NLP Collective

Discuss NLP with peers and experts Learn more

A new space for technical discussions about NLP

Share your insights, advice and experience with peers and experts

Engage and discuss in threaded post replies

Discussions

Browse discussion posts about NLP.

44 discussion posts

Sorted by:

10 votes

186 views

6 replies

What is your ideal development environment for deep learning/NLP?

I'm curious on what are the ideal development setups for NLP developers who train deep learning models? I know in my journey to work more with deep learning, I use Jupyter Notebooks a lot, but I'm ...

Travis J

81.9k

replied yesterday

0 votes

75 views

0 replies

Classifier-Free-Guidance with Transformers

I'm working on music generation using transformers. Using the decoder part for the audio tokens with text conditioning by the T5 encoder In Classifier-Free-Guidance, the text conditioning randomly ...

nlp transformer-model

qmzp

replied Jul 10 at 14:28

12 votes

765 views

19 replies

Is R efficient for sentiment analysis?

I would like to explore more about sentiment analysis but I cannot decide if I should start a project in python or R. What would you suggest?

r sentiment-analysis sentimentr

CommunityBot

replied Jul 3 at 6:03

0 votes

43 views

0 replies

Tools that can combine a csv with meta data and feed them to LLM to query them

I have a table that looks like this: pd.DataFrame({'HRHHID': [1,2,3,4,5], 'HEHOUSUT': [2,3,1,4,2], 'HETELHHD': [1,2,1,1,1]}) I also have a txt file with some "meta data" for this file that ...

amazon-web-services large-language-model nlp

quant

4,388

created Jun 21 at 7:45

0 votes

22 views

0 replies

Using medspaCy with target rules from Metathesaurus

Before I start this discussion, here are some useful links that could provide some context: https://github.com/medspacy/medspacy https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/...

spacy

Alex K

created Jun 20 at 16:54

1 vote

99 views

3 replies

An fantastic idea about using several sentences to represent another sentence in NLP

When learning NLP, I found that the current representation methods are basically word representation, so I wonder if there is a sentence representation? My hypothesis: to represent sentences using ...

nlp stanford-nlp

Software Engineer

15.9k

replied Jun 20 at 11:08

0 votes

35 views

0 replies

What are the advantages and disadvantages of using spaCy vs. NLTK for NLP tasks?

I am currently working on several NLP projects and I'm trying to decide between using spaCy and NLTK as my main NLP library. Both libraries seem to offer a range of features, but I'm not sure which ...

nlp nltk spacy

Thomas Markov

edited Jun 19 at 0:34

1 vote

39 views

0 replies

Post-Processing Arabic OCR

Has anyone worked in a place where they extract a lot of text using Arabic OCR and then clean it to be as accurate as possible? How is this done? For example, if you digitize many documents and use ...

arabic dataset easyocr nlp ocr

Hello

created Jun 4 at 6:00

10 votes

387 views

7 replies

How are OCR texts post-processed to increase accuracy of recognition?

Has anyone worked in a company where they extract large amounts of text using OCR and then clean the text to be as accurate as possible? How is this done? Say I digitize a lot of legal documents, run ...

cloud-document-ai nlp ocr post-processing

Hello

replied Jun 3 at 11:37

0 votes

81 views

0 replies

How to Develop an AI Model for Generating Website Templates from Text Prompts?

I am working on a project to develop an AI model that can generate website templates based on user-provided text prompts. The model should be able to interpret details such as desired features, color ...

artificial-intelligence deep-learning large-language-model machine-learning nlp

Quartz Mode

created Jun 3 at 8:36

0 votes

67 views

0 replies

Long Context Embedding Models eg. bge-m3 - To Chunk or Not to Chunk?

bge-m3 is highly performant embedding model that can encode both sparse and dense. It has a context length of 8kb. What I am wondering is with such models if it would be useful to BOTH long encode ...

large-language-model llama nlp

Draco

created May 23 at 18:57

1 vote

54 views

2 replies

Am I overengineering my lenient NER F1 measures

Hi, all I need to customize my F1 measurement a lot when evaluating my fine-tuned NER model performance. But I don't see other people with similar issues. I wonder if I am doing the wrong thing, or if ...

named-entity-recognition nlp

FewKey

replied May 8 at 9:24

1 vote

206 views

1 reply

Can you help a classification algorithm by offering it cues?

Is there a way in which you can teach a classification algorithm to learn better from sparse data. I was also thinking about the possibility of giving it cues as a list of words or phrases. If you ...

classification nlp-question-answering text-classification

Ekin Bozyel

replied Apr 13 at 13:15

12 votes

552 views

3 replies

LLM learning roadblock: how to fine-tune a model with zero budget

As an early learner in LLM I completed most of the hugging face tutorials without much trouble on a laptop. Once I move past that to wanting to fine tune a model with a large amount of data, my ...

huggingface-transformers

Utkarsh Dadhich

replied Apr 13 at 9:44

2 votes

31 views

0 replies

Do you know more list of datasets with English sentences has idiom?

Do you know list of dataset with English sentences has idiom? I use it for research. I found these, but need more https://metatext.io/datasets/english-possible-idiomatic-expressions-(epie) https://...

nlp python

Vy Do

50.7k

edited Apr 6 at 17:10

15 30 50 per page

2 3 Next

Got a burning question about NLP? This is the place to trade your best insights, advice, discoveries and expertise. Learn more about discussions.

Discussions Activity (last 24 hours)

NLP discussion posts, replies, edits, and votes

Share perspectives, advice, and insights

Use Discussions to engage in deeper dialogue, have opinion-based conversations, and exchange perspectives about a technical concept. See full Discussions guidelines.

Discussions is different than Q&A

Discussions exists separately from the traditional question-and-answer space. If you have a specific programming question, go to Stack Overflow Q&A to post your question.

Be welcoming and patient

All users are expected to treat one another with kindness and respect. Remember, everyone is here to learn, and sometimes while learning, people make mistakes. See code of conduct.

No resume or job listings

Discussions are not for sharing your resume or job listing.

Avoid self-promotion

If your post happens to be about your product or website, you must disclose your affiliation. See spam guidelines and best practices.

Collectives™ on Stack Overflow

NLP Collective

Discussions