NLP Collective
Discussions
Browse discussion posts about NLP.
Newest — sorts discussions by their creation dates, with the newest at the top.
Latest activity — sorts discussions by their reply, creation or edited dates (latest first).
Highest score — sorts discussions by their total votes (highest first).
Your sorting method preferences will be saved.
What is your ideal development environment for deep learning/NLP?
I'm curious on what are the ideal development setups for NLP developers who train deep learning models? I know in my journey to work more with deep learning, I use Jupyter Notebooks a lot, but I'm ...
Classifier-Free-Guidance with Transformers
I'm working on music generation using transformers. Using the decoder part for the audio tokens with text conditioning by the T5 encoder In Classifier-Free-Guidance, the text conditioning randomly ...
Is R efficient for sentiment analysis?
I would like to explore more about sentiment analysis but I cannot decide if I should start a project in python or R. What would you suggest?
Tools that can combine a csv with meta data and feed them to LLM to query them
I have a table that looks like this: pd.DataFrame({'HRHHID': [1,2,3,4,5], 'HEHOUSUT': [2,3,1,4,2], 'HETELHHD': [1,2,1,1,1]}) I also have a txt file with some "meta data" for this file that ...
Using medspaCy with target rules from Metathesaurus
Before I start this discussion, here are some useful links that could provide some context: https://github.com/medspacy/medspacy https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/...
An fantastic idea about using several sentences to represent another sentence in NLP
When learning NLP, I found that the current representation methods are basically word representation, so I wonder if there is a sentence representation? My hypothesis: to represent sentences using ...
What are the advantages and disadvantages of using spaCy vs. NLTK for NLP tasks?
I am currently working on several NLP projects and I'm trying to decide between using spaCy and NLTK as my main NLP library. Both libraries seem to offer a range of features, but I'm not sure which ...
Post-Processing Arabic OCR
Has anyone worked in a place where they extract a lot of text using Arabic OCR and then clean it to be as accurate as possible? How is this done? For example, if you digitize many documents and use ...
How are OCR texts post-processed to increase accuracy of recognition?
Has anyone worked in a company where they extract large amounts of text using OCR and then clean the text to be as accurate as possible? How is this done? Say I digitize a lot of legal documents, run ...
How to Develop an AI Model for Generating Website Templates from Text Prompts?
I am working on a project to develop an AI model that can generate website templates based on user-provided text prompts. The model should be able to interpret details such as desired features, color ...
Long Context Embedding Models eg. bge-m3 - To Chunk or Not to Chunk?
bge-m3 is highly performant embedding model that can encode both sparse and dense. It has a context length of 8kb. What I am wondering is with such models if it would be useful to BOTH long encode ...
Am I overengineering my lenient NER F1 measures
Hi, all I need to customize my F1 measurement a lot when evaluating my fine-tuned NER model performance. But I don't see other people with similar issues. I wonder if I am doing the wrong thing, or if ...
Can you help a classification algorithm by offering it cues?
Is there a way in which you can teach a classification algorithm to learn better from sparse data. I was also thinking about the possibility of giving it cues as a list of words or phrases. If you ...
LLM learning roadblock: how to fine-tune a model with zero budget
As an early learner in LLM I completed most of the hugging face tutorials without much trouble on a laptop. Once I move past that to wanting to fine tune a model with a large amount of data, my ...
Do you know more list of datasets with English sentences has idiom?
Do you know list of dataset with English sentences has idiom? I use it for research. I found these, but need more https://metatext.io/datasets/english-possible-idiomatic-expressions-(epie) https://...
Simply submit a proposal, get it approved, and publish it.
See how the process works