Skip to main content

Questions tagged [nlp]

Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches.

Filter by
Sorted by
Tagged with
0 votes
0 answers
19 views

BERT embedding cosine similarities look very random and useless

I am new to this field, so maybe I am misunderstanding something. However, I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using ...
mihovg93's user avatar
0 votes
0 answers
16 views

How do I install language model for spacy on Kaggle?

Aloha! Everybody knows how to install model at home: python -m spacy download ru_core_news_md But since python notebook on Kaggle is isolated of the global web, it does not seem possible to do so. ...
Dimas del Pablo's user avatar
0 votes
0 answers
12 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...
Mody's user avatar
  • 1
0 votes
0 answers
13 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
Pratheesh Kumar's user avatar
-2 votes
0 answers
16 views

Enhancing Document Layout Analysis by Adding Positional and Character Information to CNN Inputs

I am working on document layout analysis and have been exploring CNNs and transformer-based networks for this task. Typically, images are passed as 3-channel RGB inputs to these networks. However, my ...
HARSH DEVMURARI's user avatar
-2 votes
0 answers
13 views

Using RAG to identify the source of error in a code file [closed]

I am trying to implement a small tool that automatically identifies which part of the code from a set of code files (given as input) is responsible for the error text displayed during execution. The ...
S R's user avatar
  • 7
1 vote
0 answers
16 views

Text summarizations of comments and replace the duplicates with the first occurrence if the meaning is comment is same

Context - Doing an NLP project to analyze comments column in a data frame. I want to replace the duplicates with the first occurrence if the meaning of the comments are same. I wants to compare all ...
Bhuvaneshwari D Raman Effect's user avatar
-1 votes
0 answers
14 views

Models for getting similarity scores between categories and keywords

I want to get a similarity score between a category like vehicles and a list of words like headphone, water, truck, and green. The goal would be for each score to be low on words outside the category ...
Kayla Farivar's user avatar
0 votes
0 answers
12 views

Feature Importance Chart in Keras API

I am implementing a series of three deep learning models (RNN, 1D-CNN, and custom transformer), all utilizing the Keras API, for an NLP binary classification problem. I would like to generate a ...
Alex K's user avatar
  • 113
-2 votes
0 answers
14 views

What's the difference between contextual encoders like USE and OpenAI's text-embedding-ada-002? [closed]

Just came across Universal Sentence Encoder which can also keep up the context during the semantic search and other operations. Is OpenAI's text-embedding-ada-002 much more advance or more or less ...
Ujjwal Kumar Singh's user avatar
0 votes
0 answers
29 views

The Impact of Pretraining on Fine-tuning and Inference

I am working on a binary prediction classification task, primarily focusing on fine-tuning a BERT model to learn the association between CVEs and CWEs. I've structured my task into three phases: first,...
joehu's user avatar
  • 19
-1 votes
0 answers
21 views

Live transcription with word-level timestamps

I'm new in Machine learning. I have task to live transcribe audio from microphone and also make word-level transcripts of this live transcription instead of utterance level. I tried these projects: ...
guresha's user avatar
0 votes
0 answers
18 views

How to Combine Semantic Search with SQL Analytical Queries?

I'm creating an LLM-agent that can provide insights from a complex database. The database includes several columns of different types (datetime, numeric, and text). For simplicity, let's assume I have ...
Pepe Moreno's user avatar
0 votes
0 answers
20 views

Memory usage when using spaCy Doc extensions

Issue Before preprocessing my data with spaCy, I typically have my data stored in a Pandas Series. Since I'd like to preserve the index for each document before serializing my Docs, I decided to use ...
falsum's user avatar
  • 359
0 votes
0 answers
26 views

Combine strings with redundant information without losing business logic in Python

I have a business entity called "Groups". This groups are linked to a PQL-like string called "rule". This rules are evaluated to check which products from an ecommerce match with ...
LuisGT's user avatar
  • 23
0 votes
0 answers
11 views

Which RAG methods/concepts can I use for a benchmark?

I am writing a practical assignment for my uni. There I have to analyse different RAG methods and compare them. Since I am in my 2nd semester of information systems and I lack of experience within the ...
Arian Ott's user avatar
0 votes
0 answers
16 views

spacy-llm & SpanCat for address parsing

I'm currently developing a project to standardize and correct a dataset of inconsistently formatted addresses using spaCy-LLM and spaCy.SpanCat.v3. The goal is to train a model on examples of ...
Hammad Javaid's user avatar
-1 votes
0 answers
16 views

AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}

I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...
ahmad's user avatar
  • 41
-1 votes
0 answers
20 views

BERT: how to get a quoted string as token

I eventually managed to train a model, based on BERT (bert-base-uncased) and TensorFlow, to extract intents and slots for texts like this: create a doc document named doc1 For this text, my model ...
Fab's user avatar
  • 1,526
-2 votes
0 answers
27 views

Free speech-to-text APIs, libraries, or open-source solutions for Python? [closed]

I'm developing a Python application that requires speech-to-text functionality, preferably for multiple languages including Turkish. I'm looking for free or open-source solutions that can be easily ...
Illia's user avatar
  • 1
0 votes
1 answer
19 views

How is coherence score calculated in Mallet?

I do understand how the diagnostics output shows the coherence values for each topic but my values range between -150 and -600 and other posts that I have seen where Mallet was used show coherence ...
Glorifier's user avatar
-1 votes
0 answers
28 views

Word Prediction to complete sentence [closed]

The visuals in this movie are _________, but the story is paper-thin. It's all flash and no substance. How can I better recommend word to complete the sentence Tried a code and got output as The ...
NIM's user avatar
  • 7
-3 votes
0 answers
17 views

Aı deep learning code generator for code vectorization [closed]

Python Hello, I have language and code headings consisting of json structure and there are codes under the code headings. Can you help me clean and vectorize the codes? Thank you. Note: I want my ...
Fatih Ceylan's user avatar
0 votes
0 answers
18 views

C# english word stemming and lemmatizing using Catalyst - how to do that

I added Catalyst nuget package to my C# project, to help me lemmatize english words. However its documentation is not clear, and lacks of examples, I tried to lemmatize/stem only one word: Catalyst....
Zoltan Hernyak's user avatar
-1 votes
0 answers
15 views

Checking semantic meaning of 2 texts while considering the order of the texts

I am doing a task related to checking the semantic meaning similarity between 2 texts. There I used BERT sentence-transformers/all-MiniLM-L6-v2 model. Input 1 - "Object moves in uniform ...
Anjana Pathirana's user avatar
1 vote
1 answer
43 views

not able END function using "add_conditional_edges" in lang graph

this is my code: import os from dotenv import load_dotenv load_ dotenv() from langchain_openai import ChatOpenAI from langgraph.graph import StateGraph, END from langgraph.graph import Graph, ...
Sai3554's user avatar
  • 13
-5 votes
0 answers
34 views

Is there any way to apply NLP on my survey responses of stress [closed]

Is there any way to apply NLP technique on my survey responses of stress. I have survey data for stress. My columns data are like: Stress_Frequency Low_Energy_Frequency \ 0 1 Arts and ...
shikha. agarwal's user avatar
-2 votes
0 answers
22 views

HuggingFaceInstructor Embedding giving error? [closed]

I am working on a chatbot to handle FAQs. I have tried Huggin FaceInstructor embedding but it has a lot of depedencies and i don't know why my anaconda prompt is not able to install all the required ...
Abdul Hadi's user avatar
1 vote
1 answer
28 views

How to make dynamic API calls based on user input in a Gemini application python nlp?

I'm working on a Gemini application where I need to make dynamic API calls based on user input. Specifically, I want to perform different API requests depending on the user's query. For example, if ...
user14990172's user avatar
0 votes
0 answers
46 views

Natural Language Question Answering: How do you train and evaluate using ML.Net

I want to know how to train and evaluate Natural Language Question Answering model using ML.Net. I already have the training working but I got stuck at the Evaluation part where you ask a question and ...
Sol's user avatar
  • 71
0 votes
0 answers
17 views

Why charater based LSTM are taking more time than word based LSTM while next word prediction

I wanted to train LSTM Model for Next Word Prediction using word-based and character based. I used similar data processing technique for both character-based and word-based. For Character, class ...
Shantanu Nath's user avatar
0 votes
0 answers
10 views

Calling Asr model in BHasini throws - {"code":"something went wrong","message":null,"timestamp":"2024-07-03T06:08:52.407+00:00"}

I am trying to call the asr model of bhasini api to get transcription of my audio. I tried the following code. import requests import json url = "https://meity-auth.ulcacontrib.org/ulca/apis/v0/...
Vandit Tyagi's user avatar
0 votes
0 answers
26 views

Understanding and improving coherence values using Mallet

I am attempting to run an LDA topic model using Mallet. My corpus consists of user comments from news websites. It's a relatively small corpus with approx. 614k words. The first approach I took was to ...
Glorifier's user avatar
0 votes
0 answers
17 views

Named Entity Recognition using TFLite on Android

I have a named entity recognition TensorFlow Lite model that I trained in Python and that I would like to run on an Android. To make predictions, the model takes in a dictionary of two tensors, namely ...
cdekalb's user avatar
0 votes
0 answers
37 views

GliNER finetuning - no validation loss is logging

I am trying to fine-tune using this notebook: GLiNER/examples/finetune.ipynb at main · urchade/GLiNER (github.com) However, the logs only show 'loss' , which I assume is the training data set loss, ...
andream's user avatar
  • 33
0 votes
0 answers
67 views

UndefinedFile: could not open extension control file "/usr/share/postgresql/14/extension/vector.control": No such file or directory

I'm trying to run postgres along with pgvector but getting this error: UndefinedFile: could not open extension control file "/usr/share/postgresql/14/extension/vector.control": No such file ...
Ayush J.'s user avatar
0 votes
1 answer
34 views

How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?

I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized Unfortunately, CXR-BERT-...
Pablo Messina's user avatar
-2 votes
1 answer
23 views

Hugging Face model with large context window [closed]

Im looking for a model that accepts input like 50k characters and a prompt answering a question based on that text. Is there something available like that? Not sure how to find it, new to AI.
repo's user avatar
  • 764
-1 votes
1 answer
49 views

English text tokenization in C# not python is possible? [closed]

In our software we have to analyze a plain text file. First we should break the text into paragraph, then into sentences, then into tokens. Final steps (as far as I understand) is the stemming and ...
Zoltan Hernyak's user avatar
0 votes
0 answers
15 views

Spark code for NLP processing for china data

I am using pySpark and used jieba for china data. I have have UDF in code and did broadcast but column enriching is taking hours and hours for 5 million record. I am clueless what we need to here to ...
Rajiv Singh's user avatar
  • 1,018
1 vote
1 answer
42 views

Alternative to Receptive field in Transformers and what factors impact it

I have two transformer networks. One with 3 heads per attention and 15 layers in total and second one with 5 heads per layer and 30 layers in total. Given an arbitrary set of documents (2048 tokens ...
CraZyCoDer's user avatar
0 votes
0 answers
32 views

"KeyError: 0" when calling model.fit() in Keras

I am training a simple sequential model for NLP in keras. Here is the model architecture: from keras.models import Sequential from keras import layers from keras.layers import Embedding, Flatten, ...
Alex K's user avatar
  • 113
1 vote
0 answers
19 views

Spacy detect correctly GPE

I've a set of string where I shall detetect the country its belongs to, referring to detected GPE. sentences = [ "I watched TV in germany", "Mediaset ITA canale 5", &...
user3925023's user avatar
0 votes
0 answers
23 views

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

I'm still trying to get this work (and to learn!) so I am using a tiny corpus. I do some preprocessing on the text in order to get specific bi-gram collocations using nltk (not relevant here but I ...
DS14's user avatar
  • 131
-2 votes
0 answers
23 views

is there any best pretrained model of any spacy,ner or any for resume parsing else is there any other way to create a resume parser application

pls im working on this rom past 25 days is there any way or solution for this if there is any pretrained model i can use its also good for getting atleast 80% accuracy for entity extraction from ...
Ganesh Ingale's user avatar
0 votes
0 answers
41 views

How to generate output of HuggingFace PEFT model with previous message history as context?

I am trying to generate text from my fine-tuned Llama3 model which uses the PEFT AutoPeftModelForCausalLM library while also passing in previous message history. This is how I am currently generating ...
Avik Malladi's user avatar
0 votes
0 answers
18 views

A prepand *Paraphrase* is showig all the time after runnig the code instead of actual paraphrased Sentence

I am creating an URDU TEXT PARAPHRASING tool for my semester. I have used T5 Model and fine tuned it. Now when im running this code: **import torch from transformers import T5ForConditionalGeneration, ...
Fozan Akbar khan's user avatar
0 votes
0 answers
10 views

Understanding time complexity for dynamic indexing

While reading the text, Introduction to Information Retrieval by Manning et al., I came across dynamic indexing (Sec 4.5, Page 79). It consists of two indices, the main index (stored on disk) and an ...
Swaroop's user avatar
  • 1,249
0 votes
0 answers
43 views

Python Rasa train not compatible with M3 Apple

Unable to run the "rasa train" in mac m3 terminal.. giving below error. 2024-06-26 17:27:12 INFO rasa.cli.train - Started validating domain and training data... zsh: illegal hardware ...
Shamil's user avatar
  • 727
-1 votes
0 answers
51 views

How to open `gguf` models in Windows?

How to run a file with .gguf extension on a Windows machine? I tried to open the file with PyCharm or Visual, but they did not recognize this file. And, when I tried to run the command to use the ...
THANH HOÀNG's user avatar

1
2 3 4 5
413