Newest 'nlp' Questions

0 votes

0 answers

19 views

BERT embedding cosine similarities look very random and useless

I am new to this field, so maybe I am misunderstanding something. However, I thought you can use BERT embeddings to determine semantic similarity. I was trying to group some words in categories using ...

mihovg93

93

asked 12 hours ago

0 votes

0 answers

16 views

How do I install language model for spacy on Kaggle?

Aloha! Everybody knows how to install model at home: python -m spacy download ru_core_news_md But since python notebook on Kaggle is isolated of the global web, it does not seem possible to do so. ...

Dimas del Pablo

9

asked 19 hours ago

0 votes

0 answers

12 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...

Mody

1

asked 23 hours ago

0 votes

0 answers

13 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...

Pratheesh Kumar

1

asked yesterday

-2 votes

0 answers

16 views

Enhancing Document Layout Analysis by Adding Positional and Character Information to CNN Inputs

I am working on document layout analysis and have been exploring CNNs and transformer-based networks for this task. Typically, images are passed as 3-channel RGB inputs to these networks. However, my ...

HARSH DEVMURARI

1

asked yesterday

-2 votes

0 answers

13 views

Using RAG to identify the source of error in a code file [closed]

I am trying to implement a small tool that automatically identifies which part of the code from a set of code files (given as input) is responsible for the error text displayed during execution. The ...

S R

7

asked 2 days ago

1 vote

0 answers

16 views

Text summarizations of comments and replace the duplicates with the first occurrence if the meaning is comment is same

Context - Doing an NLP project to analyze comments column in a data frame. I want to replace the duplicates with the first occurrence if the meaning of the comments are same. I wants to compare all ...

Bhuvaneshwari D Raman Effect

11

asked 2 days ago

-1 votes

0 answers

14 views

Models for getting similarity scores between categories and keywords

I want to get a similarity score between a category like vehicles and a list of words like headphone, water, truck, and green. The goal would be for each score to be low on words outside the category ...

Kayla Farivar

1

asked 2 days ago

0 votes

0 answers

12 views

Feature Importance Chart in Keras API

I am implementing a series of three deep learning models (RNN, 1D-CNN, and custom transformer), all utilizing the Keras API, for an NLP binary classification problem. I would like to generate a ...

Alex K

113

asked 2 days ago

-2 votes

0 answers

14 views

What's the difference between contextual encoders like USE and OpenAI's text-embedding-ada-002? [closed]

Just came across Universal Sentence Encoder which can also keep up the context during the semantic search and other operations. Is OpenAI's text-embedding-ada-002 much more advance or more or less ...

Ujjwal Kumar Singh

1

asked 2 days ago

0 votes

0 answers

29 views

The Impact of Pretraining on Fine-tuning and Inference

I am working on a binary prediction classification task, primarily focusing on fine-tuning a BERT model to learn the association between CVEs and CWEs. I've structured my task into three phases: first,...

joehu

19

asked 2 days ago

-1 votes

0 answers

21 views

Live transcription with word-level timestamps

I'm new in Machine learning. I have task to live transcribe audio from microphone and also make word-level transcripts of this live transcription instead of utterance level. I tried these projects: ...

guresha

1

asked 2 days ago

0 votes

0 answers

18 views

How to Combine Semantic Search with SQL Analytical Queries?

I'm creating an LLM-agent that can provide insights from a complex database. The database includes several columns of different types (datetime, numeric, and text). For simplicity, let's assume I have ...

Pepe Moreno

43

asked 2 days ago

0 votes

0 answers

20 views

Memory usage when using spaCy Doc extensions

Issue Before preprocessing my data with spaCy, I typically have my data stored in a Pandas Series. Since I'd like to preserve the index for each document before serializing my Docs, I decided to use ...

falsum

359

asked Jul 10 at 16:09

0 votes

0 answers

26 views

Combine strings with redundant information without losing business logic in Python

I have a business entity called "Groups". This groups are linked to a PQL-like string called "rule". This rules are evaluated to check which products from an ecommerce match with ...

LuisGT

23

asked Jul 10 at 14:39

0 votes

0 answers

11 views

Which RAG methods/concepts can I use for a benchmark?

I am writing a practical assignment for my uni. There I have to analyse different RAG methods and compare them. Since I am in my 2nd semester of information systems and I lack of experience within the ...

Arian Ott

1

asked Jul 10 at 13:26

0 votes

0 answers

16 views

spacy-llm & SpanCat for address parsing

I'm currently developing a project to standardize and correct a dataset of inconsistently formatted addresses using spaCy-LLM and spaCy.SpanCat.v3. The goal is to train a model on examples of ...

Hammad Javaid

1

asked Jul 10 at 7:30

-1 votes

0 answers

16 views

AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}

I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...

ahmad

41

asked Jul 9 at 15:36

-1 votes

0 answers

20 views

BERT: how to get a quoted string as token

I eventually managed to train a model, based on BERT (bert-base-uncased) and TensorFlow, to extract intents and slots for texts like this: create a doc document named doc1 For this text, my model ...

Fab

1,526

asked Jul 8 at 14:53

-2 votes

0 answers

27 views

Free speech-to-text APIs, libraries, or open-source solutions for Python? [closed]

I'm developing a Python application that requires speech-to-text functionality, preferably for multiple languages including Turkish. I'm looking for free or open-source solutions that can be easily ...

Illia

1

asked Jul 8 at 9:23

0 votes

1 answer

19 views

How is coherence score calculated in Mallet?

I do understand how the diagnostics output shows the coherence values for each topic but my values range between -150 and -600 and other posts that I have seen where Mallet was used show coherence ...

Glorifier

31

asked Jul 6 at 14:33

-1 votes

0 answers

28 views

Word Prediction to complete sentence [closed]

The visuals in this movie are _________, but the story is paper-thin. It's all flash and no substance. How can I better recommend word to complete the sentence Tried a code and got output as The ...

NIM

7

asked Jul 5 at 18:05

-3 votes

0 answers

17 views

Aı deep learning code generator for code vectorization [closed]

Python Hello, I have language and code headings consisting of json structure and there are codes under the code headings. Can you help me clean and vectorize the codes? Thank you. Note: I want my ...

Fatih Ceylan

1

asked Jul 5 at 15:56

0 votes

0 answers

18 views

C# english word stemming and lemmatizing using Catalyst - how to do that

I added Catalyst nuget package to my C# project, to help me lemmatize english words. However its documentation is not clear, and lacks of examples, I tried to lemmatize/stem only one word: Catalyst....

Zoltan Hernyak

1,117

asked Jul 5 at 11:35

-1 votes

0 answers

15 views

Checking semantic meaning of 2 texts while considering the order of the texts

I am doing a task related to checking the semantic meaning similarity between 2 texts. There I used BERT sentence-transformers/all-MiniLM-L6-v2 model. Input 1 - "Object moves in uniform ...

Anjana Pathirana

194

asked Jul 5 at 8:25

1 vote

1 answer

43 views

not able END function using "add_conditional_edges" in lang graph

this is my code: import os from dotenv import load_dotenv load_ dotenv() from langchain_openai import ChatOpenAI from langgraph.graph import StateGraph, END from langgraph.graph import Graph, ...

Sai3554

13

asked Jul 5 at 7:29

-5 votes

0 answers

34 views

Is there any way to apply NLP on my survey responses of stress [closed]

Is there any way to apply NLP technique on my survey responses of stress. I have survey data for stress. My columns data are like: Stress_Frequency Low_Energy_Frequency \ 0 1 Arts and ...

shikha. agarwal

1

asked Jul 4 at 20:02

-2 votes

0 answers

22 views

HuggingFaceInstructor Embedding giving error? [closed]

I am working on a chatbot to handle FAQs. I have tried Huggin FaceInstructor embedding but it has a lot of depedencies and i don't know why my anaconda prompt is not able to install all the required ...

Abdul Hadi

1

asked Jul 4 at 16:31

1 vote

1 answer

28 views

How to make dynamic API calls based on user input in a Gemini application python nlp?

I'm working on a Gemini application where I need to make dynamic API calls based on user input. Specifically, I want to perform different API requests depending on the user's query. For example, if ...

user14990172

23

asked Jul 4 at 15:11

0 votes

0 answers

46 views

Natural Language Question Answering: How do you train and evaluate using ML.Net

I want to know how to train and evaluate Natural Language Question Answering model using ML.Net. I already have the training working but I got stuck at the Evaluation part where you ask a question and ...

Sol

71

asked Jul 4 at 14:16

0 votes

0 answers

17 views

Why charater based LSTM are taking more time than word based LSTM while next word prediction

I wanted to train LSTM Model for Next Word Prediction using word-based and character based. I used similar data processing technique for both character-based and word-based. For Character, class ...

Shantanu Nath

373

asked Jul 3 at 11:44

0 votes

0 answers

10 views

Calling Asr model in BHasini throws - {"code":"something went wrong","message":null,"timestamp":"2024-07-03T06:08:52.407+00:00"}

I am trying to call the asr model of bhasini api to get transcription of my audio. I tried the following code. import requests import json url = "https://meity-auth.ulcacontrib.org/ulca/apis/v0/...

Vandit Tyagi

1

asked Jul 3 at 6:13

0 votes

0 answers

26 views

Understanding and improving coherence values using Mallet

I am attempting to run an LDA topic model using Mallet. My corpus consists of user comments from news websites. It's a relatively small corpus with approx. 614k words. The first approach I took was to ...

Glorifier

31

asked Jul 2 at 23:22

0 votes

0 answers

17 views

Named Entity Recognition using TFLite on Android

I have a named entity recognition TensorFlow Lite model that I trained in Python and that I would like to run on an Android. To make predictions, the model takes in a dictionary of two tensors, namely ...

cdekalb

1

asked Jul 2 at 19:08

0 votes

0 answers

37 views

GliNER finetuning - no validation loss is logging

I am trying to fine-tune using this notebook: GLiNER/examples/finetune.ipynb at main · urchade/GLiNER (github.com) However, the logs only show 'loss' , which I assume is the training data set loss, ...

andream

33

asked Jul 2 at 12:44

0 votes

0 answers

67 views

UndefinedFile: could not open extension control file "/usr/share/postgresql/14/extension/vector.control": No such file or directory

I'm trying to run postgres along with pgvector but getting this error: UndefinedFile: could not open extension control file "/usr/share/postgresql/14/extension/vector.control": No such file ...

Ayush J.

1

asked Jul 1 at 17:16

0 votes

1 answer

34 views

How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?

I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized Unfortunately, CXR-BERT-...

Pablo Messina

441

asked Jul 1 at 15:35

-2 votes

1 answer

23 views

Hugging Face model with large context window [closed]

Im looking for a model that accepts input like 50k characters and a prompt answering a question based on that text. Is there something available like that? Not sure how to find it, new to AI.

repo

764

asked Jun 30 at 19:21

-1 votes

1 answer

49 views

English text tokenization in C# not python is possible? [closed]

In our software we have to analyze a plain text file. First we should break the text into paragraph, then into sentences, then into tokens. Final steps (as far as I understand) is the stemming and ...

Zoltan Hernyak

1,117

asked Jun 30 at 16:52

0 votes

0 answers

15 views

Spark code for NLP processing for china data

I am using pySpark and used jieba for china data. I have have UDF in code and did broadcast but column enriching is taking hours and hours for 5 million record. I am clueless what we need to here to ...

Rajiv Singh

1,018

asked Jun 29 at 9:59

1 vote

1 answer

42 views

Alternative to Receptive field in Transformers and what factors impact it

I have two transformer networks. One with 3 heads per attention and 15 layers in total and second one with 5 heads per layer and 30 layers in total. Given an arbitrary set of documents (2048 tokens ...

CraZyCoDer

421

asked Jun 29 at 4:58

0 votes

0 answers

32 views

"KeyError: 0" when calling model.fit() in Keras

I am training a simple sequential model for NLP in keras. Here is the model architecture: from keras.models import Sequential from keras import layers from keras.layers import Embedding, Flatten, ...

Alex K

113

asked Jun 27 at 16:31

1 vote

0 answers

19 views

Spacy detect correctly GPE

I've a set of string where I shall detetect the country its belongs to, referring to detected GPE. sentences = [ "I watched TV in germany", "Mediaset ITA canale 5", &...

user3925023

677

asked Jun 27 at 13:44

0 votes

0 answers

23 views

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

I'm still trying to get this work (and to learn!) so I am using a tiny corpus. I do some preprocessing on the text in order to get specific bi-gram collocations using nltk (not relevant here but I ...

DS14

131

asked Jun 27 at 10:39

-2 votes

0 answers

23 views

is there any best pretrained model of any spacy,ner or any for resume parsing else is there any other way to create a resume parser application

pls im working on this rom past 25 days is there any way or solution for this if there is any pretrained model i can use its also good for getting atleast 80% accuracy for entity extraction from ...

Ganesh Ingale

1

asked Jun 27 at 5:57

0 votes

0 answers

41 views

How to generate output of HuggingFace PEFT model with previous message history as context?

I am trying to generate text from my fine-tuned Llama3 model which uses the PEFT AutoPeftModelForCausalLM library while also passing in previous message history. This is how I am currently generating ...

Avik Malladi

27

asked Jun 27 at 0:02

0 votes

0 answers

18 views

A prepand Paraphrase is showig all the time after runnig the code instead of actual paraphrased Sentence

I am creating an URDU TEXT PARAPHRASING tool for my semester. I have used T5 Model and fine tuned it. Now when im running this code: **import torch from transformers import T5ForConditionalGeneration, ...

Fozan Akbar khan

1

asked Jun 26 at 15:53

0 votes

0 answers

10 views

Understanding time complexity for dynamic indexing

While reading the text, Introduction to Information Retrieval by Manning et al., I came across dynamic indexing (Sec 4.5, Page 79). It consists of two indices, the main index (stored on disk) and an ...

Swaroop

1,249

asked Jun 26 at 15:50

0 votes

0 answers

43 views

Python Rasa train not compatible with M3 Apple

Unable to run the "rasa train" in mac m3 terminal.. giving below error. 2024-06-26 17:27:12 INFO rasa.cli.train - Started validating domain and training data... zsh: illegal hardware ...

Shamil

727

asked Jun 26 at 9:28

-1 votes

0 answers

51 views

How to open `gguf` models in Windows?

How to run a file with .gguf extension on a Windows machine? I tried to open the file with PyCharm or Visual, but they did not recognize this file. And, when I tried to run the command to use the ...

THANH HOÀNG

9

asked Jun 26 at 6:45

Collectives™ on Stack Overflow

Questions tagged [nlp]

Related Tags