Skip to main content

Questions tagged [large-language-model]

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

large-language-model
Filter by
Sorted by
Tagged with
0 votes
0 answers
5 views

Google Canary, Docker and Gemini Nano

Can I run the latest Google Canary with the Gemini Nano model in a Docker container in headless mode and interact with the model via Selenium (execute_script)? If so, how do I do it?
aasdf1xa's user avatar
-4 votes
0 answers
17 views

Want to retrain my LLM based user questions and answers on OpenAI

We have created a solution where users can upload their PDFs and ask questions. We have used NodeJS, Langchain, and OpenAI. Currently, the app flow is we save all the content of PDF in our vector ...
Muhammad Mudassir's user avatar
0 votes
0 answers
12 views

apply different learning rate for introduced tokens in the transformers library

Say I want to introduce a few new tokens into the vocabulary of an existing model, and I want these tokens to have a different learning rate compared to the rest of the model's parameters during ...
Bipolo's user avatar
  • 73
0 votes
0 answers
12 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...
Mody's user avatar
  • 1
0 votes
0 answers
15 views

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule. When running a DSPy module with a given ...
Tom Lin's user avatar
  • 72
0 votes
0 answers
13 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
Pratheesh Kumar's user avatar
0 votes
0 answers
23 views

LlamaIndex Pandas Query Engine not returning all rows

I am using LlamaIndex Pandas Query Engine to produce a pandas query that is applied on a dataframe. The query produced is correct however the dataframe returned is not. Specifically, a dataframe with ...
prax1telis's user avatar
-1 votes
0 answers
18 views

reAct Prompting causing LLM Hallaucination

Question: How to Prevent Hallucinations in LLM When Using Langchain Tools like Wikipedia with React Prompt Template? I'm currently working with Langchain and using the gemini-1.5-flash model. I've ...
Satvik Jain's user avatar
-2 votes
0 answers
13 views

Using RAG to identify the source of error in a code file [closed]

I am trying to implement a small tool that automatically identifies which part of the code from a set of code files (given as input) is responsible for the error text displayed during execution. The ...
S R's user avatar
  • 7
0 votes
1 answer
38 views

How to use Google Vertex AI fine tuned model via Node.js

I fine-tuned a model on Google Vertex AI. Before that, I was using regular models with this code(it works): public static async SendMessage(prompt) { const vertexAI = new VertexAI({project: ...
cuneyttyler's user avatar
  • 1,325
0 votes
0 answers
18 views

Dockerized OpenSearch not returning any hits for queries

I recently tried to move the chatbot project that I am working on over to OpenSearch. At first, my search function was working, but after dockerizing OpenSearch, I've run into the issue where my ...
Frank Nakasako's user avatar
0 votes
0 answers
11 views

Unable to make llama.cpp on M1 Mac

When I try installing Llam.cpp, I get the following error: ld: warning: ignoring file '/Users/krishparikh/Projects/LLM/llama.cpp/ggml/src/ggml-metal-embed.o': found architecture 'x86_64', required ...
Krish Parikh's user avatar
0 votes
0 answers
44 views

Issues with irrelevant similarity search results using RAG and LLMs on PubMed Data

Context I have a tab-separated values (TSV) file containing metadata and abstracts of research papers (between 800-1000 rows). So far, I have been working on a code that use RAG-LLMs to check the ...
Someone_1313's user avatar
0 votes
0 answers
15 views

Llama-3-70B with pipeline cannot generate new tokens (texts)

I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts. Here is my demo code (copied from ...
Martin's user avatar
  • 11
0 votes
0 answers
18 views

How to Combine Semantic Search with SQL Analytical Queries?

I'm creating an LLM-agent that can provide insights from a complex database. The database includes several columns of different types (datetime, numeric, and text). For simplicity, let's assume I have ...
Pepe Moreno's user avatar
0 votes
0 answers
6 views

How do I persist FAISS indexes?

In the langchain wiki of FAISS, https://python.langchain.com/v0.2/docs/integrations/vectorstores/faiss/, it only talks about saving indexes to files. db.save_local("faiss_index") new_db = ...
Xiao Jing's user avatar
0 votes
0 answers
11 views

multiple headers excel loader,langchain , excel chatbot

i am creating question and answer using excel, i am using recursive text splitter.i am able to generate answer for simple excel files but the problem is with the excel file with multiple headers like](...
anu's user avatar
  • 19
1 vote
0 answers
9 views

Convert Quantization to Onnx

I am new and want to try converting models to Onnx format and I have the following issue. I have a model that has been quantized to 4-bit, and then I converted this model to Onnx. My quantized model ...
Toàn Nguyễn Phúc's user avatar
-2 votes
0 answers
15 views

Time-serie outlier detection using LLM

I want to preform outliers detection on time-serie data using an LLM (the use of LLM is critical). so basically i have a simple dataframe containing the time serie data (timestamp column, and value ...
Mohamed Sabri Belmadoui's user avatar
0 votes
0 answers
23 views

Can anyone know how to execute using only .gguf file , without any test-LLM-23B-m.safetensors.index.json ,test-LLM-23B-m.safetensors

from auto_gptq import AutoGPTQForCausalLM,BaseQuantizeConfig import torch from transformers import AutoTokenizer def run_model():     try:         #Check if CUDA is available and set device ...
goodcoder's user avatar
0 votes
0 answers
11 views

Which RAG methods/concepts can I use for a benchmark?

I am writing a practical assignment for my uni. There I have to analyse different RAG methods and compare them. Since I am in my 2nd semester of information systems and I lack of experience within the ...
Arian Ott's user avatar
0 votes
0 answers
32 views

AI Stops Abruptly with Langchain and CTransformers

I am facing an issue with my AI application in Python. I am using the chainlit library along with langchain and CTransformers to generate AI responses. However, the AI often stops abruptly before ...
Memo's user avatar
  • 1
-8 votes
0 answers
28 views

Hugging Face Spaces is completely free? [closed]

I just started to learn Hugging Face and I don't know how it works? So if I use Spaces page on Hugging Face while I using the other AI models on the websites. Should I pay any money after use this ...
CrystaL's user avatar
-1 votes
0 answers
7 views

Issue with SQLDatabase.from_uri in langchain when including materialized views

This is not working for views ( working for tables in the same database) db = SQLDatabase.from_uri(rds_uri,include_tables=["mv_complete_dur_info_update_new"],view_support=True,schema="...
kalpesh patil's user avatar
0 votes
0 answers
14 views

CrewAI tool is caching my response. How do I disable it

I have created a custom tool in CrewAI to request human for information. I created this tool as I found it extremely useful to include human in the loop situations. But I'm facing one issue. Every ...
Rohan Prasad's user avatar
0 votes
0 answers
30 views

The relationship between chunk_size, context length and embedding length in a Langchain RAG Framework

everyone. Currently I am working on a Langchain RAG framework using Ollama. I have a question towards the chunk size in the Document Splitter. Now I decide to use qwen2:72b model as both embedding ...
Joesf.Albert's user avatar
-1 votes
0 answers
16 views

ChatBot for PDFs [closed]

I just got into my first internship and the company wants to build a program that is fed up with daily pdfs with hundreds of pages and is capable of answering questions based on that. The program is ...
Daniel Alpeñes De Lucca's user avatar
0 votes
0 answers
16 views

spacy-llm & SpanCat for address parsing

I'm currently developing a project to standardize and correct a dataset of inconsistently formatted addresses using spaCy-LLM and spaCy.SpanCat.v3. The goal is to train a model on examples of ...
Hammad Javaid's user avatar
-1 votes
0 answers
16 views

AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}

I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...
ahmad's user avatar
  • 41
0 votes
0 answers
21 views

SQL query is not correctly generated using langchain, nlp and llm

I have created a application which takes input question and converts it in SQL query using langchain, llm and nlp but sometimes it creating wrong query especially in the beginning following is the ...
kalpesh patil's user avatar
1 vote
0 answers
51 views

Passing Additional Information in LangChain abatch Calls

Given an abatch call for a LangChain chain, I need to pass additional information, beyond just the content, to the function so that this information is available in the callback, specifically in the ...
TantrixRobotBoy's user avatar
0 votes
0 answers
19 views

ModuleNotFoundError when importing HuggingFaceLLM from llama_index.core.llms.huggingface

I’m trying to import HuggingFaceLLM using the following line of code: from llama_index.core.llms.huggingface import HuggingFaceLLM I know that llamaindex keeps updating, and previously this import ...
Nick's user avatar
  • 343
-1 votes
0 answers
67 views

OpenWebUI + Pipelines (w/ langchain hopefully)

I'm currently at the last step of https://github.com/open-webui/pipelines, and I tried to start the server, but it says the image below as my error. I'm not sure if the server is already running nor ...
Ryan Lutz's user avatar
-1 votes
0 answers
24 views

Trying to use Llama 3 on VertexAI is throwing 400 Bad Request but the error doesn't make sense [closed]

I am trying to use Llama 3 on VertexAI to process an image and extract data from the image and put it into JSON format. I have this working with Gemini in a Jupyter Notebook hosted on Vertex, but the ...
Carlos Muentes's user avatar
0 votes
0 answers
10 views

Converting PDFs to Markdown for Higher Quality Embeddings with Langchain.js

I am working on RAG LLM projects with Langchain.js using Node.js. Most of the data I retrieve are PDFs and a bit of JSON. For higher quality, I would like to convert my PDFs into Markdown before ...
Uiyoung Kim's user avatar
0 votes
0 answers
18 views

I am getting this error while building a RAG model

I am getting this error while building a RAG model while using qwen2 model instead of the default llama2 which chroma uses. My code: from langchain_community.embeddings import OllamaEmbeddings from ...
Dakshi R's user avatar
-2 votes
0 answers
35 views

GPT4ALL not working after installation on Windows Version 10.0.22631.3737 Ryzen5 processor [closed]

I want to run the GPT4All chatbot locally on my laptop. I have cloned the github repository in the directory I made for GPT4All. All the files and folders are downloaded and installed properly. The ....
Nandini Dasgupta's user avatar
0 votes
0 answers
44 views

How to fix this error: KeyError: 'model.embed_tokens.weight'

This is the detailed error: Traceback (most recent call last): File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module> train() File "/home/cyq/zxc/...
hshsh's user avatar
  • 11
0 votes
0 answers
20 views

Glue job with Bedrock not running in parallel

I am writing a Glue job to process a pyspark dataframe using Bedrock which was recently added to boto3. The job will get sentiment from a text field in the dataframe using one of the LLMs in Bedrock, ...
ddd's user avatar
  • 4,969
0 votes
1 answer
44 views

Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch

I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code import argparse import numpy as np import torch from datasets ...
Masthan's user avatar
  • 685
-2 votes
0 answers
30 views

A chatbot that can call apis [closed]

I want to create an LLM that can call custom-made APIs. We have already created several APIs, and the LLM should be able to make all types of HTTP requests (GET, POST, PUT, DELETE). The LLM should ...
Fawaz's user avatar
  • 11
-1 votes
0 answers
22 views

I'm trying to make a chat with pdfs app and it shows an error : "None" type object is not callable [closed]

this is the code i've been working on and the error is most probably related to some function call but i can't figure out which one import streamlit as st from dotenv import load_dotenv from PyPDF2 ...
s1nghhhhh's user avatar
0 votes
0 answers
32 views

How to configure llama-cpp-python to use more vCPUs for running LLM

I am using llama-cpp-python to run Mistral-7B-Instruct-v0.3-GGUF on an Azure Virtual Machine. I've tested the model Mistral-7B-Instruct-v0.3.Q4_K_M.gguf and Mistral-7B-Instruct-v0.3.fp16.gguf in the ...
blank's user avatar
  • 41
0 votes
0 answers
38 views

Concurrent/parallel requests with vLL,

My question might be a bit basic, but I’m new to all of this and eager to learn. I have build an app with FastAPI. Previously I used asyncio method to handle multiple request to llm, but with each new ...
Volodymyr Bondarenko's user avatar
0 votes
0 answers
9 views

triton inference server - How to prevent echoing inputs?

I'm running triton inference server and it gives me generated texts followed by input prompts. How can I configure the server only returns generated texts? Ref: https://github.com/triton-inference-...
Dorr's user avatar
  • 647
0 votes
1 answer
46 views

NextJS v14 architecture to call a LLM

In a NextJS v14 application, I need to call a proxy API that interacts with an LLM. The API returns an NDJSON response, which needs to be processed using the ndjsonStream function from the can-ndjson-...
aestheticsData's user avatar
0 votes
0 answers
30 views

Error: Python setup.py egg_info did not run successfully. While installing intel-extension-for-transformers

I am following the tutorial https://intel.github.io/intel-extension-for-pytorch/llm/llama3/xpu/ to run Llama 3 models locally, however I am getting the following error while setting up the environment ...
Rohit Reddy's user avatar
0 votes
0 answers
29 views

OpenAI API query for an input document

I could build and chat with openAI model but I wish to upload a document and try to ask questions based on that doc. I check some discussions like this My code likes from langchain_community....
linpingta's user avatar
  • 2,522
0 votes
0 answers
24 views

How to implement ray server with multiple gpus?

I'm trying to implement a multi-gpu local server with ray and vllm. I have uploaded my full code and commands to this github repository. In short, I want to serve a big model that requires 2 gpus, but ...
Boyuan Chen's user avatar
1 vote
0 answers
23 views

ChromaDB terminates Flask without exception

I'm creating an API with Flask. The other side will send me a file and I will save it to chroma database on my side. Chroma.add will terminates my program without any exception. When I save a smaller ...
StaEx_G's user avatar
  • 13

1
2 3 4 5
31