Questions tagged [large-language-model]
A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)
large-language-model
1,538
questions
0
votes
0
answers
5
views
Google Canary, Docker and Gemini Nano
Can I run the latest Google Canary with the Gemini Nano model in a Docker container in headless mode and interact with the model via Selenium (execute_script)? If so, how do I do it?
-4
votes
0
answers
17
views
Want to retrain my LLM based user questions and answers on OpenAI
We have created a solution where users can upload their PDFs and ask questions. We have used NodeJS, Langchain, and OpenAI. Currently, the app flow is we save all the content of PDF in our vector ...
0
votes
0
answers
12
views
apply different learning rate for introduced tokens in the transformers library
Say I want to introduce a few new tokens into the vocabulary of an existing model, and I want these tokens to have a different learning rate compared to the rest of the model's parameters during ...
0
votes
0
answers
12
views
module 'keras_nlp' has no attribute 'models
HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB.
module 'keras_nlp' has no attribute 'models
i
Tried to install the updated version of
pip install -U ...
0
votes
0
answers
15
views
DSPy: How to get the number of tokens available for the input fields?
This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.
When running a DSPy module with a given ...
0
votes
0
answers
13
views
“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”
I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...
0
votes
0
answers
23
views
LlamaIndex Pandas Query Engine not returning all rows
I am using LlamaIndex Pandas Query Engine to produce a pandas query that is applied on a dataframe. The query produced is correct however the dataframe returned is not. Specifically, a dataframe with ...
-1
votes
0
answers
18
views
reAct Prompting causing LLM Hallaucination
Question: How to Prevent Hallucinations in LLM When Using Langchain Tools like Wikipedia with React Prompt Template?
I'm currently working with Langchain and using the gemini-1.5-flash model. I've ...
-2
votes
0
answers
13
views
Using RAG to identify the source of error in a code file [closed]
I am trying to implement a small tool that automatically identifies which part of the code from a set of code files (given as input) is responsible for the error text displayed during execution.
The ...
0
votes
1
answer
38
views
How to use Google Vertex AI fine tuned model via Node.js
I fine-tuned a model on Google Vertex AI. Before that, I was using regular models with this code(it works):
public static async SendMessage(prompt) {
const vertexAI = new VertexAI({project: ...
0
votes
0
answers
18
views
Dockerized OpenSearch not returning any hits for queries
I recently tried to move the chatbot project that I am working on over to OpenSearch. At first, my search function was working, but after dockerizing OpenSearch, I've run into the issue where my ...
0
votes
0
answers
11
views
Unable to make llama.cpp on M1 Mac
When I try installing Llam.cpp, I get the following error:
ld: warning: ignoring file '/Users/krishparikh/Projects/LLM/llama.cpp/ggml/src/ggml-metal-embed.o': found architecture 'x86_64', required ...
0
votes
0
answers
44
views
Issues with irrelevant similarity search results using RAG and LLMs on PubMed Data
Context
I have a tab-separated values (TSV) file containing metadata and abstracts of research papers (between 800-1000 rows). So far, I have been working on a code that use RAG-LLMs to check the ...
0
votes
0
answers
15
views
Llama-3-70B with pipeline cannot generate new tokens (texts)
I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts.
Here is my demo code (copied from ...
0
votes
0
answers
18
views
How to Combine Semantic Search with SQL Analytical Queries?
I'm creating an LLM-agent that can provide insights from a complex database. The database includes several columns of different types (datetime, numeric, and text).
For simplicity, let's assume I have ...
0
votes
0
answers
6
views
How do I persist FAISS indexes?
In the langchain wiki of FAISS, https://python.langchain.com/v0.2/docs/integrations/vectorstores/faiss/, it only talks about saving indexes to files.
db.save_local("faiss_index")
new_db = ...
0
votes
0
answers
11
views
multiple headers excel loader,langchain , excel chatbot
i am creating question and answer using excel, i am using recursive text splitter.i am able to generate answer for simple excel files but the problem is with the excel file with multiple headers
like](...
1
vote
0
answers
9
views
Convert Quantization to Onnx
I am new and want to try converting models to Onnx format and I have the following issue. I have a model that has been quantized to 4-bit, and then I converted this model to Onnx. My quantized model ...
-2
votes
0
answers
15
views
Time-serie outlier detection using LLM
I want to preform outliers detection on time-serie data using an LLM (the use of LLM is critical). so basically i have a simple dataframe containing the time serie data (timestamp column, and value ...
0
votes
0
answers
23
views
Can anyone know how to execute using only .gguf file , without any test-LLM-23B-m.safetensors.index.json ,test-LLM-23B-m.safetensors
from auto_gptq import AutoGPTQForCausalLM,BaseQuantizeConfig
import torch
from transformers import AutoTokenizer
def run_model():
try:
#Check if CUDA is available and set device ...
0
votes
0
answers
11
views
Which RAG methods/concepts can I use for a benchmark?
I am writing a practical assignment for my uni. There I have to analyse different RAG methods and compare them. Since I am in my 2nd semester of information systems and I lack of experience within the ...
0
votes
0
answers
32
views
AI Stops Abruptly with Langchain and CTransformers
I am facing an issue with my AI application in Python. I am using the chainlit library along with langchain and CTransformers to generate AI responses. However, the AI often stops abruptly before ...
-8
votes
0
answers
28
views
Hugging Face Spaces is completely free? [closed]
I just started to learn Hugging Face and I don't know how it works? So if I use Spaces page on Hugging Face while I using the other AI models on the websites. Should I pay any money after use this ...
-1
votes
0
answers
7
views
Issue with SQLDatabase.from_uri in langchain when including materialized views
This is not working for views ( working for tables in the same database)
db = SQLDatabase.from_uri(rds_uri,include_tables=["mv_complete_dur_info_update_new"],view_support=True,schema="...
0
votes
0
answers
14
views
CrewAI tool is caching my response. How do I disable it
I have created a custom tool in CrewAI to request human for information. I created this tool as I found it extremely useful to include human in the loop situations.
But I'm facing one issue. Every ...
0
votes
0
answers
30
views
The relationship between chunk_size, context length and embedding length in a Langchain RAG Framework
everyone. Currently I am working on a Langchain RAG framework using Ollama. I have a question towards the chunk size in the Document Splitter.
Now I decide to use qwen2:72b model as both embedding ...
-1
votes
0
answers
16
views
ChatBot for PDFs [closed]
I just got into my first internship and the company wants to build a program that is fed up with daily pdfs with hundreds of pages and is capable of answering questions based on that. The program is ...
0
votes
0
answers
16
views
spacy-llm & SpanCat for address parsing
I'm currently developing a project to standardize and correct a dataset of inconsistently formatted addresses using spaCy-LLM and spaCy.SpanCat.v3. The goal is to train a model on examples of ...
-1
votes
0
answers
16
views
AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}
I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model
after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...
0
votes
0
answers
21
views
SQL query is not correctly generated using langchain, nlp and llm
I have created a application which takes input question and converts it in SQL query using langchain, llm and nlp but sometimes it creating wrong query especially in the beginning following is the ...
1
vote
0
answers
51
views
Passing Additional Information in LangChain abatch Calls
Given an abatch call for a LangChain chain, I need to pass additional information, beyond just the content, to the function so that this information is available in the callback, specifically in the ...
0
votes
0
answers
19
views
ModuleNotFoundError when importing HuggingFaceLLM from llama_index.core.llms.huggingface
I’m trying to import HuggingFaceLLM using the following line of code:
from llama_index.core.llms.huggingface import HuggingFaceLLM
I know that llamaindex keeps updating, and previously this import ...
-1
votes
0
answers
67
views
OpenWebUI + Pipelines (w/ langchain hopefully)
I'm currently at the last step of https://github.com/open-webui/pipelines, and I tried to start the server, but it says the image below as my error. I'm not sure if the server is already running nor ...
-1
votes
0
answers
24
views
Trying to use Llama 3 on VertexAI is throwing 400 Bad Request but the error doesn't make sense [closed]
I am trying to use Llama 3 on VertexAI to process an image and extract data from the image and put it into JSON format. I have this working with Gemini in a Jupyter Notebook hosted on Vertex, but the ...
0
votes
0
answers
10
views
Converting PDFs to Markdown for Higher Quality Embeddings with Langchain.js
I am working on RAG LLM projects with Langchain.js using Node.js. Most of the data I retrieve are PDFs and a bit of JSON.
For higher quality, I would like to convert my PDFs into Markdown before ...
0
votes
0
answers
18
views
I am getting this error while building a RAG model
I am getting this error while building a RAG model while using qwen2 model instead of the default llama2 which chroma uses.
My code:
from langchain_community.embeddings import OllamaEmbeddings
from ...
-2
votes
0
answers
35
views
GPT4ALL not working after installation on Windows Version 10.0.22631.3737 Ryzen5 processor [closed]
I want to run the GPT4All chatbot locally on my laptop. I have cloned the github repository in the directory I made for GPT4All. All the files and folders are downloaded and installed properly. The ....
0
votes
0
answers
44
views
How to fix this error: KeyError: 'model.embed_tokens.weight'
This is the detailed error:
Traceback (most recent call last):
File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module>
train()
File "/home/cyq/zxc/...
0
votes
0
answers
20
views
Glue job with Bedrock not running in parallel
I am writing a Glue job to process a pyspark dataframe using Bedrock which was recently added to boto3. The job will get sentiment from a text field in the dataframe using one of the LLMs in Bedrock, ...
0
votes
1
answer
44
views
Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch
I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code
import argparse
import numpy as np
import torch
from datasets ...
-2
votes
0
answers
30
views
A chatbot that can call apis [closed]
I want to create an LLM that can call custom-made APIs. We have already created several APIs, and the LLM should be able to make all types of HTTP requests (GET, POST, PUT, DELETE). The LLM should ...
-1
votes
0
answers
22
views
I'm trying to make a chat with pdfs app and it shows an error : "None" type object is not callable [closed]
this is the code i've been working on and the error is most probably related to some function call but i can't figure out which one
import streamlit as st
from dotenv import load_dotenv
from PyPDF2 ...
0
votes
0
answers
32
views
How to configure llama-cpp-python to use more vCPUs for running LLM
I am using llama-cpp-python to run Mistral-7B-Instruct-v0.3-GGUF on an Azure Virtual Machine.
I've tested the model Mistral-7B-Instruct-v0.3.Q4_K_M.gguf and Mistral-7B-Instruct-v0.3.fp16.gguf in the ...
0
votes
0
answers
38
views
Concurrent/parallel requests with vLL,
My question might be a bit basic, but I’m new to all of this and eager to learn.
I have build an app with FastAPI. Previously I used asyncio method to handle multiple request to llm, but with each new ...
0
votes
0
answers
9
views
triton inference server - How to prevent echoing inputs?
I'm running triton inference server and it gives me generated texts followed by input prompts.
How can I configure the server only returns generated texts?
Ref: https://github.com/triton-inference-...
0
votes
1
answer
46
views
NextJS v14 architecture to call a LLM
In a NextJS v14 application, I need to call a proxy API that interacts with an LLM. The API returns an NDJSON response, which needs to be processed using the ndjsonStream function from the can-ndjson-...
0
votes
0
answers
30
views
Error: Python setup.py egg_info did not run successfully. While installing intel-extension-for-transformers
I am following the tutorial https://intel.github.io/intel-extension-for-pytorch/llm/llama3/xpu/ to run Llama 3 models locally, however I am getting the following error while setting up the environment ...
0
votes
0
answers
29
views
OpenAI API query for an input document
I could build and chat with openAI model but I wish to upload a document and try to ask questions based on that doc.
I check some discussions like this
My code likes
from langchain_community....
0
votes
0
answers
24
views
How to implement ray server with multiple gpus?
I'm trying to implement a multi-gpu local server with ray and vllm. I have uploaded my full code and commands to this github repository. In short, I want to serve a big model that requires 2 gpus, but ...
1
vote
0
answers
23
views
ChromaDB terminates Flask without exception
I'm creating an API with Flask. The other side will send me a file and I will save it to chroma database on my side. Chroma.add will terminates my program without any exception. When I save a smaller ...