Newest 'large-language-model' Questions

0 votes

0 answers

5 views

Google Canary, Docker and Gemini Nano

Can I run the latest Google Canary with the Gemini Nano model in a Docker container in headless mode and interact with the model via Selenium (execute_script)? If so, how do I do it?

aasdf1xa

13

asked 1 hour ago

-4 votes

0 answers

17 views

Want to retrain my LLM based user questions and answers on OpenAI

We have created a solution where users can upload their PDFs and ask questions. We have used NodeJS, Langchain, and OpenAI. Currently, the app flow is we save all the content of PDF in our vector ...

Muhammad Mudassir

313

asked 19 hours ago

0 votes

0 answers

12 views

apply different learning rate for introduced tokens in the transformers library

Say I want to introduce a few new tokens into the vocabulary of an existing model, and I want these tokens to have a different learning rate compared to the rest of the model's parameters during ...

Bipolo

73

asked 22 hours ago

0 votes

0 answers

12 views

module 'keras_nlp' has no attribute 'models

HAS ANYONE ELSE EXPERIENCED THE SAME ERROR WHEN RUNNING IT LOCALLY? IT RUNS CORRECTLY ON COLAB. module 'keras_nlp' has no attribute 'models i Tried to install the updated version of pip install -U ...

Mody

1

asked 23 hours ago

0 votes

0 answers

15 views

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule. When running a DSPy module with a given ...

Tom Lin

72

asked yesterday

0 votes

0 answers

13 views

“Bus Error and Resource Tracker Warning When Training PyTorch Model on GPU with MPS”

I’ve built a vanilla Transformer using PyTorch for machine translation and am encountering issues while trying to train it on an Apple Mac M3 with a 12-core CPU and an 18-core GPU (18GB RAM) ...

Pratheesh Kumar

1

asked yesterday

0 votes

0 answers

23 views

LlamaIndex Pandas Query Engine not returning all rows

I am using LlamaIndex Pandas Query Engine to produce a pandas query that is applied on a dataframe. The query produced is correct however the dataframe returned is not. Specifically, a dataframe with ...

prax1telis

307

asked 2 days ago

-1 votes

0 answers

18 views

reAct Prompting causing LLM Hallaucination

Question: How to Prevent Hallucinations in LLM When Using Langchain Tools like Wikipedia with React Prompt Template? I'm currently working with Langchain and using the gemini-1.5-flash model. I've ...

Satvik Jain

11

asked 2 days ago

-2 votes

0 answers

13 views

Using RAG to identify the source of error in a code file [closed]

I am trying to implement a small tool that automatically identifies which part of the code from a set of code files (given as input) is responsible for the error text displayed during execution. The ...

S R

7

asked 2 days ago

0 votes

1 answer

38 views

How to use Google Vertex AI fine tuned model via Node.js

I fine-tuned a model on Google Vertex AI. Before that, I was using regular models with this code(it works): public static async SendMessage(prompt) { const vertexAI = new VertexAI({project: ...

cuneyttyler

1,325

asked 2 days ago

0 votes

0 answers

18 views

Dockerized OpenSearch not returning any hits for queries

I recently tried to move the chatbot project that I am working on over to OpenSearch. At first, my search function was working, but after dockerizing OpenSearch, I've run into the issue where my ...

Frank Nakasako

1

asked 2 days ago

0 votes

0 answers

11 views

Unable to make llama.cpp on M1 Mac

When I try installing Llam.cpp, I get the following error: ld: warning: ignoring file '/Users/krishparikh/Projects/LLM/llama.cpp/ggml/src/ggml-metal-embed.o': found architecture 'x86_64', required ...

Krish Parikh

21

asked 2 days ago

0 votes

0 answers

44 views

Issues with irrelevant similarity search results using RAG and LLMs on PubMed Data

Context I have a tab-separated values (TSV) file containing metadata and abstracts of research papers (between 800-1000 rows). So far, I have been working on a code that use RAG-LLMs to check the ...

Someone_1313

450

asked 2 days ago

0 votes

0 answers

15 views

Llama-3-70B with pipeline cannot generate new tokens (texts)

I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts. Here is my demo code (copied from ...

Martin

11

asked 2 days ago

0 votes

0 answers

18 views

How to Combine Semantic Search with SQL Analytical Queries?

I'm creating an LLM-agent that can provide insights from a complex database. The database includes several columns of different types (datetime, numeric, and text). For simplicity, let's assume I have ...

Pepe Moreno

43

asked 2 days ago

0 votes

0 answers

6 views

How do I persist FAISS indexes?

In the langchain wiki of FAISS, https://python.langchain.com/v0.2/docs/integrations/vectorstores/faiss/, it only talks about saving indexes to files. db.save_local("faiss_index") new_db = ...

Xiao Jing

1

asked 2 days ago

0 votes

0 answers

11 views

multiple headers excel loader,langchain , excel chatbot

i am creating question and answer using excel, i am using recursive text splitter.i am able to generate answer for simple excel files but the problem is with the excel file with multiple headers like](...

anu

19

asked 2 days ago

1 vote

0 answers

9 views

Convert Quantization to Onnx

I am new and want to try converting models to Onnx format and I have the following issue. I have a model that has been quantized to 4-bit, and then I converted this model to Onnx. My quantized model ...

Toàn Nguyễn Phúc

11

asked Jul 11 at 3:03

-2 votes

0 answers

15 views

Time-serie outlier detection using LLM

I want to preform outliers detection on time-serie data using an LLM (the use of LLM is critical). so basically i have a simple dataframe containing the time serie data (timestamp column, and value ...

Mohamed Sabri Belmadoui

1

asked Jul 10 at 14:03

0 votes

0 answers

23 views

Can anyone know how to execute using only .gguf file , without any test-LLM-23B-m.safetensors.index.json ,test-LLM-23B-m.safetensors

from auto_gptq import AutoGPTQForCausalLM,BaseQuantizeConfig import torch from transformers import AutoTokenizer def run_model(): try: #Check if CUDA is available and set device ...

goodcoder

5

asked Jul 10 at 13:50

0 votes

0 answers

11 views

Which RAG methods/concepts can I use for a benchmark?

I am writing a practical assignment for my uni. There I have to analyse different RAG methods and compare them. Since I am in my 2nd semester of information systems and I lack of experience within the ...

Arian Ott

1

asked Jul 10 at 13:26

0 votes

0 answers

32 views

AI Stops Abruptly with Langchain and CTransformers

I am facing an issue with my AI application in Python. I am using the chainlit library along with langchain and CTransformers to generate AI responses. However, the AI often stops abruptly before ...

Memo

1

asked Jul 10 at 13:15

-8 votes

0 answers

28 views

Hugging Face Spaces is completely free? [closed]

I just started to learn Hugging Face and I don't know how it works? So if I use Spaces page on Hugging Face while I using the other AI models on the websites. Should I pay any money after use this ...

CrystaL

1

asked Jul 10 at 13:04

-1 votes

0 answers

7 views

Issue with SQLDatabase.from_uri in langchain when including materialized views

This is not working for views ( working for tables in the same database) db = SQLDatabase.from_uri(rds_uri,include_tables=["mv_complete_dur_info_update_new"],view_support=True,schema="...

kalpesh patil

1

asked Jul 10 at 12:01

0 votes

0 answers

14 views

CrewAI tool is caching my response. How do I disable it

I have created a custom tool in CrewAI to request human for information. I created this tool as I found it extremely useful to include human in the loop situations. But I'm facing one issue. Every ...

Rohan Prasad

1

asked Jul 10 at 10:55

0 votes

0 answers

30 views

The relationship between chunk_size, context length and embedding length in a Langchain RAG Framework

everyone. Currently I am working on a Langchain RAG framework using Ollama. I have a question towards the chunk size in the Document Splitter. Now I decide to use qwen2:72b model as both embedding ...

Joesf.Albert

149

asked Jul 10 at 8:34

-1 votes

0 answers

16 views

ChatBot for PDFs [closed]

I just got into my first internship and the company wants to build a program that is fed up with daily pdfs with hundreds of pages and is capable of answering questions based on that. The program is ...

Daniel Alpeñes De Lucca

1

asked Jul 10 at 8:13

0 votes

0 answers

16 views

spacy-llm & SpanCat for address parsing

I'm currently developing a project to standardize and correct a dataset of inconsistently formatted addresses using spaCy-LLM and spaCy.SpanCat.v3. The goal is to train a model on examples of ...

Hammad Javaid

1

asked Jul 10 at 7:30

-1 votes

0 answers

16 views

AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}

I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...

ahmad

41

asked Jul 9 at 15:36

0 votes

0 answers

21 views

SQL query is not correctly generated using langchain, nlp and llm

I have created a application which takes input question and converts it in SQL query using langchain, llm and nlp but sometimes it creating wrong query especially in the beginning following is the ...

kalpesh patil

1

asked Jul 9 at 12:57

1 vote

0 answers

51 views

Passing Additional Information in LangChain abatch Calls

Given an abatch call for a LangChain chain, I need to pass additional information, beyond just the content, to the function so that this information is available in the callback, specifically in the ...

TantrixRobotBoy

631

asked Jul 9 at 12:53

0 votes

0 answers

19 views

ModuleNotFoundError when importing HuggingFaceLLM from llama_index.core.llms.huggingface

I’m trying to import HuggingFaceLLM using the following line of code: from llama_index.core.llms.huggingface import HuggingFaceLLM I know that llamaindex keeps updating, and previously this import ...

Nick

343

asked Jul 9 at 12:25

-1 votes

0 answers

67 views

OpenWebUI + Pipelines (w/ langchain hopefully)

I'm currently at the last step of https://github.com/open-webui/pipelines, and I tried to start the server, but it says the image below as my error. I'm not sure if the server is already running nor ...

Ryan Lutz

1

asked Jul 9 at 1:03

-1 votes

0 answers

24 views

Trying to use Llama 3 on VertexAI is throwing 400 Bad Request but the error doesn't make sense [closed]

I am trying to use Llama 3 on VertexAI to process an image and extract data from the image and put it into JSON format. I have this working with Gemini in a Jupyter Notebook hosted on Vertex, but the ...

Carlos Muentes

63

asked Jul 8 at 19:22

0 votes

0 answers

10 views

Converting PDFs to Markdown for Higher Quality Embeddings with Langchain.js

I am working on RAG LLM projects with Langchain.js using Node.js. Most of the data I retrieve are PDFs and a bit of JSON. For higher quality, I would like to convert my PDFs into Markdown before ...

Uiyoung Kim

1

asked Jul 8 at 13:12

0 votes

0 answers

18 views

I am getting this error while building a RAG model

I am getting this error while building a RAG model while using qwen2 model instead of the default llama2 which chroma uses. My code: from langchain_community.embeddings import OllamaEmbeddings from ...

Dakshi R

11

asked Jul 8 at 12:29

-2 votes

0 answers

35 views

GPT4ALL not working after installation on Windows Version 10.0.22631.3737 Ryzen5 processor [closed]

I want to run the GPT4All chatbot locally on my laptop. I have cloned the github repository in the directory I made for GPT4All. All the files and folders are downloaded and installed properly. The ....

Nandini Dasgupta

1

asked Jul 7 at 19:31

0 votes

0 answers

44 views

How to fix this error: KeyError: 'model.embed_tokens.weight'

This is the detailed error: Traceback (most recent call last): File "/home/cyq/zxc/SmartEdit/train/DS_MLLMSD11_train.py", line 769, in <module> train() File "/home/cyq/zxc/...

hshsh

11

asked Jul 6 at 19:41

0 votes

0 answers

20 views

Glue job with Bedrock not running in parallel

I am writing a Glue job to process a pyspark dataframe using Bedrock which was recently added to boto3. The job will get sentiment from a text field in the dataframe using one of the LLMs in Bedrock, ...

ddd

4,969

asked Jul 6 at 15:39

0 votes

1 answer

44 views

Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch

I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code import argparse import numpy as np import torch from datasets ...

Masthan

685

asked Jul 5 at 18:29

-2 votes

0 answers

30 views

A chatbot that can call apis [closed]

I want to create an LLM that can call custom-made APIs. We have already created several APIs, and the LLM should be able to make all types of HTTP requests (GET, POST, PUT, DELETE). The LLM should ...

Fawaz

11

asked Jul 5 at 15:36

-1 votes

0 answers

22 views

I'm trying to make a chat with pdfs app and it shows an error : "None" type object is not callable [closed]

this is the code i've been working on and the error is most probably related to some function call but i can't figure out which one import streamlit as st from dotenv import load_dotenv from PyPDF2 ...

s1nghhhhh

1

asked Jul 5 at 13:35

0 votes

0 answers

32 views

How to configure llama-cpp-python to use more vCPUs for running LLM

I am using llama-cpp-python to run Mistral-7B-Instruct-v0.3-GGUF on an Azure Virtual Machine. I've tested the model Mistral-7B-Instruct-v0.3.Q4_K_M.gguf and Mistral-7B-Instruct-v0.3.fp16.gguf in the ...

blank

41

asked Jul 5 at 11:54

0 votes

0 answers

38 views

Concurrent/parallel requests with vLL,

My question might be a bit basic, but I’m new to all of this and eager to learn. I have build an app with FastAPI. Previously I used asyncio method to handle multiple request to llm, but with each new ...

Volodymyr Bondarenko

1

asked Jul 5 at 6:46

0 votes

0 answers

9 views

triton inference server - How to prevent echoing inputs?

I'm running triton inference server and it gives me generated texts followed by input prompts. How can I configure the server only returns generated texts? Ref: https://github.com/triton-inference-...

Dorr

647

asked Jul 5 at 0:06

0 votes

1 answer

46 views

NextJS v14 architecture to call a LLM

In a NextJS v14 application, I need to call a proxy API that interacts with an LLM. The API returns an NDJSON response, which needs to be processed using the ndjsonStream function from the can-ndjson-...

aestheticsData

677

asked Jul 4 at 20:52

0 votes

0 answers

30 views

Error: Python setup.py egg_info did not run successfully. While installing intel-extension-for-transformers

I am following the tutorial https://intel.github.io/intel-extension-for-pytorch/llm/llama3/xpu/ to run Llama 3 models locally, however I am getting the following error while setting up the environment ...

Rohit Reddy

1

asked Jul 4 at 13:07

0 votes

0 answers

29 views

OpenAI API query for an input document

I could build and chat with openAI model but I wish to upload a document and try to ask questions based on that doc. I check some discussions like this My code likes from langchain_community....

linpingta

2,522

asked Jul 4 at 9:09

0 votes

0 answers

24 views

How to implement ray server with multiple gpus?

I'm trying to implement a multi-gpu local server with ray and vllm. I have uploaded my full code and commands to this github repository. In short, I want to serve a big model that requires 2 gpus, but ...

Boyuan Chen

33

asked Jul 3 at 13:19

1 vote

0 answers

23 views

ChromaDB terminates Flask without exception

I'm creating an API with Flask. The other side will send me a file and I will save it to chroma database on my side. Chroma.add will terminates my program without any exception. When I save a smaller ...

StaEx_G

13

asked Jul 3 at 2:45

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

Related Tags