I occasionally play with LLMs (large language models). Last time I had tried Simon Willison's llm software to search over my notes. Today I tried amaiya/onprem to ask questions about my notes. The documentation is pretty good but I made a few changes for my configuration, so I decided to write them down here. I'm running these on my Mac M1 and not on my Linux or Windows machines.
First step, get the thing installed:
cd ~/Projects/machine-learning/ mkdir onprem cd onprem # need to use python 3.9 through 3.11 for torch support; 3.12 doesn't support it yet # check the version I have installed python3 --version # if python3 isn't a reasonable version, then install or choose a different python # before doing the next step python3.11 -m venv venv source venv/bin/activate # install onprem itself, which also installs torch pip install onprem mkdir data
The main change I'm making is that I want to store the data in ./data
not in ~/onprem_data
, so that's why I made a directory for it to store the data. I don't want to store a bunch of things in my home directory.
Second step, test it out using the code from their web site. Instead of the default LLM() I set several parameters:
- use_larger will download the 13B model instead of the 7B model; omit this if you have 8GB
- n_gpu_layers will tell it to use the gpu; but I can't tell if it's actually any faster
- the path arguments I point at my local data directory
python
from onprem import LLM llm = LLM(use_larger=True, n_gpu_layers=64, model_download_path="./data", vectordb_path="./data") prompt = """Extract the names of people in the supplied sentences. Here is an example: Sentence: James Gandolfini and Paul Newman were great actors. People: James Gandolfini, Paul Newman Sentence: I like Cillian Murphy's acting. Florence Pugh is great, too. People:""" saved_output = llm.prompt(prompt)
This should print an answer Cillian Murphy, Florence Pugh
. That tells me the software is successfully installed and the LLM data is also successfully installed.
Third step, I fed it my documents. I keep them in emacs org-mode format, but I converted them to text format:
mkdir input cd input cp ~/Documents/notes/* . for f in *.org; do pandoc $f -o ${f//.org/}.txt; done
Ok, now that there are files in input/
, I ran the llm code again. Previously I used llm.prompt() to ask questions of the LLM directly. This time, following the docs, I used llm.inject() to load my input files and then llm.ask() to ask questions about the input files:
python
from onprem import LLM llm = LLM(use_larger=True, n_gpu_layers=64, model_download_path="./data", vectordb_path="./data") llm.ingest("./input/") result = llm.ask("What is carbon dioxide? Remember to only use the provided context.")
That's it! I tried asking a bunch of questions, had mixed results, and then ran out of ideas of what I should ask.
I like onprem's examples page and want to try out more of them. I also want to try out llamafile.
Labels: howto
Post a Comment