robot asks a question holding a book
"robot asks a question holding a book" image from stable diffusion

I occasionally play with LLMs (large language models). Last time I had tried Simon Willison's llm software to search over my notes. Today I tried amaiya/onprem to ask questions about my notes. The documentation is pretty good but I made a few changes for my configuration, so I decided to write them down here. I'm running these on my Mac M1 and not on my Linux or Windows machines.

First step, get the thing installed:

cd ~/Projects/machine-learning/
mkdir onprem
cd onprem

# need to use python 3.9 through 3.11 for torch support; 3.12 doesn't support it yet
# check the version I have installed
python3 --version
# if python3 isn't a reasonable version, then install or choose a different python
# before doing the next step

python3.11 -m venv venv
source venv/bin/activate

# install onprem itself, which also installs torch
pip install onprem
mkdir data

The main change I'm making is that I want to store the data in ./data not in ~/onprem_data, so that's why I made a directory for it to store the data. I don't want to store a bunch of things in my home directory.

Second step, test it out using the code from their web site. Instead of the default LLM() I set several parameters:

  1. use_larger will download the 13B model instead of the 7B model; omit this if you have 8GB
  2. n_gpu_layers will tell it to use the gpu; but I can't tell if it's actually any faster
  3. the path arguments I point at my local data directory
python
from onprem import LLM
llm = LLM(use_larger=True, n_gpu_layers=64, 
          model_download_path="./data", vectordb_path="./data")
prompt = """Extract the names of people in the supplied sentences. Here is an example:
Sentence: James Gandolfini and Paul Newman were great actors.
People:
James Gandolfini, Paul Newman
Sentence:
I like Cillian Murphy's acting. Florence Pugh is great, too.
People:"""
saved_output = llm.prompt(prompt)

This should print an answer Cillian Murphy, Florence Pugh. That tells me the software is successfully installed and the LLM data is also successfully installed.

Third step, I fed it my documents. I keep them in emacs org-mode format, but I converted them to text format:

mkdir input
cd input
cp ~/Documents/notes/* .
for f in *.org; do pandoc $f -o ${f//.org/}.txt; done

Ok, now that there are files in input/, I ran the llm code again. Previously I used llm.prompt() to ask questions of the LLM directly. This time, following the docs, I used llm.inject() to load my input files and then llm.ask() to ask questions about the input files:

python
from onprem import LLM
llm = LLM(use_larger=True, n_gpu_layers=64, model_download_path="./data", vectordb_path="./data")
llm.ingest("./input/")
result = llm.ask("What is carbon dioxide? Remember to only use the provided context.")

That's it! I tried asking a bunch of questions, had mixed results, and then ran out of ideas of what I should ask.

I like onprem's examples page and want to try out more of them. I also want to try out llamafile.

Labels:

0 comments: