Learning Outcome
In this post, we will learn about:
- What is Ollama?
- How do you install the Llama model using the Ollama framework?
- Running Llama models
- Different ways to access the Ollama model
- Access the deployed models using Page Assist plugin in the Web Browsers
- Access the Llama model using HTTP API in Python Language
- Access the Llama model using the Langchain Library
1. What is Ollama?
Ollama is an open-source tool/framework that facilitates users in running large language models (LLMs) on their local computers, such as PCs, edge devices like Raspberry Pi, etc.
2. How to install it?
Downloads and installations are available for Mac, Linux, and Windows. Visit https://ollama.com/download for instructions.
3. Running Llama 3.2
Five versions of Llama 3.2 models are available: 1B, 3B, 11B, and 90B. ‘B’ indicates billions. For example, 1B means that the model has been trained on 1 billion parameters. 1B and 3B are text-only models, whereas 11B and 90B are multimodal (text and images).
Run 1B model: ollama run llama3.2:1b
Run 3B model: ollama run llama3.2
After running these models on the terminal, we can interact with the model using the terminal.
4. Access the deployed models using Web Browsers
Page Assist is an open-source browser extension that provides a sidebar and web UI for your local AI model. It allows you to interact with your model from any webpage.
5. Access the Llama model using HTTP API in Python Language
import json
import requests
data = '{}'
data = json.loads(data)
data["model"] = "llama3.2:1b"
data["stream"] = False
data["prompt"] = "What is Newton's law of motion?" + " Answer in short."
# Sent to Chatbot
r = requests.post('http://127.0.0.1:11434/api/generate', json=data)
response_data = json.loads(json.dumps(r.json()))
# Print User and Bot Message
print(f'\nUser: {data["prompt"]}')
bot_response = response_data['response']
print(f'\nBot: {bot_response}')
6. Access the Llama model using the Langchain Library
Dependent library installation: pip install langchain-ollama
from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import ChatPromptTemplate
query = "What is Newton's law of motion?"
template = """Instruction: {instruction}
Query: {query}
"""
prompt = ChatPromptTemplate.from_template(template)
model = OllamaLLM(model="llama3.2") # Using llama3.2 as llm model
chain = prompt | model
bot_response = chain.invoke({"instruction": "Answer the question. If you cannot answer the question, answer with \"I don't know.\"",
"query": query
})
print(f'\nUser: {query}')
print(f'\nBot: {bot_response}')
[…] this should be installed and running. Please refer to the separate post on the topic, “Install, run, and access Llama using Ollama“. This post also describes the details of how to access the running model using the […]
[…] Please refer to my separate post on this topic, Install, run, and access Llama using Ollama. […]