Outlines

This will help you getting started with Outlines LLM. For detailed documentation of all Outlines features and configurations head to the API reference.

Outlines is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.

Overview

Integration details

Class	Package	Local	Serializable	JS support	Package downloads	Package latest
Outlines	langchain-community	✅	beta	❌

Setup

To access Outlines models you'll need to have an internet connection to download the model weights from huggingface. Depending on the backend you need to install the required dependencies (see Outlines docs)

Credentials

There is no built-in auth mechanism for Outlines.

Installation

The LangChain Outlines integration lives in the langchain-community package and requires the outlines library:

%pip install -qU langchain-community outlines

Instantiation

Now we can instantiate our model object and generate chat completions:

from langchain_community.llms import Outlines

# For use with llamacpp backend
model = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="llamacpp")

# For use with vllm backend (not available on Mac)
model = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="vllm")

# For use with mlxlm backend (only available on Mac)
model = Outlines(model="microsoft/Phi-3-mini-4k-instruct", backend="mlxlm")

# For use with huggingface transformers backend
model = Outlines(
    model="microsoft/Phi-3-mini-4k-instruct"
)  # defaults to backend="transformers"

API Reference:Outlines

Invocation

model.invoke("Hello how are you?")

Chaining

from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template("How to say {input} in {output_language}:\n")

chain = prompt | model
chain.invoke(
    {
        "output_language": "German",
        "input": "I love programming.",
    }
)

API Reference:PromptTemplate

Streaming

Outlines supports streaming of tokens:

for chunk in model.stream("Count to 10 in French:"):
    print(chunk, end="", flush=True)

Constrained Generation

Outlines allows you to apply various constraints to the generated output:

Regex Constraint

model.regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
response = model.invoke("What is the IP address of Google's DNS server?")

response

Type Constraints

model.type_constraints = int
response = model.invoke("What is the answer to life, the universe, and everything?")

JSON Schema

from pydantic import BaseModel

class Person(BaseModel):
    name: str

model.json_schema = Person
response = model.invoke("Who is the author of LangChain?")
person = Person.model_validate_json(response)

person

Grammar Constraint

model.grammar = """
?start: expression
?expression: term (("+" | "-") term)
?term: factor (("" | "/") factor)
?factor: NUMBER | "-" factor | "(" expression ")"
%import common.NUMBER
%import common.WS
%ignore WS
"""
response = model.invoke("Give me a complex arithmetic expression:")

response

API reference

For detailed documentation of all ChatOutlines features and configurations head to the API reference: https://python.langchain.com/api_reference/community/chat_models/langchain_community.chat_models.outlines.ChatOutlines.html

Outlines Documentation:

https://dottxt-ai.github.io/outlines/latest/

LLM conceptual guide
LLM how-to guides

Overview​

Integration details​

Setup​

Credentials​

Installation​

Instantiation​

Invocation​

Chaining​

Streaming​

Constrained Generation​

Regex Constraint​

Type Constraints​

JSON Schema​

Grammar Constraint​

API reference​

Outlines Documentation:​

Related​

Was this page helpful?