Search Documentation

Search for pages and headings in the documentation

Liutong

Liutong is a high-performance inference platform that delivers state-of-the-art language models at a fraction of the cost. Built on a custom Rust-based inference engine inspired by vllm and sglang, Liutong offers blazing-fast token generation with full OpenAI API compatibility.

Why Liutong?

  • Affordable inference — Self-hosted infrastructure keeps costs dramatically lower than commercial API providers, with OpenAI fallback for guaranteed availability.
  • OpenAI-compatible — Drop-in replacement for the OpenAI API. Use the same SDKs, the same code, the same workflows.
  • Purpose-built models — Four model families covering chat, reasoning, media generation, and embeddings.
  • Rust performance — Our inference engine is written in Rust for maximum throughput and minimal latency.

Models

ModelCategoryUse Case
crimson-falcon-4ChatGeneral-purpose text generation and conversation
indigo-owl-4ReasoningComplex reasoning, math, and multi-step problem solving
amber-phoenix-4MediaImage and video generation
jade-mole-4EmbeddingsText embeddings for search and retrieval

Quick Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.liutong.llby.org/v1",
    api_key="lt_your_api_key",
)

response = client.chat.completions.create(
    model="crimson-falcon-4",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Ready to get started? Head to the Quickstart guide.