Liutong
Fast, affordable LLM inference with a fully OpenAI-compatible API. Powered by a custom Rust engine.
Four model families, one API
Drop-in replacement for the OpenAI API. Just change the base URL.
crimson-falcon-4
General-purpose chat model. Text generation, conversation, code, and summarization.
indigo-owl-4
Reasoning model. Multi-step problem solving, math, logic, and code analysis.
amber-phoenix-4
Media generation. High-quality images and video from text prompts.
jade-mole-4
Embeddings model. Semantic search, clustering, classification, and RAG.
Rust-powered engine
Custom inference engine built in Rust, inspired by vllm and sglang. Optimized for throughput and latency.
OpenAI-compatible
Works with the official OpenAI Python and Node.js SDKs. No custom client libraries needed.
Affordable inference
Self-hosted infrastructure keeps costs low. Automatic fallback to upstream providers ensures reliability.
Ready to get started?
Explore the docs or jump straight to the quickstart guide