Skip to content
@basetenlabs

Baseten

Machine learning infrastructure for developers

Welcome to Baseten

Baseten is an AI infrastructure platform. We combine applied performance research, distributed multi-cloud infrastructure, and developer tooling to run models of all modalities in production.

Get started:

  • Deploy an open-source model in two clicks from the model library.
  • Read our docs to package and serve a fine-tuned or custom model.

Pinned Loading

  1. truss truss Public

    The simplest way to serve AI/ML models in production

    Python 1.1k 100

  2. truss-examples truss-examples Public

    Examples of models deployable with Truss

    Python 223 60

Repositories

Showing 10 of 94 repositories
  • truss Public

    The simplest way to serve AI/ML models in production

    basetenlabs/truss’s past year of commit activity
    Python 1,143 MIT 100 10 58 Updated Apr 21, 2026
  • prime-rl Public Forked from PrimeIntellect-ai/prime-rl

    Async RL Training at Scale

    basetenlabs/prime-rl’s past year of commit activity
    Python 1 Apache-2.0 267 0 16 Updated Apr 21, 2026
  • genai-bench Public Forked from sgl-project/genai-bench

    Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

    basetenlabs/genai-bench’s past year of commit activity
    Python 2 MIT 51 0 8 Updated Apr 21, 2026
  • vllm-omni Public Forked from vllm-project/vllm-omni

    A framework for efficient model inference with omni-modality models

    basetenlabs/vllm-omni’s past year of commit activity
    Python 0 Apache-2.0 835 0 0 Updated Apr 21, 2026
  • buildkit Public Forked from moby/buildkit

    concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit

    basetenlabs/buildkit’s past year of commit activity
    Go 0 Apache-2.0 1,503 0 10 Updated Apr 20, 2026
  • tmp-animations Public

    a tiny gh pages for misc documents

    basetenlabs/tmp-animations’s past year of commit activity
    HTML 0 0 0 0 Updated Apr 20, 2026
  • ms-swift Public Forked from modelscope/ms-swift

    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...) (AAAI 2025).

    basetenlabs/ms-swift’s past year of commit activity
    Python 1 Apache-2.0 1,378 0 0 Updated Apr 20, 2026
  • Model-Optimizer Public Forked from NVIDIA/Model-Optimizer

    A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

    basetenlabs/Model-Optimizer’s past year of commit activity
    Python 1 Apache-2.0 368 0 6 Updated Apr 20, 2026
  • Megatron-LM Public Forked from NVIDIA/Megatron-LM

    Ongoing research training transformer models at scale

    basetenlabs/Megatron-LM’s past year of commit activity
    Python 1 3,932 0 0 Updated Apr 19, 2026
  • basetenlabs/langchain-baseten’s past year of commit activity
    Python 0 MIT 1 0 7 Updated Apr 17, 2026

Top languages

Loading…

Most used topics

Loading…