nanochat

Trains chatbot models and explores open-source LLM. Build AI chatbots from scratch with clear code.

LLM Training Tool
nanochat logo

What is nanochat?

nanochat lets you train your own ChatGPT-style model. This open-source project makes it easy to build AI chatbots from scratch. Train, fine-tune, and deploy your model with this complete LLM training pipeline.

https://dl.dropboxusercontent.com/scl/fi/it0g3gkibzphpyverrsrr/nanochat-Screenshot?rlkey=ma1w4v4z6un2ptgdyyemwyg7l&dl=1 landing page

Key Features

  • Emoji icon 31-20e3.svg

    Rust Tokenizer.
    A lightning-fast custom tokenizer. It uses the Byte Pair Encoding (BPE) method for efficient text processing. With a 65,536-token vocabulary, the tokenizer achieves 4.8 characters per token compression. This increases the performance of the language model.

  • Emoji icon 32-20e3.svg

    FineWeb-EDU Pretraining.
    nanochat is pre-trained with the FineWeb-EDU dataset. This dataset contains high-quality education and web data. The language model gets a broad understanding of various topics. It also learns to generate coherent and relevant text.

  • Emoji icon 33-20e3.svg

    Supervised Fine-Tuning (SFT).
    The next step in training nanochat involves supervised fine-tuning (SFT). This process adjusts the base model to excel at specific tasks. The conversational data improves the model’s conversational capabilities. The inclusion of mathematical reasoning boosts analytical skills.

     

  • Emoji icon 34-20e3.svg

    Reinforcement Learning (GRPO).
    Optional reinforcement learning is available for maximizing model relevance. It uses a simplified version of Gradient Ratio Policy Optimization (GRPO) on tasks.

  • Emoji icon 35-20e3.svg

    KV Cache Inference.
    An inference engine with KV caching and a Python sandbox speeds up the model. This allows for faster generation speed with the help of the memory, allowing the user to get more information.

  • Emoji icon 36-20e3.svg

    ChatGPT-Like Interface.
    The platform includes command-line tools for quick execution. It also has a web interface to make chatting

Frequent questions for nanochat

  • How much does it cost to train a nanochat model?

    It costs from around $100 for a quick run to $1,000 for better models, This depends on model size and cloud GPU prices.

  • How is nanochat compared to GPT-2?

    A nanochat model that costs $1,000 does better than GPT-2 on tests, even though it costs less to train.

  • What hardware do you need to run nanochat?

    It is best with 8xH100 GPUs, each with 80GB VRAM. It can be adapted to single GPUs or 8xA100s, but you need to change some settings.

  • Can you train nanochat for free?

    The code is free, and you can use Google Colab for small models. But, good models need paid cloud computing services.

Related AI Tools

Latest blog posts