March 17, 2024

Running Mamba Models on Oobabooga's Text-Generation-Webui

What is Mamba?

Before we dive into using Mamba models with the text-generation WebUI, let’s first understand what Mamba is. Mamba is a new type of language model architecture that offers an alternative to the widely-used Transformer models. It’s called a “linear time, state-space model,” which refers to its key properties:

Linear Time Scaling: Unlike Transformers, whose computational complexity scales sub-quadratically with the sequence length, Mamba models scale linearly. This means they can handle much longer sequences more efficiently.

State-Space Formulation: Mamba uses a state-space formulation, which is a mathematical framework for modeling dynamic systems. This allows it to capture long-range dependencies more effectively than traditional approaches.

The end result is that Mamba models can achieve similar or better performance than Transformers, while being significantly more computationally efficient, especially on long sequences.

Prerequisites

This tutorial assumes you already have Oobabooga’s text-generation WebUI set up and running on your system. If not, head over to the project’s GitHub repository and follow the installation instructions.

Installing the Latest Transformers Version

Since Mamba support is a relatively new addition to the Transformers library, you’ll need to install the latest version from the GitHub repository:

Open a terminal and navigate to your text-generation-webui directory.
Activate the virtual environment by running the appropriate command for your operating system (e.g., ./cmd_linux.sh on Linux).
Install the latest Transformers version: pip install git+https://github.com/huggingface/transformers@main

Loading a Mamba Model

With the latest Transformers installed, the text-generation-webui will now automatically recognize and run Mamba models. You can test this by downloading a Mamba model from the Hugging Face Hub, such as state-spaces’s mamba-2.8b-hf model.

Note: Many older Mamba models (especially those created before March 2024) may not be compatible with the Transformers Mamba runtime, as support for this is still relatively new. Over time, more Mamba models will be released with full compatibility.

Once you’ve downloaded a compatible Mamba model, you can load it into the text-generation-webui just like any other model.

That’s it! You’re now ready to experiment with Mamba models and experience their impressive performance, especially on longer sequences.

What is Mamba?

Prerequisites

Installing the Latest Transformers Version

Loading a Mamba Model

You should also read:

The Potential of Higher Parameter, Lower Precision Language Models