Run multiple LLM models on your machine and hot-swap between them as
needed. llama-swap works with any OpenAI API-compatible server, giving
you the flexibility to switch models without restarting your
applications.

Built in Go for performance and simplicity, llama-swap has zero
dependencies and is incredibly easy to set up. Get started in
minutes - just one binary and one configuration file.
