This is not an official Google product
Minja is a minimalistic reimplementation of the Jinja templating engine to integrate in/with C++ LLM projects (it's used in llama.cpp and GPT4All).
It is not general purpose: it includes just what’s needed for actual chat templates (very limited set of filters, tests and language features). Users with different needs should look at third-party alternatives such as Jinja2Cpp, Jinja2CppLight, or inja (none of which we endorse).
Warning
TL;DR: use of Minja is at your own risk, and the risks are plenty! See Security & Privacy section below.
- Support each and every major LLM found on HuggingFace
- See
MODEL_IDSin tests/CMakeLists.txt for the list of models currently supported
- See
- Easy to integrate to/with projects such as llama.cpp or gemma.cpp:
- Header-only
- C++17
- Only depend on nlohmann::json (no Boost)
- Keep codebase small (currently 2.5k LoC) and easy to understand
- Decent performance compared to Python.
- Address glaring Prompt injection risks in current Jinja chat templating practices. See Security & Privacy below
- Additional features from Jinja that aren't used by the template(s) of any major LLM (no feature creep!)
- Please don't submit PRs with such features, they will unfortunately be rejected.
- Full Jinja compliance (neither syntax-wise, nor filters / tests / globals)
This library is header-only: just copy the header(s) you need, make sure to use a compiler that handles C++17 and you're done. Oh, and get nlohmann::json's json.hpp in your include path.
See API in minja/minja.hpp and minja/chat-template.hpp (experimental).
For raw Jinja templating (see examples/raw.cpp):
#include <minja.hpp>
#include <iostream>
using json = nlohmann::ordered_json;
int main() {
auto tmpl = minja::Parser::parse("Hello, {{ location }}!", /* options= */ {});
auto context = minja::Context::make(minja::Value(json {
{"location", "World"},
}));
auto result = tmpl->render(context);
std::cout << result << std::endl;
}To apply a template to a JSON array of messages and tools in the HuggingFace standard (see examples/chat-template.cpp):
#include <chat-template.hpp>
#include <iostream>
using json = nlohmann::ordered_json;
int main() {
minja::chat_template tmpl(
"{% for message in messages %}"
"{{ '<|' + message['role'] + '|>\\n' + message['content'] + '<|end|>' + '\\n' }}"
"{% endfor %}",
/* bos_token= */ "<|start|>",
/* eos_token= */ "<|end|>"
);
std::cout << tmpl.apply(
json::parse(R"([
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there"}
])"),
json::parse(R"([
{"type": "function", "function": {"name": "google_search", "arguments": {"query": "2+2"}}}
])"),
/* add_generation_prompt= */ true,
/* extra_context= */ {}) << std::endl;
}(Note that some template quirks are worked around by minja/chat-template.hpp so that all templates can be used the same way)
Models have increasingly complex templates (see some examples), so a fair bit of Jinja's language constructs is required to execute their templates properly.
Minja supports the following subset of the Jinja2/3 template syntax:
- Full expression syntax
- Statements
{{% … %}}, variable sections{{ … }}, and comments{# … #}with pre/post space elision{%- … -%}/{{- … -}}/{#- … -#} if/elif/else/endiffor(recursive) (if) /else/endforw/loop.*(includingloop.cycle) and destructuringbreak,continue(aka loop controls extensions)setw/ namespaces & destructuringmacro/endmacrofilter/endfilter- Extensible filters collection:
count,dictsort,equalto,e/escape,items,join,joiner,namespace,raise_exception,range,reject/rejectattr/select/selectattr,tojson,trim
Main limitations (non-exhaustive list):
- Not supporting most filters. Only the ones actually used in templates of major (or trendy) models are/will be implemented.
- No difference between
noneandundefined - Single namespace with all filters / tests / functions / macros / variables
- No tuples (templates seem to rely on lists only)
- No
ifexpressions w/oelse(butifstatements are fine) - No
{% raw %},{% block … %},{% include … %}, `{% extends … %},
- Fix known line difference issues on Windows
- Document the various capabilities detectors + backfill strategies used
- Propose integration w/ https://github.com/google/gemma.cpp
- Integrate to llama.cpp: ggml-org/llama.cpp#11016 + ggml-org/llama.cpp#9639
- Improve fuzzing coverage:
- use thirdparty jinja grammar to guide exploration of inputs (or implement prettification of internal ASTs and use them to generate arbitrary values)
- fuzz each filter / test
- Measure / track test coverage
- Setup performance tests
- Simplify two-pass parsing
- Pass tokens to IfNode and such
- Macro nested set scope = global?
- Get listed in https://jbmoelker.github.io/jinja-compat-tests/,