Skip to content

nomic-ai/minja

 
 

Repository files navigation

minja.hpp - A minimalistic C++ Jinja templating engine for LLM chat templates

This is not an official Google product

Minja is a minimalistic reimplementation of the Jinja templating engine to integrate in/with C++ LLM projects (it's used in llama.cpp and GPT4All).

It is not general purpose: it includes just what’s needed for actual chat templates (very limited set of filters, tests and language features). Users with different needs should look at third-party alternatives such as Jinja2Cpp, Jinja2CppLight, or inja (none of which we endorse).

Warning

TL;DR: use of Minja is at your own risk, and the risks are plenty! See Security & Privacy section below.

CI

Design goals:

  • Support each and every major LLM found on HuggingFace
  • Easy to integrate to/with projects such as llama.cpp or gemma.cpp:
    • Header-only
    • C++17
    • Only depend on nlohmann::json (no Boost)
    • Keep codebase small (currently 2.5k LoC) and easy to understand
  • Decent performance compared to Python.

Non-goals:

  • Address glaring Prompt injection risks in current Jinja chat templating practices. See Security & Privacy below
  • Additional features from Jinja that aren't used by the template(s) of any major LLM (no feature creep!)
    • Please don't submit PRs with such features, they will unfortunately be rejected.
  • Full Jinja compliance (neither syntax-wise, nor filters / tests / globals)

Usage:

This library is header-only: just copy the header(s) you need, make sure to use a compiler that handles C++17 and you're done. Oh, and get nlohmann::json's json.hpp in your include path.

See API in minja/minja.hpp and minja/chat-template.hpp (experimental).

For raw Jinja templating (see examples/raw.cpp):

#include <minja.hpp>
#include <iostream>

using json = nlohmann::ordered_json;

int main() {
    auto tmpl = minja::Parser::parse("Hello, {{ location }}!", /* options= */ {});
    auto context = minja::Context::make(minja::Value(json {
        {"location", "World"},
    }));
    auto result = tmpl->render(context);
    std::cout << result << std::endl;
}

To apply a template to a JSON array of messages and tools in the HuggingFace standard (see examples/chat-template.cpp):

#include <chat-template.hpp>
#include <iostream>

using json = nlohmann::ordered_json;

int main() {
    minja::chat_template tmpl(
        "{% for message in messages %}"
        "{{ '<|' + message['role'] + '|>\\n' + message['content'] + '<|end|>' + '\\n' }}"
        "{% endfor %}",
        /* bos_token= */ "<|start|>",
        /* eos_token= */ "<|end|>"
    );
    std::cout << tmpl.apply(
        json::parse(R"([
            {"role": "user", "content": "Hello"},
            {"role": "assistant", "content": "Hi there"}
        ])"),
        json::parse(R"([
            {"type": "function", "function": {"name": "google_search", "arguments": {"query": "2+2"}}}
        ])"),
        /* add_generation_prompt= */ true,
        /* extra_context= */ {}) << std::endl;
}

(Note that some template quirks are worked around by minja/chat-template.hpp so that all templates can be used the same way)

Supported features

Models have increasingly complex templates (see some examples), so a fair bit of Jinja's language constructs is required to execute their templates properly.

Minja supports the following subset of the Jinja2/3 template syntax:

  • Full expression syntax
  • Statements {{% … %}}, variable sections {{ … }}, and comments {# … #} with pre/post space elision {%- … -%} / {{- … -}} / {#- … -#}
  • if / elif / else / endif
  • for (recursive) (if) / else / endfor w/ loop.* (including loop.cycle) and destructuring
  • break, continue (aka loop controls extensions)
  • set w/ namespaces & destructuring
  • macro / endmacro
  • filter / endfilter
  • Extensible filters collection: count, dictsort, equalto, e / escape, items, join, joiner, namespace, raise_exception, range, reject / rejectattr / select / selectattr, tojson, trim

Main limitations (non-exhaustive list):

  • Not supporting most filters. Only the ones actually used in templates of major (or trendy) models are/will be implemented.
  • No difference between none and undefined
  • Single namespace with all filters / tests / functions / macros / variables
  • No tuples (templates seem to rely on lists only)
  • No if expressions w/o else (but if statements are fine)
  • No {% raw %}, {% block … %}, {% include … %}, `{% extends … %},

Roadmap / TODOs

  • Fix known line difference issues on Windows
  • Document the various capabilities detectors + backfill strategies used
  • Propose integration w/ https://github.com/google/gemma.cpp
  • Integrate to llama.cpp: ggml-org/llama.cpp#11016 + ggml-org/llama.cpp#9639
  • Improve fuzzing coverage:
    • use thirdparty jinja grammar to guide exploration of inputs (or implement prettification of internal ASTs and use them to generate arbitrary values)
    • fuzz each filter / test
  • Measure / track test coverage
  • Setup performance tests
  • Simplify two-pass parsing
    • Pass tokens to IfNode and such
  • Macro nested set scope = global?
  • Get listed in https://jbmoelker.github.io/jinja-compat-tests/,