uzu-ts

Mirai

uzu-ts

Node package for uzu, a high-performance inference engine for AI models on Apple Silicon. It allows you to deploy AI directly in your app with zero latency, full data privacy, and no inference costs. You don’t need an ML team or weeks of setup - one developer can handle everything in minutes. Key features:

Simple, high-level API
Specialized configurations with significant performance boosts for common use cases like classification and summarization
Broad model support

Quick Start

Add the uzu dependency to your project's package.json:

"dependencies": {
    "@trymirai/uzu": "0.1.33"
}

Set up your project through Platform and obtain an API_KEY. Then, choose the model you want from the library and run it with the following snippet using the corresponding identifier:

const output = await Engine
    .create('API_KEY')
    .model('Qwen/Qwen3-0.6B')
    .reply('Tell me a short, funny story about a robot');

Everything from model downloading to inference configuration is handled automatically. Refer to the documentation for details on how to customize each step of the process.

Examples

Place the API_KEY you obtained earlier in the corresponding example file, and then run it using one of the following commands:

pnpm run tsn examples/chat.ts
pnpm run tsn examples/summarization.ts
pnpm run tsn examples/classification.ts

Chat

In this example, we will download a model and get a reply to a specific list of messages:

import Engine, { Message } from '@trymirai/uzu';

async function main() {
    const output = await Engine.create('API_KEY')
        .model('Qwen/Qwen3-0.6B')
        .download((update) => {
            console.log('Progress:', update.progress);
        })
        .replyToMessages(
            [
                Message.system('You are a helpful assistant'),
                Message.user('Tell me a short, funny story about a robot')
            ],
            (partialOutput) => {
                return true;
            },
        );
    console.log(output.text.original);
}

main().catch((error) => {
    console.error(error);
});

Summarization

In this example, we will use the summarization preset to generate a summary of the input text:

import Engine, { Preset, SamplingMethod } from '@trymirai/uzu';

async function main() {
    const textToSummarize =
        "A Large Language Model (LLM) is a type of artificial intelligence that processes and generates human-like text. It is trained on vast datasets containing books, articles, and web content, allowing it to understand and predict language patterns. LLMs use deep learning, particularly transformer-based architectures, to analyze text, recognize context, and generate coherent responses. These models have a wide range of applications, including chatbots, content creation, translation, and code generation. One of the key strengths of LLMs is their ability to generate contextually relevant text based on prompts. They utilize self-attention mechanisms to weigh the importance of words within a sentence, improving accuracy and fluency. Examples of popular LLMs include OpenAI's GPT series, Google's BERT, and Meta's LLaMA. As these models grow in size and sophistication, they continue to enhance human-computer interactions, making AI-powered communication more natural and effective.";
    const prompt = `Text is: "${textToSummarize}". Write only summary itself.`;

    const output = await Engine.create('API_KEY')
        .model('Qwen/Qwen3-0.6B')
        .download((update) => {
            console.log('Progress:', update.progress);
        })
        .preset(Preset.summarization())
        .session()
        .tokensLimit(256)
        .enableThinking(false)
        .samplingMethod(SamplingMethod.greedy())
        .reply(prompt);

    console.log('Summary:', output.text.original);
    console.log(
        'Model runs:',
        output.stats.prefillStats.modelRun.count + (output.stats.generateStats?.modelRun.count ?? 0),
    );
    console.log('Tokens count:', output.stats.totalStats.tokensCountOutput);
}

main().catch((error) => {
    console.error(error);
});

You will notice that the model’s run count is lower than the actual number of generated tokens due to speculative decoding, which significantly improves generation speed.

Classification

In this example, we will use the classification preset to determine the sentiment of the user's input:

import Engine, { ClassificationFeature, Preset, SamplingMethod } from '@trymirai/uzu';

async function main() {
    const feature = new ClassificationFeature('sentiment', [
        'Happy',
        'Sad',
        'Angry',
        'Fearful',
        'Surprised',
        'Disgusted',
    ]);
    const textToDetectFeature =
        "Today's been awesome! Everything just feels right, and I can't stop smiling.";
    const prompt =
        `Text is: "${textToDetectFeature}". Choose ${feature.name} from the list: ${feature.values.join(', ')}. ` +
        "Answer with one word. Don't add a dot at the end.";

    const output = await Engine.create('API_KEY')
        .model('Qwen/Qwen3-0.6B')
        .download((update) => {
            console.log('Progress:', update.progress);
        })
        .preset(Preset.classification(feature))
        .session()
        .tokensLimit(32)
        .enableThinking(false)
        .samplingMethod(SamplingMethod.greedy())
        .reply(prompt);

    console.log('Prediction:', output.text.original);
    console.log('Stats:', output.stats);
}

main().catch((error) => {
    console.error(error);
});

You can view the stats to see that the answer will be ready immediately after the prefill step, and actual generation won’t even start due to speculative decoding, which significantly improves generation speed.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
examples		examples
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
jest.config.ts		jest.config.ts
jsr.json		jsr.json
jsr.json.orig		jsr.json.orig
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsc-multi.json		tsc-multi.json
tsconfig.build.json		tsconfig.build.json
tsconfig.deno.json		tsconfig.deno.json
tsconfig.dist-src.json		tsconfig.dist-src.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

uzu-ts

Quick Start

Examples

Chat

Summarization

Classification

License

About

Uh oh!

Releases 30

Packages

Contributors 3

Uh oh!

Languages

License

trymirai/uzu-ts

Folders and files

Latest commit

History

Repository files navigation

uzu-ts

Quick Start

Examples

Chat

Summarization

Classification

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 30

Packages 0

Contributors 3

Uh oh!

Languages

Packages