server : add props.model_alias #16943

ggerganov · 2025-11-02T16:23:23Z

Add "model_alias" to /props endpoint
Render the model alias when specified

llama-server -hf ggml-org/gpt-oss-20b-GGUF --jinja -c 0 --port 8033 --alias gpt-oss-20-example

allozaur · 2025-11-02T17:01:41Z

Looking good to me!

CISC · 2025-11-02T20:24:04Z

Not a massively helpful message, but I guess something needs to be fixed :)
https://github.com/ggml-org/llama.cpp/actions/runs/19014962680/job/54301478932?pr=16943#step:6:15

allozaur · 2025-11-02T20:46:49Z

Not a massively helpful message, but I guess something needs to be fixed :)

https://github.com/ggml-org/llama.cpp/actions/runs/19014962680/job/54301478932?pr=16943#step:6:15

I hadn't seen that check failing in CI. @ggerganov u can just run npm run format locally and push.

* origin/master: (169 commits) opencl: support imrope (ggml-org#16914) fix: Viewing multiple PDF attachments (ggml-org#16974) model-conversion : pass config to from_pretrained (ggml-org#16963) server : add props.model_alias (ggml-org#16943) ggml: CUDA: add head size 72 for flash-attn (ggml-org#16962) mtmd: add --image-min/max-tokens (ggml-org#16921) mtmd: pad mask for qwen2.5vl (ggml-org#16954) ggml : LoongArch fixes (ggml-org#16958) sync: minja (glm 4.6 & minmax m2 templates) (ggml-org#16949) SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (ggml-org#16869) feat(webui): improve LaTeX rendering with currency detection (ggml-org#16508) test-backend-ops : fix segfault in moe-expert-reduce test in support mode and coverage (ggml-org#16936) ci : disable failing riscv cross build (ggml-org#16952) model: add Janus Pro for image understanding (ggml-org#16906) clip : use FA (ggml-org#16837) server : support unified cache across slots (ggml-org#16736) common : move gpt-oss reasoning processing to init params (ggml-org#16937) docs: remove llama_sampler_accept reference in sampling sample usage (ggml-org#16920) CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (ggml-org#16917) devops: fix failing s390x docker build (ggml-org#16918) ...

webui: add system message in export conversation, support upload conversation with system message Webui: show upload only when in new conversation Webui: Add model name webui: increase height of chat message window when clicking editing Webui: autoclose settings dialog dropdown and maximze screen width when zoom in webui: fix date issues and add more dates webui: change error to toast.error. server: add n_past and slot_id in props_simple webui: add cache tokens, context and prompt speed in chat webui: modernize ui webui: change welcome message webui: change speed display webui: change run python icon webui: add config to use server defaults for sampler webui: put speed on left and context on right webui: recognize AsciiDoc files as valid text files (ggml-org#16850) * webui: recognize AsciiDoc files as valid text files * webui: add an updated static webui build * webui: add the updated dependency list * webui: re-add an updated static webui build Add a setting to display message generation statistics (ggml-org#16901) * feat: Add setting to display message generation statistics * chore: build static webui output webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe (ggml-org#16757) * webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe dialog Extended MarkdownContent to flag previewable code languages, add a preview button alongside copy controls, manage preview dialog state, and share styling for the new button group Introduced CodePreviewDialog.svelte, a sandboxed iframe modal for rendering HTML/JS previews with consistent dialog controls * webui: fullscreen HTML preview dialog using bits-ui * Update tools/server/webui/src/lib/components/app/misc/CodePreviewDialog.svelte Co-authored-by: Aleksander Grygier <[email protected]> * Update tools/server/webui/src/lib/components/app/misc/MarkdownContent.svelte Co-authored-by: Aleksander Grygier <[email protected]> * webui: pedantic style tweak for CodePreviewDialog close button * webui: remove overengineered preview language logic * chore: update webui static build --------- Co-authored-by: Aleksander Grygier <[email protected]> webui: auto-refresh /props on inference start to resync model metadata (ggml-org#16784) * webui: auto-refresh /props on inference start to resync model metadata - Add no-cache headers to /props and /slots - Throttle slot checks to 30s - Prevent concurrent fetches with promise guard - Trigger refresh from chat streaming for legacy and ModelSelector - Show dynamic serverWarning when using cached data * fix: restore proper legacy behavior in webui by using unified /props refresh Updated assistant message bubbles to show each message's stored model when available, falling back to the current server model only when the per-message value is missing When the model selector is disabled, now fetches /props and prioritizes that model name over chunk metadata, then persists it with the streamed message so legacy mode properly reflects the backend configuration * fix: detect first valid SSE chunk and refresh server props once * fix: removed the slots availability throttle constant and state * webui: purge ai-generated cruft * chore: update webui static build feat(webui): improve LaTeX rendering with currency detection (ggml-org#16508) * webui : Revised LaTeX formula recognition * webui : Further examples containg amounts * webui : vitest for maskInlineLaTeX * webui: Moved preprocessLaTeX to lib/utils * webui: LaTeX in table-cells * chore: update webui build output (use theirs) * webui: backslash in LaTeX-preprocessing * chore: update webui build output * webui: look-behind backslash-check * chore: update webui build output * Apply suggestions from code review Code maintenance (variable names, code formatting, string handling) Co-authored-by: Aleksander Grygier <[email protected]> * webui: Moved constants to lib/constants. * webui: package woff2 inside base64 data * webui: LaTeX-line-break in display formula * chore: update webui build output * webui: Bugfix (font embedding) * webui: Bugfix (font embedding) * webui: vite embeds assets * webui: don't suppress 404 (fonts) * refactor: KaTeX integration with SCSS Moves KaTeX styling to SCSS for better customization and font embedding. This change includes: - Adding `sass` as a dev dependency. - Introducing a custom SCSS file to override KaTeX variables and disable TTF/WOFF fonts, relying solely on WOFF2 for embedding. - Adjusting the Vite configuration to resolve `katex-fonts` alias and inject SCSS variables. * fix: LaTeX processing within blockquotes * webui: update webui build output --------- Co-authored-by: Aleksander Grygier <[email protected]> server : add props.model_alias (ggml-org#16943) * server : add props.model_alias webui: fix keyboard shortcuts for new chat & edit chat title (ggml-org#17007) Better UX for handling multiple attachments in WebUI (ggml-org#17246) webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI (ggml-org#16618) * webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI - Purely visual and diagnostic change, no effect on model context, prompt construction, or inference behavior - Captured assistant tool call payloads during streaming and non-streaming completions, and persisted them in chat state and storage for downstream use - Exposed parsed tool call labels beneath the assistant's model info line with graceful fallback when parsing fails - Added tool call badges beneath assistant responses that expose JSON tooltips and copy their payloads when clicked, matching the existing model badge styling - Added a user-facing setting to toggle tool call visibility to the Developer settings section directly under the model selector option * webui: remove scroll listener causing unnecessary layout updates (model selector) * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <[email protected]> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <[email protected]> * chore: npm run format & update webui build output * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <[email protected]> webui: Fix clickability around chat processing statistics UI (ggml-org#17278) * fix: Better pointer events handling in chat processing info elements * chore: update webui build output Fix merge error webui: Add a "Continue" Action for Assistant Message (ggml-org#16971) * feat: Add "Continue" action for assistant messages * feat: Continuation logic & prompt improvements * chore: update webui build output * feat: Improve logic for continuing the assistant message * chore: update webui build output * chore: Linting * chore: update webui build output * fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message * chore: update webui build output * feat: Enable "Continue" button based on config & non-reasoning model type * chore: update webui build output * chore: Update packages with `npm audit fix` * fix: Remove redundant error * chore: update webui build output * chore: Update `.gitignore` * fix: Add missing change * feat: Add auto-resizing for Edit Assistant/User Message textareas * chore: update webui build output Improved file naming & structure for UI components (ggml-org#17405) * refactor: Component iles naming & structure * chore: update webui build output * refactor: Dialog titles + components namig * chore: update webui build output * refactor: Imports * chore: update webui build output webui: hide border of button webui: update webui: update webui: update add vision webui: minor settings reorganization and add disable autoscroll option (ggml-org#17452) * webui: added a dedicated 'Display' settings section that groups visualization options * webui: added a Display setting to toggle automatic chat scrolling * chore: update webui build output Co-authored-by: firecoperana <firecoperana>

ggerganov requested review from allozaur and ngxson as code owners November 2, 2025 16:23

github-actions bot added examples server labels Nov 2, 2025

allozaur approved these changes Nov 2, 2025

View reviewed changes

ngxson approved these changes Nov 2, 2025

View reviewed changes

ggerganov added 2 commits November 3, 2025 11:32

server : add props.model_alias

6890d72

webui : npm run format

0e42f25

ggerganov force-pushed the gg/server-use-alias branch from e63315b to 0e42f25 Compare November 3, 2025 09:33

ggerganov mentioned this pull request Nov 3, 2025

changelog : llama-server REST API #9291

Open

allozaur merged commit 48bd265 into master Nov 3, 2025
68 of 70 checks passed

ggerganov deleted the gg/server-use-alias branch November 3, 2025 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : add props.model_alias #16943

server : add props.model_alias #16943

Uh oh!

ggerganov commented Nov 2, 2025

Uh oh!

allozaur commented Nov 2, 2025

Uh oh!

CISC commented Nov 2, 2025

Uh oh!

allozaur commented Nov 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

server : add props.model_alias #16943

server : add props.model_alias #16943

Uh oh!

Conversation

ggerganov commented Nov 2, 2025

Uh oh!

allozaur commented Nov 2, 2025

Uh oh!

CISC commented Nov 2, 2025

Uh oh!

allozaur commented Nov 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants