Skip to content

Conversation

@spiffcs
Copy link
Contributor

@spiffcs spiffcs commented Nov 5, 2025

Description

This PR follows up on #4279 by adding support for a new docker source ocimodelsource (naming pending 😄)

With this change users can do the following: syft -o json docker.io/ai/qwen3-vl | jq .:

They'll get an SBOM with a single package showing the gguf model and details for the model pulled from https://hub.docker.com/u/ai

Example of metadata extracted

 "metadata": {
        "modelFormat": "gguf",
        "modelName": "Qwen3-Vl-8B-Instruct",
        "modelVersion": "unknown",
        "hash": "321c13d3e93151b5",
        "license": "apache-2.0",
        "ggufVersion": 3,
        "architecture": "qwen3vl",
        "quantization": "Q4_K_M",
        "parameters": 8190735360,
        "tensorCount": 399,
        "header": {
          "general.base_model.0.name": "Qwen3 VL 8B Instruct",
          "general.base_model.0.organization": "Qwen",
          "general.base_model.0.repo_url": "https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct",
          "general.base_model.count": 1,
          "general.basename": "Qwen3-Vl-8B-Instruct",
          "general.file_type": 15,
          "general.finetune": "Instruct",
          "general.quantization_version": 2,
          "general.quantized_by": "Unsloth",
          "general.repo_url": "https://huggingface.co/unsloth",
          "general.size_label": "8B",
          "general.tags": {
            "type": 8,
            "len": 2,
            "startOffset": 741,
            "size": 41
          },
          "general.type": "model",
          "quantize.imatrix.chunks_count": 694,
          "quantize.imatrix.dataset": "unsloth_calibration_Qwen3-VL-8B-Instruct.txt",
          "quantize.imatrix.entries_count": 252,
          "quantize.imatrix.file": "Qwen3-VL-8B-Instruct-GGUF/imatrix_unsloth.gguf",
          "qwen3vl.attention.head_count": 32,
          "qwen3vl.attention.head_count_kv": 8,
          "qwen3vl.attention.key_length": 128,
          "qwen3vl.attention.layer_norm_rms_epsilon": 0.000001,
          "qwen3vl.attention.value_length": 128,
          "qwen3vl.block_count": 36,
          "qwen3vl.context_length": 262144,
          "qwen3vl.embedding_length": 4096,
          "qwen3vl.feed_forward_length": 12288,
          "qwen3vl.n_deepstack_layers": 3,
          "qwen3vl.rope.dimension_sections": {
            "type": 5,
            "len": 4,
            "startOffset": 1268,
            "size": 16
          },
          "qwen3vl.rope.freq_base": 5000000,
         "tokenizer.ggml.add_bos_token": false,
          "tokenizer.ggml.bos_token_id": 151643,
          "tokenizer.ggml.eos_token_id": 151645,
          "tokenizer.ggml.merges": {
            "type": 8,
            "len": 151387,
            "startOffset": 3197544,
            "size": 2731548
          },

A larger google doc is being put together to go over the choices made in this PR and changes we need to make so that pt1/pt2 are working together as intended

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
@spiffcs spiffcs force-pushed the 4184-pt2-oci-model-support branch from 5853129 to 8031957 Compare November 13, 2025 06:19
Base automatically changed from 4184-gguf-parser to main November 13, 2025 22:43
* main: (76 commits)
  feat: snap can be queried by revision and ```track/risk/branch``` (#4439)
  fix: 4423 dotnet-deps cataloger skips project type by def
  signpost to docs site (#4483)
  chore(deps): bump github/codeql-action from 4.31.8 to 4.31.9 (#4481)
  chore(deps): bump github.com/goccy/go-yaml from 1.19.0 to 1.19.1 (#4482)
  Detect embedded deps.json in .NET binaries (#4375)
  chore(deps): bump actions/cache from 5.0.0 to 5.0.1 (#4476)
  chore(deps): bump actions/cache in /.github/actions/bootstrap (#4477)
  chore(deps): update tools to latest versions (#4473)
  unapply base path for resolver inbound requests (#4478)
  fix: golang PURL should include full module (#4395)
  fix:best effort to get the os info of an ELF binary (#4438)
  Improve PR template (#4472)
  feat: add support for Gemfile.next.lock (#4457)
  chore:cancel in-progress workflows for new commits on same PR (#4465)
  chore(deps): update tools to latest versions (#4466)
  chore(deps): bump github/codeql-action from 4.31.7 to 4.31.8 (#4468)
  chore(deps): bump actions/cache from 4.3.0 to 5.0.0 (#4469)
  chore(deps): bump github.com/anchore/stereoscope from 0.1.14 to 0.1.16 (#4470)
  chore(deps): bump actions/cache in /.github/actions/bootstrap (#4471)
  ...

Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
li, err := fetchSingleGGUFHeader(ctx, client, artifact.Reference, layer, tempDir)
if err != nil {
return nil, fmt.Errorf("failed to create temp file: %w", err)
os.RemoveAll(tempDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be ignoring an error (we should at least log)

// Fetch GGUF layer headers via range-GET
tempFiles := make(map[string]string)
ggufLayers := make([]GGUFLayerInfo, 0, len(artifact.GGUFLayers))
id := deriveID(cfg.Reference, cfg.Alias, metadata.ManifestDigest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have a test that ensures the ID generated for an artifact here vs an stereoscope image is the same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well we should really try to move the deriveIDFromStereoscopeImage function in the stereoscopesource package to source/internal package and share the invocation. We should be careful to not change existing behavior. Could be worth capturing an ID for each case we care about before making refactors, adding a test for that (if it does not already exist), then doing the refactor and ensure they dont change for stereoscope images.

Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Signed-off-by: Christopher Phillips <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants