-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Work Unit 004: Ref Packaging and Publishing in libs/refs
Status: Specification
Estimated Effort: 3-4 days
Dependencies: Work Unit 002 (CUE Schema), Work Unit 003 (OCI Client)
Behavioral Goal
As a ref author,
I need to package my documentation directory as an OCI image and publish it to a registry,
So that consumers can install my ref via sow refs add, inspect its contents without downloading, and benefit from digest-based versioning for reproducible installations.
Success Criteria
- A
Packagerinterface exists inlibs/refsthat transforms directories into OCI images - Publishers receive validation errors before packaging if
.sow-ref.yamlis invalid or missing - Exclusion patterns from
packaging.excludeare applied correctly (e.g.,*.draft.mdfiles are not packaged) - Default exclusions (
.git/,.DS_Store,node_modules/) are applied automatically - Metadata from
.sow-ref.yamlis mapped to OCI annotations, enabling registry-level querying - Published images use estargz format (NOT standard tar.gz), enabling TOC-based inspection
- Publishing a 10MB ref completes in under 30 seconds
- Integration tests verify round-trip: package → push → pull → verify content matches
Existing Code Context
Explanatory Context
The packaging functionality builds upon the libs/refs module created in Work Unit 003. The Client interface defined there provides the low-level Push operation, but the packaging layer adds:
- Manifest validation: Before packaging, validate
.sow-ref.yamlagainst the CUE schema from Work Unit 002 - Content filtering: Apply exclusion patterns to skip files that shouldn't be distributed
- Annotation mapping: Transform manifest fields into OCI annotations for registry-level metadata
- Archive creation: Build estargz-format archives with proper directory structure
The existing refs system in cli/internal/refs/ shows how ref types handle caching and content management. The GitType implementation (cli/internal/refs/git.go) demonstrates the pattern of wrapping an external library (github.com/jmgilman/go/git/cache) with sow-specific validation and error handling. The Packager will follow a similar pattern with the OCI client.
The design document (lines 253-270) defines the Ref Packager component's responsibilities:
- Validate
.sow-ref.yamlschema before packaging - Apply exclusion patterns from
packaging.excludein manifest - Create estargz archive (not standard tar.gz)
- Generate OCI annotations from
.sow-ref.yamlfields - Calculate content digest
The annotation mapping (design doc lines 424-437) provides a complete mapping from .sow-ref.yaml fields to OCI annotation keys, enabling consumers to query metadata without downloading ref contents.
Key Files
| File | Lines | Purpose |
|---|---|---|
libs/refs/client.go |
(WU003) | Client interface with Push method that packager will use |
libs/refs/url.go |
(WU003) | OCI URL parsing for ref destination |
libs/schemas/ref_manifest.cue |
(WU002) | CUE schema for validation |
libs/schemas/cue_types_gen.go |
(WU002) | Generated RefManifest Go type |
libs/project/state/validate.go |
46-68 | CUE validation pattern to follow |
cli/internal/refs/git.go |
1-226 | Pattern: wrapping library with sow-specific logic |
libs/exec/executor.go |
1-47 | Interface definition pattern |
Existing Documentation Context
Design Document (Primary Implementation Reference)
The OCI Refs Design Document (.sow/knowledge/designs/oci-refs/oci-refs-design.md) provides the complete specification:
Ref Packager Component (lines 253-270):
- Validate
.sow-ref.yamlschema before packaging - Apply exclusion patterns from
packaging.excludein manifest - Create estargz archive (NOT standard tar.gz) - this format is required for TOC-based inspection
- Generate OCI annotations from
.sow-ref.yamlfields - Calculate content digest
- Default exclusions:
.git/,.DS_Store,node_modules/ - Preserve Unix permissions (sanitized: no setuid/setgid)
- Include
.sow-ref.yamlin root of image
OCI Annotation Mapping (lines 424-437): This mapping is contractual - consumers rely on these annotations being set correctly:
| .sow-ref.yaml Field | OCI Annotation Key |
|---|---|
ref.title |
org.opencontainers.image.title |
content.description |
org.opencontainers.image.description |
provenance.authors |
org.opencontainers.image.authors (JSON array) |
provenance.created |
org.opencontainers.image.created |
provenance.source |
org.opencontainers.image.source |
provenance.license |
org.opencontainers.image.licenses |
content.classifications |
com.sow.ref.classifications (JSON) |
content.tags |
com.sow.ref.tags (comma-separated) |
ref.link |
com.sow.ref.link |
Performance Requirements (lines 26, 100):
- NFR1: Publishing 10MB ref completes in < 30 seconds
- The estargz format is non-negotiable - it enables selective extraction
Discovery Analysis (Integration Guidance)
Section 6 of the discovery analysis (.sow/project/discovery/analysis.md) confirms:
- The
github.com/jmgilman/go/ocilibrary provides estargz support (lines 217-224) - Expected API includes
Pushfor publishing images - Security features are built into the library: path traversal protection, size limits
Section 9.3 shows the error handling pattern to follow:
if err != nil {
return fmt.Errorf("failed to <operation>: %w", err)
}Arc42 Building Blocks (Architecture Context)
The arc42-05 document (.sow/knowledge/designs/oci-refs/arc42-05-building-blocks-refs.md) places Ref Packager in the architecture:
- Packager is a sub-component of OCI Client Wrapper (Level 2)
- Receives validated directory input
- Produces OCI image with estargz format
- Maps metadata to OCI annotations
- Packager depends on Schema Validator for manifest validation
Detailed Requirements
Module Structure
Extend libs/refs/ module with packaging functionality:
libs/refs/
├── ... (existing from WU003)
├── packager.go # Packager interface definition (port)
├── packager_impl.go # Implementation (adapter)
├── packager_test.go # Unit tests
├── annotations.go # Annotation mapping logic
├── annotations_test.go # Annotation mapping tests
├── exclusions.go # File exclusion logic
├── exclusions_test.go # Exclusion pattern tests
└── mocks/
└── packager.go # Generated mock (add to existing generate directive)
Interface Definitions
Packager Interface (packager.go):
//go:generate go run github.com/matryer/moq@latest -out mocks/packager.go -pkg mocks . Packager
// Packager transforms local directories into publishable OCI refs.
//
// The packaging workflow:
// 1. Validate .sow-ref.yaml exists and passes schema validation
// 2. Read exclusion patterns from manifest + apply defaults
// 3. Create estargz archive with filtered contents
// 4. Generate OCI annotations from manifest fields
// 5. Calculate content digest
// 6. Delegate to Client.Push for registry upload
//
// Use NewPackager() to create an instance.
type Packager interface {
// Publish packages a directory and pushes to registry.
//
// The directory must contain a valid .sow-ref.yaml at root.
// Returns the digest of the published image.
//
// Example:
// digest, err := packager.Publish(ctx, "./docs", "ghcr.io/org/ref:v1.0.0")
Publish(ctx context.Context, dir string, ref string, opts ...PublishOption) (digest string, err error)
// Validate checks if a directory can be packaged.
//
// Returns nil if the directory contains a valid .sow-ref.yaml.
// Use this for pre-flight validation before publishing.
Validate(ctx context.Context, dir string) error
// Package creates an estargz archive from directory without pushing.
//
// Returns path to temporary archive file. Caller must clean up.
// Used for local inspection or alternative upload methods.
Package(ctx context.Context, dir string, opts ...PublishOption) (archivePath string, annotations map[string]string, err error)
}PublishOption Type:
// PublishOption configures a publish operation.
type PublishOption func(*publishOptions)
// WithAdditionalExclusions adds exclusion patterns beyond manifest defaults.
func WithAdditionalExclusions(patterns []string) PublishOption
// WithAlsoTagLatest also pushes the image with :latest tag.
func WithAlsoTagLatest() PublishOption
// WithDryRun validates and packages but does not push.
func WithDryRun() PublishOption
// WithProgressCallback reports packaging progress.
func WithProgressCallback(fn func(stage string, current, total int64)) PublishOptionExclusion Logic
Create exclusions.go with file filtering:
// DefaultExclusions are always applied when packaging.
var DefaultExclusions = []string{
".git/",
".git", // If .git is a file (submodule)
".DS_Store",
"node_modules/",
"*.swp", // Vim swap files
"*~", // Backup files
".sow/", // Don't include sow metadata
}
// Exclusions handles file filtering for packaging.
type Exclusions struct {
patterns []string
}
// NewExclusions creates an exclusion matcher from manifest + defaults.
func NewExclusions(manifestExcludes []string) *Exclusions
// ShouldExclude returns true if the path should be excluded.
// The path is relative to the package root.
func (e *Exclusions) ShouldExclude(path string) bool
// MatchedPattern returns which pattern matched, if any.
// Used for debugging/logging.
func (e *Exclusions) MatchedPattern(path string) (pattern string, matched bool)Glob Pattern Support:
*.md- matches any .md file in current directory**/*.draft.md- matches draft files in any subdirectorydocs/internal/- matches directory and all contents!important.md- negation patterns (NOT excluded even if matched)
Annotation Mapping
Create annotations.go with OCI annotation logic:
// StandardAnnotations are OCI-standard annotation keys.
const (
AnnotationTitle = "org.opencontainers.image.title"
AnnotationDescription = "org.opencontainers.image.description"
AnnotationAuthors = "org.opencontainers.image.authors"
AnnotationCreated = "org.opencontainers.image.created"
AnnotationSource = "org.opencontainers.image.source"
AnnotationLicenses = "org.opencontainers.image.licenses"
)
// SowAnnotations are sow-specific annotation keys.
const (
AnnotationClassifications = "com.sow.ref.classifications"
AnnotationTags = "com.sow.ref.tags"
AnnotationLink = "com.sow.ref.link"
AnnotationSchemaVersion = "com.sow.ref.schema_version"
)
// MapManifestToAnnotations converts RefManifest fields to OCI annotations.
//
// The mapping follows the design document specification:
// - ref.title → org.opencontainers.image.title
// - content.description → org.opencontainers.image.description
// - provenance.authors → org.opencontainers.image.authors (JSON array)
// - provenance.created → org.opencontainers.image.created
// - provenance.source → org.opencontainers.image.source
// - provenance.license → org.opencontainers.image.licenses
// - content.classifications → com.sow.ref.classifications (JSON)
// - content.tags → com.sow.ref.tags (comma-separated)
// - ref.link → com.sow.ref.link
//
// Returns map of annotation key to value. Empty/nil fields are omitted.
func MapManifestToAnnotations(manifest *schemas.RefManifest) map[string]stringPackager Implementation
Factory Function (packager_impl.go):
// NewPackager creates a Packager with the given OCI client.
//
// The packager validates manifests, applies exclusions, and delegates
// to the client for registry operations.
func NewPackager(client Client, opts ...PackagerOption) Packager
// PackagerOption configures packager behavior.
type PackagerOption func(*packagerOptions)
// WithDefaultExclusions sets custom default exclusions.
// If not set, DefaultExclusions are used.
func WithDefaultExclusions(patterns []string) PackagerOptionImplementation Flow:
func (p *packagerImpl) Publish(ctx context.Context, dir string, ref string, opts ...PublishOption) (string, error) {
// 1. Validate directory exists
if _, err := os.Stat(dir); err != nil {
return "", fmt.Errorf("directory not found: %w", err)
}
// 2. Read and validate .sow-ref.yaml
manifestPath := filepath.Join(dir, ".sow-ref.yaml")
manifest, err := p.loadAndValidateManifest(manifestPath)
if err != nil {
return "", err // Already wrapped with context
}
// 3. Build exclusions (defaults + manifest + options)
exclusions := p.buildExclusions(manifest, opts)
// 4. Map manifest to OCI annotations
annotations := MapManifestToAnnotations(manifest)
// 5. Delegate to client.Push with exclusions and annotations
digest, err := p.client.Push(ctx, dir, ref,
WithExclusions(exclusions.Patterns()),
WithAnnotations(annotations),
)
if err != nil {
return "", fmt.Errorf("failed to push to registry: %w", err)
}
// 6. Optionally push :latest tag
if opts.alsoTagLatest {
latestRef := replaceTag(ref, "latest")
if _, err := p.client.Push(ctx, dir, latestRef,
WithExclusions(exclusions.Patterns()),
WithAnnotations(annotations),
); err != nil {
// Log warning but don't fail the primary push
p.log.Warn("failed to push :latest tag", "error", err)
}
}
return digest, nil
}Error Types
// ErrManifestMissing indicates .sow-ref.yaml not found in directory.
var ErrManifestMissing = errors.New("manifest .sow-ref.yaml not found")
// ErrManifestInvalid indicates schema validation failed.
type ErrManifestInvalid struct {
Path string // Path to manifest file
Errors []string // Validation error messages
}
func (e *ErrManifestInvalid) Error() string {
return fmt.Sprintf("manifest %s is invalid: %s", e.Path, strings.Join(e.Errors, "; "))
}
// ErrEmptyPackage indicates all files were excluded.
type ErrEmptyPackage struct {
Dir string
Exclusions []string
}
func (e *ErrEmptyPackage) Error() string {
return fmt.Sprintf("no files to package in %s (all excluded)", e.Dir)
}
// ErrExclusionPattern indicates an invalid glob pattern.
type ErrExclusionPattern struct {
Pattern string
Err error
}Testing Requirements
Unit Tests
1. Exclusion Pattern Tests (exclusions_test.go):
| Test Case | Input Pattern | Input Path | Expected |
|---|---|---|---|
| Default git | .git/ |
.git/config |
excluded |
| Default DS_Store | .DS_Store |
.DS_Store |
excluded |
| Default node_modules | node_modules/ |
node_modules/pkg/index.js |
excluded |
| Manifest pattern | *.draft.md |
guide.draft.md |
excluded |
| Manifest pattern | *.draft.md |
guide.md |
NOT excluded |
| Recursive glob | **/*.tmp |
docs/cache/file.tmp |
excluded |
| Directory pattern | internal/ |
internal/secrets.md |
excluded |
| Negation pattern | !keep.tmp |
keep.tmp |
NOT excluded |
| No match | *.log |
README.md |
NOT excluded |
2. Annotation Mapping Tests (annotations_test.go):
- Full manifest → all annotations present
- Minimal manifest → only required annotations
- Authors array → JSON array format (
["author1","author2"]) - Tags array → comma-separated (
"golang,testing,docs") - Classifications array → JSON array format
- Empty optional fields → annotations omitted (not empty strings)
- Special characters in values → properly escaped
3. Packager Validation Tests (packager_test.go):
- Valid manifest → no error
- Missing manifest →
ErrManifestMissing - Invalid manifest schema →
ErrManifestInvalidwith field details - Directory doesn't exist → clear error
- All files excluded →
ErrEmptyPackage
4. Packager Integration Tests (with mocked client):
- Publish calls client.Push with correct arguments
- Exclusions are passed to client
- Annotations are passed to client
- WithAlsoTagLatest triggers second Push with :latest
- WithDryRun doesn't call client.Push
- WithProgressCallback receives updates
Integration Tests
Test with Real Registry (using Docker local registry):
func TestPublishRoundTrip(t *testing.T) {
// 1. Start local registry container
registry := testutil.StartRegistry(t)
defer registry.Stop()
// 2. Create test directory with valid manifest
dir := t.TempDir()
createTestRef(t, dir, "Test Ref", []string{"docs/guide.md", "examples/demo.go"})
// 3. Publish to local registry
packager := refs.NewPackager(refs.NewClient())
ref := fmt.Sprintf("%s/test-ref:v1.0.0", registry.Addr)
digest, err := packager.Publish(ctx, dir, ref)
require.NoError(t, err)
require.NotEmpty(t, digest)
// 4. Pull and verify contents match
client := refs.NewClient()
pullDir := t.TempDir()
err = client.Pull(ctx, ref, pullDir, nil)
require.NoError(t, err)
// 5. Verify files
assertFileExists(t, filepath.Join(pullDir, ".sow-ref.yaml"))
assertFileExists(t, filepath.Join(pullDir, "docs/guide.md"))
assertFileExists(t, filepath.Join(pullDir, "examples/demo.go"))
}
func TestExclusionsApplied(t *testing.T) {
// Create dir with files that should be excluded
// Publish
// Pull
// Verify excluded files are NOT present
}
func TestAnnotationsPreserved(t *testing.T) {
// Publish with manifest having all fields
// Use ListFiles or GetManifest to verify annotations
}Performance Tests
func BenchmarkPublish10MB(b *testing.B) {
// Create 10MB test directory
dir := createLargeTestDir(b, 10*1024*1024)
packager := refs.NewPackager(refs.NewClient())
ref := "localhost:5000/bench:latest"
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := packager.Publish(ctx, dir, ref)
require.NoError(b, err)
}
// Assert: average time < 30 seconds
}Implementation Notes
Dependency Chain
-
Work Unit 002 (CUE Schema) must be complete for:
libs/schemas/ref_manifest.cueschema file- Generated
RefManifestGo type - Validation function availability
-
Work Unit 003 (OCI Client) must be complete for:
Clientinterface withPushmethodWithExclusionsandWithAnnotationspush options- Authentication via Docker credential chain
estargz Format Requirement
The github.com/jmgilman/go/oci library produces estargz-format archives automatically. This is critical because:
- Work Unit 005 (Inspection) uses TOC-only download via
ListFiles - Work Unit 006 (Installation) uses selective extraction via glob patterns
- Standard tar.gz would require full download for any operation
The packager MUST NOT use standard tar.gz or any non-estargz format.
Manifest Location
The .sow-ref.yaml file MUST be at the root of the packaged image. This enables:
- Quick extraction for inspection (
GetManifest) - Consistent location for consumers
- Schema validation at extraction time
Permission Handling
The OCI client handles permission sanitization:
- setuid/setgid bits are stripped
- Permissions are preserved (644 for files, 755 for directories)
- No special handling needed in packager
Progress Reporting
For large refs, progress reporting improves UX. The WithProgressCallback option receives:
stage: "scanning", "packaging", "pushing"current/total: bytes processed (during packaging/pushing)
Out of Scope
- CLI command implementation: Work Unit 007 implements
sow refs publish - Inspection of packaged refs: Work Unit 005 implements
sow refs inspect - Installation of refs: Work Unit 006 implements
sow refs add - Index updates after publishing: Not part of packaging (publishing is independent)
- Registry authentication UI: Docker credential chain handles this transparently
Implementation Standards
All code produced in this work unit MUST adhere to the following standards:
Code Quality Standards
- STYLE.md Compliance: All Go code must follow the conventions documented in
.standards/STYLE.md - TESTING.md Compliance: All tests must follow the patterns documented in
.standards/TESTING.md - golangci-lint: Code must pass
golangci-lint runwith zero errors before completion
Required Dependencies
- OCI Operations: Use
github.com/jmgilman/go/ocifor all OCI registry operations- Push operations use
client.Push(ctx, sourceDir, reference, opts...) - Annotations via
oci.WithAnnotations(map[string]string) - The OCI library handles estargz format automatically
- Push operations use
- Filesystem Abstractions: Use
github.com/jmgilman/go/fs/coreandgithub.com/jmgilman/go/fs/billyfor all file system operations- Use
core.FSinterface for filesystem operations requiring abstraction - Pass
oci.WithFilesystem(fsys)for testability - Use
billy.NewMemoryFS()in unit tests - Use
billy.NewLocalFS()for production
- Use
Verification Checklist
Before marking this work unit complete, verify:
-
golangci-lint run ./libs/refs/...passes with zero errors - All code follows STYLE.md conventions (functional options, error wrapping, etc.)
- All tests follow TESTING.md patterns (table-driven tests, test helpers, etc.)
- Unit tests use memory filesystem via
billy.NewMemoryFS()where applicable
Acceptance Criteria
-
Packagerinterface is defined inlibs/refs/packager.go -
//go:generatedirective producesmocks/packager.go -
Publishreturns error if.sow-ref.yamlis missing -
Publishreturns error with field details if manifest is invalid - Default exclusions (
.git/,.DS_Store,node_modules/) are applied - Manifest
packaging.excludepatterns are applied - OCI annotations are correctly mapped from manifest fields
-
org.opencontainers.image.*standard annotations are set -
com.sow.ref.*custom annotations are set - Published images use estargz format (verified by successful
ListFiles) - Round-trip test passes: package → push → pull → verify contents
- Publishing 10MB ref completes in < 30 seconds (benchmark test)
- Unit tests pass for exclusion patterns (all cases from requirements)
- Unit tests pass for annotation mapping (all fields)
- Unit tests pass with mocked OCI client