-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Work Unit 003: libs/refs Module with OCI Backend
Status: Specification
Estimated Effort: 3-4 days
Dependencies: Work Unit 002 (CUE Schema for .sow-ref.yaml Manifest)
Behavioral Goal
As a ref system consumer (CLI commands, packaging, inspection, installation),
I need a libs/refs module providing OCI registry operations with a clean interface,
So that I can interact with OCI registries to list files, pull images, and push refs without understanding the underlying OCI protocol complexity, while having the flexibility to mock these operations in tests.
Success Criteria
- A
libs/refsGo module exists withClientandRegistryinterfaces following the ports/adapters pattern - OCI operations are abstracted behind interfaces, enabling unit testing with mocks
- The OCI client wrapper successfully authenticates via Docker credential chain (transparent to caller)
- URL detection correctly identifies OCI registry URLs (both explicit
oci://prefix and auto-detected known registries) - Security constraints are enforced: max file size 100MB, max total size 1GB, 10k file limit
- Retry with exponential backoff is configured for transient failures
- Mock generation via
go:generateproduces usable test doubles - Unit tests pass using mocked OCI client
Existing Code Context
Explanatory Context
The sow codebase has recently undergone significant architectural changes (PRs #119-#123) establishing a consistent ports and adapters pattern in libs/. This pattern separates interface definitions (ports) from implementations (adapters), enabling testability and flexibility.
The existing RefType interface in cli/internal/refs/types.go (lines 23-79) defines how ref types (git, file) are registered and used. The new OCI functionality will be implemented as an OCIType that implements this interface, but the core OCI client operations belong in a new libs/refs module to enable reuse and testing.
The libs/exec module demonstrates the interface pattern we'll follow: Executor interface defines operations, local.go provides the concrete implementation, and mocks/executor.go is auto-generated for testing. Similarly, libs/git shows the factory pattern (NewGitHubClient()) for environment-aware client creation.
The URL parsing logic in cli/internal/refs/url.go currently handles git+ and file:// schemes. OCI URLs need different detection:
- Explicit:
oci://ghcr.io/org/repo:tag(new scheme prefix) - Auto-detect:
ghcr.io/org/repo:tag,docker.io/library/image,*.azurecr.io/path(known registry patterns) - Digest pinning:
registry/path@sha256:abc123... - Version tags:
registry/path:v1.0.0,registry/path:latest
The github.com/jmgilman/go/oci library provides OCI operations with estargz support. This library was selected per ADR-003 for its security features and selective extraction capabilities. The library provides the following key APIs:
// Core client creation
oci.New() (*Client, error)
oci.NewWithOptions(opts ...ClientOption) (*Client, error)
// Registry operations
client.Push(ctx, sourceDir, reference, opts...) error
client.Pull(ctx, reference, targetDir, opts...) error
client.PullWithCache(ctx, reference, targetDir, cacheDir, opts...) error
client.ListFiles(ctx, reference) (*ListFilesResult, error)
client.ListFilesWithFilter(ctx, reference, patterns...) (*ListFilesResult, error)
// Key functional options
oci.WithFilesToExtract(patterns...) // Selective extraction via glob patterns
oci.WithAnnotations(map[string]string) // OCI annotations for metadata
oci.WithFilesystem(fsys core.FS) // Filesystem abstraction for testabilityThe library already uses github.com/jmgilman/go/fs/core for filesystem abstraction, enabling tests to use in-memory filesystems via github.com/jmgilman/go/fs/billy.
Key Files
| File | Lines | Purpose |
|---|---|---|
libs/exec/executor.go |
1-47 | Interface definition pattern with //go:generate for mocks |
libs/exec/local.go |
full | Concrete implementation pattern |
libs/exec/mocks/executor.go |
full | Generated mock pattern |
libs/git/client.go |
1-45 | Interface with method documentation pattern |
libs/git/factory.go |
1-25 | Factory pattern for environment-aware instantiation |
libs/git/errors.go |
full | Dedicated error types pattern |
cli/internal/refs/types.go |
23-79 | RefType interface that OCIType will implement |
cli/internal/refs/git.go |
1-226 | Reference implementation of a RefType |
cli/internal/refs/url.go |
1-199 | URL parsing and type inference (needs extension) |
cli/internal/refs/registry.go |
full | Type registry pattern |
Existing Documentation Context
ADR-003 (Decision Rationale)
ADR-003 (.sow/knowledge/adrs/003-oci-refs-distribution.md) documents why OCI was chosen over alternatives (git status quo, custom HTTP, npm-style). Key technical decisions:
- Use
github.com/jmgilman/go/ocilibrary for estargz support - Docker credential chain for transparent authentication
- estargz format requirement (NOT standard tar.gz) for selective extraction
- Standard OCI registries (ghcr.io, Docker Hub, Harbor) without custom server modifications
The implementation notes (lines 143-159) outline the integration approach: OCI client wrapper with estargz support, security constraints (file/total size limits), and registry recommendations.
Design Document (Implementation Details)
The OCI Refs Design Document (.sow/knowledge/designs/oci-refs/oci-refs-design.md) provides detailed specifications:
Component Breakdown (lines 233-270): Defines OCI Client Wrapper responsibilities:
- Initialize OCI client with Docker credential chain
- Push estargz images to registries
- Pull images (full or selective) from registries
- List files via estargz TOC without full download
- Query image metadata (annotations) without download
Security Configuration (line 251): Configure github.com/jmgilman/go/oci with:
- Max file size: 100MB
- Max total size: 1GB
- Max file count: 10,000
- Retry with exponential backoff (library built-in)
URL Detection (lines 36-42 in task description, lines 351-358 in design): Support patterns:
- Explicit
oci://prefix (recommended) - Known registry auto-detection:
ghcr.io/,docker.io/,*.azurecr.io/ - Digest pinning:
@sha256:... - Version tags:
:v1.0.0,:latest
Discovery Analysis (Context)
Section 6 of discovery analysis (.sow/project/discovery/analysis.md) confirms:
github.com/jmgilman/go/ocilibrary is NOT currently in codebase (lines 201-215)- Need to add as dependency to CLI
go.mod - Expected API:
Push,Pull,ListFiles,ExtractSelective - Section 6.3 recommends either
cli/internal/refs/or newlibs/oci/module
Section 10.3 provides module structure decision:
- Recommended: Start in
libs/refs/following recent patterns - Can extract to
libs/oci/later if broader reuse emerges
Detailed Requirements
Module Structure
Create libs/refs/ module with the following structure:
libs/refs/
├── doc.go # Package documentation
├── client.go # Client interface definition (port)
├── client_oci.go # OCI implementation (adapter)
├── client_oci_test.go # Unit tests with mocked OCI library
├── registry.go # Registry interface for registry-specific operations
├── url.go # OCI URL parsing and detection
├── url_test.go # URL parsing tests
├── options.go # Functional options for configuration
├── errors.go # Dedicated error types
└── mocks/
└── client.go # Generated mocks
Interface Definitions
Client Interface (client.go):
//go:generate go run github.com/matryer/moq@latest -out mocks/client.go -pkg mocks . Client
// Client defines operations for interacting with OCI refs.
//
// This interface abstracts OCI registry operations, enabling:
// - Unit testing with mocked implementations
// - Future alternative backends if needed
// - Consistent error handling across operations
type Client interface {
// ListFiles returns file metadata from OCI image TOC without downloading content.
// Uses estargz table-of-contents for efficient inspection.
ListFiles(ctx context.Context, ref string) ([]FileEntry, error)
// Pull downloads an OCI image and extracts to destination directory.
// For full extraction, pass nil for globs.
Pull(ctx context.Context, ref string, dest string, globs []string) error
// Push packages a directory as estargz OCI image and pushes to registry.
// The manifest is read from .sow-ref.yaml in srcDir.
Push(ctx context.Context, srcDir string, ref string, opts ...PushOption) error
// GetManifest retrieves .sow-ref.yaml content without full download.
// Uses selective extraction to fetch only the manifest file.
GetManifest(ctx context.Context, ref string) ([]byte, error)
// GetDigest returns the digest of the image at ref.
GetDigest(ctx context.Context, ref string) (string, error)
}FileEntry Type:
// FileEntry represents a file in an OCI image.
type FileEntry struct {
Path string // Relative path within image
Size int64 // File size in bytes
Mode os.FileMode
ModTime time.Time
IsDir bool
}Registry Interface (optional, for registry-specific operations):
// Registry defines operations for querying OCI registries.
type Registry interface {
// ListTags returns available tags for a repository.
ListTags(ctx context.Context, repo string) ([]string, error)
// ResolveTag resolves a tag to a digest.
ResolveTag(ctx context.Context, ref string) (string, error)
// CheckAuth verifies authentication is valid for registry.
CheckAuth(ctx context.Context, registry string) error
}URL Detection and Parsing
Create url.go with OCI-specific URL handling:
// IsOCIRef determines if a URL refers to an OCI registry.
//
// Returns true for:
// - oci://ghcr.io/org/repo:tag (explicit prefix)
// - ghcr.io/org/repo:tag (known registry)
// - docker.io/library/image:latest (Docker Hub)
// - myregistry.azurecr.io/path:v1 (Azure CR)
// - registry/path@sha256:abc123... (digest)
func IsOCIRef(rawURL string) bool
// ParseOCIRef parses an OCI reference string.
//
// Returns structured components: registry, repository, tag/digest.
type OCIRef struct {
Registry string // e.g., "ghcr.io"
Repository string // e.g., "org/repo"
Tag string // e.g., "v1.0.0" (empty if digest)
Digest string // e.g., "sha256:abc123..." (empty if tag)
}
func ParseOCIRef(rawURL string) (*OCIRef, error)
// NormalizeOCIRef normalizes an OCI reference to canonical form.
// Strips oci:// prefix, normalizes docker.io references.
func NormalizeOCIRef(rawURL string) (string, error)Known Registry Detection:
var knownOCIRegistries = []string{
"ghcr.io",
"docker.io",
"registry.hub.docker.com",
"index.docker.io",
"*.azurecr.io",
"*.gcr.io",
"*.amazonaws.com", // ECR
"quay.io",
}OCI Client Implementation
Factory Function (client_oci.go):
// NewClient creates an OCI client with Docker credential chain.
//
// The client is configured with security limits:
// - Max file size: 100MB
// - Max total size: 1GB
// - Max file count: 10,000
// - Retry with exponential backoff
func NewClient(opts ...ClientOption) (Client, error)Security Configuration:
const (
DefaultMaxFileSize = 100 * 1024 * 1024 // 100MB
DefaultMaxTotalSize = 1024 * 1024 * 1024 // 1GB
DefaultMaxFileCount = 10000
)Functional Options
// ClientOption configures the OCI client.
type ClientOption func(*clientOptions)
// WithMaxFileSize sets the maximum size for a single file.
func WithMaxFileSize(size int64) ClientOption
// WithMaxTotalSize sets the maximum total extraction size.
func WithMaxTotalSize(size int64) ClientOption
// WithMaxFileCount sets the maximum number of files.
func WithMaxFileCount(count int) ClientOption
// WithInsecure allows insecure (HTTP) registry connections.
func WithInsecure(insecure bool) ClientOption
// PushOption configures a push operation.
type PushOption func(*pushOptions)
// WithExclusions sets glob patterns to exclude from packaging.
func WithExclusions(patterns []string) PushOption
// WithAnnotations sets additional OCI annotations.
func WithAnnotations(annotations map[string]string) PushOptionError Types
// ErrNotOCIRef indicates the URL is not an OCI reference.
var ErrNotOCIRef = errors.New("not an OCI reference")
// ErrManifestNotFound indicates .sow-ref.yaml is missing.
var ErrManifestNotFound = errors.New("manifest .sow-ref.yaml not found")
// ErrFileTooLarge indicates a file exceeds size limit.
type ErrFileTooLarge struct {
Path string
Size int64
MaxSize int64
}
// ErrTotalSizeExceeded indicates extraction exceeds total size limit.
type ErrTotalSizeExceeded struct {
TotalSize int64
MaxSize int64
}
// ErrAuthFailed indicates registry authentication failed.
type ErrAuthFailed struct {
Registry string
Err error
}
// ErrRegistryNotFound indicates the registry is unreachable.
type ErrRegistryNotFound struct {
Registry string
Err error
}go.mod Update
Add to cli/go.mod:
github.com/jmgilman/go/oci v0.x.x // Use latest stable version
Testing Requirements
Unit Tests
-
URL Detection Tests (
url_test.go):oci://ghcr.io/org/repo:tag→ OCI type detectedghcr.io/org/repo:tag→ OCI type detected (known registry)docker.io/library/nginx:latest→ OCI type detectedmyregistry.azurecr.io/path:v1→ OCI type detected (wildcard match)registry/path@sha256:abc...→ OCI type detected (digest)git+https://github.com/org/repo→ NOT OCI typefile:///path/to/dir→ NOT OCI typegithub.com/org/repo(no tag) → NOT OCI type (ambiguous)
-
URL Parsing Tests (
url_test.go):- Parse
ghcr.io/org/repo:v1.0.0→ registry="ghcr.io", repo="org/repo", tag="v1.0.0" - Parse
docker.io/library/nginx:latest→ normalize to canonical form - Parse
registry/repo@sha256:abc123...→ extract digest correctly - Parse
oci://ghcr.io/org/repo:tag→ stripoci://prefix
- Parse
-
Client Mock Tests (
client_oci_test.go):- ListFiles returns expected entries
- Pull extracts to correct destination
- Pull with globs extracts only matching files
- Push packages directory correctly
- GetManifest retrieves only manifest file
- GetDigest returns correct digest format
-
Error Handling Tests:
- ErrFileTooLarge triggered at 100MB limit
- ErrTotalSizeExceeded triggered at 1GB limit
- ErrManifestNotFound when .sow-ref.yaml missing
- ErrAuthFailed with clear message
Integration Tests (Future, in consuming work units)
The actual OCI registry integration will be tested in Work Units 004-006 using a test registry (Docker local registry or mock registry).
Implementation Notes
Dependency on Work Unit 002
This work unit depends on Work Unit 002 (CUE Schema) for:
libs/schemas/ref_manifest.cueschema definition- Generated Go types (
RefManifest, etc.) - Validation function for
.sow-ref.yaml
The Push operation needs schema validation before packaging. However, the interface definition and URL parsing can proceed independently.
Integration with RefType System
After libs/refs is complete, Work Unit 007 (CLI Integration) will:
- Implement
OCITypeincli/internal/refs/oci.go - Register with
refs.Register(&OCIType{}) - The
OCITypewill delegate tolibs/refs.Client
This separation enables:
- Clean testing of OCI operations independent of CLI
- Potential reuse in other tools (marketplace, etc.)
- Consistent architecture with other libs/ modules
Docker Credential Chain
The github.com/jmgilman/go/oci library handles Docker credential chain automatically. Users logged in via docker login or with ~/.docker/config.json credentials will authenticate transparently. No explicit credential handling needed in our code.
estargz Format
The OCI library produces estargz-format images automatically. This is critical for:
ListFilesoperation (reads TOC without downloading content)- Selective extraction via glob patterns
- The format is NOT optional - standard tar.gz won't work
Out of Scope
- Packaging logic: Handled in Work Unit 004 (uses this module's
Push) - Inspection commands: Handled in Work Unit 005 (uses this module's
ListFiles,GetManifest) - Installation logic: Handled in Work Unit 006 (uses this module's
Pull) - CLI commands: Handled in Work Unit 007 (wires everything together)
- RefType implementation: Work Unit 007 implements
OCITypeusing this module - Index schema updates: Work Unit 007 updates index to include OCI-specific fields
Implementation Standards
All code produced in this work unit MUST adhere to the following standards:
Code Quality Standards
- STYLE.md Compliance: All Go code must follow the conventions documented in
.standards/STYLE.md - TESTING.md Compliance: All tests must follow the patterns documented in
.standards/TESTING.md - golangci-lint: Code must pass
golangci-lint runwith zero errors before completion
Required Dependencies
- OCI Operations: Use
github.com/jmgilman/go/ocifor all OCI registry operations- Client creation:
oci.New()oroci.NewWithOptions() - Push:
client.Push(ctx, sourceDir, reference, opts...) - Pull:
client.Pull(ctx, reference, targetDir, opts...) - List files:
client.ListFiles(ctx, reference)(TOC-only, bandwidth efficient) - Selective extraction:
oci.WithFilesToExtract(patterns...)
- Client creation:
- Filesystem Abstractions: Use
github.com/jmgilman/go/fs/coreandgithub.com/jmgilman/go/fs/billyfor all file system operations- Pass
oci.WithFilesystem(fsys)to enable testability - Use
billy.NewMemoryFS()in unit tests - Use
billy.NewLocalFS()for production
- Pass
Verification Checklist
Before marking this work unit complete, verify:
-
golangci-lint run ./libs/refs/...passes with zero errors - All code follows STYLE.md conventions (functional options, error wrapping, etc.)
- All tests follow TESTING.md patterns (table-driven tests, test helpers, etc.)
- Unit tests use memory filesystem via
billy.NewMemoryFS()where applicable
Acceptance Criteria
-
libs/refs/Go module exists withgo.mod -
Clientinterface is defined withListFiles,Pull,Push,GetManifest,GetDigest -
//go:generatedirective producesmocks/client.go -
IsOCIRef()correctly identifies OCI URLs (explicit and auto-detect) -
ParseOCIRef()extracts registry, repository, tag/digest components -
NewClient()factory creates client with Docker credential chain - Security limits are configurable: max file size, max total size, max file count
- Functional options pattern used for configuration
- Dedicated error types provide actionable error messages
- Unit tests pass for URL detection (OCI vs non-OCI)
- Unit tests pass for URL parsing (various formats)
- Unit tests pass with mocked OCI client
-
github.com/jmgilman/go/ociadded tocli/go.mod