Skip to content

Symlink input files into temporary work folder to make auxiliary output paths work #161

@samuell

Description

@samuell

Problem

When running a command like bwa index on a file like seq.fa, it will produce a number of files based on the input file name, like:

seq.fa.bwt
seq.fa.pac
seq.fa.ann
seq.fa.amb
seq.fa.sa

In order to detect if this has already been run, one can create the process for this like so:

idx := wf.NewProc("index-ref", "bwa index {i:ref} # Output: {o:bwt}")
idx.In("ref").From(unpack.Out("ungzipped")) # Given that unpack is some upstream process
idx.SetOut("bwt", "{i:ref}.bwt")

But, when the input file is outside of the temporary run directory, these will be produced at the place of the input file, rather than in the temp dir, and then SciPipe will not recognize that the outputs have been created, and give an error like:

ERROR   2025/09/25 09:48:05 [Task:index-ref] Missing output temp-file (_scipipe_tmp.index-ref.d50a9c348f87ae9ecbd3d14652cde9fa8691426c/data/efaecium.fna.bwt) for ip with path (data/efaecium.fna.bwt)

This should be fixed by first symlinking the input file into the temp dir, and using the local path to the symlink in the command.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions