Author: Alice Braun [email protected]
Last update: 2025-07-11
Note
This short README is designed to help you quickly set up and use the RICOPILI pipeline with centrally deployed dependencies and imputation references on the national Dutch supercomputer SURFsnellius.
For a more comprehensive documentation on how to install RICOPILI and its dependencies, run the modules and interpret their output please visit: sites.google.com/a/broadinstitute.org/ricopili/
Download RICOPILI from GitHub via ssh from https://github.com/Ripkelab/ricopili
git clone [email protected]:Ripkelab/ricopili.git
mv ricopili/rp_bin/ ~
Trouble running Git on a cluster?
If you are unable to download from GitHub, you need to copy your SSH key to GitHub first:
ssh-keygen -t rsa -b 4096When prompted, save the key in
~/.ssh/id_rsa
.Then, add your SSH key to the agent:
ssh-add ~/.ssh/id_rsa eval "$(ssh-agent -s)"Now log into GitHub in your browser and navigate to Settings > SSH and GPG keys.
Click New SSH key and paste the contents of your public key file:cat ~/.ssh/id_rsa.pub
Finally, verify the SSH config file:
vim ~/.ssh/config
Add the following lines:
Host github.com HostName github.com User git IdentityFile ~/.ssh/id_rsa
Test the connection:
ssh -T [email protected]
Or via wget:
wget https://personal.broadinstitute.org/braun/sharing/rp_bin.2025_Feb_20.001.tar.gz
wget https://personal.broadinstitute.org/braun/sharing/rp_bin.2025_Feb_20.001.md5.cksum
# verify checksum
md5sum rp_bin.2025_Feb_20.001.md5.cksum
tar -xvzf rp_bin.2025_Feb_20.001.tar.gz
Note
We recommend to install an additional set of software (mostly R packages) via conda/mamba:
# download conda yaml file to build environment with all necessary R packages
wget https://personal.broadinstitute.org/braun/sharing/rp_env_0225b.yaml
mamba env create -n rp_env -f rp_env.yaml
Note
If you'd like to install RICOPILI on a different cluster than the Broad UGER or the SURFsnellius supercomputer we recommend downloading the depency tarball:
wget https://personal.broadinstitute.org/braun/sharing/ricopili_dependencies_0225b.tar.gz
wget https://personal.broadinstitute.org/braun/sharing/ricopili_dependencies_0225b.md5.cksum
# verify checksum
md5sum ricopili_dependencies_0225b.tar.gz
tar -xvzf ricopili_dependencies_0225b.tar.gz
On SURFsnellius you may use the centrally installed dependencies on PGC DAC:
/gpfs/work5/0/pgcdac/ricopili_download/dependencies/
To swiftly install RICOPILI you need to create a file called ricopili.conf in your home directory.
Edit ricopili.conf
via your preferred text editor and paste the following contents, replacing: init
and email
with your personal information.
eloc /home/$USER/.conda/envs/rp_env/bin/ # conda environment installation
i2loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/impute_v2
i4loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/impute_v4
hmloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/hapmap_ref/
minimac3loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/Minimac3/
minimac4loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/Minimac4/minimac4-4.1.2-Linux-x86_64/bin/
gmloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/genetic_map_files
sh5loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/shapeit5
plink2loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/plink2/plink2
rloc /home/$USER/.conda/envs/rp_env/bin/R # conda environment installation
ldsc_start /home/$USER/.conda/envs/ldsc/bin/ # conda ldsc environment which runs on python 2.7
sh3loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/shapeit3
tabixloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/tabix/
bcloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/bcftools/bcftools-1.18
bcloc_plugins /gpfs/work5/0/pgcdac/ricopili_download/dependencies/bcftools/bcftools-1.18/plugins/
ealoc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/eagle
bgziploc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/bgzip/
ldsc_ref /gpfs/work5/0/pgcdac/ricopili_download/dependencies/ldsc/
liloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/liftover/
rpac /gpfs/home1/$USER/.conda/envs/rp_env/lib/R/library # conda environment installation
p2loc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/plink
shloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/shapeit
meloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/metal/
bcrloc /gpfs/work5/0/pgcdac/ricopili_download/dependencies/bcftools/resources/
home /home/$USER/
sloc /scratch-local/$USER/
init ab
email <YOUR-EMAIL>
loloc /home/$USER/
batch_jobcommand sbatch
batch_name -J_SPACE_XXX
batch_jobfile XXX
batch_memory_request NONE
batch_walltime --time=0-HH:MM:SS
batch_array --array=1-XXX%YYY
batch_stdout -o_SPACE_XXX/%x-%j.out
batch_stderr -e_SPACE_XXX/%x-%j.out
batch_job_dependency --dependency=afterany:XXX
batch_array_task_id $SLURM_ARRAY_TASK_ID
batch_max_parallel_jobs_per_one_array added_to_array
batch_other_job_flags --partition=genoa_SPACE_--cpus-per-task=32
batch_job_output_jid Submitted_SPACE_batch_SPACE_job_SPACE_XXX
batch_ncores_per_node 32
batch_mem_per_node 56
Start the script ./rp_config
from within the rp_bin
directory.
This is an interactive script that will take care of the installation in your computer cluster environment.
If RICOPILI is already installed in the system under your account, it will ask you if you wish to unset the Ricopili PATH settings first. For first time custom installation it is highly recommended to do so.
The configuration script will give you the two commands you have to issue. You just need to copy/paste them into the command line.
In addition, you need to add the following lines to your .bashrc file, if they haven't been added automatically by the rp_config module:
export rp_perlpackages="/gpfs/work5/0/pgcdac/ricopili_download/dependencies/perl_modules"
export RPHOME="/home/$USER"
export PATH="/gpfs/work5/0/pgcdac/ricopili_download/dependencies/plink:$PATH"
export PATH="/home/$USER/rp_bin/pdfjam:$PATH"
export PATH="/home/$USER/rp_bin/pdfjam/pdfjoin:$PATH"
export PATH="/gpfs/work5/0/pgcdac/ricopili_download/dependencies/texlive/2025/2025/bin/x86_64-linux:$PATH"
export MANPATH="/gpfs/work5/0/pgcdac/ricopili_download/dependencies/texlive/2025/2025/texmf-dist/doc/man"
export INFOPATH="/gpfs/work5/0/pgcdac/ricopili_download/dependencies/texlive/2025/2025/texmf-dist/doc/info"
If the configuration script cannot find a configuration file (by default the script is looking for a file named rp_config.custom.txt
) an empty file is created, that needs to be filled by you and/or a system-administrator.
This file follows a two column structure, where variable-names are found in the first column and variable-values in the second. “###” are comments.
Whitespace can be as long as necessary, spaces are not allowed. Please use term _SPACE_
if needed.
To run the next step of the configuration on SURFsnellius you can copy paste the following into the rp_config.custom.txt
at the rp_bin
directory, replacing rp_user_initials, rp_user_email, rp_logfiles
:
### for details please refer to https://docs.google.com/document/d/14aa-oeT5hF541I8hHsDAL_42oyvlHRC5FWR7gir4xco/edit?usp=sharing
### and https://docs.google.com/spreadsheets/d/1LhNYIXhFi7yXBC17UkjI1KMzHhKYz0j2hwnJECBGZk4/edit?usp=sharing
variable_name variable_value
----------------------------------------------
rp_dependencies_dir /gpfs/work5/0/pgcdac/ricopili_download/dependencies
R_packages_dir /home/$USER/R/x86_64-pc-linux-gnu-library/4.3/
starting_R starting_R module_SPACE_load_SPACE_2023;module_SPACE_load_SPACE_R/4.3.2-gfbf-2023a;_SPACE_R
path_to_Perlmodules /gpfs/work5/0/pgcdac/ricopili_download/dependencies/perl_modules
path_to_scratchdir /scratch-local
starting_ldsc /home/$USER/.conda/envs/rp_env/bin/
ldsc_reference /gpfs/work5/0/pgcdac/ricopili_download/dependencies/ldsc
rp_user_initials <YOUR_INITIALS>
rp_user_email <YOUR_MAIL>
rp_logfiles /home/$USER/
----------------------------------------
----------------------------------------
---- jobarray and queueing parameters:
----------------------------------------
----------------------------------------
batch_jobcommand sbatch
batch_memory_request NONE
batch_walltime --time=0-HH:MM:SS
batch_array --array=1-XXX%YYY
batch_max_parallel_jobs_per_one_array added_to_array
batch_jobfile XXX
batch_name -J_SPACE_XXX
batch_stdout -o_SPACE_XXX/%x-%j.out
batch_stderr -e_SPACE_XXX/%x-%j.out
batch_job_dependency --dependency=afterany:XXX
batch_array_task_id $SLURM_ARRAY_TASK_ID
batch_other_job_flags --partition=thin_SPACE_--cpus-per-task=32
batch_job_output_jid Submitted_SPACE_batch_SPACE_job_SPACE_XXX
batch_ncores_per_node 32
batch_mem_per_node 56
After creating these files run ./rp_config
again
Follow the instructions but do not replace the config file you have just copy-pasted.
Warning
- Currently, the libgsl.so.23 dependency for EIGENSOFT is not available on SURFsnellius.
you can install EIGENSOFT through conda/ mamba:
mamba install bioconda::eigensoft
and add the following to your ricopili.conf (assuming your environment is called "rp_env"):
eloc /home/$USER/.conda/envs/rp_env/bin/
- LDSC is not available as a module on SURFsnellius.
As it uses Python 2.7 you can install LDSC into a new environemnt through conda/ mamba:
mamba create -n ldsc python=2.7.15 -c conda-forge -c bioconda ldsc
try to start ldsc manually to see if it runs and then add the following to your rp_config file:ldsc_start /home/$USER/.conda/envs/ldsc/bin/