Assuming Bob has a file named tree.py. Assuming this piece of code accepts command line parameters specifying data and target space, performs three fold cross validation and returns the result, so something like:
python tree.py --super_big_data somedata.dat --super_multitarget_space targets.datand returns something in the lines of:
RESULT10fCV 1 0.89
RESULT10fCV 2 0.85
RESULT10fCV 3 0.81
Of course, Bob is interested in obtaining the above results of cross validation for some new data. Assuming that Bob is not able to exactly replicate the environment on his own for one of the following reasons:
- Some driver conflict prohibits installation of one of the libraries
- He's unable to install all requirements.txt directly for whatever reason
- He's lazy
We are able to present to him a simple solution - the Singularity container. The idea is the following:
- Precompile a Python environment with minimal spatial overhead and all required libraries
- Share the environment as a single file
Singularity files are images, just like e.g., your favourite Star Wars movie DVD. Given an example Singularity image, e.g., OurSingularityImage.sif, the above piece of code can be run as:
singularity exec OurSingularityImage.sif python tree.py --super_big_data Bob_somedata.dat --super_multitarget_space Bob_targets.dat
singularity= call of the toolexec= flag to execute a given command (there is alsorun)OurSingularityImage.sif= our environment image.
and the output should be, of course, different. E.g.,
RESULT10fCV 1 0.31
RESULT10fCV 2 0.32
RESULT10fCV 3 0.33
That's it. Bob was able to replicate our environment and run the codebase.