One of the issues we face in running large-scale distributed simulations is building an automated workflow system, that sends jobs out of available workers without human intervention. FireWorks is one tool designed to do this, and is particularly suited to dynamic workflows, where the output of one stage of the workflow will influence the parameters of the next stage.
In this image, FireWorks can fetch a PhoSim task from an external database and run it. It can be run in "rapidfire" mode, where it continually pulls down tasks and executes them sequentially until it runs out of tasks, or until the container is killed.
Specific instructions for running on Edison @ NERSC
Note that you need to be using a bash shell at NERSC, otherwise there will be a conflict between the bash environment of the batch job and your own environment setup.
To add jobs to the LaunchPad (the FireWorks DB) use a script similar to this:
fw_id: -1 name: Unnamed FW spec: _tasks: - _fw_name: ScriptTask script: - aprun -n 1 ./home/phosim-3.4.2/phosim -c /home/phosim3.4.2/examples/nobackground /home/phosim-3.4.2/examples/star use_shell: true
- the full paths to everything within the image must be used.
- Note that aprun is requesting only one node.