submit¶
bgqmap submit
launches a bunch of commands to the workload manager for each execution.
The commands to be executed come from a file with the following format:
# pre-command 1
# pre-command 2
...
# pre-command l
## job parameters
job 1
job 2 ## job specific parameters
...
job m
# post-command 1
# post-command 2
...
# post-command n
- Job pre-commands
- Command to be executed before any job
- Job parameters
- Resources asked to the workload manager (e.g. memory or cores)
- Job command
- Bash command to be executed. One command corresponds to one job unless groups are made
- Job post-commands
- Commands to be executed before any job
An example of such file:
# module load anaconda3
# source activate oncodrivefml
## cores=6, memory=25G
oncodrivefml -i acc.txt -e cds.txt -o acc.out
oncodrivefml -i blca.txt -e cds.txt -o blca.out
bgqmap submit
is a tool that is not only intended to easy the job submission,
but also tries to limit the jobs that one user submits at once,
preventing that he/she takes the whole cluster.
General features¶
It uses the current directory as the working directory for the jobs.
The command line parameters override general job parameters but not specific job parameters (see below).
There is a limit of 1000 commands per submission without grouping.
Grouping involves that a set of x
commands is executed one after the
other as part of the same job. If one fails, the job is terminated.
Warning
Job specific parameters are ignored in grouped submissions.
If a job is killed due to high memory usage, it is resubmitted automatically (as long as bgqmap is active) requesting twice the memory that was requested in the previous execution. This self-resubmission feature will only be done twice.
How does it work?¶
Reading the jobs file¶
The lines at the head of the file with a single #
are interpreted as job pre-commands.
Any non-empty line that does not start with #
is considered as a job command.
If the job command contains ##
anything from there is interpreted as
specific job parameters.
Any line starting with #
after the first job command is interpreted as a
job post-command.
Any line starting with ##
is assumed to contain the general job parameters.
That is, the parameters (memory, cores…) for all the jobs.
Warning
To increase readability, we highly recommend to place all post-commands at the end of the file and the general parameters right after the pre-commands.
Generating the jobs¶
Once the jobs file is parsed, the jobs are created. This process involves:
creating an output directory and copying the jobs file
Note
If the output directory is not empty,
bgqmap
will faileach job receives and id that correspond to its line in the submit folder
for each job command one file with the job metadata is created (named as <job id>.info)
Warning
To prevent lots of writing to the
.info
file,bgqmap
only writes to disk on special cases, when explicitly asked or before exiting.for each job and a bash script file with the commands to be executed is created. The file is named as <job id>.sh and consists on:
- all the pre-commands
- all the commands in the group or a single command if not groups are made
- all the post-commands
Note
The job commands can contain two wildcards that are expanded before job submission:
- ${JOBID}: identifier of the job (is the same for all the commands in a group)
- ${LINE}: identifier of the line of the job (unique for each command)
Running the jobs¶
the jobs start to be submitted to the workload manager. Only certain amount of jobs are submitted according to the
-max-running
parameter. This parameter accounts for running and pending jobs.each job requests certain resources to the workload manager. The order of priority is: command line parameters, general job parameters from the jobs file and default parameters.
Note
If no grouping is perform and the job contains specific job parameters those have the highest priority.
the job output to the standard error is logged in a file named as <job id>.out and the job output to the standard error is logged in <job id>.err.
Usage¶
bgqmap submit -m <memory> -c <cores> <jobs file>
Usage: bgqmap submit [OPTIONS] JOBS_FILE
Submit a set of jobs
- The following values will be extended
- ${JOB}: for id ${LINE}: for line number in the input file
- Options:
-l, --logs PATH Output folder for the bgqmap log files. Default a folder is created in the current directory. -r, --max-running INTEGER Maximum number of job running/waiting. Default: 4. -g, --group INTEGER Group several commands into one job. Default: no grouping. --no-console Show terminal simple console -c, --cores INTEGER Number of cores to use. Default: 2 -m, --memory TEXT Max memory. Default: 4G. Units: K|M|G|T. Default units: G -t, --wall_time TEXT Wall time for the job. Default: no wall time. -w, --working_directory TEXT Working directory. Default: current. -h, --help Show this message and exit.
Examples¶
Using this jobs file:
sleep 5 && echo 'hello world after 5'
sleep 10 && echo 'hello world after 10'
Basic example:
$ bgqmap submit -m 1 -c 1 examples/input/hello.map --no-console
Finished vs. total: [0/2]
Job 0 done. [1/2]
Job 1 done. [2/2]
Execution finished
In the output directory of bgqmap, you can find a copy
of the input file (as bgqmap_input
) and for each
job 4 up to for different files as explained above:
$ ls bgqmap_output_20170905
0.err 0.info 0.out 0.sh 1.err 1.info 1.out 1.sh bgqmap_input
The output directory must not exist before the submission:
$ bgqmap submit examples/input/hello.map --no-console
BgQmapError: Output folder [bgqmap_output_20170905] is not empty. Please give a different folder to write the output files.
Grouping reduces the number of jobs, but expecific job execution parameters are ignored:
$ bgqmap submit examples/input/hello.map -g 2 --no-console
Specific job execution parameters ignored
Finished vs. total: [0/1]
Job 0 done. [1/1]
Execution finished
The following examples make use of this other jobs file:
# module load anaconda3/4.4.0
## memory=8G
python memory.py 8
python memory.py 10 ## memory=10G
The working directory is helpful when your jobs file does not contain the full path to your script
$ bgqmap submit examples/input/memory.map --no-console
Finished vs. total: [0/2]
Job 2 failed. [1/2]
Job 3 failed. [2/2]
Execution finished
$ bgqmap submit examples/input/memory.map -w test/python_scripts/ --no-console
Finished vs. total: [0/2]
Job 2 done. [1/2]
Job 3 done. [2/2]
Execution finished