submit

bgqmap submit launches a bunch of commands to the workload manager for each execution. The commands to be executed come from a file with the following format:

# pre-command 1
# pre-command 2
...
# pre-command l
## job parameters

job 1
job 2  ## job specific parameters
...
job m

# post-command 1
# post-command 2
...
# post-command n
Job pre-commands
Command to be executed before any job
Job parameters
Resources asked to the workload manager (e.g. memory or cores)
Job command
Bash command to be executed. One command corresponds to one job unless groups are made
Job post-commands
Commands to be executed before any job

An example of such file:

# module load anaconda3
# source activate oncodrivefml
## cores=6, memory=25G

oncodrivefml -i acc.txt -e cds.txt -o acc.out
oncodrivefml -i blca.txt -e cds.txt -o blca.out

bgqmap submit is a tool that is not only intended to easy the job submission, but also tries to limit the jobs that one user submits at once, preventing that he/she takes the whole cluster.

General features

It uses the current directory as the working directory for the jobs.

The command line parameters override general job parameters but not specific job parameters (see below).

There is a limit of 1000 commands per submission without grouping. Grouping involves that a set of x commands is executed one after the other as part of the same job. If one fails, the job is terminated.

Warning

Job specific parameters are ignored in grouped submissions.

If a job is killed due to high memory usage, it is resubmitted automatically (as long as bgqmap is active) requesting twice the memory that was requested in the previous execution. This self-resubmission feature will only be done twice.

How does it work?

Reading the jobs file

The lines at the head of the file with a single # are interpreted as job pre-commands.

Any non-empty line that does not start with # is considered as a job command. If the job command contains ## anything from there is interpreted as specific job parameters.

Any line starting with # after the first job command is interpreted as a job post-command.

Any line starting with ## is assumed to contain the general job parameters. That is, the parameters (memory, cores…) for all the jobs.

Warning

To increase readability, we highly recommend to place all post-commands at the end of the file and the general parameters right after the pre-commands.

Generating the jobs

Once the jobs file is parsed, the jobs are created. This process involves:

  • creating an output directory and copying the jobs file

    Note

    If the output directory is not empty, bgqmap will fail

  • each job receives and id that correspond to its line in the submit folder

  • for each job command one file with the job metadata is created (named as <job id>.info)

    Warning

    To prevent lots of writing to the .info file, bgqmap only writes to disk on special cases, when explicitly asked or before exiting.

  • for each job and a bash script file with the commands to be executed is created. The file is named as <job id>.sh and consists on:

    • all the pre-commands
    • all the commands in the group or a single command if not groups are made
    • all the post-commands

    Note

    The job commands can contain two wildcards that are expanded before job submission:

    • ${JOBID}: identifier of the job (is the same for all the commands in a group)
    • ${LINE}: identifier of the line of the job (unique for each command)

Running the jobs

  • the jobs start to be submitted to the workload manager. Only certain amount of jobs are submitted according to the -max-running parameter. This parameter accounts for running and pending jobs.

  • each job requests certain resources to the workload manager. The order of priority is: command line parameters, general job parameters from the jobs file and default parameters.

    Note

    If no grouping is perform and the job contains specific job parameters those have the highest priority.

  • the job output to the standard error is logged in a file named as <job id>.out and the job output to the standard error is logged in <job id>.err.

Usage

bgqmap submit -m <memory> -c <cores> <jobs file>

Usage: bgqmap submit [OPTIONS] JOBS_FILE

Submit a set of jobs

The following values will be extended
${JOB}: for id ${LINE}: for line number in the input file
Options:
-l, --logs PATH
 Output folder for the bgqmap log files. Default a folder is created in the current directory.
-r, --max-running INTEGER
 Maximum number of job running/waiting. Default: 4.
-g, --group INTEGER
 Group several commands into one job. Default: no grouping.
--no-console Show terminal simple console
-c, --cores INTEGER
 Number of cores to use. Default: 2
-m, --memory TEXT
 Max memory. Default: 4G. Units: K|M|G|T. Default units: G
-t, --wall_time TEXT
 Wall time for the job. Default: no wall time.
-w, --working_directory TEXT
 Working directory. Default: current.
-h, --help Show this message and exit.

Examples

Using this jobs file:

sleep 5 && echo 'hello world after 5'
sleep 10 && echo 'hello world after 10'

Basic example:

$ bgqmap submit -m 1 -c 1 examples/input/hello.map --no-console
Finished vs. total: [0/2]
Job 0 done. [1/2]
Job 1 done. [2/2]
Execution finished

In the output directory of bgqmap, you can find a copy of the input file (as bgqmap_input) and for each job 4 up to for different files as explained above:

$ ls bgqmap_output_20170905
0.err  0.info  0.out  0.sh  1.err  1.info  1.out  1.sh  bgqmap_input

The output directory must not exist before the submission:

$ bgqmap submit examples/input/hello.map --no-console
BgQmapError: Output folder [bgqmap_output_20170905] is not empty. Please give a different folder to write the output files.

Grouping reduces the number of jobs, but expecific job execution parameters are ignored:

$ bgqmap submit examples/input/hello.map -g 2 --no-console
Specific job execution parameters ignored
Finished vs. total: [0/1]
Job 0 done. [1/1]
Execution finished

The following examples make use of this other jobs file:

# module load anaconda3/4.4.0
## memory=8G
python memory.py 8
python memory.py 10 ## memory=10G

The working directory is helpful when your jobs file does not contain the full path to your script

$ bgqmap submit examples/input/memory.map --no-console
Finished vs. total: [0/2]
Job 2 failed. [1/2]
Job 3 failed. [2/2]
Execution finished

$ bgqmap submit examples/input/memory.map -w test/python_scripts/ --no-console
Finished vs. total: [0/2]
Job 2 done. [1/2]
Job 3 done. [2/2]
Execution finished