[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
LSF provides a structure called a job array that allows a sequence of jobs that share the same executable and resource requirements, but have different input files, to be submitted, controlled, and monitored as a single unit. Using the standard LSF commands, you can also control and monitor individual jobs and groups of jobs submitted from a job array.
After the job array is submitted, LSF independently schedules and dispatches the individual jobs. Each job submitted from a job array shares the same job ID as the job array and are uniquely referenced using an array index. The dimension and structure of a job array is defined when the job array is created.
- Creating a Job Array
- Handling Input and Output Files
- Job Array Dependencies
- Monitoring Job Arrays
- Controlling Job Arrays
- Requeuing a Job Array
- Job Array Job Slot Limit
[ Top ]
Creating a Job Array
A job array is created at job submission time using the
-Joption ofbsub. For example, the following command creates a job array namedmyArraymade up of 1000 jobs.%bsub -J "myArray[1-1000]" myJobJob <123> is submitted to default queue <normal>.Syntax
The
bsubsyntax used to create a job array follows:%bsub -J "arrayName[indexList,...]"myJobWhere:
-J "arrayName[indexList, ...]"
Names and creates the job array. The square brackets,
[ ], aroundindexListmust be entered exactly as shown and the job array name specification must be enclosed in quotes. Commas (,) are used to separate multipleindexListentries. The maximum length of this specification is 255 characters.User specified string used to identify the job array. Valid values are any combination of the following characters:
a-z | A-Z | 0-9 | . | - | _indexList = start[-end[:step]]
Specifies the size and dimension of the job array, where:
startUsed with
endto specify the start of a range of indices. Can also be used to specify an individual index. Valid values are unique positive integers. For example,[1-5]and[1, 2, 3, 4, 5]specify 5 jobs with indices 1 through 5.endSpecifies the end of a range of indices. Valid values are unique positive integers.
stepSpecifies the value to increment the indices in a range. Indices begin at
start, increment by the value ofstep, and do not increment past the value ofend. The default value is 1. Valid values are positive integers. For example,[1-10:2]specifies a range of 1-10 with step value 2 creating indices 1, 3, 5, 7, and 9.After the job array is created (submitted), individual jobs are referenced using the job array name or job ID and an index value. For example, both of the following series of job array statements refer to jobs submitted from a job array named
myArraywhich is made up of 1000 jobs and has a job ID of 123:myArray[1], myArray[2], myArray[3], ..., myArray[1000] 123[1], 123[2], 123[3], ..., 123[1000]Maximum size of a job array
A large job array allows a user to submit a large number of jobs to the system with a single job submission.
By default, the maximum number of jobs in a job array is 1000, which means the maximum size of a job array can never exceed 1000 jobs.
To make a change to the maximum job array value, set MAX_JOB_ARRAY_SIZE in
lsb.paramsto any number up to 65534. The maximum number of jobs in a job array cannot exceed this value.[ Top ]
Handling Input and Output Files
LSF provides methods for coordinating individual input and output files for the multiple jobs created when submitting a job array. These methods require your input files to be prepared uniformly. To accommodate an executable that uses standard input and standard output, LSF provides runtime variables (%I and %J) that are expanded at runtime. To accommodate an executable that reads command line arguments, LSF provides an environment variable (LSB_JOBINDEX) that is set in the execution environment.
Preparing input files
LSF needs all the input files for the jobs in your job array to be located in the same directory. By default LSF assumes the current working directory (CWD); the directory from where
bsubwas issued. To override CWD, specify an absolute path when submitting the job array.Each file name consists of two parts, a consistent name string and a variable integer that corresponds directly to an array index. For example, the following file names are valid input file names for a job array. They are made up of the consistent name
inputand integers that correspond to job array indices from 1 to 1000:input.1, input.2, input.3, ..., input.1000[ Top ]
Redirecting Standard Input and Output
The variables %I and %J are used as substitution strings to support file redirection for jobs submitted from a job array. At execution time, %I is expanded to provide the job array index value of the current job, and %J is expanded at to provide the job ID of the job array.
Standard input
Use the
-ioption ofbsuband the %I variable when your executable reads from standard input. To use %I, all the input files must be named consistently with a variable part that corresponds to the indices of the job array. For example:input.1, input.2, input.3, ..., input.NFor example, the following command submits a job array of 1000 jobs whose input files are named
input.1,input.2,input.3, ...,input.1000and located in the current working directory:% bsub -J "myArray[1-1000]"-i "input.%I"myJobStandard output and error
Use the
-ooption ofbsuband the %I and %J variables when your executable writes to standard output and error.To create an output file that corresponds to each job submitted from a job array, specify %I as part of the output file name. For example, the following command submits a job array of 1000 jobs whose output files will be located in CWD and named
output.1,output.2,output.3, ...,output.1000:% bsub -J "myArray[1-1000]"-o "output.%I"myJobTo create output files that include the job array job ID as part of the file name specify %J. For example, the following command submits a job array of 1000 jobs whose output files will be located in CWD and named
output.123.1,output.123.2,output.123.3, ...,output.123.1000. The job ID of the job array is 123.% bsub -J "myArray[1-1000]"-o "output.%J.%I"myJob[ Top ]
Passing Arguments on the Command Line
The environment variable LSB_JOBINDEX is used as a substitution string to support passing job array indices on the command line. When the job is dispatched, LSF sets LSB_JOBINDEX in the execution environment to the job array index of the current job. LSB_JOBINDEX is set for all jobs. For non-array jobs, LSB_JOBINDEX is set to sero (0).
To use LSB_JOBINDEX, all the input files must be named consistently and with a variable part that corresponds to the indices of the job array. For example:
input.1, input.2, input.3, ..., input.NYou must escape LSB_JOBINDEX with a backslash, \, to prevent the shell interpreting
bsubfrom expanding the variable. For example, the following command submits a job array of 1000 jobs whose input files are namedinput.1,input.2,input.3, ...,input.1000and located in the current working directory. The executable is being passed an argument that specifies the name of the input files:% bsub -J "myArray[1-1000]"myJob -f input.\$LSB_JOBINDEX
[ Top ]
Job Array Dependencies
Like all jobs in LSF, a job array can be dependent on the completion or partial completion of a job or another job array. A number of job-array-specific dependency conditions are provided by LSF.
Whole array dependency
To make a job array dependent on the completion of a job or another job array use the
-w "dependency_condition"option ofbsub. For example, to have an array dependent on the completion of a job or job array with job ID 123, you would use the following command:% bsub-w "done(123)"-J "myArray2[1-1000]" myJobPartial array dependency
To make a job or job array dependent on an existing job array you would use one of the following dependency conditions.
Use one the following operators (
op) combined with a positive integer (num) to build a condition:== | > | < | >= |<= | !=Optionally, an asterisk (
*) can be used in place ofnumto mean all jobs submitted from the job array.For example, to start a job named
myJobwhen 100 or more elements in a job array with job ID 123 have completed successfully:% bsub-w "numdone(123, >= 100)"myJob[ Top ]
Monitoring Job Arrays
Use
bjobsandbhistto monitor the current and past status of job arrays.Job array status
To display summary information about the currently running jobs submitted from a job array, use the
-Aoption ofbjobs. For example, a job array of 10 jobs with job ID 123:%bjobs -A 123JOBID ARRAY_SPEC OWNER NJOBS PEND DONE RUN EXIT SSUSP USUSP PSUSP 123 myArra[1-10] user1 10 3 3 4 0 0 0 0Individual job status
To display the status of the individual jobs submitted from a job array, specify the job array job ID with
bjobs. For jobs submitted from a job array, JOBID displays the job array job ID, and JOBNAME displays the job array name and the index value of each job. For example, to view a job array with job ID 123:%bjobs 123JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 123 user1 DONE default hostA hostC myArray[1] Feb 29 12:34 123 user1 DONE default hostA hostQ myArray[2] Feb 29 12:34 123 user1 DONE default hostA hostB myArray[3] Feb 29 12:34 123 user1 RUN default hostA hostC myArray[4] Feb 29 12:34 123 user1 RUN default hostA hostL myArray[5] Feb 29 12:34 123 user1 RUN default hostA hostB myArray[6] Feb 29 12:34 123 user1 RUN default hostA hostQ myArray[7] Feb 29 12:34 123 user1 PEND default hostA myArray[8] Feb 29 12:34 123 user1 PEND default hostA myArray[9] Feb 29 12:34 123 user1 PEND default hostA myArray[10] Feb 29 12:34To display the past status of the individual jobs submitted from a job array, specify the job array job ID with
bhist. For example, to view the history of a job array with job ID 456:%bhist 456Summary of time in seconds spent in various states: JOBID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 456[1] user1 *rray[1] 14 0 65 0 0 0 79 456[2] user1 *rray[2] 74 0 25 0 0 0 99 456[3] user1 *rray[3] 121 0 26 0 0 0 147 456[4] user1 *rray[4] 167 0 30 0 0 0 197 456[5] user1 *rray[5] 214 0 29 0 0 0 243 456[6] user1 *rray[6] 250 0 35 0 0 0 285 456[7] user1 *rray[7] 295 0 33 0 0 0 328 456[8] user1 *rray[8] 339 0 29 0 0 0 368 456[9] user1 *rray[9] 356 0 26 0 0 0 382 456[10]user1 *ray[10] 375 0 24 0 0 0 399Specific job status
To display the current status of a specific job submitted from a job array, specify in quotes, the job array job ID and an index value with
bjobs. For example, the status of the 5th job in a job array with job ID 123:%bjobs "123[5]"JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 123 user1 RUN default hostA hostL myArray[5] Feb 29 12:34To display the past status of a specific job submitted from a job array, specify, in quotes, the job array job ID and an index value with
bhist. For example, the status of the 5th job in a job array with job ID 456:%bhist "456[5]"Summary of time in seconds spent in various states: JOBID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 456[5] user1 *rray[5] 214 0 29 0 0 0 243[ Top ]
Controlling Job Arrays
You can control the whole array, all the jobs submitted from the job array, with a single command. LSF also provides the ability to control individual jobs and groups of jobs submitted from a job array. When issuing commands against a job array, use the job array job ID instead of the job array name. Job names are not unique in LSF, and issuing a command using a job array name may result in unpredictable behavior.
Most LSF commands allow operation on both the whole job array, individual jobs, and groups of jobs. These commands include
bkill,bstop,bresume, andbmod.Some commands only allow operation on individual jobs submitted from a job array. These commands include
btop,bbot, andbswitch.Whole array
To control the whole job array, specify the command as you would for a single job using only the job ID. For example, to kill a job array with job ID 123:
%bkill 123Individual jobs
To control an individual job submitted from a job array, specify the command using the job ID of the job array and the index value of the corresponding job. The job ID and index value must be enclosed in quotes. For example, to kill the 5th job in a job array with job ID 123:
%bkill "123[5]"Groups of jobs
To control a group of jobs submitted from a job array, specify the command as you would for an individual job and use
indexListsyntax to indicate the jobs. For example, to kill jobs 1-5, 239, and 487 in a job array with job ID 123:%bkill "123[1-5, 239, 487]"
[ Top ]
Requeuing a Job Array
Use
brequeueto requeue a job array. When the job is requeued, it is assigned the PEND status and the job's new position in the queue is after other jobs of the same priority. You can requeue:
- Jobs in DONE job state
- Jobs EXIT job state
- All jobs regardless of job state in a job array.
- EXIT, RUN, DONE jobs to PSUSP state
- Jobs in RUN job state
brequeueis not supported across clusters.Requeuing jobs in DONE state
To requeue DONE jobs use the -d option of brequeue. For example, the command
brequeue -J "myarray[1-10]" -d 123 requeuesjobs with job ID 123 and DONE status.Requeuing Jobs in EXIT state
To requeue EXIT jobs use the -e option of brequeue. For example, the command
brequeue -J "myarray[1-10]" -e 123requeues jobs with job ID 123 and EXIT status.Requeuing all jobs in an array regardless of job state
A submitted job array can have jobs that have different job states. To requeue all the jobs in an array regardless of any job's state, use the -a option of
brequeue. For example, the commandbrequeue -J "myarray[1-10]" -a 123requeues all jobs in a job array with job ID 123 regardless of their job state.Requeuing RUN jobs to PSUSP state
To requeue RUN jobs to PSUSP state, use the -H option of brequeue. For example, the command
brequeue -J "myarray[1-10]" -H 123requeues to PSUSP RUN status jobs with job ID 123.Requeuing jobs in RUN state
To requeue RUN jobs use the -r option of brequeue. For example, the command
brequeue -J "myarray[1-10]" -r 123requeues jobs with job ID 123 and RUN status.[ Top ]
Job Array Job Slot Limit
The job array job slot limit is used to specify the maximum number of jobs submitted from a job array that are allowed to run at any one time. A job array allows a large number of jobs to be submitted with one command, potentially flooding a system, and job slot limits provide a way to limit the impact a job array may have on a system. Job array job slot limits are specified using the following syntax:
% bsub -J "arrayName[indexList]%jobLimit" myJobwhere:
Specifies the maximum number of jobs allowed to run at any one time. The percent sign,
%, must be entered exactly as shown. Valid values are positive integers less than the maximum index value of the job array.Setting a job array job slot limit
A job array job slot limit can be set at the time of submission using
bsub, or after submission usingbmod.For example, to set a job array job slot limit of 100 jobs for a job array of 1000 jobs:
% bsub -J "jobArrayName[1000]%100" myJobFor example, to set a job array job slot limit of 100 jobs for an array with job ID 123: % bmod -J"%100"123Changing a job array job slot limit
Changing a job array job slot limit is the same as setting it after submission. For example, to change a job array job slot limit to 250 for a job array with job ID 123:
% bmod -J"%250"123Viewing a job array job slot limit
To view job array job slot limits use the
-Aand-loptions ofbjobs. The job array job slot limit is displayed in the Job Name field in the same format in which it was set. For example, the following output displays the job array job slot limit of 100 for a job array with job ID 123:%bjobs -A -l 123Job <123>, Job Name <myArray[1-1000]%100>, User <user1>, Project <default>, Sta tus <PEND>, Queue <normal>, Job Priority <20>, Command <my Job> Wed Feb 29 12:34:56: Submitted from host <hostA>, CWD <$HOME>; COUNTERS: NJOBS PEND DONE RUN EXIT SSUSP USUSP PSUSP 10 9 0 1 0 0 0 0
[ Top ]
[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Date Modified: January 12, 2004
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.