Jobs

 

Home
Clusters
Queues
Jobs
Commands
Manuals
FAQ

Jobs are the basic unit of work in LSF. Most of what you will do with LSF involves  submitting, monitoring or controlling jobs using the following commands:

bsub, bsas, ... submitting jobs
bjobs, bhist, bpeek monitoring jobs
bkill, bstop, bresume controlling jobs

 

Submitting jobs with bsub

You use the bsub command to submit a job in LSF.  Here is a very simple bsub command:

% bsub uname -a
Job <125532> is submitted to default queue <batch>.

Each LSF job runs in a queue. If you don't give LSF a queue name, your job will go to the batch queue as it did in the above example..

Each LSF job will be dispatched to a server.  If you don't specify the server, LSF will choose one for you. To find the name of the server and the current status of the job, use the bjobs command:

% bjobs 125532
JOBID  USER    STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
125532 joeuser DONE batch sunny     sunny     uname -a Mar 28 13:37

This was a rather trivial job consisting of only one command, so it ran very quickly. It's status (STAT) is DONE, which means it completed successfully. If a job returns anything other than a normal completion code, its status will be exit.

This job executed on sunny, the same host from which it was submitted. Unless told otherwise, LSF will chose an execution host with the same architecture as the submission host, in this case Solaris sparc. If more than one server meets that criterion, LSF will choose the most powerful host with the lightest load.

If you want your job to run on a specific host, use the -m option

% bsub -m zephyr ...

Where is the output from this job? By default, LSF will send you email containing the standard output (stdout) and standard error (stderr) from your job, as well as some basic information about the execution of the job. If your program produces additional output files, they are separate and are not included in this email.

Job submitted to the batch queue run in the background, but sometimes you need to run a job in interactively in the foreground. To do this you need to request the interactive queue explicitly and also use the -Ip (interactive pseudo terminal) option. Here is a simple example:

bsub -m chastity -q interactive -Ip uname -a
Job <125682> is submitted to queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on chastity>>
IRIX64 chastity 6.5 01091820 IP35

Note: When you run an interactive job, LSF does not send you email.

Examples of programs that are often run in the interactive queue include SAS, Stata, and Mathematica.

 

More bsub options

Here are some more bsub options that you may find useful.

-b

To force your job to begin at a specific time, use the -b option on the bsub command:

bsub -b 11:00

Tells LSF to start your job at 11:00 a.m. If the current time is after 11:00 a.m., the job will be held until the next day.

bsub -b 2:15:23:15

Tells LSF to start the job at 11:15 p.m. on February 15.

-o, -e

To save your job's output in a file instead of receiving it in email, use the -o option on the bsub command:

bsub -o my_output ...

You can put stdout and stderr in different files if you wish:

bsub -o my_out -e my_err ...

To make it easier to keep track of the output from multiple runs of the same program, you can use the special %J variable in your file names. LSF will substitute the job number for the %J variable:

bsub -o out.%J 

-u

If LSF sends you email with your job output, it uses the address 

    onyen
@email.unc.edu 

If you want the email send to another address, you should use the -u option on the bsub command:

bsub -u job_user@unc.edu ...

 

Submitting jobs with bsas and other scripts

Some programs are used so frequently that ATN has created special "wrapper" scripts you can use for submitting those program to LSF. These scripts construct the bsub command and submit the job for you. 

More information about these wrapper scripts can be found here.

 

Monitoring jobs with bjobs and bhist 

 

bjobs

The bjobs command displays the current status of one or more jobs. If used without any options, if displays all of your own pending, running or suspended jobs. 

% bjobs
JOBID  USER    STAT  QUEUE FROM_HOST  EXEC_HOST JOB_NAME  SUBMIT_TIME
123456 jobuser RUN   batch chastity   zephyr    myprog01  Mar 25 14:13

Useful options for the bjobs command include:

job_ID information about a particular job ID
-a information about all of your jobs, including those which finished recently
-l long format; multiple lines giving additional information
-u onyen information about a particular user, identified by onyen; "-u all" displays information on all jobs in the system
-w wide format; doesn't truncate host names

Note that you can use more than one option at a time. For example:

bjobs -l -w -a -u joeuser

 

bhist

The bhist command displays historical information about jobs.

 

bpeek

The bpeek command display the stdout and stderr of a job while it is running.

 

Controlling jobs with bkill, bstop and bresume 

 

bkill

The bkill command is usually used to kill a running, pending or suspended job. More precisely, bkill causes LSF to send SIGINT and SIGTERM to a job to give it a change to clean up, then LSF sends SIGKILL to kill the job.

You can only kill your own jobs.

 

bstop

The bstop command suspends a job by sending it the SIGSTOP signal.

 

bresume

The bresume command resumes a suspended job by sending it the SIGCONT signal.

 

 

Last updated Thursday, May 23, 2002 04:35 PM