Submitting and Monitoring Jobs

Submitting your Job:

Once your script is complete, it is ready to be submitted it to the cluster:

$ qsub my_script 

 

How to Check Job Status Using qstat:

To verify that your job has been successfully submitted (or to check its progress) enter the following command:

$ qstat 

Qstat displays all current jobs that are running in the cluster (R), queued to run (Q), or have recently completed (C). A simple way to identify your job is by matching the name you provided in the script (#PBS -N name) with the unique set of numbers in the Job id column (e.g., 123461[].login00).

Note: Brackets following a Job i.d. number indicate an array (multiple files generated by your script)

Note: As per row seven, interactive mode sessions are displayed in qstat.

Additional arguments can be added to the qstat command to further specify the status of your job:

$ qstat -at  # this added option individually lists all jobs currently running or queued in the PBS. For example, if you submitted a job containing the directive -t 1-48, the status of each of the 48 files would display on a separate row.  
 
$ qstat -atu username # Same as above, but only displays jobs for the designated username.

 

Monitoring Job Status via Standard Output and Error Files:

Once a job is submitted, standard output and error files will continue to update throughout the course of its runtime. Depending upon the size and scope of the query, this process could range from a few seconds to the 30-day offical limit.  These files are stored in the same directory as the script used to create the job.  Again, they can be located by the name provided in the #PBS –N name directive.  If you do not specify a name using the #PBS directive, your error and output files will assume the name of the script itself. 

Locating Standard Output and Error Files:

$ cd /directory  # change the directory to where your script is housed. 
$ ls             # list directory contents

Search through the directory for the standard error (e) and output (o) files:

filename.e######  # standard error from your script
filename.o######  # standard output from your script

Viewing Standard Output and Error Files:

The $ less command opens these files so that the user can identify and therefore troubleshoot issues that may have occured when the job was submitted:

$ less filename.e######
$ less filename.o######

Per the graphic above, the standard output and error files for the file named blast would appear in the script directory as follows:

blast.e123461
blast.a123461