The National Institute for Computational Sciences

Running Jobs - SIP

  Running Jobs

General Information

When you log in, you will be directed to one of the login nodes. The login nodes should only be used for basic tasks such as file editing, code compilation, and job submission.

The login nodes should not be used to run production jobs. Production work should be performed on the system's compute resources. Serial jobs (pre- and post-processing, etc.) may be run on the compute nodes. Access to compute resources is managed by Torque (a PBS-like system). Job scheduling is handled by Moab, which interacts with Torque and system software.

This page provides information for getting started with the batch facilities of Torque with Moab as well as basic job execution. Sometimes you may want to chain your submissions to complete a full simulation without the need to resubmit, you can read about this here.


Encrypted Project Space Mounting

For sensitive projects that require encryption, EncFS is used to securely store and access data. For additional information on EncFS, see the Arch Linux wiki entry. The ACF provides tools for simplifying and centrally managing the use of EncFS. To use the secure storage, you must first mount the encrypted folder for your project by following these steps.

  1. Connect to an ACF-SIP login node, if you have not already.
  2. To mount an encrypted project folder, type the following command: sudo sipmount <projectname> (where <projectname> is the project ID whose encrypted space you want to access).
  3. The sudo part of the command will require you to authenticate with your NetID password and Duo TFA. The ACF recommends you use the “Duo Push” option.
  4. Verify that the project folder was mounted.
    1. The df command shows mounted filesystems. The project directory will be mounted at /projects/<projectname>.
    2. Type ls –l /projects/<projectname> to list the contents of the project folder.

After 15 minutes of inactivity, the encrypted space will be closed and require you to repeat this process.

Because the secure space is made available to running jobs (see the Job Directories section for details), it is necessary to prompt for authentication when the user submits a job. When using the qsub command to submit a job, you will be prompted to authenticate using your NetID password and Duo TFA, like when logging in. When submitting multiple jobs simultaneously (such as in a script), a single authentication will be stored for up to 5 minutes (this uses the same mechanism as the sudo command when mounting the secure project storage, so one authentication will be sufficient for both types of commands for the 5-minute window).

Do consider the amount of jobs you intend to submit. It is possible that you may attempt to submit too many jobs in the five-minute period, which will require re-authentication.

In cases where the secure project space is unnecessary for a job, you may submit to jobs to the nosecurespace queue to avoid authentication. This queue does not automatically mount a secured project space and will not require you to authenticate. To run a job in the nosecurespace queue, use a command like the following.

qsub –I –A <projectname> -l nodes=<nodes> -q nosecurespace

In this example, the qsub command is used to submit an interactive job on the nosecurespace queue. Change the and arguments to your own project folder name and node specification. For more information on the qsub command, type man qsub to read through its manual pages or type qsub -? to view the options available to the command.


Job Directories

Jobs should be submitted from within a directory in the Lustre file system. Always execute cd $PBS_O_WORKDIR as the first command in your shell script. An example of this appears in the Batch Script section. Please refer to the PBS Environment Variables section for further details. Documentation that describes PBS options can be used for more complex job scripts. The following storage spaces are available to running jobs:

  • /projects/<projectname> - This is the encrypted space for the project under which the job is being run. It is mounted automatically on the head node of the job and is unmounted automatically at job completion. Due to the encryption layer, read and write performance to this space is very poor. It is recommended that jobs requiring many reads or writes utilize the scratch space described next, copying initial data from the secure space to the scratch space at the start, and copying results back into either the secured or unsecured (depending on the nature of the data) spaces before the job completes.

  • /lustre/sip/scratch/<jobid> - Stored in the SCRATCHDIR environment variable available to a running job, this temporary scratch space is created automatically as the job starts, and is renamed to /lustre/sip/scratch/<jobid>.completed when the job is complete. <jobid> is the full name of the job as reported by qsub when the job was submitted (I.E. 1234.sip-mgmt1). 24 hours after the job completes and the directory is renamed, it will be automatically deleted to help protect any sensitive data that is stored there.

  • /lustre/sip/proj/ - This is the unsecured scratch space for the project under which the job is being run. You will have your own folder under this space with your username which you can use to store non-sensitive data.

If your project was previously given an encrypted space on the login nodes using the LUKS encryption mechanism, you will need to migrate any data stored there to your EncFS space. The LUKS spaces are deprecated and will eventually be retired. To simplify this process, the sudo sipmount –migrate <projectname> command has been provided for use on the login node containing your project’s LUKS encrypted space. When this command is used, it will do the following.

  1.   Mount the LUKS and EncFS space simultaneously.
  2.   Report the location of the LUKS space.
  3.   Report the location of the EncFS space.
  4.   Implore you to migrate data from LUKS to EncFS.
Once the data has been migrated, type sipmount --migrate-complete <projectname> to complete the migration process and close the encrypted spaces.


Batch Scripts

Batch scripts can be used to run a set of commands on a system's compute partition. Batch scripts allow users to run non-interactive batch jobs, which are useful for submitting a group of commands, allowing them to run through the queue, and then viewing the results. However It is sometimes useful to run a job interactively (primarily for debugging purposes). Please refer to the Interactive Batch Jobs section for more information on how to run batch jobs interactively.

All non-interactive jobs must be submitted on Beacon using job scripts via the qsub command. The batch script is a shell script containing PBS flags and commands to be interpreted by a shell. The batch script is submitted to the resource manager, Torque, where it is parsed. Based on the parsed data, Torque places the script in the queue as a job. Once the job makes its way through the queue, the script will be executed on the head node of the allocated resources.

Jobs are submitted to the batch job scheduler in units of nodes via the -l nodes=# option. By default, MPI jobs will place one task per node. The default behavior can be overridden by adding the '-ppn=# -f $PBS_NODEFILE' option to the mpirun command. Nodes can be oversubscribed (i.e. utilizing more MPI ranks than the node has cores); however, the default behavior will be to fill all cores on all nodes before adding the additional MPI ranks. This will be done by adding ranks to each node again up to the number of cores per node available. This process is repeated until all MPI ranks have been allocated. For example a job that requests 3 nodes (-l nodes=3) that have 16 total cores available and submits an MPI job using 144 ranks (mpirun -n 144) will first place 16 MPI ranks on each node on each of the 3 nodes (48 ranks over 3 nodes) before placing an addition set of 48 ranks in the same way (16 ranks per node over 3 nodes). Finally, the remaining set of 48 ranks will be allocated to all the nodes in the same way (16 ranks per node over 3 nodes).

If all MPI ranks have not been allocated it will place this same number of MPI ranks starting again on each node, starting with the first, until all MPI Ranks have been allocated. In cases were the number of MPI ranks per node is less than the available cores per node, these MPI ranks are evenly spread across processor cores. For example if 8 MPI ranks are placed on a 16 core node (2 processors of 8 cores each) then four MPI ranks will land on the first processor and the four MPI ranks will land on the second processor.

All job scripts start with an interpreter line, followed by a series of #PBS declarations that describe requirements of the job to the scheduler. The rest is a shell script, which sets up and runs the executable.

Batch scripts are divided into the following three sections:

  1. Shell interpreter (one line)
    • The first line of a script can be used to specify the script's interpreter.
    • This line is optional.
    • If not used, the submitter's default shell will be used.
    • The line uses the syntax #!/path/to/shell, where the path to the shell may be
      • /usr/bin/csh
      • /usr/bin/ksh
      • /bin/bash
      • /bin/sh
  2. PBS submission options
    • The PBS submission options are preceded by #PBS, making them appear as comments to a shell.
    • PBS will look for #PBS options in a batch script from the script's first line through the first non-comment line. A comment line begins with #.
    • #PBS options entered after the first non-comment line will not be read by PBS.
  3. Shell commands
    • The shell commands follow the last #PBS option and represent the executable content of the batch job.
    • If any #PBS lines follow executable statements, they will be treated as comments only. The exception to this rule is shell specification on the first line of the script.
    • The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments.
    • During normal execution, the batch script will end and exit the queue after the last line of the script.

The following examples show typical job script header with various mpirun commands to submit a parallel job that executes ./a.out on 3 nodes with a wall clock limit of two hours:

#PBS -S /bin/bash
#PBS -l nodes=3,walltime=02:00:00


Option 1:
mpirun -n 48 ./a.out    
Places 48 MPI ranks (16 per node, placed 1 per node round robin)

Option 2:
mpirun -n 96 ./a.out   
Places 96 MPI ranks (32 ranks per node).  

Option 3:
mpirun -n 96 -ppn=32 -f $PBS_NODEFILE  ./a.out
Places 96 MPI ranks (32 ranks per node ).  

Option 4:
mpirun -n 24 -ppn=8 -f $PBS_NODEFILE  ./a.out
Places 24 MPI ranks (8 per node in groups of 8).  Ranks 0-7 will be on node 1, Ranks 8-15 will be on node 2, and Ranks 16-23 will be on node 3.

Jobs should be submitted from within a directory in the Lustre file system. It is best to always execute cd $PBS_O_WORKDIR as the first command. Please refer to the PBS Environment Variables section for further details.

Documentation that describes PBS options can be used for more complex job scripts.

Unless otherwise specified, your default shell interpreter will be used to execute shell commands in job scripts. If the job script should use a different interpreter, then specify the correct interpreter using:

 #PBS -S /bin/XXXX


Altering Batch Jobs

This section shows how to remove or alter batch jobs.

Remove Batch Job from the Queue

Jobs in the queue in any state can be stopped and removed from the queue using the command qdel.

For example, to remove a job with a PBS ID of 1234, use the following command:

> qdel 1234

More details on the qdel utility can be found on the qdel man page.

Hold Queued Job

Jobs in the queue in a non-running state may be placed on hold using the qhold command. Jobs placed on hold will not be removed from the queue, but they will not be eligible for execution.

For example, to move a currently queued job with a PBS ID of 1234 to a hold state, use the following command:

> qhold 1234

More details on the qhold utility can be found on the qhold man page.

Release Held Job

Once on hold the job will not be eligible to run until it is released to return to a queued state. The qrls command can be used to remove a job from the held state.

For example, to release job 1234 from a held state, use the following command:

> qrls 1234

More details on the qrls utility can be found on the qrls man page.

Modify Job Details

Non-running (or on hold) jobs can only be modified with the qalter PBS command. For example, this command can be used to:

Modify the job´s name,

$ qalter -N <newname> <jobid>

Modify the number of requested nodes,

$ qalter -l nodes=<NumNodes> <jobid>

Modify the job´s wall time

$ qalter -l walltime=<hh:mm:ss> <jobid>

Set job´s dependencies

$ qalter -W  depend=type:argument <jobid>

Remove a job´s dependency (omit :argument):

$ qalter -W  depend=type <jobid>


  • Use qstat -f <jobid> to gather all the information about a job, including job dependencies.
  • Use qstat -a <jobid> to verify the changes afterward.
  • Users cannot specify a new walltime for their job that exceeds the maximum walltime of the queue where your job is.
  • If you need to modify a running job, please contact us. Certain alterations can only be performed by administrators.


Interactive Batch Jobs

Interactive batch jobs give users interactive access to compute resources. A common use for interactive batch jobs is debugging. This section demonstrates how to run interactive jobs through the batch system and provides common usage tips.

Users are not allowed to run interactive jobs from login nodes. Running a batch-interactive PBS job is done by using the -I option with qsub. After the interactive job starts, the user should run the computationally intense applications on the lustre scratch space, and place the executable after the mpirun command.

Interactive Batch Example

For interactive batch jobs, PBS options are passed through qsub on the command line. Refer to the following example:

qsub -I -A UT-NTNL0121 -l nodes=1,walltime=1:00:00



-IStart an interactive session
-ACharge to the “UT-NTNL0121” project
-lRequest 1 physical compute node (16 cores) for one hour

After running this command, you will have to wait until enough compute nodes are available, just as in any other batch job. However, once the job starts, the standard input and standard output of this terminal will be linked directly to the head node of our allocated resource. The executable should be placed on the same line after the mpirun command, just like it is in the batch script.

> cd /lustre/medusa/$USER
> mpirun -n 16 ./a.out

Issuing the exit command will end the interactive job.


Common PBS Options

This section gives a quick overview of common PBS options.

Necessary PBS options




A#PBS -A <account>Causes the job time to be charged to <account>. The account string is typically composed of three letters followed by three digits and optionally followed by a subproject identifier. The utility showusage can be used to list your valid assigned project ID(s). This is the only option required by all jobs.
l#PBS -l nodes=<nodes>Number of requested nodes.
 #PBS -l walltime=<time>Maximum wall-clock time. <time> is in the format HH:MM:SS. Default is 1 hour.

Other PBS Options




o#PBS -o <name>Writes standard output to <name> instead of <job script>.o$PBS_JOBID. $PBS_JOBID is an environment variable created by PBS that contains the PBS job identifier.
e#PBS -e <name>Writes standard error to <name> instead of <job script>.e$PBS_JOBID.
j#PBS -j {oe,eo}Combines standard output and standard error into the standard error file (eo) or the standard out file (oe).
m#PBS -m aSends email to the submitter when the job aborts.
 #PBS -m bSends email to the submitter when the job begins.
 #PBS -m eSends email to the submitter when the job ends.
M#PBS -M <address>Specifies email address to use for -m options.
N#PBS -N <name>Sets the job name to <name> instead of the name of the job script.
S#PBS -S <shell>Sets the shell to interpret the job script.
qos#PBS -q <queue>Directs the job to the run under the specified QoS. This option is not required to run in the default QoS.
l#PBS -l feature=<feature>Select the desired node feature set.

Note:  Please do not use the PBS -V option. This can propagate large numbers of environment variable settings from the submitting shell into a job which may cause problems for the batch environment. Instead of using PBS -V, please pass only necessary environment variables using -v <comma_separated_list_of_ needed_envars>. You can also include module load statements in the job script.



Further details and other PBS options may be found using the man qsub command.


PBS Environment Variables

This section gives a quick overview of useful environment variable sets within PBS jobs.

    • PBS sets the environment variable PBS_O_WORKDIR to the directory from which the batch job was submitted.
    • By default, a job starts in your home directory. Often, you would want to do cd $PBS_O_WORKDIR to move back to the directory you were in. The current working directory when you start mpirun should be on Lustre Space.

Include the following command in your script if you want it to start in the submission directory:

    • PBS sets the environment variable PBS_JOBID to the job's ID.
    • A common use for PBS_JOBID is to append the job's ID to the standard output and error file(s).

Include the following command in your script to append the job's ID to the standard output and error file(s)

#PBS -o scriptname.o$PBS_JOBID
    • PBS sets the environment variable PBS_NNODES to the number of logical cores requested (not nodes). Given that Beacon has 16 physical cores per node, the number of nodes would be given by $PBS_NNODES/16.
    • For example, a standard MPI program is generally started with mpirun -n $($PBS_NNODES) ./a.out. See the Job Execution section for more details.


Monitoring Job Status

The below describes some ways to monitor jobs in the ACF batch environment. The Torque resource manager and Moab scheduler provide multiple tools to view the queues, batch system, job status, and scheduler information. Below are the most common and useful of these tools.


Use qstat -a to check the status of submitted jobs.

> qstat -a 
Job ID               Username    Queue    Jobname          SessID NDS   TSK    Memory   Time     S  Time
-----------------  -----------     --------   ----------------   ------    -----   ------  ------        --------   -  --------
102903              lucio       batch    STDIN              9317    --       16       --            01:00:00 C 00:06:17
102904              lucio       batch    STDIN              9590    --       16       --            01:00:00 R      -- 

The qstat output shows the following:

Job IDThe first column gives the PBS-assigned job ID.
UsernameThe second column gives the submitting user's login name.
QueueThe third column gives the queue into which the job has been submitted.
JobnameThe fourth column gives the PBS job name. This is specified by the PBS -N option in the PBS batch script. Or, if the -N option is not used, PBS will use the name of the batch script.
SessIDThe fifth column gives the associated session ID.
NDSThe sixth column gives the PBS node count. Not accurate; will be one.
TasksThe seventh column gives the number of logical cores requested by the job's -size option.
Req’d MemoryThe eighth column gives the job's requested memory.
Req’d TimeThe ninth column gives the job's requested wall time.
SThe tenth column gives the job's current status. See the status listings below.
Elap TimeThe eleventh column gives the job's time spent in a running status. If a job is not currently or has not been in a run state, the field will be blank.

The job's current status is reported by the qstat command. The possible values are listed in the table below.

Status value


EExiting after having run
TBeing moved to new location
WWaiting for its execution time
CRecently completed (within the last 5 minutes)


The Moab showq utility shows the schedulers view of jobs in the queue. The utility shows the state of jobs from the schedulers point of view including:

RunningThese jobs are currently running.
IdleThese jobs are currently queued awaiting to be assigned resources by the scheduler. A user is allowed five jobs in the Idle state to be considered by the Moab scheduler.
BlockedBlocked jobs are those that are ineligible to be considered by the scheduler. Common reasons for jobs in this state are jobs that the specified resources are not available or the user or system has put a hold on the job.
BatchHoldThese jobs are currently in the queue but are on hold from being considered by the scheduler usually because the requested resources are not available in the system or because the resource manager has repeatedly failed in attempts to start the job.


The Moab checkjob utility can be used to view details of a job in the queue. For example, if job 736 is currently in a blocked state, the following can be used to view the reason:

> checkjob 736

The return may contain a line similar to the following:

BLOCK MSG: job 736 violates idle HARD MAXIJOB limit of 5 for user <your_username>  partition ALL (Req: 1  InUse: 5) 

This line indicates the job is in the blocked state because the owning user has reached the limit of five jobs currently in the eligible state.


The Moab showstart utility gives an estimate of when the job will start.

> showstart 100315
job 100315 requires 16384 procs for 00:40:00

Estimated Rsv based start in 15:26:41 on Fri Sep 26 23:41:12
Estimated Rsv based completion in 16:06:41 on Sat Sep 27 00:21:12

The start time may change dramatically as new jobs with higher priority are submitted, so you need to periodically rerun the command.


The Moab showbf utility gives the current backfill. This can help you create a job which can be backfilled immediately. As such, it is primarily useful for short jobs.


Scheduling Policy

The ACF uses TORQUE as the resource manager and Moab as the scheduler to schedule jobs. The ACF has been divided into logical units known as condos. There are institutional and individual condos in the ACF. Institutional condos are available for any faculty, staff, or student at that institution. Individual condos are available to projects that have invested in the ACF.

The scheduler gives preference to large core count jobs. Moab is configured to do “first fit” backfill. Backfilling allows smaller, shorter jobs to use otherwise idle resources.

Users can alter certain attributes of queued jobs until they start running. The order in which jobs are run depends on the following factors:

  • number of nodes requested - jobs that request more nodes get a higher priority.
  • queue wait time - a job's priority increases along with its queue wait time (not counting blocked jobs as they are not considered "queued.")
  • number of jobs - a maximum of ten jobs per user, at a time, will be eligible to run.

Currently, single core jobs by the same user will get scheduled on the same node. Users on the same project can share nodes with written permission of the PI.

In certain special cases, the priority of a job may be manually increased upon request. To request priority change you may contact ACF User Support. ACF will need the job ID and reason to submit the request.


Known Issues

When a user successfully authenticates the sipmount or qsub commands, it will report that the user has successfully logged in, even though the user is already logged in. This is due to the authentication mechanism using the same security controls as are used for authenticating a login attempt.


Last Updated: 10 / 15 / 2019