TACC Instructions for Running CIG Software
The Texas Advanced Computing Center (TACC) at The University of Texas at Austin offers four queue levels that vary in priority and no. of processors. NOTE: For CitcomS, the parallel HDF5 library is not available on the TACC TeraGrid; you must use ASCII output.
See specifications of TACC's hardware resources and other details at Texas Advanced Computing Center (TACC).
CIG is able to offer small allocations of time at TACC and a few other locations to get you started using CIG software; fill out the application to request an allocation.
Log in to TACC
Once you receive an allocation login (via snail-mail package a few weeks after you get your portal login), you can log in to TACC:
$ ssh username@tg-login.tacc.teragrid.org
Gale
You will have to write a batch script, and then submit that script to the queuing system. For example, this script
#!/bin/bash
# first line specifies shell
#BSUB -J Gale #name the job "jobname"
#BSUB -o Gale.o%J #output-> out.o<jobID>
#BSUB -e Gale.e%J #error -> error.o<jobID>
#BSUB -n 4 -W 1:00 #4 CPUs and 1hr
#BSUB -q normal #Use normal queue.
set echo #Echo all commands.
cd $LS_SUBCWD #cd to directory of submission
ibrun /projects/tg/CIG/Gale/bin/Gale /projects/tg/CIG/Gale/input/cookbook/yielding.xml
will run the cookbook yielding input file for 1 hour on 4 cpus. To submit this script, save it in as a file (e.g. Gale_script), and then run it with the command
$ bsub < Gale_script
CitcomS
Set up your environment
-
TeraGrid uses SoftEnv to manage your software environment, like the PATH and
LD_LIBRARY_PATH environment variables. The setting is stored in the file
~/.soft. In order to run CitcomS, create ~/.soft and add the following lines.
Note that the order of the lines is important.
CIGHOME = /projects/tg/CIG PATH += $CIGHOME/CitcomS/bin PYTHONPATH += $CIGHOME/Exchanger-1.0.0/lib/python2.4/site-packages PATH += $CIGHOME/python-2.4.4/bin PATH += $CIGHOME/Gale/bin ### uncomment the next line if you want to compile CitcomS from ### CIG's subversion repository #PATH += $CIGHOME/autotools/bin @teragrid-basic @teragrid-dev -
After modifying your ~/.soft, remember to run "resoft" to update your settings.
$ resoft
Submit your job
CitcomS is able to interact directly with the queuing system. All you have to do is add some input parameters. For example, adding these parameters to your .cfg input file will submit the job, named hello, to the "normal" queue with a maximum run-time of 2 hours.
[CitcomS.job]
queue = normal
name = hello
walltime = 2*hour
On TACC, available queues include:
- normal
- high (if you need to run your simulation with higher priority, use this queue)
- hero (if you need more than 256 processors)
- development (30 minute runtime and 16 CPUs maximum, for debugging and development purpose)
Note: The "walltime" parameter is required; otherwise, you will get the error message "bsub: exit 255." You can use "minute," "hour," and "day" when specifying walltime.
Monitor your job
After you've submitted your job, you can monitor its status with the "bjobs" command. You can view the status of the queue with the "showq" command. Again, more information can be found at the TACC Lonestar User Guide.
Sometimes you might need to remove a pending job from the queue or kill a running job:
$ bkill <jobid> # Removes pending or running job.
$ bkill -s 9 <jobid> # Sends (sig)kill immediately to running job.