SGE has 3 components: commd, qmaster and schedd.
The qmaster and the schedd are self explanatory but the commd is not.
What the commd means is the "Job Executor", "The communicator of all running jobs".
One each node there is a execd that communicates to the servers commd and places a job into execution when it receives a copy of the job from the qmaster.

How to submit a Job


I. Using the command qsub users can submit a script file.

II. Example of a script file:


vi hello.sh

#!/bin/sh
# This is a simple example of an SGE script
#? -N sample
cd $HOME/
./hello-world
:wq

III. To submit this script type
          hostname >  qsub hello.sh  

You will get output on the screen showing the job id number.hostname (i.e. 0.hostname)

How to check the status of my job/host


I. Using the command qstat users can check the status of their job. Look for the job id number.

  hostname >  qstat 

Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
0.hostname hello.sh jackie 00:00:00 R batch

Note: See the man pages for further options to qstat

II. Using the command qhost users can check the status of their job.


  hostname > qhost 

HOSTNAME             ARCH       NPROC  LOAD   MEMTOT   MEMUSE   SWAPTO   SWAPUS
-------------------------------------------------------------------------------
global               -              -     -        -        -        -        -
node001              glinux         2     -     2.0G        -     1.9G        -
node002              glinux         2  0.01     2.0G    28.9M     1.9G     2.4M
node003              glinux         2  0.02     2.0G    29.8M     1.9G     2.4M
node004              glinux         2  0.00     2.0G    27.2M     1.9G      0.0
node005              glinux         2  0.00     2.0G    27.6M     1.9G      0.0

How to script for a Parallel Run



I. Examples of a parallel script file:
   vi mpi-hello.sh
   #!/bin/sh
   #$ -N MPI_Job
   #$ -pe mpich 20

   /usr/local/mpich-pgi/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines /home/scoggins/mpi-hello

:wq

You will submit this script the same as you did the first one. The -pe mpich tells sge that this is a parallel
run and that it needs to run the parallel environment to set up the user to use mpi.

If you want to see what parallel environments are available run:
  qconf -spl      

  hostname> qconf -spl
     mpi
     mpich

If you want to see how the environment is set run:
  qconf -sp   (i.e. mpich, lam, etc)      

       hostname > qconf -sp mpich 

	pe_name           mpich
	queue_list        all
	slots             44
	user_lists        NONE
	xuser_lists       NONE
	start_proc_args   /sge/mpi/startmpi.sh -catch_rsh $pe_hostfile
	stop_proc_args    /sge/mpi/stopmpi.sh
	allocation_rule   $round_robin
	control_slaves    TRUE
	job_is_first_task FALSE