Using Complexes with GridEngine
Last Modified: 07/02 10:49
Complexes allow users to make very specific requests to the scheduler such as what interconnect your application prefers, how much available memory you require, whether you mind running on low priority machines, etc. To get the most out of the queuing environment, it is very useful to have a reasonable understanding of how the complexes work and which complexes you should request for your type of application.
Important Complexes
The following table lists custom defined complexes which are necessary for getting your job to run on the right hardware. As some rules of thumb, keep in mind that
- Complexes which begin with i_ are used to specify the interconnect
- Those that begin with p_ are used to specify a priority
- Those that begin with lic_ are used to check out a license for a licensed application
| Complex Name | Value | Description | Default |
| h_rt | Time | Request and set a certain amount of run-time for your job. Jobs are killed when they reach this limit | 10 Minutes |
| h_vmem, h_stack, mem_free | Memory | Manage job memory. h_vem sets an upper limit on available virtual memory. h_stack sets an upper limit on available stack space for binary execution (this must be set when you use h_vmem or you will have NO stack space available. 64M is a reasonable value). mem_free requests nodes that have the specified amount of memory available. | None |
| i_ib | True/False | If you request that this complex be true, your job will only be dispatched to nodes with interconnected with InfiniBand | False |
| i_mx | true/false | If you request that this complex be true, your job will only be dispatched to nodes with interconnected with Myrinet | False |
| i_numa | true/false | This complex is used to request resources on the multi-processor large-memory nodes | False |
| p_low | true/false | Tells the scheduler to allow your job to run in low priority queus. Low priority queues do not guarantee complete execution since your job may be preempted by higher priority jobs. | False |
Requesting Complexes
To request a complex, you must add the following line to your submit script:
#$ -l <complex_name>=<value>
where <complex_name> is one of the complexes defined above (or one of the standard complexes described in the GridEngine User's Guide and show below), and <value> is the desired value of the complex.
Information for SPARC Users
In order to properly run codes against any SPARC machines an architecture flag must be set in your job submission script. To only run against SPARC machines, add the following job requirements line to your submit script:
#$ -l arch=sol-sparc64
A couple examples:
#!/bin/bash #$ -N my_job_needs_myrinet #$ -cwd #$ -pe ompi* #$ -l i_mx=true,p_low=true #$ -j y #$ -o output.$JOB_ID sge_mpirun /opt/apps/my_app/my_binary
- This script will run in the first available queue that supports Myrinet. (i_mx=true)
- This script will run in the first available queue regardless of priority. (p_low=true)
#!/bin/bash #$ -N my_job_needs_infiniband #$ -cwd #$ -pe ompi* #$ -l i_ib=true #$ -j y #$ -o output.$JOB_ID sge_mpirun /opt/apps/my_app/my_binary
- This script will run in the first available queue with InfiniBand. (i_ib=true)
- This script will only run in normal priority queues where you cannot be pre-empted. (no p_low=true)
Complex Table
This is a table of all GridEngine complexes. You can request machines with a variety of characteristics such as machines with a certain amount of memory or a particular architecture type.
| Complex | Description | Short-cut | Data Type | Relational Operator | Requestable | Consumable | Default | Urgency |
| a_hadoop | Use only queues configured to support Hadoop | hadoop | BOOL | == | YES | NO | 0 | 0 |
| arch | System architecture (lx24-x86, lx24-amd64,sol-sparc64) | a | RESTRING | == | YES | NO | NONE | 0 |
| calendar | Job calendar (see SGE User's Guide) | c | RESTRING | == | YES | NO | NONE | 0 |
| cpu | n/a | cpu | DOUBLE | >= | YES | NO | 0 | 0 |
| h_core | Hard core file size | h_core | MEMORY | <= | YES | NO | 0 | 0 |
| h_cpu | Hard max CPU time | h_cpu | TIME | <= | YES | NO | 0:0:0 | 0 |
| h_data | Hard max heap size | h_data | MEMORY | <= | YES | NO | 0 | 0 |
| h_fsize | Hard max file size | h_fsize | MEMORY | <= | YES | NO | 0 | 0 |
| h_rss | Hard max resident memory | h_rss | MEMORY | <= | YES | NO | 0 | 0 |
| h_rt | Hard max runtime | h_rt | TIME | <= | YES | NO | 0:0:0 | 0 |
| h_stack | Hard max stack size | h_stack | MEMORY | <= | YES | NO | 0 | 0 |
| h_vmem | Hard max Virtual Memory (includes swap space) | h_vmem | MEMORY | <= | YES | NO | 0 | 0 |
| hostname | Machine hostname | h | HOST | == | YES | NO | NONE | 0 |
| i_ib | Has InfiniBand | i_ib | BOOL | == | YES | NO | 0 | 0 |
| i_mx | Has Myrinet | i_mx | BOOL | == | YES | NO | 0 | 0 |
| i_numa | Shared memory/NUMA | i_numa | BOOL | == | YES | NO | 0 | 0 |
| lic_schrod | Request Schrodinger license | lic_schrod | INT | <= | YES | YES | 1 | 0 |
| load_avg | System Load Average | la | DOUBLE | >= | NO | NO | 0 | 0 |
| load_long | 15 Minute | ll | DOUBLE | >= | NO | NO | 0 | 0 |
| load_medium | 5 Minute | lm | DOUBLE | >= | NO | NO | 0 | 0 |
| load_short | 1 Minute | ls | DOUBLE | >= | NO | NO | 0 | 0 |
| mem_free | Free Memory | mf | MEMORY | <= | YES | NO | 0 | 0 |
| mem_total | Total Memory | mt | MEMORY | <= | YES | NO | 0 | 0 |
| mem_used | Used Memory | mu | MEMORY | >= | YES | NO | 0 | 0 |
| min_cpu_interval | n/a | mci | TIME | <= | NO | NO | 0:0:0 | 0 |
| np_load_avg | per processor load average | nla | DOUBLE | >= | NO | NO | 0 | 0 |
| np_load_long | 15 Minute | nll | DOUBLE | >= | NO | NO | 0 | 0 |
| np_load_medium | 5 Minute | nlm | DOUBLE | >= | NO | NO | 0 | 0 |
| np_load_short | 1 Minute | nls | DOUBLE | >= | NO | NO | 0 | 0 |
| num_proc | Number of processrs | p | INT | == | YES | NO | 0 | 0 |
| p_low | Low Priority | p_low | BOOL | == | YES | NO | 0 | 0 |
| qname | Queue name | q | RESTRING | == | YES | NO | NONE | 0 |
| rerun | Queue re-runnable | re | BOOL | == | NO | NO | 0 | 0 |
| s_core | Soft max core file size | s_core | MEMORY | <= | YES | NO | 0 | 0 |
| s_cpu | Soft max CPU time | s_cpu | TIME | <= | YES | NO | 0:0:0 | 0 |
| s_data | Soft max heap size | s_data | MEMORY | <= | YES | NO | 0 | 0 |
| s_fsize | Soft max file size | s_fsize | MEMORY | <= | YES | NO | 0 | 0 |
| s_rss | Soft max resident memory size | s_rss | MEMORY | <= | YES | NO | 0 | 0 |
| s_rt | Soft max runtime | s_rt | TIME | <= | YES | NO | 0:0:0 | 0 |
| s_stack | Soft max stack size | s_stack | MEMORY | <= | YES | NO | 0 | 0 |
| s_vmem | Soft max virtual memory | s_vmem | MEMORY | <= | YES | NO | 0 | 0 |
| seq_no | Queue sequence number | seq | INT | == | NO | NO | 0 | 0 |
| slots | Total slots | s | INT | <= | YES | YES | 1 | 1000 |
| sse2 | Select only nodes with sse2 (and above) processor feature | sse2 | BOOL | == | YES | NO | FALSE | 0 |
| sse3 | Select only nodes with sse3 (and above) processor feature | sse3 | BOOL | == | YES | NO | FALSE | 0 |
| sse4 | Select only nodes with sse4 (and above) processor feature | sse4 | BOOL | == | YES | NO | FALSE | 0 |
| sse41 | Select only nodes with sse41 (and above) processor feature | sse41 | BOOL | == | YES | NO | FALSE | 0 |
| sse4a | Select only nodes with sse4a (and above) processor feature | sse4a | BOOL | == | YES | NO | FALSE | 0 |
| swap_free | Free swap space | sf | MEMORY | <= | YES | NO | 0 | 0 |
| swap_rate | Swap pages in and out | sr | MEMORY | >= | YES | NO | 0 | 0 |
| swap_rsvd | Reserved swap | srsv | MEMORY | >= | YES | NO | 0 | 0 |
| swap_total | Total swap | st | MEMORY | <= | YES | NO | 0 | 0 |
| swap_used | Swap used | su | MEMORY | >= | YES | NO | 0 | 0 |
| tmpdir | Temporary directory path (/opt/sge/tmp on all hosts) | tmp | RESTRING | == | NO | NO | NONE | 0 |
| virtual_free | Virtual memory free | vf | MEMORY | <= | YES | NO | 0 | 0 |
| virtual_total | Virtual memory total | vt | MEMORY | <= | YES | NO | 0 | 0 |
| virtual_used | Virtual memory used | vu | MEMORY | >= | YES | NO | 0 | 0 |