Job Run Time
Last Modified: 07/26 8:24
All jobs are required to have a hard run-time specification. Jobs that do not have this specification will have a default run-time of 10 minutes and will be stopped at that point.
User's should ensure that they let the scheduler know the estimated run time of their jobs by including the following option in their submit scripts:
#$ -l h_rt=XX:XX:XX
where XX:XX:XX is the hours, minutes and seconds the job is expected to run. This time can also be expressed as a single integer of seconds.
Rules of Thumb
It is generally good practice to overshoot your estimate rather than undershoot it, especially in the case where your job is not re-startable or checkpointed. For shorter jobs (on the order of 0-24 hours), its good to overestimate by around 10 percent as it will not incur a penalty with the scheduler (the more time you actually use, the lower your priority becomes in the Share Tree, regardless of the requested run time. See the GridEngine Policy page for more info).
For jobs lasting longer than 24 hours, you might want lower this percentage by 1 percentage point for each additional day but no lower than 2%.
Determining Job Run Time
Only benchmarking or profiling your code will provide a reasonable time estimate, but there are some rules you can go by when attempting to make an educated guess:
- Embarrassingly Parallel or Course-Grained tasks tend to scale linearly or near-linearly. Generally, you can divide the time it takes to run on one processor by the number of processors you are planning to run on.
- Fine-Grained computations generally follow some scaling curve where after some point, adding additional resources does not yield any appreciable speedup. Depending on the parameters passed to the program, there may be no established point of reference for a reasonable time table. The best thing to do in this case is benchmark the code given the same input parameters but perhaps with fewer repeated iterations, time steps, etc. Get an idea of how the application behaves with smaller, shorter-running jobs, then make a reasonable estimate of how the runtime will change as you increase your iterations, time-steps, etc.
Interactive Jobs
For Interactive jobs using qrsh or qlogin, users should specify a runtime of inifinity or how long they plan on using the session. This is accomplished with the following:
#$ -l h_rt=INFINITY