GridEngine Scheduling & Dispatch Policy
Last Modified: 07/26 8:25
This guide will help users understand how their jobs get executed and why they get the priority that they do.
Fair-Share Scheduling
The GridEngine installation at USF implements a Fair-Share-based scheduling system where individual utilization figures are used to determine the priority, in the queue, of that individual's next job relative to other users. Theoretically, this should result in the following situation:
- Over the course of two weeks, user johnq submits over 100 jobs and consumes about 1000 hours of CPU time. The system records this usage.
- On the third week, janeq logs in and sees that johnq has submitted 1000 jobs to the queue which will require 10000 hours of CPU time to complete. janeq just returned from a conference and is eager to get started on her work but is upset that the system is completely full.
- Fortunately for janeq, she will not have to wait for all of johnq's jobs to finish first. Because he has been heavily utilizing the system, johnq has consumed most of his relative Share Tree Tickets. The consumption of these tickets is calculated into his job priority. Since greater consumption translates into lower relative priority, all of johnq's pending jobs are pushed to the back of the queue so that janeq can get started.
- In the next two weeks, janeq shows that she too can be a very heavy user of the system. As she acquires more utilization points, her priority begins to drop so that other users can have access to the system.
This is how the scheduler works, in the grossest possible sense. There are also other things to take into account such as job sizes and requested resources. There are also different queues that jobs are dispatched to.
Dispatch Policy
Generally, users will stick to using the default policy which is a normal priority, best-queue-first approach. Some users belong to groups with their own hardware. The scheduler tries to run jobs in the high-priority queues that the user has access to and moves down the list of available queues until it finds available resources.
These high-priority queues also have attached low-priority queues. Users who specify the complex
#$ -l p_low=true
acknowledge that their jobs may be terminated in the middle of run time but are willing to forgo the safety of guaranteed job execution for a little time on some higher-end hardware. Only jobs that have this complex specified will be executed in low priority queues. We are working on allowing p_low jobs to run in normal priority queues as well (this is good for throughput reasons) but currently, they cannot.
- Note: More details on the scheduler will be added soon.