Implemetation Details

Next: Experiments Up: Lottery Scheduling Previous: Basic Approach to Implementation

Implemetation Details

The following procedures and data structures have been appropriated from the linux scheduler for use by the lottery scheduler.

task_struct.priority - stores the number of tickets that a task has been allocated. Currently, the value can range from 1 to 40.
task_struct.counter - stores the number of tickets this task currently holds, taking into consideration compensation tickets.
task_struct.avg_slice - stores the length of time that this task ran the last time it was scheduled. This is used to compute compensation tickets.
setpriority(..,..,int priority) - This system call is designed to allow user level programs to modify a task's scheduling priority. It accepts values from -20 to 19 where lower numbers represent higher priority. In the standard linux kernel, higher values in the task_struct.priority field represent higher priorities. Therefore a priority argument of -20 is mapped to a task_struct.priority value of 40, 19 is mapped to 1, and the intermediate values are mapped as you would expect.
We use the setpriority() system call with no modification to set the number of tickets that a task holds.

We've left most of the code in the linux scheduler unmodified. Jobs such as handling the bottom half, handling interrupt logic, handling timers, etc. need no modification. Most of our changes are in the piece of code that is responsible for choosing the next task to run from the list of runnable tasks. This is now handled via the simplest possible implemenation of lottery scheduling.

First we run through the list of runnable tasks, and grant them all compensation tickets according to how much of their quantum they used last time they ran. Once the ticket counts have been updated, we sum the total number of tickets held by all runnable tasks. We then choose a random number between 1 and the total number of tickets. Finally, we again walk down the list of runnable tasks, subtracting the number of tickets held by the current task from our random number at each step. We select that task that takes us to zero or below.

The mechanism described above is sufficient for choosing tasks to run. However one other small modification is required in order to be sure that the scheduler is run at all. In the standard linux scheduler, the scheduling routine is only called when the task_struct.need_resched field of the currently running task is set to 1. This most commonly happens in the update_process_times() routine which is called every 10ms by the timer interrupt. update_process_times() normally only resets the need_resched when the counter for the current task has dropped below zero. We have modified this code so that the field is set every time the timer goes off. This guarantees that the scheduler will be called at least once every 10ms.

The end result of this modification is that our lottery scheduler causes many more context switches than the standard linux scheduler. (Which frequently waits as long as 200 ms before rescheduling.) This has some negative impact on throughput, but it increases system responsiveness.

Next: Experiments Up: Lottery Scheduling Previous: Basic Approach to Implementation

Brandon Sanders
12/17/1999