When scheduling takes place only under circumstances one and four, we say the scheduling scheme is not pre-emptive otherwise the scheduling scheme is pre-emptive. Under non pre-emptive scheduling once the CPU has been allocated to the process, the process keeps the CPU until it releases the CPU by terminating or by switching to the waiting state. It is the only method that can be used on certain hardware i. It is also known as SJF preemptive scheduling.
In this method, the process will be allocated to the task, which is closest to its completion. This method prevents a newer ready state process from holding the completion of an older process. Priority scheduling is a method of scheduling processes based on priority.
In this method, the scheduler selects the tasks to work as per the priority. Priority scheduling also helps OS to involve priority assignments. The processes with higher priority should be carried out first, whereas jobs with equal priorities are carried out on a round-robin or FCFS basis.
Priority can be decided based on memory requirements, time requirements, etc. Round robin is the oldest, simplest scheduling algorithm.
The name of this algorithm comes from the round-robin principle, where each person gets an equal share of something in turn. It is mostly used for scheduling algorithms in multitasking.
This algorithm method helps for starvation free execution of processes. SJF is a full form of Shortest job first is a scheduling algorithm in which the process with the shortest execution time should be selected for execution next.
This scheduling method can be preemptive or non-preemptive. On a uniprocessor system, scheduling is relatively simple: the highest-priority thread that wants to run is always running. Therefore, while Windows attempts to schedule the highest-priority runnable threads on all available CPUs, it only guarantees to be running the single highest-priority thread somewhere.
Multiprocessor Considerations in the Dispatcher Database. In addition to the ready queues and the ready summary, Windows maintains two bitmasks that track the state of the processors on the system. How these bitmasks are used is explained in the upcoming section Multiprocessor Thread-Scheduling Algorithms.
Following are the two bitmasks that Windows maintains:. The active processor mask KeActiveProcessors , which has a bit set for each usable processor on the system This might be less than the number of actual processors if the licensing limits of the version of Windows running supports less than the number of available physical processors.
The idle summary KiIdleSummary , in which each set bit represents an idle processor. This is true for any systemwide structure accessed from high IRQL. See Chapter 3 for a general description of kernel synchronization and spinlocks. There is also a per-CPU list of threads in the deferred ready state. These represent threads that are ready to run but have not yet been readied for execution; the actual ready operation has been deferred to a more appropriate time.
Because each processor manipulates only its own per-processor deferred ready list, this list is not synchronized by the PRCB spinlock. The deferred ready thread list is processed before exiting the thread dispatcher, before performing a context switch, and after processing a DPC. Threads on the deferred ready list are either dispatched immediately or are moved to the per-processor ready queue for their priority level. Note that the systemwide dispatcher spinlock still exists and is used, but it is held only for the time needed to modify systemwide state that might affect which thread runs next.
For example, changes to synchronization objects mutexes, events, and semaphores and their wait queues require holding the dispatcher lock to prevent more than one processor from changing the state of such objects and the consequential action of possibly readying threads for execution.
Other examples include changing the priority of a thread, timer expiration, and swapping of thread kernel stacks. Thread context switching is also synchronized by using a finer-grained per-thread spinlock, whereas in Windows XP context switching was synchronized by holding a systemwide context swap spinlock. As described in the Symmetric Multiprocessing section in Chapter 2, Windows supports hyperthreaded and multicore multiprocessor systems in two primary ways:.
Logical processors as well as per-package cores do not count against physical processor licensing limits. For example, Windows Vista Home Basic, which has a licensed processor limit of 1, will use all four cores on a single processor system.
When choosing a processor for a thread, if there is a physical processor with all logical processors idle, a logical processor from that physical processor will be selected, as opposed to choosing an idle logical processor on a physical processor that has another logical processor running a thread. You can examine the information Windows maintains for hyperthreaded processors using the! The following output is from a dual- processor hyperthreaded Xeon system four logical processors :.
Another type of multiprocessor system supported by Windows is one with a nonuniform memory access NUMA architecture. In a NUMA system, processors are grouped together in smaller units called nodes. Each node has its own processors and memory and is connected to the larger system through a cache-coherent interconnect bus. While any processor in any node can access all of memory, node-local memory is much faster to access. The format of the KNODE structure can be shown using the dt command in the kernel debugger, as shown here:.
Applications that want to gain the most performance out of NUMA systems can set the affinity mask to restrict a process to the processors in a specific node. This information can be obtained using the functions listed in Table Functions that can alter thread affinity are listed in Table Retrieves the node that currently has the highest number.
Retrieves the processor mask for the specified node. Retrieves the node number for the specified processor. How the scheduling algorithms take into account NUMA systems will be covered in the upcoming section Multiprocessor Thread-Scheduling Algorithms and the optimizations in the memory manager to take advantage of node-local memory are covered in Chapter 9.
Each thread has an affinity mask that specifies the processors on which the thread is allowed to run. The thread affinity mask is inherited from the process affinity mask.
By default, all processes and therefore all threads begin with an affinity mask that is equal to the set of active processors on the system—in other words, the system is free to schedule all threads on any available processor.
This can be done at several levels:. Calling the SetThreadAffinityMask function to set the affinity for an individual thread. Calling the SetProcessAffinityMask function to set the affinity for all the threads in a process.
The Psexec tool from Sysinternals provides a command-line interface to this function. See the —a switch. By making a process a member of a job that has a jobwide affinity mask set using the SetInformationJobObject function Jobs are described in the upcoming Job Objects section.
If this flag is set, the system chooses a single processor at process creation time and assigns that as the process affinity mask, starting with the first processor and then going round-robin across all the processors.
This has actually saved the authors of this book on two different occasions. In this experiment, you will modify the affinity settings for a process and see that process affinity is inherited by new processes:. Right-click the process, and select Affinity. A list of processors should be displayed. For example, on a dual-processor system you will see this:. Select a subset of the available processors on the system, and click OK.
Now run Notepad. Right-click it, and choose Affinity. You should see the same list of processors you chose for the command prompt process. This is because processes inherit their affinity settings from their parent.
For example, consider this scenario: CPU 0 is running a priority 8 thread that can run on any processor, and CPU 1 is running a priority 4 thread that can run on any processor. A priority 6 thread that can run on only CPU 0 becomes ready. What happens? Therefore, changing the affinity mask for a process or a thread can result in threads getting less CPU time than they normally would, as Windows is restricted from running the thread on certain processors. Therefore, setting affinity should be done with extreme care—in most cases, it is optimal to let Windows decide which threads run where.
Ideal processor , or the preferred processor that this thread should run on. Last processor , or the processor on which the thread last ran. The ideal processor for a thread is chosen when a thread is created using a seed in the process block. The seed is incremented each time a thread is created so that the ideal processor for each new thread in the process will rotate through the available processors on the system.
For example, the first thread in the first process on the system is assigned an ideal processor of 0. The second thread in that process is assigned an ideal processor of 1. In that way, the threads within each process are spread evenly across the processors. Note that this assumes the threads within a process are doing an equal amount of work.
This is typically not the case in a multithreaded process, which normally has one or more housekeeping threads and then a number of worker threads.
Therefore, a multithreaded application that wants to take full advantage of the platform might find it advantageous to specify the ideal processor numbers for its threads by using the SetThreadIdealProcessor function. On hyperthreaded systems, the next ideal processor is the first logical processor on the next physical processor. For example, on a dual-processor hyperthreaded system with four logical processors, if the ideal processor for the first thread is assigned to logical processor 0, the second thread would be assigned to logical processor 2, the third thread to logical processor 1, the fourth thread to logical process 3, and so forth.
In this way, the threads are spread evenly across the physical processors. On NUMA systems, when a process is created, an ideal node for the process is selected. The first process is assigned to node 0, the second process to node 1, and so on. The ideal processor for the first thread in a process is assigned to the first processor in the node.
This works fine on systems that have a constant number of processors during their run time for example, desktop machines require shutting down the computer to make any sort of hardware changes to the processor or their count.
In fact, one of the times when adding a CPU is required for a server is at times of high load that is above what the machine can support at its current level of performance. Having to shut down the server during a period of peak usage would defeat the purpose. To meet this requirement, the latest generation of server motherboards and systems support the addition of processors as well as their replacement while the machine is still running.
The ACPI BIOS and related hardware on the machine have been specifically built to allow and be aware of this need, but operating system participation is required for full support. Dynamic processor support is provided through the HAL, which will notify the kernel of a new processor on the system through the function KeStartDynamicProcessor.
This routine does similar work to that performed when the system detects more than one processor at startup and needs to initialize the structures related to them. When a dynamic processor is added, a variety of system components perform some additional work. For example, the memory manager allocates new pages and memory structures optimized for the CPU.
Other executive parts of the kernel are also called, mostly to initialize the per-processor lookaside lists for the processor that was added. Finally, the kernel initializes threaded DPC support for the processor and adjusts exported kernel variables to report the new processor. Different memory manager masks and process seeds based on processor counts are also updated, and processor features need to be updated for the new processor to match the rest of the system for example, enabling virtualization support on the newly added processor.
The HAL is also involved in this process. It is called once to start the dynamic processor after the kernel is aware of it, and it is called again after the kernel has finished initialization of the processor. However, these notifications and callbacks only make the kernel aware and respond to processor changes. Although an additional processor increases the throughput of the kernel, it does nothing to help drivers.
To handle drivers, the system has a new default executive callback, the processor add callback, that drivers can register with for notifications. Similar to the callbacks that notify drivers of power state or system time changes, this callback allows driver code to, for example, create a new worker thread if desirable so that it can handle more work at the same time.
Unfortunately, until now, CPU-hungry applications have still been left out of this process, but Windows Server and Windows Vista Service Pack 1 have improved the process to allow applications to be able to take advantage of newer processors as well. However, a sudden change of affinity can have potentially breaking changes for a running application especially when going from a single-processor to a multiprocessor environment through the appearance of potential race conditions or simply misdistribution of work since the process might have calculated the perfect ratios at startup, based on the number of CPUs it was aware of.
As a result, applications do not take advantage of a dynamically added processor by default—they must request it. Once an application has told the system that its affinity is permanent, it cannot later change its mind and request affinity updates, so this is a one-time change. As part of KeStartDynamicProcessor , a new step has been added after interrupts are rebalanced, which is to call the process manager to perform affinity updates through PsUpdateActiveProcessAffinity.
Some Windows core processes and services already have affinity updates enabled, while third-party software will need to be recompiled to take advantage of the new API call. The System process, Svchost processes, and Smss are all compatible with dynamic processor addition. There are two basic decisions to describe:. When a thread becomes ready to run, Windows first tries to schedule the thread to run on an idle processor. If this eliminates all idle processors, the reduction is not done.
Next, if the system is running hyperthreaded processors and there is a physical processor with all logical processors idle, the list of idle processors is reduced to that set. If that results in an empty set of processors, the reduction is not done. If the current processor the processor trying to determine what to do with the thread that wants to run is in the remaining idle processor set, the thread is scheduled on it. If the current processor is not in the remaining set of idle processors, it is a hyperthreaded system, and there is an idle logical processor on the physical processor containing the ideal processor for the thread, the idle processors are reduced to that set.
If that set is nonzero, the idle processors are reduced to that list. Finally, the lowest numbered CPU in the remaining set is selected as the processor to run the thread on.
When the idle loop on that processor runs, it will see that a thread has been selected to run and will dispatch that thread. If there is already a thread running on that CPU, Windows checks whether the priority of the currently running thread is less than the thread being readied for execution. If so, the currently running thread is marked to be preempted and Windows queues an interprocessor interrupt to the target processor to preempt the currently running thread in favor of this new thread.
If no thread can be preempted on that one CPU, the new thread is put in the ready queue for its priority level, where it awaits its turn to get scheduled. Therefore, Windows does not guarantee to be running all the highest-priority threads, but it will always run the highest-priority thread. If the ready thread cannot be run right away, it is moved into the ready state where it awaits its turn to run.
Because each processor has its own list of threads waiting to run on that processor, when a thread finishes running, the processor can simply check its per-processor ready queue for the next thread to run. If the per-processor ready queues are empty, the idle thread for that processor is scheduled.
As part of the new hard quota management system added in Windows Vista which builds on previous quota support present since the first version of Windows NT, but adds hard limits instead of soft hints , support for limiting CPU usage was added to the system in three different ways: per-session, per-user, or per-system.
Unfortunately, information on enabling these new limits has not yet been documented, and no tool that is part of the operating system allows you to set these limits: you must modify the registry settings manually. Because all the quotas—save one—are memory quotas, we will cover those in Chapter 9, which deals with the memory manager, and focus our attention on the CPU rate limit.
CPU rate limits can therefore be set in one of three ways:. By creating a new value called CpuRateLimit and entering the rate information. The rest of the bits are used for the actual rate, as a value representing a percentage of maximum CPU usage.
Because any number from 0 to can be represented with only 7 bits, the rest of the bits are unused. Therefore, a rate limit of 40 percent every 2 seconds would be defined by the value 0x, or in binary. The process manager, which is responsible for enforcing the CPU rate limit, uses a variety of system mechanisms to do its job. First of all, rate limiting is able to reliably work because of the CPU cycle count improvements discussed earlier, which allow the process manager to accurately determine how much CPU time a process has taken and know whether the limit should be enforced.
Finally, the main mechanism through which rate limiting works is by creating an artificial wait on a kernel gate object making the thread uniquely bound to this object and putting it in a wait state, which does not consume CPU cycles. This mechanism operates through the normal routine of an APC object queued to the thread or threads inside the process currently responsible for the work. The gate is signaled by an internal worker thread inside the process manager responsible for replenishment of the CPU usage, which is queued by a DPC responsible for replenishing systemwide CPU usage requests.
Windows Internals, 5th Edition. Windows Internals, Part 2, 6th Edition. Windows 7 Inside Out, Deluxe Edition. Sign in. Your cart. Back Page 7 of 9 Next. Figure Thread priority levels. Thus this type of scheduling is used mainly when a process switches either from running state to ready state or from waiting state to ready state.
The resources that is CPU cycles are mainly allocated to the process for a limited amount of time and then are taken away, and after that, the process is again placed back in the ready queue in the case if that process still has a CPU burst time remaining. That process stays in the ready queue until it gets the next chance to execute. There are many different criteria to check when considering the "best" scheduling algorithm, they are:.
It is the total number of processes completed per unit of time or rather says the total amount of work done in a unit of time. It is the amount of time taken to execute a particular process, i. The interval from the time of submission of the process to the time of completion of the process Wall clock time.
The sum of the periods spent waiting in the ready queue amount of time a process has been waiting in the ready queue to acquire get control on the CPU. It is the average number of processes residing in the ready queue waiting for their turn to get into the CPU.
Amount of time it takes from when a request was submitted until the first response is produced. Remember, it is the time till the first response and not the completion of process execution final response. In general CPU utilization and Throughput are maximized and other factors are reduced for proper optimization. To decide which process to execute first and which process to execute last to achieve maximum CPU utilization, computer scientists have defined some algorithms, they are:.
Priority Scheduling. Round Robin RR Scheduling.
0コメント