IT Questions and Answers :)

Thursday, July 4, 2019

What does Windows use IRQL for?

What does Windows use IRQL for?

  • Allow Windows to create BSoD error
  • Set levels for hardware priority
  • Interpret requests for the CPU
  • Send messages to other computers 
What does Windows use IRQL for?


EXPLANATION

IRQL stands for "Interrupt Request Level". It is a number, ranging from 0 through 31 on Windows x86 systems and 0 through 15 on x64 systems. It represents the "importance" of a kernel mode task relative to other kernel mode tasks.

IRQL is a Windows-defined state of the processor - not of a process or thread - that indicates to Windows whether or not whatever that processor is doing can be interrupted by other tasks. If a new task (such as an interrupt service routine) has a higher IRQL than the processor's current IRQL, then yes, it can interrupt the current task; otherwise no. On a multiprocessor system each processor has its own IRQL. This includes the "Logical Processors" created by hyperthreading. 
( I use the word "importance" rather than "priority" because "priority" in Windows refers to thread priorities, and IRQLs are something different. Unlike thread priorities, kernel tasks at the same IRQL are not time-sliced, and IRQLs aren't subject to automatic boost and decay. )
( I should also mention that the term "kernel task" here is not official. Windows does not really call these things "kernel tasks", they are not managed objects as are e.g. processes and threads, and there is no relation to x86 "task gates" nor to anything shown in "Task Manager". As I (and others) use the term here, "kernel mode task" really covers "anything with a defined beginning and end that needs to be done in kernel mode at IRQL 2 or above." An interrupt service routine is a one example of a "kernel mode task"; so is a DPC routine. But another example can be code in a kernel mode thread. Such threads start at IRQL 0, but if part of the code raises to IRQL 2 or above, does something, and then returns to its previous IRQL, the high-IRQL part of the code is one example of what I'm calling a "kernel task" here. )

Performance Monitor shows time spent at IRQL 2 as "% DPC time" and time at IRQL > 2 as "% interrupt time", regardless of whether the time was actually spent in a DPC routine or ISR or was a the result of raising IRQL from a lower value. Each is a subset of what PerfMon shows as "% privileged time" - which should have been labeled "kernel mode time".
Once a kernel task is started at IRQL 2 or above, it runs to completion before anything else at the same IRQL will be started on the same processor. It may be interrupted by a higher-IRQL task (which could in turn be interrupted by a yet higher-IRQL task, etc.), but when the higher-IRQL tasks complete, control returns to the task it interrupted.

IRQL is primarily a serialization mechanism. (Many say "synchronization", but I prefer this word as it more exactly describes the result.) Its purpose is to help guarantee that multiple tasks on the same CPU that access certain shared resources - mostly shared data structures in the OS kernel space - are not allowed to interrupt each other in ways that could corrupt those structures.
For example, a great deal of data in the Windows kernel, particularly the memory management data and the data used by the thread scheduler, is "serialized" at IRQL 2. That means that any task that wants to modify such data must be running at IRQL 2 when it does so. If a higher-IRQL task attempts to write such data, that could cause corruption, because it might have interrupted an IRQL 2 task which might be in the middle of a read-modify-write cycle on that same data. So higher-IRQL tasks are simply not allowed to do that.
Higher-IRQL tasks are mostly the interrupt service routines of device drivers, because all devices' interrupts occur at IRQL > 2. This includes the interrupt from the timer chip on the motherboard that drives timekeeping and time-driven-activity in the OS. Its IRQL is above that of all "ordinary" hardware devices.

IRQLs 2 and above are used for kernel tasks that are not triggered by hardware interrupts but during which normal thread scheduling - including waiting - cannot occur. Thus once a processor is at IRQL 2 or above, no thread context switches can happen on that processor until IRQL drops below 2.
User mode code is always at IRQL 0. Kernel mode code can run at any IRQL from 0 through whatever the max is. IRQL 1 is a special case; it is kernel mode only but has no impact on scheduling, and is really more a state of a thread than of the processor - it is saved and restored during thread context switches, for example.

In order to maintain various serialization guarantees, most exceptions (things like divide by zero, or memory access violations like page faults) are simply not handle-able at IRQL 2 or above. (IRQL 2 btw is commonly called "dispatch level" or "DPC level".)
And now we can finally explain this bugcheck code!

The most common case of IRQL_NOT_LESS_OR_EQUAL is due to a page fault (attempt to access a "not resident" virtual address), or a memory access violation (attempt to write to a read-only page, or to access a page that is not defined at all), that occurs at IRQL 2 or above.
If such exceptions are raised at IRQL 0 or 1, they can be "handled" either by system-supplied code (like the page fault handler) or by an exception handler provided by the developer. However, most exceptions cannot be handled at all if they occurred at IRQL 2 or above.
So... the bugcheck code means "an exception of a type that can only be handled at IRQL 0 or 1 occurred when IRQL was at 2 or higher." i.e. "not less than or equal to 1". Strange wording, but there it is.
There are a few other things that can trigger this bugcheck, and the value that the IRQL is not less or equal to is not always 1, but they occur only rarely. The WinDBG documentation lists them.

 

Share:

0 comments:

Post a Comment

Popular Posts