We mentioned in passing that the client-server runtime subsystem, affectionately known as CSRSS, handles graphics on Windows NT. Actually it handles all window manipulation as well as graphics, and thus makes up an important portion of processor activity on the system. This architecture is illustrated in Figure 3.24.
Figure 3.24 Graphics architecture on Windows NT
The Windows NT SDK contains a graphical device interface demonstration program called Gdidemo. As shipped in the SDK, Gdidemo pauses between drawings. For this experiment, we modified the Gdidemo program to remove that pause so that it will spend all its time drawing. Figure 3.25 shows processor utilization for the processor, the modified Gdidemo program bouncing balls around the screen, and the CSRSS process.
Figure 3.25 Processor utilization by a graphics program pumping pixels
The processor is 100% busy, and most of the time is in CSRSS, which makes sense because it is doing most of the work. On Windows NT you need to think beyond the application process itself and look at other processes in the system that the application may be using. CSRSS is a primary candidate for consuming processor cycles on behalf of an application. Usually this is pretty obvious, because the display changing rapidly is a primary clue. But some tasks that manipulate windows do not change the visible display: they may be operating on windows that are hidden behind others. So taking a look at CSRSS is a good basic policy.
The graphics application communicates with CSRSS using a fast form of the local interprocess procedure call. What makes it fast is dedicating one thread in CSRSS for each application thread that communicates with CSRSS. So you'll see lots of CSRSS threads. An application sends graphics commands to CSRSS in batches to amortize the cost of the process switch over a number of graphical operations. Each such context switch is counted by System: Context Switches, and by Thread: Context Switches as well. You can see from the report in Figure 3.26 that the context switches between Gdidemo and a thread in CSRSS account for nearly all the context switches in the system. (Remember Heisenberg: Performance Monitor is logging at one-second intervals here. You can see its communication with CSRSS in the two threads at the right of Figure 3.26.)
Figure 3.26 Thread context switching during graphics processing
Thread 7 of CSRSS is waking up about 70 times per second to do some housekeeping, but shows no processor activity. This thread is slipping through our processor usage sampling crack. Context switches are a more positive indication of activity than processor utilization because they are always counted. Look at them if you want to know for certain whether a thread is active. We used this technique in Figure 3.11.