If you are writing an application that draws on the display, then there is a new facility in the Win32® API set which can really speed things up. We're talking about the CreateDIBSection call. (The DIB here stands for "Device-Independent" bitmap.) This allows you to share a memory section directly with CSRSS, and thus avoid having it copied from your process to CSRSS each time there is a change. In the old days you might have called GetDIBits, made the required changes, then SetDIBits. You might have had to do this several times on different scan lines of the bitmap before the image was ready for updating. The new call avoids all that. You will first need to call CreateCompatibleDC to get a Device Context to select it into in order to access it with the GDI API's. You can then make the changes directly in the memory section holding the bits, and then call BitBlt or StretchBlt to transfer the changes to the display.
One word of caution if you decide to use CreateDIBSection. You need to be sure that any calls that might affect your bitmap have completed before you start to draw in it. This is because the batching of GDI calls may cause their delayed execution. Suppose you make a PatBlt call to clear your bitmap. Then you start to change the bits in your DIB section. If the PatBlt call was batched it might not actually get to CSRSSuntil after you start to make the bitmap changes. So, before you start to twiddle the bits on your own side of the fence, be sure to call GdiFlush if you have made changes to the bimap with earlier GDI calls.