Bill Hill
Microsoft Corporation
January 2000
Summary: Discusses what UI and Web page designers can learn from the old-fashioned book. (6 printed pages) Includes a discussion on ClearType™, a font display technology developed by Microsoft.
"I designed an application whose user interface was so natural that people would work in it for hours without fatigue. In fact, it was so transparent that it 'just disappeared,'—it was as if it were part of the user's own brain, and they forgot they were using an application at all…"
Sounds like the Holy Grail of UI design, right? A Utopian future, in which we're hooked up to a virtual reality world? Wrong! All of us work with a UI that is that natural, almost every day of our lives. It's called "The Book."
The "Book," when you boil it right down to its essence, is nothing but sooty marks on shredded trees. Yet it can capture our attention and hold it for hours. Not only that, but as we read a book, it disappears. We're no longer aware of "the book" we're holding out there; the "real book" is happening inside our heads.
As developers responsible for UI and Web page design, there's a lot we can learn from how books work to make our applications and Web sites more attractive and easier for people to learn and use.
I'm a researcher in the Electronic Books (eBooks) group at Microsoft, and I hope this article will pass on some "lessons learned" from our study of reading that you can apply to your own work.
How in the heck does the book do what it does? What's the magic? These two questions were the starting point of a research project I undertook last year, which helped lead us to develop a number of new technologies, including ClearType™, the font display technology we announced at Comdex last year, which dramatically improves the readability of text on LCD screens.
A detailed analysis of how books work reveals that they use an extremely mature and sophisticated technology. It's hard to sell this idea inside Microsoft, or to developers anywhere. There are no buttons, bells, whistles, or flashing lights. There's no obvious code running. Yet there is real technology there—a technology whose whole purpose is to be invisible, to allow the user to engage with and become immersed in the content.
The basic task of reading is word recognition. When we learn to read, we might start with the alphabet, learning individual letters, but at some point most of us cross a threshold at which the shape of a word like "door" is imprinted in our memory. From that time on, we never read the letters d-o-o-r, we just recognize the whole word. When we become more adept readers, we may read larger units of meaning, like phrases, sentences, even paragraphs. But the word is the basic building block.
And therein lies the first problem as far as computers are concerned. The science and art of typography has evolved over thousands of years (because it's a development from writing systems, which predate Gutenberg by a long way) to create word shapes on paper that are instantly recognizable to us. There's no effort involved.
On the screen, however, we have a problem: low resolution.
Even 15 years ago, the resolution of an average laser printer was 300 dots per inch. Nowadays, it's more likely to be 600dpi, or even 1200dpi. With dots that small, it's possible to reproduce the tiny features of type that have developed over the centuries to help word recognition.
Humans need—not like, need—to read text at relatively small sizes. Legibility work carried out by folks like Professor Miles Tinker and his colleague Donald Paterson found that readable text (for readers with no visual impairment) starts around 9 point. These letters are one-sixth of an inch high (72 points to the inch). Readability improves as text goes up to 10 point, gets even better at 11 point, starts to fall off slightly at 12 point, and gets a lot worse as you move past 14 point.
This isn't negotiable. It's a function of the human visual system, and the fact that the high-acuity area of our eye, the fovea, is only about 0.2 mm in diameter.
Serif typefaces, like Times New Roman or Palatino, are better for sustained reading than sans serif faces like Arial or Helvetica. Serifs are the tiny strokes at the end of main strokes of letters. They help to tie the letters together to form word "gestalts" and make them easier to recognize. They also help to lead the eye horizontally along lines of text.
At the same time that laser printers with 300dpi resolution were being introduced, the resolution even of so-called graphical user interface PCs, like Windows and Macintosh machines, was only 72dpi. Today, the average has made it to about 96dpi because there are technical problems in manufacturing to greater precision.
Contrast that to printer resolution improvement over the same period, and it's obvious there are hard problems to solve before dramatic improvements in screen resolution can take place. The good news is, big improvements are around the corner; new technologies like low-temperature polycrystalline silicon TFT-LCDs (try saying that with poorly fitting dentures) will take us to 200dpi screens over the next two years.
Unfortunately, even 200dpi won't be enough by itself. Most of us, and our customers, will be stuck with legacy equipment for many years to come.
There's another problem in reproducing type at small sizes: letter spacing. Letter spacing is just as critical to transparent word recognition as letter shape. Try recognizing the word d o o r. That's an exaggeration to prove a point, but the effect is there even with much less obvious spacing flaws. And it's more critical on certain pairs of letters (t and o, for instance), hence the development of pair kerning, which removes extra space between certain characters to make them a better optical "fit."
The problem is, the pixel is way too coarse a "brush" to re-create the subtlety of type. It's as if someone asks you to paint the Mona Lisa, and then hands you a paint roller.
Technologies like conventional anti-aliasing, or grayscaling, which attempt to smooth out the "jaggies" at the edges of letters, don't work at these sizes. They still use the same size paint roller—you just get to dip it in a few different shades of gray and smear it around the edges. The effect on small (readable) text sizes is that the type is blurred, and loses contrast. Readability is badly affected—it's like trying read a book wearing someone else's eyeglasses.
At Microsoft, we've tried probably harder than anyone else to defeat this pixel problem. For instance, we commissioned world-renowned type designer Matthew Carter to create two new typefaces—a serif and a sans serif—specifically for reading large amounts of text on the screen, by designing them around pixel characteristics. These are the Georgia and Verdana faces, which ship with all versions of Microsoft® Windows® and Microsoft Internet Explorer.
Today, these are probably the best choices you can make to make life easier for your users until ClearType becomes widely available.
Verdana and Georgia were big improvements, but not big enough for sustained reading on the screen. So, Bert Keely and I turned the magnifying glass on the pixel, to see if anything could be done. We focused especially on LCDs.
As developers, we've always treated pixels as pixels, ignoring whether they were LCD- or CRT-generated. Fact is, they're very different. On a CRT there's no such thing as a "pixel grid." We create one by using software to drive the graphics, because it's much more convenient. But on an LCD screen, the pixel grid is very real. The wiring makes hard edges to each pixel. That's one reason conventional anti-aliasing is even worse on LCDs than on CRTs.
If you want an eye opener, take a 10x magnifier and look at a standard laptop LCD. Look at a white area. Of course it's not white; it's red, green, and blue sub-pixels. We've always treated the pixel as the smallest usable unit. But with RGB sub-pixels the theoretical potential resolution is actually three times higher (in one dimension, but it turns out that that dimension is the most important for display of most Latin-based alphabets).
It's not nearly that simple, of course. If you treat each sub-pixel as a separate unit, you end up with three times the resolution, but text that's full of so many color artifacts you couldn't read it (unless, of course, you were color-blind).
The trick with ClearType consists of some very clever signal processing that removes the color fringing—actually, it doesn't need to be removed altogether, merely taken down below the level at which we perceive it, which is a different thing altogether. We were lucky; Microsoft Research has some world-class graphics folks and mathematicians who did a mathematical analysis of what we'd done and came up with ways to optimize it, make it faster, and so on. The result is a huge improvement in text resolution on existing hardware—and less computing than required for conventional anti-aliasing.
The illustrations to follow zoom in to show the differences.
Figure 1. An 18x magnification of black text on a white background on a laptop LCD screen. You'd never believe the human eye sees all that color as white!
Figure 2. Conventional anti-aliasing. As you can see, almost all of the solid black in the character stems has been grayed out. Result? Blurry text, lowered contrast, hard to read.
Figure 3. ClearType smoothes the edges, yet retains high contrast. All three samples are Gill Sans Italic typeface at 11 point.
ClearType multiplies resolution. Apply that multiplier to a 200dpi polysilicon TFT-LCD screen, and you get text that is easier to read than most of the print we read today.
It turns out that the way we're using color also makes for better anti-aliasing on CRTs, although the improvement isn't as dramatic as on LCDs, in which the sub-pixels are capable of being individually addressed, digitally.
ClearType will ship first in our Microsoft Reader early next year. It will eventually ship in all versions of Windows, including Windows CE, where it makes an amazing difference to readability on small handheld devices like palm-size PCs.
The additional precision that sub-pixel addressing gives us can be used to improve letter spacing as well as letter shapes.
As developers, the book has a lot more to teach us than "more resolution is good."
One thing about the book that's not obvious is that it focuses our high-acuity vision on the content, and allows our peripheral vision to remain at rest. This has some important lessons for application and Web page design.
The first is: there's nothing on the page but content. One of the tricks I always use when reading on the Web—as opposed to browsing, or searching for new content—is to put Internet Explorer into full-screen mode (F11 key in Internet Explorer on Windows). I also have both the menu bar and Windows taskbar set to AutoHide, so both disappear unless you mouse over.
To hide the menu bar in Internet Explorer, first use F11 to go full screen, and then right-click the menu bar. To set the Start menu and taskbar in Windows to AutoHide, right-click the taskbar to bring up the Properties dialog box, and then check the AutoHide box.
When reading a longer document, you'll find the removal of all this menu clutter helps you or your users focus on the content.
Another great trick is to use linked "pages" rather than having a single, long, scrolling page. Users don't like to read scrolling text for long periods.
One of the great tragedies of the enormous growth of the Web over the past few years is how often people print out content because it's so painful to read on the screen.
Some Web page designers go to the opposite extreme: Web pages full of Java applets that animate almost everything. This is very counterproductive. When we read (or focus on content) our peripheral vision is at rest, right? In fact, our peripheral vision is a survival trait, and it is low-resolution because it's really only interested in movement, which in primitive times told us we needed to focus on a threat. Flashing applets or animation distracts our attention from content. To paraphrase the words of a good friend, UK typographer and designer John Miles, "It's like shouting all the time. In the end, nobody listens."
If you want to see a great example of shouting all the time, visit http://www.mirabilis.com/ … but not right after lunch!
Books are an "extreme case" of reading. We immerse ourselves in them for hours at a time. But if you solve the "extreme case," you can apply some of the techniques to all kinds of applications, documents, Web pages, and so on.
As far as typography is concerned, the devil is in the details. There are a lot of variables to juggle, and non-experts can easily dismiss the attention to detail as unnecessary fussiness—but the end results are definitely worth it.
For those who want to read more, here are some good books: