To provide the maximum potential for our site, it's obvious that we need to meet the needs of our visitors. This might mean changing the site to better adapt it to our existing visitor profile, or changing it so that it attracts different kinds of visitors. To be able to do either of these things, we need to measure our site's usage in a range of different ways. This will usually include:
We'll look at each of these in turn, and see how the logging system we set up in the previous chapter can help to provide this information. The following diagram shows the way we are collecting and summarizing the data from the two main log files—the file access (or hit count) log, and the user session log. The different summary tables provide the various kinds of information that we need:
Measuring the number of hits your site receives is a good way to get a warm glow. You can even use the results to try and bluff your financial director into parting with the cash for a couple of new Web servers. You often see Web sites that proudly display this figure on their Home page as well: 'Congratulations, you are visitor number twenty four million and seventy three!'
Don’t believe a word of it. Counting hits doesn’t equate in any way to counting the number of visitors. Our Web-Developer site gets over 250,000 hits a week, but from our measurements this only equates to around 7,500 visitors. However, some weeks we might get 6,000 visitors for 350,000 hits, while other weeks we get 8,000 visitors for 200,000 hits.
You can’t accurately and directly equate the number of hits with the number of visitors, because you don’t know how many pages (together with all their assorted graphics and other files) that person will visit. And of course, our site is only a minnow compared to the 'big boys' out there. I might spend an hour in one visit to the Microsoft or Netscape sites, causing many hundreds of hits.
However, hit counting is useful for looking at overall traffic patterns. You can see whether your site gets more hits on particular days, perhaps indicating if it's mainly business or home users that you attract. We provide a publicly-available breakdown showing the number of hits on our server (and the traffic volume) by day, over the previous four weeks. Go to http://webdev.wrox.co.uk/resources/ and follow the links to the Gallery and then Graphical Traffic Reports. You can see that our site is far busier during the week than at weekends:
You'll see how this and other types of graphical charts are created later in this chapter
Being able to measure traffic volume (in Kbytes) is less universally useful, but can provide some useful information. For example, you can get an indication of how much spare capacity there is on your 'Net connection. If the amount you pay for your connection is based on traffic volume, it helps when you come to check the charges.
It's also useful to be able to measure the relationship between the number of hits and the volume of data sent out by your server. As it changes week by week, you can see how the number and popularity of non-HTML pages (i.e. downloadable files), and the size of the graphics, affects traffic volume on your site.
Far more useful for measuring the real number of visitors to our site is the ability of ASP to count user sessions. In general, we'll get just one session start per visitor. Only after they fail to load any file from our site within 20 minutes (the default, see Chapter 5) will the session end. Any further accesses to the site will then start a new session. So, we have a reasonably foolproof way of counting visitors.
This isn’t perfect, however, because ASP sessions depend on the browser's ability to accept and return cookies—as we saw in Chapter 5. Some user agents (particularly spiders, search engines, robots, etc.) won’t accept cookies, and so won’t start a session. However, we probably shouldn't class these as visitors anyway, any more than you would class the person who comes to read your electricity meter as a visitor to your home.
There are some issues with sessions, in that sometimes ASP can loose track of them. This might be a problem with a link on your site, or the browser itself (the user may have set it to reject cookies). The point is that in general it provides a good guide to the number of visitors the site gets, independent of the number of hits they make on the site.
Some sites count the number of accesses to a single page to measure the number of visitors. This is reasonable when your entire site appears in a frameset that doesn’t change as they navigate the site, because the frameset page will then only be loaded once per visit. However, this falls down with browsers that don't support frames. Counting the number of times that your Home page is loaded is not ideal either, because visitors will tend to load this many times if they are exploring your site.
In Chapter 4 we looked at the issues involved in creating content that works on different kinds of browsers. We also considered the market share of different browsers, and the kinds of support you should be considering for the various types. Being able to measure the proportion of each browser type that hits your site, rather than depending on the average for the Web as a whole, can provide useful insights into the mix of visitors you get.
And if, like us, you provide material that is connected with computing generally, the types of operating system that your visitors run can provide useful extra information. For example, our Web-Developer site focuses heavily on Windows NT Server programming issues, so we would expect to get a large proportion of visitors using that system. However, bear in mind that just because visitors access your site using, say, Windows 95, doesn’t necessarily mean that this is the development environment they use in their workplace.
Overall, though, the combination of browser and operating system does provide another way of measuring the kinds of visitors you get. The variations from site to site are evident in our case, because—as we saw in Chapter 4—the market share for Netscape Navigator in the majority of colleges is as high as 90%, while on our site it's only around 25%:
It's also valuable to see how traffic volume changes by the hour—useful if you plan major updates to the site, because you can choose a time when it's quiet. When combined with a measurement of where your visitors are located in the world, i.e. which time zones, you can start to get a picture of the spread of your site's popularity. It also provides an indication of whether you should perhaps consider devoting areas of your site (or even building specially located sites) to meet the needs of visitors in other countries.
The Graphical Traffic Reports section of our Web site provides an hourly breakdown of traffic volume. We know that we get most hits during the working week, and that the largest volume comes from English-speaking countries—particularly the US and the UK. What this breakdown also indicates is that the loading is highest in the period from 2:00 PM to 11:00 PM GMT. This reinforces our assumption that a large proportion of our visitors are in the US and use computers at work, because this period encompasses their working day:
The final area where we collect information is about sites that provide links to us.The number of visitors you are likely to get can be highly dependent on this (depending on how you publicize your site), so knowing what's happening is vital. By measuring the number of referrals, and comparing it with where and how you spend any advertising budget you may have, you can measure the return on that expenditure.
More to the point, perhaps, is that you can directly see what kinds of people are interested in your site; and who considers it useful enough to provide a link to it from their own site. You often find that referrers also comment on what they think, maybe telling their visitors that this is a really great site that they shouldn’t miss—good for the ego! Alternatively, if you find your site listed on the 'Worst of the Web' pages or as an example of a poorly constructed site, it's probably time for a redesign exercise.