This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
|
Tracking and Fighting Spam: A Primer for Postmasters R'ykandar Korra'ti |
Unauthorized email floods are one of the biggest headaches faced by system administrators of Internet-connected networks. There are some effective strategies for reducing its drain on your resources. |
Spammers are thieves. They can argue about this all they want. They're hijacking your system to deliver their unrequested, unwanted advertising. If you pay for your bandwidth consumption, then they're charging you for the privilege. If you don't, then they're charging against your ISP for the privilege, and you can be sure that's included in your monthly feeone way or another. The allegedly "legitimate" spammers have a bit more of a case; they don't hide where their mail is coming from, and they at least pretend to offer a way off their lists. But I'm not talking about that sort. I'm talking about the run-of-the-mill spammers. The pedestrian make-money-fast/mortgage fraud/gambling/pornmeisters who forge everything they can in the header, those who dump email on unsuspecting third parties to deliver for themthus stealing from even more people than do "legitimate" spammers. These are the people who've forced site administrators to shut down relay services on their machines to stem the flow, thereby defeating a useful design function of the Internet. What does all this mean to you? If you want to track down spammers and get them shut down, you have the moral right, even, perhaps, the duty to do so. OK, maybe it's just a desire to see a spammer get hammered. Whatever. It's all right; they're the villains here. So where do you start? With a few specific tools available on the Web and a bit of analysis of the header you'll be ready to fight back. The Tools
The important toolsor more properly, Internet protocol services that various tools will implementare ping, NSLookup, and WhoIs. Of these, WhoIs is most important. Traceroute, which I won't discuss here, is occasionally handy as well. The Header
First, you need to know a few basic rules about spam mail and its header lines:
Decoding Sample Spam Figure 1 shows the header from a piece of mail I got just recently, as displayed in Outlook® 97. |
Figure 1: Spam Header in Outlook |
This is actually a fairly restrained piece of spam. There are only a few pieces of outright goofiness in the header, most notably a From line which appears to be Aztec in originbut as noted before, you can ignore that. Start at the first (top) Received line. |
Received: from euromar-travel.com (root@[194.65.2.129])
by anvilite.murkworks.net (8.6.12/8.6.9) with ESMTP id VAA29070 for
<kiki@murkworks.net>; Thu, 29 Jul 1999 21:11:13 -0700
This much I know is true: it was received by me (anvilite.murkworks.net). That part was placed by the SMTP handler on my machine. Where it's from is another matter completely. Fortunately, part of that information was also placed by my machinespecifically, the portion in parentheses. Broken down and printed in colorred meaning provided by someone else, green meaning provided by my machine, and black meaning uninteresting in this scenarioit looks like this: |
Received: from euromar-travel.com (root@[194.65.2.129])
by anvilite.murkworks.net (8.6.12/8.6.9) with ESMTP id VAA29070 for
<kiki@murkworks.net>; Thu, 29 Jul 1999 21:11:13 -0700
Here's how it looks generalized into a template: |
Received from <sending machine, as provided by that machineuntrustworthy>
(<sending machine's name (sometimes) and IP address (always),
as provided by receiving machinetrustworthy>)
by <receiving machine's name, provided by receiving machine> for
<some random user, sometimes accurate and sometimes not;
provided by sending machine but uninteresting>;
<date and time, provided by receiving machine, sometimes interesting>
Note that in this case, the receiving machine (anvilite.murkworks.net) did not fill in the name of the sending machine, and instead left only the IP address, 194.65.2.129. This used to be standard; hosts would assume that other hosts would correctly identify themselves. The IP was provided for other reasons. These days the sending host often liesparticularly where spammers are involvedand most administrators turn on a feature called reverse authentification. Reverse authentification causes the receiving mailer to look up the name belonging to the IP address of the machine handing it the mail. It then puts this name in the header, next to the IP address, but inside the parentheses rather than outside. This gives you a handy and quick check to see if a sending machine lied. In this example the reverse authentification failed. This does not automatically mean that the sending host was lying about its name. It could simply be flaky, it could be running DHCP and therefore have a changing IP address, or it could be a machine within a domain that doesn't have a specific name and therefore doesn't have a specific DNS entry. Now is when you need the tools I mentioned earlier. |
Figure 2: NSLookUp of IPAddress |
First, let's see what NSLookup has to say about 194.65.2.129 (see Figure 2). As you can see, name service knows about this host, and thinks that it's a machine called dns.madinfo.pt, which is not what the sending host told us. Suspicious, but not necessarily a problem. One host can have many names. Besides, almost no spammers hand you the mail directly. Let's ping euromar-travel.com and see whether they're alive (see Figure 3). |
Figure 3: Pinging euromar-travel.com |
Well, they aren't. But still, this isn't fatal; they might not be a host that is up full time. Let's use the second tool, WhoIs. WhoIs provides access to the various domain registration databases, such as those maintained by the InterNIC and other organizations. And as you can see in Figure 4, this yields better results. Euromar-travel.com is a registered domain with the InterNIC, and all its domain name servers are in the madinfo.pt domain. This means that euromar-travel.com is probably a single machine leasing an address from a larger organization. It also means that the host has legitimately identified itself; it gave a correct name. |
Figure 4: Detailed Information from WhoIs |
And what does this mean? The upshot is that this Received line is valid. So the host that handed this mail to me was almost certainly not the originating site of the spam. Spammers don't generally play that way. Plus, there's another, older Received line below this one, which means that euromar-travel.com was simply a victimsomeone a spammer picked on to deliver the spam mail for free. When you have a valid Received line, it's generally safe to assume that the previous Received line was filled in by an honest host. This isn't always the case, but it's a reasonable rule of thumb. It also doesn't mean that everything in that Received line is valid, any more than everything in the previous Received line was correct. Spammers lie to everyone. So let's look at the Received line filled in by the previous hop. |
Received: from tgnfg.nada.kth.se (zeus.host4u.net [216.71.64.21])
by euromar-travel.com (8.8.8/8.8.8) with SMTP id FAA03578;
Tue, 30 Mar 1999 05:08:56 GMT
This example shows that euromar-travel.com has reverse-authentification turned on; instead of just the IP address in parentheses (216.71.64.21), there's a host name (zeus.host4u.net) as well. Validating this using either NSLookup (see Figure 5) or WhoIs (see Figure 6) tells you that indeed euromar-travel.com found the correct name. |
Figure 5: Validating the IP Address in NSLookup |
Figure 6: IP Address Lookup in WhoIs |
And, as the Received line makes clear, that correct name differs wildly from the name handed to them by the sending system. The use of the word "nada" was the other warning flag. People do give hosts and domains names like that in real life, but more often it's a spammer making something up. So the spamming host was almost certainly host4u.net. But before sending mail, it's important to verify that tgnfg.nada.kth.se isn't realand a quick scan against the major WhoIs servers shows that it's not. Neither is the listed subdomain, nada.kth.se. kth.se does exist (see Figure 7), but that doesn't mean much at this point. I can be fairly sure that the spam came from some random user at host4u.net. Which specific user, I don't know and can't find out; but they can, with their log files. |
Figure 7: Domain Name Lookup in WhoIs |
Often, there will be more Received lines following the first dishonest host. Ignore them. They would almost always have been placed by the spammer to confuse youmore aluminum foil in your radar. Taking Action
At this point, it's time to send mail. You need to send a note saying what happened, and you must include the spam itself, with its entire header intact. Without that header, the sending system can't track down the individual user who sent the spam, and your complaint will generally just be ignored. |
http://msdn.microsoft.com/library/partbook/ asp20/aspinternetmail.htm and http://msdn.microsoft.com/library/psdk/cdosys/ cdosysguide_transportevts_examples_6ycf.htm |
From the December 1999 issue of Microsoft Internet Developer.