Bruce: As you know, this is my first work as a professional programmer—although at this point I’m only a sample programmer. This is also my introduction to parsing. I was assigned to fix REMLINE.BAS as a sample program for QuickBasic.
Joe: What the heck is a sample programmer?
Bruce: Well, there are real programmers, and there are sample programmers. I’m just a sample.
Mary: Bruce has been hired by the documentation department as a programmer. In the past, all the sample programs in the manuals and on the disk have been just thrown together by whoever happened to have time. Coding style and quality varied a lot. Bruce’s job is to make all the samples use a consistent style and have consistent quality.
Joe: Well, if he works for documentation, why don’t the documenters review his code?
Mary: We thought it’d be good experience for him to have his code reviewed by programmers from Basic development. A sort of baptism by fire.
Jane: It might also be nice for C programmers writing the Basic language to see what it’s like in the trenches. How long has it been since you tried to write serious code in Basic?
Joe: Serious code in Basic? Hmph! Can’t be done.
Bruce: We’ll see. Back to REMLINE. I didn’t write it, and I don’t know the original author. The version I got was lightly commented, had a clumsy user interface, didn’t take full advantage of new Basic features, and had undocumented limitations that some people might call design bugs. My job was to clean up this private code for public consumption.
Joe: So this isn’t even your code?
Bruce: Well, what we’re going to talk about today is mine. I revised it heavily, and I take responsibility for it. Now, the purpose of REMLINE is to remove all unnecessary line numbers from GW-BASIC and BASICA programs so that they look semistructured. In theory, you can then go through and change the remaining line numbers to labels as a first step toward making the code readable by humans as well as by modern versions of Basic.
Joe: You’re better off just throwing away your BASICA code and starting over from scratch.
Jane: Oh, sure! That’s a good marketing line. Put that on the box. “QuickBasic: Now you can rewrite all your old code.”
Bruce: When I started on REMLINE, I thought it was a pretty cool program, and I had all kinds of ideas for making it even cooler. Now I tend to agree with Joe. Dump your BASICA code in the trash. In any case, we’re not reviewing REMLINE; we’re reviewing the parsing code in it.
REMLINE is a very simple compiler that tokenizes a source file. It recognizes certain Basic keywords associated with line numbers—Goto, Gosub, Then, Else—and indexes each one with its target line number. When it’s finished indexing, it strips out all the line numbers that haven’t been targeted by a keyword. A good portion of the code deals with tokenizing keywords, variables, and line numbers. REMLINE does this with imitations of the C parsing functions strtok, strspn, and strcspn.
Joe: This is crazy. I don’t have time to review sample code that no one will see. Even if you ship this crap, no one will pay any attention to it, and you won’t bother to ship it with the next version.
Mary: You don’t have to like this assignment, but you do have to do it. Try to have a positive attitude.
Archaeologist’s Note: REMLINE went on to become the most widely distributed source code in the history of the planet. It shipped not only with various versions of Microsoft Basic but also with MS-DOS 5, and it still ships with every copy of Windows NT version 4. For reasons unknown, the problems discovered in the code review described here were never fixed. Fortunately, most of the millions of people who saw this code didn’t look at it.