YAHU, or Yet Another Header Utility

Ruediger R. Asche
Microsoft Developer Network Technology Group

January 10, 1995

Click to open or copy the files in the YAHU sample application for this technical article.

Abstract

This article describes the architecture and implementation of YAHU.EXE, a utility that allows users to analyze executable files for all platforms supported by the Microsoft® Windows®, Windows NT™, and MS-DOS® operating systems. Special consideration is given to code recycling and interface design for applications based on the Microsoft Foundation Class Library (MFC).

Introduction

Let's face it: Header utilities are the giveaways of the early 1990s. If your company has run out of ideas for freebies to send to your valued customers (with propeller beanies being the only possibility left, but the Developer Network team is about the only group of nutheads who can get away with something dorky like that), your marketing staff turns to you, the peon developer, and asks you, "What little tool can you crank out that we can give to our customers as a value-added bonus for our product?" So you dig out the product documentation, come across the structure definitions for the header file format, and bingo! "Let's write a header utility!"

Well, YAHU is a header utility, pretty much like the utility you developed as a stocking stuffer for version 2.0 of your product, but with two exceptions:

  1. Except for the user interface and selection logic, all the code in YAHU has been shamelessly plagiarized from various sources in the Microsoft® Development Library:
  2. YAHU is sort of the ultimate header utility, because it incorporates header file information for almost all known executable file formats, and it can be extended fairly easily for other formats. Also, YAHU includes a number of additional features that I copied and pasted from other sources, as I listed above. Thus, after YAHU, you should look for another tool to stuff your product box with. Sorry.

In this article, I will not elaborate on the code that I pasted from other sources, which is already described elsewhere. The article will focus on the user interface and on any code changes significant enough to merit discussion.

The entire application took about two weeks to crank out, mostly due to the awesome abstraction mechanisms that the Microsoft Foundation Class Library (MFC) provides. (Don't ask what ate up the rest of my time. . .) If, after reading this article, you are not convinced that MFC is the coolest thing since sliced bread, you are probably more interested in heli-skiing or reading comic strips than you are in working with computers, which is perfectly fine with me.

What Is the Tool Good for?

Let us first play with YAHU a little bit to see what it actually does. From the File menu, choose the Open command and select any executable file. YAHU determines whether the file is an MS-DOS, a 16-bit Windows®, or a 32-bit Windows executable file or DLL, a VxD, or none of the above. Depending on the outcome, YAHU creates one or more multiple-document interface (MDI) child windows in its client area.

A second pane in the status bar displays the name and type of the file that is associated with the active window.

Eventually, YAHU displays a permanent window that shows a list of currently running processes. This window tells you which processes are currently available and allows you to kill individual processes if necessary.

Application Architecture

YAHU is an MFC application that includes many classes. The figure below shows the class hierarchy within YAHU.

YAHU class hierarchy

In the figure above, all the base classes provided by MFC are shown with a grey background. To add support for other file formats, you need to add the following:

We will look at extending the application in the final section of this article.

The classes do the following:

The CViewfileDoc class, being a derivative of the CDocument class, is the "container" of a document—whenever you open a file, the application creates a new document. The document maps the executable file into the application's address space and exports member functions to access the file mapping. Also, the code in the OnOpenDocument member function determines the document's file type and creates the appropriate view(s) by calling CreateFrame on one or more document templates.

Each file type is associated with a frame class that is derived from CChildFrame, which is a frame class that associates a splitter window with the frame. Each pane in the splitter window is associated with a view that is derived from CFileView, which is a base class that roughly associates a scrollable list box with its client area. An entry in one of the panes is basically an item in the associated list box.

The CRawView and CRawFrame classes implement the hex dump windows that are automatically opened for each document. There is nothing interesting about these classes, really. The only difference between CRawFrame and CMDIChildWnd is that CRawFrame displays its client window minimized by default; it accomplishes this by overriding the OnCreateClient member function:

BOOL CRawFrame::OnCreateClient(LPCREATESTRUCT lpcs, CCreateContext* pContext)
{
  if (!CMDIChildWnd::OnCreateClient(lpcs, pContext)) return FALSE;
     ShowWindow(SW_MINIMIZE);
     return TRUE;
}

Ninety-nine percent of CRawView is provided courtesy of Nigel Thompson, and 1 percent is cut-and-paste on my side, honestly. I don't have the slightest idea what Nigel did—I had no inclination to rethink the process for converting bytes to ASCII representations of their hex values and displaying them in a window, so I recycled Nigel's code. Had it taken me any longer than 15 minutes to generate the classes and paste Nigel's code into the classes, I wouldn't have done it.

Finally, the CPViewDoc, CPView and CNoSysFrame classes are the building blocks of the process list. CPViewDoc is almost identical to CDocument; the only difference is that the OnNewDocument member function of CPViewDoc sets the title of the document to "Process List." CNoSysFrame is almost identical to CMDIChildWnd; the only difference is that CNoSysFrame overrides the PreCreateWindow member function to ensure that the window is displayed without a System menu (because I wanted to make the process list window permanent).

Modularity Considerations

The key to the modularity of YAHU is that each view is responsible for one logical section of the file header. For example, if YAHU detects that a file is a 32-bit PE file, it generates a splitter window consisting of six panes (therefore, six CPEView instances, because each pane accommodates one view). One view displays the resources of a PE file, another view displays the import and export tables, and so on. The associated frame is responsible for "sorting out" the sections of the executable file and dispatching them to the appropriate views. That is, the frame "knows" the sections that a file format comprises and how to obtain the relative starting address of each section, and then calls a member function of one of the panes. The member function dissects that section and displays the information in the list box associated with the pane.

Thus, YAHU has a three-layer "granularity" that consists of the document, frame, and view classes:

The logic that sorts out the file types and dispatches the appropriate frames is shown below. Note that almost every executable file that is known to a Microsoft operating system begins with an MS-DOS file header. Thus, for portable executables, linear executables (LE), new executables, and MS-DOS executables, YAHU generates an MS-DOS header window as well as a window that contains specific header information. The logic in CViewfileDoc that dispatches windows depending on the file type creates one CDOSFrame object for every document that is created (code from CViewfileDoc::OnOpenDocument):

CFrameWnd *pNewFrame=NULL;
  
  // First check to see if there is an MS-DOS header.
   if (CDOSView::IsMyKindOfFile(m_lpImage))
      {
        wsprintf(szStatusMessage,"DOS signature found for %s",lpszFileName);
        pNewFrame = theApp.m_pDOSTemplate->CreateNewFrame(this,NULL);
        if (pNewFrame)
        theApp.m_pDOSTemplate->InitialUpdateFrame(pNewFrame, this, TRUE);
      }

 // Now look for additional info. Note that an MS-DOS header has already been 
 // created here...
   if (CPEView::IsMyKindOfFile(m_lpImage))
      {
      wsprintf(szStatusMessage,"PE signature found for %s",lpszFileName);
        pNewFrame = theApp.m_pPETemplate->CreateNewFrame(this,NULL);
        if (pNewFrame)
        theApp.m_pPETemplate->InitialUpdateFrame(pNewFrame, this, TRUE);
      }
   else if (CNEView::IsMyKindOfFile(m_lpImage)) 
      {
        wsprintf(szStatusMessage,"NE signature found for %s",lpszFileName);
        pNewFrame = theApp.m_pNETemplate->CreateNewFrame(this,NULL);
        if (pNewFrame)
        theApp.m_pNETemplate->InitialUpdateFrame(pNewFrame, this, TRUE);
      }
   else if (CLEView::IsMyKindOfFile(m_lpImage))
      {
        wsprintf(szStatusMessage,"LE signature found for %s",lpszFileName);
        pNewFrame = theApp.m_pLETemplate->CreateNewFrame(this,NULL);
        if (pNewFrame)
        theApp.m_pLETemplate->InitialUpdateFrame(pNewFrame, this, TRUE);
      }
      else if (!pNewFrame)
      {
        wsprintf(szStatusMessage,"No known signature found for 
                 %s",lpszFileName);
      pNewFrame = NULL;
      };
   m_PriorityFrame=pNewFrame; // last frame created...

m_PriorityFrame is a member variable of the CViewfileDoc class. If there are multiple frames, this member variable indicates which of these should receive the initial focus. When a new document has been created, the custom member function CViewfileDoc::ActivateTheRightFrame is invoked, which activates the priority frame. This is necessary because MFC will, by default, always create one frame to go with the document, and, when the application returns from OnOpenDocument, will activate this first frame. This frame is, conveniently enough, CRawFrame, because in the list of document templates generated during startup, CRawFrame comes first. If we didn't use m_PriorityFrame, the active view would always be the hex dump view, which is not what I wanted, with all due respect to Nigel's code.

The other advantage of CRawFrame coming first in the list of generated templates is that if the file does not have any known signature, YAHU displays only the hex dump window. Once again, YAHU provides this functionality without a single additional line of code.

Note that because the MS-DOS header, by definition, always starts at file offset 0, the logic in CDOSFrame can accommodate other file types with no modification. In other words, once the logic for the MS-DOS frame was implemented, associating a PE, NE, or LE file with the appropriate MS-DOS frame came for free—I did nothing but add two lines to the code above. This is software designer's heaven, I think—making use of the code that is already there with a single function call—no tricks, no traps, no sales pitches.

If you don't like the order in which the information is presented in the panes, you can simply change the code in the respective OnInitialUpdate member functions of the frame classes. We will look at that in more detail later in this article.

Process Monitoring

One of the features I had put on the wish list for YAHU was the ability to monitor dynamic loading and unloading of DLLs in a given process. One of the most common problems you encounter when running Windows-based applications is that you may have several versions of a DLL on your disk, and it is not always easy to figure out which version of the DLL was loaded by your application. If your application faults or behaves erratically, knowing the DLLs that your application loads and their paths can save you a significant amount of time. Ideally, YAHU should be able to tell you exactly which DLL is loaded or unloaded from a process, at what time, and where the image of the DLL resides.

Two problems hinder YAHU's (or any similar utility's) progress in accomplishing this goal:

So how do we monitor DLL loading and unloading activity in 16-bit processes under Windows NT? I have good news and bad news. In the Development Library, under Unsupported Tools and Utilities, Windows Tools, you will find a tool called WPS, which was written for Windows 3.x. WPS scans the module list and reports the full path names of all loaded DLLs. In Windows 3.x, DLLs share a single address space, so WPS can grab all DLLs in one global scan.

The bad news is that WPS uses undocumented information about internal system data structures, so I cannot give you the code for WPS, nor can I use the information myself in YAHU. Sorry.

The good news is that the copy of the Windows 3.x kernel executed in NTVDM.EXE so closely resembles the real Windows 3.x kernel that the same undocumented data structures still exist under Windows NT, and WPS appears to work fine in an NTVDM process.

Now I have more bad news. Under Windows NT 3.1, WPS sufficiently displayed all 16-bit DLLs because all 16-bit applications were executed in (and, therefore, all 16-bit DLLs were loaded into) a single NTVDM process. Under Windows NT 3.5, however, you can run a 16-bit application in its own separate address space, which creates a new instance of an NTVDM/WOW process dedicated to executing the 16-bit application. Thus, WPS cannot monitor a 16-bit application that is executed in its own address space unless the 16-bit application has a built-in facility for launching other executables. If that facility exists, you can use it to load WPS into the address space of that process. If that facility does not exist, the only way for you to determine the DLLs loaded by a 16-bit application under Windows NT 3.5 is to launch that application and WPS in the same "standard" NTVDM.

Extending YAHU to Custom File Formats

As I mentioned earlier, you can extend YAHU's functionality fairly easily to work on other file formats such as MS-DOS device drivers, POSIX executables, RIFF files (for example, .WAV or .MID files), OLE structured storage files, and so on. In a fully operational, OLE-based operating system, you could build the functionality for retrieving header information in a custom OLE object that interacts with YAHU in a well-defined way, and simply design a custom control for the desired header format.

For the time being, however, extending YAHU to other file formats requires adding code to YAHU, as I will explain below.

Let's assume that you wish to add support for a custom file format used by document files with a .DCT extension. Follow the steps below.

Step 1. Using ClassWizard, create a new class (CDCTView, or whatever you wish to call it) derived from CFileView. (Include FILEVIEW.H in the source file for the new class.)

Step 2. Use ClassWizard again to create a new class (for example, CDCTFrame) derived from CChildFrame. (Include CHILDFRM.H in the implementation file for that class.)

Step 3. The class derived from CFileView should support the following member functions:

In addition to these members, the view typically contains functions that decode file contents and display them in the window. If you wish to display anything, simply use the CFileView members m_szBuf and AddStringAndAdjust; for example, as follows:

void CDCTView::DisplayDummy()
{
wsprintf(m_szBuf,"Dummy string");
AddStringandAdjust(m_szBuf);
};

This will add the entry "Dummy string" to CDCTView and readjust the scroll bars of the list box that is associated with the view.

Step 4. The CDCTFrame class should support the following member function that is generated with ClassWizard:

BOOL OnCreateClient(LPCREATESTRUCT lpcs, CCreateContext* pContext)

The OnCreateClient function should perform the following operations:

We will look at a typical implementation of the OnCreateClient member function after we complete the last two steps.

Step 5. In VIEWFILE.CPP, add a member variable to accommodate your template, and change the InitInstance member function by adding a new document template, as shown below. (Do not put the code at the beginning of the function; m_pRawTemplate must be registered first.)

VIEWFILE.H file:

public:
.
.
.
CMultiDocTemplate *m_pViewTemplate;
CMultiDocTemplate *m_pCDCTTemplate;   // ...Or whatever name you choose...
.
.
.

VIEWFILE.CPP file:

.
.
.
m_pCDCTTemplate = new MultiDocTemplate(
  IDR_CDCTTYPE,                 // Associate this with menus, icons, etc.
  RUNTIME_CLASS(CViewfileDoc),  // Our file document
  RUNTIME_CLASS(CDCTFrame),     // The new frame type
  RUNTIME_CLASS(CDCTView)),    // Our new view, but this is pretty bogus anyway.
AddDocTemplate(m_pCDCTTemplate);   // Register the new template.

Step 6. In VIEWFDOC.CPP, edit the CViewfileDoc::OnOpenDocument member function to spawn a new frame when it encounters a document of the appropriate type. The code will probably look something like this:

if (CDCTView::IsMyKindOfFile(m_lpImage))
{ wsprintf(szStatusMessage,"My custom signature found for %s",lpszFileName);
  pNewFrame = theApp.m_pDCTTemplate->CreateNewFrame(this,NULL);
  if (pNewFrame)
  theApp.m_pDCTTemplate->InitialUpdateFrame(pNewFrame,this,TRUE);
}

The szStatusMessage member holds a string that is displayed whenever a window that is associated with the current document is activated.

If you have followed the instructions up to this point, you should be able to run YAHU and select a file of your custom type. YAHU will display a hex view of that file and a new empty frame with as many splitter panes as you specified. Displaying the information and reacting to user input is up to the logic in your view and frame classes. You don't have to edit any more existing files after this point.

Displaying Information

Let's look at the CNECFrame::OnCreateClient member function to see what it does to dump header information for custom classes. I will comment on the code as we go along.

As we discussed before, the code sets up the appropriate number of rows and columns in your splitter window and has the base class create the frame for you:

BOOL CNECFrame::OnCreateClient(LPCREATESTRUCT lpcs, CCreateContext* pContext) 
{  
 m_iNumberRows = 3;
 m_iNumberCols = 2;
 m_ViewClass = RUNTIME_CLASS(CNEView);
 if (!CChildFrame::OnCreateClient(lpcs,pContext)) return FALSE;

In the next code fragment, we use structured exception handling because the new executable header file format contains indirect references. If the file is corrupt, or looks like a new executable but actually isn't, trying to dereference those relative structures may cause GP faults. By wrapping the entire code into structured exception handling and catching GP faults in the __except clause, we allow the code to recover gracefully from working on bad file images.

 _try {

Let's display the new header information (as opposed to the resources, the import and export information, and so on) in the upper left pane (pane (0, 0)):

CNEView *cfMyPointer = (CNEView *)m_wndSplitter.GetPane(0,0);

Below, we call a member function in the CFileView class that knows how to take a data structure and disassemble it into its members. The FillInTaggedData function internally calls AddStringandAdjust to fill its list box. Note that some functions that are fairly generic are defined in the CFileView base class so that all derived classes can use them, whereas some of the more specialized functions are defined in the derived classes. You might want to play with the DisplayDummy function we defined earlier to see what happens here. Note that we decide which pane the output goes to by selecting a pane and then calling a display member function of the view associated with that pane. This can be changed easily at any time. Note also that the CViewfileDoc member functions AdjustPointerAbsolute and AdjustPointerRelative provide us with the pointers into the file image:

       PIMAGE_DOS_HEADER lpImage = (PIMAGE_DOS_HEADER)m_AssociatedDocument
                                    ->AdjustPointerAbsolute(0);
       PIMAGE_OS2_HEADER lpNewHeader = (PIMAGE_OS2_HEADER)(unsigned char 
*)m_AssociatedDocument->AdjustPointerAbsolute(lpImage->e_lfanew);
       cfMyPointer->FillInTaggedData((unsigned char *)lpNewHeader,&tlNEHeader);
int iModuleEntries = lpNewHeader->ne_cmod;
       HEADERTEMPLATE hT = {"Imported Names: ","%s"};
       unsigned char *pImportTable;
       pImportTable = m_AssociatedDocument->AdjustPointerRelative
                      (lpNewHeader->ne_imptab);
       unsigned char *pModuleTable;
       pModuleTable = (unsigned char *)lpNewHeader;
       pModuleTable+=((PIMAGE_OS2_HEADER)pModuleTable)->ne_modtab;

Now we move on to the next pane and display some stuff here. You get the idea, so I won't go into details. It's more or less a repeating game of "select a pane, move the file pointer to the next section, and call a member function in the view to display the relevant data."

       cfMyPointer = (CNEView *)m_wndSplitter.GetPane(0,1);
       cfMyPointer->FillInChainedStructures(iModuleEntries, &hT,
                                            (WORD *)pModuleTable,pImportTable);

 .. display more stuff until all information is sucked out
      }

Finally, the __except clause at the end, as I explained before, allows us to recover gracefully from corrupted file images:

 _except (GetExceptionCode() == EXCEPTION_ACCESS_VIOLATION? EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
   { 
    AfxMessageBox("corrupted file; cannot display all information");
   }
 return TRUE;
};

Value-Added Functionality

Now we have all the tools we need to display header information in a custom window. What if we want more? Where, for example, do we add the functionality that allows YAHU to execute an application and monitor its DLL usage? How do we implement custom extensions? If we wanted to extend YAHU to display header information for RIFF files, we might want to add a Play command to invoke a sound playback or video display routine. Where would we do it?

One of the really cool things about Visual C++™ (and inherited by MFC) is that you can keep all of these features strictly local. The changes you make to add new functionality need only go into the view class (in our example, CDCTView).

Let us assume that your document type is designed to be processed by your application, BESTAPP.EXE (which, unfortunately, doesn't ship with a free .EXE header utility any longer, remember?). What if we wanted to add the custom command "Run BESTAPP" to the File menu to load a copy of the file that is currently displayed? Nothing easier than that. First, load VIEWFILE.RC into Visual C++, and add a custom menu to the identifier that goes with your document template. (Referring to our earlier example, that menu would be associated with the identifier IDR_CDCTTYPE.) Add the Run BESTAPP command to the menu. (Now that you're in the .RC file, you might also want to add an icon for an IDR_CDCTYPE frame and extend the IDR_RAWTYPE string to add your custom .DCT files to the file list in the MFC Open File dialog. Remember, it's all free—you don't need to do anything but modify the resources. I just love MFC!)

Next, add a member function to the CDCTView class that handles the Run BESTAPP command. That function would probably call CreateProcess with the current file name or call CreateProcess on the file name itself (if there is an association between a .DCT file and BESTAPP.EXE in the Registry).

If you want the user to be able to initiate further processing by double-clicking an entry in either pane, simply add code to the DispatchDoubleClick member function of the view class. You might want to check the code in CPEView::DispatchDoubleClick for an example for how to do this.

Summary

YAHU is a good example of how MFC lets you do a number of very impressive things in a very short time (especially if you take other people's code and collapse several different applications into one). By sorting out functionalities between views, frames, and documents, you can account for the similarities and differences between file formats in an easy, clean, and extensible way.

You can extend YAHU in many ways. Items I have on my list (but haven't bothered to implement yet) include in-line display and editing of resources, and a more generous set of value-added options for existing file formats (for example, gathering more profile information at run time). Other features you can implement include extending the set of file formats that can be processed, for example, to POSIX executables or RIFF files.

The YAHU user interface is currently not very user-friendly. Ideally, YAHU should not have multiple panes, and all the information about the executables should be displayed in a tree view control. At the time I wrote this article, MFC did not support tree view controls, and I didn't really feel like reinventing the wheel. However, to implement a different user interface, you would need to make changes only to the view and frame base classes CFileView and CChildFrame.

As a final note, please be aware that Windows 95 will provide built-in functionality (such as file viewers) that will make a few of YAHU's tasks obsolete. For more information, see "Creating File Viewers in Windows 95" by Nancy Cluts in the MSDN Library.