Algorithms

Significant Linkage: Implementing Linked Lists in VBA

By Rod Stephens

Visual Basic for Applications features arrays that are convenient for storing small lists of items. You can easily add and remove items from an array and, using the ReDim statement, you can even change the number of items the array holds.

Rearranging the items in an array can be cumbersome, however. If you want to add a new item at the beginning of an array, you need to move all the other items over one position to make room. If the array is large, this can be a time-consuming operation. Similarly, removing an item from the front of an array can be awkward.

This article describes a flexible data structure called a linked list. Using a linked list, you can easily add and remove items at any position.

Cell Structure

The data items in a linked list are stored in structures called cells. The easiest way to implement cells in VBA is to use a class defined in a class module. Public variables declared by the class provide the basic structure of a cell.

The cell class must define some sort of variable to hold the data represented by the cell. For example, a cell that used to store a list of strings might define a StringValue variable.

The cell class must also define a variable that is a reference to the next cell in the linked list. This variable provides the link between one cell and the next. The following code shows the variables declared by a simple StringCell class:

Public StringValue As String   ' The cell's data.
Public NextCell As StringCell  ' The next cell in the list.

Using the StringCell class, a VBA program can build a list of strings. It uses the New keyword to create new instances of the class. It sets the cells' StringValues and uses their NextCell values to link the cells. The example code shown in FIGURE 1 builds a linked list containing three cells and displays the list in a message box.

Public Sub MakeList()

  Dim top_cell As StringCell
  Dim new_cell As StringCell
  Dim txt As String

  ' Start with an empty list.
  Set top_cell = Nothing
  
  ' Add string 1 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Dog"
  Set new_cell.NextCell = top_cell
  Set top_cell = new_cell
  
  ' Add string 2 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Cat"
  Set new_cell.NextCell = top_cell
  Set top_cell = new_cell
  
  ' Add string 3 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Horse"
  Set new_cell.NextCell = top_cell
  Set top_cell = new_cell
  
  ' Display the list.
  txt = ""
  Set new_cell = top_cell
  Do While Not (new_cell Is Nothing)
    txt = txt & new_cell.StringValue & vbCrLf
    Set new_cell = new_cell.NextCell
  Loop
  MsgBox txt
End Sub

FIGURE 1: This code creates a linked list of StringCell objects and then displays the list.

The code in FIGURE 1 is typical of many programs that use linked lists. A reference to the top of the list is initialized to Nothing. To add a new cell to the list, the code sets the new cell's NextCell variable to point to the top item. It then resets top so it points to the new item. This makes new items appear at the top of the list. When the code displays the items, they are listed in the reverse of the order in which they were added to the list.

FIGURE 2: A graphic representation of the linked list produced by the code in FIGURE 1.

The Life and Death of a List

Any program that builds complex structures using object references must always keep a handle on the data structure. In this case, that means the program must keep a reference to the top of the list. If the program loses its reference to the top, it cannot access the items in the list. The code in FIGURE 1 uses the reference top_cell to keep track of the top of the list. If the code set top_cell to Nothing, it would no longer be able to find the cells in the list.

Dropping a list like this is bad if the program still needs it, but it's a convenient way for the program to destroy the list when it's no longer required. Every object in VBA has a reference count. When a variable like new_cell or top_cell is set to point to a particular object, VBA increases that object's count by one. When a reference is removed, as happens when the program sets new_cell to Nothing, VBA decreases the object's count by one. At that time, if the object's count is zero, VBA knows the program can no longer reference the object, so VBA destroys the object.

VBA is also smart enough to know that any variable declared locally within a subroutine is removed when the subroutine ends. For example, when the subroutine in the code shown in FIGURE 1 ends, the top_cell reference is automatically removed.

Now, consider how reference counting applies to a linked list. Most of the list's cells are pointed to by another cell. When the program's last reference to the list is set to Nothing, however, the top-most cell has no references. Because the top-most cell's reference count is now zero, VBA destroys it.

The top-most cell contained the only reference to the second cell in the list. When VBA destroys the top-most cell, it reduces the second cell's reference count to zero, so it destroys that cell too. This process continues with VBA destroying one cell after another until the whole list has been destroyed.

This process is important. It means a program can simply remove its last reference to a linked list to make the entire list go away. If the list were not properly destroyed, it would continue to occupy memory even though the program could no longer access it. The program would have a "memory leak" and, over time, it might tie up a large amount of the system's memory.

You can verify that the list cells are correctly created and destroyed by adding code to the list cell class. The Class_Initialize event occurs when a cell is created. The Class_Terminate event occurs when a cell is destroyed. The following code prints statements into the Immediate window to indicate when cells are created and destroyed. If the number of created and destroyed statements are not the same, the program is not correctly freeing all of its memory.

Private Sub Class_Initialize()
  Debug.Print "Cell created"
End Sub

Private Sub Class_Terminate()
  Debug.Print "Cell destroyed"
End Sub

List Management

Adding or removing a cell from a linked list is easy. The code in FIGURE 1 shows how a program can add new cells to the top of a linked list in just a few statements.

It is just as easy to add a new cell after a particular cell in a linked list. The program simply sets the new cell's NextCell value so it points after the existing cell. It then sets the existing cell's NextCell value so it points to the new cell. FIGURE 3 shows the process graphically. The following code inserts the cell new_cell after the cell after_me:

FIGURE 3: Inserting a cell in a linked list is easy.

Set new_cell.NextCell = after_me.NextCell
Set after_me.NextCell = new_cell

Removing a cell from the top of a linked list is just as easy. The program simply sets the top_cell variable to point to the second cell as in the following code.

Set new_cell.NextCell = after_me.NextCell
Set after_me.NextCell = new_cell

When top_cell no longer references the first cell in the list, that cell's reference count is reduced to zero, so VBA automatically destroys it. FIGURE 4 shows the process graphically.

FIGURE 4: Removing a cell from the top of a linked list is also easy.

Removing a cell from the middle of a linked list is also simple. The program sets the NextCell value of the cell, before the target cell, to the target cell's NextCell. The following code shows how a program can remove the cell after the cell after_me.

Set after_me.NextCell = after_me.NextCell.NextCell

In this case, the target cell's reference count is reduced to zero, so VBA automatically deletes it. The process is shown graphically in FIGURE 5.

FIGURE 5: Removing a cell from the middle of a linked list.

Sentinels

The previous code had to handle the top and the middle of the list differently. Because there is no item in front of the top of the list, removing the top item is different from removing an item after another item in the middle of the list.

The program can use a sentinel cell at the top of the list to allow it to treat both situations the same way. The sentinel item takes the place of the top_cell pointer. The sentinel's NextCell value references the first real item in the list.

Using the sentinel, the program can treat the special cases of adding or removing an item at the top of the list the same way it treats these operations in the middle of the list. It can add or remove the item after the sentinel.

The code in FIGURE 6 shows the code from FIGURE 1 modified to use a sentinel instead of a top_cell variable.

Public Sub MakeList()

  Dim sentinel As New StringCell
  Dim new_cell As StringCell
  Dim txt As String

  ' Start with an empty list.
  Set sentinel.NextCell = Nothing
  
  ' Add string 1 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Dog"
  Set new_cell.NextCell = sentinel.NextCell
  Set sentinel.NextCell = new_cell
  
  ' Add string 2 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Cat"
  Set new_cell.NextCell = sentinel.NextCell
  Set sentinel.NextCell = new_cell
  
  ' Add string 3 to the top of the list.
  Set new_cell = New StringCell
  new_cell.StringValue = "Horse"
  Set new_cell.NextCell = sentinel.NextCell
  Set sentinel.NextCell = new_cell
  
  ' Display the list.
  txt = ""
  Set new_cell = sentinel.NextCell
  Do While Not (new_cell Is Nothing)
    txt = txt & new_cell.StringValue & vbCrLf
    Set new_cell = new_cell.NextCell
  Loop
  MsgBox txt

End Sub

FIGURE 6: This code creates a linked list using a sentinel.

A Classy List

To hide some of a linked list's details, a VBA program can create a linked-list class. This class contains a sentinel that it uses to manage a linked list. The main program creates a list object and uses its procedures to manage the list indirectly. This makes list management easier for the main program.

It also makes it easier to have more than one linked list. The main program simply creates two linked-list objects, and they manage their lists separately. If the program managed the lists itself, it would need two sentinels and two sets of routines for managing the lists. A separate list class makes the process much simpler. FIGURE 7 shows the code used by the StringList class to implement a linked list of strings.

Option Explicit

Private sentinel As New StringCell

' Add a new value to the top of the list.
Public Sub Add(new_value As String)
  Dim new_cell As New StringCell

  new_cell.StringValue = new_value
  Set new_cell.NextCell = sentinel.NextCell
  Set sentinel.NextCell = new_cell
End Sub

' Remove an item from the list.
Public Sub Remove(target_value As String)
  Dim after_me As StringCell
  Dim next_cell As StringCell

  ' Find the target cell.
  Set after_me = sentinel
  Set next_cell = after_me.NextCell
  Do
    ' If we reach the bottom, it's not here.
    If next_cell Is Nothing Then Exit Sub
    If next_cell.StringValue = target_value Then Exit Do
    Set after_me = next_cell
    Set next_cell = after_me.NextCell
  Loop
  
  ' Remove the target cell.
  Set after_me.NextCell = next_cell.NextCell
End Sub

' Return a textual representation of the list.
Public Function TextValue() As String
  Dim txt As String
  Dim cell As StringCell

  Set cell = sentinel.NextCell
  Do While Not (cell Is Nothing)
    txt = txt & cell.StringValue & vbCrLf
    Set cell = cell.NextCell
  Loop
  TextValue = txt
End Function

FIGURE 7: A linked list class.

The StringListForm dialog box, shown in FIGURE 8 and available for download, uses a StringList to let you manage a list of strings. Enter a string and click the Add button to add the string to the list. Enter a string and click the Remove button to remove the string.

Sorted Lists

With just a little more work, a program can turn a linked list into a sorted list. Instead of adding items at the top of the list, the program searches the list to see where new items belong. The program searches until it finds a cell with a value greater than or equal to the new value. It then inserts the new item before that cell.

To make the search easier, the list can include a sentinel at the bottom of the list with a value greater than any that might be added to the list. For text strings, the bottom sentinel can be given the value Chr$(255) since all normal strings come alphabetically before the character 255. When it searches for a new item's position, the program knows it will eventually find a cell with a value greater than the new value so it will not run past the end of the list.

Sorting the list also makes searching for items that are not in the list a bit faster. In an unordered list, the program must search all the way to the end of the list before it can conclude the target item is not present. In the sorted linked list, it only needs to search until it finds an item with value greater than the target item. Since the list is sorted, the target must come before that point.

FIGURE 9 shows the VBA code for the Add and Remove subroutines used by the SortedList class. The TextValue function is the same as the one used by the StringList class so it is not repeated here.

The SortedListForm dialog box demonstrates this class. It is similar to the StringListForm dialog box shown in FIGURE 8, except it stores strings in a SortedList object instead of a StringList. This keeps the items in the list sorted.

FIGURE 8: The StringListForm dialog box.

Option Explicit
' Sorted string linked list class.

Private sentinel As New StringCell
Private bottom_sentinel As New StringCell

' Add a new value to the top of the list.
Public Sub Add(new_value As String)
  Dim after_me As StringCell
  Dim new_cell As New StringCell

  ' See where the new value belongs.
  Set after_me = sentinel
  Do While (after_me.NextCell.StringValue < new_value)
    Set after_me = after_me.NextCell
  Loop
  
  ' Insert the item.
  new_cell.StringValue = new_value
  Set new_cell.NextCell = after_me.NextCell
  Set after_me.NextCell = new_cell
End Sub

' Remove an item from the list.
Public Sub Remove(target_value As String)
  Dim after_me As StringCell

  ' Find the target cell.
  Set after_me = sentinel
  Do While (after_me.NextCell.StringValue < target_value)
    Set after_me = after_me.NextCell
  Loop
  
  ' See if we found it.
  If after_me.NextCell.StringValue <> target_value Then
    Exit Sub
  End If 
  ' Remove the target cell.
  Set after_me.NextCell = after_me.NextCell.NextCell
End Sub

FIGURE 9: A sorted linked list class.

Doubly-Linked Lists

So far, all the list management routines work with the cell after another cell; they can add a cell after a given cell and they can remove the cell after a given cell. If the linked list cells include a PrevCell value indicating the previous item in the linked list, the program can manipulate cells either before or after a given cell. This makes certain operations easier. Lists that use cells with NextCell and PrevCell links are called doubly-linked lists.

For example, a program might want to move items up and down through the list. This is a bit easier using a doubly-linked list. The program finds the target cell, and then uses the NextCell and PrevCell links to move the item. Figure 10 shows VBA code that moves an item one position closer to the top of the list. The DoubleListForm dialog box shown in FIGURE 11 demonstrates a doubly-linked list class.

' Move an item towards the top of the list.
Public Sub MoveUp(target_value As String)
  Dim cell As DoubleStringCell
  Dim before As DoubleStringCell
  Dim after As DoubleStringCell

  ' Find the target cell.
  Set cell = sentinel.NextCell
  Do
    ' If we reach the bottom it's not here.
    If cell Is bottom_sentinel Then Exit Sub
    If cell.StringValue = target_value Then Exit Do
    Set cell = cell.NextCell
  Loop
  
  ' If cell.PrevCell is sentinel,
  ' this is the first item in the list.
  If cell.PrevCell Is sentinel Then Exit Sub
  
  ' Remove the target cell from its current position.
  Set before = cell.PrevCell
  Set after = cell.NextCell
  Set before.NextCell = after
  Set after.PrevCell = before
  
  ' Insert the cell in its new position.
  Set after = before
  Set before = before.PrevCell
  Set cell.PrevCell = before
  Set cell.NextCell = after
  Set before.NextCell = cell
  Set after.PrevCell = cell
End Sub

Figure 10: Code that moves an item closer to the top of a doubly linked list.

FIGURE 11: DoubleListForm uses the DoubleList class to let the user rearrange a list of items.

More Memory Management

Singly-linked lists die gracefully. If the program removes its last reference to the list, VBA automatically reclaims the entire list. That is not the case with a doubly-linked list. Since two adjacent cells hold links to each other, neither has a reference count of zero, so VBA will not destroy either.

The program must break this deadlock before it discards a doubly-linked list or the memory will be wasted. One way to break the reference deadlock is to run through the list setting every cell's PrevCell link to Nothing. Then, the list becomes a singly-linked list so it can be destroyed normally. The following code shows how the DoubleList class prepares its cells for destruction when it's destroyed.

' Break reference deadlocks so the cells can be destroyed.
Private Sub Class_Terminate()

  Dim cell As DoubleStringCell

  Set cell = sentinel
  Do While Not (cell Is Nothing)
    Set cell.PrevCell = Nothing
    Set cell = cell.NextCell
  Loop

End Sub
Conclusion

Arrays are fine for storing small, fixed-size lists that are seldom rearranged. When the data changes frequently or must often be reorganized, however, linked lists offer much greater flexibility. Using linked lists, you can add, remove, and rearrange items in a list quickly and easily.

Download source code for this article here.

Rod Stephens is the author of several books including Visual Basic Algorithms [John Wiley & Sons, 1996], Visual Basic Graphics Programming [John Wiley & Sons, 1997], and Custom Controls Library [John Wiley & Sons, 1998]. He also writes algorithm columns in Visual Basic Developer and Delphi Informant. See his programming tips and tricks at www.vb-helper.com or contact him at RodStephens@vb-helper.com.