Programming Best Practices with Microsoft Message Queuing Services (MSMQ)

Charles Sterling
Microsoft Corporation

June 1999

Summary: Discusses the best practices for building, troubleshooting, and testing distributed applications with Microsoft® Message Queuing Services (MSMQ). (11 printed pages)

Contents

Introduction
Eleven Guidelines for Writing Better MSMQ Applications
Troubleshooting Common MSMQ Problems

Introduction

Microsoft Message Queuing Services (MSMQ) enables applications running at different times to communicate across heterogeneous networks and systems that may be temporarily offline. As such, it allows many ways of sending and receiving messages. However, such flexibility brings with it the ready possibility of inefficiency. This article outlines an efficient way of writing MSMQ applications, testing MSMQ use, and troubleshooting to resolve any issues that may be encountered during the application-writing process.

This article assumes the reader's knowledge of MSMQ, along with some MSMQ programming experience. All the sample code is in Microsoft Visual Basic®, but the principles apply to other languages as well.

Eleven Guidelines for Writing Better MSMQ Applications

Here is a quick summary of the programming guidelines in this section. You should follow them when writing an MSMQ application.

Do Only Local Receives

MSMQ allows programmers to write code with complete location independence. This feature is powerful and useful, but it can also be very costly to performance, since it will fail if a receiving computer is disconnected from the computer hosting the queue. The failure occurs because MSMQ cannot receive messages while disconnected from the host computer or the site controller.

The performance dichotomy is a result of two factors:

For more information on MSMQ performance please see "Optimizing Performance in a Microsoft Message Queue Server Environment," available at http://www.microsoft.com/ntserver/appservice/deployment/planguide/msmqperformance.asp.

Another consideration to keep in mind for doing only local receives: In MSMQ 1.0, you cannot receive from a remote queue while in a transaction. For details, see the chapter "Transactional Remote Receive Semantics in MSMQ" in the article "Microsoft Message Queuing Services (MSMQ) Tips," available at http://msdn.microsoft.com/library/backgrnd/html/msmqtips.htm.

Avoid Functions that Query the MQIS

All information about public queues is stored in a data repository named MQIS. (In MSMQ version 1.0, this repository is housed in a SQL Server™ database.)

Most of the functions that can be used to open queues also query the data repository to verify existence of the queue or to validate the permissions for the type of access being requested. For example, by default a queue allows everyone "send" access, while only the owner has "receive" access).

Many functions need data repository access while others do not. The key is to minimize the traffic for the functionality you need. A list of some of the strategies for opening queues follows, with the cost and some of the advantages and disadvantages associated with each strategy.

Implement Timeouts

The default timeouts in MSMQ are set to infinite. This setting can lead to some disastrous effects. At first glance, this seems to be a very easy issue to identify, but infinite timeouts occur subtly and can become a problem. These timeouts include:

One symptom of an incorrectly set timeout (for either time to reach the queue or time to be received) is a machine that reacts increasingly slowly over time. Depending on the type of messages, the situation may or may not get better after a reboot. When a computer doesn't get better after a reboot, the problem involves the recoverable or transacted messages. Also, in those situations, the MSMQ service takes much longer to start, since it needs to initialize more messages on each restart. To confirm that this is the problem, look at the Perfmon objects for the MSMQ Queue object and verify that the queue is holding messages and that the number of messages agrees with what should be there.

Understand the Limits of Asynchronous Notification

Asynchronous notifications via WithEvents in Visual Basic can be a powerful feature. The idea of running code only in response to an event is quite attractive. The MSMQ issues with this method are minor, but noteworthy:

If you find additional problems within multithreaded Visual Basic applications, you should upgrade to Windows NT 4.0 Service Pack 4, which includes enhancements specific to this area.

Know When and Where to Use Transactions

There are two types of transactions in MSMQ:

Transactions in MSMQ can be quite useful, guaranteeing that only one instance of a message will be sent. If multiple messages are sent in a single transaction, those messages remain in order, and failures are always logged to the XactDeadletter queue.

The primary disadvantage of transactions in MSMQ is the performance. A secondary consideration is that a remote receive from inside of a transaction cannot be done.

For more information, review the chapter "Transactional Remote Receive Semantics in MSMQ" in the article "Microsoft Message Queuing Services (MSMQ) Tips" (http://msdn.microsoft.com/library/backgrnd/html/msmqtips.htm) and "Optimizing Performance in a Microsoft Message Queue Server Environment," available at http://www.microsoft.com/ntserver/appservice/deployment/planguide/msmqperformance.asp.

Know When to Use Persistable COM Objects

The process of sending COM objects can be convenient, but this convenience can also be expensive, because a generic COM object (as in ADO or Microsoft Word) will have no idea what it will need to deliver. Thus, the developer must write contingency code for every possible circumstance. The following sample shows the sending of an ADO recordset that only has one column and one row. The message sent is 394 bytes; the same information sent as text is only 22 bytes.

Private Sub Form_Load()
Dim con As Connection
Dim strQuery As String
Dim rs As Recordset
Dim msg As New MSMQMessage
Dim q As MSMQQueue
Dim qi As New MSMQQueueInfo
'***********ADO Code****************
Set con = New ADODB.Connection
con.CursorLocation = adUseClient ' Required to implement IPersist
con.Open ("Driver={SQL Server};Server=eastway;Database=pubs;Uid=sa;Pwd=")
strQuery = "select max(au_id) from authors"
Set rs = con.Execute(strQuery)
qi.FormatName = "direct=os:eastway\test"
Set q = qi.Open(MQ_SEND_ACCESS, 0)
'msg.Body = rs
msg.Body = "" & rs.Fields(0)
msg.Send q
End Sub

Understand What Security Context to Use

MSMQ validates permissions based on the security context where the work is being done. This typically affects the following: services, Microsoft Transaction Server (MTS) objects, ASP scripts, and users who inadvertently log onto their computer as a local user rather than an authenticated user account.

By default, all services run in the context of a local system. The local system is only a valid account on the computer hosting the service. With the default, MSMQ permissions will fail all receive operations to remote computers. Unfortunately, sends will work because the default for send access is "Everyone." The overhead inherent in attempting to validate an invalid user on the domain shows up as huge delays in sends reaching the destination queue. The classic symptom of this problem has sends immediately succeeding but leaving the sending computer in 30-second intervals (viewable in the performance monitor).

By default, Active Server Pages (ASP) run in the context of Internet Information Server (IIS), which is based in the local system. (This setting is not configurable in IIS 4.0 and was ignored in IIS 3.0.) Because it runs on the local system, it has the same problem as described above. There are a couple of solutions to this problem:

The default security for a package of MTS objects is "Interactive User." This default works well when the MTS package is on the same computer as the client application calling the MTS component. However, in classic Windows DNA architecture, business objects are placed on a dedicated computer. The design for this is as follows: the client computer calls the business object computer, which calls the server computer, which hosts the queue.

Since Windows NT 4.0 does not support delegation, this model will have the same problems as does running in the context of a local system. The security identifier that is passed is the same as that for the local system account.

Logging onto a workstation as the local administrator is frequently an issue for developers who are used to services with the idea of "standard security." MSMQ installed into a domain does not have this concept. The attempt to access a site controller generates the error: "No connection with this site's controller(s). C00E0013L." While mirrored accounts (local machine accounts that use the same name and password) can be used to work around this issue to a certain degree, they are not recommended, for the following reasons:

Smart Queue Usage

Queue creation has all of the overhead of other MQIS calls (see Avoid Functions that Query the MQIS), as well as, in some circumstances, the addition of the InterSite replication duration (the default is 10 seconds).

MSMQ actively finds the closest site controller after a restart. This site controller serves that client for such information requests as queue existence and access. However, the original site controller is the only site controller that services object creation.

An example of the logic flow:

  1. Start an application by creating a queue.

  2. Open that queue for either doing "sends" or "receives."

If the site controller used for information retrieval is the original parent site controller, then this process will always succeed. If the client has been moved and is now using a different site controller, this code will fail while the queue creation information is replicated to the information site controller from the original installation site controller. Therefore, you should have "retry logic" built into all opens.

Request Acknowledgements or Nonacknowledgements

The default behavior of MSMQ is to not give notifications of either a success or a failure when a message is delivered. This is fine for messages that are expendable, but for messages that need verification, you need to request notification.

Transactional messages have this notification property set for messages that fail. So all transactional message failures are reported (but not successes) to the XactDeadletter queue. Unbeknownst to the programmer, the XactDeadletter queue can accumulate a lot of messages.

Remember Case Sensitivity

This is more of heads-up warning than a programming guideline: MSMQ queue names can be case sensitive. You may find it doubly difficult to track down an error caused by case sensitivity because the MSMQ Explorer shows all queues in lower case. .

Test Your Application with a Full Reboot while Offline

This, too, is a heads-up warning rather than a programming guideline. Because of the caching of both MSMQ and Microsoft Windows NT 4.0, it is always a good idea to test your application after a full reboot in a disconnected mode, to verify that the program is going to behave as expected in production. Also, note that the MSMQ service typically takes longer to come online in a disconnected state than normal. (Keep in mind that some send syntaxes and no remote receives are allowed offline.)

Troubleshooting Common MSMQ Problems

Many MSMQ problems can be isolated and resolved using a few simple tests. Problems are listed here in order of percentage of cases created for support at Microsoft, with connectivity representing the largest proportion.

Connectivity

No matter what the connectivity symptom or problem, using the ping utility to test the computer having problems is always a good idea. The amount of time that a ping test takes to respond can indicate a problem, as can the fact that such a test succeeds only intermittently. (The latter would indicate such issues as name resolution failing and the computer falling back to doing broadcasts for name resolution or network saturation).

When running the ping utility to test a computer for MSMQ tests, the name used must be the Network Basic Input/Output System (NetBIOS) machine name—not a fully qualified DNS name or the IP address. This is so because MSMQ 1.0 only uses the NetBIOS name.

The ping utility is based on the ICMP protocol of TCP/IP. ICMP is a poor choice to verify firewall issues for two reasons:

For firewall and port issues, Telnet is a very good tool. With Telnet, you can establish sessions with the host computer on ports 135, 1801, 2101, and 3527. For more information on the ports required for firewalls, see the article "HOWTO: Configure a Firewall for MSMQ Access" (article Q183293, available at http://support.microsoft.com/support/default.asp) and "Using Distributed COM with Firewalls," available at http://www.microsoft.com/com/wpaper/dcomfw.asp.

Additional connectivity problems can be isolated through an INetMonitor trace. A NetMon trace can help to determine the object of an MSMQ attempt to establish a session and ascertain which part of the connectivity process is failing. NetMon can also help to find situations where the connection between the two computers is succeeding but validation to a domain controller is failing.

Security

Many security problems occur when users log on locally and try to use MSMQ across nontrusting domains. (See Understand What Security Context to Use for more information.)

The easiest test to verify proper security is to connect to the default admin share on the other computer:

\\Machine_Name\C$

If credentials are requested to make this connection, MSMQ will not connect. Unlike SQL Server, MSMQ will not use the credentials supplied to establish that connection. Note that making this connection by itself is not a valid connectivity test, since a share may be accessed by some protocol other than TCP/IP.

MQIS Connectivity

MQIS connectivity problems only affect MSMQ servers, since these servers are the only ones to have an MQIS data store. However, connectivity seems to be a tenuous beast on many development computers. Here are several key points for troubleshooting MQIS connectivity issues:

Slowness and Resource Depletion

For slowness and resource depletion issues, the counters for MSMQ in the PerfMon performance monitor utility are extremely useful, since they can show the following problems:

Miscellaneous Problems and other Issues

For other miscellaneous problems and unknown issues, the debug version of MSMQ is an invaluable troubleshooting tool. All MSMQ error conditions must return one of the errors defined in MQ.h, but this can lead to situations where errors are too generic to be useful. (For example, there is an actual error with the message text "GenericError.") However, in the debug version of MSMQ, you have the ability to see the actual causes of errors rather than predefined error categories from MQ.h, including comments from the original MSMQ developers.

For more information on using the MSMQ debug version, see the article "HOWTO: Use Debug DLLs to Troubleshoot MSMQ Issues" (article Q195141, available at http://support.microsoft.com/support/default.asp).