VB6 String anomalies

Tuesday, February 2, 2010

One of the main well known differences between VB6 and VB.NET is the way strings are handled. Under .NET a String is a reference type which can, therefore, be Nothing, while in VB6 strings always contain a valid value, by default “” – the empty string. No news there. More surprisingly, I recently found out that VB6 strings can sometimes also contain (sort of) Nothing! Surprised as well? Read on.

The case of the string that was Nothing

I recently decided to harvest the built-in .NET SMTP client in a Visual Basic 6 application, in an attempt to get rid of MAPI (shudder) and mailto-links. It turned out to be as easy as it was supposed to be: create a new project, add a COM Visible class MailSender, add a SendMail method that sends out an email using the built-in SmtpClient class, compile it, register it, and Bob was my uncle once again. It worked like a charm from VB6: create an instance of the COM class using CreateObject(), and call the SendMail method.

Then something strange happened. Under certain circumstances, the .NET class would throw an exception, indicating that one of the String arguments to the SendMail method was Nothing. Since the argument involved was a simple String, this didn’t make sense. In VB6, a string always has a value, right? When you declare a string, it has the value of “”, i.e. the empty string, consisting of zero characters, right? Wrong.

Some more digging revealed that the problem was caused by the CC argument, a String containing a list of recipients that should be CC-ed when sending the mail message. In effect, this was the problem:

Sub SendEmail(..., Byval CC As String, ...)
  Dim Sender As Object

  Set Sender = CreateObject(...)
  Sender.SendMail ..., CC, ...
End Sub

This caused the exception from within the SendMail method.

At first, I suspected one of the many layers involved in the call. I was using late binding, which complicated things, so I added a reference to the type library of the .NET DLL and replaced the CreateObject-call with

Dim Sender As New MailSender

Same problem.

Next, I suspected COM/.NET marshaling. During the call, the parameters are passed ByRef to COM, and then they’re marshalled from COM unmanaged code to .NET managed code. I started to believe that somewhere in the COM-.NET transition some layer would decide that empty strings should be converted to Nothing.

To verify that, I tried replacing the use of the variable CC with the empty string literal:

Sender.SendMail ..., "", ...

Same problem! This was unexpected. How would any part of the marshalling process distinguish between a string variable containing the empty string, and the empty string literal? Next, I tried:

Sender.SendMail ..., "" + CC, ...

To my amazement, this worked. No exception.

At this point, I was stumped.

Retreating to the drawing board, I realized that the problem only occurred under certain circumstances. There was another scenario, in which sending out an email using an empty CC list actually worked. Analyzing the difference, I found that the problem occurred when the CC list was set to the Text property of a TextBox:

Dim CCList As String
CCList = txtCC.Text
SendEmail ..., CCList, ...

It seemed that the Text poperty contained some kind of special string, and the Property-ness of that string stuck until the call into COM. I didn’t understand how this could happen, let alone why. How could a string be “”, and still be different from the empty string? I tried:

Dim S As String
Debug.Print S = ""

Sure enough: True. I brought in the property:

Dim S As String
S = txtCC.Text
Debug.Print S = ""

Once again, True.

StrPtr to the rescue

Then I remembered two undocumented VB6 methods: VarPtr() and StrPtr(). The latter is the interesting one in this case: it returns the memory address of (i.e. a pointer to) the first character of a string. This set me on to a phenomenon that I hadn’t expected, even with my twenty years of VB6 experience:

Debug.Print StrPtr(S)
Debug.Print StrPtr(txtCC.Text)

Both of these printed 0! Some more investiging led to an even simpler example that demonstrates the problem:

Dim S As String
Debug.Print S, StrPtr(S)
S = ""
Debug.Print S, StrPtr(S)

The first value of the string pointer is 0, the second is not. Yet, in both cases the value of the string is “”! Even the test

Debug.Print S = ""

did not distinguish between an uninitialized string and a string set to “”.

My conclusion: when a VB6 string is created, it in fact contains a null-pointer (which makes it sort of Nothing, but not quite). When a string is explicitly initialized to “”, it contains a valid pointer to zero bytes, and becomes a non-null pointer. When using an uninitialized string, its value is converted to the empty string. VB tries very hard to never let you ‘see’ the null-pointer by showing you a temporary empty string.

This, I did not expect. In fact, I felt it violated one of the very basic principles I thought I knew about VB6: a String always has a valid value. Period.

How relevant is this, in real life? Why should we care what the StrPtr() of a string is, when VB converts it to a nice empty string when we use it? We just saw the answer to that one: when passing a string to COM (and I suspect: also to a Windows API-function like SendMessage()) strings with a StrPtr() of 0 are passed as real null-pointers, which .NET interprets as strings set to Nothing.

So the initial workaround (prefixing the CC argument with “” in the call to COM) turns out to be a valid one: it makes sure uninitialized strings are not passed as Nothing.

But did we indeed pass an uninitialized string in our case? The answer is no – we passed a Text property that happened to be empty! Another strange behavior of VB6 turns out to be that the Text property of a TextBox returns an uninitialized string when the text in the TextBox is empty. This occurs even when text was entered and then removed, for example by pressing Backspace.

So the morale of this story is: make sure you don’t pass uninitialized strings or empty control properties of type String to Windows API-calls or COM-objects. I suspect the problem is limited to COM objects not written in VB6, because although the string argument may have been passed as a null-pointer, the receiving end will turn it back into a temporary empty string – since it’s written in VB. The fix is simple: prefix string arguments in these cases with “”, no matter how dumb that looks in your code!