Bug in .NET Command Line Arguments

Update - Slawomir Brzezinski pointed out to me that the 'bug' mentioned below is actually a feature: .NET tries to allow for double quotes in command line arguments. I'll return to the subject soon, but in the mean I still wonder whether this is a desirable feature. When an executable is started from the command line, or from explorer through Shell integration, you will run into the problem discussed below. To be continued!

Introduction

When writing RegName, I stumbled upon a subtle bug unexpected (by me, anyway) behavior in the way .NET handles command line arguments. More specific: when I expected a command line argument that was supposed to end in a backslash, it would end in a double quote instead!

The cause seems to do with escaping the double quote inside command line arguments, because the string

"c:\temp\"

is (incorrectly) interpreted as

c:\temp"

when surrounded by double quotes. That reeks of some C#-like string syntax bug, since the sequence

\"

at the end is replaced with

"

which is what is supposed to happen inside a literal C# string, but obviously not in command line arguments...

Fortunately, the fix is relatively easy. If you know you're expecting a file or folder name, you could use the following class (RegName does, anyway!):

public class CommandLineParser
{
  /// <summary> 
  /// This method fixes a bug in .NET command line handling.
  /// When an argument on the command line is surrounded by
  /// double quotes *and* the argument ends in a backslash,
  /// the ending backslash as replaced by a double quote.
  /// 
  /// For example, when a program is started with:
  /// 
  /// test.exe "c:\"
  ///
  /// args[1] will be c:" while it should be c:\
  /// </summary>
  public static string[] GetCommandLineArgs()
  {
    // First get the regular list of arguments (#0 is
    // the path to our own executable)
    string[] args = Environment.GetCommandLineArgs();
     // Loop over all argument except #0
    for (int i = 1; i < args.Length; i++) {
    // If the argument ends in a double quote:
    if (args[i].EndsWith("\""))
        // Remove the ending double quote, add backslash
        args[i] = args[i].Substring(0, args[i].Length - 1) + "\\";
    }
    // Return the (possibly altered) argument list
    return args;
  }
}

Update: the plot thickens!

I dug around some more for the cause of this weird problem and write the following mini-program to test the command line:

using System;
namespace ArgTester
{
  class Program
  {
    static void Main(string[] args)
    {
      int n = 0;
      foreach (string arg in args)
        Console.WriteLine(String.Format("{0}: {1}", ++n, arg));
    }
  }
}

First of all, the bug is still present in the .NET Framework 3.5, because running

ArgTester "c:\test\"

yields:

1: c:\test"

which should of course have been:

1: c:\test\

In an effort to remove the surrounding double quotes, the end quote is dropped.

Even more interesting is the output of:

ArgTester a"b"c

This results in:

1: abc

What's that, then? The double quotes are removed altogether when used as part of an argument! Presumably, this makes constructions like

dir "c:\temp"\temp.txt

possible. I verified that the command line was not touched by cmd.exe by printing Environment.CommandLine, which shows that the command line is passed to ArgTester verbatim. (cmd.exe does, however, interfere with things like

> NUL

in the command line, but that's another story.)

So, the .NET environment is removing quotes in our command line. But what if we need a quote in it (as is very possibly the case when dealing with regular expressions on the command line - as RegFind needs to!) It turns out that both the VB.NET and the C# way of escaping double quotes in strings work inside command line arguments. The proof:

ArgTester a""b\"c

prints:

1: a"b"c