CmdLiner - a break from Windows Forms

This will be the third time I tried to start this post. Let's see if I get derailed yet again, or if I can actually finish it this time and get it up on the blog. To break from the monotony of Windows Forms, I'm uploading a snippet that wasn't originally intended to be a code sample. But, in my frantic grasp for content, it has become such.

What is CmdLiner? It's a tool I wrote for myself, to help parse command lines. (It will be used in larger projects, eventually.) I decided my command-line options loosely follow these rules:

  1. switches are immediately preceded by a hyphen (-) a forward slash (/) or a back-slash (\).
  2. non-switch arguments are anything that isn't in category 1, including the program name itself.
  3. switches can contain arguments, which immediately follow the switch, and start with a colon (:) or equals (=). There can be multiple arguments, delimited by commas(,).

So, a valid command line might look like the following:
myprog /v -outfile:c:\temp.txt \r=200,13 infile.txt

How does it, ehm, how does it work? Consult the book of armaments! No, no. Actually, I compile it /LD, into a .NET assembly. Then, I can #using it from any program I want, to take advantage of its capabilities. I went back to the old C++ standard of defining the CmdLiner object in a .h file, and having the implementation in a .cpp file, rather than jumbling it all up in one source file. Being a C++ programmer at heart, that's really what makes the most sense to me.

Let's get into the code! I'll start with the .h file, and proceed from there. As always, my explanation is in Whidbey C++ syntax (old syntax code is also included in the sample, however), and the link to the sample download is way at the bottom.


//cmdliner.h - managed command-line argument parsing tool

using namespace System;

using namespace stdcli::language;

public ref class CmdLiner{

public:

CmdLiner(array<String^>^ args){ //requires the paramarray of main() to be passed in

loadargs(args);

}

Some typical using namespace declarations - the second being for the array<> template, and a constructor definition. .NET programs have their command-line options as an array of System::Strings. We simply take that information, and pass it along to the private loadargs function, which will actually perform the job of parsing the command line. (Most of the rest of the class is just methods to access that information in a meaningful way.)

CmdLiner(int argc, char **argv){

array<String^>^ args=gcnew array<String^>(argc);

for(int i=0; i<argc; i++){

args[i]=argv[i];

}

loadargs(args);

}

Here's a second constructor. It takes the traditional C++ command-line arguments, makes an array of Strings, fills it, and passes it along to loadargs.

int argc(){ return _args->Length; } //argument count (inlined)

bool switchOn(String^ sw); //is a particular switch thrown?

array<String^>^ switchArgs(String^ sw); //return any arguments for a switch /s:arg1,arg2

array<String^>^ nonswitchArgs(); //return all non-switch arguments

int nswc(); //number of non-switch arguments

String^ getarg(int index){ return _args[index]; }

String^ getdearg(int index){ return _deargs[index]; }

Most of these methods are used to access the already-parsed command line information. These are safe to call pretty much any time after the constructor runs. The switchOn method is probably the most useful, with switchArgs and nonswitchArgs also being useful. There's some reference in the inlined functions to private members, like _args. We'll get to those in a second. I think I chose some slightly confusing names. Were the sample not already uploaded (uploading a sample has an average 12-hour turnaround time), I might actually go fix them. The switchArgs method returns an array of the arguments for a particular switch (see rule 3 up at top), but nonswitchArgs returns an array containing all arguments that aren't switches (rule 2).

private:

array<String^>^ _args;

array<String^>^ _deargs;

String^ dearg(String^ arg); //removes /-\=: and anything following =:

void loadargs(array<String^>^ args);

int getSwIndex(String^ sw); //gets the array index of sw, -1 if not found

};

Here are those private members, and a few private methods. The arrays _args and _deargs store our parsed arguments. The _args array stores them, complete with slashes and all the option arguments, and _deargs has just the switch. (In the case of a non-switch argument, the _deargs entry contains nullptr.) Also, the loadargs method we've seen earlier, and dearg and getSwIndex, both of which are obvious, based on their comments.

Now, onto the meat. Let's get to the real point of this one - the implementation.


//cmdliner.cpp - requires cmdliner.h, build /LD

#include "cmdliner.h"

using namespace System;

using namespace stdcli::language;

#define STRARR(x) ((String^)x)->ToCharArray()

This is all old hat - we've seen all this before. (If you don't remember STRARR, go check out an old sample, loc.)

bool CmdLiner::switchOn(String^ sw){ //is a particular switch thrown?

if(getSwIndex(sw)>=0) return true;

else return false;

}

Here, we're determining if a switch the user requested has been thrown, and we use the private method getSwIndex, which returns -1 if the switch isn't found, and returns the index to the switch otherwise.

array<String^>^ CmdLiner::switchArgs(String^ sw){

//return any arguments for a switch /s:arg1,arg2 or -s=arg1,arg2

int index=getSwIndex(sw);

int argindex=0;

if(index>=0){

if(_args[index]->IndexOf(':')>=0) argindex=_args[index]->IndexOf(':');

else if (_args[index]->IndexOf('=')>=0) argindex=_args[index]->IndexOf('=');

else return nullptr;

String^ justargs=(_args[index]->Substring(argindex+1));

return justargs->Split( STRARR(",") );

}

return nullptr;

}

The switchArgs method returns the arguments for a switch. It returns an array of arguments because there could potentially be multiple arguments to any one switch. Basically, we first use the String::Substring method to get everything after the = or :, and then use String::Split to divide up the strings by the comma delimiter. Looking over this code, I see a flaw, where a user could break this method by specifying an argument that starts with an equals, but has a colon in it (like a path): /d=c:\windows\system32. Oops! Well, watch out for that bug. Or fix it, if you so desire. Typically, it only has negative effects when the path is not on the same drive as the program is being run on, since the bug basically has the effect of removing the drive specification, for paths, anyhow.

array<String^>^ CmdLiner::nonswitchArgs(){ //return all non-switch arguments in order

if(nswc()<=0) return nullptr;

array<String^>^ retarr = gcnew array<String^>(nswc());

int arrindex=0;

for(int i=0; i<argc(); i++)

if(!_deargs[i]) retarr[arrindex++]=safe_cast<String^>(_args[i]->Clone());

return retarr;

}

This method returns only the non-switch arguments. I do this by creating a new array of the right size (I figure that out using the nswc method), and then looping through all the arguments, checking _deargs to see if it's empty. If it is, we copy the corresponding _args entry into our temporary array. Since retarr contains only the non-switch arguments, it has to keep and increment a seperate index (arrindex) from the loop variant.

int CmdLiner::nswc(){ //number of non-switch arguments

int count=0;

for(int i=0; i<argc(); i++)

if(!_deargs[i]) count++;

return count;

}

Here's that nswc method. It basically just loops through _deargs, counting the number of blank entries. Pretty straightforward.

String^ CmdLiner::dearg(String^ arg){ //removes /-\=: and anything following =:

//only if it IS an argument (leads with a /-\)

if(arg[0]=='\\' || arg[0]=='/' || arg[0]=='-'){

arg=arg->Trim( STRARR("\\/-") );

return (arg->Split(STRARR(":=")))[0];

}

else return nullptr;

}

This method takes an individual argument, and strips off all the characters that don't represent the actual switch. If it isn't a switch, it returns nullptr.

void CmdLiner::loadargs(array<String^>^ args){

_args = safe_cast<array<String^>^>(args->Clone()); //get a copy of the args array

_deargs=gcnew array<String^>(argc());

for(int i=0; i<args->Length; i++) //clone the array, dearg'ed

_deargs[i]=dearg(_args[i]);

}

The loadargs method is the one called from the two constructors. Basically, it just fills the _args and the _deargs arrays, using the dearg method. I perform the argument stripping once, and for all the arguments up front, so I incur that cost once, rather than distributing it out among several calls (which could potentially be more costly, in the long run).

int CmdLiner::getSwIndex(String^ sw){ //gets the array index of sw, -1 if not found

for(int i=0; i<_deargs->Length; i++)

if(String::Compare(sw, _deargs[i], true)==0) return i;

return -1;

}

Finally, the getSwIndex function. This is the method I'm least proud of, because it is a simple iterative search, rather than something possibly more efficient. However, the array is sorted by argument order, rather than alphabetically. A more efficient search algorithm would probably need to have the array sorted somehow, and still remember the original order of the switches. It really turns the switchOn method into a O(n2) operation - but since the calls to switchOn are spread out, it isn't so bad.


Get the files! You can find them here. Please drop me a line if you try this out and find it useful, and especially if you find that the Managed Extensions version doesn't compile under VS.Net 2003. Stay tuned - a future sample that I'm working on will make use of CmdLiner.