Give Me Everything

Article
05/31/2007

Even though its only been two blog posts since I first wrote it, I'm ready to change how the Select task was implemented. One thing that bothered me when I wrote it was that I made it mandatory to pass the columns the user wanted. As a result, if someone wanted all columns, they had to explicitly list them all. Of course, if we were writing the SQL statement from scratch instead of using a task, we would probably do it differently:

SELECT * FROM ...

I would like a way to capture this possibility. My plan is to no longer have the columns be required: if the Columns attribute is not set, all columns will be selected.

Of course, there was one thing I liked about forcing someone to list all of the columns they wanted: they knew what they were getting back. That is, it should have been obvious what the metadata values available were for these items. Now, it will be less obvious, especially because MSBuild has a default set of metadata values for all task items.

In thinking of ways on how to correct this, I was tempted to simply revert to one of my earlier ideas for what to make the ItemSpec for each row that was returned: a semi-colon delimited list for all of the columns found. But I also realize this loses critical information, especially the column index and the type. So instead, I think this information will be returned via another output item: ColumnInfo, which will have the following pieces of metadata:

ItemSpec: The name of the Table this was selected from
Name: The name of this column
Index: the column index (1-based) based upon the selected columns
Type: The type of information stored in the column. Possible values are (for now): String or Integer

One important note about the index: this value will be based on the selection index, rather than the actual index into the table. I'm going to copy this ItemSpec idea for the Records as well and set the identity of each record to the TableName (I wasn't overly fond of the old method). None of this ColumnInfo stuff should be terribly hard to do: I did something similar as part of ModifyTableData. It would be great to be able to use these column names to access the values from a collection of records. I'm not sure how to do this yet; I need to ask the MSBuild experts how it can be done.

Okay, let's get started.

Step one: Stop requiring the Columns task parameter

 public string[] Columns
{
    get { return columns; }
    set { columns = value; }
}

That was easy: I just removed the [Required] attribute from the Columns property.

Step two: Update the `SELECT` statement to account for an empty list of columns

 private string CreateSelectStatement()
{
    StringBuilder sql = new StringBuilder("SELECT");
    if (Columns == null || Columns.Length == 0)<br>    {<br>        sql.Append(" * ");<br>    }<br>    else<br>    { 
        for (int i = 0; i < Columns.Length; i++)
        {
            sql.Append(" `");
            sql.Append(Columns[i]);
            sql.Append("`");
            if (i != Columns.Length - 1)
            {
                sql.Append(",");
            }
        }
     } 

    sql.Append(" FROM `");
    sql.Append(TableName);
    sql.Append("`");

    if (!string.IsNullOrEmpty(Where))
    {
        sql.Append(" WHERE ");
        sql.Append(Where);
    }

    return sql.ToString();
}

Step three: Populate the column info

Okay, so this one is going to be a little bit harder. The first thing we need is a output parameter. Let's copy what we did with the Records, but re-do it for ColumnInfo:

  [Output]<br>public ITaskItem[] ColumnInfo<br>{<br>    get { return columnInfo.ToArray(); }<br>    set { }<br>}

This necessitates the creation of a member variable:

 protected List<ITaskItem> columnInfo = new List<ITaskItem>();

And a call to populate the member variable:

 protected override bool ExecuteTask()
{
    string sql = CreateSelectStatement();

    View view = Msi.OpenView(sql);
    view.Execute(null);

    FillColumnInfo(view); 

    Record record = view.Fetch();
    while (record != null)
    {
        ITaskItem item = new TaskItem(record.get_StringData(1));
        for (int i = 0; i < Columns.Length; i++)
        {
            item.SetMetadata(Columns[i], record.get_StringData(i + 1));
        }

        records.Add(item);
        record = view.Fetch();
    }

    view.Close();
    System.Runtime.InteropServices.Marshal.FinalReleaseComObject(view);

    return true;
}

So far this has also been pretty easy. Now it's just a matter of writing the FillColumnInfo function. Note I passed in the view for our SQL statement. This is based upon the code written for the IsIntegerData function and the ModifyTableData task. I would expect this helper function to use the IsIntegerData function from our Utilities area. Unfortunately, this was not exposed. Let's fix that now:

 // Utilities.cs

public static bool IsIntegerData(View view, int column)
{
    Record columnInfo = view.get_ColumnInfo(MsiColumnInfo.msiColumnInfoTypes);
    return columnInfo.get_StringData(1).StartsWith("i", StringComparison.OrdinalIgnoreCase) ||
           columnInfo.get_StringData(1).StartsWith("j", StringComparison.OrdinalIgnoreCase);
}

I think I see another error in there, but let's wait and see what happens when the FillColumnInfo method starts using this:

 private void FillColumnInfo(View view)
{
    Record columnInfoNames = view.get_ColumnInfo(MsiColumnInfo.msiColumnInfoNames);
    for (int i = 1; i <= columnInfoNames.FieldCount; i++)
    {
        ITaskItem item = new TaskItem(TableName);
        item.SetMetadata("Name", columnInfoNames.get_StringData(i));
        item.SetMetadata("Index", i.ToString());
        if (Utilities.IsIntegerData(view, i))
        {
            item.SetMetadata("Type", "Integer");
        }
        else
        {
            item.SetMetadata("Type", "String");
        }

        columnInfo.Add(item);
    }
}

The function starts by getting the names of all of the columns in this view. It then iterates over the number of items in the field (note: I chose to do the iteration as 1-based, which also means comparison is done via < = FieldCount). For each field, create a new TaskItem (with ItemSpec set to the TableName), and add metadata for Name, Index, and Type (utilitizing our helper function from earlier). Finally, add the ItemSpec parity to the Records output via the following:

 protected override bool ExecuteTask()
{
    string sql = CreateSelectStatement();

    View view = Msi.OpenView(sql);
    view.Execute(null);

    FillColumnInfo(view);

    Record record = view.Fetch();
    while (record != null)
    {
        ITaskItem item = new TaskItem(TableName);
        for (int i = 0; i < Columns.Length; i++)
        {
            item.SetMetadata(Columns[i], record.get_StringData(i + 1));
        }

        records.Add(item);
        record = view.Fetch();
    }

    view.Close();
    System.Runtime.InteropServices.Marshal.FinalReleaseComObject(view);

    return true;
}

Verification

Now that there is code for the improved Select task, let's go ahead and test it. I never wrote tests last time around, and I'd like to be a lttle more vigorous about it this time. Here are some of the things I want to test:

Selecting items from an empty table gives correct results and column info
specifying all columns gives correct results and column info
Specifying all columns with a Where clause gives correct results and column info
Selecting all columns (i.e. *) gives correct results and column info
Selecting all columns (i.e. *) with a where clause gives correct results and column info

For now,there are not going to be actual code-driven tests for these. Instead, I will use a post-build step and perform visual inspection of the results to make sure things are okay.

We had success with the Error table, so let's run our tests using this table. I'll put all of the tests into a single setup project, "SetupSelectTests". Add our usual post build template to the setup project:

 <Project xmlns="https://schemas.microsoft.com/developer/msbuild/2003">
    <Import Project="$(MSBuildExtensionsPath)\SetupProjects\SetupProjects.Targets" />
</Project>

and modify the PostBuildEvent property

 msbuild.exe /p:Configuration="$(Configuration)" /p:BuiltOutputPath="$(BuiltOuputPath)" /p:ProjectDir="$(ProjectDir)." /p:BuiltOutputDir="$(ProjectDir)$(Configuration)" $(ProjectDir)\PostBuild.proj"

This is actually a little different from the usual argument: I didn't pass in which Target to run. Instead, I will be using the DefaultTargets attribute within the Project node:

 <Project xmlns="https://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="RunTests" >
    <Import Project="$(MSBuildExtensionsPath)\SetupProjects\SetupProjects.Targets" />
</Project>

There are two approaches I could take to the RunTests target:

Add all my tests into this target
Write individual Targets for each test to run and say that the RunTests target depends on these

I'm going to go with the latter. Using a property to define the list of dependencies is a normal MSBuild construct (although I'm not so sure that it is common for a Target to perform no tasks other than forcing other tasks to be run). You then specify which targets your Target depends on via the DependsOnTargets attribute, like this:

 <Project xmlns="https://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="RunTests">
    <Import Project="$(MSBuildExtensionsPath)\SetupProjects\SetupProjects.Targets" />

    <PropertyGroup>
        <RunTestsDependsOn>
            SelectFromEmptyTest;
        </RunTestsDependsOn>
    </PropertyGroup>

    <Target Name="SelectFromEmptyTest">
        <Message Text="********** Begin SelectFromEmpty Test **********" />
    </Target>
    
    <Target Name="RunTests" DependsOnTargets="$(RunTestsDependsOn)" />
</Project>

Building the setup project shows:

 Starting post-build events...
Microsoft (R) Build Engine Version 2.0.50727.42
[Microsoft .NET Framework, Version 2.0.50727.42]
Copyright (C) Microsoft Corporation 2005. All rights reserved.

__________________________________________________
Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (default targets):

Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

Build succeeded.

Let's actually put some meat into the first test. I know I want to run the Select task, selecting both the Error and Message columns from the Error table. Store the Records output in the "ErrorRecords" item group, and the ColumnInfo output in the "ErrorColumnInfo" item group:

 <Target Name="SelectFromEmptyTest">
    <Message Text="********** Begin SelectFromEmpty Test **********" />
     <Select MsiFileName="$(BuiltOutputPath)"<br>            TableName="Error" <br>            Columns="Error;Message"><br>        <Output TaskParameter="Records" ItemName="ErrorRecords" /><br>        <Output TaskParameter="ColumnInfo" ItemName="ErrorColumnInfo" /><br>    </Select> 
</Target>

Now that I have these items, let's actually do something with them. Print out relevant information so that Ican visually inspect it for correctness:

 <Target Name="SelectFromEmptyTest">
    <Message Text="********** Begin SelectFromEmpty Test **********" />
    <Select MsiFileName="$(BuiltOutputPath)"
            TableName="Error" 
            Columns="Error;Message">
        <Output TaskParameter="Records" ItemName="ErrorRecords" />
        <Output TaskParameter="ColumnInfo" ItemName="ErrorColumnInfo" />
    </Select>
     <Message Text="---------- Error Table Information ------------" /><br>    <Message Text="Table: %(ErrorColumnInfo.Identity)   Name: %(ErrorColumnInfo.Name)    Type: %(ErrorColumnInfo.Type)   Index: %(ErrorColumnInfo.Index)" /><br>    <Message Text="---------- Error Records ------------" /><br>    <Message Text="Table: %(ErrorRecords.Identity)   Error: %(ErrorRecords.Error)    Message: %(ErrorRecords.Message)" <br>             Condition="'%(ErrorRecords.Identity)'!=''"/> 
</Target>

I print out all the relevant values for the ErrorColumnInfo and all of the relevevant values for the ErrorRecord. Like I said earlier, I wish there was a way to avoid hard-coding all of the expected column names in our records, but I'm afraid this may be an MSBuild limitation. Note that I added a condition to the last Message task. I added this because without it, the build printed one line with blank values.

Building gives us the following output:

 Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (default targets):

Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: Integer   Index: 2
    ---------- Error Records ------------

Build succeeded.
    0 Warning(s)
    0 Error(s)

This looks mostly correct except for one little issue: type for the Message column is listed as Integer rather than String. Like I said, I thought there was a bug in Utilities.IsIntegerData. I was passing the column index but always using "1". Let's fix that:

 public static bool IsIntegerData(View view, int column)
{
    Record columnInfo = view.get_ColumnInfo(MsiColumnInfo.msiColumnInfoTypes);
    return columnInfo.get_StringData(column).StartsWith("i", StringComparison.OrdinalIgnoreCase) ||
           columnInfo.get_StringData(column).StartsWith("j", StringComparison.OrdinalIgnoreCase);
}

Re-running with this change gives:

 Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------

which looks a lot better.

Writing this test, though, made me think that I missed another group of tests: specifying a subset of columns as well as specifying columns out of order. Let's modify the list of tests:

Selecting items from an empty table gives correct results and column info
Specifying all columns gives correct results and column info
Specifying all columns with a Where clause gives correct results and column info****
Specifying some columns gives correct results and column info
Specifying some columns with a Where clause gives correct results and column info
specifying all columns but out of order gives correct results and column info
Specifying all columns but out of order with a Where clause gives correct results and column info
Selecting all columns (i.e. *) gives correct results and column info
Selecting all columns (i.e. *) with a where clause gives correct results and column info

Let's write the next test. This will have the same shape as the previous test. The primary difference is that I want to add some values to the Error table before I select from it. To do that, I'll just use the ExecuteSql task. The full Target is given below:

 <Target Name="SelectAllColumnsTest">
    <Message Text="********** Begin SelectAllColumnsTest Test **********" />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('1', 'One')"
    />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('2', 'Two')"
    />
    <Select MsiFileName="$(BuiltOutputPath)"
            TableName="Error" 
            Columns="Error;Message">
        <Output TaskParameter="Records" ItemName="ErrorRecords" />
        <Output TaskParameter="ColumnInfo" ItemName="ErrorColumnInfo" />
    </Select>
    <Message Text="---------- Error Table Information ------------" />
    <Message Text="Table: %(ErrorColumnInfo.Identity)   Name: %(ErrorColumnInfo.Name)    Type: %(ErrorColumnInfo.Type)   Index: %(ErrorColumnInfo.Index)" />
    <Message Text="---------- Error Records ------------" />
    <Message Text="Table: %(ErrorRecords.Identity)   Error: %(ErrorRecords.Error)    Message: %(ErrorRecords.Message)" 
             Condition="'%(ErrorRecords.Identity)'!=''"/>
</Target>

I have to update the depends on property to make sure this test is run:

 <PropertyGroup>
    <RunTestsDependsOn>
        SelectFromEmptyTest;
        SelectAllColumnsTest; 
    </RunTestsDependsOn>
</PropertyGroup>

Now building the project yields:

 Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (default targets):

Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
Target SelectAllColumnsTest:
    ********** Begin SelectAllColumnsTest Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
    Table: Error   Error: 1    Message: One
    Table: Error   Error: 2    Message: Two

This looks right. Let's add a new test with a Where clause. This will be almost exactly the same as before, I just need to modify the Select task parameters a little:

 <Target Name="SelectAllColumnsWithWhereTest">
    <Message Text="********** Begin SelectAllColumnsWithWhereTest Test **********" />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('1', 'One')"
    />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('2', 'Two')"
    />
    <Select MsiFileName="$(BuiltOutputPath)"
            TableName="Error" 
            Columns="Error;Message"
            Where="`Error`=1" >
        <Output TaskParameter="Records" ItemName="ErrorRecords" />
        <Output TaskParameter="ColumnInfo" ItemName="ErrorColumnInfo" />
    </Select>
    <Message Text="---------- Error Table Information ------------" />
    <Message Text="Table: %(ErrorColumnInfo.Identity)   Name: %(ErrorColumnInfo.Name)    Type: %(ErrorColumnInfo.Type)   Index: %(ErrorColumnInfo.Index)" />
    <Message Text="---------- Error Records ------------" />
    <Message Text="Table: %(ErrorRecords.Identity)   Error: %(ErrorRecords.Error)    Message: %(ErrorRecords.Message)" 
             Condition="'%(ErrorRecords.Identity)'!=''"/>
</Target>

Of course, I also added SelectAllColumnsWithWhereTest to RunTestsDependsOn. Unfortunately, building the project caused a slight error:

 Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (default targets):

Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
Target SelectAllColumnsTest:
    ********** Begin SelectAllColumnsTest Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
    Table: Error   Error: 1    Message: One
    Table: Error   Error: 2    Message: Two
Target SelectAllColumnsWithWhereTest:
    ********** Begin SelectAllColumnsWithWhereTest Test **********
    E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj(54,9): error : Execute,Params
Done building target "SelectAllColumnsWithWhereTest" in project "PostBuild.proj" -- FAILED.

Done building project "PostBuild.proj" -- FAILED.

Build FAILED.
E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj(54,9): error : Execute,Params

For some reason this is failing. It took me a little while to figure it out, but maybe its obvious to you. Lets see if you can figure it out, especially after I give this hint. Suppose I only run this one test by changing my command line to

 msbuild.exe  /t:SelectAllColumnsWithWhereTest /p:Configuration="$(Configuration)" /p:BuiltOutputPath="$(BuiltOuputPath)" /p:ProjectDir="$(ProjectDir)." /p:BuiltOutputDir="$(ProjectDir)$(Configuration)" $(ProjectDir)\PostBuild.proj"

This works just fine:

 Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (SelectAllColumnsWithWhereTest target(s)):

Target SelectAllColumnsWithWhereTest:
    ********** Begin SelectAllColumnsWithWhereTest Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
    Table: Error   Error: 1    Message: One

Build succeeded.

The problem was that I was attempting to add to the database twice. Windows Installer doesn't like this and issues an error (albeit not a very good one).

So I need a way to work around this. Let's start by breaking the adding to the database as a new Target:

 <Target Name="FillErrorTable"
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('1', 'One')"
    />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('2', 'Two')"
    />
</Target>

Now we need a way to ensure that this gets called prior to running our other tests. One way would be to add the FillErrorTable to the RunTestsDependsOn. But there is a drawback to that: what if I only want to run one of the tests (like i did up above)? Then the target won't get called. The answer to this seems obvious: say that these other tasks depends FillErrorTable. But this could lead to the same problem: after running SelectAllColumnsTest the error table has the information in it. Calling SelectAllColumnsWithWhereTest will force FillErrorTable again which will cause the problem all over again. Ideally, FillErrorTable would not run unless it needed to. (Actually, what I really need is to keep the tests isolated by creating separate projects for each test or (some other means). But let's not deal with that now: I'm in the zone).

Of course, there are multiple ways to do that: I could first query the database to see if the entries have already been added, or I could have FillErrorTable keep track of whether or not it has already run. The first seems more robust (after all, it's possible that other tests could have a side-effect of removing our entries), but I'm going to go with the second because that will cause fewer database hits.

The solution is pretty simple: at the end of the task, simply create a property and set it to 'true'. Then condition the ExecuteSql statements to not run if this property is true. Something like this:

 <Target Name="FillErrorTable">
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('1', 'One')"
                Condition="'$(_FillErrorTableCalled)'!='true'" 
    />
    <ExecuteSql MsiFileName="$(BuiltOutputPath)"
                Sql="INSERT INTO `Error` (`Error`, `Message`) VALUES ('2', 'Two')"
                Condition="'$(_FillErrorTableCalled)'!='true'" 
    />
     <CreateProperty Value="true"><br>        <Output TaskParameter="Value" PropertyName="_FillErrorTableCalled" /><br>    </CreateProperty> 
</Target>

The property is created by a call to the CreateProperty task. The property getting set is "_FillErrorTableCalled". The underscore at the beginning is MSBuild convention to signify that the property is considered private: that is, it is not a value that should be set in a project file within a PropertyGroup node.

After modifying the tests to depend on this new Target:

 <Target Name="SelectAllColumnsTest"
        DependsOnTargets="FillErrorTable">

Running the build shows:

 Project "E:\MWadeBlog\src\Tests\SetupSelectTests\PostBuild.proj" (default targets):

Target SelectFromEmptyTest:
    ********** Begin SelectFromEmpty Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------

Target SelectAllColumnsTest:
    ********** Begin SelectAllColumnsTest Test **********
    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
    Table: Error   Error: 1    Message: One
    Table: Error   Error: 2    Message: Two
Target SelectAllColumnsWithWhereTest:
    ********** Begin SelectAllColumnsWithWhereTest Test **********

    ---------- Error Table Information ------------
    Table: Error   Name: Error    Type: Integer   Index: 1
    Table: Error   Name: Message    Type: String   Index: 2
    ---------- Error Records ------------
    Table: Error   Error: 1    Message: One
    Table: Error   Error: 2    Message: Two

Build succeeded.

That doesn't look right: the SelectAllColumnsWithTest still returned Error 2. If I run this test by itself, I still get the correct result. So what gives? I think MSBuild is being a little too helpful here by appending the output items rather than clearing out the old ones. I don't think there's an easy way around this (at least my inital scan of MSDN didn't find any) aside from changing the names of the output parameters. But when you get two big signals like this, I think its pretty clear that maybe its time to change our approach.

(To be continued...)

SetupProjects.Tasks-1.0.20531.0.msi

Give Me Everything

Step one: Stop requiring the Columns task parameter

Step two: Update the SELECT statement to account for an empty list of columns

Step three: Populate the column info

Verification

Additional resources

Step two: Update the `SELECT` statement to account for an empty list of columns