Cool Linq Query

Now that Visual Studio 2008 Beta 2 is out the door, I can write about some cool Linq Queries, and you can run them.

Foxpro has had Language Integrated Query (Linq) in the form of embedded SQL statements (Create Table/Select/Update/Insert/Delete/Alter Table) for almost 2 decades (I believe it was Fox 2.0 around 1989?)

Many moons ago, I was working on Zorro/Tesla, the predecessors to Linq. When we came up with a way to query arrays, I came up with a cool query: use System.Diagnostics.Process.GetProcesses()as a query source.

Linq allows the use of Set based operations, rather than procedural based operations over a data source.

Linq in VB has way better intellisense than Fox, and can query more types of data (XML,Objects, Arrays for Linq vs just tables and cursors for Fox). Additionally, the .Net framework has many types that return arrays.

Suppose I want to search all “words” in a bunch of C++ source files in a folder (and subfolders) to find which words occur the most. This can be done with one Linq statement!

For example, I found that in a fairly large project with almost 1 million lines of code, “if” occurs more than twice as often as any other word.

The query executes surprisingly quickly, and uses many of the several Linq Clauses.

Just start VS2008, choose File->New->Project->VB->Windows Forms Application, View Code, then paste in the code below

The “Take While” in the subquery filters out words after a C comment (“//”) but the query doesn’t filter out multi-line comments (/* comment */) or preprocessor filtered code (#ifdef)

As soon as the Take While finds a comment, the predicate is false so it has to be put in a subquery.

The Browse method below is very useful: it will take the result of a query and show it in a grid.

Now imagine how much code it is to write this query in another language! If you come up with other similar code in various languages, please share your equivalent.

A Foxpro version of the same query would take a couple procedures and non-query code to manipulate arrays.

I’ll post my C# and Fox versions next to compare …

See also:

VS Beta 2: Bug Fixes, Final Features, Polish and Shine

Use temporary projects in Visual Studio

Remove double spaces from pasted code samples in blog

Code starts here:

Imports System.IO

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Me.Width = 1024

        Me.Height = 768

        ' It sure would be nice to have comments after line continuation!

        ' Comment about query below:

        ' Get All Files matching "*.*" on path with extension ".cpp" or ".h"

        ' Read all lines, then split into array of lines LineArray for each file

        ' Iterate through each line, split into array of words WordArray using specific chars as word delimiters

        ' use a subquery and "Take While" to eliminate any words after a C++ comment: "//"

        ' Iterate through each word, make sure it starts with a letter

        ' get the word and its occurrence count

        Dim q = From FileName In Directory.GetFiles( _

                      "D:\dd\VB03_s2\src\vb\bc\", "*.*", _

                      SearchOption.AllDirectories) _

                  Let ext = Path.GetExtension(FileName).ToLower _

                  Where ext = ".cpp" Or ext = ".h" _

                  Let LineArray = File.ReadAllText(FileName).Split(New Char() {vbLf, vbCr}) _

                  From SingleLine In LineArray _

                  Let WordArray = _

                    (From word In SingleLine.Split( _

                            New Char() {" ", vbTab, "@", "*", ",", ".", "(", ")", "<", ">", ":", ";", "'", """"} _

                                    ) _

                      Take While word <> "//") _

                  From Word In WordArray _

                  Where Char.IsLetter(Word) _

                  Group By Word Into Occur = Count() _

                  Order By Occur Descending _

         Select Word, Occur


        '' Similar query: all the file extensions in Windows\system32 directory

        'Dim r = From FileName In IO.Directory.GetFiles( _

        ' My.Application.GetEnvironmentVariable("windir") + "\system32", "*.*", SearchOption.TopDirectoryOnly) _

        ' Select FileExt = IO.Path.GetExtension(FileName) _

        ' Group By FileExt Into Occur = Count() _

        ' Order By FileExt _

        ' Select FileExt, Occur

    End Sub

    Sub Browse(Of t)(ByVal seq As IEnumerable(Of t))

        Dim GridView As New DataGridView

        GridView.Width = Me.Width

        GridView.Height = Me.Height


        Dim pl = New List(Of t)(seq)

        GridView.DataSource = pl

        Me.Text = pl.Count.ToString

        GridView.Dock = DockStyle.Fill


    End Sub

End Class


End of code