Dallas on PHP (via WebMatrix)

Microsoft Codename "Dallas"Frequent readers know that I’ve been jazzed about Microsoft Codename “Dallas” ever since it was announced at PDC last year (cf. my now dated blog series), so with the release of CTP3 and my upcoming talk at the Vermont Code Camp, I thought it was time for another visit.

Catching Up With CTP3

I typically introduce “Dallas” with an alliterative (ok, cheesy) elevator pitch: “Dallas Democratizes Data,” and while that characterization has always been true – the data transport mechanism is HTTP and the Atom Publishing Protocol – it’s becoming ‘truer’ with the migration of “Dallas” services to OData (the Open Data Protocol).  OData is an open specification for data interoperability on the web, so it’s a natural fit for “Dallas,” which itself is a marketplace for data content providers like NASA, Zillow, the Associated Press, Data.gov, and many others.

Exposing these datasets as OData in CTP3 is a significant development on two fronts:

  1. It enables a more flexible query capability.  For example, you can now use the OData $select system query option to indicate what columns of data you’re interested in, $orderby to sort, and $inlinecount to return the number of rows retrieved.  All of that happens now on the server, reducing bandwidth and the time it takes to get just the specific data you want. 

  2. The developer experience is vastly improved.  If you’re using Visual Studio, you can leverage the Add Service Reference functionality and work with the LINQ abstractions over the OData (WCF Data Services) protocol.  As you might recall from my previous post, previously there was a rather manual process of generating C# service classes that, to boot, didn’t automatically support the asynchronous invocation required by Silverlight.

    If you’re not using .NET or Visual Studio, the presence of numerous OData SDKs (Java, Ruby, PHP, iPhone, etc.) now makes accessing “Dallas” data a snap from many other applications and platforms as well.  That’s the premise of this post, where I’ll share my experiences in accessing a “Dallas” data service via a PHP web page.

Getting Your “Dallas” Account

“Dallas” is in a CTP phase, which stands for “Community Technology Preview” (but which means ‘free’!), and all you need is a Live ID to get started.  If you don’t yet have a “Dallas” account, go to http://www.sqlazureservices.com, and click on the Subscriptions tab.  You’ll be prompted to login via a Live ID and then see a short form to fill out and submit.  You’ll immediately get an account key and be able to subscribe to the various data services. 

In the CTP all the feeds are free, so it’s a good time to explore what’s out there via the Catalog tab.  You’ll see both CTP3 and CTP2 services in the catalog; the CTP3 ones are OData-enabled, and the CTP2 ones are being updated to support OData, after which they’ll be decommissioned.  The bulk of this article applies only to CTP3 services.

The CTP3 dataset I’ll use in this blog post comes from Infogroup and contains data about new businesses that have formed in the past two years in the US.  Actually, since the CTP is gated, the data available now is only for Seattle, and query results are limited to 100 records, but it’s good enough for our purposes.

"Dallas" data set

As with any of the datasets in “Dallas,” once you subscribe, you can query the data directly via the “Dallas” Explorer by selecting the Click here to explore the dataset link found with each dataset listed on your Subscriptions tab..   Each dataset exposes the same generic grid display and query parameter interface to give you a quick glimpse into the fields and data values that are part of the dataset.   This functionality, by the way, has been a part of “Dallas” since day one.

Service URIsWith CTP3, as you experiment with your queries, the OData URI corresponding to the query is shown in the bottom left of the “Dallas” Explorer, and you can view that URL in a browser to see the raw Atom format! Basic Authentication is now used, so you’ll need to supply your “Dallas” account key as the password (and anything you want as the username) when prompted.

Setting Up PHP and the OData SDK

Web Platform InstallerIf you aren’t familiar with the Web Platform Installer (aka WebPI), it’s time you were!  It’s a lightweight, smart client application that you can use to discover, install, and configure all sorts of web development tools and technologies, from ASP.NET to Visual Web Developer Express, from Windows Server AppFabric to WCF RIA Services, and from PHP to numerous open source offerings, like WordPress, Joomla!, Drupal, Umbraco, Moodle, and more.   I use it whenever I refresh my machine – which is frequently in order to get rid of numerous trial and beta bits!

In my case, I already had PHP set up via WebPI, so I went straight to the instructions for installing the OData PHP SDK.  The ZIP file contains everything you need, including a handy README file to set things up.

Caveat: One of the requirements (Step 6 in the README) is the installation of the PHP XSL extension, but that extension is not installed via the Web Platform Installer.  If you don’t find php_xsl.dll in your PHP extensions directory, you can grab just that file from the distribution at http://windows.php.net and update your php.ini file to enable the extension.  As of this writing, the Web Platform Installer installs the non-thread safe version 5.2.14.   It’s important you get that right, or you’ll get some subtle failures later on.

Generating the PHP Proxy Class

With CTP3 services, .NET developers can use the familiar Add Service Reference functionality to generate the proxy plumbing needed to invoke “Dallas” services.  The proxy code hides the underlying HTTP request/response goo and allows the developer to access the service and its data via a data context and LINQ operators, just as they might use LINQ to SQL or the ADO.NET Entity Framework. 

Visual Studio’s “Add Service Reference” UI is actually just a nice wrapper for the WCF Data Service Client Utility, so it stands to reason that there be a similar utility for generating analogous PHP classes.  It’s called PHPDataSvcUtil.php and is included with the PHP OData SDK you installed in the step above.  The utility has a number of switches, but the most important one is /uri via which you specify the “Dallas” service’s metadata endpoint (shown in the image above).  For the service we’re interested in, the following command line (sans line breaks) will generate the desired PHP proxy class:

php PHPDataSvcUtil.php      /uri=https://api.sqlazureservices.com/Data.ashx/InfoUSA/HistoricalBusinesses

Since I didn’t specify an output file, the file is placed in the same directory as PHPDataSvcUtil, and that file includes two significant class types:

  • A container class that serves the same function as a DataContext or ObjectContext in .NET.  It handles the connection and authentication to the service and exposes query properties for the various data series exposed by the “Dallas” service.  For the selected Infogroup service, the container class is InfoUSAHistoricalBusinessesContainer
  • Classes for each series exposed by the service.  Each class include properties corresponding to the ‘columns’ of data in that series as well as a few utility methods accessed by the container context.   The service being used here has three series, and so three classes:
    • Historical_Business,
    • NewBusiness, and
    • Nixie

Writing Some PHP Code

Until yesterday, the extent of my PHP coding was slapping   <? echo phpinfo(); ?>  into a file, browsing to it, and silently proclaiming, ‘Yup, that’s PHP!”  This then seemed like a perfect opportunity to cut my teeth on Microsoft’s newest entry in the Web development arena – WebMatrix.  Currently in beta, WebMatrix offers a light and simple mechanism for building web sites, with both ASP.NET (and a new view syntax called Razor) and PHP support.   It’s a quick download, and even included in the 3.0 version of the Web Platform Installer, so installing is a breeze.

When you start WebMatrix, you’re not overwhelmed with options; there are precisely four!

WebMatrix start screen 
We’re starting from scratch, so “Site From Template” sounds good.  There’s a few ASP.NET application templates to pick from – by the way, some are pretty amazing in terms of leading you a well-designed and aesthetically pleasing site in no time.  We’re not in this for looks though, we just want data, so I’ll opt for the Empty Site template.

WebMatrix template options

By default, we get a single page – with a .cshtml extension, implying we obviously want to use ASP.NET and Razor, right?   In this (rare) case we don’t, so I’ll just rename the page to index.php.  That bare bones, default page is really just an HTML document with head and body tags, nothing else, so it’s up to us to write some code!

It’s time now to pull in the PHP proxy class file generated earlier, since it includes the container class as well as the classes representing the entities that the “Dallas” service will be returning.  The “Add Existing File…” option is the quickest way to add that PHP class file (InfoUSAHistoricalBusinessesContainer.php) to the project.

Add Existing File option

It’s all down to code now!  Below is an implementation that results in a pretty stark rendering of the company name and address in a table (ack!) format, but it’s enough to get the idea across.  Lines 12-18 are where the service is accessed and queried, the rest is just data rendering; a line-by-line analysis follows the code snippet.

   1: <!DOCTYPE html>
  2: <html>
  3:     <head>
  4:         <title>Dallas PHP Example</title>
  5:     </head>
  6:     <body>
  8:     <?php
 10:     require_once "InfoUSAHistoricalBusinessesContainer.php";
 12:     $svc = new InfoUSAHistoricalBusinessesContainer();
 13:     $svc->Credential = new 
 14:        WindowsCredential("key", "<YOUR ACCOUNT KEY HERE>");
 16:     $qry = $svc->NewBusiness()
 17:                ->Filter("ZipCode eq 98101");
 18:     $response = $qry->Execute();
 20:     echo "<table>";
 21:     foreach ($response->Result as $row)
 22:     {                   
 23:         echo "<td>" . htmlspecialchars($row->CompanyName) . "</td>";
 24:         echo "<td>" . htmlspecialchars($row->Street) . "</td>";
 25:         echo "<td>" . htmlspecialchars($row->City) . "</td>";
 26:         echo "<td>" . htmlspecialchars($row->State) . "</td>";
 27:         echo "</tr>";
 28:     }
 29:     echo "</table>";
 31:     ?>
 32:     </body>
 33: </html>

Here’s the play-by-play for the highlighted code above:

  • Line 12:   instantiate the container (context) class generated by PHPDataSvcUtil.php.  The endpoint URI is encapsulated in this class, but you can pass it in via the constructor as well.

  • Line 13-14:   provide the credentials.  "Dallas” CTP3 uses Basic Authentication in lieu of custom HTTP headers used in previous CTPs.

  • Line 16-17: formulate the query.  The resemblance to LINQ is no accident; the ‘extension’ methods available include:

    • Filter is the equivalent of a SQL WHERE clause (cf. $filter OData system query option).
    • OrderBy is the equivalent of the SQL ORDER BY clause (cf. $orderby).
    • Select facilitates a projection, naming the subset of entity properties (columns) to be included in the result set (cf. $select).
    • Top(n) returns the first n entities (rows) in the result set (cf. $top).
    • Skip(n) returns all but the first n entities in the result set (cf. $skip).
    • IncludeTotalCount includes the number of entities in the result stream along with the results themselves (cf. $inlinecount).  The count value is referenced via a TotalCount method on the query response.
    • Count returns only the number of entities satisfying the query parameters (cf. $count on the URI path).
  • Line 18 – execute the query, returning a collection of objects which are enumerated in the rest of the script to display the data on the page.


Running Your Code (No, it doesn’t work the first time!)

WebMatrix Run optionSo, you’ve done everything right, and can’t wait to hit the Run option in WebMatrix, only to be met with: 

HTTP 500 Internal Server Error

As when playing with any new technology, you pretty much figure it’s your fault, and you’ve screwed up somewhere along the line.  Well, not this time!  I spent an hour or so of debugging (PHP’s error_log() is my friend!) and comparing the execution of a “Dallas” service to my own operational local Northwind WCF Data Service. 

I eventually discovered a subtle difference in the Atom feed coming back as part of each entity in the result.  The successful service included a link tag like:

 <link rel="edit" title="Customer" href="Customers('ALFKI')" />

and the unsuccessful one from “Dallas” had:

 <link rel="edit" title="CityCrime" href="/CityCrime(123889)" />

Note the initial “/” in the latter’s XML href attribute.   I’m not conversant enough with OData to know if the inclusion of the “/” is valid, but I do know that the PHP OData SDK can’t get past it!   The “fix” turns out to be pretty simple though, and since the PHP OData SDK is open source, you can patch it on your own:

  1. Navigate to the directory containing the PHP OData SDK files (mine is at C:\Program Files (x86)\PHP\OData).
  2. Edit the file AtomParser.php in the Parser subdirectory.  You’ll probably need to run notepad or your favorite editor as Administrator and make sure that file isn’t marked as Read-only.
  3. Look for the EnumerateEntry function – in my copy it starts on Line 192,
  4. Add the following highlighted code in the context of the EnumerateEntry method to strip of an initial “/” if there is one.  Note, to save space, the entire method isn’t replicated below.
     protected function EnumerateEntry($entry, &$entityType, $parentObject = null)
        $xPath = self::GetXPathForNode($entry);
        $ids = $xPath->query(self::$QUERY_ID);
        $uri = $ids->item(0)->nodeValue;
        //Try to get EntitySet From edit link
        $entitySet = null;
        $editLinks = $xPath->query(self::$QUERY_EDIT_LINK);
            $href = $this->GetAttribute($editLinks->item(0), 'href');
                if(($pos = strpos($href, '(')) !== FALSE)
                    $entitySet = substr($href, 0, $pos);
                    if (substr($entitySet, 0, 1) == '/')
                        $entitySet = substr($entitySet, 1);        

And that’s a wrap!  I’ll update the post when I hear back from the “Dallas” and/or WCF Data Services teams in terms of where the bug actually lies.