Tutorial: Charting with Excel from F#
Applies to: Functional Programming
Summary: The tutorial shows how to use Excel automation from F#. It demonstrates how to write data to an Excel worksheet and generate a chart from the data.
This topic contains the following sections.
- Using Excel for F# Charting
- Creating a Worksheet
- Writing Data to Excel Worksheets
- Adding a Chart Using Chart Wizard
- Additional Resources
- See Also
This article is is associated with Real World Functional Programming: With Examples in F# and C# by Tomas Petricek with Jon Skeet from Manning Publications (ISBN 9781933988924, copyright Manning Publications 2009, all rights reserved). No part of these chapters may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.
Using Excel for F# Charting
When working with data in F#, it may be desirable to export some part of the data to Excel for further processing and analysis or to generate Excel files that can be shared with other people. This can be done using managed wrappers for the API exposed by Excel. This tutorial shows how to write a script that starts the Excel application and manipulates it from the interactive environment. You will learn how to:
Reference the Excel API and start the application from F#.
Write arrays of data to an Excel Worksheet.
Use the ChartWizard method to easily create plots in Excel.
All code in this tutorial is written as an F# script, which means that it can be executed step-by-step using F# Interactive. However, the same code would also work in a compiled application. The only difference is that Excel libraries would be referenced from the project instead of using the #r directive.
Creating a Worksheet
When working with Excel, the first step is to create a new Excel document, which is called a workbook. A workbook consists of one or more worksheets (in Excel, these appear as tabs below the main working area). A newly created workbook automatically contains a single empty worksheet, so it doesn't have to be added explicitly. However, the script needs to locate it so that it can write data to it. The following snippet loads Excel, creates a workbook, and finds the main worksheet. The snippet also needs to reference assemblies that wrap the Excel API (for more information about this topic see How to: Install Office Primary Interop Assemblies):
// Reference the Excel interop assembly #r "Microsoft.Office.Interop.Excel.dll" open Microsoft.Office.Interop.Excel // Run Excel as a visible application let app = new ApplicationClass(Visible = true) // Create new file and get the first worksheet let workbook = app.Workbooks.Add(XlWBATemplate.xlWBATWorksheet) // Note that worksheets are indexed from one instead of zero let worksheet = (workbook.Worksheets. :?> Worksheet)
The first two lines add reference to the main Excel API assembly and open a namespace with Excel types. Once the reference is added, it is possible to create an instance of the ApplicationClass type to start Excel. The snippet makes the application visible so that the results are immediately visible in Excel. A compiled application would more likely set the Visible property to false and save the generated Excel document into a file.
A new document is created using the Add method of the Workbooks property. The argument specifies a template with a single worksheet that can contain data (another option is to create a separate worksheet with a chart only). Next, the snippet gets the first worksheet from the workbook and downcasts it to a type that exposes the functionality in a strongly typed fashion.
Writing Data to Excel Worksheets
A worksheet consists of individual cells that can contain data. The cells can be indexed using ranges. For example, the range A1:B5 specifies a section with two columns and five rows. When writing data to Excel, the script can access individual cells, but it can also specify an array of values that should be populated to a specified range. A range is obtained by calling the Range member of the worksheet with two corners of the range as arguments.
The value of a range can be set using Value2 property. A value of an individual cell can be set to values of many standard F# types including strings, integers, floating point numbers, and dates. When setting a value for a range, the argument should be two-dimensional arrays (for an arbitrary range) or one-dimensional arrays (for a single row of data). The following listing shows how to generate a random table with 10 rows and three columns and add it to the Excel worksheet. It also adds textual titles to the table:
open System // Store data in arrays of strings or floats let rnd = new Random() let titles = [| "No"; "Maybe"; "Yes" |] let names = Array2D.init 10 1 (fun i _ -> string('A' + char(i))) let data = Array2D.init 10 3 (fun _ _ -> rnd.NextDouble()) // Populate data into Excel worksheet worksheet.Range("C2", "E2").Value2 <- titles worksheet.Range("B3", "B12").Value2 <- names worksheet.Range("C3", "E12").Value2 <- data
The result of calling the above code is shown in Figure 1. The code snippet first creates the titles for the data. The one-dimensional array titles provides a title row, and the two-dimensional array names with a single column provides a title column. The data is stored in the 2D array of floating point numbers data. To write the data, the snippet first uses the Range property to get the ranges of compatible sizes and then assigns the created arrays as values for each range.
Figure 1. A dataset written to an Excel worksheet
Once the worksheet contains data, it is easy to create a chart using the Excel application. However, charts can be created programmatically using F#. This can be done using a method that behaves like the chart wizard tool in Excel.
Adding a Chart Using Chart Wizard
To create a chart in Excel, it is first necessary to construct a new chart object (of CharatObject type). All important properties of the chart can be configured using a single method call. The previous sections already added data to the Excel worksheet, so this section just needs to specify the range where the data can be found. If Excel recognizes that the range includes an initial row or a column with titles (because they contain strings), it will use the string values as titles. Otherwise, it will use automatically generated titles that can be easily hidden.
The following snippet creates a 3D Stacked Column chart. This means that the three values in a single row will be drawn as three parts of a single column. The chart assumes that the values add up to 100%, so all the columns will occupy the entire area. This type of a chart can be used, for example, to present the choices made by a group of subjects.
// Add new item to the charts collection let chartobjects = (worksheet.ChartObjects() :?> ChartObjects) let chartobject = chartobjects.Add(400.0, 20.0, 550.0, 350.0) // Configure the chart using the wizard chartobject.Chart.ChartWizard ( Title = "Stacked column chart", Source = worksheet.Range("B2", "E12"), Gallery = XlChartType.xl3DColumnStacked100, PlotBy = XlRowCol.xlColumns) // Set graphical style of the chart chartobject.Chart.ChartStyle <- 2
The snippet first creates an empty chart object and then configures it using the ChartWizard method. The method takes a large number of optional parameters. The most important one is the Source parameter, which should be set to a range with the data (and optionally also titles). The code also needs to set the Gallery parameter, which specifies the type of the chart to use—other options available in the XlChartType enumeration can be explored using IntelliSense. Finally, the snippet also specifies the title and the orientation of the chart. In the case of a bar chart, the data can be interpreted either as (vertical) columns or as (horizontal) rows.
The last line sets the ChartStyle property of the chart. This is one of the predefined graphical styles of charts that are available in Office 2007 or newer and it sets the colors and other visual properties of the chart to make it look like the example in Figure 2. Various predefined styles appear in the Design tab after selecting the chart in Excel.
Figure 2. 3D stacked column chart generated using the ChartWizard method
The chart can be also saved to file by calling the Export method of the chartobject.Chart object. This method allows you to save the chart in a standard graphical format such as PNG.
This tutorial looked at a brief F# script that uses Excel to create a new workbook, write data to the worksheet, and add a stacked column chart. This option can be attractive, especially if Excel is already part of a project tool-chain. For example, data may be loaded and preprocessed in F# but then saved back to Excel. An analyst who is comfortable with Excel can then further work with the data. Another benefit of using Excel is that this approach makes it easy to combine working with data programmatically in F# and performing analysis with Excel tools.
This tutorial demonstrated how to use Excel for creating charts. Excel is good in specific situations but it may not be the best tool in general. However, F# can access a wide range of other charting tools. The following list provides links to tutorials for other technologies discussed in this section. Consult the overview article for comparison of various options:
To download the code snippets shown in this article, go to http://code.msdn.microsoft.com/Chapter-6-Visualizing-Data-c68a2296
This article is based on Real World Functional Programming: With Examples in F# and C#. Book chapters related to the content of this article are:
Book Chapter 4: “Exploring F# and .NET libraries by example” demonstrates how to create a simple charting application from scratch. This chapter is useful for learning F# programming when focusing on working with data.
Book Chapter 12: “Sequence expressions and alternative workflows” explains how to work with in-memory data sets in F# using sequence expressions and higher-order functions.
Book Chapter 13: “Asynchronous and data-driven programming” shows how to use asynchronous workflows to obtain data from the internet, how to convert data to a structured format, and how to chart it using Excel.
The following MSDN documents are related to the topic of this article:
Excel Object Model Overview provides documentation for the objects exposed by the Excel .NET API that can be called from F#.
Arrays (F#) discusses how to create and work with arrays in F#. Data stored in arrays can be easily written to Excel worksheets.
Parameters and Arguments (F#) contains useful information about optional arguments and named parameters.
ChartSheet.ChartWizard Method lists arguments that can be provided when calling a method for adding charts to Excel.
Previous article: Tutorial: Visualizing Stock Prices Using F# Charts
Next article: Tutorial: Charting with Gnuplot from F#