Determining analytical goals

patterns & practices Developer Center

From: Developing big data solutions on Microsoft Azure HDInsight

Before embarking on a big data project it is generally useful to think about what you hope to achieve, and clearly define the analytical goals of the project. In some projects there may be a specific question that a business wants to answer, such as “where should we open our new store?” In other projects the goal may be more open-ended; for example, to examine website traffic and try to detect patterns and trends in visitor numbers. Understanding the goals of the analysis can help you make decisions about the design and implementation of the solution, including the specific technologies to use and the level of integration with existing BI infrastructure.

Historical and predictive data analysis

Most organizations already collect data from multiple sources. These might include line of business applications such as websites, accounting systems, and office productivity applications. Other data may come from interaction with customers, such as sales transactions, feedback, and reports from company sales staff. The data is typically held in one or more data stores, and is often consolidated into a specially designed data warehouse system.

Having collected this data, organizations typically use it to perform the following types of analysis and reporting:

  • Historical analysis and reporting, which is concerned with summarizing data to make sense of what happened in the past. For example, a business might summarize sales transactions by fiscal quarter and sales region, and use the results to create a report for shareholders. Additionally, business analysts within the organization might explore the aggregated data by drilling down into individual months to determine periods of high and low sales revenue, or drilling down into cities to find out if there are marked differences in sales volumes across geographic locations. The results of this analysis can help to inform business decisions, such as when to conduct sales promotions or where to open a new store.
  • Predictive analysis and reporting, which is concerned with detecting data patterns and trends to determine what’s likely to happen in the future. For example, a business might use statistics from historical sales data and apply it to known customer profile information to predict which customers are most likely to respond to a direct-mail campaign, or which products a particular customer is likely to want to purchase. This analysis can help improve the cost-effectiveness of a direct-mail campaign, or increase sales while building closer customer relationships through relevant targeted recommendations.

Both kinds of analysis and reporting involve taking source data, applying an analytical model to that data, and using the output to inform business decision making. In the case of historical analysis and reporting, the model is usually designed to summarize and aggregate a large volume of data to determine meaningful business measures—for example, the total sales revenue aggregated by various aspects of the business, such as fiscal period and sales region.

For predictive analysis the model is usually based on a statistical algorithm that categorizes clusters of similar data, or that correlates data attributes (which may influence one another) to the related cause trends—for example, classifying customers based on demographic attributes, or identifying a relationship between customer age and the purchase of specific products.

Databases are the core of most organizations’ data processing, and in most cases the purpose is simply to “run the operation” by, for example, storing and manipulating data to manage stock and create invoices. However, analytics and reporting is one of the fastest growing sectors in business IT as managers strive to learn more about their organization.

Analytical goals

Although every project has its own specific requirements, big data projects generally fall into one of the following categories:

  • One-time analysis for a specific business decision. For example, a company planning to expand by opening a new physical store might use big data techniques to analyze demographic data for a shortlist of proposed store sites in order to determine the location that is likely to result in the highest revenue for the store. Alternatively, a charity planning to build water supply infrastructure in a drought-stricken area might use a combination of geographic, geological, health, and demographic statistics to identify the best locations.
  • Open “blue sky” exploration of “interesting” data. Sometimes the goal of big data analysis is simply to find out what you don’t already know from the available data. For example, a business might be aware that customers are using Twitter to discuss its products and services, and want to explore the tweets to determine if any patterns or trends can be found that relate to brand visibility or customer sentiment. There may be no specific business decision that needs to be made based on the data, but gaining a better understanding of how customers perceive the business might inform decision-making in the future.
  • Ongoing reporting and BI. In some cases a big data solution will be used to support ongoing reporting and analytics, either in isolation or integrated with an existing enterprise BI solution. For example, a real estate business that already has a BI solution, which enables analysis and reporting of its own property transactions across time periods, property types, and locations, might extend it to include demographic and population statistics data from external sources.

Note

In many respects, data analysis is an iterative process. It is not uncommon for an initial project based on open exploration of data to uncover trends or patterns that form the basis for a new project to support a specific business decision, or to extend an existing BI solution.

The results of the analysis are typically consumed and visualized in the following ways:

  • Custom application interfaces. For example, a custom application might display the data as a chart, or generate a set of product recommendations for a customer.
  • Business performance dashboards. For example, you could use the PerformancePoint Services component of SharePoint Server to display key performance indicators (KPIs) as scorecards, and display summarized business metrics in a SharePoint Server site.
  • Reporting solutions such as SQL Server Reporting Services. For example, business reports can be generated in a variety of formats and distributed automatically by email, or viewed on demand through a web browser.
  • Analytical tools such as Excel. Information workers can explore analytical data models through PivotTables and charts. Business analysts can use advanced Excel capabilities such as Power Query, Power Pivot, Power View, and Power Map to create their own personal data models and visualizations, or use add-ins to apply predictive models to data and view the results in Excel.

Next Topic | Previous Topic | Home | Community