SharePoint and the Emergence of the Data Scientist


  By Geoff Evelyn, SharePoint MVP owner of

As the use of content management systems evolve with users adding more, ahem, "content", the organizations accountable for those content systems will need to ensure that they build in people resources who can manage that content, and particularly people who can find insights in that content for the benefit of the organization.

Business intelligence requirements and implementations are growing faster than ever before, particularly due to the rise of cloud computing and more cloud services. There is now much more pressure on ensuring that customer interactions are tracked as a key aspect of business intelligence data gathering. This is one of the most critically important ways of working out the value that cloud services provide.

Examples of this is everywhere. Azure Media Services and partners provided cloud-based components for the London Olympics delivering VOD content to more than 30 countries spanning across three continents. This was provided again for the 2014 Winter Olympics using a combination of Microsoft Dynamics and Windows Azure. Platform as a Service (PaaS) as being driven by Mobile application developers, so that means more push notification and geo-location services, for example. IBM announced plans to expand its global cloud footprint committing 1.2BN.

Due to this and other factors, there is already a requirement for those who can help manage the customer data, usage data, behavioral analysis, etc. The big problem is, who knows how to gather the data, what is the actual skill-set required, and what is going to be the impact on IT services, particularly the roles? 

Data Scientists are needed. On this front, demand has raced ahead of supply. Indeed, the shortage of data scientists is becoming a serious constraint in some sectors. Greylock Partners, an early-stage venture firm that has backed companies such as Facebook, LinkedIn, Palo Alto Networks, and Workday, is worried enough about the tight labor pool that it has built its own specialized recruiting team to channel talent to businesses in its portfolio.

Data is becoming so important!
Much of the current enthusiasm for big data focuses on technologies that make taming it possible, including Hadoop (the most widely used framework for distributed file system processing) and related open-source tools, cloud computing, and data visualization. While those are important breakthroughs, It is worth noting that getting the correct people with the skill set (and the mind-set) to put them to good use is just as important. The emergence of integrated app development will also ensure that data scientists will be required. There is already upon us the emergence of devices with sensors, and therefore the reality of the term 'Internet of Things' where more devices will have Internet connectivity woven into them. This means that adoption and app development will rise at consumer level and that app driving sensors will be defined by the data it provides.

Why is data so important? Three reasons: 1. Data is the center, not the application.
Just ten years ago, there was no such thing as 'customer support' or 'customer analysis' concerning analytics surrounding 'metadata' or site usage, or extracting value from data. The application (in other words, the product used by the customer to create the content) was deemed way more important in the eyes of those who provided or provisioned the product to the customer. Analysis of the data was secondary. Back then, a person in I.T. was known as an individual who did not need to get close to the customer, did not need to have business acumen, and did not need to get 'close' to the data created by the application. Instead, all they needed to worry about was the actual software and hardware.  Nowadays, organizations need to understand whether the products they produce, and therefore the data they provide, is deemed as valuable so they can understand whether their services are useful (and continue to be useful) to a broad range of consumers, and thus help them answer key usage and business questions to help them run the business, and thus make the platform (which surfaces that data) better. Nowadays there are requirements that means we need to present findings and recommendations to key decision makers at various management levels, to ensure that data provides clarity on ambiguous projects with multiple stakeholders and unclear requirements. By doing that a 'machine resource' is there to aid quality control.

2. Job roles are evolving.
An example of a job role evolving is a business analyst, who is typically responsible for gathering business requirements, defining metrics, aiding dashboard designs and producing product executive reports. As well as ensuring ROI assessment is mapped to an ever evolving SharePoint platform. I suspect that Data Scientists are there to help the business analyst, by identifying the correct data to help fulfil the requirements. So why doesn’t a company simply stick with the Business Analyst? The key reason again is simple. Organizations today are starting to solve the issues concerning data silos, by storing large volumes of decision support data in warehouses. This data is becoming more and more complex. All organizations wish to centralize their data, and in many cases need to carry out real-time analysis, and will require people with analytic skills. Some organizations take on the business Analyst and then the business analyst role starts to occupy that of a data scientist, because the value and experience of the business analyst is built up first, then they are able to analyse the data further, if the data does not get too complex!

3. Data Convergence is a strategic reality.
The SharePoint objective is to manage and centralize data - that is clear. So this means the bringing together of data to a central place, instead of relevant users having to visit multiple places which would impact productivity. So, if you have data say in IBM Cognos, then it is probable that you would want to see that data presented as say dashboards in a SharePoint site. If you have data in SQL, then again, you may want to see that presented through SharePoint. Without talking about the technical requirements to do this, surely the business requirement is to identify what data has value which should be exposed first? So, you would say 'easy, get someone to do that'. So who? Get a longtime Business Intelligence power users to mine the data? You could do that, except then they are in fact very close to data scientists... The point here is that the more there is a requirement to understand the various data silos, the harder it will be to say that you do not need a resource to understand.

Let us now put into context why data is so important from a personal perspective. Acronis, the backup / restore company, carried out a survey earlier in February 2014, which polled 818 respondents. This revealed two interesting statistics:

· 53 percent say their personal pictures are the most important things to them versus their videos, music, etc.

· 74 percent say the value of their personal files is more important than the devices themselves.

What is a Data Scientist?
Before you say 'Data Scientist is not an IT Role' - think again. Data analysts generally have a strong foundation in applications and computer sciences. They are able to communicate analysis results to IT and the business. They can identify the through analysing organizational data identify problems and solutions, by selecting insights. They look at the data from multiple angles and recommends ways to apply the data. Data Scientists is a strategic position, and helps by defining the data model, examine for consistency and reliability. They have a technical position because they understand and use various business intelligence systems, tools, reports and data sets to generate business insights.

A Data Scientist is effectively an amalgamation of a number of older roles such as Statistical Analysis, Data Miner, Predictive Modeller or even Analytical Analyst. Add to those some of the newer and now very important areas of customer analysis on sites such as behavioural analysis, sentiment analysis. And to put this into context, Data Scientists are in fact used by Microsoft Office365 teams to analyse the terabytes of data being collected per day. That data will not simply be limited to usage, it will be based on support up/down, performance, helpdesk tickets, time to resolution and more.

A client, working with SharePoint is extremely concerned about data analysis. They have systems which are connected to sensors. "The amount of data we have coming off our sensors produces data every ten minutes. This we are counting as Big Data. It is very important to us that we are able to analyse that data from a business perspective, and not simply rely on technical information".

The SharePoint impact In the early days of SharePoint, the Engineer, Administrator have altered and grown over time to include all the roles required to deploy SharePoint to an organization. Generally, they are technical roles. However, they are still deemed as non-data centric in the eyes of the business, which focuses on the value of the data to help meet its business imperative - cash flow. The business facing critical roles, particularly that of Business Analyst, Content Strategist are now heading into data analysis land to extract value from that data, again to focus on the business imperative.

What will and has been emerging is the requirement to analyse the data. This has been going on for some time - just looking at the requirements for usage statistics on SharePoint solutions is a sign. Take that, and add on the requirement to analyse usage from an app provided say in Office 365, and that then begs questions of the other data that can be extracted from the app. Not just the technical log information either, the behavioral aspect of the users working with the app is also data.

Another aspect of data management is the focus of business intelligence and the connectivity of disparate data sources to provide information. In a small organization, it will be prudent to analyse the data to identify the best fit, however, this is simply an aspect of business intelligence gathering which 'ends' when the dashboard is defined. However, as the data gathering becomes more complex, and the analysing requirements of that data become more focused on customer usage and behaviour. An impact on SharePoint is therefore ensuring there is a critical difference in analysing the data which is surfaced, to the data which is not. Another impact could be that the focus on SharePoint data analysis still seems to be the tools so that one can focus on how to present and visualize that data, however, there still needs to be work done to identify the connections between multiple sets of data in the enterprise.

The emergence of the Data Scientist is not altogether new, but is definitely gathering pace. This is particularly when there is data trend spotting to be done, Data Scientists will be needed to help bring change to the organization. Irrespective of the platform, and I have mentioned SharePoint only because it provides the means to surface that data, organizations will need the skill-set and the mind-set to be able to make sense of that data as the organization evolves.

As we working humans harvest more data, data scientists able to determine that the data will be of critical value to the organization using machines - it is that machine intelligence which will become vitally important as we steer into the wealth of customer centric cloud services.

As I researched the items for this article, I was amazed at the mass of information concerning what it means to be a Data Scientist. Most notably ‘What is a data Scientist?’ by Forbes and also an interesting PowerPoint on the roles available out there for a Data Scientist, according to Network World.


Microsoft Accelerate Your Insights
Interested in finding out more on Microsoft Business Intelligence? Join us o n May 1st for an event that will explore the opportunities for organisations wishing to accelerate their use of data insights. To find out more about this free online event check out Anthony Saxby's’ (Microsoft Data Platform lead in the UK) ‘Can big data be used to help your customers?’ article.

Find out more about Microsoft SharePoint and how it could rapidly respond to your business needs.

Did you find this article helpful? Let us know by commenting below, or reaching out to us via @TechNetUK.