Team Data Science Process for Developer Operations

This article explores the Developer Operations (DevOps) functions that are specific to an Advanced Analytics and Cognitive Services solution implementation. These training materials are related to the Team Data Science Process (TDSP) and Microsoft and open-source software and toolkits, which are helpful for envisioning, executing and delivering data science solutions. It references topics that cover the DevOps Toolchain that is specific to Data Science and AI projects and solutions.

Lesson Path

The following table provides guidance at specified levels to help complete the DevOps objectives that are needed to implement data science solutions with Azure technologies.

Objective Topic Resource Technologies Level Prerequisites
Understand Advanced Analytics The Team Data Science Process Lifecycle This technical walkthrough describes the Team Data Science Process Data Science Intermediate General technology background, familiarity with data solutions, Familiarity with IT projects and solution implementation
Understand the Microsoft Azure Platform for Advanced Analytics Information Management
This reference gives and overview of Azure Data Factory to build pipelines to collect and orchestrate data from the services you use for analysis Microsoft Azure Data Factory Experienced General technology background, familiarity with data solutions, Familiarity with IT projects and solution implementation
This reference covers an overview of the Azure Data Catalog which you can use to document and manage metadata on your data sources Microsoft Azure Data Catalog Intermediate General technology background, familiarity with data solutions, familiarity with Relational Database Management Systems (RDBMS) and NoSQL data sources
This reference covers an overview of the Azure Event Hubs system and how you and use it to ingest data into your solution Azure Event Hubs Intermediate General technology background, familiarity with data solutions, familiarity with Relational Database Management Systems (RDBMS) and NoSQL data sources, familiarity with the Internet of Things (IoT) terminology and use
Big Data Stores
This reference covers an overview of using the Azure SQL Data Warehouse to store and process large amounts of data Azure SQL Data Warehouse Experienced General technology background, familiarity with data solutions, familiarity with Relational Database Management Systems (RDBMS) and NoSQL data sources, familiarity with HDFS terminology and use
This reference covers an overview of using Azure Data Lake to capture data of any size, type, and ingestion speed in one single place for operational and exploratory analytics Azure Data Lake Store Intermediate General technology background, familiarity with data solutions, familiarity with NoSQL data sources, familiarity with HDFS
Machine Learning and Analytics This reference covers an introduction to Machine Learning, predictive analytics, and Artificial Intelligence systems Azure Machine Learning Intermediate General technology background, familiarity with data solutions, familiarity with Data Science terms, familiarity with Machine Learning and Artificial Intelligence terms
This article provides an introduction to Azure HDInsight, a cloud distribution of the Hadoop technology stack. It also covers what a Hadoop cluster is and when you would use it Azure HDInsight Intermediate General technology background, familiarity with data solutions, familiarity with NoSQL data sources
This reference covers an overview of the Azure Data Lake Analytics job service Azure Data Lake Analytics Intermediate General technology background, familiarity with data solutions, familiarity with NoSQL data sources
This overview covers using Azure Stream Analytics as a fully-managed event-processing engine to up real-time analytic computations on streaming data Azure Stream Analytics Intermediate General technology background, familiarity with data solutions, familiarity with structured and unstructured data concepts
Intelligence This reference covers an overview of the available Cognitive Services (such as vision, text, and search) and how to get started using them Cognitive Services Experienced General technology background, familiarity with data solutions, software development
This reference covers and introduction to the Microsoft Bot Framework and how to get started using it Bot Framework Experienced General technology background, familiarity with data solutions
Visualization This self-paced, online course covers the Power BI system, and how to create and publish reports Microsoft Power BI Beginner General technology background, familiarity with data solutions
Solutions This resource page covers multiple applications you can review, test and implement to see a complete solution from start to finish Microsoft Azure, Azure Machine Learning, Cognitive Services, Microsoft R, Azure Search, Python, Azure Data Factory, Power BI, Azure Document DB, Application Insights, Azure SQL DB, Azure SQL Data Warehouse, Microsoft SQL Server, Azure Data Lake, Cognitive Services, Bot Framework, Azure Batch, Intermediate General technology background, familiarity with data solutions
Understand and Implement DevOps Processes DevOps Fundamentals This video series explains the covers the fundamentals of DevOps and helps you understand how they map to DevOps practices, and how they can be implemented by a variety of products and tools DevOps, Microsoft Azure Platform, Visual Studio Team Services Experienced Used an SDLC, familiarity with Agile and other Development Frameworks, IT Operations Familiarity
Use the DevOps Toolchain for Data Science Configure This reference covers the basics of choosing the proper visualization in Visio to communicate your project desgin Visio Intermediate General technology background, familiarity with data solutions
This reference describes the Azure Resource Manager, terms, and serves as the primary root source for samples, getting started, and other references Azure Resource Manager, Azure PowerShell, Azure CLI Intermediate General technology background, familiarity with data solutions
This reference explains the Azure Data Science Virtual Machines for Linux and Windows Data Science Virtual Machine Experienced Familiarity with Data Science Workloads, Linux
This walkthrough explains configuring Azure cloud service roles with Visual Studio - pay close attention to the connection strings specifically for storage accounts Visual Studio Intermediate Software Development
This series teaches you how to use Microsoft Project to schedule time, resources and goals for an Advanced Analytics project Microsoft Project Intermediate Understand Project Managment Fundamentals
This Microsoft Project template provides a time, resources and goals tracking for an Advanced Analytics project Microsoft Project Intermediate Understand Project Managment Fundamentals
This tutorial helps you get started with Azure Data Catalog, a fully managed cloud service that serves as a system of registration and system of discovery for enterprise data assets Azure Data Catalog Beginner Familiarity with Data Sources and Structures
This Microsoft Virtual Academy course explains how to set up Dev-Test with Visual Studio Online and Microsoft Azure Visual Studio Online Experienced Software Devlopment, familiarity with Dev/Test environments
This Management Pack download for Microsoft System Center contains a Guidelines Document to assist in working with Azure assets System Center Intermediate Experience with System Center for IT Management
This document is intended for developer and operations teams to understand the benefits of PowerShell Desired State Configuration PowerShell DSC Intermediate Experience with PowerShell coding, enterprise architectures, scripting
Code This download also contains documentation on using Visual Studio Online Code for creating Data Science and AI applications Visual Studio Online Intermediate Software Development
This getting started site teaches you about DevOps and Visual Studio Visual Studio Beginner Software Development
You can write code directly from the Azure Portal using the App Service Editor. Learn more at this resource about Continuous Integration with this tool Azure Portal Highly Experienced Data Science background - but read this anyway
This resource explains how to code and create Predictive Analytics experiments using the web-based Azure ML Studio tool Azure ML Studio Experienced Software Development
This reference contains a list and a study link to all of the development tools on the Data Science Virtual Machine in Azure Data Science Virtual Machine Experienced Software Development, Data Science
Read and understand each of the references in this Azure Security Trust Center for Security, Privacy, and Compliance - VERY important Azure Security Intermediate System Architecture Experience, Security Development experience
Build This course teaches you about enabling DevOps Practices with Visual Studio Online Build Visual Studio Online Experienced Software Development, Familiarity with an SDLC
This reference explains compiling and building using Visual Studio Visual Studio Intermediate Software Development, Familiarity with an SDLC
This reference explains how to orchestrate processes such as software builds with Runbooks System Center Experienced Experience with System Center Orchestrator
Test Use this reference to understand how to use Visual Studio Online for Test Case Management Visual Studio Online Experienced Software Development, Familiarity with an SDLC
Use this previous reference for Runbooks to automate tests using System Center System Center Experienced Experience with System Center Orchestrator
As part of not only testing but development, you should build in Security. The Microsoft SDL Threat Modeling Tool can help in all phases. Learn more and download it here Threat Monitoring Tool Experienced Familiarity with security concepts, software development
This article explains how to use the Microsoft Attack Surface Analyzer to test your Advanced Analytics solution Attack Surface Analyzer Experienced Familiarity with security concepts, software development
Package This reference explains the concepts of working with Packages in TFS and VSO Visual Studio Online Experienced Software development, familiarity with an SDLC
Use this previous reference for Runbooks to automate packaging using System Center System Center Experienced Experience with System Center Orchestrator
This reference explains how to create a data pipeline for your solution, which you can save as a JSON template as a "package" Azure Data Factory Intermediate General computing background, data project experience
This topic describes the structure of an Azure Resource Manager template Azure Resource Manager Intermediate Familiarity with the Microsoft Azure Platform
DSC is a management platform in PowerShell that enables you to manage your IT and development infrastructure with configuration as code, saved as a package. This reference is an overview for that topic PowerShell Desired State Configuration Intermediate PowerShell coding, familiarity with enterprise architectures, scripting
Release This head-reference article contains concepts for build, test, and release for CI/CD environments Visual Studio Online Experienced Software development, familiarity with CI/CD environments, familiarity with an SDLC
Use this previous reference for Runbooks to automate release management using System Center System Center Experienced Experience with System Center Orchestrator
This article helps you determine the best option to deploy the files for your web app, mobile app backend, or API app to Azure App Service, and then guides you to appropriate resources with instructions specific to your preferred option Microsoft Azure Deployment Intermediate Software development, experience with the Microsoft Azure platform
Monitor This reference explains Application Insights and how you can add it to your Advanced Analytics Solutions Application Insights Intermediate Software Development, familiarity with the Microsoft Azure platform
This topic explains basic concepts about Operations Manager for the administrator who manages the Operations Manager infrastructure and the operator who monitors and supports the Advanced Analytics Solution System Center Experienced Familiarity with enterprise monitoring, System Center Operations Manager
This blog entry explains how to use the Azure Data Factory to monitor and manage the Advanced Analytics pipeline Azure Data Factory Intermediate Familiarity with Azure Data Factory
This video shows how to monitor a log with Azure Log Analytics Azure Logs, PowerShell Experienced Familiarity with the Azure Platform
Understand how to use Open Source Tools with DevOps on Azure Open Source DevOps Tools and Azure This reference page contains two videos and a whitepaper on using Chef with Azure deployments Chef Experienced Familiarity with the Azure Platform, Familiarity with DevOps
This site has a toolchain selection path DevOps, Microsoft Azure Platform, Visual Studio Team Services, Open Source Software Experienced Used an SDLC, familiarity with Agile and other Development Frameworks, IT Operations Familiarity
This tutorial explains how to automate the build and test phase of application development, using a continuous integration and deployment CI/CD pipeline Jenkins Experienced Familiarity with the Azure Platform, Familiarity with DevOps, Familiarity with Jenkins
This contains an overview of working with Docker and Azure as well as additional references for implementation for Data Science applications Docker Intermediate Familiarity with the Azure Platform, Familiarity with Server Operating Systems
This installation and explanation explains how to use Visual Studio Code with Azure assets VSCODE Intermediate Software Development, familiarity with the Microsoft Azure Platform
This blog entry explains how to use R Studio with Microsoft R R Studio Intermediate R Language experience
This blog entry shows how to use continuous integration with Azure and GitHub git, github Intermediate Software Development

Next steps

Team Data Science Process for data scientists This article provides guidance to a set of objectives that are typically used to implement comprehensive data science solutions with Azure technologies.