Data Science with Microsoft R Hands-on Labs

In this post I will provide list of most important publically available Data Science with Microsoft R Hands-on Labs which we use in MTC New York for Microsoft R workshops.

To start doing labs provided below it’s a good idea to have a general level of predictive and classification Statistics, and a basic understanding of Machine Learning and Open R language. (For this you may use DAT204x Introduction to R for Data Science, DAT209x Programming in R for Data Science and other courses from Microsoft Data Science specialization).

Microsoft R Hands-on Labs

  1. Exploring SQL Server 2016 R Services and Microsoft R Client with R Tools for Visual Studio. (3 hours; manual is available, all necessary tools and files are included; uses New York Taxi dataset; when you see “Times Squire” in the code, change it to “New York” and save)
  2. MTC Microsoft R training by Jarek Kazmierczak. (1-2 hours; contains source file and R scripts)
  3. edX: DAT213x Analyzing Big Data with Microsoft R Server by Seth Mottaghinejad. (16 hours; contains videos, scripts; you may also earn Microsoft certificate; uses New York Taxi dataset; please let me know if you experience any issues with ggplot2 and ggrepel).
  4. Flight delay prediction with Azure ML (90 minutes; exercise 1 from Cortana Intelligence Suite End-to-End Training by Todd Kitta)
  5. Text Mining with R with Azure ML by Seayoung Rhee. (1 hour)
  6. edX. DAT203.1x Data Science Essentials
  7. edX. DAT203.2x Principles of Machine Learning
  8. edX. DAT203.3x Applied Machine Learning
  9. HDInsight Spark MLib (placeholder)
  10. Cognitive Toolkit (CNTK) Deep Dive and Hands-on (tutorial; video).

Here is one of screenshots from the first (highly recommended) training based on New York Taxi dataset.

sqlrserviceslabnyc

Prerequisites to use Data Science Virtual Machine

The Data Science Virtual Machine has all of the tools you will need to work with the materials. You will need Microsoft Azure subscription for this.

  1. To use subscription to Microsoft Azure you can sign up for a free account here or you can use your MSDN subscription.
  2. To create the Data Science Virtual Machine in Azure please login to Azure Portal and create the virtual machine. (New -> Search for “data science” -> select “Data Science Virtual Machine” -> Create).
  3. Optionally you may test your Microsoft R code on top of HDInsight Spark cluster created in Azure Portal.

Prerequisites to use your local machine

If you would like to work with some of the tools locally, please install following components.

  1. Visual Studio – the Community Edition (free) is acceptable – Version 2015 preferable.
  2. Install R Tools for Visual Studio.
  3. Optionally you may use RStudio.
  4. Optionally you may install SQL Server Developer Edition for SQL Server related content.

Additional materials

Materials from Mission Critical Performance Workshop

Today in MTC New York I provided workshop “Always On: Mission Critical Performance” dedicated to some new features of SQL Server 2016. (And this time SQL Server AlwaysOn technology actually was covered, but it was only fraction of the whole content 😉 ).

Here you can find presentation decks from this workshop:

  1. SQL Server 2016 Evolution
  2. SQL Server 2016 Performance (Here I additionally included slides on in-memory OLTP and ColumnStore from SQL Server 2014)
  3. SQL Server 2016 Security and Compliance
  4. SQL Server 2016 Availability
  5. SQL Server 2016 Scalability
  6. SQL Server 2016 Cloud Service (Bonus topic)

Additional materials are available on the official site of SQL Server 2016.

You may also try Virtual Labs. (Please, filter by “SQL Server 2016”).

evolution