Home | Prev | Next

1 Data Science VM

This is an optinal way of getting most things configured for use.

1.1 Intro

Using either an existing subscription, a trial, or a voucher, we’ll need a Microsoft Data Science Virtual Machine.

  • There are Windows and Linux variants - use your preferred option!

1.2 Provision the VM

In Azure Portal: 1. Hit New 2. Seach for data science 3. Select the Data Science Virtual Machine 4. You can do a Next, Next, Next install but you may wish to: + Give the VM a sensible name + Use security best practices like a strong password + Base the VM in Northern Europe + Use an A3 size (don’t forget to turn it off or downsize after)

1.3 Configure the VM

Use the Connect option on the portal to get an RDP connection to the VM

  • Disable Enhanced Security by going to Server Manager > Local Server > IE Enhanced Security
  • Install RStudio
  • Using either Visual Studio 2015 Community Edition or RStudio, install the devtools package

2 Installation instructions

This section outlines general installation notes that should be followed when configuring an R environment. The data science VM generally precludes their necessity.

2.1 ODBC Driver

Get the relevant SQL Server ODBC 11 Preview driver for your OS and install it. You can use the native driver as it’s ODBC compatible but amendments to connection strings in the training material will be required.

2.2 R

  • Download latest base R exe file from cran.rstudio.com
  • Also download Rtools latest exe (if using Windows)
  • Install R then Rtools - if 64bit, install both 32 and 64 as it saves you hassle with other drivers

2.3 Git

  • Download latest git version from msysgit.github.io
  • Install
  • Open up command-prompt
    • Run git config --global user.name="Your name"
    • Run git config --global user.email="email@addre.ss"

2.4 Use Rstudio

  • Go to rstudio.com
  • Select correct install file for your PC, and install
  • Open Rstudio
  • Go to Tools > Global Options and set Rstudio to point at the location of your git.exe file
  • Restart Rstudio
    • If you don’t automatically get the Git pane (top RHS) then Create a New Project and tick the source control checkbox
  • Go to the Packages area (bottom RHS) and select Install. Install devtools

3 The RMSFTDP package

3.1 Purpose

This section introduces the package R in the Microsoft Data Platform, or RMSFTDP for short.

3.2 Get it

  • Open RStudio
  • Get the project from source control by going to New Project in Rstudio, selecting from version control, and pasting in the URL https://github.com/stephlocke/RMSFTDP.git
  • Run devtools::install() to install the package

3.3 Contents

  • vignettes/ - use devtools::build_vignettes() to generate a local copy of the training materials