Contact Us

Who At
Matteo Cereda matteo.cereda1@unimi.it
Fabio Iannelli fabio.iannelli@ifom.eu
Uberto Pozzoli uberto.pozzoli@lanostrafamiglia.it

For any question, ideas, and comment reach us at this addresses





Introduction

“R for Data Science” will teach you how to treat data with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it.

Cover image


What you will see

Data science is a huge field, our model of the tools needed in a typical data science project looks something like this:

  1. First you must IMPORT your data into R. This typically means that you take data stored in a file, database, or web API, and load it into a data frame in R.

  2. Once you’ve imported your data, it is a good idea to TIDY it. Tidying your data means storing it in a consistent form that matches the semantics of the dataset with the way it is stored. Each column is a variable, and each row is an observation.

  3. Once you have tidy data, a common first step is to TRANSFORM it. Transformation includes narrowing in on observations of interest, creating new variables that are functions of existing variables, and calculating a set of summary statistics.

  4. Once you have tidy data with the variables you need, there are two main engines of knowledge generation: VISUALIZATION and MODELLING

  5. The last step of data science is COMMUNICATION.

A work by Matteo Cereda and Fabio Iannelli