Who At
Matteo Cereda matteo.cereda1@unimi.it
Fabio Iannelli fabio.iannelli@ifom.eu
Uberto Pozzoli uberto.pozzoli@lanostrafamiglia.it

For any question, ideas, and comment reach us at this addresses

# Introduction

“R for Data Science” will teach you how to treat data with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it.

## What you will see

Data science is a huge field, our model of the tools needed in a typical data science project looks something like this:

1. First you must IMPORT your data into R. This typically means that you take data stored in a file, database, or web API, and load it into a data frame in R.

2. Once you’ve imported your data, it is a good idea to TIDY it. Tidying your data means storing it in a consistent form that matches the semantics of the dataset with the way it is stored. Each column is a variable, and each row is an observation.

3. Once you have tidy data, a common first step is to TRANSFORM it. Transformation includes narrowing in on observations of interest, creating new variables that are functions of existing variables, and calculating a set of summary statistics.

4. Once you have tidy data with the variables you need, there are two main engines of knowledge generation: VISUALIZATION and MODELLING

5. The last step of data science is COMMUNICATION.

A work by Matteo Cereda and Fabio Iannelli