Datasets are the results of experiments or surveys measured or observed on certain individuals. They can be
- Numbers: quantitative variable (e.g. salary)
- Codes: qualitative variables (e.g. sex)
Data analysis consists in searching for existing relations between individuals, or observations in these datasets.
In other words, there is no statistical modelling: data analysis is an advanced tool of descriptive statistics. Three types of methods can be distinguished
- Classical descriptive statistics: study one or two observed variables
- Analysis of a scatter plot in higher dimension: principal component analysis, etc...
- Classification: group individuals into homogenous categories according to a certain criterion.
The goal of this project is to study one of these datasets with these three methods using the software R. Several datasets are available; students are also welcomed to choose their own dataset.
Obs! För GU-studenter räknas projektet som ett projekt i Matematisk Statistik (MSG900/MSG910).
3-6 Speciella förkunskapskrav:
notions in descriptive statistics and in R
Examinator: Maria Roginskaya
Institution: Matematiska vetenskaper