Overview
A 2-day workshop on statistical methods for high-dimensional datasets. Covering “modern” methods but not “basic stats” or “machine learning”. Also not a programming course..
Prerequisites
Basic R programming (eg, data/software carpentries).
Basic stats (eg, linear models course).
Learning goals
Based on a survey of the target audience, we narrowed down on a set of goals (and thus lessons):
- Go beyond classical tests
- Understand and apply statistical methods for high-dimensional data
- Understand and appraise/compare methods
Content
Intro (1/4 day)
- Notation
- Some assumptions (briefly), e.g. linear models assumptions
- Perhaps some techniques (e.g. cross-validation)
High-dimensional regression
- Fitting many linear modules
- Multiple testing
- Information sharing between features
- glmnet: lasso/ridge/elastic net
- cross-validation, selecting variables
Dimensionality reduction