Audience

  • Minimal background in mathematics and statistic
    • analysis and calculus (integral, derivatives, study of functions, … )
    • basic statisticx concepts
  • Minimal knowledge on statistical modeling
  • Basic expertise with Python and Jupyter Notebook
    • installing new packages
    • writing basic code and running pipelines
    • knowledge of standard libraries (numpy, pandas, scikit-learn)

Evaluation

Assignments: 3 in total, 10 points.
The completed assignments should be sent within 2 weeks from the release date. The notebook must be renamed as <candidate name>_<candidate surname>_<assignment_number>.ipynb and sent to my email address.

Final exam: 10 points

Course Material

All the material is available on GitLab .
Support slides Notebook Assignments References
Day 1 Introduction Basic Probability Theory / /
Day 2 / Probability Distributions / [BDA]:Ch1
Day 3 / Inference in Bayesian models: theory and practice Assignments 1 [BDA]:Ch1, [McE]:Ch3
Day 4 / Practical Exercises 1 , , Data / [BDA]:Ch1, [McE]:Ch3
Day 5 / Laplace approximation , tools.py , Data (from [McE]) / [McE]:Ch4
Day 6 / Bayesian linear regression / [McE]:Ch4
Day 7 / Practical exercises 2 Assignments 2 , data [McE]:Ch8, Ch10
Day 8 / MCMC / [McE]:Ch8, [Bet2018], [Stan]

References

[BDA] Bayesian Data Analysis. A. Gelman, J.B. Carlin, H.S. Stern, D.B. Dunson, A. Vehtari, D.B. Rubin; Chapman and Hall/CRC, 2014, 3rd Edition.

[McE] Statistical Rethinking. R. McElreath; Chapman and Hall/CRC, 2016, 3rd Edition.

[Bet2018] A Conceptual Introduction to Hamiltonian Monte Carlo. M. Betancourt; ArXiv

[Stan] Stan Documentation . Stan Development Team.


Audience

  • Minimal background in mathematics and statistic
    • analysis and calculus (integral, derivatives, study of functions, … )
    • basic statistical concepts (expectation, median, covariance, distributions, … )
  • Minimal knowledge on statistical modeling
    • regression
    • expectation, variance/covariance, statistical descriptors
  • Basic expertise with Python and Jupyter Notebook
    • installing new packages
    • writing basic code and running pipelines
    • knowledge of standard libraries (numpy, pandas, scikit-learn)

Evaluation

Three sessions of assignments: 10 points.
The completed assignments should be sent within 2 weeks from the release date. The notebook must be renamed as <candidate name>_<candidate surname>_<assignment_number>.ipynb and sent to my email address.

Final exam: 10 points

Course Material

Support slides Notebook Assignments References
Day 1 Introduction Basic Probability Models and Sampling in Python / /
Day 2 / Data Generation - Regression & Classification / [Gu2003], [Gu2007]
Day 3 / Bias and Variance / [HTF2001]:Ch7, [Bis2006]:Ch1,Ch3, [Gem1992]
Day 4 / Bootstrap and bagging / [Brei1996], [Efr1986], [HTF2001]:Ch7
Day 5 / Cross-validation I / [HTF2001]:Ch7
Day 6 / Cross-validation II / [HTF2001]:Ch7, [Koh1995], [Rao2008]
Day 7 / Information Criteria I / [McE2016]:Ch6
Day 8 / Information Criteria II [McE2016]:Ch6, [Wag2004], [Sym2011]
Day 9 / Class exercises / /

References

[Gu2003] Design of experiments for the NIPS 2003 variable selection benchmark. I. Guyon, 2003. link

[Gu2007] Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark. I. Guyon, J. Li, T. Mader, P.A. Pletscher, G. Schneider, M. Uhr. Pattern Recognition Letters, 28, 1438-1444, 2007.

[HTF2001] The Elements of Statistical Learning. T. Hastie, R. Tibshirani, and J. Friedman. Springer Series in Statistics Springer New York Inc., New York, NY, USA, 2001.

[Bis2006] Pattern Recognition and Machine Learning. C.M. Bishop. Springer-Verlag Berlin, Heidelberg, DE, 2006.

[Gem1992] Neural networks and the bias/variance dilemma. S. Geman, E. Bienenstock, R Doursat. Neural Computation, 4:2, 1-58, 1992.

[Brei1996] Bagging predictors . L. Breiman. Machine learning, 24(2), 123-140, 1996

[Efr1986] Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. B. Efron, R. Tibshirani, Statistical science, 54-75. 1986.

[Koh1995] A study of cross-validation and bootstrap for accuracy estimation and model selection. R. Kohavi. IJCAI, 14:2, 1995.

[Rao2008] On the dangers of cross-validation. An experimental evaluation. R. Barat Rao, G. Fung, and R. Rosales. Proceedings of the 2008 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2008.

[McE2016] Statistical Rethinking. A Bayesian Course with Examples in R and Stan. R. McElreath. T&F Crc Press, 2016.

[Wag2004] AIC model selection using Akaike weights. E.J. Wagenmakers , S. Farrell. Psychon Bull Rev. 11(1):192-6, 2004.

[Sym2011] A brief guide to model selection, multimodel inference and model averaging in behavioural ecology using Akaike’s information criterion. M.R. Symonds, A. Moussalli, A. Behavioral Ecology and Sociobiology, 65(1), 13-21, 2011.


Support material for Winter School in Imaging Genetics 2019, Verona University, Italy

This lecture aims at covering the statistical background required to perform association analysis in typical imaging-genetics studies. We will introduce the notion of statistical association, and highlight the standard analysis paradigm in univariate modeling. We will then explore multivariate association models, generalizing to high-dimensional data the notion of statistical association. In particular, we will focus on standard paradigms such as Canonical Correlation Analysis (CCA), Partial Least Squares (PLS), and Reduced Rank Regression (RRR). We will finally introduce more advanced analysis frameworks, such as Bayesian and deep association methods. Within this context we will present the Multi-Channel Variational Autoencoder, recently developed by our group.



References

P. Geladi and B. Kowalski. Partial least-squares regression: a tutorial, Analytica Chimica Acta, 1985.

S. Szymczak, J.M. Biernacka, H.J. Cordell, O. González-Recio, I.R. König, H. Zhang, Y.V. Sun. Machine learning in genome‐wide association studies, Genetic Epidemiology 2009.

M. Silver, E. Janousova, X. Hua, P.M. Thompson, G. Montana. Identification of gene pathways implicated in Alzheimer's disease using longitudinal imaging phenotypes with sparse regression, NeuroImage 2012.

E. Le Floch, V. Guillemot, V. Frouin, P. Pinel, C. Lalanne, L. Trinchera, A. Tenenhaus, A. Moreno, M. Zilbovicius, T. Bourgeron, S. Dehaene, B. Thirion, J.B. Poline, E. Duchesnay. Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse Partial Least Squares. Neuroimage. 2012 Oct 15;63(1):11-24.

J. Liu and V. Calhoun. A review of multivariate analyses in imaging genetics, Frontiers in Neuroinformatics, 2014.

M. Lorenzi, A. Altmann, B. Gutman, S. Wray, C. Arber, D.P. Hibar, N. Jahanshad, J.M. Schott, D.C. Alexander, P.M. Thompson, S. Ourselin. Susceptibility of brain atrophy to TRIB3 in Alzheimer’s disease, evidence from functional prioritization in imaging genetics. PNAS March 20, 2018 115 (12) 3162-3167.

L. Antelmi, N. Ayache, P. Robert, M. Lorenzi. Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data. 36th International Conference on Machine Learning (ICML), 2019.

L. Shen and P.M. Thompson. Brain Imaging Genomics: Integrated Analysis and Machine Learning. Proceedings of the IEEE, 2019.