PHYSICS

Master Degree

MACHINE LEARNING FOR PHYSICS AND THE NATURAL SCIENCES

Teachers: 
Bianco Federica
Credits: 
6
Site: 
PARMA
Year of erogation: 
2020/2021
Unit Coordinator: 
Bianco Federica
Disciplinary Sector: 
EXPERIMENTAL PHYSICS
Semester: 
Second semester
Language of instruction: 

English

Learning outcomes of the course unit

This course will teach the basis of data driven inference in the physical sciences. Students will acquire computational skills, knowledge of statistical analysis, error analysis, good practises for handling, processing, and analyzing data and (including big-data) programmatically, and communication and visualization skills.

Prerequisites

Coding experience, preferably in python. Basic statistical knowledge (descriptive statistics). Basic linear algebra knowledge (matrix, vectors, matrix multiplications and transformations

Course contents summary

The course will be organized in a modular fashion, with some guest lectures. Each machine learning method will be studied as it is applied to a physical problem, based on open data and literature examples. Students will learn from examples of machine learning methods applied to current problems in Physics and the Natural Sciences. Some of the simpler algorithms will be explored in detail and implemented from scratch, others will be implemented through the use of dedicated python libraries.

Course contents

The course will review: Traditional Null Hypothesis Testing statistics concepts and modern applied statistics and machine
learning methods including: Bayesian Statistics, MonteCarlo Markov Chains, Principal Component
Analysis, Support Vector Machines, Tree methods, Clustering, and Neural Network (including
Autoencodes, Convolutional, and Recurrent Neural Networks).
You will learn examples of machine learning methods applied to current problems in Physics and the Natural Sciences. You will acquire basic computational skills, knowledge of statistical analysis, error analysis, good practises for handling, processing, and analyzing data and (including big-data) programmatically, and communication and visualization skills

Recommended readings

No textbook is required but several textbooks may be helpful throughout the class, including :
• Elements of Statistical Learning, Hastie, Tibshirani, Friedman, Springer 2001
• Statistics, Data Mining, and Machine Learning in Astronomy, Ivezic, Connoly, VanderPlas, Gray, Princeton Press 2nd edition
• ML in python: Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow probably the book that is closer to the syllabus in terms of techniques, but doesn’t buy it, because the second edition is due to come out imminently and the deep learning chapters of the previous edition are out of date now

Additional textbooks, particularly helpful for students with less experience in coding or python, include:
• Python Data Science Handbook, Jake VanderPlas, O'Reilly Media [https://www.oreilly.com/library/view/python-data-science/9781491912126/]
• computing and coding: Beginning Python Visualization, 2009
• data analysis: Statistics in a nutshell, S. Boslaugh, O'Reilly Media
• Visualizations: Visualizations Analysis and Design, T. Munzer, 2014

Most of the content of the listed books that will be referred to in lectures can be found online.

Teaching methods

Google Collaboratory will be used for the class. Homework can be developed on any platform as long as the computational set up consistent the entire class: the class assistants and I need to be able to reproduce your work and obtain the same results. Modules and library used in your work need to be accessible to me, the graders, and your classmates. We may make a docker image and a virtual environment as well and instructions on how to set up your environment to allow you to work offline

Assessment methods and criteria

Homework (reproducing literature analysis) done in groups, Midterm real-time exam, final group project, quizzes to assess ongoing understanding. Participation is also included in the grading

Other informations