Exploit your data: machine learning for multivariate data analysis
Participant profile
Doctoral students of the UPV/EHU
Calendar
Biskay Campus: March-April 2025
Duration / Timetable
20 hours (4 hours classes run over 5 weeks)
Time: 09:30 to 13:30
Attendance Requirement
Students will be expected to attend 90% of the classes together with submission of a final practical work assignment (see points 3 and 5 of the Basic regulations for participation in transversal training activities organised by the Doctoral School).
Language
English
Modality
Face-to-face
Pre-requisites
Participants must use their own laptops with the software "R" and the IDE “R Studio” installed.
Location and dates
CAMPUS | FECHA | LUGAR |
---|---|---|
Biskay Campus (Leioa) |
March: 7, 14, 21, 28 April: 4 |
Biblioteca building Classroom 6A (1st floor) |
Speaker, Trainer and Profile
Giulia Gorla. Bachelor's degree in Chemistry and Industrial Chemistry and Master's degree in Chemistry at the University of Insubria (Como). In 2023, I achieved my Ph.D. in Chemical and Environmental Sciences, specializing in analytical chemistry, graduating with honors (Cum Laude) with a doctoral thesis titled " Infrared spectroscopy and Chemometrics: facing analytical chemistry issues through data." My scientific interests encompass the analysis of spectroscopic data, hyperspectral imaging, and the application and development of chemometrics, including Machine Learning and Deep Learning techniques on several type of data. I am currently working as a postdoctoral fellow in the IBeA Research Group (Ikerkuntza eta Berrikuntza Analitikoa - Analytical Research and Innovation) within the Department of Analytical Chemistry at the University of the Basque Country (UPV/EHU). You can find more about my work experience on my LinkedIn profile page (www.linkedin.com/in/giuliagorla95) and my research interest at my ORCID (0000-0002-2311-9333).
Group size
20
Registration
From 3 February
Objectives
The main objective of the course is to train students in the use and understanding of machine learning techniques applied to research. Students will learn to use these techniques autonomously and apply them to their research projects. In terms of the competencies and skills to be developed, students will gain the ability to manage data acquisition, structuring, analysis, and visualization, and critically evaluating the results obtained. Their capacity for critical, logical, and mathematical reasoning will be fostered, along with their ability to solve problems and create models that reflect real-world situations. Additionally, students will learn to design and implement experiments and analyze and interpret the results.
Self-directed learning will be promoted, as well as the ability to communicate conclusions and knowledge to both specialized and non-specialized audiences clearly. Students will work in teams during the practical sessions and learn to select the most appropriate technique for each problem. They will be trained in the use of statistical software and the critical evaluation of the results obtained, considering their applicability and possible limitations.
The assessment will be based on the practical work completed during the course and will culminate in a final presentation where students will apply the techniques learned to their own field of study. Mastery of the techniques and the ability to apply them effectively in their areas of interest will be expected.
Competences to be acquired by the doctoral student
- Ability to conceive, design or create, implement and adopt a substantial process of research or creation.
- Ability to critically analyse, evaluate and synthesise new and complex ideas.
- Ability to promote, in academic and professional contexts, scientific, technological, social, artistic or cultural progress within a knowledge-based society.
Format
The course will combine theoretical lectures with practical sessions where students will work in groups or individually on real datasets to apply the techniques they have learned. Active participation and discussion of results will be encouraged to enhance learning, ensuring that students can use these tools to extract the maximum information from their data using machine learning techniques.
Content
This course is designed to provide doctoral students from various disciplines with both a theoretical and practical understanding of multivariate analysis tools. The course will cover the fundamentals of experimental design, data acquisition strategies, preprocessing, and models for exploration, prediction, and classification. The course structure includes both theoretical lectures and practical sessions focused on solving real-world problems using datasets and machine learning tools in R. The necessary foundational concepts for multivariate analysis and the use of R will be introduced. The modules will be divided as follows:
- Introduction to the Multivariate Approach and Its Advantages
Presentation of basic concepts and the importance of multivariate analysis across various disciplines. Discussion of the benefits of using multivariate techniques for extracting complex information. - Exploratory Data Analysis
Principal Component Analysis (PCA) for dimensionality reduction and data interpretation. Detection of outliers, types of outliers, and their impact on analysis. Residual inspection to assess model quality. Clustering techniques for identifying natural groups within the data. - Data Preprocessing
Visual inspection of data using visualization tools. Preprocessing techniques. - Regression and Prediction Methods
Introduction to multiple linear regression and multivariate regression methods: Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Evaluation of prediction models through internal and external validation. - Classification Methods
Classification techniques. Comparison and evaluation of classification models using metrics such as accuracy, precision, and ROC curves. - Experimental Design
Strategies for designing effective experiments and analyzing the results obtained.