Day 1: Data Processing and Unsupervised Methods
Morning Session:
Data Processing and Preparation
1. Course Introduction and Official Commencement
2. Lecture – Data Collection and Processing
- The importance of data quality in the analysis of chemical data.
- An overview of data sources and data collection methods.
- Data processing techniques for handling missing values, duplicates, and outliers.
3. Code-Along Session with the Instructor (Python) - Techniques for data processing and preparation in Python.
- Utilizing specialized libraries for data science.
Afternoon Session:
Introduction to Chemical Data Analysis and Unsupervised Machine Learning
1. Lecture – Introduction to Chemical Data Analysis
- The specifics of chemical data analysis and its types
(chemical structure descriptors, properties). - Popular chemical databases.
- Unsupervised machine learning methods in chemistry with examples.
2. Code-Along Session with the Instructor (Python)
- Using Python and public database APIs to automate the generation of chemical datasets.
- Utilizing the generated datasets in multidimensional data analysis, examples of various unsupervised machine learning algorithms (clustering, principal component analysis, clustering).