This one-day workshop, led by Nikolay Oskolkov from Lund University, provides a comprehensive introduction to machine learning techniques in data analysis, focusing on both theoretical knowledge and practical coding skills in R and Python while using ChatGPT as a research assistant for coding and interpreting results. Participants will learn to implement from scratch and optimize algorithms such as neural networks, random forest, k-means clustering, Gaussian Mixture Model (GMM), and Markov Chain Monte Carlo (MCMC), making it an essential resource for advancing research in statistics and data science.
Machine learning has become an indispensable tool in the field of computational data analysis, offering powerful techniques to analyze and interpret complex data. As the volume of data continues to grow exponentially, the ability to apply machine learning algorithms effectively is crucial for advancing research across the biological, environmental, health, and social sciences — as well as statistics and engineering. Nikolay Oskolkov from Molecular Biosciences, Lund University, brings his extensive expertise to this one-day workshop, showing how to use AI tools like ChatGPT alongside R and Python for advanced data analysis, thereby equipping participants with both theoretical knowledge and practical skills in this cutting-edge area.
This workshop is particularly valuable for PhD students, academics, and professional researchers who are looking to enhance their analytical capabilities using machine learning. By focusing on practical applications in R and Python with coding and interpretation assistance from ChatGPT, participants will not only learn the theoretical underpinnings of various machine learning algorithms but also gain hands-on experience in coding these algorithms from scratch in extremely intuitive ways. This dual approach ensures that attendees can immediately apply what they learn to their own research projects, making the workshop an essential investment for anyone involved in computational data analysis.
Workshop Topics and Learning Objectives:
Introduction to machine learning in data science and computational biology
Limitations of traditional statistics and need for machine learning approach
Understanding principles of neural networks and their applications
Coding gradient descent and neural network from scratch in R and Python and optimizing with ChatGPT
Choice of machine learning algorithm for tabular, image, text and time series data
Implementing random forest algorithm from scratch in R and Python and improving with ChatGPT
K-means clustering and Gaussian Mixture Model (GMM) in R and Python
Markov Chain Monte Carlo (MCMC) methods in R and Python for bioinformatics and genomics
Applications of Autoencoder neural network for integration of heterogeneous data
Case studies and real-world applications with live coding and interpretation with ChatGPT
Troubleshooting and optimizing machine learning models, hyperparameter tuning
Ethical considerations and best practices in machine learning research
PhD students and academic researchers will find this workshop particularly beneficial as it addresses both the theoretical and practical aspects of machine learning. By the end of the workshop, participants will have a solid understanding of how to implement and apply various machine learning algorithms in their own research. The hands-on coding sessions will provide the necessary skills to develop and optimize models, ensuring that attendees can confidently tackle their own research challenges.
Participants can expect to gain a comprehensive understanding of machine learning algorithms such as neural networks, random forest, k-means clustering, Gaussian Mixture Model (GMM), and Markov Chain Monte Carlo (MCMC). The workshop will also cover practical coding techniques in R and Python, enabling attendees to implement these algorithms from scratch in their own research. Additionally, the workshop will provide insights into troubleshooting and optimizing models, as well as ethical considerations in machine learning research. By the end of the day, participants will be well-equipped to apply these powerful tools to their own research projects.
For all live-streaming seminars, each seminar is taught via Zoom and features take-home skill challenges, and all Zoom recordings and material (including program input, output, data, and slides) are available online for 30 days after the seminar concludes – in case you would prefer to attend asynchronously or you would like to go back and revisit the seminar content after it concludes. An online seminar chat forum will also be monitored by the instructor for 30 days after the seminar concludes, so that you can ask questions related to seminar content outside of the live seminar sessions. For all on-demand seminars, all videos and material (including program input, output, data, and slides) will be available for 30 days after you activate your enrollment, which you can do anytime after you purchase the on-demand seminar. An official Instats certificate of completion is provided at the conclusion of all seminars. For European students, our seminars offer ECTS Equivalent points, which is indicated on the certificate of completion that is provided at the conclusion of each seminar (see the Instats FAQ for details).