Week 8 | Progress feedback one-on-one

Class Details

📅 Date: 14 May, 2025
⏰ Time: 14:00h - 16:00h
📖 Synopsis: One-on-one meeting focused on receiving feedback on the manuscript draft, the R code used for data analysis, interpretation of results, and adherence to best practices for reproducible research.

Done in Class
To Do at Home

Books suggested to help with data analysis:

The Art of Data Science.
Roger D. Peng and Elizabeth Matsui. 2018.
https://leanpub.com/artofdatascience
R for Data Science (2e).
Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023.
https://r4ds.hadley.nz/
Tidy Modeling with R.
Max Kuhn and Julia Silge. 2023.
https://www.tmwr.org/
Modern Statistics for Modern Biology.
Susan Holmes and Wolfgang Huber. 2025.
https://www.huber.embl.de/msmb/

Overview of Model Classifications

How Does Modeling Fit into the Data Analysis Process?

“Based on: R for Data Science. Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. | Tidy Modeling with R. Max Kuhn and Julia Silge. 2023.”

Particular tasks for BB

Refactor the data analysis script by splitting the current single script into two separate scripts: one for Descriptive Analysis and one for Modeling.
Apply the following changes to the data analysis scripts:
- Add the site variable to the tidy dataset.
- Exclude the id variable from the MCA calculation.
- Color the PCA plots using relevant categorical variables to help identify groupings.
- Move the PCA visualization with variable vectors to the modeling script.
- Combine the individual histograms into a single multi-panel figure.
- Remove the printing of descriptive statistics for each variable separately; keep only the summary table with all variables described.
Draft the Scientific Data descriptor manuscript: Prepare a clean first draft to share with the tutor for feedback.
Prepare final questions or suggestions related to reproducible data analysis for discussion.