Week 8 | Progress feedback one-on-one

Class Details

📅 Date: 14 May, 2025
Time: 14:00h - 16:00h
📖 Synopsis: One-on-one meeting focused on receiving feedback on the manuscript draft, the R code used for data analysis, interpretation of results, and adherence to best practices for reproducible research.

Books suggested to help with data analysis:

  1. The Art of Data Science.
    Roger D. Peng and Elizabeth Matsui. 2018.
    https://leanpub.com/artofdatascience

  2. R for Data Science (2e).
    Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023.
    https://r4ds.hadley.nz/

  3. Tidy Modeling with R.
    Max Kuhn and Julia Silge. 2023.
    https://www.tmwr.org/

  4. Modern Statistics for Modern Biology.
    Susan Holmes and Wolfgang Huber. 2025.
    https://www.huber.embl.de/msmb/

Overview of Model Classifications

How Does Modeling Fit into the Data Analysis Process?

“Based on: R for Data Science. Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. | Tidy Modeling with R. Max Kuhn and Julia Silge. 2023.”

Particular tasks for BB

  1. Refactor the data analysis script by splitting the current single script into two separate scripts: one for Descriptive Analysis and one for Modeling.

  2. Apply the following changes to the data analysis scripts:

    • Add the site variable to the tidy dataset.
    • Exclude the id variable from the MCA calculation.
    • Color the PCA plots using relevant categorical variables to help identify groupings.
    • Move the PCA visualization with variable vectors to the modeling script.
    • Combine the individual histograms into a single multi-panel figure.
    • Remove the printing of descriptive statistics for each variable separately; keep only the summary table with all variables described.
  3. Draft the Scientific Data descriptor manuscript: Prepare a clean first draft to share with the tutor for feedback.

  4. Prepare final questions or suggestions related to reproducible data analysis for discussion.

Back to top