How to analyse and publish health data research using R Markdown (Rmd)
Objectives: To demonstrate the importance of reproducible research tools for collaboration and
communication using an open source programming language. To provide a brief overview of data
wrangling, descriptive statistics and publishing options using Rmd
Methods: Data from the Growing Up in Ireland survey (child cohort) were used to illustrate the
main steps in the import, cleaning and exploratory data analysis involved in a data analysis project.
The analysis of dental health-related variables is presented as R code, text chunks and output,
including graphs and tables from an Rmd report
Results: The importance of a reproducible workflow is presented including data import, renaming
variables, summary statistics and data visualisation. Key R packages and functions are highlighted.
Interactive changes to the data analysis will be demonstrated and the code re-executed to
generate new output documents. The final output is rendered, using the knitr function, as a latex
document and ioslides for presentation. An emphasis is placed on highlighting the benefits of using RMk
for data analysis compared to other commercial statistical packages that use a graphical user interface.
Conclusions: R is a versatile open source language tool that can be used for generating executable
code, high quality data visualisation and interactive or static documents. These can be produced in
multiple formats such as html, pdf, word or a LaTeX document for direct publication.