2A Lab 2 Week 3

This is the pair coding activity related to Chapter 2.

We will continue working with the data from Binfet et al. (2021), focusing on the randomised controlled trial of therapy dog interventions. Today, our goal is to calculate an average Flourishing score for each participant at time point 1 (pre-intervention) using the raw data file dog_data_raw. Currently, the data looks like this:

RID	F1_1	F1_2	F1_3	F1_4	F1_5	F1_6	F1_7	F1_8
1	6	7	5	5	7	7	6	6
2	5	7	6	5	5	5	5	4
3	5	5	5	6	6	6	5	5
4	7	6	7	7	7	6	7	4
5	5	5	4	6	7	7	7	6

However, we want the data to look like this:

RID	Flourishing_pre
1	6.125
2	5.250
3	5.375
4	6.375
5	5.875

Task 1: Open the R project you created last week

If you haven’t created an R project for the lab yet, please do so now. If you already have one set up, go ahead and open it.

Task 2: Open your `.Rmd` file from last week

Since we haven’t used it much yet, feel free to continue using the .Rmd file you created last week in Task 2.

Task 3: Load in the library and read in the data

The data should be in your project folder. If you didn’t download it last week, or if you’d like a fresh copy, you can download the data again here: data_pair_coding.

We will be using the tidyverse package today, and the data file we need to read in is dog_data_raw.csv.

Hint

# loading tidyverse into the library
library(???)

# reading in `dog_data_raw.csv`
dog_data_raw <- read_csv("???")

Task 4: Calculating the mean for `Flourishing_pre`

Step 1: Select all relevant columns from dog_data_raw, including participant ID and all items from the Flourishing questionnaire completed before the intervention. Store this data in an object called data_flourishing.

Hint

Look at the codebook. Try to determine:

The variable name of the column where the participant ID is stored.
The items related to the Flourishing scale at the pre-intervention stage.

More concrete hint

From the codebook, we know that:

The participant ID column is called RID.
The Flourishing items at the pre-intervention stage start with F1_.

data_flourishing <- ??? %>% 
  select(???, F1_???:F1_???)

Step 2: Pivot the data from wide format to long format so we can calculate the average score more easily (in step 3).

Hint

Which pivot function should you use? We have pivot_wider() and pivot_longer() to choose from.

We also need 3 arguments in that function:

The columns you want to select (e.g., all the Flourishing items),
The name of the column where the current column headings will be stored (e.g., “Questionnaire”),
The name of the column that should store all the values (e.g., “Responses”).

More concrete hint

We need pivot_longer(). You already encountered pivot_longer() in first year (or in the individual walkthrough if you have already completed this Chapter). The 3 arguments was also a give-away; pivot_wider() only requires 2 arguments.

  pivot_longer(cols = ???, names_to = "???", values_to = "???")

Step 3: Calculate the average Flourishing score per participant and name this column Flourishing_pre to match the table above.

Hint

Before summarising the mean, you may need to group the data.

More concrete hint

To compute an average score per participant, we would need to group by participant ID first.

  group_by(???) %>% 
  summarise(Flourishing_pre = mean(???)) %>% 
  ungroup()

Solution

# loading tidyverse into the library
library(tidyverse)

# reading in `dog_data_raw.csv`
dog_data_raw <- read_csv("dog_data_raw.csv")

# Task 4: Tidying 
data_flourishing <- dog_data_raw %>% 
  # Step 1
  select(RID, F1_1:F1_8) %>% 
  # Step 2
  pivot_longer(cols = -RID, names_to = "Questionnaire", values_to = "Responses") %>% 
  # Step 3
  group_by(RID) %>% 
  summarise(Flourishing_pre = mean(Responses)) %>% 
  ungroup()

Task 1: Open the R project you created last week

Task 2: Open your .Rmd file from last week

Task 3: Load in the library and read in the data

Task 4: Calculating the mean for Flourishing_pre

Task 2: Open your `.Rmd` file from last week

Task 4: Calculating the mean for `Flourishing_pre`