2A Lab 3 Week 4

This is the pair coding activity related to Chapter 3.

We will once again be working with data from Binfet et al. (2021), which focuses on the randomised controlled trials data involving therapy dog interventions. Today, our goal is to calculate the average Loneliness score for each participant measured at time point 1 (pre-intervention) using the raw data file dog_data_raw. Currently, the data looks like this:

RID L1_1 L1_2 L1_3 L1_4 L1_5 L1_6 L1_7 L1_8 L1_9 L1_10 L1_11 L1_12 L1_13 L1_14 L1_15 L1_16 L1_17 L1_18 L1_19 L1_20
1 3 3 4 3 2 3 1 2 3 4 3 1 3 1 2 3 2 3 2 4
2 3 2 3 3 4 3 2 2 4 3 2 2 1 2 4 3 3 2 4 3
3 3 3 2 3 3 4 2 3 3 3 2 2 2 2 3 3 4 3 3 3
4 4 2 2 3 4 4 1 3 3 4 2 1 2 2 4 4 3 3 4 3
5 2 3 3 3 4 3 2 2 3 2 4 4 4 3 2 2 3 4 3 2

But we want the data to look like this:

RID Loneliness_pre
1 2.25
2 1.90
3 2.25
4 1.75
5 2.85

This task is a bit more challenging compared to last week’s lab activity, as the Loneliness scale includes some reverse-coded items.

Task 1: Open the R project for the lab

Task 2: Open your .Rmd file from last week or create a new .Rmd file

You could continue the .Rmd file you used last week, or create a new .Rmd. If you need some guidance, have a look at Section 1.3.

Task 3: Load in the library and read in the data

The data should already be in your project folder. If you want a fresh copy, you can download the data again here: data_pair_coding.

We are using the package tidyverse today, and the datafile we should read in is dog_data_raw.csv.

# loading tidyverse into the library
library(???)

# reading in `dog_data_raw.csv`
dog_data_raw <- read_csv("???")

Task 4: Calculating the mean for Loneliness_pre

  • Step 1: Select all relevant columns, such as the participant ID and all 20 items of the Loneliness questionnaire completed by participants before the intervention. Store this data in an object called data_loneliness.

Look at the codebook. Try to figure out

  • the variable name of the column in which the participant id is stored, and
  • which items relate to the Loneliness scale at Stage “pre”
  • the participant id column is called RID
  • The Loneliness items at pre-intervention stage start with L1_
  • Step 2: Pivot the data from wide format to long format so we can reverse-score and calculate the average score more easily (in step 3)

pivot_

We also need 3 arguments in that function:

  • the columns we want to select (e.g., all the loneliness items),
  • the name of the column in which the current column headings will be stored (e.g., “Qs”), and
  • the name of the column that should store all the values (e.g., “Responses”).
  pivot_longer(cols = ???, names_to = "???", values_to = "???")
  • Step 3: Reverse-scoring

Identify the items on the Loneliness scale that are reverse-coded, and then reverse-score them accordingly.

We need to figure out:

  • which are the items of the loneliness scale we need to reverse-score
  • what is the measuring scale of loneliness so we can determine the new values
  • which function to use to create a new column that has the corrected scores in it
  • which one of the case_ functions will get us there
  • The items to be reverse-coded items can be found in the codebook: L1_1, L1_5, L1_6, L1_9, L1_10, L1_15, L1_16, L1_19, L1_20
  • the loneliness scale ranges from 1 to 4, so we need to replace 1 with 4, 2 with 3, 3 with 2, and 4 with 1
  • the function to create a new column mutate()
  • it’s a conditional statement rather than “just” replacing values, hence we need case_when()
  mutate(Score_corrected = case_when(
    ??? ~ ???,
    .default = ???
    ))
  • Step 4: Calculate the average Loneliness score per participant. To match with the table above, we want to call this column Loneliness_pre

grouping and summarising

  group_by(???) %>% 
  summarise(Loneliness_pre = ???(???)) %>% 
  ungroup()
# loading tidyverse into the library
library(tidyverse)

# reading in `dog_data_raw.csv`
dog_data_raw <- read_csv("dog_data_raw.csv")

# Task 4: Tidying 
loneliness_tidy <- dog_data_raw %>% 
  # Step 1
  select(RID, starts_with("L1")) %>% # select(RID, L1_1:L1_20) also works
  # Step 2
  pivot_longer(cols = -RID, names_to = "Qs", values_to = "Response") %>% 
  # Step 3
  mutate(Score_corrected = case_when(
    Qs %in% c("L1_1", "L1_5", "L1_6", "L1_9", "L1_10", "L1_15", "L1_16", "L1_19", "L1_20") ~ 5-Response,
    .default = Response
    )) %>% 
  # Step 4
  group_by(RID) %>% 
  summarise(Loneliness_pre = mean(Score_corrected, na.rm = TRUE)) %>% 
  ungroup()

If you’d like to practise your data wrangling skills further, you can try the “Challenge yourself” scenarios at the end of Chapter 3.