`library(splithalfr)`

This vignette describes a scoring method similar to Heuer, Rinck, and Becker (2007); double difference of median reaction times (RTs) for correct responses on Approach Avoidance Task data. It is a subtraction comparing approach bias towards test stimuli relative to approach bias towards control stimuli (avoid_test - approach_test) - (avoid_control - approach_control).

Load the included AAT dataset and inspect its documentation.

```
data("ds_aat", package = "splithalfr")
?ds_aat
```

The columns used in this example are:

- UserID, which identifies participants
- block_type, in order to select only assessment blocks
- appr, approach or avoid trial
- cat, in order to compare test and control stimuli
- response, in order to select only correct first responses
- rt, in order to calculate medians for avoid_test, approach_test, avoid_control, and approach_control
- stim, stimulus ID

Only select trials from assessment blocks.

`ds_aat <- subset(ds_aat, block_type == "assess")`

The variables `appr`

and `stim`

were counterbalanced. Below we illustrate this for the first participant.

```
ds_1 <- subset(ds_aat, UserID == 1)
table(ds_1$appr, ds_1$stim)
```

The scoring function calculates the score of a single participant as follows:

- selects only correct responses
- calculates the median RT of remaining responses

```
fn_score <- function (ds) {
median_avoid_test <- median(
ds[ds$appr == "no" & ds$cat == "test" & ds$response == 1, ]$rt
)
median_approach_test <- median(
ds[ds$appr == "yes" & ds$cat == "test" & ds$response == 1, ]$rt
)
median_avoid_control <- median(
ds[ds$appr == "no" & ds$cat == "control" & ds$response == 1, ]$rt
)
median_approach_control <- median(
ds[ds$appr == "yes" & ds$cat == "control" & ds$response == 1, ]$rt
)
return (
(median_avoid_test - median_approach_test) -
(median_avoid_control - median_approach_control)
)
}
```

Let’s calculate the AAT score for the participant with UserID 14. NB - This score has also been calculated manually via Excel in the splithalfr repository.

`fn_score(subset(ds_aat, UserID == 14))`

To calculate the AAT score for each participant, we will use R’s native `by`

function and convert the result to a data frame.

```
scores <- by(
ds_aat,
ds_aat$UserID,
fn_score
)
data.frame(
UserID = names(scores),
score = as.vector(scores)
)
```

To calculate split-half scores for each participant, use the function `by_split`

. The first three arguments of this function are the same as for `by`

. An additional set of arguments allow you to specify how to split the data and how often. In this vignette we will calculate scores of 1000 permutated splits. The trial properties `app`

and `stim`

were counterbalanced in the AAT design. We will stratify splits by these trial properties. See the vignette on splitting methods for more ways to split the data.

The `by_split`

function returns a data frame with the following columns:

`participant`

, which identifies participants`replication`

, which counts replications`score_1`

and`score_2`

, which are the scores calculated for each of the split datasets

*Calculating the split scores may take a while. By default, by_split uses all available CPU cores, but no progress bar is displayed. Setting ncores = 1 will display a progress bar, but processing will be slower.*

```
split_scores <- by_split(
ds_aat,
ds_aat$UserID,
fn_score,
replications = 1000,
stratification = paste(ds_aat$app, ds_aat$stim)
)
```

Next, the output of `by_split`

can be analyzed in order to estimate reliability. By default, functions are provided that calculate Spearman-Brown adjusted Pearson correlations (`spearman_brown`

), Flanagan-Rulon (`flanagan_rulon`

), Angoff-Feldt (`angoff_feldt`

), and Intraclass Correlation (`short_icc`

) coefficients. Each of these coefficient functions can be used with `split_coef`

to calculate the corresponding coefficients per split, which can then be plotted or averaged via a simple `mean`

. A bias-corrected and accelerated bootstrap confidence interval can be calculated via `split_ci`

. Note that estimating the confidence interval involves very intensive calculations, so it can take a long time to complete.

```
# Spearman-Brown adjusted Pearson correlations per replication
coefs <- split_coefs(split_scores, spearman_brown)
# Distribution of coefficients
hist(coefs)
# Mean of coefficients
mean(coefs)
# Confidence interval of coefficients
split_ci(split_scores, spearman_brown)
```