###------------------------------------------------------------------###
### BRM Session 7 - Hypothesis testing: Analysis of Variance (ANOVA) ###
###------------------------------------------------------------------###
# Dennis Abel and Lukas Birkenmaier
# Intro and setup ---------------------------------------------------------
# Based on exercises in chapter 6 of Sarstedt & Mooi (2019)
# Load required packages
library(tidyverse) # Many functions including ggplot
library(car) # For Levene's test
library(lsr) # several useful functions, especially post hoc and effect estimations
# Working directory
setwd("[insert directory here]")
# Use read_rds-function to load rds data
oddjob <- read_rds("oddjob.rds")
# Research design ---------------------------------------------------------
# Our aim is to identify the factors that influence customers' overall price/performance
# satisfaction with the airline and explore the relevant target groups for future
# advertising campaigns. The following question should be answered:
# 1. Does the overall price/performance satisfaction differ according to the
# traveler's status?
# In order to address this question, we need the following variables:
# overall price/performance satisfaction: overall_sat
hist(oddjob$overall_sat)
# Traveler's status: status
counts_status <- table(oddjob$status)
barplot(counts_status)
# Formulate hypothesis
# 1. H0: Overall satisfaction means are the same between status groups
# 1. H1: Overall satisfaction means differ at least for two status groups
# Choose significance level
# Use significance level (alpha) of 0.05, which means that we allow a maximum
# chance of 5% of mistakenly rejecting a true null hypothesis
# Select appropriate test
# We start by defining the testing situation, which is to compare the mean overall
# price/performance satisfaction scores across three status groups.
# [The textbook classifies these scores as measured on a ratio scale, I assume this is a typo
# and it means interval].
# We know that a comparison of three or more groups involve a one-way ANOVA or Kruskal-Wallis test.
# What we know about this sample is that it is a random subset and we also know
# that these are independent observations
# Check assumptions -------------------------------------------------------
# Next, we need to check if our dependent variable is normally distributed
# Shapiro-Wilk test for whole sample
shapiro.test(oddjob$overall_sat)
# Shapiro-Wilk test for each status group
by(oddjob$overall_sat, oddjob$status, shapiro.test)
# The results show that the p-values of the Shapiro-Wilk test are smaller than .05, indicating that the
# normality assumption is violated for all three status groups.
# Visual inspection with quantile plots (qqplots)
ggplot(oddjob, aes(sample = overall_sat, colour = status)) +
stat_qq() +
stat_qq_line()+
facet_wrap(~status)
# The dots appear to follow the line reasonably well but deviations are
# detectable. Visual inspection unclear but Shapiro-Wilk test is definitive.
# As we find no support for normality, we may have to use the a one-way ANOVA F-test when
# the variances are equal or a Kruskal-Wallis rank test when the variances are not equal.
# So the decision on which to choose depends on whether the variances are equal.
# We inspect this with Levene's test
leveneTest(oddjob$overall_sat, oddjob$status)
leveneTest(oddjob$overall_sat, oddjob$status, center=mean) ## By default, the test utilizes
# the median. We can switch to mean with the option "center=mean"
# The Levene's test statistic clearly suggests that the population variances are equal,
# as the test's p-value (0.404) is well above 0.05. Thus, we can utilize the one-way
# Anova F-test.
# Calculate test statistic and effect size --------------------------------
# We will use the aov-function to calculate the test
anova <- aov(formula = overall_sat ~ status,
data = oddjob)
summary(anova)
# R doesn't use the terms "between-group" and "within-group". The between groups variance
# corresponds to the effect that the status has on the outcome variable and the within
# groups variance corresponds to the ``leftover'' variability, so it calls that the
# residuals.
# Between group sum of squares SSb = 51.76
# Within group sum of squares SSw = 2758.46
# The model has an F-value of 9.963. We also learn that the p-value is smaller than the
# significance level (0.001). Therefore,
# we can reject the null hypothesis that there is no difference
# in satisfaction between status groups and conclude that the overall
# price/performance satisfaction differs significantly across at least two of the
# three status groups.
# Post-hoc test to identify which groups differ
# The textbook suggests various post hoc tests depending on whether population variances
# and sample sizes per group differ. The available post-hoc tests in SPPS exceed the
# available approaches in R. For our purposes, we can use the widely used Holm
# correction or the conservative Tukey's honestly significant difference test (Tukey's HSD)
# We use the posthocPairwiseT-function from the lsr-package
posthocPairwiseT(anova, p.adjust.method="holm")
# Writing a conclusion on pairwise comparisons:
# Post hoc tests (using the Holm correction to adjust p) indicated that Blue status group
# members have a significantly larger satisfaction with the price/performance ratio of
# Oddjob Airways compared to Silver group members (p=.0008) and Gold group members (p=.0021).
# We found no evidence that Silver and Gold members differ (p=.78).
# Alternative TukeyHSD
TukeyHSD(anova)
plot(TukeyHSD(anova, conf.level=.95), las = 2)
# Furthermore, we can calculate the effect size.
# For this, we use the etaSquared-function from the lsr-package
etaSquared(x=anova)
# The first output corresponds to the eta-squared statistic. The second one is a partial
# eta-squared and only relevant for more complicated ANOVAs (not covered in this course).
# Remember, the eta-squared is the between sum of squares divided by the total sum of squares.
# The interpretation is straightforward: it refers to the proportion of the variability
# in the outcome variable than can be explained in terms of the predictor.
# Eta-squared is closely related to the squared correlation and the square-root of eta-squared
# can be interpreted as if it referred to the magnitude of a Pearson correlation.
sqrt(0.0184)
## 0.13 equals a rather small relationship.
# Interpret the results:
# 1. Gold and Blue members, as well as Silver and Blue members, differ significantly in
# their mean overall price/performance satisfaction.
# 2. There is no difference between Gold and Silver member in their mean overall price/
# performance satisfaction.
# 3. Membership status explains only a minimal share of the customer's price/performance
# satisfaction. Hence, other factors - presently not included in the model - explain
# the remaining variation in the outcome variable.