###--------------------------------------------------------------------###
### BRM Session 6 - Hypothesis testing: two independent samples t-test ###
###--------------------------------------------------------------------###
# Dennis Abel and Lukas Birkenmaier
# Intro and setup ---------------------------------------------------------
# Based on exercises in chapter 6 of Sarstedt & Mooi (2019)
# Load required packages
library(tidyverse) # Many functions including ggplot
library(car) # For Levene's test
library(lsr) # several useful functions, we use it for calculating Cohen's D
# Working directory
setwd("[insert directory here]")
# Use read_rds-function to load rds data
oddjob <- read_rds("oddjob.rds")
# Research design ---------------------------------------------------------
# Our aim is to identify the factors that influence customers' overall price/performance
# satisfaction with the airline and explore the relevant target groups for future
# advertising campaigns. The following questions should be answered:
# 1. Does the overall price/performance satisfaction differ by gender?
# In order to address this question, we need the following variables:
# overall price/performance satisfaction: overall_sat
ggplot(oddjob, aes(x = overall_sat))+
geom_histogram()
# Gender: gender
counts_gender <- table(oddjob$gender)
barplot(counts_gender)
# Formulate hypothesis
# 1. H0: Overall satisfaction means of male and female travelers are the same
# 1. H1: Overall satisfaction means of male and female travelers differ
# Choose significance level
# Use significance level (alpha) of 0.05, which means that we allow a maximum
# chance of 5% of mistakenly rejecting a true null hypothesis
# Select appropriate test
# We start by defining the testing situation, which is to compare the mean overall
# price/performance satisfaction scores of female and male customers.
# [The textbook classifies these scores as measured on a ratio scale,
# I assume this is a typo and it means interval].
# What we know about this sample is that it is a random subset and we also know
# that these are independent observations
# Check assumptions -------------------------------------------------------
# Next, we need to check if our dependent variable is normally distributed
# Shapiro-Wilk test for whole sample
shapiro.test(oddjob$overall_sat)
# Shapiro-Wilk test for each gender group
by(oddjob$overall_sat, oddjob$gender, shapiro.test)
# The results show that the p-values of the Shapiro-Wilk test are smaller than .05, indicating that the
# normality assumption is violated for both gender groups.
# Visual inspection with quantile plots (qqplots)
ggplot(oddjob, aes(sample = overall_sat, colour = gender)) +
stat_qq() +
stat_qq_line()+
facet_wrap(~gender)
# The dots appear to follow the line reasonably well but deviations are
# detectable. Visual inspection unclear but Shapiro-Wilk test is definitive.
# As we find no support for normality, we may have to use the independent
# samples t-test or the Mann-Whitney U test. The decision on which to choose
# depends on whether the variances are equal.
# We inspect this with Levene's test
leveneTest(oddjob$overall_sat, oddjob$gender)
leveneTest(oddjob$overall_sat, oddjob$gender, center=mean) ## By default, the test utilizes
# the median. We can switch to mean with the option "center=mean"
# The low F-value (F) of 0.418 suggests that we cannot reject the null hypothesis
# that the population variances are equal. This is also mirrored in the large p-value
# of 0.518, which lies far above 0.05. Because we obtained evidence that the variances
# are equal, we also find that the independent samples t-test is appropriate.
# Calculate test statistic and effect size --------------------------------
# Remember: t = xbar1 - xbar2 / SE(of the difference between means)
t.test(overall_sat ~ gender, var.equal=TRUE, data=oddjob)
# The output is reported in the console: the t-value, degrees of freedom (df), statistical
# significance (p-value), 95% confidence interval (CI) of the mean difference, and
# the mean score for each of the two groups.
# The reported confidence interval of the mean differences tells you that
# there’s a 95% chance that the true difference between means lies between 0.04 and 0.48
# We learn that the p-value (0.020) is smaller than the significance level (0.05). Therefore,
# we can reject the independent samples t-test's null hypothesis that there is no difference
# in satisfaction between female and male customers and conclude that the overall
# price/performance satisfaction differs significantly between female and male travelers.
# Furthermore, we can calculate the effect size: Cohen's D
cohensD(overall_sat ~ gender, data=oddjob)
# How to interpret Cohen's D: A d of 1 indicates that the group means differ by
# 1 standard deviation; A d of of 2 -> differ by 2 standard deviations, and so on...
# So an effect size of 0.5 means the value of the average person in group 1 is 0.5
# standard deviations above the average person in group 2.
# Rule of thumb conventions:
# <0.2 negligible
# 0.2 = small effect
# 0.5 = medium effect size
# 0.8 >= large effect
# An effect size of 0.16 for gender on overall satisfaction is rather small.