bugs_long provides information on the extent to which men and women want to kill arthropods that vary in disgustingness (low, high) and freighteningness (low, high) (four groups in total). Each participant rated their attitude towards all four kinds of anthropods. bugs_long is a subset of the data reported by Ryan et al.(2013) .
bugs_long is included in the ggstatsplot package.
Note that this is a repeated measures design because the same participant gave four different ratings across four different conditions (LDLF, LDHF, HDLF, HDHF).
desire - The desire to kill an arthropod was indicated on a scale from 0 to 10
gender Male/Female
region
condition
LDLF: low disgustingness and low freighteningness
LDHF: low disgustingness and high freighteningness
HDLF: high disgustingness and low freighteningness
HDHF: high disgustingness and high freighteningness
Picture from Ryan et al. (2013) https://doi.org/10.1016/j.chb.2013.01.024
In bugs_long, is there a difference within the participants in their desire to kill bugs from the four different conditions?
Should you use ggwithinstats() or ggbetweenstats() when comparing
Is it reasonable to assume normality?
Show the code
bugs_long |>group_by(condition) |>shapiro_test(desire)# qqplotbugs_long |>ggplot(aes(sample = desire, group = condition)) +geom_qq_line(color ="red", linewidth =1)+geom_qq()+facet_wrap(~condition)# Density plotbugs_long |>ggplot(aes(x = desire, fill = condition)) +geom_density(alpha =0.2)
bugs_long |># Friedman test only works with complete data# Some subjects have missing data in desire# We need to remove thesedrop_na(desire, condition, subject) |>add_count(subject) |>filter(n ==4) |># Send the complete datafriedman_test(desire ~ condition | subject)
# This is not straightforward to do in gtsummary # because you are looking at a repeated measures test (each subject has 4 rows)## However, it can be done. See this link for how the author of the gtsummary package solves it## https://stackoverflow.com/questions/75003882/how-to-do-repeated-measures-anova-and-friedman-test-in-gtsummary
What is your interpretation?
What is the consequence if you change the type of test?
Is there a difference between men and women in the desire to kill bugs that are LDHF (low disgustingness and high freighteningness).
Create a filtered data frame of bugs_long ::: {.cell}
What is the consequence if you change the type of test?
Confused by all the different names in this example? Its not you. This test has many synonyms. Quote from wikipedia: “Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW/MWU), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test)…”
Is there a difference in the frequency of men and women between North America and the remaining regions?.
First, lump region togehter to two levels (fct_lump())
Reduce the data to one row pr subject ID, and discuss why this is a good idea. ::: {.cell}
# rstatix is somewhat complicated here# First we need to make table# table is not pipe friendlytable(bugs_long_region_lump$region , bugs_long_region_lump$gender) |># Then pass the table to the testchisq_test()
# Notice that gtsummary is switching to Fishers exact test.# Fishers exact test is actually the right tests to use if one of the groups has 5 or fewer observationsbugs_long_region_lump |>select(gender, region) |>tbl_summary(by = gender) |>add_p()# If we absolutely want to run the chisquared test we do it like thisbugs_long_region_lump |>select(gender, region) |>tbl_summary(by = gender) |>add_p(test =list(all_categorical() ~"chisq.test"))# If you want to see a contigency table# With row and column sum countsbugs_long_region_lump |>drop_na() |># Remove observations with missing valuestbl_cross(gender, region) |>add_p()
What is your interpretation?
2.2 The ToothGrowth dataset
ToothGrowth gives information on tooth length in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).
len Tooth length
supp Supplement type
VC Vitamin C as ascorbic acid
OJ Orange Juice
dose Dose in milligrams/day (0.5, 1, or 2)
Is there a difference in Tooth length based on the type of supplement?
Should you use ggwithinstats() or ggbetweenstats() when comparing
Is it reasonable to assume normality?
Show the code
ToothGrowth |>group_by(supp) |>shapiro_test(len)# qqplotToothGrowth |>ggplot(aes(sample = len, color = supp)) +geom_qq()+geom_qq_line()# Density plotToothGrowth |>ggplot(aes(x = len, fill = supp)) +geom_density(alpha =0.2)
Make the appropriate test ::: {.cell}
ggstatsplot
ToothGrowth |>ggbetweenstats(x = supp,y = len,type ="parametric")# It is likely completely fine to run this as a "parametric" test, but you could consider type = "robust"
:::
rstatix
ToothGrowth |>t_test(len ~ supp,paired =FALSE)# We run this as a "parametric" test
statsExpressions
ToothGrowth |>two_sample_test(x = supp, y = len, paired =FALSE, type ="parametric")# It is likely completely fine to run this as a "parametric" test, but you could consider type = "robust".
---title: "Hypothesis tests"author: "Steen Flammild Harsted & Søren O´Neill"date: todayformat: html: toc: true toc-depth: 2 number-sections: true number-depth: 2 code-fold: true code-summary: "Show the code" code-tools: trueexecute: eval: false message: false warning: false---<br><br># PresentationYou can download the course slides for this section <a href="./presentation_hypothesis_tests.html" download>here</a><div>```{=html}<iframe class="slide-deck" src="presentation_hypothesis_tests.html" width=90% ></iframe>```</div># Roadmap to your test```{r}#| column: page-inset-right#| eval: true#| echo: false#| code-fold: false#| lightbox:#| group: r-graph#| description: Roadmap to your test (r4phd.sdu.dk)knitr::include_graphics(here::here("img", "hypo_tests.png"))``````{r setup}#| include: false#| echo: false#| eval: truelibrary(tidyverse)library(here)library(gtsummary)library(ggstatsplot)library(rstatix)library(statsExpressions)#library(AER)#library(easystats)#library(GGally)#library(gt)#library(gtsummary)#data("PhDPublications")#library(palmerpenguins)#source(here("scripts", "01_import.R"))``````{r}#| eval: true#| column: margin#| echo: falseknitr::include_graphics(here::here("img", "sherif.png"))```## Getting Started {.unnumbered}* Make sure that you are working in your course project* Create a new quarto document and name it "hypothesis_tests.qmd"* Insert a code chunk and load 2 important libraries* Insert a new code chunk- Write `source(here("scripts", "01_import.R"))` in the chunk* Write a short headline to each code chunk* Change the YAML header to style your document output.::: {.callout-tip collapse="true"}### The YAML header can look like this````{verbatim}---title: "TITLE"subtitle: "SUBTITLE"author: "ME"date: todayformat: html: toc: true toc-depth: 2 embed-resources: true number-sections: true number-depth: 2 code-fold: true code-summary: "Show the code" code-tools: trueexecute: message: false warning: false---````:::<br><br><br>#### Add the `gtsummary`, `ggstatsplot`, `statsExpressions` and `rstatix` packages to the code chunk where you have your library calls.If you need to install any of the packages:* use `install.packages(c("package_name", "package_name"))` to download the packages. * This is done in the console and NOT in your script.```{r}#| eval: falselibrary(tidyverse) # import, tidy, transform, visualizelibrary(here) # filepath controllibrary(gtsummary) # tables (with stats)library(ggstatsplot) # statistical plotslibrary(rstatix) # statistical testslibrary(statsExpressions) # statistical tests```<br><br>## The `bugs_long` dataset`bugs_long` provides information on the extent to which men and women want to kill arthropods that vary in disgustingness (low, high) and freighteningness (low, high) (four groups in total). Each participant rated their attitude towards all four kinds of anthropods. `bugs_long` is a subset of the data reported by [Ryan et al.(2013)](https://www.sciencedirect.com/science/article/abs/pii/S0747563213000277) . `bugs_long` is included in the `ggstatsplot` package.Note that this is a repeated measures design because the same participant gave four different ratings across four different conditions (LDLF, LDHF, HDLF, HDHF).* `desire` - The desire to kill an arthropod was indicated on a scale from 0 to 10* `gender` Male/Female* `region`* `condition` - **LDLF**: low disgustingness and low freighteningness - **LDHF**: low disgustingness and high freighteningness - **HDLF**: high disgustingness and low freighteningness - **HDHF**: high disgustingness and high freighteningness ```{r}#| echo: false#| eval: true#| fig-cap: "Picture from Ryan et al. (2013) https://doi.org/10.1016/j.chb.2013.01.024"knitr::include_graphics(here("img" , "bugs_conditions.jpg"))```<br><br>### In `bugs_long`, is there a difference within the participants in their `desire` to kill bugs from the four different `conditions`? * Should you use `ggwithinstats()` or `ggbetweenstats()` when comparing * Is it reasonable to assume normality?```{r}bugs_long |>group_by(condition) |>shapiro_test(desire)# qqplotbugs_long |>ggplot(aes(sample = desire, group = condition)) +geom_qq_line(color ="red", linewidth =1)+geom_qq()+facet_wrap(~condition)# Density plotbugs_long |>ggplot(aes(x = desire, fill = condition)) +geom_density(alpha =0.2)```* Make the appropriate test```{r}#| code-summary: "`ggstatsplot`"bugs_long |>ggwithinstats(x = condition,y = desire,type ="nonparametric")``````{r}#| code-summary: "`rstatix`"bugs_long |># Friedman test only works with complete data# Some subjects have missing data in desire# We need to remove thesedrop_na(desire, condition, subject) |>add_count(subject) |>filter(n ==4) |># Send the complete datafriedman_test(desire ~ condition | subject)``````{r}#| code-summary: "`statsExpressions`"bugs_long |>oneway_anova(x = condition,y = desire,type ="nonparametric",paired =TRUE )``````{r}#| code-summary: "`gtsummary`"# This is not straightforward to do in gtsummary # because you are looking at a repeated measures test (each subject has 4 rows)## However, it can be done. See this link for how the author of the gtsummary package solves it## https://stackoverflow.com/questions/75003882/how-to-do-repeated-measures-anova-and-friedman-test-in-gtsummary```* What is your interpretation?* What is the consequence if you change the type of test?<br><br>### Is there a difference between men and women in the `desire` to kill bugs that are **LDHF** (low disgustingness and high freighteningness).* Create a filtered data frame of bugs_long```{r}bugs_long_LDHF_only <- bugs_long |>filter(condition =="LDHF")```* Should you use `ggwithinstats()` or `ggbetweenstats()` for this test?* Is it reasonable to assume normality?```{r}bugs_long_LDHF_only |>filter(!is.na(gender), !is.na(desire)) |>group_by(gender) |>shapiro_test(desire)# qqplotbugs_long_LDHF_only |>ggplot(aes(sample = desire, color = gender)) +geom_qq_line(color ="red", linewidth =1) +geom_qq()+facet_wrap(~gender)# Density plotbugs_long_LDHF_only |>ggplot(aes(x = desire, fill = gender)) +geom_density(alpha =0.2)```* Make the appropriate test```{r}#| code-summary: "ggstatsplot"bugs_long_LDHF_only |>ggbetweenstats(x = gender,y = desire,type ="nonparametric")``````{r}#| code-summary: "rstatix"bugs_long_LDHF_only |>wilcox_test(desire ~ gender)``````{r}#| code-summary: "statsExpressions"bugs_long_LDHF_only |>two_sample_test(x = gender, y = desire, type ="nonparametric", paired =FALSE )``````{r}#| code-summary: "gtsummary"bugs_long_LDHF_only |>select(gender, desire) |>tbl_summary(by ="gender") |>add_p()```* What is your interpretation?* What is the consequence if you change the type of test?* Confused by all the different names in this example? Its not you. This test has many synonyms. Quote from [wikipedia](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#:~:text=The%20Mann%E2%80%93Whitney%20U%20test%20%2F%20Wilcoxon%20rank%2Dsum%20test,to%20matched%20or%20dependent%20samples.): *"Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW/MWU), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test)..."*<br><br>### Is there a difference in the frequency of men and women between `North America` and the remaining regions?.* First, lump `region` togehter to two levels (`fct_lump()`)* Reduce the data to one row pr subject ID, and discuss why this is a good idea.```{r}bugs_long_region_lump <- bugs_long |>mutate(region =fct_lump(region, 1)) |>group_by(subject) |>slice(1) |>ungroup()```* Should you use `ggwithinstats()` or `ggbetweenstats()` or perhaps `ggbarstats()` for this test?* Should you asses normality?```{r}# Both variables are factors (categorical/nominal). Normality has to do with continuous data```* Make the appropriate test```{r}#| code-summary: "ggstatsplot"bugs_long_region_lump |>ggbarstats(x = gender,y = region)``````{r}#| code-summary: "rstatix"# rstatix is somewhat complicated here# First we need to make table# table is not pipe friendlytable(bugs_long_region_lump$region , bugs_long_region_lump$gender) |># Then pass the table to the testchisq_test()``````{r}#| code-summary: "statsExpressions"bugs_long_region_lump |>contingency_table(gender, region)``````{r}#| code-summary: "gtsummary"# Notice that gtsummary is switching to Fishers exact test.# Fishers exact test is actually the right tests to use if one of the groups has 5 or fewer observationsbugs_long_region_lump |>select(gender, region) |>tbl_summary(by = gender) |>add_p()# If we absolutely want to run the chisquared test we do it like thisbugs_long_region_lump |>select(gender, region) |>tbl_summary(by = gender) |>add_p(test =list(all_categorical() ~"chisq.test"))# If you want to see a contigency table# With row and column sum countsbugs_long_region_lump |>drop_na() |># Remove observations with missing valuestbl_cross(gender, region) |>add_p()```* What is your interpretation?<br><br>## The `ToothGrowth` dataset`ToothGrowth` gives information on tooth length in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).* `len` Tooth length* `supp` Supplement type - **VC** Vitamin C as ascorbic acid - **OJ** Orange Juice* `dose` Dose in milligrams/day (0.5, 1, or 2)<br><br>### Is there a difference in Tooth length based on the type of supplement?* Should you use `ggwithinstats()` or `ggbetweenstats()` when comparing * Is it reasonable to assume normality?```{r}ToothGrowth |>group_by(supp) |>shapiro_test(len)# qqplotToothGrowth |>ggplot(aes(sample = len, color = supp)) +geom_qq()+geom_qq_line()# Density plotToothGrowth |>ggplot(aes(x = len, fill = supp)) +geom_density(alpha =0.2)```* Make the appropriate test```{r}#| code-summary: "ggstatsplot"ToothGrowth |>ggbetweenstats(x = supp,y = len,type ="parametric")# It is likely completely fine to run this as a "parametric" test, but you could consider type = "robust"``````{r}#| code-summary: "rstatix"ToothGrowth |>t_test(len ~ supp,paired =FALSE)# We run this as a "parametric" test``````{r}#| code-summary: "statsExpressions"ToothGrowth |>two_sample_test(x = supp, y = len, paired =FALSE, type ="parametric")# It is likely completely fine to run this as a "parametric" test, but you could consider type = "robust".``````{r}#| code-summary: "ggsummary"ToothGrowth |>select(supp, len) |>tbl_summary(by = supp, statistic =list(all_continuous() ~"{mean} ({sd})")) |>add_p(test =list(all_continuous() ~"t.test"))```* What is your interpretation?<br><br>### Improve your home assignment * Return to your home assignment and add some statistical tests if and where it makes sense* Remember to change project (top right corner in Rstudio)* Using the menu in the top right corner, you can switch between your course project and your home assignment