# Preface

This book should help you get familiar with analysis of variance (ANOVA) and
mixed models in `R`

(R Core Team 2021). From a methodological point of view, we build
upon the knowledge of an introductory course to probability and statistics
covering the basic concepts of statistical inference (estimation, hypothesis
tests, confidence intervals) up to the two-sample \(t\)-test. See for example
Dalgaard (2008) for an introduction of both theory and the corresponding functions
in `R`

. A more theoretical reference is Rice (2007).

There are of course already well-established excellent textbooks covering ANOVA
including experimental design in great detail. Examples are
Oehlert (2000), Kuehl (2000), Montgomery (2019) and many more. We build upon
these great books. From a mathematical point of view, we use similar notation
as Oehlert (2000). The goal of this book is to provide a *compact* overview of
the most important topics including the corresponding applications in `R`

using flexible mixed model approaches. We also use examples from the classical
textbooks and will redo the corresponding statistical analyses in `R`

.

As this is an introductory text, the focus is on getting to know multiple
experimental design types, when they are being used and what a proper analysis in
`R`

looks like. This is why we will not do all the details, especially for the
more advanced topics. The idea is that if the reader is familiar with the basic
concepts and their applications in `R`

, this knowledge can be extended (and
applied) to other areas.

Besides discussing the theory and the corresponding `R`

functions, we also try
to give you an intuition in when and how things can go wrong and what aspects
have to be considered in practice. This is not only useful when planning an
experiment on your own, but also when analyzing data from other sources or when
reading a research paper.

From a statistical point of view, an ANOVA model is nothing more than a special
case of a linear regression model. Note that no prior knowledge of linear
regression is needed for this book. For the basic models, we mostly use the
function `aov`

in `R`

in order to get the “classical” outputs. In fact, `aov`

simply calls `lm`

(the linear regression model fitting function) and adjusts the
output accordingly. We sometimes mention extensions to more general linear
regression models. However, this book is not meant to be an introductory text to
linear regression. See for example Fox and Weisberg (2019) or Faraway (2005) for applied
introductions.

If not stated otherwise, we use a significance level of 5% if we make statements about statistical significance, or equivalently, a coverage level of 95% for the corresponding confidence intervals.

If you find any errors, inconsistencies or if you miss something, please e-mail me or fill out the anonymous feedback form at https://goo.gl/ZBvjj9.

The most recent version of this book and a list of errors can be found on https://stat.ethz.ch/~meier/teaching/book-anova/.

## Structure of the Book

We begin with a non-technical introduction to the general principles of experimental design in Chapter 1. Chapter 2 then introduces the first models for designs with only one factor. More specific questions regarding these models are then discussed in Chapter 3, including the problem of multiple testing. Chapter 4 introduces factorial designs which arise if a treatment is a combination of multiple factors. A short introduction to complete block designs, which are a great way to increase power or precision, can be found in Chapter 5. Chapter 6 introduces a new class of models including random and fixed effects, the so-called mixed models which are very popular in many applied areas. Some more special designs follow: Chapter 7 introduces a new class of designs which can deal with experimental units of different sizes, the so-called split-plot designs. We conclude with Chapter 8 about block designs with small blocks that cannot accommodate all treatments, so-called incomplete block designs.

## Software Information and Conventions

This book uses a lot of `R`

code. If you are completely new to `R`

, you can get
more information for example at
https://cran.r-project.org/manuals.html
or https://education.rstudio.com/.

The `R`

code and output has the following form:

```
text <- "Let's get started ..."
paste(text, "now!", sep = " ")
```

`## [1] "Let's get started ... now!"`

This means that output lines start with two comments sign “`##`

”. For better
readability, we sometimes shorten the `R`

output a bit. If we remove multiple
lines, this will be indicated with the symbol “`## ...`

”, i.e., two comment
signs and three dots, in the output.

Regarding plots, we mostly use base `R`

graphics. For more complex plots we
switch to `ggplot2`

(Wickham 2016).

We often load data directly from the web, either in tabular format using the
function `read.table`

, or already as an `R`

object, using the function
`readRDS`

.

The packages `knitr`

(Xie 2015) and `bookdown`

(Xie 2021) were used
to compile this book.

## Acknowledgments

First, I’d like to thank all members of the Seminar für Statistik at ETH Zürich for such a nice and fun working and research environment and for making it possible to work on this project. I learned a lot a long time ago from a wise man nicknamed “Puma” while working in a building named “LEO”. Hans-Rudolf Roth, you are missed!

Many people contributed in various ways to this book, special thanks go to Peter Bühlmann, Markus Kalisch, Marloes Maathuis, Christoph Buck, Claude Renaux, Camilla Gerboth, Tanja Finger, Michael Zellinger, Reto Zihlmann and Bill Perry.

I also want to thank Rob Calver from Chapman & Hall/CRC Press for the support and patience.

Finally, and most importantly, I would like to thank my family for all the support.

Lukas Meier

Zürich, Switzerland