Load libraries:

```
library(fitdistrplus)
library(vcd)
library(tidyverse)
```

R includes a whole range of distributions: Here is a list of them.

Letâ€™s, for example, look at the help for the Gaussian functions:

`?dnorm`

We see that there are four different calls:

- `dnorm`

: density

- `pnorm`

: cumulative distribution function (percentage of
values smaller than)

- `qnorm`

: quantile function (inverse of cumulative
distribution)

- `rnorm`

: generates random numbers

These four functions are available for most of the distributions. The first letter specifies if we want to look at the density, probability distribution/mass function, quantile or random numbers. The suffix specifies the distribution.

The arguments depend on the distribution we are looking at, but always include the parameters of that function.

The functions starting with `d`

give us densities /
probabilities for certain values to occur in a distribution that we
specify ourselves.

Letâ€™s use the example where we caught 10 frogs and count how many of them are light-colored. For known parameters, we can calculate the the chances of counting exactly 5 light-colored frogs:

```
n = 10 # number of frogs we catch
p = 0.3 # true fraction of light frogs
dbinom(x=5, size=n, prob=p)
```

`## [1] 0.1029193`

We can ask for the probability of catching at most (or at least) 5
light frogs. In this case, we need the cumulative probability
distribution starting with `p`

:

`pbinom(q=5, size=n, prob=p) # at most`

`## [1] 0.952651`

`pbinom(q=5, size=n,prob=p, lower.tail=FALSE) # larger than`

`## [1] 0.04734899`

Catching at least 5 light frogs is a rare event.

Discrete random variables have probability mass functions, which are only defined for integer values:

`dpois(1.5, lambda=4)`

`## Warning in dpois(1.5, lambda = 4): non-integer x = 1.500000`

`## [1] 0`

The probability here is \(0\), and comes with a warning: Your calculations potentially make no sense.

You can draw random numbers from a certain distribution. We use this a lot for simulating random processes. If we do so, itâ€™s advised to set a seed for reproducibility:

`set.seed(59)`

Now we can simulate frog counts:

```
frog_counts <-rpois(n = 200, lambda = 4)
head(frog_counts)
```

`## [1] 1 4 6 5 4 6`

Letâ€™s also simulate an experiment where each frog count was done in a different lake.

For this we use the gamma-poisson distribution:

`frog_counts_different_lakes <- rnbinom(n=200, size=2, mu=4) # the smaller size, the more spread out`

**Exercise:** Simulate 200 frog sizes that follow a
Gaussian distribution with an average of 7 cm, and a standard deviation
of 2 cm.

**Solution**:

```
frog_sizes <- rnorm(n = 200, mean = 7, sd = 2)
head(frog_sizes)
```

`## [1] 8.281580 8.384858 8.695852 7.667760 9.216826 7.191485`

If you draw random numbers from a certain distribution, the histogram will have a shape that is specific for the distribution:

```
data.frame(frog_counts) %>%
ggplot(aes(x=frog_counts))+
geom_histogram(binwidth=1)
```

**Exercise**: Draw a histogram of the frog sizes:

```
data.frame(frog_sizes) %>%
ggplot(aes(x=frog_sizes))+
geom_histogram()
```

The cumulative distribution is the integral of the distribution/the
histogram: It gives you the percentage of values that are smaller than a
certain value.

We can visualize it with `stat_ecdf`

in
`ggplot2`

.

```
data.frame(frog_sizes) %>%
ggplot(aes(x=frog_sizes))+
stat_ecdf()
```