Central limit theorem

General Idea of the Central Limit Theorem

[!summary] Central Limit Theorem \(\lim_{ n \to \infty } P\left( a < \frac{(X_{1}+X_{2}+\dots X_{n})-n\cdot\mu}{\sigma \sqrt{ n }} < b\right) =\int_{a}^{b} \frac{1}{\sqrt{ 2\pi }} e^{-x^{2}/2} \, dx\)

ok ( ok )

The 68-95-99.7 rule

! [[ sample_means_coins.png#invert ]] ! [[ sample_means_movies.png#invert ]]

[!caution] Note CLT applies only for sample mean $\bar{x}$, it does not say anything about $X_{i}$ (population) $n$ large does not mean population is normal.

https://youtu.be/jvoxEYmQHNM

\[\begin{align} \text{Large } n \implies \bar{x} \sim \mathcal{N}\left( \mu, \frac{\sigma^{2}}{n} \right) \\ \\ Z =\left( \frac{\bar{x}-\mu}{\sigma/{\sqrt{ n }}} \right) \sim N\left( 0, 1 \right) & \text{ by standardizing sample mean} \end{align}\] \[\begin{align} P(-a<Z<a) = 95 \% \\ \\ P\left( -a<\left( \frac{\bar{x} - \mu }{\sigma/{\sqrt{ n }}} \right) < a \right) = 95\% \\ \\ P \left( -\frac{a\sigma}{\sqrt{ n }}-\bar{x} < -\mu < \frac{a\sigma}{\sqrt{ n }}-\bar{x} \right) = 95\% \\ \\ \\ \end{align}\] \[P\left[ \mu \in \underbrace{ \left( \bar{x} \pm a \frac{\sigma}{\sqrt{ n }} \right) }_{ \text{interval} } \right] = 95\%\] \[P\left[ \mu \in \left( \bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{ n }} \right) \right] = 100(1-\alpha)\%\]

The Movie opened on Easter weekend in April 2009. Over the three-day weekend, the movie became the number-one box office attraction (The Wall Street Journal, April 13, 2009). The ticket sales revenue in dollars for a sample of 25 theatres is as follows.

Ticket sales (in $)
20,200
8350
10,750
13,900
13,185
10,150
7300
6240
4200
9200
13,000
14,000
12,700
6750
21,400
11,320
9940
7430
6700
11,380
9700
11,200
13,500
9330
10,800

Problem A

What is the 95% confidence interval estimate for the mean ticket sales revenue per theatre? Interpret this result.

[!check] Solution https://www.vaia.com/en-us/textbooks/economics/economics-for-today-6-edition/chapter-8/problem-22-disneys-hannah-montana-the-movie-opened-on-easter/

Let $\bar{x}$ be the sample mean of the sample data. Let $n\, (= 25)$ denote the given sample size. We need $1-\alpha = 0.95 \implies \alpha=0.05$ Population is not known to be normal. And $\sigma$ is also not known. Hence, non-parametric methods must be used to 95% confidence interval. But here, we’ll explicitly assume that the population is normal and use $t$-confidence interval. For 95% confidence, \(\bar{x}-t_{\alpha/2}^{n-1} \frac{s}{\sqrt{ n }} < \mu < \bar{x}+t_{n-1,\alpha/2} \frac{s}{\sqrt{ n }}\)

  • $\bar{x} = \frac{1}{n} \sum_{i=1}^{N}X_{i} = 10905$
  • $s = \frac{1}{n-1}\sum(X_{i} - \bar{x})^{2} \approx 3962.11$
  • Use the function T.INV in spreadsheet to calculate the $t$-score as T.INV(1 - 0.05/2, 25 - 1), then multiply it by $s/\sqrt{ n }$ to get $t_{n-1,\alpha/2} \frac{s}{\sqrt{ n }} = 1635.48$
  • Hence, $\mu = 10905 \pm 1635.48 = \left[ 9270, 12540 \right]$

Problem B

Using the movie ticket price of $7.16 per ticket, what is the estimate of the mean number of customers per theatre?

[[ 2025-01-16|16 January 2025 ]]

Confidence Interval

! [[ Drawing 2025-01-16 11.44.06.excalidraw|100% ]]


[[ 2025-01-21|21 January 2025 ]]

problem

[[ CLT Problem ]]


\[\underbrace{ \hat{p} }_{ \text{sample proportion} } \sim N\left( \underbrace{p}_{ \text{mean, population proportion} }, \underbrace{ \frac{ p(1-p)}{n} }_{ \text{variance of } \hat{p} } \right), n \to \infty\]

Recap

\[\begin{align} E = z_{\alpha/2} \frac{\sigma}{\sqrt{ n }} \\ \\ \implies n = \left( z_{\alpha/2}\cdot \frac{\sigma}{E} \right)^{2} \\ \\ \boxed{n = \frac{z_{\alpha/2}^{2} \sigma^{2}}{E^{2}}} \end{align}\]

So, our equation becomes: \(n = \frac{z_{\alpha/2}^{2} {\sigma ^{*}}^{2}}{E^{2}} \; \text{for mean}\)

\(\begin{align} E = z_{\alpha/2} \sqrt{ \frac{\hat{p}(1-\hat{p})}{n} } \\ \\ \implies n =\frac{z_{\alpha/2}^{2}\cdot p^{*}(1-p ^{*})}{E^{2}} \end{align}\) - Use previous studies, pilot studies, etc.

Problem

Observations about $n$