What is the probability the sample mean is more than 10 words?

$\begingroup$

The distribution of the number of words in text messages between employees at a large company is skewed right with a mean of $8.6$ words and a standard deviation of $4.3$ words. If a random sample of $39$ messages is selected, what is the probability the sample mean is more than $10$ words?

  1. Firstly, I used the z-score formula, substituting $8.6$ as the population mean, $10$ as the observed value, and $4.3$ as the standard deviation. The equation is set up as the following: $(10-8.6)/4.3$.
  2. Secondly, the equation calculates to approx. $0.3255$ and its probability, using a z-table, is approx. $0.6276$. I subtracted this value from $1$ because the prompt asks to find the probability of "more than 10 words" not less than. The final value then is approx. $0.3724$.

My answer is $0.3724$ as the probability of the sample mean being more than $10$ words. However, I believe there is an extra step using the sample size. If this is correct, how do I use the sample size to find the correct answer? Do I, instead of substituting $4.3$ as the standard deviation in the z-score formula, substitute $4.3\sqrt39$, which, after going through the listed steps, have the probability as approx. $0.0210$?

$\endgroup$

2 Answers

$\begingroup$

Let $\bar X_{39}$ be the average number of words in $n = 39$ text messages. Then $\mu = E(\bar X_{39}) = 8.6$ and$\sigma = SD(\bar X_{39}) = 4.3/\sqrt{39} = 0.6886.$

Moreover, even though the distribution of the individual words is not normal, we can assume that$\bar X_{39} \stackrel{aprx}{\sim}\mathsf{Norm}(\mu=8.6, \sigma=0.6886).$

Then $P(\bar X_{39} > 10) = 0.0181,$$ as computed in R, where pnorm is a normal CDF.

1 - pnorm(10, 8.6, 0.6686)
[1] 0.01813321

You can get a good approximation to this result by standardizing and using printed tables of the standard normal distribution.

In the figure below, the desired probability is the area under the density curve to the right of the vertical dotted line.

enter image description here

R code for figue:

hdr = "Density of NORM(8.6, 0.6686)"
curve(dnorm(x, 8.6, 0.6686), 6, 12, lwd=2, col="blue", main=hdr) abline(h=0, col="green2") abline(v=10, col="red", lwd=2, lty="dotted")
$\endgroup$ $\begingroup$

The formula of a $z$-score is the following:$$z = \dfrac{\text{Quantity} - \mu_{\text{Quantity}}}{\sigma_{\text{Quantity}}}\text{.}$$In particular, when $\text{Quantity} = \bar{X}$, $\mu_{\bar{X}} = \mu$ and $\sigma_{\bar{X}} = \dfrac{\sigma}{\sqrt{n}}$.

So, the appropriate $z$-score should actually be (invoking the Central Limit Theorem, with the $n \geq 30$ approximation)$$z = \dfrac{10 - 8.6}{4.3/\sqrt{39}} \approx 2.03\text{.}$$Each $z$ table will differ in how it is structured, but if I look up $1.03$ using this table, I get $0.97882$, which is the left-tailed probability, thus yielding $1-0.97882\approx 0.02$.

If, instead, the question was worded as

The distribution of the number of words in text messages between employees at a large company is normally distributed with a mean of 8.6 words and a standard deviation of 4.3 words. What is the probability of obtaining a text message with more than 10 words?

your answer would have been correct.

The central limit theorem states that, under reasonable conditions (if you are so inclined to read them) on the distribution of the population, that $$\dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}}$$follows an approximately standard normal distribution for $n$ sufficiently large, which is why we can ignore the "skewed right" condition on the population.[a] Many introductory stats textbooks interpret "$n$ sufficiently large" to mean $n \geq 30$.

[a] Though I have to admit, this oversimplifies the actual story and is not how I would approach this question in practice. For a deep dive, see the Beery-Esseen theorem.

$\endgroup$ 2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like