This is an ongoing dialogue between a student and his statistics professor.

# Student: Prof. I am having some difficulty in comprehending the relationship between a sample and its population mean. I was taught that in the long run and through repeated sampling one could show that the sample mean (x̄) will tend towards the population mean (m). Why should this be true?

Prof: You are asking a tough question. I will try to answer. In probability theory, there is a theorem called the Law of Large numbers. According to it, the results that will be obtained by performing the same experiment a large number of times, the average of the results should be close to the expected value (the population mean).

Student: Can you give me an example?

Prof.: Yes. I can, you take a page of full of random numbers. Say 500 of them. You take a random sample of 50, several times and compute the mean. You have a prior information of the population mean which is the average of all the numbers appearing in the page. Now if you compare the sample mean and the population mean. The theory says that the sample mean will be a good predictor of the population mean.

Student: Prof. This assumes that the population is known so and you can compute the population mean. What if the population size is unknown and also non-homogenious. For example the voting population of India, you were aware that in the last Indian general election exit polls were conducted by at least a dozen daily journals and agencies. In predicting the final tally, none of them were anywhere near the final outcome what is this due to.

Prof.: This is called the sampling error. Unfortunately in this case it turned out to be very large. Normally predictions based on the sample surveys will turn out to by correct. However, I need to address your question. Let me try.

The poor prediction could be due to a host of reasons.

- The sample chosen was very small and did not reflect several traits and characteristics of the population.
- The respondents to the survey may have intentionally mislead the interviewer by telling them a lie.
- Since the polls are conducted in many precincts, the selected precincts voting pattern could have been different from the others.

Student: Prof. Your explanation is not at all convincing. I will not trust the sample mean to be a good predictors. I have seen the non-farm pay roll and the quarterly GDP data being revised several times from the initial announcement which was based on a sample survey. I have also noticed that the consumer confidence level prediction varies significantly in a short period of time. I am tempted to conclude that with sample survey nothing really can be done because the estimates derived from it could be far from truth. Thanks Prof.