The Bell Curve

Problem #16

Tags: statistics instructional

Who solved this?

Previous:Trapezoidal Rule Next:Logarithms


The Bell Curve

The Normal Distribution Bell Curve, with mean = 0 and stdev = 1.

Let's imagine we have a process which produces something with some measurable property - for example, a pencil factory which produces pencils of a certain length x. Our goal is for the pencils to be exactly x_tgt in length, but real-world processes aren't perfect and so then length of each individual pencil deviates by certain amount. If we run this process to infinity and record the results of each length, we'd hope to see most results close to x_tgt, and fewer results far from x_tgt. If our data is Normally Distributed then we say that our data fits the Bell Curve.

The Bell Curve is a very powerful tool for statisticians to make predictions and analyses using real-world datasets. If the values in a dataset are distributed per the Bell Curve, then we can utilize the mean and standard deviation of the dataset to answer questions like "What is the probability that the process will produce a pencil with length less than x_min?".

The Probability Density Function (PDF) of normally-distributed datasets is

$$\Huge f(x) = \frac{1}{\sqrt{2 \pi \sigma ^ {2}}}e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}}$$

given the mean μ and standard deviation σ of the data.

A few notes about Probability Density Functions in the above form that are critical to understand:

And so it is within the Probability Density Function that we can find the answer to our original question. A Numerical Integration Method may be necessary to find the area under the curve, though.

It's also worth mentioning that this is usually by calculating the z-score for some value x, equal to z = (x - μ) / σ. Then you would go to a lookup table of recorded values and look up your z-score to find the probability for any value less than x. This is convenient in that it greatly reduces the math required to find the solution, but at the cost of always requiring a lookup table nearby...

Problem Statement

You will be given a dataset of quantity N pencil lengths randomly created from our process, which is fully representative of a process producing a Normal Distribution. You will then be given quantity M testcases, which will have the following possible formats:

BT

Below Tolerance
BT

Above Tolerance
BT

In Tolerance
BT

Out of Tolerance

Some examples for x = -1.0 and y = 0.5.

Input Data
First line is an integer N, the size of a dataset produced by our process.
N lines will then follow, each containing one pencil length.
The next line will then be Q, the quantity of testcases.
Q lines will then follow, each containing one testcase in the format described above.

Answer
Should be Q space-separated values corresponding to the probability that any given pencil from the given process would have a length per the conditions described in the testcases.
Error should be less than 1e-6.

Example

input data:
5
5.86
8.35
8.88
4.6
6.04
4
AT 9.92
IT 3.74 5.44
BT 5.12
OT 4.43 5.61

answer:
0.024577 0.177902 0.156775 0.834891
You need to login to get test data and submit solution.