The Normal Distribution Bell Curve, with mean = 0 and stdev = 1.
Let's imagine we have a process which produces something with some measurable property - for example, a pencil factory which produces
pencils of a certain length x. Our goal is for the pencils to be exactly x_tgt in length, but real-world processes aren't perfect
and so then length of each individual pencil deviates by certain amount. If we run this process to infinity and record the results of
each length, we'd hope to see most results close to x_tgt, and fewer results far from x_tgt. If our data is
Normally Distributed then we say that our data fits the Bell Curve.
The Bell Curve is a very powerful tool for statisticians to make predictions and analyses using real-world datasets. If the values
in a dataset are distributed per the Bell Curve, then we can utilize the mean and standard deviation of the dataset to answer questions
like "What is the probability that the process will produce a pencil with length less than x_min?".
The Probability Density Function (PDF) of normally-distributed datasets is
$$\Huge f(x) = \frac{1}{\sqrt{2 \pi \sigma ^ {2}}}e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}}$$
given the mean μ and standard deviation σ of the data.
A few notes about Probability Density Functions in the above form that are critical to understand:
f(x) approaches 0 as x approaches +infinity or -infinity.x-axis is equal to 1.x=a and x=b is equal to the probability of any given datapoint having value a < x < b.And so it is within the Probability Density Function that we can find the answer to our original question. A Numerical Integration Method may be necessary to find the area under the curve, though.
It's also worth mentioning that this is usually by calculating the z-score for some value x, equal to z = (x - μ) / σ. Then
you would go to a lookup table of recorded values and look up your z-score to find the probability for any value less than x. This
is convenient in that it greatly reduces the math required to find the solution, but at the cost of always requiring a lookup table
nearby...
You will be given a dataset of quantity N pencil lengths randomly created from our process, which is fully representative of a
process producing a Normal Distribution. You will then be given quantity M testcases, which will have the following possible formats:
BT x asks the probability that any given pencil created by the process will have length less than x. (Below Tolerance)AT x asks the probability that any given pencil created by the process will have length greater than x. (Above Tolerance)IT x y asks the probability that any given pencil created by the process will have length greater than x and less than y. (In Tolerance)OT x y asks the probability that any given pencil created by the process will have length less than x and greater than y. (Out of Tolerance)
Below Tolerance |
Above Tolerance |
In Tolerance |
Out of Tolerance |
Some examples for x = -1.0 and y = 0.5.
Input Data
First line is an integer N, the size of a dataset produced by our process.
N lines will then follow, each containing one pencil length.
The next line will then be Q, the quantity of testcases.
Q lines will then follow, each containing one testcase in the format described above.
Answer
Should be Q space-separated values corresponding to the probability that any given pencil from the given process would have a
length per the conditions described in the testcases.
Error should be less than 1e-6.
Example
input data:
5
5.86
8.35
8.88
4.6
6.04
4
AT 9.92
IT 3.74 5.44
BT 5.12
OT 4.43 5.61
answer:
0.024577 0.177902 0.156775 0.834891