The Mean Caps Exceedance Fraction
This fact is wild to me: If you know the average exposure of a SEG, you know the maximum exceedance fraction. For example, if the arithmetic mean of a SEG is less than 25% of the exposure standard, then your exceedance fraction will be less than 5%. The explanation falls out of the maths for calculating a percentile of a lognormal distribution - there are no tricks here, and I’ll explain later. In short, the SEG average is inherently intertwined with the SEG variance. But first, some visualisations may help explain what’s going on.
In this first chart, I have created many lognormal distributions. All have an arithmetic mean of 25% of the OEL, and a GSD according to the X-axis. The Y-axis is the exceedance fraction as a decimal (i.e. 0.05 = 5% EF).
Notice two things: 1) the exceedance fraction never goes above 5% regardless of the GSD. Which means, as said before, that if the arithmetic mean (AM) is less than 25% of the OEL, then the exceedance fraction must be less than 5% regardless of the GSD. 2) The exceedance fraction goes down for higher GSDs.
The first point is ‘fun’, but the second point, to me, is very confusing and counter-intuitive. Shouldn’t it be that a higher GSD means more spread of data, and a higher exceedance fraction??
By graphing actual distributions with AM = 25% of the OEL and a selection of GSDs we can see that the shapes of the distributions vary wildly on the left-hand side. But the area under the curves above the OEL (which is the exceedance fraction) are all quite similar. That is, all very small.
Zooming in, we can see the area under the curve (exceedance fraction) does increase from GSD = 1.5 to 4, but then decreases when GSD = 7 - as shown by the first graph.
Using the formula outlined in the next section we can show the relationship between the arithmetic mean and the maximum exceedance fraction:
So what?
Does this matter if we could just calculate the exceedance fraction directly? Maybe not. Maybe it’s just an “fun fact”. One possible use is if your decision criteria uses the exceedance fraction but all your data is censored. In this case, you can’t calculate an exceedance fraction. However, if the limit of reporting was less than 25% of the OEL, you can be pretty sure the AM is also less than 25%. Therefore, you can be pretty dang sure that the exceedance is less than 5%!
Maths Explaination
The maths is confusing if you haven’t studied probability. I find it confusing and will probably do a poor job at explaining but will try anyway.
In general, normal distributions are much easier to work with than log-normal. To convert a log-normal distribution to a normal distribution, we take the log of the log-normal distribution (hence the name). That’s why in all the equations, almost everything is written as log(x).
The equation below is our exceedance fraction. The funny symbol in the middle, and all the stuff to the right, is a ‘cumulative normal distribution’ function. Basically, how much of the distribution is below the OEL. If we then take 1 minus that number, we have everything above the OEL… which is our exceedance fraction.
Say we know what our OEL and AM is. Say OEL = 100, and AM = 25. If GSD is really small, like 1.1, then the value inside the brackets will be large -> the cumulative distribution function will be close to 1 -> our exceedance fraction is tiny.
But it is also true that as the GSD gets huge, say 100, then the value is also large, and our exceedance fraction is small.
How? Notice that there is a ‘GSD’ both on the top and bottom of the fraction?
So, for any OEL - AM combination there must be a GSD that minimises this value in the brackets. To cut things short, the AM is calculated using the GSD. They are “linked”. It turns out that the largest value possible in the brackets is equal to:
Substituting this back into the first equation we get the maximum exceedance fraction:
Notice how now the maximum exceedance fraction does not directly rely on the GSD? For a given OEL and AM, the GSD could be anything and we will still know what the maximum possible exceedance faction is.
If that didn’t make sense, it’s more my fault than yours. Its confusing, I’m sorry.
If you want to learn more, I strongly suggest Rappaport’s book which is where I found this idea.
For Ian’s Comment