Yes, unless you choose the right parameters of the log-normal, you can get the situation where the variance is much larger than the square of the mean. This would make it much more bi-lobal than the exponential, i.e. a huge number of small depths and also lots of deep depths. Unfortunately using the log-normal also makes it hard to derive a closed form solution, or something that doesn't require you to quote an Erf(). You could get lucky and derive it analytically but I am not sure.

I think log-normal is typically reserved for statistical pool sizes, if you want to get the biggest benefit from using it. It has something to do with aggregation of clusters. If clusters get big enough they start to agglomerate so you end up moving toward two extremes. Lots of small ones, and many more than average big ones. It may work for depths too. The log-normal certainly has the nice property that it prevents negative values.

A Gamma distribution of N=2 would also work. This requires the integration of x^2*exp(-kx) for figuring the E[]. (Gamma of N=1 is the exponential. ) I could see this working as the convolution of two exponentials. The first exponential defines the depth distribution at which you start to hit oil, and the second exponential defines the width of the sweet spot. This also has the nice property of not going negative, and I bet it has a closed-form solution as well.

I will definitely scribble on this derivation today. (Unless you get there before I do)

I agree regarding the log-normal. There will be no closed form solution (not that I think this is a problem really, we can easily evaluate the integral numerically).

"I could see this working as the convolution of two exponentials. The first exponential defines the depth distribution at which you start to hit oil, and the second exponential defines the width of the sweet spot."

This is a nice idea for motivating another distribution. Here is what I get. (If we get the same, there is a good chance it's correct ;)

Let S ~ Gamma(2,s), L ~ Gamma(2,l) (and S,L independent as usal). Then S has density x/(2*s^2)exp(-x/s), for x >= 0.

First, integrating over the distribution of S, conditional on L:
E[D(S,L) | L = y] = alpha*( s - s*exp(-y/s) - (1/2)*y*exp(-y/s) ).

Second, integrate over the distribution of L:
E[D(S,L)] = E[ E[D(S,L)|L] ] = int_0^\infty E[D(S,L) | L = y] y/(2*l^2)exp(-y/l) dy =

alpha*( s - (s/l^2)/(1/l + 1/s)^2 - 3/(2*l^2)/(1/l + 1/s)^3 ).

Moreover, s and l are determined by lambda and Lo, since E[S] = 2*s = lambda and E[L] = 2*l = Lo. Note that Var[S] = 2*s^2 = lambda*s and E[L] = 2*l = Lo, Var[L] = 2*l^2 = Lo*l, so the variance "scales" differently than in the case when we use exponential distributions.

What parameters do you use for drawing the graphs above? (i.e., Lo, and the paramters for lambda(t) ).

There is an interesting property if you convolve two exponentials with L0 and L1 characteristic lengths. The result is essentially proportional to "exp(-x/L0)-exp(-x/L1)" where L1>L0.

This means that the expected value should be
1/(1/L1+1/x) - 1/(1/L0+1/x)

If I pick something like L1=8 units and L0=7 units then the incremental dispersion looks like:

(this is the derivative of the above function)

What I like about it, is that it generates the "cusp" that Hubbert saw on his cumulative footage return graph. Look at the superimposed blue curve:

The non-sweet spot return shows a plateau at zero (constant positive rate) whereas the sweet spot definitely shows a cusp, which may be reflected by the actual data that Hubbert plotted.

I will try to reconcile the difference between your formula and this one. Admittedly, I did this my derivation in my head because of the convolution identity that dawned on me.

Yes, if L is a sum of independent exponentials with means Lo and L1, then L has density

(*) 1/(L1-L0)*( exp(-x/L1)-exp(-x/L0) ), for x >= 0.

(Of course, when L0->L1 this will collapse to a Gamma distribution with 2 degrees of freedom).

In the formulas above, I assumed that S is also Gamma(2,s) distributed, which perhaps is harder to give a physical interpretation than for L. If we assume S ~ Exp(lambda) and L density (*), then

E[D(S,L)] = lambda*(1 - 1/(L1-Lo)*(1/(1/lambda + 1/L1) - 1/(1/lambda + 1/Lo)) ).

But perhaps all this fiddling around with different distibutions is of moderate interest: it is probably most important to recognice that the most important thing is to have some kind of uncertainty ("dispersion") in lambda and L. Whether this is the case, I would say, comes down to how sensitive the "output" is to the various assumptions on the distributions, the "output" being whatever the model help us to say about the real world. (In this case, perhaps the expected volume of discoveries in the next few years.)

Another thing we could try to adress is the sample variability: we have focused on the mean of D(S,L), but it should be straight forward to compute condfidence intervals for future discoveries, given assumptions on the distributions of S and L. (Maybe you already did this).

Another (minor) comment: i don't think the single dispersive model corresponds to a uniformly distributed L --- in that model, L = Lo deterministically (i.e. point mass 1 at Lo in probability lingo).

In fact, L uniformly distributed on [0,2*L0] (to make E[L] = L0) gives

E[D(S,L)] = labmda*( 1 - 1/(2*Lo) + 1/(2*Lo)*exp(-2*Lo/lambda) ).