As a very intriguing corollary to this finding, the fact that we can use a Logistic to model discovery means that we cannot use only a Logistic to model production.

Given the vast number of variables that we have to deal with, I have tried to go with the simplest quantitative modeling tool that appears to provide some plausible results. I think that a good way to evaluate the HL method is to generate some predicted production curves for regions that have peaked, using only production data through the peak date to generate the predicted curve--and then compare the predicted post-peak cumulative production to the actual post-peak cumulative production for a given region. This is what we (my idea, Khebab's hard work) did in the following article:

http://graphoilogy.blogspot.com/2007/06/in-defense-of-hubbert-linearizat...
In Defense of the Hubbert Linearization Method (June, 2007)

BTW, I should add a fairly self-evident point, to-wit, that the HL method can't "see" the production from immature and/or undeveloped basins. Of course, the problem is that there are fewer and fewer basins that are in this category, and then the question is how material they will be to a given region and to the world.

I think the procedure you referene in the other post is really providing a false sense of accuracy -- at least there are some implicit assumptions that are never explicitly addressed. The biggest influence on your outcome is:

How do you choose the points to which the model is fitted? In the post above, you say "...using only prodcution data through he peak data to generate the predicted curve"; however, in the link you clearly state only the "green" points are used to fit the model. Clearly, there are several points prior to the "green" points that are not included in the modeling process. My question remains:

1. How do you choose the green points?
2. How much does the answer vary if choose a different range of green points?
3. A true estimate of the variance of the curve could be gleaned if you randomly chose x points prior to the peak, doing this several hundred times and getting an empirical confidence interval. Have you done this?

While I am a statistician, I have worked extensively with physicists -- which it appears that OP is, also the love for the power law gives it away a little :) The point is that the model needs to be chosen based on a defendable reason versus quantitative convenience. In my opinion, this becomes more of a necessity as Peak Oil becomes more "mainstream" and people begin to investigate some of the claims. It becomes fairly easy to establish numerous counterexamples where the HL procedure is shown to be quite ineffective or exhibits a lack of robustness.

I'm actually a physicist, and I agree with your requirement of defensible reason being more important than quantitative, but there is, as a physicist, some wriggle room. It depends on the quality of the data and the stage of development of the theory. For example, in Verhulst's time, there was no data with which one could do a reliable study of the effect of starvation alone, as opposed to starvation and disease, or starvation and war, etc. So Verhulst chose to simply posit that population tended toward a saturation number that was a new parameter in the Malthus model. In the absense of any real data, this was little more than an intellectual place holder for the idea that this model can't possibly be complete.

Then a century later Hubbert needs a simple formula for a time dependent quantity that starts very small, grows to a peak and then declines, ultimately to zero. He sees that Verhulst equation meets his criteria and uses it.

The Gauss normal curve also meets these theoretical criteria. I tried Gauss curve when first learning about PO. It leads to very messy algebra. There was no theory that supported using Gauss in this situation. The central limit theorem applies to large numbers of statistically independent events. Since I didn't want to use Gauss because the algebra was messy, it was easy for me to convince myself that there were surely not a large number of independent events in this situation.

He, like a physicist, doesn't have to justify trying it. Using it only needs justification if it works, and then the justification is more a discussion of what work needs to be done to develop a proper theory. Among other things, one needs to develop a good procedure for selecting data.

Elsewhere in the discussion I've posted a comment about how troubling it is that Hubbert linearization requires that the Hubbert peak be symmetric.

I think that we may very well witness the post peak decline in real life before we have a adequate theory of how to predict it. What we have now is good enough for economic hand waving, but as soon as decline is real there will be rapid changes that will lead to big forced changes in human behavior.

The Gauss Normal curve only kicks in when you apply the Shock Model to the discovery curve. The shock model places convolutions of slight production shifts corresponding to the fallow, construction, maturation, and extraction phases after the initial discovery (i.e. a sequence of statistically independent events). This trends the production curve to look more Gaussian.

You should look at this post http://mobjectivist.blogspot.com/2008/03/street-lamp-understanding-of-sh... to see how this all works in the context of the Oil Shock model. Convolutions of gaussians result in gaussians and all curves trend toward this property as a consequence of the CLT:

The only minor issue I have is this statement of yours:
"He, like a physicist, doesn't have to justify trying it. Using it only needs justification if it works, and then the justification is more a discussion of what work needs to be done to develop a proper theory. Among other things, one needs to develop a good procedure for selecting data."
Without the theory, this becomes the definition of a heuristic and it prevents us from making as fast a headway as possible. Can you imagine how slowly we would have advanced technologically if everything was based on heuristics instead of fundamental explainable laws such as Maxwell-Boltzmann and Fermi-Dirac statistics? If it wasn't for F-D in particular, we would still be wondering why a semiconductor transistor works at all!!

Otherwise I agree with everything you say and consider Verhulst's approach a deterministic trajectory and not the stochastic trajectory that we really should be using, ala the Dispersive Discovery model.

The key point is to determine if the data set shows a steady linear progression with a P/Q intercept generally, but not always, the 5% to 10% range (there are some outliers, such as the North Sea, an exclusively offshore region with a rapid decline rate). That is how Khebab chose the green points.

We don't have that many large producing regions to study. We can say that our available case histories--Texas (total plot, pre-peak is noisy), the total Lower 48, Russia, Mexico, North Sea, Saudi Arabia etc.--broadly fit the HL model. These regions account for about half of the oil that has been produced to date worldwide.

Meanwhile, what I first warned about in January, 2006--based on a HL analysis of the top net oil exporters--is unfolding in front of our very eyes, an accelerating decline in net oil exports. In fact, based on the HL analysis of Russia, in January, 2006, I gave Russia another one to two years of rising production before they resumed their production decline, and while Saudi Arabia has shown a rebound in production, it is a near certainty that they will show three straight years of annual production below their 2005 annual rate, at about the same stage of depletion at which the prior swing producer, Texas, started declining (all based on HL).

Recent headline:

Declining Russian Oil Production Could Lead to $200 Oil and “Global Recession,” Says Deutsche Bank

You are exactly correct in your observations and the choice of the time interval for the fit is the weakness of this empirical approach.

A true estimate of the variance of the curve could be gleaned if you randomly chose x points prior to the peak, doing this several hundred times and getting an empirical confidence interval. Have you done this?

Yes, but I use a Bootstrap techniques instead in order to derive a confidence of interval which is very often quite large. For instance for Saudi Arabia:

Thanks for the response, Khebab -- this is basically what I was asking. Do points prior to 50 Gb ever get used in the estimation. I guess I am just curious to how long you have to wait to choose a "linear" portion of the profile. Also, how effective is this if the peak hasn't occurred yet?

For instance, what if you started estimating with the first two observations and then updating your estimates as new observations came in. You would get wildly varing answers. Then, eventually, you would have to decide where to quit using early points to capture the linear part of the profile. In the graph above, if the estimation was done using the points between 40 Gb and 70 Gb, we would have estimated the cumulative Gb to be around 85.

In the end, I think this is a decent way to model existing data, but may be poor in making predictions of any accuracy. I just think we have to be careful on how this is presented -- especially when addressing scientifically-minded non-believers.

"Do points prior to 50 Gb ever get used in the estimation"

Look at this mash-up of Khebab's SA data and the USA data from my post above. The data points show more fluctuation for SA below 50 Gb, but they both show that curious hyperbolic curvature indicated by the solid blue line.

As I said in the post, this has to do with the use of power-law discovery as opposed to the exponential-law; the latter gives a perfectly flat HL.

I usually discard the first points because low cumulative production values (Q) will boost small fluctuations in production (P). I usually take P/Q<10% as a cut-off value. Because of integration, the noise on P does not affect Q after a while and fluctuations in P are dampened as Q increases.