Links to tutorial material on Hubbert Linearization

Some people who haven't been reading this site since the beginning might be getting a bit lost in all of the discussion of the Hubbert Linearization technique, so I thought I would post some links to some tutorial material.

According to one of the links below, the technique was introduced by Hubbert in his 1982 review paper "Techniques of Prediction as Applied to the Production of Oil and Gas", which appears in the collection Oil and Gas Supply Modeling, edited by Saul I. Gass (published as NBS Special Publication 631). I haven't found this paper on the web, but I plan to peruse the microfiche version at the library of a nearby university soon.

There is a good introduction to the technique in Kenneth Deffeyes's book Beyond Oil: The View from Hubbert's Peak.

On the web, there's a great tutorial at Wolf at the Door (which also has a lot of other tutorial material on peak oil) and there's a piece by Jean Laherrère, which is a version of a paper that appeared in Oil and Gas Journal on April 17, 2000.

Stuart introduced the technique to the TOD community here. Interestingly, it appears that Stuart actually coined the phrase "Hubbert linearization" in this post that post. (It appears in the figure titles.)

If you know of any other references or links, please leave them in the comments.

Hubbert Linearization is an approximation. That makes the results somewhat less reliable. From a mathematical perspective, the problem is simply one of fitting a Gaussian curve (or "bell curve") to the annual production data, P, vs the year, T. The problem can be simplified by noting that the natural log of the production, Log(P), has a simple quadratic dependence on T:

Log(P) = A + B*T + C*T^2.

The constants A, B, and C are obtained fitting. Therefore, the problem reduces to the problem of fitting a quadratic rather than a straight line. This is only slightly more difficult and can be done with spreadsheet programs like Excel. The advantage is that, unlike the Hubbard Linearization, the relationship is exact. I've done this fitting myself and it works quite well. I'll see if I can paste the chart in this post:

The situation actually seems to be more interesting than you might think.  I explored the US case here, and while it's true that the US production curve is better fit across the full history by a Gaussian than a logistic, Hubbert linearization is the most robust predictor, and is the only one to give very reliable results before the peak.  Quality of model fit and quality of prediction are often very different things.
You've looked at this issue more than I had realized. I was commenting more on the use of the HL technique to approximate a Gaussian curve. From that limited mathematical perspective, it's not a particularly good approximation.

The problem is that:

(1) The data has noise.

(2) The correct model may not be a Gaussian.

In the U.S. case, HL may have given better early predictions of the peak than the Gaussian, but that might not be true of world data. I suppose the best we can do is to try different fitting methods and get a "rough estimate" of world peak oil production. The estimate should improve over time. As you point out in your discussion, error bars and sensitivity analysis are important.

correct me if i am wrong but when ever someone says 'the data has noise'
the noise is almost always data that doesn't fit the person's pre-determined opinion from what i have seen.
No. Noise has particular characteristics that are independent of the model. Noise is random. There is no reasonable model that could account for the small, short-term fluctuations in the data that we call noise. However, a bad model simply doesn't fit the data very well. An example is the HL technique in the very early part of the oil production history. A straight line doesn't fit that part of the curve, and it's not because of noise.
The best way that I could think of to test the post-peak validity of the HL technique was the excercise that Khebab did with the Lower 48 data.  Post-1970 cumulative Lower 48 production, through 2004, was 99% of what the HL model predicted that it would be--using only 1970 and earlier production data to generate a predicted production profile.  

The same exercise for Russia showed that post-1984 cumulative Russian production was 95% of what the HL model predicted, using only production data through 1984 to generate the predicted production profile.

One problem with this is that the person fitting the data using the HL technique makes a (subjective) choice about where to start the linear fit. To quote Stuart Staniford, "Long experience has taught us that the linearization generally does a bad job in the early part of the history...". Therefore, its possible, after the fact, to choose a starting point for the fit that gives good ageement with the known post-peak data.
 "Therefore, its possible, after the fact, to choose a starting point for the fit that gives good ageement with the known post-peak data."

I proposed the Lower 48/Russian experiment to Khebab, and he chose all of the technical parameters.  If you have read any of his posts, you can tell that Khebab is an objective scientist. IMO, he is a genius.

In any case, Khebab had zero preconceived expectataions of how the results would turn out.  When you look at the actual 1970 and earlier Lower 48 data and actual 1984 and earlier Russian data, they both show very strong HL patterns.

The following link will take you to several Energy Bulletin articles:  http://www.energybulletin.net/news.php?author=jeffrey+brown&keywords=&cat=0&action=searc h

"M. King Hubbert's Lower 48 Prediction Revisited" has the HL modeling of the Lower 48.

Approximations are often more useful than precision.

For example, back when I was a teenager I used to grind and polish astronomical mirrors (for Newtonian and Hershel-style off-axis) reflecting telescopes. The goal was to get to the Raleigh Limit--one-eighth of a wavelength of sodium light, and the figure desired was a parablola.

Guess what: for a 4 1/4 diameter F-20 mirror I figured it to a SPHERE which is well within the Raleigh limit for a parabola, even when using the off-axis style to avoid the diffraction from a Newtonian diagonal and its support.

Even at F-12 or thereabouts for a Newtonian style, a sphere is within the Raleigh Limit for a little mirror, such as 6".

BTW, for observing planets, most nights the atmosphere is too turbulent to get much if any benefit from a telescope over 6" or 8"     And on many nights a 60 mm lens on a good refractor actually shows better images than you get from a big scope because of the nature of atmospheric turbulence, which is (to put it mildly) complex.

Anyway, what matters and costs the most in amateur scopes is usually the stability of the mounting rather than the quality of the lenses and mirrors.

IFeelFree,
perhaps you understand this and you are making another point when you say Hubbert Linearization is an approximation but if so, I think you may have confused some readers that are less familiar with the subject.

At the risk of teaching the many Grannies that live here to suck eggs, Hubbert proposed that cumulative  production with time (Q) followed the logistic curve (or more precisely the sigmoid curve, a special case of the logistic curve). The rate of production (P) is thus the differential of this with respect to time which gives the familiar Bell shaped curve that is like a Gaussian curve but not quite the same.

Q = Qmax/(1 + exp((th - t)k)))

P = Qmax*(t/k) * ((exp((th - t)/k)(1-exp((th - t)/k)²))

Where:-
Qmax = the ultimate cumulative production
th = the time at which half of this production is extracted
k = a constant with units of time that determines the rate of depletion, the larger k the slower the extraction.
t = time

This function gives the mathematically exact relationship
P/Q = (t/k)*(1-Q/Qmax)
This shows that were production to exactly follow Hubbert's curve a plot of P/Q against Q would give a straight line slopping down to intersect the Q axis at Qmax. The line passes thrugh Qmax/2 at th and this is the time of peak production.  

What I don't understand is why we don't try to plot these curves (rate vs cumulative) instead of (rate/cumulative vs cumulative).  If you do it the first way, then you can see the "upside down" symmetric parabola fairly clearly, IF it is a logistic curve. Howver, since the data rarely shows the symmetry, I guess that's why the linearization technique is favored.  The weirdness at the start is then swept under the rug.

Plotted the first way, the majority of the curves look like this:

You can see the general shape of the upside-down parabola but since it is not symmetric, it can't be the logistic curve.

Not a bad idea, Web. Here's the US curve done that way:

Makes a nice parabola and actually this looks much better than the linearization (since it's not blowing up the misfit in the tails with a 1/Q factor). The predicted URR is within error bars of what the linearization says. I only fooled around a tiny bit, but it seems that it does ok at robust prediction too:

(BTW, as a process/communication issue I would prefer that you not accuse others of acting in bad faith ("swept under the rug"), without good evidence. It's more polite to assume that the other people simply see things differently, which is usually the case, certainly around here.)

Nice results, this representation seems to have good asymptotic properties.
I don't think so. The fit with more recent data pushes the asymmetric peak to the left giving longer tails to the right.  This will continue to happen as you get more and more data. The properties of temporal causality makes it so.

I guess I am more into the understanding rather than the predictive properties.  I just don't get how this stuff works out without any kind of forcing function included.  I am categorizing this set of formulations under the heading "Immaculate Conception Hubbert Peak Analyses".  Without a forcing function, I might as well look into causes of Spontaneous Human Combustion.

A thought experiment for us to engage in. Say from now out, time=T, the quantity of discoveries followed as:
 K/(C + (time-T))

In this case, we would still have the peak centered at the same point but the URR will blow up to infinity.  Until this is discussed by someone other than me, I can say that it is "swept under the rug", which is a mildly-offensive euphemism for "ignored".  And for all I know someone has discussed this, but I don't know about it.

You might want to spend some time contemplating whether there's likely to be any relationship between the frequency with which you insult other people, and the amount of effort they are likely to put into thinking about your ideas.
I usually try to limit my insults to ideas, which last time I looked are inanimate objects and pretty immune to such things as feelings.

Oops, maybe that was an insult as well.  I will try to stifle myself, and just consider anything outside the bounds of decent behavior as my attempts at snark (people such as Michael Lynch, George W. Bush, and Michael Crichton excluded).

The second graph corresponds to a prediction in 1976. After another 30 years of evidence, it's moved by about 10%. In my book, that makes it a pretty useful prediction method (for this case - as I've discussed elsewhere I think there are a lot of caveats elsewhere). There are very few ways to predict anything in the far future that won't have moved a lot more than 10% after 30 years. A-priori, I would only have expected the logistic to be a very rough approximation to something like oil production and it still surprises me that it does as well as it does.

I agree, as I've said repeatedly, that we lack a theoretical understanding of why the US production curve is so Gaussian and that's unsatisfactory. However, I view that as an interesting challenge that we should try to solve rather than a reason to dismiss the fact that it has been so up to now (modulo some noise).

I should also point out that the shift is pretty much due to Alaska coming on line as a late chunk of discover - the lower 48 prediction would be significantly better I imagine (as Wes and Khebab showed for linearization a while back).
Do you know why economics textbooks usually show supply and demand curves at straight lines? They did not used to be (say prior to about 1954) shown as such but instead were often shown as rectangular hyperbolas (i.e. price elasticity of exactly one). Well, you hardly ever have a price elasticity of demand or supply of exactly one; it happens, but rarely.

It was George Stigler (I think) in his first edition (late 1940s I believe) who pointed out for the first time that BECAUSE we economists do not know the empirical shape of most supply and demand curves that we should draw them as straight lines TO EMPHASIZE THAT THEY ARE ARBITRARY AND DO NOT REFER TO THE REAL WORLD. Unfortunately, after Stigler, relatively few authors make his point, thus needlessly confusing generations of miserable and bewildered and hostile students.

HOWEVER, I shout;-)
     For a small change in price for a well-behaved supply or demand function a straight line is often a pretty darn good approximation to the real world.

It is almost never a decent approximation for a large (say more than 20%) change in price.

When I taught economics I explained these nuts and bolts, and guess what: Almost half my students got a fairly good understanding of supply and demand. In the typical introductory and even intermdiate microeconomics classes at U.C., Berkeley, my guess is that fewer than ten percent grasped the most basic fundamentals of supply and demand.

Now elementary need not be hard at all. No! The problem is that most teachers of lower division classes do not give a darn about teaching and could care less that 90% of the students are ignorant of fundamentals.

Grump.

Don Sailorman,
The linearization talked about here concerns a technique to turn a highly non-linear function into a straight line. It has nothing to do with small perturbations affecting linearity to the first order, as a Taylor series approximation does.

That property I do believe in but that does not influence my disagreement with the original premise of using a logistic curve or gaussian curve formulation to describe the stochastic behavior.

I understand what you are saying; no analogy is perfect. However, IMO my main point is valid and does apply apply.

I think (but do not know) that Stuart agrees with my line of reasoning; he has expressed it himself in somewhat different words--just a few days ago.

But for the US, didn't we hit the peak around 1970?  So we already knew the fit would work and the subsequent 30 years that have passed haven't really added much insight to the peak position.  More to the point is how the depletion tails will work out. This I think is a work in progress and something that has yet to be verified due to issues such as reserve growth and future discoveries.  And the discovery profile is something that is not included in any of these immaculate conception models.
"But for the US, didn't we hit the peak around 1970?  So we already knew the fit would work and the subsequent 30 years that have passed haven't really added much insight to the peak position"

The key point is that Hubbert, in 1956, accurately predicted the Lower 48 peak.  

What Khebab and I attempted to address was how good the HL model was at predicting post-peak cumulative production, using only Lower 48 production data through 1970.  The answer was that actual cumulative Lower 48 production was 99% of what the HL model predicted that it would be.  

Assuming that Deffeyes is right that we are past the peak of conventional crude + condensate production, the HL model should therefore offer us a very accurate prediction for post-peak world production.

I think you mean this post a week prior is when he first used it.
Ah, very astute peakguy! The phrase appears in the figure titles of that post. I didn't notice them because I was only searching on text. I will correct the story.
Prof. David Roper has studied various population and depletion modeling problems using curve fitting techniques:

Projection of World Population

Where Have All the Metals Gone? (PDF)

Depletion Theory (PDF)

Crude Oil Depletion

Other papers here.

Interesting. Roper uses a function to fit the U.S. crude oil data which is asymmetric pre/post peak. It gives a significantly better fit to the data than a simple Gaussian (which is symmetric). He uses the same type of function to fit other natural resources, including natural gas, and precious and base metals.

His asymmetric fit of world crude oil extraction data suggests that we are already past peak oil.

He's using the Verhulst model which can be seen as a generalization of the logistic model. However, there is an additionnal parameter that controls the curve asymmetry that is difficult to set without side information.
The Verhulst model appears to have a total 4 free parameters. It can be fit solely by the data, without the use of "side information", but it requires non-linear regression techniques. I've programmed non-linear regression models and it's not trivial, but quite do-able.

It's just another model, of course, but if the asymmetric function gives a better fit to the data over the entire data set, and it works for many different kinds of natural resource production data, then it seems to me its a very worthwhile approach.

Not to pound on a point but model fit is not the correct metric.  Model fit always improves if you add more parameters, but prediction may be dreadful due to overfitting.  Metrics should be based on the ability to predict data that weren't used in the model fit.
Strictly Verhulst is using the logistic equation. He developed it and named it equation logistique in 1838 in his studies of limited population growth after reading Malthus' work.

This equation does allow asymmetrical peaks. The simplified version used by Hubbert that only allows symmetrical peaks should strictly be called the sigmoid curve although it is often referred to as the logistic curve.

Here's his fit to the world production history:

I'm pretty sceptical.  What it looks like to me is that the fit routine is using the asymmetry parameter to try to bend the logistic into the Gaussian tail shape of the data.  It sees no cost to doing this because it's unconstrained by any post-peak data.  If you look at his fit to the US data it looks more symmetric because it can't get away with that trick there to the same extent.  I bet if you did a stability analysis of how the asymmety parameter varies with the length of history included, you'd find it wasn't stabilizing.

I agree, there are too much parameters in the Verhulst curve making it a poor predictor. Bascically, there is no way to tell from one half of a curve if the overall curve will be asymmetric. We played with this model a while ago on PO.com:

Updated Verhulst model

Where is the discovery data in these models?

How can analysts throw this information away when there is a concern over too many model parameters?  Having additional information does wonders for establishing a model's utility.

I was talking strictly about the curve fitting approach, too much parameters produce overfitted results and poor extrapolation reliability.
Hi Everyone.

This is my first time on this site. I am a PhD. in Quantum Field theory who now trades shares for a living based in Hong Kong. I have almost 10 years experience in the markets. As such I feel uniquely positioned to sympathise/understand both the approaches of the people at TOD and economists who assume higher prices will stimulate exploration/production of oil.

The economists get it wrong all the time. In this case I personally believe production will be around 85mbpd depending mostly on geopolitical events over the next couple years before we enter permanent decline. Prices will soar either soon depending on geopoltical events or in any event within 5 years. The EXACT details of exactly how many bpds and in which year is largely irrelevant and can't be known now anyway as is the ultimate price of oil (it will be multiple of where we are today).

I do think that the "die-off" and other scenarios are a bit overdone and that the economists do have a point. When TSHTF (I learned that abbreviation here - thanks!) it will suit the politicians to "send out the cavalary". I think you will see MASSIVE efforts on GTL, Tar Sands, Ethanol and conservation etc. I think the economic environment will be really tough for 10 years but that we ultimately get through it. The seventies were rough but ultimately we survived.

I think the Chinese will lead the way with the alternatives. The government there doesn't have to face elections and can (and does) make tough decisions. They literally plan on the basis of ensuring commodities supplies 50 years into the future. They already have Sasol looking at GTL to produce 1mbpd. Feasilbility to take 2 years. Scale that up in the major countries and you will have enough.

I just want to say thanks for all your efforts, especially to Stuart for helping me better understand the issues.

Regards.

Welcome! It's the first time I see a trader with a PhD in Quantum Field theory!
Welcome also. My PhD was doing simulations in lattice gauge theory, but then I went into Computer Science. So I used to know some of the same things you used to know :-)
With PRC primary energy consumption increasing at a stable 15% per year, it's going to be the other guys that have to make the tough decisions, like what to do without :-)
Of course the economists are correct that much higher prices will stimulate much more drilling for oil. Now the economists may or may not think they will get much more oil, say twice as much production from twenty times the current expenditure (in inflation-adjusted dollars).

The impression I get from most petroleum geologists who post here is that even if spending is increased 50 fold (5,000%), output probably would not go up much or for long.

Time will tell.

Pretty soon, IMO.