When I look at the NHC forecast that HO cited, they appear to have picked parameters from all over the world that have been correlated in the past with hurricane number/intensity and use that to make the prediction. They don't say, but one suspects, that the choice of model parameters is the result of computer data-mining. There are grave dangers in doing prediction by datamining for things with high-correlations. If one looks at enough possible candidate variables, even if there are no true causal associations at all, something will look correlated by chance. To avoid this problem, it's important to do cross-validation - the samples used for finding the correlate variables to build the model and the samples used to test the prediction skill of the resulting model must not overlap.

It's not obvious to me that these guys understand the dangers here. But maybe they do and I just don't have enough detail on their methodology. Does anyone have more information on it?

Hmmm. In December 2004, they predicted a 69% chance of a Cat 3-4-5 hurricane hitting the US coast in 2005. We got Dennis, Katrina, Rita, and Wilma. Wilma had the lowest central pressure ever recorded in an Atlantic hurricane, and Katrina caused the highest monetary damages of any Atlantic hurricane.

In December 2003, they predicted a 68% probability of a Cat 3-4-5 hurricane hitting the US coast. We got Charlie, Ivan, and Jeanne. (Frances was only Cat 2 in Florida, though she had hit the Bahamas as a Cat 3).

So, when they say the probability of a Cat 3-4-5 strike on the US is 81% this year (versus 69% in 2005 and 68% in 2004), you should probably quake in fear :-).