Wednesday, April 30, 2008

Satanic statistics?

Over on Big News, Dave Crampton echoes the Kiwi Party in asking querying the Government Statistician's analysis of signatures in the childbeating petition. The Government Statistician checked 29,501, and found that 25,754 (87.3%) were valid. Multiply that proportion by the 324,511 and you get 283,294 - just 1,733 signatures short of the number required. So why does the Government Statistician say they need 18,000 more signatures? An evil plot to subvert god's will and prevent spanking through Satanic statistics?

No. Instead, its about the duplicates. The Government Statistician found 160 multiple signatures in the sample - 158 duplicates and 2 triplicates. The sample was 1/11th of the total, so this suggests that there will be a further 160 x 10 = 1,600 replicates in the sample where the other match is in the rest of the population - and therefore a further 1,600 x 10 hidden replicates in the population as a whole. Which pretty clearly gets us in the right ballpark. The problem is slightly more complicated than that, since signatures can be both invalid and duplicated; statisticians have a number of different ways of estimating this, but that's about the stage I start seeing tentacles. The important thing is that this is not a satanic statistical plot, but a problem of childbeaters being idiots who think that signing a petition multiple times helps their cause. We can only hope that they don't think the same about voting.

(With thanks to Mary Whiteside & Mark Eakin, A Better Estimate of the Number of Valid Signatures on a Petition [PDF]).

Update: After further puzzled emails, I've added a sentence there which makes it clearer what is going on. There's further useful comments from Graeme Edgeler at Big News.