A few months ago, some researchers found that women who ate a lot of breakfast cereal were more likely to give birth to boys. It got brief mention in the media and the blogosphere, and a lot of people questioned how that could be possible.
It turns out that it was a basic methodological error.
The authors had performed 132 tests on the same data set, and by doing so, had increased the odds that they would get at least one strange result. You can read more about it here
Statistics is a funny thing. And it matters.DNA's Dirty Little Secret
A man named John Puckett is arrested for a murder committed decades earlier. He has no known connection to the victim apart from the DNA in her mouth, which was discovered to match his after law enforcement did a database search. The jury is told that the chance of a false match is less than one in a million.
Typically, law enforcement and prosecutors rely on FBI estimates for the rarity of a given DNA profile—a figure can be as remote as one in many trillions when investigators have all thirteen markers to work with. In Puckett’s case, where there were only five and a half markers available, the San Francisco crime lab put the figure at one in 1.1 million—still remote enough to erase any reasonable doubt of his guilt. The problem is that, according to most scientists, this statistic is only relevant when DNA material is used to link a crime directly to a suspect identified through eyewitness testimony or other evidence. In cases where a suspect is found by searching through large databases, the chances of accidentally hitting on the wrong person are orders of magnitude higher.
In fact, due to the age of the DNA evidence, and the method used to identify Puckett--searching a database for matches, rather than testing people already connected to the crime--the actual odds of a false match were closer to one in three.
Puckett's lawyers, one of whom has a master's degree in biology and molecular genetics, are barred from arguing this in court; the judge calls it "essentially irrelevant." The jury never hears their corrected statistic, about the high rate of coincidental matches in database searches, or even the fact that Puckett is identified through a cold hit. This is not unusual. Similar things happen in courts across the country.
Puckett is convicted for the murder.
Meanwhile, the agencies in charge of these databases, including the FBI, actively block academics and defense counsel from access to them, making it difficult to research the true odds of matches. The FBI cites privacy concerns, but many researchers believe that their real reason is that investigation might undermine faith in database matches using the FBI's figures.
The article has much more and is worth reading.