Thursday, December 20, 2012

The Pest

When I was 22 I worked for a year as a secretary in the emergency room of a a Boston psychiatric hospital. There was a patient there who used to come in and bother me. He would talk about nonsense when I was trying to do my work and sometimes he would even sit on the corner of my desk. He did this on purpose. He knew he was a nuisance, and he thought it was funny. He even told me once with some pride that a woman on another floor just called him "The Pest".

When I complained about what a pain in the neck he was to my friend she told me more about him. She said that he lived in a group home now, and pretty much had his illness under control, taking a pill that tethered him to the reality that the rest of us lived in. It was an improvement that was hard won. When he was younger he had spent about a decade in a secure mental hospital because he had killed his three year old son.

The Pest wasn't evil. This seems like an impossible thing to say about a man who killed any child, especially his own. But I met him and I believe unequivocally that in his heart he loved his son absolutely as much as I love mine. But reality became filtered through a fractured prism inside his head, so much so that listened to a horrible voice that did not exist and instead of showing his son love he murdered him.

The Pest is different than most mentally ill people. Most mentally ill people do not hurt anyone. In general people with mental illness are more likely to be the victims of crime than the perpetrator, and most people who are hurt are the victims of someone who is perfectly sane.

But people like the Pest do exist, people who are so sick that they become so genuinely disconnected from this world that they are at once capable of nothing and anything. In our daily lives they are invisible. Our neighbors who suffer from illness are sequestered in psychiatric hospitals and group homes. People that live among us do not discuss their problems for fear of being stigmatized. We have sterilized mental illness from our society so much that the average person does not even seem to know what real mental illness looks like.

We can see this in the way that the truly common human experiences associated with mental illness have become novel. Events like a celebrity having a very mundane manic episode become objects of absolute fascination. And so mental illness becomes an abstraction, a position on a continuum with normal instead of, sometimes, a place on a different plane all together. And we search for motives and reasons why someone behaved like they did when there is no reason why. There is just a mess.

I was thinking about this this week. When we become less paralyzed by our sadness, we will as a country begin the process of making laws to make our children safer. When we do this, I hope we do not begin with the belief that the laws will be followed by the good and broken by the bad. This binary understanding is wrong. There are another group of people who are capable of doing neither, and the laws need to account for them too, knowing that these people cannot be easily sorted.

This not only protects our children, but people like the Pest who would give anything now not to have killed his son.

Sunday, September 30, 2012

Why Rat Studies Won't Prove the Safety Of GM Food

There was a recent study that came out showing that GM food causes cancer. It was pretty outlandish and was quickly and correctly dismissed based on dodgy stats/design, what have you.

But while this particular study didn't prove the harm of GM food is it possible to use a rat study to prove the the safety of a GM food?

I don't think so. The level of risk that people consider acceptable is too small to be feasibly assessed with a rat study.

Because GM foods aren't labeled in America a rough estimate of the number of people who eat GM food is everyone, so about 300 million people. About 41% of Americans get cancer in their lifetime according to the first response on google, which we'll out of laziness assume is right. If that risk increased to 42% that would represent an extra 3 million people getting cancer. Surely something that caused 3 million extra people to get cancer would clearly be an unacceptable safety risk by any standard.

If we pretend that rats get cancer and respond to GM food in the same way as humans, how large a sample is required to assess a cancer risk that goes from 41% to 42%? Well, let's look at in a chi square test. If you observed those ratios with 10 thousand individuals in each group is that significant? (Excuse the cheesy formatting, also out of laziness:)

.................GM...No GM.....Total
Cancer......4200....4100.....8300
No Cancer...5800....5900....11700
Total........10000...10000....20000

By a simple two tailed chi square test this still isn't significant (p=0.15). So even a study with tens of thousand of rats really wouldn't be able to see a small but meaningful increase in cancer risk, which, I think, is exactly the sort of cancer risk that people who are afraid of GM food are afraid of.

So while those in favor of GM food are complaining there weren't enough rats to prove harm in the study that came out last week, it's useful to remember that there will likely never be enough rats to prove safety because rat studies with tens of thousands of rats just aren't normally done. Realistically, rat studies, which usually only have hundreds not thousands of rats, are only able to see things that are quite, quite bad. Or, as the statisticians say, have large effect sizes. Studies that don't find harm should be interpreted as failing to find an increased risk larger than X, with X being whatever percentage increase they would be have the power to detect. With a hundred rats in each group findings become significant when risks that goes from about 41% to 55%, but less dramatic changes will not be significant.

Does this mean GM food causes cancer? No. I can't think of any reason why they would.

But the math is what it is.

I think this is also why we see toxicology studies with tumor-prone rats. If you, for instance, have 100 rats and 10 of your controls get cancer and 15 of your GM food rats get cancer then that isn't significant. But if you expect 30 and see 45 that is significant, even though both are a 1.5X increase in cancer. If the tumor prone rats are still a good model for human you increase your statistical power by using cancer-prone rats. This is probably also why they use massive doses of whatever is being tested.

And so you have a Catch 22 that favors toxins. If you design your experiment "properly" with comparable doses and normal rats you will miss huge risks using the number of rats that are feasible. If you rejigger your experiment to get more power the toxin's proponents will argue that your experiment wasn't done properly, but they are unlikely to pony up the money to buy thirty thousand rats.

Monday, August 6, 2012

Choosing to be Excited

I woke up this morning with a knot in my stomach.

Tomorrow is my PhD defense, the culmination of five years of work. I will speak in front of dozens of people for an hour, and then take a two hour oral exam. It will be grueling. It is supposed to be.

There are lots of people in this world who choose paths that put them in situations where they are under a lot of pressure. A few days ago I heard someone talking on the radio about the athletes, a lot of them kids, who are in the Olympics. He said that all of them experience tension: the physiological symptoms of the knots in the stomach, the sleepless nights.

Some of them call these feelings anxiety. Others call these feelings excitement.

So when I got up this morning and my son looked up at me with his big blue eyes and said, "Are you excited about tomorrow?" I gave him a hug and said, "Yes!"

I'm not anxious.

I'm excited.

I choose to be.


Saturday, June 30, 2012

The Science of Potted Plants and Herman Hess

On the recommendation of one of the professors in our department, I have been reading Herman Hess's novel Steppenwolf. As far as I can tell, it is about a man who feels that he has a dual personality. He is at once a wild wolf of the steppes, and at the same time an intellectual man who is submitting to the confines of bourgeois conventions.

I admit that I don't get very much of it. Perhaps I am not bohemian or intellectual enough. But there is one bit of symbolism that I think I've finally wrapped my head around.

The man in the story, Harry, lives in a rooming house and a rather large point is made of the fact that there is a araucaria plant on the staircase. I was a bit confused by the specific and repeated use of the araucaria as it is not a particularly common or remarkable houseplant. An araucaria is one of those pine trees that people grow as houseplants. They are the ones sold in grocery stores at Christmas to people who want a live tree but maybe not a big one. We had one around here for a while. It's not a very nice looking plant, so we put it in the back room until finally it degraded to a condition worthy only of the compost heap. Here's a picture of one, in better shape than mine.

The plant on the steps in Steppenwolf was prized. It was well-tended and dotingly cared for with all of its branches meticulously dusted.

However, when grown inside an araucaria will never grow more than five or six feet tall. By contrast, an araucaria in the wild is known as a Norfolk Island Pine, and in its native warm climate it can reach up to 200 feet, a really truly massive tree.

And so it was with the Steppenwolf. He felt he was truly a wolf, stifled by the bourgeois environment he was stuck in. And the araucaria is truly a Norfolk Island Pine, stifled by the bourgeois environment it was stuck in.

I didn't understand this meaning of the araucaria at first. But I had a eureka moment when I read this article about a study that used MRIs to examine how plants grow. They tested many different species of plants. Initially, the plants send roots out as far as they can go. But then, when they hit the sides of the pot, the plant will notice that there is a problem and send a signal to the rest of the plant that tells it to stop growing. This is how the araucaria on the Steppenwolf's staircase knew that it would never be able to reach its 200 foot height. It wasn't because there was not enough light or water. The confined araucaria wouldn't even try.

Herman Hess of course knew this in the 1920's when he wrote Steppenwolf, without any need for an MRI. And I was pretty chuffed with myself when I worked it out. (It's a pretty difficult novel). Hess himself has said that Steppenwolf is often misunderstood. I guess I am ahead of the game, then, because it is better to not understand something at all rather than to misunderstand it. But I think I finally caught on to this part. Unless I misunderstood it...Hmmmm...

Monday, June 25, 2012

How Underpowered Experiments Make Good Methods Look Bad

In biology we often process large datasets to identify things that meet certain criteria. These things could be anything: genes that are differentially expressed, nucleotides that are mutated, regions of the genome that are missing DNA. In many cases, there is more than one method available to do the same thing. And often times any two methods will produce a different set of "things" that they identify as meeting the same criteria.

This can be a source of consternation for people using the methods. For example, if a biologist has an experiment measuring differential gene expression he is not going to be too happy if he applies two different methods to the same dataset and gets two very different lists of differentially expressed genes. He will naturally assume there is something wrong with one of the methods, shake his fist at his computer, feel that these bioinformaticists have let him down, and perhaps post a pithy Facebook status bemoaning how his life had not turned out as he had hoped.

However, it need not be the case that either of the methods is incorrect. As a first pass, we can define "correct" simply by saying that the method is identifying genes that are differentially expressed with a false positive rate equal to the calculated p-value. That is, if there are 10,000 genes being tested, and the biologist applies Method A to his dataset using a p-value cut off of 0.01, then he will identify some number of truly differentially expressed genes as well as 100 genes that are identified incorrectly. These proportions can be confirmed in validation experiments, and he can use this metric to assess whether the method is "correct".

So imagine that the biologist applied both Method A and Method B to his dataset, obtained two very different lists of differentially expressed genes, and then confirmed with validation experiments that both methods are correct by this definition. How can this be? It may simply be that the biologist just did not give the methods enough data to work with.

With any method, you will identify some differentially expressed genes, but you may not have enough power to identify ALL differentially expressed genes. The data may only have enough inherent information in it to identify about a third of the differentially expressed genes. To get at this third the method may have to make certain assumptions about the data. For instance, it may assume that the data is distributed according to a certain shape or that the variance of the data exists within certain boundaries. These assumptions may be adequate to describe the data and to identify differentially expressed genes, but Method A and Method B may use different assumptions. The differences in these assumptions can result in different gene lists:




If the biologist had asked his bioinformaticist friends ahead of time how to design his experiment (and bionformaticists always welcome the opportunity to boss wet lab people around in these sorts of matters) the bioinformaticist could have run some sort of power analysis for him (or directed him to our soon-to-be-released software to do use himself). If there was enough money, the biologist would have been told to run a greater number of replicates. Then, both methods, having more information, should (from first principles) perform better, identifying a larger portion of the truly differentially expressed genes. Since the number of incorrectly identified genes will remain constant if you use the same p-value cut off, both methods will have to converge on a similar list:



I think that, while this effect is obvious when you think about it, it is also somewhat counter-intuitive to say that two different answers to the same question are both correct. For this reason, this effect is not always in the front of people's minds when they are confronted with two different methods yielding two different result sets. Instead it seems that this kind of discordance is blamed on inadequacies in the methods, when really it's caused by inadequacies in the data.

Now of course, any two methods, even if they are correct, usually aren't truly equal. Method A and Method B may achieve a different power to identify genes, introduce different biases in the call sets, or may work better with some types of data than others. All of these things should be considered in choosing a statistical method.

But first and foremost, you need to make sure that your experiment will have adequate power to achieve your experimental goals. As an added side effect this can also sometimes make the experiment more robust to the biases of the statistical method that is used to analyze the data.

In the end, in the example above, the biologist shaking his fist at the computer screen really had only himself to blame.

He should probably update his Facebook status.


Monday, May 21, 2012

Birds That I Will Mock You If You Cannot Identify

I have had the good fortune of having an aunt who became exasperated when we were ignorant of birding and gardening basics and taught us the difference between a starling and a blue jay. It has come to my attention that not everyone has had the advantage of this education. This has occurred mostly via Twitter when intelligent people have posted pictures asking for help identifying the obscure species of "robin" and "daffodil".

Thus, I present to you, a compendium of photos of Birds That I Will Mock You If You Cannot Identify. (I'll post flowers in a follow up post).

As it happened, and thanks to the google algorithm, all of the photos are from my google+ friend and Michael's brother Thomas Stromberg who has assembled a good collection, except for the starling which is from the Cornell Ornithology lab. Thomas's only starling was some fancy west coast thing...

Don't worry. There will be no test. Well, there will, but it's a low stakes pass/fail where the only consequence of a fail is me snickering quietly under my breath.

Sooooo...here we go.

ROBIN - red breast, with the red breast.



STARLING - this is an invasive species that started it's invasion in Central Park in New York with less than a dozen birds. The sun reflects off of it's feathers and gives them an iridescent shine. It's considered a nuisance bird but don't tell my son because he thinks they're pretty.



SPARROW - cute little things


BLUE JAY - nasty things that fight everything at the bird feeder



CARDINAL - male - red like a cardinal's robes



CARDINAL - female, less fashion forward than her husband whom she secretly finds flamboyant


MOURNING DOVE - like a pigeon, but more brown. It's better known by it's call, which you've surely heard on summer evenings. It goes Cooo Ahhhh, Cooo, Cooo, Cooo....


And that's that. There are loads of other species but these are the most common in the bird feeder, besides the pigeons which make a huge mess and are best chased off ASAP.





Sunday, May 13, 2012

Hortense and Paul

My opinion about the controversial Time magazine cover with the mother breastfeeding her child is that it just wasn't very good. It was designed to shock, but didn't do anything else. It's boring.

Here is something to cleanse your eyes with.

This is Cezanne's Hortense Breastfeeding Paul. I love it.

Hortense looks so glamorous. It makes me think that she used to be a party girl, and would have collapsed this way after many evenings out with friends. But now she is collapsed with her new little friend, cozy and attached, exhausted but content without needing to make a point saying she is so. She's beautiful, naturally sexy without needing to be sexualized. She is the artist's wife (or maybe she was his mistress then - they had a rocky relationship) and she is seen as he saw her, lovingly. There is nothing cloying about it, nothing artificial or staged.

Fantastic.