Bastrop State Park, four years later

Four years ago I wrote about the wildfires in Bastrop, Texas. Here’s a photo from the time by Kerri West, used by permission.

Today I visited Bastrop State Park on the way home from Austin. Some trees, particularly oaks, survived the fires. Pines have come back on their own in parts of the park. A volunteer working in the park told me that some of these new trees are 10 feet tall, though I didn’t see these myself. In other parts volunteers have planted pines. Here’s a photo I took this morning.

Most of the new growth in the forest is underbrush, in some places thicker than in the photo above. The same volunteer mentioned above said that the park is already planning prescribed burning in some areas to clear the underbrush and protect the viable trees.

Interpreting scientific literature about your product

A medical device company approached me with the following problem. Scientists had written academic journal articles about their product, but the sales force couldn’t understand what they said. My task was to read the articles, then tell the people in sales what the articles were saying in laymen’s terms.

One of the questions that came up was how to compare two studies with different sample sizes. Of course there are many factors involved, but I said that as a general rule of thumb, a study with four times the sample size will give confidence intervals that are half as wide. They loved that. In the midst of what to them was a sea of statistical mumbo jumbo, here was something they could grab onto. I also pointed out a few things I thought doctors would want to hear and two or three buzzwords the sales people should learn.

The scientific literature on their product was favorable, but the company was not able to convey this because the sales reps didn’t have the words to use. I gave them the words by translating scientific jargon to simple language.

If you’d like for me to give your sales team the words they need, please contact me.

Skin in the game for observational studies

The article Deming, data and observational studies by S. Stanley Young and Alan Karr opens with

Any claim coming from an observational study is most likely to be wrong.

They back up this assertion with data about observational studies later contradicted by prospective studies.

Much has been said lately about the assertion that most published results are false, particularly observational studies in medicine, and I won’t rehash that discussion here. Instead I want to cut to the process Young and Karr propose for improving the quality of observational studies. They summarize their proposal as follows.

The main technical idea is to split the data into two data sets, a modelling data set and a holdout data set. The main operational idea is to require the journal to accept or reject the paper based on an analysis of the modelling data set without knowing the results of applying the methods used for the modelling set on the holdout set and to publish an addendum to the paper giving the results of the analysis of the holdout set.

They then describe an eight-step process in detail. One step is that cleaning the data and dividing it into a modelling set and a holdout set would be done by different people than the modelling and analysis. They then explain why this would lead to more truthful publications.

The holdout set is the key. Both the author and the journal know there is a sword of Damocles over their heads. Both stand to be embarrassed if the holdout set does not support the original claims of the author.

* * *

The full title of the article is Deming, data and observational studies: A process out of control and needing fixing. It appeared in the September 2011 issue of Significance.

Update: The article can be found here.

Intellectual property is hard to steal

It’s hard to transfer intellectual property. When I was managing software projects, it would take months to fully transfer a project from one person to another. This was with full access to and encouragement from the original developer. This was a transfer between peers, both part of the same environment with all its institutional memory. If it’s this hard to transfer a project to a colleague, how hard must it be for a competitor to make sense of stolen files?

I’m most familiar with intellectual property in the form of software, but I imagine the same applies to many other forms of intellectual property. Some forms of data are easy to understand, such as a list of passwords. But others, such as source code, require a large amount of context beyond the data. One reason acquisitions fail so often is that the physical assets of a company are not enough. The most valuable assets a company has are often intangible.

Of course companies should protect their intellectual property, but a breach is not necessarily a disaster. On the other hand, the loss of institutional memory may be a disaster.