Galen and clinical trials

Here’s a quote from the Greek physician Galen (c. 130-210 A.D.)

All who drink of this remedy recover in a short time, except those whom it does not help, who all die. Therefore, it is obvious that it fails only in incurable cases.

Imagine a dialog between Galen and a modern statistician.

Stat: You say your new treatment is better than the previous one?

Galen: Yes.

Stat: But more people died on the new treatment.

Galen: Those patients don’t count because they were incurable. They would have died anyway.

The problem with Galen’s line of reasoning is that it is not falsifiable: no experiment could disprove it. He could call any treatment superior by claiming that evidence against it doesn’t count. Still, Galen might have been right.

Now suppose our statistician has a long talk with Galen and tells him about modern statistical technique.

Galen: Can’t you look back at my notes and see whether there was something different about the patients who didn’t respond to the new treatment? There’s got to be some explanation. Maybe my new treatment isn’t better for everyone, but there must be a group for whom it’s better.

Stat: Well, that’s tricky business. Advocates call that “subset analysis.” Critics call it “data dredging.” The problem is that the more clever you are with generating after-the-fact explanations, the more likely you’ll come up with one that seems true but isn’t.

Galen: I’ll have to think about that one. What do you propose we do?

Stat: We’ll have to do a randomized experiment. When each patient arrives, we’ll flip a coin to decide whether to give them the old or the new treatment. That way we expect about the same number of incurable patients to receive each treatment.

Galen: But the new treatment is better. Why should I give half my patients the worse treatment?

Stat: We don’t really know that the new treatment is better. Maybe it’s not. A randomized experiment will give us more confidence one way or another.

Galen: But couldn’t we be unlucky and assign more incurable patients to the better treatment?

Stat: Yes, that’s possible. But it’s not likely we will assign too many more incurable patients to either treatment. That’s just a chance we’ll have to take.

The issues in these imaginary dialogs come up all the time. There are people who believe their treatment is superior despite evidence to the contrary. But sometimes they’re right. New treatments are often tested on patients with poor prognosis, so the complaints of receiving more incurable patients are justified. And yet until there’s some evidence that a new treatment may be at least as good as standard, it’s unethical to give that treatment to patients with better prognosis. Sometimes post-hoc analysis finds a smoking gun, and sometimes it’s data dredging. Sometimes randomized trials fail to balance on important patient characteristics. There are no simple answers. Context is critical, and dilemmas remain despite our best efforts. That’s what makes biostatistics interesting.

Related: Adaptive clinical trial design


If you’re going to do XHTML, you’d better do it right

XHTML is essentially a stricter form of HTML, but not quite. For the most part, you can satisfy the requirements of both standards at the same time. However, when it comes to closing tags, the two standards are incompatible. For example, the line break tag in HTML is <br> but in XHTML is <br/>. Most browsers will tolerate the unnecessary backslash before the closing tag in HTML, especially if you put a space before it. But it’s not strictly correct.

So is this just a pedantic point of markup language grammar? Chris Maunder says an error with closing tags caused Google to stop indexing his web site. He had XHTML-style end tags but had set his DOCTYPE to HTML.

I’ve also heard of browsers refusing to render a page at all because it had DOCTYPE set to XHTML but contained an HTML entity not supported in XHTML. I believe the person reporting this said that he had run the XHTML page through a validator that failed to find the error. Unfortunately I’ve forgotten where I saw this. Does anyone know about this?

Houston Deco

This weekend I stumbled across the book Houston Deco at the library. The book is filled with photos of Art Deco and Art Moderne architecture in Houston and the surrounding area. I had no idea how much Art Deco architecture there was in Houston until I read the book. Some of the photos were of buildings I’ve seen or even been inside without paying much attention to the architecture. More photos are available at the Houston Deco website.

Random number generator controversy

I submitted an article to Code Project yesterday, Simple Random Number Generation, describing a small C# class called SimpleRNG that uses George Marsaglia’s WMC algorithm. The article was posted around 5 PM (central US time) and comments started pouring in right away. I didn’t expect any feedback on a Friday afternoon or Saturday morning. But as I write this post, there have been 580 page views and 11 comments.

There have been three basic questions raised in the comments.

  1. Why not just use the random number generator that comes with .NET?
  2. Is this code suitable for cryptography?
  3. Is this code suitable for Monte Carlo applications?

Why not use the built-in generator? For many applications, the simplest thing would be to use the .NET random number generator. But there are instances where this might not be best. There are questions about the statistical quality of the .NET generator; I’ll get to that in a minute. The primary advantages I see to the SimpleRNG class are transparency and portability.

By transparency I mean that the internal state of the generator is simple and easy to access. When you’re trying to reproduce a result, say while debugging, it’s convenient to have full access to the internal state of the random generator. If you’re using your own generator, you can see everything. You can even temporarily change it: for debugging, it may be convenient to temporarily have the “random” generator return a very regular, predictable sequence.

By portability I do not necessarily mean moving the code between operating systems. The primary application I have in mind is moving the algorithm between languages. For example, in my work we often have prototype code written in R that needs to be rewritten in C++ for efficiency. If the code involves random number generation, the output of the prototype and the rewrite cannot be directly compared, only compared on average. Then you have to judge whether the differences are to be expected or whether they indicate a bug. But if both the R and the C++ code use the same RNG algorithm and the same seed, the results may be directly comparable. (They still may not be directly comparable due to other factors, but at least this way the results are often comparable.)

As for cryptography, no, SimpleRNG is not appropriate for cryptography.

As for Monte Carlo applications, not all Monte Carlo applications are created equal. Some applications do not require high quality random number generators. Or more accurately, different applications require different kinds of quality. Some random number generators break down when used for high-dimensional integration. I suspect SimpleRNG is appropriate for moderate dimensions. I use the Mersenne Twister generator for numerical integration. However, SimpleRNG is faster and much simpler; the MT generator has a very large internal state.

Someone commented on the CodeProject article that the random number generator in .NET is not appropriate for Monte Carlo simulation because it does not pass Marsaglia’s DIEHARD tests while SimpleRNG does. I don’t know what algorithm the .NET generator uses, so I can’t comment on its quality. Before I’d use it in statistical applications, I’d want to find out.

Text reviews for software

When users find spelling and grammar errors in your software, your credibility takes a hit. But apparently very few software projects review the text their software displays. I imagine the ones that do review their text use a combination of two leaky methods: asking execution testers to take note of prose errors, and requiring that all text displayed to users be stored in a string table.

There are a couple problems with asking execution testers to be copy editors. First, they’re not copy editors. They may not recognize a grammatical error when they see it. Second, they only see the text that their path through the software exposes. Messages displayed to the user under unusual circumstances slip through testing.

String tables are a good idea. They can be reviewed by a professional editor. (Or translator, if you’re application is internationalized.) But it’s difficult to make sure that every string the user might see is in the string table. When you need to add a few quick lines of error-handling code, it’s so easy to just include the text right there in the code rather than adding an entry to the string table. After all, you say to yourself, the code’s probably not going to run anyway.

My solution was to write a script that extracts all the quoted text from a source tree so it can be reviewed separately. The script tries to only pick out strings that a user could see, filtering out, for example, code quoted inside code. Doing this perfectly would be very hard, but by tolerating a small error rate, the problem can be solved quickly in a few lines of code. I’ve used this script for years. Nearly every time I run it I discover potentially embarrassing errors.

In addition to helping with copy editing, an extract of all the string literals in a project gives an interesting perspective on the source code. For example, it could help uncover security risks such as SQL injection vulnerabilities.

I’ve posted an article on CodeProject along with the script I wrote.

PowerShell Script for Reviewing Text Shown to Users

The script on CodeProject is written for Microsoft’s PowerShell. If anyone would like a Perl version of the script, just let me know. I first wrote the script in Perl, but then moved it to PowerShell as my team was moving to PowerShell for all administrative scripting.

A little simplicity goes a long way

Sometimes making a task just a little simpler can make a huge difference. Making something 5% easier might make you 20% more productive. Or 100% more productive.

To see how valuable a little simplification can be, turn it around and think about making things more complicated. A small increase in complexity might go unnoticed. But as complexity increases, your subjective perception of complexity increases even more. As you start to become stressed out, small increases in objective complexity produce big increases in perceived complexity. Eventually any further increase in complexity is fatal to creativity because it pushes you over your complexity limit.

graph of perceived complexity versus actual complexity

Clay Shirky discusses how this applies to information overload. He points out that we can feel like the amount of information coming in has greatly increased when it actually hasn’t. He did a little experiment to quantify this. When he thought that the amount of spam he was receiving had doubled, he would find that it had actually increased by about 25%. Turning this around, you may be able to feel like you’ve cut your amount of spam in half by just filtering out 20% of it.

A small decrease in complexity can be a big relief if you’re under stress. It may be enough to make the difference between being in a frazzled mental state to a calm mental state (moving out of F-state into C-state). If you’re up against your maximum complexity, a small simplification could make the difference between a problem being solvable or unsolvable.

Small simplifications are often dismissed as unimportant when they’re evaluated in the small. Maybe a new term makes it possible to refer to an idea in three syllables rather than six. No big deal if it’s a term you don’t use much. But if it’s a term you use all the time, it makes a difference. That’s why every group has its own jargon.

Suppose one application takes five mouse clicks to do what another can do in three. Maybe that’s no big deal. But if you’re under stress, those two mouse clicks might make the difference between deciding a finishing touch is worthwhile versus not worthwhile.

Suppose one programming language takes five lines of code to do what another language can do in four lines. So what? How long does it take to type one line of code? But multiply that by 10. Maybe you see 40 lines of code on your laptop at once but you can’t see 50. Or multiply by 10 again. Maybe you can hold 400 lines of code in your head but you can’t hold 500. Language features dismissed as “syntactic sugar” can make a real difference.

When you’re stressed and feel like only a radical change will do any good, think again. A small simplification might be enough to give you some breathing room by pulling you back down below your complexity limit.

Related post: What happens when you add another teller?

Three quotes on simplicity

It’s easy to decide what you’re going to do.  The hard thing is deciding what you’re not going to do.
Michael Dell

Clutter kills WOW.
Tom Peters

Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius — and a lot of courage — to move in the opposite direction.
Albert Einstein

Are Covey’s quadrants correlated?

I was reading a statistical article the other day that used the word “important” when I thought the author should have said “urgent.” Since I was in a statistical frame of mind, I wondered whether importance and urgency are positively or negatively correlated.

Stephen Covey is known for his four quadrant diagram and his advice that we should spend as much time as we can in quadrant 2, working on things that are important but not urgent.

The four-quadrant matrix for importance and urgency.

Are urgent tasks more likely, less likely, or equally likely to be important? In statistical jargon, are they positively correlated, negatively correlated, or uncorrelated?

I believe Covey’s assumption is that urgency is negatively correlated with importance, that is, urgent tasks are less likely to be important. That’s probably true of life in general, but there are contexts where the correlation is reversed. In the paper that prompted this musing, I believe urgency and importance were positively correlated.

In what areas of life are urgency and importance most positively correlated? Most negatively correlated?

Why Unicode is subtle

On it’s surface, Unicode is simple. It’s a replacement for ASCII to make room for more characters. Joel Spolsky assures us that it’s not that hard. But then how did Jukka Korpela have enough to say to fill his 678-page book Unicode Explained? Why is the Unicode standard 1472 printed pages?

It’s hard to say anything pithy about Unicode that is entirely correct. The best way to approach Unicode may be through a sequence of partially true statements.

The first approximation to a description of Unicode is that it is a 16 bit character set. Sixteen bits are enough to represent the union of all previous character set standards. It’s enough to contain nearly 30,000 CJK (Chinese-Japanese-Korean) characters with space left for mathematical symbols, braille, dingbats, etc.

Actually, Unicode is a 32-bit character set. It started out as a 16-bit character set. The first 16 bit range of the Unicode standard is called the Basic Multilingual Plane (BMP), and is complete for most purposes. The regions outside the BMP contain characters for archaic and fictional languages, rare CJK characters, and various symbols.

So essentially Unicode is just a catalog of characters with each character assigned a number and a standard name. What could be so complicated about that?

Well, for starters there’s the issue of just what constitutes a character. For example, Greek writes the letter sigma as σ in the middle of a word but as ς at the end of a word. Are σ and ς two representations of one character or two characters? (Unicode says two characters.) Should the Greek letter π and the mathematical constant π be the same character? (Unicode says yes.) Should the Greek letter Ω and the symbol for electrical resistance in Ohms Ω be the same character? (Unicode says no.) The difficulties get more subtle (and politically charged) when considering Asian ideographs.

Once have agreement on how to catalog tens of thousands of characters, there’s still the question of how to map the Unicode characters to bytes. You could think of each byte representation as a compression or compatibility scheme. The most commonly used systems are UTF-8, and  UTF-16. The former is more compact (for Western languages) and compatible with ASCII. The latter is simpler to process. Once you agree on a byte representation, there’s the issue of how to order the bytes (endianness).

Once you’ve resolved character sets and encoding, there remain issues of software compatibility. For example, which web browsers and operating systems support which representations of Unicode? Which operating systems supply fonts for which characters? How do they behave when the desired font is unavailable? How do various programming languages support Unicode? What software can be used to produce Unicode? What happens when you copy a Unicode string from one program and paste it into another?

Things get even more complicated when you want to process Unicode text because this brings up internationalization and localization issues. These are extremely complex, though they’re not complexities with Unicode per se.

For more links, see my Unicode resources.

Learning is not the same as gaining information

Learning is not the same as just gaining information. Sometimes learning means letting go of previously held beliefs. While this is true in life in general, my point here is to show how this holds true when using the mathematical definition of information.

The information content of a probability density function p(x) is given by

integral of p(x) log p(x)

Suppose we have a Beta(2, 6) prior on the probability of success for a binary outcome.

plot of beta(2,6) density

The prior density has information content 0.597. Then suppose we observe a success. The posterior density is distributed as Beta(3, 6). The posterior density has information 0.516, less information than the prior density.

plot of beta(3,6) density

Observing a success pulled the posterior density toward the right. The posterior density is a little more diffuse than the prior and so has lower information content. In that sense, we know less than before we observed the data! Actually, we’re less certain than we were before observing the data. But if the true probability of response is larger than our prior would indicate, we’re closer to the truth by becoming less confident of our prior belief, and we’ve learned something.

Related: Use information theory to clarify and quantify goals

Contrasting Microsoft Word and LaTeX

Here’s an interesting graph from Marko Pinteric comparing Microsoft Word and Donald Knuth’s LaTeX.

comparing Word and Latex. Image by Marko Pinteric.

According to the graph, LaTeX becomes easier to use relative to Microsoft Word as the task becomes more complex. That matches my experience, though I’d add a few footnotes.

  1. Most people spend most of their time working with documents of complexity to the left of the cross over.
  2. Your first LaTeX document will take much longer to write than your first Word document.
  3. Word is much easier to use if you need to paste in figures.
  4. LaTeX documents look better, especially if they contain mathematics.

See Charles Petzold’s notes about the lengths he went to in order to produce is upcoming book in Word. I imagine someone of less talent and persistence than Petzold could not have pulled it off using Word, though they would have stood a better chance using LaTeX.

Before the 2007 version, Word documents were stored in an opaque binary format. This made it harder to compare two documents. A version control system, for example, could not diff two Word documents the same way it could diff two text files. It also made Word documents difficult to troubleshoot since you had no way to look beneath the WYSIWYG surface.

However, a Word 2007 document is a zip file containing a directory of XML files and embedded resources. You can change the extension of any Office 2007 file to .zip and unzip it, inspect and possibly change the contents, the re-zip it. This opens up many new possibilities.

I’ve written some notes that may be useful for people wanting to try out LaTeX on Windows.

* * *

For daily tips on LaTeX and typography, follow @TeXtip on Twitter.

TeXtip logo

New spin on the cathedral and the bazaar

Eric Raymond’s famous essay The Cathedral and the Bazaar compares commercial software projects to cathedrals and open source software projects to bazaars. Cathedrals are carefully planned. Bazaars are not. The organizational structure a bazaars emerges without deliberate coordination of its participants. The open source community has embraced the metaphor of the bazaar and the informality and spontaneity it implies.

Shmork wrote the following observation in the comments to a Coding Horror post yesterday that discussed the difficulties of using Linux software.

Almost nobody in the Western world shops at real-life bazaars either, because they are dodgy, unsafe, and unregulated. And in the Western world, we like things to be reliable, working, safe. So cathedral it is. Even our flea markets aren’t bazaars, really, they’re just knock-off cathedrals.

Being busy

From A Bias for Action:

The simple fact is that being busy is easier than not.  Most managers cannot admit that a fragmented day is actually the laziest day, the day that requires the least mental discipline and the most nervous energy.  Responding to each new request, chasing an answer to the latest question, and complaining about overwhelming demands are easier than setting priorities.