In my previous post, cohort assignments in clinical trials, I mentioned in passing how you could calculate cohort numbers from accrual numbers if the world were simpler than it really is.
Suppose you want to treat patients in groups of 3. If you count patients and cohorts starting from 1, then patients 1, 2, and 3 are in cohort 1. Patients 4, 5, and 6 are in cohort 2. Patients 7, 8, and 9 are in cohort 3, etc. In general patient n is in cohort 1 + ⌊(n-1)/3⌋.
If you start counting patients and cohorts from 0, then patients 0, 1, and 2 are in cohort 0. Patients 3, 4, and 5 are in cohort 1. Patients 6, 7, and 8 are in cohort 2, etc. In general patient n is in cohort ⌊n/3⌋.
These kinds of calculations, common in computer science, are often simpler when you start counting from 0. If you want to divide things (patients, memory locations, etc.) into groups of size k, the nth item is in group ⌊n/k⌋. In C notation, integer division truncates to an integer and so the expression is even simpler:
Counting centuries is confusing because we count from 1. That’s why the 1900’s were the 20th century etc. If we called the century immediately following the birth of Christ the 0th century, then the 1900’s would be the 19th century.
Because computer scientists usually count from 0, most programming languages also count from zero. Fortran and Visual Basic are notable exceptions.
The vast majority of humanity finds counting from 0 unnatural and so there is a conflict between how software producers and consumers count. Demanding that average users learn to count from zero is absurd. So the programmer must either use one-based counting internally, and risk confusing his peers, or use zero-based counting internally, and risk forgetting to do a conversion for input or output. I prefer the latter. The worst option is to vacillate between the two approaches.
* * *
For a daily dose of computer science and related topics, follow @CompSciFact on Twitter.