Yesterday I said on Twitter “Time to see whether practice agrees with theory, moving from LaTeX to Python. Wish me luck.” I got a lot of responses to that because it describes the experience of a lot of people. Someone asked if I’d blog about this. The content is confidential, but I’ll talk about the process. It’s a common pattern.
I’m writing code based on a theoretical result in a journal article.
First the program gets absurd results. Theory pinpoints a bug in the code.
Then the code confirms my suspicions of an error in the paper.
The code also uncovers, not an error per se, but an important missing detail in the paper.
Then code and theory differ by about 1%. Uncertain whether this is theoretical approximation error or a subtle bug.
Then try the code for different arguments, ones which theory predicts will have less approximation error, and now code and theory differ by 0.1%.
Then I have more confidence in (my understanding of) the theory and in my code.
7 thoughts on “Iterating between theory and code”
This describes my entire Ph.D. work in Computational Chemical Physics – spend weeks writing code, more weeks testing it with toy problems to verify that the code is doing the right thing, more weeks debugging, and then a day to run the actual parameters you want for the real problem.
Let’s say I specilated that this was the journal article you were testing “https://www.researchgate.net/publication/322766604_M_-periodogram_for_the_analysis_of_long-range-dependent_time_series” . Would you be breaching your confidentiality agreement if you told me it was not that. How much information about a scenario constitutes a breach?
Haven’t seen that article before. But I’ve played out the sequence of events in the blog post many times with little variations on the theme.
A concrete example of, “In theory, theory and practice are the same. In practice, they are not.”
Having a practice of my own, practically speaking, theory is never practical, but can extend my options, somewhat like my skeleton allows me to extend my reach.
The story you described has a happy ending. I guess math papers are harder to fake versus p-value ones? In other words, how do you deal with academic fraud without knowing? How much time would one waste doubting the code and results against publish-or-perish desperation wrapped in finest 100% genuine LaTeX?
Any advice on how to spot those, especially in Data Science?
Looking forward to your reply. Thanks John.
Sometimes simple tests are enough to uncover a problem, such as a back-of-the-envelop calculation or a dimensional analysis check.
Another technique is to imagine what the researcher might have done with different data, how they might have come up with a post hoc explanation of contrary data. François Jacob said “A theory that explains too much ultimately explains very little.” If you have an extremely flexible theory, conclusions aren’t very convincing.