Software development | John D. Cook

An AI Odyssey, Part 4: Astounding Coding Agents

Wayne Joubert — Tue, 21 Apr 2026 20:14:40 +0000

AI coding agents improved greatly last summer, and again last December-January. Here are my experiences since my last post on the subject.

The models feel subjectively much smarter. They can accomplish a much broader range of tasks. They seem to have a larger, more comprehensive in-depth view of the code base and what you are trying to accomplish. They can locate more details in obscure parts of the code related to the specific task at hand. By rough personal estimate, I would say that they helped me with 20% of my coding effort last August and about 60% now. Both of these figures may be below what is possible, I may not be using them to fullest capability.

This being said, they are not a panacea. Sometimes they need help and direction to where to look for a problem. Sometimes they myopically look at the trees and don’t see the forest, don’t step back to see the big picture to understand something is wrong at a high level. Also they can overoptimize to the test harness. They can also generate code that is not conceptually consistent with existing code.

They can also generate much larger amounts of code than necessary. Some warn that that this will lead to explosion of coding debt. On the other hand, you can also guide the coding agent to refactor and improve its own code—quickly. In one case I’ve gotten it to reduce the size of one piece of code to less than half its original size with no change in behavior.

I use OpenAI Codex rather than Claude Code. I’m glad to hear some technically credible people think this a good choice. Though maybe I should try both.

My work is a research project for which the code itself is the research product, so I can’t give specifications of everything in advance; writing the code itself is a process of discovery. Also, I want a code base that remains human-readable. So I am deeply involved in discussion with the coding agent that will sometimes go off to do a task for some period of time. It is not my desire to treat the agent as what some have called a dark software factory.

Some say they fear that using a coding agent will result in forgetting how to write code without one. I have felt this, but from having exercised this muscle for such a very long time I don’t think it is a skill I would easily forget. The flip side of this argument is, you might even learn new coding idioms from observing what the coding agent writes, that is a good thing.

Some say they haven’t written a line of code in weeks because the coding agent does it for them. I don’t think I’ll ever stop writing code, any more than I will stop scribbling random ideas on the back of an envelope, or typing out some number of lines of code by which I discover some new idea in real time while I am typing. Learning is multisensory.

My hat’s off to the developers who are able to keep many different agents in flight at the same time. Personally I have difficulty handling the cognitive load of thinking deeply about each one and doing a mental context switch across many agents. Though I am experimenting with running more than one agent so I can be doing something while another agent is working.

I continue to be astounded by productivity gains in some situations. I recently added a new capability, which normally I think would’ve taken me two months to learn a whole new algorithmic method and library to use. With the coding agent, it took me four days to get this done, on the order of 10X productivity increase. Though admittedly things are not always that rosy.

In short, the tools are getting better and better. I’m looking forward to what they will be like a few months or a year from now.

The post An AI Odyssey, Part 4: Astounding Coding Agents first appeared on John D. Cook.

Two cheers for ugly code

John — Mon, 19 Jan 2026 17:56:39 +0000

Ugly code may be very valuable, depending on why it’s ugly. I’m not saying that it’s good for code to be ugly, but that code that is already ugly may be valuable.

Some of the ugliest code was started by someone who knew the problem domain well but did not know how to write maintainable code. It may implicitly contain information that is not explicitly codified anywhere else. It may contain information the original programmer isn’t even consciously aware of. It’s often easier to clean up the code than to surface the information it contains using any other source.

Another way code gets ugly is undisciplined modification by multiple programmers over a long period of time. In that case the code has proved to be useful. It’s the opposite of Field of Dreams code that you hope will be used if you build it.

Working effectively with legacy code is hard. It may be easier, and certainly more pleasant, to start from scratch. Even so, there may be more to learn from the old code than is immediately obvious.

***

This post was motivated by looking to extend some code I use in my business. It wouldn’t win a beauty pageant, but it’s very useful.

Writing this post reminded me of a post Productive Productivity that I wrote a while back. From that post:

The scripts that have been most useful are of zero interest to anyone else because they are very specific to my work. I imagine that’s true of most scripts ever written.

The post Two cheers for ugly code first appeared on John D. Cook.

Experiences with GPT-5-Codex

Wayne Joubert — Thu, 16 Oct 2025 18:47:04 +0000

OpenAI Codex is now generally available (see here, here). I’m using the Codex extension in the Cursor code editor with my OpenAI account.

Codex is very helpful for some tasks, such as complex code refactoring, implementing skeleton code for an operation, or writing a single small self-contained piece of code. Models have come a long way from a year ago when bugs in a 30-line piece of generated code were not uncommon.

Some have reported 40-60% overall productivity increase from coding agents. I used Codex recently for a complicated code refactoring that I estimate would have taken over 10x more time to plan and execute without an assistant.

The coding agent is less effective for some other tasks. For example, I asked it to ensure that all Python function signatures in my code had type hints, and it missed many cases.

Also, some have reported that the new Claude Sonnet 4.5 runs much faster, though Codex is being continually improved.

Obviously to be effective, these models must have access to adequate test case coverage, to enable the models to debug against. Without this, the coding agent can get really lost.

My approach to using the agent is very hands-on. Before letting it make any changes, I discuss the change plan in detail and make corrections as needed. (I sometimes need to remind the agent repeatedly to wait until I say “start” before commencing changes). Also when appropriate, I ask the model to make changes one step at a time instead of all in one go. This not only makes for a result that is more understandable and maintainable by humans, but also is more likely to give a good quality result.

Some have cautioned of the hazards of using coding agents. One concern is that a rogue coding agent could do something drastic like delete your code base or data files (both theoretically, and for some models, actually). One remedy is to set up your own sandbox to run the agent in, for example, a virtual machine, that has very locked-down access and no access to sensitive data. This may be cumbersome for some workflows, but for others may be a good security measure.

Also, some have warned that an agent can introduce dangerous security bugs in code. A remedy for this is to manually review every piece of code that the agent produces. This introduces some added developer overhead, though still in my experience it is much faster than writing the same code without the agent. And it is much better than just pushing a button to generate a big blob of incomprehensible code.

Coding agents have greatly improved over the last several months. Software development practices are presently passing through a point of no return, permanently changed by the new AI-enabled coding assistants. Even very bright people, who are already extremely skilled, are benefitting from using these tools.

The post Experiences with GPT-5-Codex first appeared on John D. Cook.

Experiences with Nvidia

Wayne Joubert — Wed, 09 Jul 2025 16:44:26 +0000

Our team started working within Nvidia in early 2009 at the beginning of the ORNL Titan project. Our Nvidia contacts dealt with applications, libraries, programming environment and performance optimization. First impressions were that their technical stance on issues was very reasonable. One obscure example: in a C++ CUDA kernel were you allowed to use “enums,” and the answer would be, of course, yes, we would allow that. This was unlike some other companies that might have odd and cumbersome programming restrictions in their parallel programming models (though by now this has become a harder problem for Nvidia since there are so many software products a user might want to interoperate).

Another example, with a colleague at Nvidia on the C++ standards committee, to whom I mentioned, it might be too early to lock this certain feature design into the standard since hardware designs are still rapidly changing. His response was, Oh, yes, we think exactly the same thing. So in short, their software judgments and decisions generally seem to be well thought out, reasonable and well informed. It sounds simple, but it is amazing how many companies have gotten this wrong.

Nvidia has made good strategic decisions. In the 2013 time frame, Intel was becoming a competitive threat with the Xeon Phi processor. Intel was several times larger than Nvidia with huge market dominance. In response, Nvidia formed a partnership with IBM–itself several times larger than Intel at the time. This came to fruition in the ORNL Summit system in 2018. In the meantime, the Xeon Phi’s OpenMP programming model, though standards-based, turned out to be difficult to write optimized code for, and Nvidia CUDA captured market share dominance of accelerated user software. Intel eventually dropped the Xeon Phi product line.

In the early 2000s, Nvidia went all-in on CUDA. I’ve heard some project teams say they would never use CUDA, because it is nonstandard and too low-level. Many have turned back on this decision. Of course, it is often possible to write an abstraction layer on top of CUDA to make it easier to use and maintain. Also newer programming models like Kokkos can be helpful.

Nvidia also made a prescient decision early to bet big on AI. A little later they decided to go all in on developing a huge number of software libraries is to enable access to many new markets. A huge moat. AMD is trying hard to improve their software processes and catch up.

On the downside, Nvidia high prices are upsetting to many, from gamers to buyers of the world’s largest HPC systems. Competition from AMD and others is a good thing.

And Nvidia marketing speak is sometimes confusing. A comparison was once made claiming that a four GPU system was more powerful than one of the world’s top CPU-only supercomputers on a very specific science problem. I’d like to see the details of that comparison. Also, different figures are being given on how long it took to stand up xAI’s Colossus supercomputer, from 19 days to 122 days. One has to dig a little to find out what these figures mean. Also it was widely reported last year that the GB200 NVL72 GPU was “30 times faster” than H100, but this only refers to certain operations, not key performance measures like flops per watt.

Those are my takes. For more perspectives, see Tae Kim’s excellent book, The Nvidia Way, or this interview.

Thoughts on Nvidia? Please leave in the comments.

The post Experiences with Nvidia first appeared on John D. Cook.

On Making Databases Run Faster

Wayne Joubert — Wed, 05 Mar 2025 19:51:52 +0000

Database technology is a mature field, and techniques for optimizing databases are well understood. However, surprises can still happen.

Certain performance optimizations you might expect to be automatic are not really. I’m working with a legacy code developed some time ago, before modern notions of separation of concerns between code business logic and data storage. The code runs slower than you’d expect, and some have wondered as to why this is.

Profiling of the code revealed that the slowdown was not in the core computation, but rather in the reading and writing of the backend database, which occurs frequently when this code executes.

My first thought was to run with the database in RAM disk, which would give higher bandwidth and lower latency than spinning disk or SSD. This helped a little, but not much.

As a short term fix I ended up writing code for (in-memory) hash tables as an interposer between the code and the database. This can cache commonly accessed values and thus reduce database access.

I would’ve thought high-speed RAM caching of values would be default behavior for a database manager. A principle of interface design is to make the defaults as useful as possible. But in this case apparently not.

Thankfully my fix gave over 10X speedup in application launch time and 6X speedup in the core operation of the code.

The project team is moving toward SQLite for database management in the near future. SQLite has perhaps a dozen or so available methods for optimizations like this. However early experiments with SQLite for this case show that more fundamental structural code modifications will also be needed, to improve database access patterns.

As with general code optimization, sometimes you’d be inclined to think the system (compiler, database manager, etc.) will “do it all for you.” But often not.

The post On Making Databases Run Faster first appeared on John D. Cook.

Code Profiling Without a Profiler

Wayne Joubert — Thu, 27 Feb 2025 14:34:55 +0000

Making your code to run faster starts with understanding where in the code the runtime is actually spent. But suppose, for whatever reason, the code profiling tools won’t work?

I recently used MS Visual Studio on a legacy C++ code. The code crashed shortly after startup when attempting to profile, though otherwise the code ran fine for both release and debug build targets. The cause of the problem was not immediately visible.

If all else fails, using manual timers can help. The idea is to find a high-accuracy system wallclock timer function and use this to read the time before and after some part of the code you want to time. One can essentially apply “bisection search” to the code base to look for the code hot spots. See below for an example.

This can be useful in various situations. Codes in complex languages (or even mixed languages in the code base) can have unusual constructs that break debuggers or profilers. Also, exotic hardware like embedded systems, GPUs or FPGAs may lack full profiler support. Additionally, brand new hardware releases often lack mature tool support, at least initially.

Furthermore, profiling tools themselves, though helpful for getting a quick snapshot of the performance breakdown of each function in the code, have their own limitations. Profilers work either by instrumenting the executable code or sampling. Instrumenting can cause timing inaccuracies by adding overhead from calling the system timer on entrance and exit to every function called. Also it breaks function inlining, often reducing performance.

Sampling on the other hand can be inaccurate if the sample rate is too low, or can distort runtime when sampling at too high a frequency. In contrast, manual timers can circumvent these problems by a very surgical application to specific parts of the code (though some profilers let you turn the profiler on and off at different parts of the code).

Resorting to manual timing of code sections is a messy business. But sometimes it’s the only thing that will work.

Visual Studio C++ Code Example

// mycode.h

#include "cstdio"
#include "cstdarg"

// Get time of day - elapsed seconds
static double gtod() {   
    LARGE_INTEGER ctr, freq;
    QueryPerformanceFrequency(&freq);
    QueryPerformanceCounter(&ctr);
    return static_cast(ctr.QuadPart) / static_cast(freq.QuadPart);
}   
    
// Convenience function for printing to file
static void FilePrintf(const char* format, ...) {   
    char buffer[1024];
    va_list args;
    va_start(args, format);
    vsnprintf(buffer, sizeof(buffer), format, args);
    va_end(args);
    FILE* myoutfile = fopen("mytimingsfile.txt", "a");
    fprintf(myoutfile, "%s", buffer);
    fclose(myoutfile);
}   
    
// Storage for timer
extern double g_timer;

// mycode.cpp

#include "mycode.h"

// Initialization for timer
double g_timer = 0;

int main() {

    // ...
    g_timer = 0;

    for (int i=0; iThe post Code Profiling Without a Profiler first appeared on John D. Cook.

How hard is constraint programming?

Wayne Joubert — Thu, 31 Oct 2024 11:52:13 +0000

I’ve been writing code for the Z3 SMT solver for several months now. Here are my findings.

Python is used here as the base language. Python/Z3 feels like a two-layer programming model—declarative code for Z3, imperative code for Python. In this it seems reminiscent of C++/CUDA programming for NVIDIA GPUs—in that case, mixed CPU and GPU imperative code. Either case is a clever combination of methodologies that is surprisingly fluent and versatile, albeit not a perfect blend of seamless conceptual cohesion.

Other comparisons:

Both have two separate memory spaces (CUDA CPU/GPU memories for one; pure Python variables and Z3 variables for the other).
Both can be tricky to debug. In earlier days, CUDA had no debugger, so one had to fall back to the trusty “printf” statement (for a while it didn’t even have that!). If the code crashed, you might get no output at all. To my knowledge, Z3 has no dedicated debugger. If the problem being solved comes back as satisfiable, you can print out the discovered model variables, but if satisfiability fails, you get very little information. Like some other novel platforms, something of a “black box.”
In both cases, programmer productivity can be well-served by developing custom abstractions. I developed a Python class to manage multidimensional arrays of Z3 variables, this was a huge time saver.

There are differences too, of course.

In Python, “=” is assignment, but in Z3, one only has “==”, logical or numeric equality, not assignment per se. Variables are set once and can’t be changed—sort of a “write-once variables” programming model—as is natural to logic programming.
Code speed optimization is challenging. Code modifications for Z3 constraints/variables can have extreme and unpredictable runtime effects, so it’s hard to optimize. Z3 is solving an NP-complete problem after all, so runtimes can theoretically increase massively. Speedups can be massive also; one round of changes I made gave 2000X speedup on a test problem. Runtime of CUDA code can be unpredictable to a lesser degree, depending on the PTX and SASS code generation phases and the aggressive code optimizations of the CUDA compiler. However, it seems easier to “see through” CUDA code, down to the metal, to understand expected performance, at least for smaller code fragments. The Z3 solver can output statistics of the solve, but these are hard to actionably interpret for a non-expert.
Z3 provides many, many algorithmic tuning parameters (“tactics”), though it’s hard to reason about which ones to pick. Autotuners like FastSMT might help. Also there have been some efforts to develop tools to visualize the solve process, this might be of help.

It would be great to see more modern tooling support and development of community best practices to help support Z3 code developers.

The post How hard is constraint programming? first appeared on John D. Cook.

Choosing a Computer Language for a Project

Wayne Joubert — Wed, 24 Apr 2024 00:19:22 +0000

Julia. Scala. Lua. TypeScript. Haskell. Go. Dart. Various computer languages new and old are sometimes proposed as better alternatives to mainstream languages. But compared to mainstream choices like Python, C, C++ and Java (cf. Tiobe Index)—are they worth using?

Certainly it depends a lot on the planned use: is it a one-off small project, or a large industrial-scale software application?

Yet even a one-off project can quickly grow to production-scale, with accompanying growing pains. Startups sometimes face a growth crisis when the nascent code base becomes unwieldy and must be refactored or fully rewritten (or you could do what Facebook/Meta did and just write a new compiler to make your existing code base run better).

The scope of different types of software projects and their requirements is so incredibly diverse that any single viewpoint from experience runs a risk of being myopic and thus inaccurate for other kinds of projects. With this caveat, I’ll share some of my own experience from observing projects for many dozens of production-scale software applications written for leadership-scale high performance computing systems. These are generally on a scale of 20,000 to 500,000 lines of code and often require support of mathematical and scientific libraries and middleware for build support, parallelism, visualization, I/O, data management and machine learning.

Here are some of the main issues regarding choice of programming languages and compilers for these codes:

1. Language and compiler sustainability. While the lifetime of computing systems is measured in years, the lifetime of an application code base can sometimes be measured in decades. Is the language it is written in likely to survive and be well-supported long into the future? For example, Fortran, though still used and frequently supported, is a less common language thus requiring special effort from vendors, with fewer developer resources than more popular languages. Is there a diversity of multiple compilers from different providers to mitigate risk? A single provider means a single point of failure, a high risk; what happens if the supplier loses funding? Are the language and compilers likely to be adaptable for future computer hardware trends (though sometimes this is hard to predict)? Is there a large customer base to help ensure future support? Similarly, is there an adequate pool of available programmers deeply skilled in the language? Does the language have a well-featured standard library ecosystem and good support for third-party libraries and frameworks? Is there good tool support (debuggers, profilers, build tools)?

2. Related to this is the question of language governance. How are decisions about future updates to the language made? Is there broad participation from the user community and responsiveness to their needs? I’ve known members of the C++ language committee; from my limited experience they seem very reasonable and thoughtful about future directions of the language. On the other hand, some standards have introduced features that scarcely anyone ever uses—a waste of time and more clutter for the standard.

3. Productivity. It is said that programmer productivity is limited by the ability of a few lines of code to express high level abstractions that can do a lot with minimal syntax. Does the language permit this? Does the language standard make sense (coherent, cohesive) and follow the principle of least surprise? At the same time, the language should not engulf what might better be handled modularly by a library. For example, a matrix-matrix product that is bound up with the language might be highly productive for simple cases but have difficulty supporting the many variants of matrix-matrix product provided for example by the NVIDIA CUTLASS library. Also, in-language support for CUDA GPU operations, for example, would make it hard for the language not to lag behind in support of the frequent new releases of CUDA.

4. Strategic advantage. The 10X improvement rule states that an innovation is only worth adopting if it offers 10X improvement compared to existing practice according to some metric . If switching to a given new language doesn’t bring significant improvement, it may not be worth doing. This is particularly true if there is an existing code base of some size. A related question is whether the new language offers an incremental transition path for an existing code to the new language (in many cases this is difficult or impossible).

5. Performance (execution speed). Does the language allow one to get down to bare-metal performance rather than going through costly abstractions or an interpreter layer? Are the features of the underlying hardware exposed for the user to access? Is performance predictable? Can one get a sense of the performance of each line of code just by inspection, or is this occluded by abstractions or a complex compilation process? Is the use of just-in-time compilation or garbage collection unpredictable, which could be a problem for parallel computing wherein unexpected “hangs” can be caused by one process unexpectedly performing one of these operations? Do the compiler developers provide good support for effective and accurate code optimization options? Have results from standardized non-cherry-picked benchmarks been published (kernel benchmarks, proxy apps, full applications)?

Early adopters provide a vibrant “early alert” system for new language and tool developments that are useful for small projects and may be broadly impactful. Python was recognized early in the scientific computing community for its potential complementary use with standard languages for large scientific computations. When it comes to planning large-scale software projects, however, a range of factors based on project requirements must be considered to ensure highest likelihood of success.

The post Choosing a Computer Language for a Project first appeared on John D. Cook.

Experiences with Thread Programming in Microsoft Windows

Wayne Joubert — Fri, 15 Mar 2024 10:35:29 +0000

Lately I’ve been helping a colleague to add worker threads to his GUI-based Windows application.

Thread programming can be tricky. Here are a few things I’ve learned along the way.

Performance. This app does compute-intensive work. It is helpful to offload this very compute-heavy work to a worker thread. Doing this frees the main thread to service GUI requests better.

Thread libraries. Windows has multiple thread libraries, for example Microsoft Foundation Class library threads and C++ standard library threads. It is hazardous to use different thread libraries in the same app. In the extreme case, different thread libraries, such as GOMP vs. LOMP, used in resp. the GCC and LLVM compiler families, have different threading runtimes which keep track of threads in different ways. Mixing them in the same code can cause hazardous silent errors.

Memory fences are a thing. Different threads can run on different processor cores and hold variables in different respective L1 caches that are not flushed (this to maintain high performance). An update to a variable by one thread is not guaranteed to be visible to other threads without special measures. For example, one could safely transfer information using ::PostMessage coupled with a handler function on the receiver thread. Or one could send a signal using an MFC CEvent on one thread and read its Lock on the other. Also, a thread launch implicitly does a memory fence, so that, at least then, the new thread is guaranteed to correctly see the state of all memory locations.

GUI access should be done from the master thread only, not a worker thread. Doing so can result in deadlock. A worker thread can instead ::PostMessage to ask the master thread to do a GUI action.

Thread launch. By default AfxBeginThread returns a thread handle which MFC takes care of deleting when no longer needed. If you want to manage the life cycle of the handle yourself, you can do something like:

myWorkerThreadHandle = AfxBeginThread(myFunc, myParams,
  THREAD_PRIORITY_NORMAL, 0, CREATE_SUSPENDED);
myWorkerThreadHandle->m_bAutoDelete = false;
myWorkerThreadHandle->ResumeThread();

Joint use of a shared library like the DAO database library has hazards. One should beware of using the library to allocate something in one thread and deallocating in another, as this will likely allocate in a thread-local heap or stack instead of a shared thread-safe heap, this resulting in a crash.

Initialization. One should call CoInitializeEx(NULL, COINIT_APARTMENTTHREADED) and AfxDaoInit() (if using DAO) at thread initialization on both master and worker threads, and correspondingly CoUninitialize() and AfxDaoTerm() at completion of the thread.

Monitoring of thread state can be done with
WaitForSingleObject(myWorkerThreadHandle->m_hThread, 0) to determine if the thread has completed or WaitForSingleObject(myWorkerThreadHandle->m_hThread, INFINITE) for a blocking wait until completion.

Race conditions are always a risk but can be avoided by careful reasoning about execution. Someone once said up to 90% of code errors can be found by desk checking [1]. Race conditions are notoriously hard to debug, partly because they can occur nondeterministically. There are tools for trying to find race condition errors, though I’ve never tried them.

So far I find no rigorous specification of the MFC threading model online that touches on all these concerns. Hopefully this post is useful to someone else working through these issues.

References

[1] Dasso, Aristides., Funes, Ana. Verification, Validation and Testing in Software Engineering. United Kingdom: Idea Group Pub., 2007, p. 163.

The post Experiences with Thread Programming in Microsoft Windows first appeared on John D. Cook.

What’s the Best Code Editor?

Wayne Joubert — Sat, 02 Mar 2024 00:29:33 +0000

Emacs, vi, TextEdit, nano, Sublime, Notepad, Wordpad, Visual Studio, Eclipse, etc., etc.—everyone’s got a favorite.

I used Visual Studio previously and liked the integrated debugger. Recently I started using VS again and found the code editing windows rather cluttered. Thankfully you can tone this down, if you can locate the right options.

Eclipse for Java has instantaneous checking for syntax errors. I have mixed feelings on this. Perhaps you could type a little more code before getting a glaring error message?

Concerning IDEs (integrated development environments) like this—I’ve met people who think that a full GUI-based IDE is the only way to go. Maybe so. However , there’s another view.

You’d think if anyone would know how to write code quickly, accurately and effectively, it would be world-class competitive programmers. They’re the best, right?

One of the very top people is Gennady Korotkevich. He’s won many international competitions.

What does he use? Far Manager, a text-based user interface tool with a mere two panels and command prompt. It’s based on 1980s pre-GUI file manager methodologies that were implemented under DOS.

It reminds me of a conversation I had with our admin when I was in grad school. I asked, “Why do you use vi instead of MS Word for editing documents?” Answer: “I like vi because it’s faster—your fingers never need to leave the keyboard.”

Admittedly, not all developer workflows would necessarily find this approach optimal. But still it makes you think. Sometimes the conventional answer is not the best one.

Do you have a favorite code editor? Please let us know in the comments.

The post What’s the Best Code Editor? first appeared on John D. Cook.

Avoiding Multiprocessing Errors in Bash Shell

Wayne Joubert — Mon, 12 Feb 2024 13:03:53 +0000

Suppose you have two Linux processes trying to modify a file at the same time and you don’t want them stepping on each other’s work and making a mess. A common solution is to use a “lock” mechanism (a.k.a. “mutex”). One process “locks the lock” and by this action has sole ownership of a resource in order to make updates, until it unlocks the lock to allow other processes access.

Writing a custom lock in Linux bash shell is tricky. Here’s an example that DOESN’T work right:

#/bin/bash
let is_locked=1 # helper variable to denote locked state
mylockvariable=$(cat mylockfile 2>/dev/null)  # read the lock
while [ "$mylockvariable" != $is_locked ]  # loop until unlocked
do
    sleep 5 # wait 5 seconds to try again 
    mylockvariable=$(cat mylockfile 2>/dev/null)  # read again
done
echo $is_locked > mylockfile  # lock the lock
# >>> do critical work safely here <<<
# >>> ERROR: NOT SAFE <<<
rm mylockfile  # unlock the lock

Here the lock value is stored in a shared resource, the file “mylockfile”. If the file exists and contains the character “1”, the lock is considered locked; otherwise, it is considered unlocked. The code will loop until the lock is unlocked, then acquire the lock, do the required single-process work, and then release the lock.

However, this code can fail without warning: suppose two processes A and B execute this code concurrently. Initially the lock is in an unlocked state. Process A reads the lockfile. Then suppose immediately after this, Process A is temporarily interrupted, perhaps to give CPU cycles to run Process B. Then, suppose Process B begins, reads the lock, locks the lock and starts doing its critical work. Suppose now Process B is put into wait state and Process A is restarted. Process A, since it previously read the lockfile, wrongly believes the lock is unlocked, thus proceeds to also lock the lock and do the critical work—resulting in a mess.

This is an example of a classic race condition, in which the order of execution of threads or processes can affect the final outcome of execution.

A solution to this conundrum is found in the excellent book, Unix Power Tools [1,2]. This is a hefty tome but very accessibly written, for some people well worth a read-through to pick up a slew of time-saving tips.

The problem with the example code is the need to both read and set the lock in a single, indivisible (atomic) operation. Here’s a trick to do it:

#/bin/bash
until (umask 222; echo > mylockfile) 2>/dev/null  # check and lock
do  # keep trying if failed
    sleep 5 # wait 5 seconds to try again 
done
# >>> do critical work safely here <<<
rm -f mylockfile  # unlock the lock

Here, the existence of the lockfile itself is the indicator that the lock is set. Setting the umask makes this file creation fail if the file already exists, triggering the loop to activate to keep trying. This works because the existence of a file can either be true or false and nothing else; the existence of a file is guaranteed atomicity by the OS and the filesystem. Thus, assuming the system is working correctly, this code is guaranteed to produce the desired behavior.

Race conditions can be a nuisance to find since their occurrence is nondeterministic and can be rare but devastating. Writing correct code for multiple threads of execution can be confusing to those who haven’t done it before. But with experience it becomes easier to reason about correctness and spot such errors.

References:

[1] Peek, Jerry D., Shelley Powers, Tim O’Reilly and Mike Loukides. “Unix Power Tools, Third Edition.” (2002), https://learning.oreilly.com/library/view/unix-power-tools/0596003307/

[2] https://learning.oreilly.com/library/view/unix-power-tools/0596003307/ch36.html#:-:text=Shell%20Lockfile

The post Avoiding Multiprocessing Errors in Bash Shell first appeared on John D. Cook.

Is Low Precision Arithmetic Safe?

Wayne Joubert — Tue, 06 Feb 2024 15:20:34 +0000

The popularity of low precision arithmetic for computing has exploded since the 2017 release of the Nvidia Volta GPU. The half precision tensor cores of Volta offered a massive 16X performance gain over double precision for key operations. The “race to the bottom” for lower precision computations continues: some have even solved significant problems using 1-bit precision arithmetic hardware ([1], [2]). And hardware performance is getting even better: the Nvidia H100 tensor core-enabled FP16 is a full 58X faster than standard FP64, and 1-bit precision is yet another 16X faster than this, for total speedup of over 900X for algorithms that can use it [3].

This eye-popping speedup certainly draws attention. However, in scientific computing, low precision arithmetic has typically been seen as unsafe for modeling and simulation codes. Indeed, lower precision can sometimes be used to advantage [4], commonly in a “mixed precision” setting in which only parts of the calculation are done in low precision. However, in general anything less than double precision is considered inadequate to model complex physical phenomena with fidelity (see, e.g., [5]).

In response, developers have created tools to measure the safety of reduced precision arithmetic in application codes [6]. Some tools can even identify which variables or arrays can be safely demoted to lower precision without loss of accuracy in the final result. However, use of these tools in a blind fashion, not backed by some kind of reasoning process, can be hazardous.

An example will illustrate this. The conjugate gradient method for linear system solving and optimization [7] and the closely related Lanczos method for eigenvalue problem solving [8] showed great promise following their invention in the early 1950s. However, they were considered unsafe due to catastrophic roundoff errors under floating point arithmetic—even more pronounced as floating point precision is reduced. Nonetheless, Chris Paige showed in his pioneering work in the 1970s [9] that the roundoff error, though substantial, did not preclude the usefulness of the methods when properly used. The conjugate gradient method has gone on to become a mainstay in scientific computing.

Notice that no tool could possibly arrive at this finding, without a careful mathematical analysis of the methods. A tool would detect inaccuracy in the calculation but could not certify that these errors could cause no harm to the final result.

Some might propose instead a purely data-driven approach: just try low precision on some test cases, if it works then use low precision in production. This approach is fraught with peril, however: the test cases may not capture all situations that could be encountered in production.

For example, one might test an aerodynamics code only on smooth flow regimes, but production runs may encounter complex flows with steep gradients—that low precision arithmetic cannot correctly model. Academic papers that test low precision methods and tools must rigorously evaluate in challenging real-world scenarios like this.

Sadly, computational science teams frequently don’t have the time to evaluate their codes for potential use of lower precision arithmetic. Tools could certainly help. Also, libraries that encapsulate mixed precision methods can provide benefits to many users. A great success story here is mixed precision dense linear solvers, founded on the solid theoretical work of Nick Highnam and colleagues [10], which has found its way into libraries such as [11].

So the final answer is, “it depends.” Each new case must be looked at carefully, and a determination made based on some combination of analysis and testing.

References

[1] Zhang, Y., Garg, A., Cao, Y., Lew, Ł., Ghorbani, B., Zhang, Z. and Firat, O., 2023. Binarized Neural Machine Translation. arXiv preprint arXiv:2302.04907, https://openreview.net/forum?id=XAyPlfmWpu

[2] Lagergren, J., Cashman, M., Vergara, V.G.M., Eller, P.R., Gazolla, J.G.F.M., Chhetri, H.B., Streich, J., Climer, S., Thornton, P., Joubert, W. and Jacobson, D., 2023. Climatic clustering and longitudinal analysis with impacts on food, bioenergy, and pandemics. Phytobiomes Journal, 7(1), pp.65-77, https://apsjournals.apsnet.org/doi/10.1094/PBIOMES-02-22-0007-R.

[3] “NVIDIA H100 Tensor Core GPU Datasheet,” https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet.

[4] G. Alvarez et al., “New algorithm to enable 400+ TFlop/s sustained performance in simulations of disorder effects in high-Tc superconductors,” SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, Austin, TX, USA, 2008, pp. 1-10, doi: 10.1109/SC.2008.5218119.

[5] Spafford, K., Meredith, J., Vetter, J., Chen, J., Grout, R., Sankaran, R. (2010). Accelerating S3D: A GPGPU Case Study. In: Lin, HX., et al. Euro-Par 2009 – Parallel Processing Workshops. Euro-Par 2009. Lecture Notes in Computer Science, vol 6043. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14122-5_16.

[6] “Mixed precision analysis tools,” https://scholar.google.com/scholar?q=mixed+precision+analysis+tools

[7] Hestenes, M.R. and Stiefel, E., 1952. Methods of conjugate gradients for solving linear systems. Journal of research of the National Bureau of Standards, 49(6), pp.409-436, https://nvlpubs.nist.gov/nistpubs/jres/049/jresv49n6p409_a1b.pdf.

[8] Cornelius Lanczos, An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators, Journal of Research of the National Bureau of Standards Vol. 45, No. 4, October 1950, https://nvlpubs.nist.gov/nistpubs/jres/045/4/V45.N04.A01.pdf.

[9] Paige, Christopher C.. “The computation of eigenvalues and eigenvectors of very large sparse matrices.” (1971), https://www.cs.mcgill.ca/~chris/pubClassic/PaigeThesis.pdf.

[10] Higham, N.J., Pranesh, S. and Zounon, M., 2019. Squeezing a matrix into half precision, with an application to solving linear systems. SIAM Journal on Scientific Computing, 41(4), pp.A2536-A2551, https://epubs.siam.org/doi/abs/10.1137/18M1229511.

[11] Lu, Hao; Matheson, Michael; Wang, Feiyi; Joubert, Wayne; Ellis, Austin; Oles, Vladyslav. “OpenMxP-Opensource Mixed Precision Computing,” https://www.osti.gov/biblio/1961398.

The post Is Low Precision Arithmetic Safe? first appeared on John D. Cook.

When High Performance Computing Is Not High Performance

Wayne Joubert — Wed, 10 Jan 2024 15:12:04 +0000

Everybody cares about codes running fast on their computers. Hardware improvements over recent decades have made this possible. But how well are we taking advantage of hardware speedups?

Consider these two C++ code examples. Assume here n = 10000000.

void sub(int* a, int* b) {
    for (int i=0; i
void sub(int* a, int* b) {
    for (int i=0; i
Which runs faster? Both are simple and give identical results (assuming no aliasing). However on modern architectures, depending on the compilation setup, one will generally run significantly faster than the other.
In particular, Snippet 2 would be expected to run faster than Snippet 1. In Snippet 1, elements of the array “a”, which is too large to be cached, must be retrieved from memory after being written, but this is not required for Snippet 2. The trend for over two decades has been for compute speed of newly delivered systems to grow much faster than memory speed, and the disparity is extreme today. The performance of these kernels is bound almost entirely by memory bandwidth speed. Thus Snippet 2, a fused loop version of Snippet 1, improves speed by reducing main memory access.
Libraries like C++ STL are unlikely to help, since this operation is too specialized to expect a library to support it (especially the fused loop version). Also, the compiler cannot safely fuse the loops automatically without specific instructions that the pointers are unaliased, and even then is not guaranteed to do so.
Thankfully, high level computer languages since the 1950s have raised the programming abstraction level for all of us. Naturally, many of us would like to just implement the required business logic in our codes and let the compiler and the hardware do the rest. But sadly, one can’t always just throw the code on a computer and expect it to run fast. Increasingly, as hardware becomes more complex, giving attention to the underlying architecture is critical to getting high performance.
The post When High Performance Computing Is Not High Performance first appeared on John D. Cook.

Leading zeros

John — Tue, 09 Jan 2024 16:04:28 +0000

The confusion between numbers such as 7 and 007 comes up everywhere. We know they’re different—James Bond isn’t Agent 7—and yet the distinction isn’t quite trivial.

How should software handle the two kinds of numbers? The answer isn’t as simple as “Do what the user expects” because different users have different expectations.

Excel

If you type 007 into Excel, by default the software will respond as if to say “Got it. Seven.” If you configure a cell to be text, then it will retain the leading zeros. Many people find this surprising, myself included.

But you can be sure that Microsoft has good reasons for the default behaviors it chooses. These are often business reasons rather than technical reasons. Microsoft wants to please the majority of its user base, not tech wizards. Not only are wizards an unprofitable minority, wizards can take care of themselves.

Zip codes

Someone relayed the following conversation to me recently.

“It took me longer than I thought, but I got the zip codes wrangled.”

“Leading zeros trip you up?”

“Yeah, how did you guess?”

“This isn’t my 01st rodeo.”

I’ve run into this, as has almost everyone who has ever worked with zip codes. The Boston zip code 02134 is not the number 2,134.

Octal

In Perl the expression (02134 > 2000) evaluates to false. That is because in some software, including the perl interpreter, a leading zero indicates that a number is written in octal, i.e. base 8. So 02134 represents 2134_eight = 1116_ten, which is less than 2000_ten.

Update: I’d forgotten that C acts the same way until Wayne reminded me in the comments. I don’t think I’ve ever (deliberately) used that feature in C.

Dates

I’m an American, and I use American-style dates in public correspondence. But privately I use YYYY-MM-DD dates so that dates always sort as intended, regardless of whether a particular piece of software interprets these symbols as numbers, text, or dates.

Computer science versus software engineering

From a computer science perspective, the root of the problem is not being explicit about data types. In computer science lingo, 7 and 2134 are integers, while 007 and 02134 are “words” built on the “alphabet” consisting of the digits 0 through 9. Integers and words have different data types. Furthermore, 007 and 02134 are not just words but representations of different data types: one is a serial number and the other is a postal code. And neither is not an octal number.

Objects of different data types have may have similar text representations, but these representations are to be interpreted differently. And they have different sort orders, which may not correspond to their sort order as text. End of discussion.

This is fine for computer science, but it doesn’t address the software engineering problem of meeting user expectations. It will not do to say “Just make the user specify his types.” The average user doesn’t know what that means.

So what do you do? The software could make educated guesses, but then what? Ask the user for confirmation that the software guessed correctly? Or presume the guess was correct but provide a way to fix the assumption in case it was not? Demand that the user be more specific? The solution depends on context.

Even if you want to meet the expectations of a particular group, such as Excel users or Perl programmers, those expectations may evolve over time. We expect different behavior from software than we did a generation ago. But we also expect backward compatibility! So even within an individual you have conflicting expectations. There is no simple solution, even for such a simple problem of how to handle leading zeros.

The post Leading zeros first appeared on John D. Cook.

Naming Awk

John — Thu, 21 Sep 2023 22:51:37 +0000

The Awk programming language was named after the initials of its creators. In the preface to a book that just came out, The AWK Programming Language, Second Edition, the authors give a little background on this.

Naming a language after its creators shows a certain paucity of imagination. In our defense, we didn’t have a better idea, and by coincidence, at some point in the process we were in three adjacent offices in the order Aho, Weinberger, and Kernighan.

By the way, here’s a nice line from near the end of the book.

Realistically, if you’re going to learn only one programming language, Python is the one. But for small programs typed at the command line, Awk is hard to beat.

The post Naming Awk first appeared on John D. Cook.

A small programming language

John — Sat, 26 Aug 2023 12:05:06 +0000

Paul Graham said “Programming languages teach you not to want what they don’t provide.” He meant that as a negative: programmers using less expressive languages don’t know what they’re missing. But you could also take that as a positive: using a simple language can teach you that you don’t need features you thought you needed.

Awk

I read the original awk book recently, published in 1988. It’s a small book for a small language. The language has grown since 1988, especially the Gnu implementation gawk, and yet from the beginning the language had a useful set of features. Most of what has been added since then is of no use to me.

How I use awk

It has been years since I’ve written an awk program that is more than one line. If something would require more than one line of awk, I probably wouldn’t use awk. I’m not morally opposed to writing longer awk programs, but awk’s sweet spot is very short programs typed at the command line.

At one point when I was saying how I like little awk programs, someone suggested I use Perl one-liners instead because then I’d have access to Perl’s much richer set of features, in particular Perl regular expressions. Along those lines, see these notes on how to write Perl one-liners to mimic sed, grep, and awk.

But when I was reading the awk book I thought about how I rarely need the the features awk doesn’t have, not for the way I use awk. If I were writing a large program, not only would I want more features, I’d want a different language.

Now my response to the suggestion to use Perl one-liners would be that the simplicity of awk helps me focus by limiting my options. Awk is a jig. In Paul Graham’s terms, awk teaches me not to want what it doesn’t provide.

Regular expressions

At first I wished awk were more expressive is in its regular expression implementation. But awk’s minimal regex syntax is consistent with the aesthetic of the rest of the language. Awk has managed to maintain its elegant simplicity by resisting calls to add minor conveniences that would complicate the language. The maintainers are right not to add the regex features I miss.

Awk does not support, for example, \d for digits. You have to type [0-9] instead. In exchange for such minor inconveniences you get a simple but adequate regular expression implementation that you could learn quickly. See notes on awk’s regex features here.

The awk book describes regular expressions in four leisurely pages. Perl regular expressions are an order of magnitude more complex, but not an order of magnitude more useful.

The post A small programming language first appeared on John D. Cook.

Simple example of Kleisli composition

John — Fri, 18 Nov 2022 12:50:25 +0000

When a program needs to work with different systems of units, it’s best to consistently use one system for all internal calculations and convert to another system for output if necessary. Rigidly following this convention can prevent bugs, such as the one that caused the crash of the Mars Climate Orbiter.

For example, maybe you need to work in degrees and radians. It would be sensible to do all calculations in radians, because that’s what software libraries expect, and output results in degrees, because that’s what humans expect.

Now suppose you have a function that takes in a length and doubles it, and another function takes in a length and triples it. Both functions take in length in kilometers but print the result in miles.

You would like the composition of the two functions to multiply a length by six. And as before, the composition would take in a speed in kilometers and return a speed in miles.

Here’s how we could implement this badly.

    miles_per_km = 5/8 # approx

    def double(length_km): 
        return 2*length_km*miles_per_km

    def triple(length_km): 
        return 3*length_km*miles_per_km

    length_km = 8
    d = double(length_km)
    print("Double: ", d)
    t = triple(d)
    print("Triple: ", t)

This prints

    Double: 10.0
    Triple: 18.75

The second output should be 30, not 18.5. The result is wrong because we converted from kilometers to miles twice. The correct implementation would be something like the following.

    miles_per_km = 0.6213712

    def double(length_km): 
        d = 2*length_km
        print("Double: ", d*miles_per_km)
        return d

    def triple(length_km): 
        t = 3*length_km
        print("Triple: ", t*miles_per_km)
        return t

    length_km = 8
    d = double(length_km)
    t = triple(d)

This prints the right result.

    Double: 10.0 
    Triple: 30.0

In abstract terms, we don’t want the composition of f and g to be simply g ∘ f.

We have a function f from X to Y that we think of as our core function, and a function T that translates the output. Say f doubles its input and T translates from kilometers to miles. Let f* be the function that takes X to TY, i.e. the combination of f and translation.

Now take another function g from Y to Z and define g* as the function that takes Y to TZ. We want the composition of f* and g* to be

g* ∘ f* = T ∘ g ∘ f.

In the example above, we only want to convert from kilometers to miles once. This is exactly what Kleisli composition does. (“Kleisli” rhymes with “highly.”)

Kleisli composition is conceptually simple. Once you understand what it is, you can probably think of times when it’s what you wanted but you didn’t have a name for it.

Writing code to encapsulate Kleisli composition takes some infrastructure (i.e. monads), and that’s a little complicated, but the idea of what you’re trying to achieve is not. Notice in the example above, what the functions print is not what they return; the print statements are a sort of side channel. That’s the mark of a monad.

Kleisli categories

The things we’ve been talking about are formalized in terms of Kleisli categories. You start with a category C and define another category that has the same objects as C does but has a different notion of composition, i.e. Kleisli composition.

Given a monad T on C, the Kleisli category C_T has the same objects as C. An arrow f* from X to Y in C_T corresponds to an arrow f from X to TY in C. In symbols,

Hom_{C_T}(X, Y) = Hom_C(X, TY).

Mr. Kleisli’s motivation for defining his categories was to answer a more theoretical question—whether all monads arise from adjunctions—but more practically we can think of Kleisli categories as a way of formalizing a variation on function composition.

The post Simple example of Kleisli composition first appeared on John D. Cook.

Getting pulled back in

John — Thu, 17 Nov 2022 18:51:27 +0000

“Just when I thought I was out, they pull me back in.” — Michael Corleone, The Godfather, Part 3

My interest in category theory goes in cycles. Something will spark my interest in it, and I’ll dig a little further. Then I reach my abstraction tolerance and put it back on the shelf. Then sometime later something else comes up and the cycle repeats. Each time I get a little further.

A conversation with a client this morning brought me back to the top of the cycle: category theory may be helpful in solving a concrete problem they’re working on.

I’m skeptical of applied category theory that starts with categories. I’m more bullish on applications that start from the problem domain, a discussion something like this.

“Here’s a pattern that we’re trying to codify and exploit.”

“Category theory has a name for that, and it suggests you might also have this other pattern or constraint.”

“Hmm. That sounds plausible. Let me check.”

I think of category theory as a pattern description language, a way to turn vague analogies into precise statements. Starting from category theory and looking for applications is less likely to succeed.

When I left academia the first time, I got a job as a programmer. My first assignment was to make some change to an old Fortran program, and I started asking a lot of questions about context. My manager cut me off saying “You’ll never get here from there.” I had to work bottom-up, starting from the immediate problem. That lesson has stuck with me ever since.

Sometimes you do need to start from the top and work your way down, going from abstract to concrete, but less often that I imagined early in my career.

The post Getting pulled back in first appeared on John D. Cook.

Code katas taken more literally

John — Mon, 08 Aug 2022 15:22:13 +0000

Code katas are programming exercises intended to develop programming skills, analogous to the way katas develop martial art skills.

But literal katas are choreographed. They are rituals rather than problem-solving exercises. There may be an element of problem solving, such as figuring how to better execute the prescribed movements, but katas are rehearsal rather than improvisation.

CodeKata.com brings up the analogy to musical practice in the opening paragraph of the home page. But musical practice is also more ritual than problem-solving, at least for classical music. A musician might go through major and minor scales in all 12 keys, then maybe a chromatic scale over the range of the instrument, then two different whole-tone scales, etc.

A code kata would be more like a jazz musician improvising a different melody to the same chord changes every day. (Richie Cole would show off by improvising over the chord changes to Cherokee in all twelve keys. I don’t know whether this was a ritual for him or something he would pull out for performances.)

This brings up a couple questions. What would a more literal analog of katas look like for programming? Would these be useful?

I could imagine someone going through a prescribed sequence of keystrokes that exercise a set of software features that they wanted to keep top of mind, sorta like practicing penmanship by writing out a pangram.

This is admittedly a kind of an odd idea. It makes sense that the kinds of exercises programmers are interested in require problem solving rather than recall. But maybe it would appeal to some people.

***

Image “karate training” by Genista is licensed under CC BY-SA 2.0 .

The post Code katas taken more literally first appeared on John D. Cook.

Visualizing C operator precedence

John — Fri, 03 Jun 2022 17:10:30 +0000

Here’s an idea for visualizing C operator precedence. You snake your way through the diagram starting from left to right.

Operators at the same precedence level are on the same horizontal level.

Following the arrows for changing directions, you move from left-to-right through the operators that associate left-to-right and you move right-to-left through the operators that associate right-to-left.

Although this diagram is specifically for C, many languages follow the same precedence with minor exceptions. For example, all operators that Perl shares with C follow the same precedence as C.

The post Visualizing C operator precedence first appeared on John D. Cook.

Tool recursion

John — Thu, 12 May 2022 15:58:20 +0000

“Literature about Lisp rarely resists that narcissistic pleasure of describing Lisp in Lisp.” — Christian Queinnec, Lisp in Small Pieces

Applying software development tools to themselves has a dark side and a light side.

There’s a danger of becoming obsessed with one’s tools and never getting around to using them. If it’s your job to cut down a tree, there’s some benefit to sharpening your ax, but you can’t only sharpen your ax. At some point you hit diminishing return and it’s time to start chopping.

But there are benefits to self-referential systems, such as macros that use Lisp to generate Lisp, or writing a C compiler in C, or using Emacs to tweak Emacs. There’s a kind of consistency that results, and there can be a compound return on effort. But as with writing a recursive procedure, there has to be a base case, a point at which you stop recursing. Otherwise you go down the black hole of becoming absorbed in your tool and never using it for work.

Even though I’ve used Emacs for a long time, I’ve never understood the recursive fascination some people have with it. For example, part of the elevator pitch for Emacs is that it’s self-documenting. You can pull up help on Emacs from inside Emacs. But you can also type your questions into a search engine without having to learn an arcane help API. What’s the advantage to the former?

For one thing, using Emacs help inside Emacs works without a network connection. For another, you avoid the risk of being distracted by something you see when you’re using your web browser. But the most subtle benefit is the compound effect of self-reference. You can use the same navigation commands in the help system that you can when editing text, you can execute code snippets in place, etc.

When I hear “Isn’t it cool that you can do X in X?” my first thought is “Yeah, that sounds cool” but my second thought is “But why would you want to do that? Sounds like it could be really hard.” I’m starting to appreciate that there are sometimes long-term benefits to these sort of recursive tool uses even if they’re not optimal in the short run.

The post Tool recursion first appeared on John D. Cook.

Nota bene

John — Sat, 23 Oct 2021 14:33:33 +0000

NB

I was looking at the J programming language yesterday and I was amused to see that it uses “NB.” to mark the rest of a line of source code as a comment, just like # in Python or // in C++. This makes comments in J look like comments in English prose.

“NB” abbreviates the Latin phrase nota bene meaning “note well.” It’s been used to mark side notes in English for about three centuries.

Most programming languages couldn’t use “NB” or “NB.” as a comment marker because it would inconsistent with conventions for identifiers, but J’s unconventional code syntax allows it to use conventional English notation for comments.

Why J?

I was looking at J because I have a project looking at its younger sister Remora. As described in this paper,

Remora is a higher-order, rank-polymorphic array-processing programming language, in the same general class of languages as APL and J. It is intended for writing programs to be executed on parallel hardware.

J keeps the array-oriented core of APL but drops its infamous symbols. Remora syntax is even closer to the mainstream, being written like a Lisp. (Some might object that Lisp isn’t mainstream, but it sure is compared to APL or J.)

APL comment symbol

Learning about J’s comment marker made me curious what its APL counterpart was. APL had custom symbols for everything, including comments. Comments began with ⍝ (U+235D), the idea being that the symbol looked like a lamp, giving light to the poor soul mentally parsing code.

The full name for the lamp symbol is “APL FUNCTIONAL SYMBOL UP SHOE JOT.” Since this section of code is explicitly for APL symbols, why not call the symbol COMMENT or LAMP rather than UP SHOE JOT?

I suppose the comment symbol looks like the bottom of a shoe. There’s also a symbol ⍦ (U+2366) [1] with the name “APL FUNCTIONAL SYMBOL DOWN SHOE STILE”

and so “up” and “down” must refer to the orientation of the part of the symbol that looks like ∩ and ∪. But what about “jot” and “stile”?

A jot is a small character. The name is related to the Greek letter iota (ι) and the Hebrew letter yod (י). But if ∩ and ∪ are a shoe, the “jot” is a fairly large circle. Does “jot” have some other meaning?

A “stile” is a step or a rung, as in a turnstile. I suppose the vertical bar on top of ∪ is a stile.

[1] What is this character for in APL? Unicode includes it as an APL symbol, but it’s not included in Wikipedia’s list of APL symbols.

The post Nota bene first appeared on John D. Cook.

Monads and Macros

John — Sun, 26 Sep 2021 18:14:21 +0000

There are two techniques in software development that have an almost gnostic mystique about them: monads and macros.

Pride and pragmatism

As with everything people do, monads and macros are used with mixed motives, for pride and for pragmatism.

As for pride, monads and macros have just the right barrier to entry: high enough to keep out most programmers, but not so high as to be unsurmountable with a reasonable amount of effort. They’re challenging to learn, but not so challenging that you can’t show off what you’ve learned to a wide audience.

As for pragmatism, both monads and macros can be powerful in the right setting. They’re both a form of information hiding.

Monads let you concentrate on functions by providing a sort of side channel for information associated with those functions. For example, you may have a drawing program and want to compose a rotation and stretching. These may be ostensibly pure functions, but they also effect a log of operations that lets you undo actions. It would be burdensome to consider this log as an explicit argument passed into and returned from every operation. so you might keep this log information in a monad.

Macros let you hide the fact that your programming language doesn’t have features that you would like. Operator overloading is an example of adding the appearance of a new feature to a programming language. Macros take this much further, for better or for worse. If you think operator overloading is good because of its potential to clarify code, you’ll like macros. If you think operator overload is bad because of its potential for misuse, you definitely won’t like macros.

Mutually exclusive

Few people are excited about both monads and macros; only one person that I know comes to mind.

Monads and macros appeal to opposite urges: the urge to impose rules and the urge to get around rules. There is a time for both, a time to build structure and a time to tear structure down.

Monads are most popular in Haskell, and macros in Lisp. These are very different languages and their communities have very different values [1].

The ideal Haskell program has such rigid structure that only correct code will compile. The ideal Lisp program is so flexible that it is essentially a custom programming language.

A Haskell programmer would like to say that a program is easy to reason about because of its mathematical structure. A Lisp programmer would like to say a program is easy to read because it maps well onto the problem at hand.

Lisp enthusiast Doug Hoyte says in his book Let Over Lambda

As we now know, nobody truly understands macros.

A Haskell programmer would find this horrifying, but a Lisp programmer would consider it an acceptable price to pay or even something fun to explore.

[1] Here I’m referring to archetypes, generalizations but not exaggerations. Of course no language community is monolithic, and an individual programmer will have different approaches to different tasks. But as a sweeping generalization, Haskell programmers value structure and Lisp programmers value flexibility.

The post Monads and Macros first appeared on John D. Cook.

Pareto and Pandas

John — Thu, 11 Mar 2021 16:44:09 +0000

This post muses about what it means to learn a software library. I’ll use Pandas as an example, but the post isn’t just about Pandas.

Suppose you say “I want to learn Pandas.” That implicitly assumes Pandas one thing, and in a sense it is. In another sense Pandas is hundreds of things.

At the top level, the pandas module (version 1.2.0) has 142 things inside.

    >>> import pandas as pd
    >>> len(dir(pd))
    142

The two most important things inside are the Series and DataFrame objects. They each in turn contain hundreds of things.

    >>> len(dir(pd.Series))
    434
    >>> len(dir(pd.DataFrame))
    441

That’s evidence Pandas’ diversity. But here’s evidence of it’s unity: most of the things inside these two objects have the same names.

    >>> s = set(dir(pd.Series))
    >>> d = set(dir(pd.DataFrame))
    >>> len(s.union(d))
    491
    >>> len(s - d)
    50
    >>> len(d - s)
    57

Pandas kinda has a fractal dimension, having both complexity and unity. The best way to think about it is not as one monolithic thing, or as hundreds of isolated things. It’s a coherent, but not perfectly coherent, collection of related things. This is true of all software libraries. Pandas is more coherent than most libraries because it was initially the product of one mind, that of Wes McKinney.

This has a couple implications for what it means to “learn Pandas.” Because Pandas is big, you have to explore it strategically, not exhaustively. And because Pandas is coherent, part of what it means to learn Pandas is to develop a feel for the way Pandas does things.

No one is going to learn Pandas by studying every object, every method on every object, and every argument to every method on every object. It’s too big. That’s also unnecessary.

There’s probably something like a Pareto distribution on the usefulness of features. The most commonly used features are used far, far more often than the most obscure features.

It would be interesting to do some kind of survey to see which features are actually used and how often. But I don’t think that’s practical. The easiest thing to do would be to find some large code base that heavily uses Pandas. But that’s not typical of how Pandas is used. Probably most lines of code using Pandas are scattered over millions of small scripts, much of it not in production code.

A well-designed library makes it possible to make good guesses about functionality you haven’t used. You learn the gestalt of the library. You can always look up API documentation as needed, but you can’t develop an intuition for a library just-in-time.

“Learn Pandas” is a daunting goal, and maybe an impossible goal if by “learn” you mean explore exhaustively. But “learn how to do my common tasks quickly in Pandas” and “develop a feel for how to do things in Pandas” are much smaller tasks.

The post Pareto and Pandas first appeared on John D. Cook.

Expressiveness

John — Mon, 20 Jul 2020 15:45:09 +0000

Programmers like highly expressive programming languages, but programming managers do not. I wrote about this on Twitter a few months ago.

Q: Why do people like Lisp so much?

A: Because Lisp is so expressive.

Q: Why don’t teams use Lisp much?

A: Because Lisp is so expressive.

Q: Why do programmers complain about Java?

A: Because it’s not that expressive.

Q: Why do businesses use Java?

A: Because it’s not that expressive.

A highly expressive programming language offers lots of options. This can be a good thing. It makes programming more fun, and it can lead to better code. But it can also lead to more idiosyncratic code.

A large programming language like Perl allows developers to carve out language subsets that hardly overlap. A team member has to learn not only the parts of the language he understands and wants to use, but also all the parts that his colleagues might use. And those parts that he might accidentally use.

While Perl has maximal syntax, Lisp has minimal syntax. But Lisp is also very expressive, albeit in a different way. Lisp makes it very easy to extend the language via macros. While Perl is a big language, Lisp is an extensible language. This can also lead to each programmer practically having their own language.

With great expressiveness comes great responsibility. A team using a highly expressive language needs to develop conventions for how the language will be used in order to avoid fracturing into multiple de facto languages.

But what if you’re a team of one? Now you don’t need to be as concerned how other people use your language. You still may need to care somewhat. You want to be able to grab sample code online, and you may want to share code or ask others for help. It pays not to be entirely idiosyncratic, though you’re free to wander further from the mainstream.

Even when you’re working in a team, you still may have code that only you use. If your team is producing C# code, and you secretively use a Perl script to help you find things in the code, no one needs to know. On the other hand, there’s a tendency for personal code to become production code, and so personal tools in a team environment are tricky.

But if you’re truly working by yourself, you have great freedom in your choice of tools. This can take a long time to sort out when you leave a team environment to strike out on your own. You may labor under your previous restrictions for a while before realizing they’re no longer necessary. At the same time, you may choose to stick to your old tools, not because they’re optimal for your new situation, but because it’s not worth the effort to retool.

(Regarding the last link, think myth as in Joseph Campbell, not myth as in Myth Busters.)

The post Expressiveness first appeared on John D. Cook.

From shell to system

John — Mon, 20 Jul 2020 14:01:18 +0000

Routine computer tasks and system programming require different tools, though I’m not entirely sure why.

Many people have thought about how inconsistent shells and system programming languages are and tried to unite them. Wouldn’t it be nice to use one language for everything? But attempts to bring system languages down to the shell, or to push shell programming up to large programs, have not been very successful.

I learned Perl in college so I wouldn’t have to learn shell programming. That’s what Perl was initially designed to be: an alternative to shell scripting. Larry Wall called Perl a “distillation of Unix culture.”

Perl is the most disliked programming language according to Stack Overflow. And yet I imagine many who complain about Perl gladly use the menagerie of quirky tools that Perl was created to unify. Bash is popular while Perl is unpopular, and yet the quirkiest parts of Perl are precisely those it shares with bash.

I expect much of the frustration with Perl comes from using it as a language for writing larger programs. Perl is very terse and expressive. These features are assets for one-liners and individual use. They are liabilities for large programs and team development.

Compared to a system programming language like Java, Perl is complex, inconsistent, and unsafe. But compared to shell scripting, Perl is simple, consistent, and safe!

The post From shell to system first appeared on John D. Cook.

Software analysis and synthesis

John — Thu, 09 Jul 2020 17:21:45 +0000

People who haven’t written large programs think that writing software is easy. All you have to do is break a big problem into smaller problems until you have something so small that it’s easy to program.

The problem is putting the pieces back together. If you’ve only written small programs, you haven’t had many pieces to put together. It’s harder to put the pieces together when you write a large program by yourself. It’s even harder when you work on a large program with other people.

Synthesis is harder than analysis. Or as Perdita Stevens put it, integration is harder than separation.

The image above is a screenshot from her keynote at the RC2020 conference on reversible computation.

The post Software analysis and synthesis first appeared on John D. Cook.

Randomization audit

John — Mon, 08 Jun 2020 16:02:13 +0000

“How would you go about drawing a random sample?”

I thought that was kind of a silly question. I was in my first probability class in college, and the professor started the course with this. You just take a sample, right?

As with many things in life, it gets more complicated when you get down to business. Random sample of what? Sampled according to what distribution? What kind of constraints are there?

How would you select a random sample of your employees in a way that you could demonstrate was not designed to include any particular employee? How can you not only be fair, but convince people who may not understand probability that you were fair?

When there appear to be patterns in your random selections, how can you decide whether there has been an error or whether the results that you didn’t expect actually should have been expected?

How do you sample data that isn’t simply a list but instead has some sort of structure? Maybe you have data split into tables in a relational database. Maybe you have data that naturally forms a network. How can you sample from a relational database or a network in a way that respects the underlying structure?

How would you decide whether the software you’re using for randomization is adequate for your purposes? If it is adequate, are you using it the right way?

These are all questions that I’ve answered for clients. Sometimes they want me to develop randomization procedures. Sometimes they want me to audit the procedures they’re already using. They not only want good procedures, they want procedures that can stand up to critical review.

If you’d like me to design or audit randomization procedures for your company, let’s talk. You will get the assurance that you’re doing the right thing, and the ability to demonstrate that you’re doing the right thing.

The post Randomization audit first appeared on John D. Cook.

Pretending OOP never happened

John — Fri, 15 May 2020 12:26:32 +0000

I ran across someone recently who says the way to move past object oriented programming (OOP) is to go back to simply telling the computer what to do, to clear OOP from your mind like it never happened. I don’t think that’s a good idea, but I also don’t think it’s possible.

Object oriented programming, for all its later excesses, was a big step forward in software engineering. It made it possible to develop much larger programs than before, maybe 10x larger. Someone might counter by saying that programs had to be 10x larger because of all the overhead of OOP, but that’s not the case. OOP does add some overhead, and the amount of overhead grew over time as tools and frameworks became more complicated, but OOP made it possible to write programs that could not have been written before.

OOP provides a way for programmers to organize their code. It may not be the best way, depending on the problem, but the way to move past OOP is to replace it with another discipline. And I imagine most people who have learned and then rejected OOP do just that, whether they realize it or not. Maybe they retain some organizational patterns that they learned in the context of OOP.

That has been my experience. I hardly ever write classes anymore; I write functions. But I don’t write functions quite the way I did before I spent years writing classes.

And while I don’t often write classes, I do often use classes that come from libraries. Sometimes these objects seem like they’d be better off as bare functions, but I imagine the same libraries would be harder to use if no functions were wrapped in objects.

There are many alternatives to OOP for organizing code. I generally like functional programming, but in my experience there’s a hockey stick effort curve as you try to push the purity to 100%. James Hague said it well:

100% pure functional programming doesn’t work. Even 98% pure functional programming doesn’t work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches — say to 85% — then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure.

It’s possible, and a good idea, to develop large parts of a system in purely functional code. But someone has to write the messy parts that interact with the outside world.

Short essays on programming languages

John — Fri, 24 Apr 2020 15:40:34 +0000

I saw a link to So You Think You Know C? by Oleksandr Kaleniuk on Hacker News and was pleasantly surprised. I expected a few comments about tricky parts of C, and found them, but there’s much more. The subtitle of the free book is And Ten More Short Essays on Programming Languages. Good reads.

This post gives a few of my reactions to the essays, my even shorter essays on Kaleniuk’s short essays.

My C

The first essay is about undefined parts of C. That essay, along with this primer on C obfuscation that I also found on Hacker News today, is enough to make anyone run screaming away from the language. And yet, in practice I don’t run into any of these pitfalls and find writing C kinda pleasant.

I have an atypical amount of freedom, and that colors my experience. I don’t maintain code that someone else has written—I paid my dues doing that years ago—and so I simply avoid using any features I don’t fully understand. And I usually have my choice of languages, so I use C only when there’s a good reason to use C.

I would expect that all these dark corners of C would be accidents waiting to happen. Even if I don’t intentionally use undefined or misleading features of the language, I could use them accidentally. And yet in practice that doesn’t seem to happen. C, or at least my personal subset of C, is safer in practice than in theory.

APL

The second essay is on APL. It seems that everyone who programs long enough eventually explores APL. I downloaded Iverson’s ACM lecture Notation as a Tool of Thought years ago and keep intending to read it. Maybe if things slow down I’ll finally get around to it. Kaleniuk said something about APL I hadn’t heard before:

[APL] didn’t originate as a computer language at all. It was proposed as a better notation for tensor algebra by Harvard mathematician Kenneth E. Iverson. It was meant to be written by hand on a blackboard to transfer mathematical ideas from one person to another.

There’s one bit of notation that Iverson introduced that I use fairly often, his indicator function notation described here. I used it a report for a client just recently where it greatly simplified the write-up. Maybe there’s something else I should borrow from Iverson.

Fortran

I last wrote Fortran during the Clinton administration and never thought I’d see it again, and yet I expect to need to use it on a project later this year. The language has modernized quite a bit since I last saw it, and I expect it won’t be that bad to use.

Apparently Fortran programmers are part of the dark matter of programmers, far more numerous than you’d expect based on visibility. Kaleniuk tells the story of a NASA programming competition in which submissions had to be written in Fortran. NASA cancelled the competition because they were overwhelmed by submissions.

Syntax

In his last essay, Kaleniuk gives some ideas for what he would do if he were to design a new language. His first point is that our syntax is arbitrarily constrained. We still use the small collection of symbols that were easy to input 50 years ago. As a result, symbols are highly overloaded. Regular expressions are a prime example of this, where the same character has to play multiple roles in various contexts.

I agree with Kaleniuk in principle that we should be able to expand our vocabulary of symbols, and yet in practice this hasn’t worked out well. It’s possible now, for example, to use λ than lambda in source code, but I never do that.

I suspect the reason we stick to the old symbols is that we’re stuck at a local maximum: small changes are not improvements. A former client had a Haskell codebase that used one non-ASCII character, a Greek or Russian letter if I remember correctly. The character was used fairly often and it did made the code slightly easier to read. But it wreaked havoc with the tool chain and eventually they removed it.

Maybe a wholehearted commitment using more symbols would be worth it; it would take no more effort to allow 100 non-ASCII characters than to allow one. For that matter, source code doesn’t even need to be limited to text files, ASCII or Unicode. But efforts along those lines have failed too. It may be another local maximum problem. A radical departure from the status quo might be worthwhile, but there’s not a way to get there incrementally. And radical departures nearly always fail because they violate Gall’s law: A complex system that works is invariably found to have evolved from a simple system that worked.

The post Short essays on programming languages first appeared on John D. Cook.

Software development | John D. Cook

An AI Odyssey, Part 4: Astounding Coding Agents

Two cheers for ugly code

Experiences with GPT-5-Codex

Experiences with Nvidia

On Making Databases Run Faster

Code Profiling Without a Profiler

Visual Studio C++ Code Example

How hard is constraint programming?

Choosing a Computer Language for a Project

Experiences with Thread Programming in Microsoft Windows

References

What’s the Best Code Editor?

Avoiding Multiprocessing Errors in Bash Shell

References:

Is Low Precision Arithmetic Safe?

References

When High Performance Computing Is Not High Performance

Leading zeros

Excel

Zip codes

Octal

Dates

Computer science versus software engineering

Naming Awk

A small programming language

Awk

How I use awk

Regular expressions

Simple example of Kleisli composition

Kleisli categories

Related posts

Getting pulled back in

Code katas taken more literally

Visualizing C operator precedence

Related posts

Tool recursion

Nota bene

NB

Why J?

APL comment symbol

Related posts

Monads and Macros

Pride and pragmatism

Mutually exclusive

Related posts

Pareto and Pandas

Related posts

Expressiveness

Related posts

From shell to system

Related posts

Software analysis and synthesis

Randomization audit

Pretending OOP never happened

More programming posts

Short essays on programming languages

My C

APL

Fortran

Syntax

Related posts