On Making Databases Run Faster

Database  technology is a mature field, and techniques for optimizing databases are well understood. However, surprises can still happen.

Certain performance optimizations you might expect to be automatic are not really. I’m working with a legacy code developed some time ago, before modern notions of separation of concerns between code business logic and data storage. The code runs slower than you’d expect, and some have wondered as to why this is.

Profiling of the code revealed that the slowdown was not in the core computation, but rather in the reading and writing of the backend database, which occurs frequently when this code executes.

My first thought was to run with the database in RAM disk, which would give higher bandwidth and lower latency than spinning disk or SSD. This helped a little, but not much.

As a short term fix I ended up writing code for (in-memory) hash tables as an interposer between the code and the database. This can cache commonly accessed values and thus reduce database access.

I would’ve thought high-speed RAM caching of values would be default behavior for a database manager. A principle of interface design is to make the defaults as useful as possible. But in this case apparently not.

Thankfully my fix gave over 10X speedup in application launch time and 6X speedup in the core operation of the code.

The project team is moving toward SQLite for database management in the near future. SQLite has perhaps a dozen or so available methods for optimizations like this. However early experiments with SQLite for this case show that more fundamental structural code modifications will also be needed, to improve database access patterns.

As with general code optimization, sometimes you’d be inclined to think the system (compiler, database manager, etc.) will “do it all for you.” But often not.

4 thoughts on “On Making Databases Run Faster

  1. The database systems I have most experience with (MySQL, SQL Server) most definitely cache frequently accessed data in RAM. In my experience as much work goes into managing those caching mechanisms as goes into managing the schema.

  2. Thanks for the comment. The existing library being used in the code is the Microsoft DAO Object Library. I understood from some sources that this library is deprecated. Also according to some sources it has limited RAM caching capabilities. Perhaps a better choice should have been made earlier in the project for a capable database management library —

  3. Love posts like these — plenty of rocks to turn over…

    I’ll go back and read again, but was the fix basically “fixing what an ORM got wrong”?

    Modern DBs have many — sometimes too many! — switches to flip for optimization…and that’s all the way at the end of chain, after the query optimizer and the ORM and the external (2PC? multi-client? strong-consistency-requiring? etc) clients kick it off.

    vis a vis SQLLite — good call. N.B. AWS trying to roughly co-opt that for certain use cases with their ‘new’ DSQL for Postgres/Aurora.

Leave a Reply

Your email address will not be published. Required fields are marked *