I recently heard the term data swamp, a humorous take on data lakes. I thought about data swamps yesterday when I hiked past the literal swamp in the photo above.
Swamps are a better metaphor than lakes for heterogeneous data collections because a lake is a homogeneous body of water. What makes a swamp a swamp is its heterogeneity.
The term “data lake” may bring up images of clear water, maybe an alpine lake like Lake Tahoe straddling California and Nevada. “Data swamp” makes me think of Louisiana’s Atchafalaya Basin, which is probably a more appropriate image for most data lakes.