Big data is easy; big models are hard.
If you just wanted to use simple models with tons of data, that would be easy. You could resample the data, throwing some of it away until you had a quantity of data you could comfortably manage.
But when you have tons of data, you want to take advantage of it and ask questions that simple models cannot answer. (“Big” data is often indirect data.) So the problem isn’t that you have a lot of data, it’s that you’re using models that require a lot of data. And that can be very hard.
I am not saying people should just use simple models. No, people are right to want to take advantage of their data, and often that does require complex models. (See Brad Efron’s explanation why.) But the primary challenge is not the volume of data.
Related post: Big data and humility