DataXpresso

Data lakehouse: overhyped or here to stay?

Data lakehouse is a term that’s increasingly working its way into the consciousness of those of us working in the data industry. But what does it actually mean? Is there any substance to it? Are there any use cases where a lakehouse might make sense?

These are the critical questions that any data professional needs to address first to get a true understanding of whether there is anything worth exploring for their business. In the latest episode of DataXpresso, Helena Schwenk digs into the topic with Graham Sharpe, director of strategic solutions at Exasol. Check out the podcast here:

Where has the phrase ‘data lakehouse’ come from?

If you want to learn more about the origins of the term and see how relevant it may be to you, check out Helena’s recent blog on the topic.

Why is there confusion?

Put simply, we’re in the early stages of this concept being used and there are different definitions being used by vendors with different backgrounds.

Why is this being discussed now?

The data lakehouse terminology seems to have come about due to issues that are being experienced at the high end of data use cases with large volumes and high throughput. In this scenario the effectiveness of throwing more compute power at the problem is beginning to show diminishing returns. Some are pushing the lakehouse as a way to satisfy smaller and medium use cases, while creating time to work out how to take on big problems with really large volumes of data.

If you want to hear more on the debate, download the podcast and, while you’re at it, check out our back catalog here.