Today’s “A:360” podcast answers the question “What is a data warehouse?”. Learn what makes a data warehouse different from a “regular database” and what one could do for your organization.
Watch and Listen
Listen to the Podcast
Read the Transcribed Audio
Hey everyone. Welcome to today’s A:360. My name is Brewster Knowlton, and today we’re going to be answering the question: “What is a data warehouse?”
For starters, a data warehouse is just a special type of database. It has a few unique features in the way that it’s designed and setup that make it really valuable to us for the purposes of business intelligence and analytics.
The first major difference is that a data warehouse integrates data from multiple data sources. Let’s say you have a core, or a loan origination system, or a CRM system – for each of those applications there is going to be a separate database. And those databases don’t communicate with each other unless you’ve done some advanced integrations or custom development.
A data warehouse takes all of the data from all those different applications and integrates it into one, central data repository. [It is] one central location, a single source of truth if you will, that houses information about, in the case of a credit union, all of your members, products and services they have, their online banking interactions, their non-monetary and monetary interactions from the CRM system, their debit and credit card data… all of these different, currently disparate applications. A data warehouse takes all of that data and brings it into one central place where you can get that 360-degree view of your membership.
Another key feature of a data warehouse is that it is designed to read data out of it as opposed to write data into it. What this means is that is it optimized for you to retrieve results, retrieve datasets out of this database, as opposed to writing data to it. In a data warehouse, you’re only updating or adding new information every night for the most part. Whereas an operational database – like one that would sit behind a CRM system or any other on premise technology that has to be written to frequently – has to be designed and optimized for write-based operations.
So, for you to be able to run complex queries in a data warehouse – because those operational databases are designed to write instead of read data – some of your queries that you’re going to run are going to take a pretty good amount of time to load. And if you have business users that you’re trying to deploy reports to, they’re not going to be too happy if they have to wait 2, 3, or 4 minutes for a report to render.
Though this point is a little bit more technical in nature, the fact that a data warehouse is designed to read data and retrieve results very quickly (as opposed to writing them because that only happens once a night) is a pretty important feature especially as you’re trying to deploy analytics and reports throughout the organization.
The next important feature of a data warehouse is that it is designed for historical reporting. It’s designed to, instead of just track what a balance is today, you’ll be able to ask questions like, “What was it today? What was it yesterday? Last month? Last year? Two years ago?” And so on. You have this historical analysis. And as we start to go into a world where we want to ask questions like, “What will happen next month?” As opposed to, “What happened last month?” We need to be able to have this historical information for the purposes of trending, for the purposes of predictive analytics, and a lot of the more advanced features that come as a byproduct of having this analytics platform built from a data warehouse.
The last feature of a data warehouse that I’m going to talk about, at least for this podcast, is that it can enforce the consistency of data definitions. I alluded to this point in the previous podcast talking about what it means for an organization to be data-driven.
It is incredibly important to have consistent definitions for key terms like member, product or service. A data warehouse can do a really good job of enforcing those definitions by having those definitions already built in, so, when users pull reports – like a current member report – it will provide the same information whether person A from department A pulls it, or person B from department C pulls it. A data warehouse does a really nice job of enforcing those data definitions that we want to have be consistent. This is a critical component of being a data-driven organization.
I could spend 20 or 30 minutes going on talking about what makes a data warehouse unique and different from a regular database. But the four key points I want you to walk away with are this:
- A data warehouse integrates data from multiple data sources
- A data warehouse is designed for read operations as opposed to write operations making it faster and more efficient for reporting and analytics
- It aggregates historical data and captures historical data so that we can do trending analysis and other historical analysis whereas operational databases have more current information and less historical
- A data warehouse enforces and ensures consistency of important data definitions
- Terms like member, product, service, or household. Tt helps enforce those key terms that are really critical to having a strong analytics foundation.
Thanks for tuning in to today’s A:360!
Subscribe to have new content sent directly to your email!