I get this question all the time: what makes a data warehouse different from a regular database?
To those in organizations with a less robust business intelligence and analytics maturity, it may be difficult to differentiate between the two. However, there are very distinct differences between a data warehouse and a regular database. In this post, I will quickly address the major differences.
Read vs. Write
A “regular” database is generally the destination for data modified from a front-end application. In banking, the core system will typically record all changes and data to a standard database. A CRM system (like Dynamics CRM) will write to a standard SQL Server database. These databases are specifically designed to write data (update or insert data) as quickly as possible. They are not optimized for read (data retrieval) operations. In other words, the database is not designed to efficiently get data out of it. A “regular” database is often normalized; this structure optimizes those write operations.
A data warehouse, on the other hand, is designed for read operations. The structure of the data warehouse is designed to accommodate the fact that the majority of a user’s interaction with the warehouse will be retrieving data from it. The only time a “write” operation – any operation that modifies existing data or adds new data – occurs in the warehouse is at night during the ETL process (in most data warehouses anyway).
Unlike a “regular” database, a data warehouse is denormalized. You will often hear this structure referred to as a star schema when discussed in a business intelligence context.
Point in Time vs. Historical
A credit union’s core banking database will very easily tell you what a member’s current share balance is. But will it just as easily tell you what that member’s balance was yesterday? Last week? Last month? A year ago?
Another major difference between a “regular” database and a data warehouse is that a data warehouse is designed for historical reporting. Data warehouses accrue information nightly from various source systems; a “regular” database typically only stores current data (or an extremely limited amount of historical data). Some historical data might be buried in log files or audit trails in a “regular” database, but the database is by no means designed to efficiently or easily access this data. A data warehouse, whose primary purpose is for analytics and intelligence, makes accessing this historical data incredibly easy.
Think of what you ask your front-end applications versus what you might ask a data warehouse. A front-end application (whose data is stored in a “regular” database) might be asked “what sales prospects are in our system today? A data warehouse might ask “what is the average number of sales prospects we have had in our pipeline for the last twelve months?”.
Single Source vs. Multiple Sources
A “regular” database is usually the backend to a front-end application often used by business users or customers. Each backend database is designed to only store data for that particular application.
A data warehouse, on the other hand, is designed to integrate data from various application sources. A healthcare example might be that the data warehouse integrates the main EMR system database, a 3rd party surgical scheduling application database, a billing database, and a logistics application’s database. A credit union’s data warehouse might integrate their core platform, consumer loan origination system, mortgage origination system, and CRM application. Each of these data warehouses have integrated data from various “regular” databases. The ability for a data warehouse to integrate data from various data sources is probably the most important difference between a “regular database” and a data warehouse.
There are several other differences between a data warehouse and a “regular” database, but these few points represent the major differences in structure and design. As big data, analytics, and business intelligence become more and more critical to an organization’s sustained success, becoming familiar with some of these key business intelligence concepts will become a necessity. Take a look at some of our other business intelligence posts to learn more!