Understanding The Benefits & Problems with Data Lakes

As more credit unions want to up their competitive game through data and analytics, the debate between data warehouses and data lakes continues. While solution providers and analysts line up on both sides of the discussion, understanding the advantages and drawbacks of a data lake can help your credit union determine if it’s truly the best fit for your needs.

To clear any confusion, let’s recap the main distinctions between a data warehouse and a data lake from our previous article. Both serve as data repositories, however, the data warehouse integrates primarily structured data from multiple data sources into one centralized, single-source of truth. It is then made available to run complex queries fast and efficiently. The data lake, on the other hand, offers credit unions the ability to store vast amounts of raw and unstructured data in its native form until it is ready for use when it is then transformed for analytics, reporting and visualization.

The Data Lake Upsides:

Boosts Competitive Advantage: As a tool, the data lake is helping to redefine the way credit unions analyze heaps of unstructured data for business decision-making. With the tremendous increase in competition, the need to analyze and utilize member data from all sources will be crucial. The data lake facilitates quick decision-making, advanced predictive analytics, and agile data-backed determinations.

Converges Data Sources: Data lakes can help resolve the nagging problem of accessibility and data integration. Credit unions can start to pull together massive volumes of data from various sources for analytics or to store for undetermined future uses. Rather than having dozens of independently managed collections of data, you can combine these sources into the unmanaged data lake.

Delivers Fast Results: Data lakes provide a platform to transform mountains of information for business benefits in near real-time. The data extracted from the data lake can be queried for information and analysis and further decision-making.

Reduces Expense: A data lake built in a public or hybrid cloud environment can help reduce some of the cost required to store raw data. Additionally, the data lake can potentially help cut costs through server and license reduction.

The Data Lake Draw Backs:

Lacks Compatibility: The capability of a data lake to be able to store data in a way that it’s constantly retrievable and queried must be built in to the data lake through unique metadata tags. Without these tags, the data lake quickly dissolves into what many have dubbed the data swamp.

Requires Expertise: Data lakes are only as good as the person fishing in them. Someone with extensive skills must be tasked with ingesting the data, cleansing it, analyzing it and acting upon it. A data lake, at this point in its maturity, is best suited for the trained data scientists. To effectively transform the raw data into useful information, it requires the expertise that many credit unions do not have in-house today.

Hinders Security: By its definition, a data lake accepts any data, from any source, without oversight or governance. Data lakes focus on storing disparate data but do not focus on how or why data is governed, defined and secured. Experts agree that data lakes are a target for hackers. Since the technology and security capabilities are still emerging, it could put the credit union at risk and pose compliance problems.

Skews Results: Since the data stored in a data lake is unstructured and has potential data quality issues, the credit union runs the risk of the analytics being misinterpreted, inaccurate or imprecise.

Creates Data Graveyards: The reality for many credit unions is that data lakes are becoming data retention ponds. It is quite possible that the credit union can discover that they are simply just storing heaps of raw data, unable to make use of the data for problem-solving and business growth. Data lakes require solid cleaning and archiving practices. Without implementing a data analytics roadmap for how to use the data and a solid business intelligence strategy, the data lake can quickly become an expensive repository.

Consumes Time: Since data lakes hold mountains of unstructured data they can potentially squander the valuable time of the data scientist if most of their initial efforts are spent preparing and cleaning the data before any analysis can even begin.

Know Your End Goal
Credit Unions should enter any new technology investment armed with questions and answers. In today’s fiercely competitive financial environment, where every single scrap of data matters, it’s important to stay abreast of all the data analytics tools available. However, we caution our readers to do their homework before diving in. Credit Unions should be careful of jumping right into data lakes and using them as the main integration source for analytics. While the vision for data lakes has been focused on making large amounts of data available quickly, the credit union needs to first strategically assess their current and future business goals, consider the pros and cons of the data lake, and then determine the best tool for the job.

Are you looking for more tips and helpful advice on data management? At the Knowlton Group, we believe that the best data and analytics program starts with a great strategy and clearly defined roadmap and implementation plan. Our personalized approach to each engagement ensures that the specific needs and goals of your financial institution are captured for maximum results. Contact us today to get started.