Understanding The Benefits & Problems with Data Lakes

As more credit unions want to up their competitive game through data and analytics, the debate between data warehouses and data lakes continues. While solution providers and analysts line up on both sides of the discussion, understanding the advantages and drawbacks of a data lake can help your credit union determine if it’s truly the best fit for your needs.

To clear any confusion, let’s recap the main distinctions between a data warehouse and a data lake from our previous article. Both serve as data repositories, however, the data warehouse integrates primarily structured data from multiple data sources into one centralized, single-source of truth. It is then made available to run complex queries fast and efficiently. The data lake, on the other hand, offers credit unions the ability to store vast amounts of raw and unstructured data in its native form until it is ready for use when it is then transformed for analytics, reporting and visualization.

The Data Lake Upsides:

Boosts Competitive Advantage: As a tool, the data lake is helping to redefine the way credit unions analyze heaps of unstructured data for business decision-making. With the tremendous increase in competition, the need to analyze and utilize member data from all sources will be crucial. The data lake facilitates quick decision-making, advanced predictive analytics, and agile data-backed determinations.

Converges Data Sources: Data lakes can help resolve the nagging problem of accessibility and data integration. Credit unions can start to pull together massive volumes of data from various sources for analytics or to store for undetermined future uses. Rather than having dozens of independently managed collections of data, you can combine these sources into the unmanaged data lake.

Delivers Fast Results: Data lakes provide a platform to transform mountains of information for business benefits in near real-time. The data extracted from the data lake can be queried for information and analysis and further decision-making.

Reduces Expense: A data lake built in a public or hybrid cloud environment can help reduce some of the cost required to store raw data. Additionally, the data lake can potentially help cut costs through server and license reduction.

The Data Lake Draw Backs:

Lacks Compatibility: The capability of a data lake to be able to store data in a way that it’s constantly retrievable and queried must be built in to the data lake through unique metadata tags. Without these tags, the data lake quickly dissolves into what many have dubbed the data swamp.

Requires Expertise: Data lakes are only as good as the person fishing in them. Someone with extensive skills must be tasked with ingesting the data, cleansing it, analyzing it and acting upon it. A data lake, at this point in its maturity, is best suited for the trained data scientists. To effectively transform the raw data into useful information, it requires the expertise that many credit unions do not have in-house today.

Hinders Security: By its definition, a data lake accepts any data, from any source, without oversight or governance. Data lakes focus on storing disparate data but do not focus on how or why data is governed, defined and secured. Experts agree that data lakes are a target for hackers. Since the technology and security capabilities are still emerging, it could put the credit union at risk and pose compliance problems.

Skews Results: Since the data stored in a data lake is unstructured and has potential data quality issues, the credit union runs the risk of the analytics being misinterpreted, inaccurate or imprecise.

Creates Data Graveyards: The reality for many credit unions is that data lakes are becoming data retention ponds. It is quite possible that the credit union can discover that they are simply just storing heaps of raw data, unable to make use of the data for problem-solving and business growth. Data lakes require solid cleaning and archiving practices. Without implementing a data analytics roadmap for how to use the data and a solid business intelligence strategy, the data lake can quickly become an expensive repository.

Consumes Time: Since data lakes hold mountains of unstructured data they can potentially squander the valuable time of the data scientist if most of their initial efforts are spent preparing and cleaning the data before any analysis can even begin.

Know Your End Goal
Credit Unions should enter any new technology investment armed with questions and answers. In today’s fiercely competitive financial environment, where every single scrap of data matters, it’s important to stay abreast of all the data analytics tools available. However, we caution our readers to do their homework before diving in. Credit Unions should be careful of jumping right into data lakes and using them as the main integration source for analytics. While the vision for data lakes has been focused on making large amounts of data available quickly, the credit union needs to first strategically assess their current and future business goals, consider the pros and cons of the data lake, and then determine the best tool for the job.

Are you looking for more tips and helpful advice on data management? At the Knowlton Group, we believe that the best data and analytics program starts with a great strategy and clearly defined roadmap and implementation plan. Our personalized approach to each engagement ensures that the specific needs and goals of your financial institution are captured for maximum results. Contact us today to get started.

Today’s “A:360” podcast answers the question “What is a data warehouse?”. Learn what makes a data warehouse different from a “regular database” and what one could do for your organization.

Watch and Listen

Click to Watch on YouTube.

Listen to the Podcast

Click to Listen on SoundCloud
Click to Listen to on iTunes

Read the Transcribed Audio

Hey everyone. Welcome to today’s A:360. My name is Brewster Knowlton, and today we’re going to be answering the question: “What is a data warehouse?”

For starters, a data warehouse is just a special type of database. It has a few unique features in the way that it’s designed and setup that make it really valuable to us for the purposes of business intelligence and analytics.

The first major difference is that a data warehouse integrates data from multiple data sources. Let’s say you have a core, or a loan origination system, or a CRM system – for each of those applications there is going to be a separate database. And those databases don’t communicate with each other unless you’ve done some advanced integrations or custom development.

A data warehouse takes all of the data from all those different applications and integrates it into one, central data repository. [It is] one central location, a single source of truth if you will, that houses information about, in the case of a credit union, all of your members, products and services they have, their online banking interactions, their non-monetary and monetary interactions from the CRM system, their debit and credit card data… all of these different, currently disparate applications. A data warehouse takes all of that data and brings it into one central place where you can get that 360-degree view of your membership.

Another key feature of a data warehouse is that it is designed to read data out of it as opposed to write data into it. What this means is that is it optimized for you to retrieve results, retrieve datasets out of this database, as opposed to writing data to it. In a data warehouse, you’re only updating or adding new information every night for the most part. Whereas an operational database – like one that would sit behind a CRM system or any other on premise technology that has to be written to frequently – has to be designed and optimized for write-based operations.

So, for you to be able to run complex queries in a data warehouse – because those operational databases are designed to write instead of read data – some of your queries that you’re going to run are going to take a pretty good amount of time to load. And if you have business users that you’re trying to deploy reports to, they’re not going to be too happy if they have to wait 2, 3, or 4 minutes for a report to render.

Though this point is a little bit more technical in nature, the fact that a data warehouse is designed to read data and retrieve results very quickly (as opposed to writing them because that only happens once a night) is a pretty important feature especially as you’re trying to deploy analytics and reports throughout the organization.

The next important feature of a data warehouse is that it is designed for historical reporting. It’s designed to, instead of just track what a balance is today, you’ll be able to ask questions like, “What was it today? What was it yesterday? Last month? Last year? Two years ago?” And so on. You have this historical analysis. And as we start to go into a world where we want to ask questions like, “What will happen next month?” As opposed to, “What happened last month?” We need to be able to have this historical information for the purposes of trending, for the purposes of predictive analytics, and a lot of the more advanced features that come as a byproduct of having this analytics platform built from a data warehouse.

The last feature of a data warehouse that I’m going to talk about, at least for this podcast, is that it can enforce the consistency of data definitions. I alluded to this point in the previous podcast talking about what it means for an organization to be data-driven.

It is incredibly important to have consistent definitions for key terms like member, product or service. A data warehouse can do a really good job of enforcing those definitions by having those definitions already built in, so, when users pull reports – like a current member report – it will provide the same information whether person A from department A pulls it, or person B from department C pulls it. A data warehouse does a really nice job of enforcing those data definitions that we want to have be consistent. This is a critical component of being a data-driven organization.

I could spend 20 or 30 minutes going on talking about what makes a data warehouse unique and different from a regular database. But the four key points I want you to walk away with are this:

  • A data warehouse integrates data from multiple data sources
  • A data warehouse is designed for read operations as opposed to write operations making it faster and more efficient for reporting and analytics
  • It aggregates historical data and captures historical data so that we can do trending analysis and other historical analysis whereas operational databases have more current information and less historical
  • A data warehouse enforces and ensures consistency of important data definitions
    • Terms like member, product, service, or household. Tt helps enforce those key terms that are really critical to having a strong analytics foundation.

Thanks for tuning in to today’s A:360!

Subscribe to have new content sent directly to your email!

[mc4wp_form]

There are plenty of data and analytics consultants in the market. The choice of a data and analytics consultant can be critical to the success of your project, so how you can be sure you have chosen the best team to partner with?

At The Knowlton Group, we specialize in working with financial institutions. We know your challenges, the technology you utilize, and can provide solutions backed by expertise and experience.

But so can a handful of other qualified consultancies.

What the other consultancies don’t have is what we call “The Knowlton Group Advantage”.

The Knowlton Group Advantage and Why It Matters

“The Knowlton Group Advantage” is the closest thing to a full service data and analytics solution that can be provided to financial institutions. With our partner ecosystem and in-house expertise, we can help you manage and implement each component of a data and analytics program.

The Knowlton Group Advantage

There are several key steps involved with any data and analytics program.

Data and Analytics Strategy

You must have a plan of action and a strategic direction defined before commencing any data and analytics project. A data and analytics strategy is required to drive the strategic direction of the program for the next several years. Combining execution and strategy, this offering is critical to long-term success.

Data and Analytics Talent Acquisition and Staffing

Third party vendors and consultants can only do so much. Eventually, an organization must have the right internal talent to drive the strategic direction of the data and analytics program. The Knowlton Group can source, assess, and recommend the best data and analytics talent based on the cultural fit and skills required for the role.

Data Warehouse Implementation and Customization

Organization’s that have a data warehouse in place often require additional integrations or customizations after the initial implementation. Need to integrate your new loan origination system or other third-party application into your data warehouse? No problem.

If you don’t have a data warehouse currently implemented, we have existing relationships with vendors who have been handpicked based on their ability to provide the highest quality product.

Data-Driven Business Strategies

You have a data warehouse in place and a data and analytics team staffed. Now what?

Let our team and partners work with you to define how best to leverage the data and analytics platform in place. We have helped each and every business unit within the bank or credit union drive business decisions and strategies through the use of data and analytics.

We can help you put your data to use.

Statistical Modeling and Analytics

There are times when advanced statistical modeling and predictive analytics requires a highly specialized skill set and expertise. From segmentation to product propensity analysis to advanced machine learning applications, we can recommend the top experts in these areas (and more!) to ensure seamless delivery and execution.

Why it matters

There are many moving parts in a data and analytics program. This can present significant challenges to organizations without the in-house experience and expertise. Through our own internal capabilities and the strengths of our relationships with the best vendors in the market, we can fully support and manage the entire data and analytics lifecycle.

Take the stress out of your data and analytics program with The Knowlton Group Advantage.

Yesterday, LinkedIn reminded me that The Knowlton Group has been around for three years! It certainly doesn’t feel like three years have passed since filing all the documentation and getting the company officially registered.

Reflecting on this, the only thing more surprising than how fast time has flown is how much has changed in that time. Many of my preconceived notions about starting a consulting business were challenged – forcing me to adapt. Things that I thought would be easy proved to be much more challenging. In this spirit, I have put together a brief list of some of the lessons that I learned starting The Knowlton Group.

Focus, focus, and focus

The first day after leaving my position at State Employees’ Federal Credit Union, I woke up with a ton of excitement. After all, it was my first day of being completely dedicated full-time to The Knowlton Group and surviving (or failing!) on my own. I sat down in my office, opened up my computer…and aimlessly wasted the first couple hours reading various news sites, blogs, and perusing social media.

Especially working from home, one of the biggest challenges I dealt with was staying focused early in the business’ maturity. When starting out I had a couple clients and some prospects I was trying to pursue. By no means did I have a full day’s worth of work. I found that it was very easy to kill a couple hours doing…well…nothing.

It’s easy to stay glued to my screen working tirelessly now that opportunities abound. In fact, the hardest challenge is now pulling myself away from the computer to maintain some sense of work-life balance!

The biggest lesson I learned from all this was to create a to-do list and prioritize tasks every day. Before I wrap up each day’s work, I put together a to-do list and an outline for tomorrow’s day. This enables me to immediately start work the next day with a plan of action. This helps me stay focused and achieve specific objectives throughout the day.

Credit Ms-Blake for the Rock gif!

Gif Credit: Ms-Blake

Focus, focus, focus.

Appreciate Everything – Positive and Negative

It’s obvious that one would be appreciative of their clients and partners. After all, without them, you would have no business. But over the past three years, I learned not only to appreciate the positives but to embrace and appreciate the negatives.

Not every prospect needs your services or is interested in them. Maybe they don’t think you’re qualified to solve their business need. Perhaps they don’t have the business need you’re trying to solve at the moment. Other times, an existing relationship with another vendor might take precedent over you. It was easy to be discouraged when I was just starting out, but I soon realized that every negative could be learned from and built into a positive.

The client that chooses another vendor enables me to learn how to better promote, communicate, and describe our services. Every negative opens up the possibility to improve – from business development to marketing to delivering services. Learning to embrace negatives and turn them into positives has been one of the best lessons learned over the past few years.

Appreciate and learn from everything – both the positives and negatives

Don’t Sell – Solve Problems

One of the pre-conceived notions I had before starting my business was that clients would be lining up at the door trying to book business. I felt that because I had the skills and qualifications, getting business and projects would be an easy sell.

The flaw in that line of thinking lay in the word “sell”.

Entrepreneurial naiveté led me to believe I could sell my skills and services. But that’s the worst way I could have approached business development. My focus needed to be on solving problems. Nobody cares that I can build a data warehouse, design advanced reports, develop SSIS/ETL packages or design a data strategy. What they care about is what business problem can I help them solve? How can I make their life and their team’s life easier?

This mentality has been one of the biggest factors for our recent successes. Our marketing efforts and content development reflect this goal: let’s help organizations solve problems first and foremost.

Don’t sell – solve problems.

The Pipeline is Never Too Full

As a consultant, my day is a constant balancing act between execution (completing deliverables and making progress on projects) and business development (communication with prospects, content development, etc.). Earlier in the business’ life, I was able to line up a few projects that kept me busy all day. I dedicated 100% of my time to completing those projects. That worked out well…until the projects were finished and nothing else was in the pipeline!

Marketing and business development can NEVER stop. I may have six months of work lined up, but you can bet that I am dedicating an appropriate amount of time each week to developing new opportunities. Expectations and timelines must be clearly articulated to any potential client. If you aren’t going to be free until two months from now, be clear and upfront. I would much rather a prospect decline to do business because the timelines don’t work out than take on a project where I can’t dedicate the proper amount of time to it.

Many companies face this challenge: the implementation team needs to be able to support the new business generated, while the business development team must ensure that the implementation team is at a high utilization. This challenge becomes amplified when the business development team and implementation teams are small in number.

The pipeline is never too full.

Your Reputation is Everything

Luckily, this is not a lesson I had to learn the hard way! I have always prided myself on building a positive reputation on the principles for which I stand.

If I believe a client is asking for something outside of my qualifications, skills, or scope of operations, I have no problem letting them know that I am not the right person for the job. I would rather lose out on business than accept a project for which I would not provide the absolute highest quality work.

There have been instances, when speaking with prospects, where the services we offer might not be a good fit for them. I’m going to be 100% up-front and let the prospect know that I don’t believe they need this service.

Why lose out on that business? Because my reputation, and the reputation of the business, is of the utmost importance – more important than simply adding dollars to the bottom-line. I want this reputation to be based on up-front honesty and delivering high quality and high value services. By trying to force a project just to onboard another client and get more business, this reputation we are striving for would most certainly not be upheld.

Your reputation is everything.

Summary

It’s been a great ride so far. Starting and owning a business that you completely dedicate yourself to is the most nerve-racking, scary, stressful, exciting, fun, and rewarding endeavor. I’ve learned a lot in the last three years, and I expect to learn even more over the next three.

Thank you to everyone who has helped the company get to where it is today, and thank you to all who will help it get to where it will be in the future.

-Brewster Knowlton
signature