Overcoming Challenges in Data Warehouse Modernization
Mar 2, 2022 12:00 PM - 1:00 PM EST
Are you having trouble maintaining or scaling your data warehousing? If you built your database five, 10, or even 20 years ago, it’s probably time to modernize.
Upgrading to a modern stack has great benefits for analytics — but there are a few critical variables to consider before you get started. Should you go to the cloud, and if so, what would that look like? Where and how can you automate? What are the EDW costs and how can you manage them successfully? This is where the Teknion Data Solutions team comes in. They have the expert wisdom and advice you need to modernize your data warehousing and tackle any obstacles along the way.
In this virtual event, Greg Irwin has a conversation with Michael Tantrum and David Brown from Teknion Data Solutions about modernizing your data warehousing. They explain what you need to know about automation, how to overcome common obstacles when upgrading your stack, and the key to managing cloud EDW costs.
Resultant is a modern consulting firm, focused on technology, data analytics, and digital transformation, with a passion for problem solving.
Connect with ResultantNational Sales Director at Teknion
Michael Tantrum is the National Sales Director at Teknion Data Solutions, a team of data professionals that specializes in designing, building, and implementing data and analytic solutions for global organizations. He’s also the National Sales Director at Validatar, an automated data quality and data testing platform under the Teknion brand. Michael has 30 years of data experience and has worked with some of the largest data warehouses across industries. He earned his bachelor’s degree from the University of Auckland and his MBA from The University of Manchester.
Co-Founder, Co-CEO at BWG Strategy LLC
BWG Strategy is a research platform that provides market intelligence through Event Services, Business Development initiatives, and Market Research services. BWG hosts over 1,800 interactive executive strategy sessions (conference calls and in-person forums) annually that allow senior industry professionals across all sectors to debate fundamental business topics with peers, build brand awareness, gather market intelligence, network with customers/suppliers/partners, and pursue business development opportunities.
Chief Revenue Officer at Teknion
David Brown is the Chief Revenue Officer at Teknion Data Solutions. He has worked with Teknion for almost a decade, previously serving as the company’s COO and EVP of Sales. Before this, David held positions as a COO and CEO for multiple companies, including Gibson Products, The Noble Group, and Precise Equipment Company. He has a degree in marketing from Texas McCombs School of Business and an MBA from the University of Georgia’s Terry College of Business.
National Sales Director at Teknion
Michael Tantrum is the National Sales Director at Teknion Data Solutions, a team of data professionals that specializes in designing, building, and implementing data and analytic solutions for global organizations. He’s also the National Sales Director at Validatar, an automated data quality and data testing platform under the Teknion brand. Michael has 30 years of data experience and has worked with some of the largest data warehouses across industries. He earned his bachelor’s degree from the University of Auckland and his MBA from The University of Manchester.
Co-Founder, Co-CEO at BWG Strategy LLC
BWG Strategy is a research platform that provides market intelligence through Event Services, Business Development initiatives, and Market Research services. BWG hosts over 1,800 interactive executive strategy sessions (conference calls and in-person forums) annually that allow senior industry professionals across all sectors to debate fundamental business topics with peers, build brand awareness, gather market intelligence, network with customers/suppliers/partners, and pursue business development opportunities.
Chief Revenue Officer at Teknion
David Brown is the Chief Revenue Officer at Teknion Data Solutions. He has worked with Teknion for almost a decade, previously serving as the company’s COO and EVP of Sales. Before this, David held positions as a COO and CEO for multiple companies, including Gibson Products, The Noble Group, and Precise Equipment Company. He has a degree in marketing from Texas McCombs School of Business and an MBA from the University of Georgia’s Terry College of Business.
Senior Digital Strategist at BWG Connect
BWG Connect provides executive strategy & networking sessions that help brands from any industry with their overall business planning and execution.
Senior Digital Strategist Tiffany Serbus-Gustaveson runs the group & connects with dozens of brand executives every week, always for free.
Greg Irwin 0:18
We've been very, very lucky to be going through this series with the team over at Teknion. We're talking about all things data warehousing. And really that means data pipelines, data movement, data, governance, data quality, and, and can go a lot of different directions. What I love about these forums is everyone's different, and it's really dictated by the group. The one thing I will reinforce is, we don't do sales pitches here. So there's no PowerPoint presentation ready to go, that there's no conversation or sales pitch on a product. Unfortunately, you know, Michael and Dave are real SMEs and can talk on a lot of topics. And they can be very helpful for people thinking through challenges. And we're going to get to some of that today. But I'll just, you know, put it out there. If you have opportunities to talk more deeply about your projects. That's what they're doing here. It's to drive awareness and, and to demonstrate their leadership around EWS. So my name is Greg Irwin. I'm one of the partners at BWG. I moderate these for a living. What I'd like to do is give Michael and Dave the opportunity to introduce themselves. And then we're going to jump right into it in terms of our session around modernizing EDWs. Michael, do us a favor, why don't you go first with a quick intro please?
Michael Tantrum 1:51
Sure. Thanks, Greg. So, data warehousing is something that gives me huge amount of passion. I was talking to one of our junior colleagues yesterday at dinner. And we're talking about, you know, how did you get into the career and I always said, I wanted to work in a field where it make real changes to the way it company runs. And for me that felt like analytics and data. And so I actually started my very first analytic database project back in 1989. So that dates me a little bit. I've been a practitioner in the field for about 30 years of work with some of the largest data warehouses in the country, across pretty much every industry. And so my passion is helping people get from nothing to something or if you've already got a version one of a data warehouse, you know, the challenges there? And how do we get to something modern. So I'm full of ideas, lots of experience in just love to help help you guys, you know, throw ideas out and reflect back at us things we've seen and, and learn from you as well, because you guys all have your own unique experiences across all sorts of industry.
Greg Irwin 2:59
We'll get into it, Michael, thank you very much. Dave Brown, let me go to you next here. And do us a favor give a little intro.
David Brown 3:07
Greg, thanks so much. And everyone, it's nice to meet you. My name is David Brown, I've been with this particular company for nine years, Chief Revenue Officer and the bottom line, I get the opportunity along with Dave to talk to companies day in day out that are just, you know, either have dreams that they're trying to achieve or pains they're trying to alleviate in and around data. And so we help them organize that data, understand that data and make decisions from that data, get ROI from that data. And that's what we did. So really appreciate you all being here today. Looking forward to that lively discussion.
Greg Irwin 3:39
Guys. I think many of you have been on my forums before. In fact, I look across I see some familiar faces. So thank thank you all, for coming back. You know how we do it. The best way to kind of push along the idea is is to learn from each other's experiences and kind of ground on the important things, the relevant things that everyone's thinking about. So I'm going to declare this a full participation forum and ask people for, yeah, hey, maybe it's something very specific around governance, maybe it's something about cloud, maybe it's something about cost. Let's talk let's talk in some specifics about what different organizations are doing. But I'm gonna have to ask Michael to get the ball rolling here in the context of modern data warehousing, because a lot of people have lots of databases, and they paint them for lots of analytics. But it's not necessarily a modern architecture or a modern process. So what does it mean to be modern, at least from the vantage point of the clients that you're supporting them?
Michael Tantrum 4:44
Yeah, and I think everybody hopefully everybody in this room has already done a first attempt. It's relatively rare that we get a greenfield company that's never done data warehouse before. So most of you will have done what I call vision 1.0, it's probably an on prem database, it could have been something you built five years ago, or 20 years ago. And what people say is that it gets to a point where we can't, what we've got becomes difficult to scale or to maintain. And we'd like to do version 2.0. And what does that mean? And so they want to take the opportunity to say, what is a modern? If we're going to look at a modern stack? What are the things to consider? And I would say, I probably see two themes. The first is cloud, should we go to the cloud? And if we do, what does that look like for us? And the second is automation. And so where can we automate? And we've tried the offshore model, and it really doesn't work well for analytics, because analytics is about talking to users. What are you trying to deliver? What's your business unit trying to do? You're asking me for data? How do I best put that together? And this is interaction thing. And so what we need to do what the focus of automation is around saying, my critical resource is my human human brains. My analysts, people are subject matter experts, people who, who can ask questions around what are my requirements? What are my source data? What, what sort of models? Am I going to put together? What sort of business rules do I need? What am I going to do about data quality, all of that I need a human brain for. But what people are saying we don't need human brains for it we can use machinery for is writing code, creating tables, maintaining documentation, maintaining lineage creating data catalogs. And so those two things of automation and cloud are the two things that I think people are asking the question to say, if I want to change what I'm doing at the moment and try and make something more scalable, more maintainable, more responsive when a business user says, Hey, can I have another piece of data? I want the turnaround from that request, or when it goes back through my IT team and back to the desktop? I want it to be short, and how can I get there? So those are the things I see a lot of clients talking about
Greg Irwin 6:57
today. Let's put it let's get everyone involved here. We have this chat window. It provides a nice sounding board. So do me a favor, everyone dropped in there. One thing that's like what you're looking for, if there's maybe it's in the context of modern, maybe it's our spend, maybe it's storage, maybe it's scalability, maybe it's automation, maybe it's I don't care whether it helps or not, we're going to cloud. Everyone do me a favor put in there. One thing that you'd like to see out of modern modern database entered now and then we'll, we'll have some some concepts to talk about here.
Michael Tantrum 7:34
You know, business units, say we need data right now. And I have a business problem to solve. And in it, why are you so slow? And it says, Well, you want it right? Now we want to get it right. And so we've got the tension between how do I do something that is repeatable and predictable and maintainable as against something that needs to be immediate? And that's one of the challenges of modern and I see a few of you talking about this? How do we get stuff turned around quickly? In fact, maybe I'll throw out a nugget for some of you guys that traditional and think who was it talks about the project portfolio process and things like that. The traditional way of doing a project is you have users give you requirements, you take their requirements, you build it, you bring it back to the users and say, do you accept it? In analytic analytics, that doesn't work very well, because people really struggle to define the requirements Well, upfront. And if you try and force them to do that, it creates this tension. Because they say we don't know what we don't know, I can tell you my first question, but that the answer that that gives me is going to create six new questions. And I don't know what those are yet. And so can I suggest a way to tackle this as rather than that traditional requirements, build acceptance phase is think of prototype, iterate, build a prototype really quickly, with as much data as you can get your hands on quickly, throw it up on the big screen, have that user or a representative of the users look at it, and let that data allow them to refine the requirements? And as they say, can you try this, try and do that in real time with them. And that polishing the prototype? Means the requirements are refined with real data. And it becomes a collaborative exercise with your users, not this confrontational thing that we're used to in the past.
Greg Irwin 9:13
Michael, do us a favor tell us about a large client who's and who's successfully managed to manage cloud upon their cloud EDW costs.
Michael Tantrum 9:25
Yeah, and the the interesting thing is that with with your example, what you didn't have before but that you do have now is the visibility over who has who is creating the cost. Whereas before when you had a Teradata box you didn't know you couldn't allocate say this department is consuming this much. And it was also capex, right? You you fronted with a million dollar check to buy the appliance or a snowflake. You know, the chip comes in every month, but they add up really fast. And now the good thing is you've got people adopting, which which is great. Okay, now we're caught The old adage but she was talking about a management consultant, you know, what gets measured gets noticed gets done. And so what we see a lot of how people manage that when they get to your situation, right victims of your own success is to is to create dashboards which say, you know, to each cost center, this is what the value you're getting from our data warehouse data lake, in your case, that data lake house, and it comes with this cost. And here and not not making any judgment, but here's the leaderboard of the people who are consuming it. And they will self police, they will say, yes, it's worth the investment for us. Or Whoa, okay, we didn't realize that we're doing 20% of the cost, what are we going to do to bring it down. And it's interesting how peer pressure helps you manage that. But in a modern data warehouse, you should also be considering what does data quality look like. And we focus a lot on data engineering, building this beautiful thing. But we don't tend to as data engineers, we don't tend to care too much about the content, we say that's a user problem. And what you're just saying is, right, it's my problem. Now, data quality, is often something you do at the 11th hour, it's like a built this thing. Now I'm gonna run my data quality checks. But an actual fact your data quality needs to be part of the DNA of your project and needs to be actually from the start. And you need to apply the same thinking to it, as you do for the rest of your data engineering, think about automation. What does it look like, you know, for you to say, Okay, I found this problem on the data, I don't ever want to see that, again, I want some automated process to say now that I found this problem, you check it every single run. And so we haven't talked a lot about that. But I would encourage people as part of your data governance program, a lot of what you're talking about the EU is data governance type problems. But yeah, data quality, you really want to be thinking about what that looks like. And I can talk all day on that. That's one of my, I'm not going to now. But yeah, that's, that's a big thing I see.
Greg Irwin 11:52
Can it practically be automated, like I understand some things like, hey, I want this format, or I don't want to have it not having this format can create issues when I when I, you know, ingested into this algorithm, but across the scope of data quality problems, missing data, or, or unreliable data or changing data? Michael, how are we? How straightforward is it to automate some of these data quality processes?
Michael Tantrum 12:23
So there's actually a bunch of tools out there which get the pathway was a chat, I think mentioned DBT. They have some of it? There's a tool out there called validator. Let me just put that on the chat. There. It's actually designed specifically for data quality in data warehousing. And it's a tool which should go along with your data engineering. But yeah, whether it's things like you, you're saying, Is my data stable? Is it what I thought was stable yesterday? Has it changed today? Do I have all the data? Is my data unique? Have I introduced our Cartesian products? What are the other sort of problems? We see? You know, like, you were saying, when I thought this customer, you know, I've got an in my CRM system in Salesforce, I've got it over here, my ERP, are they the same person, I thought I had that rule solved, has that changed. And then down to data engineering data quality problems, like the volume of records I'm bringing, and if they spiked, if they dropped all sorts of data quality. I actually give a whole seminar on this and all the different phases of it. But yes, it is very possible to automate. But it's something that we don't tend to think about and falls in the crack between the business user and the data engineer.
Greg Irwin 13:36
But Michael, there are a couple of questions in here, we might be able to handle them here. And maybe we'll do it as a follow up. I see everyone needs to drop. So where people are dropping, let me just thank you all for joining. Let's keep this dialogue going. We're going to continue our series. And please make one new connection out of this group, whether it's Teknion, BWG, or somebody else here in that in the grid. Let's build this community. Michael and Dave, guys, let's let me give you the final word here on a closing comment for the group in
Michael Tantrum 14:09
30 seconds for me, and then I'll let Dave run with it. Just the comments in lots of comments here about how do we pick the right sort of software. We help companies with it all the time. And so we've got because we touch many places, we've got lots of good opinions for you. That might save you a lot of legwork and a lot of thinking. So feel free to use us to help you do that. And yeah, we've seen other people do POCs and things like that so we can give you the benefit of that.
Greg Irwin 14:37
Thank you. Thank you all for joining and I look forward to speaking with everybody on a future call. Great conversation. And thank you all. Bye bye.