BWG & ScienceLogic are teaming up to cover some topical points in the Enterprise Technology world! Join our conversation with information technology professionals to interact / learn best practices around AIOps, tool consolidation, business services, & hybrid cloud.
BWG Connect and ScienceLogic invite you to participate in an interactive discussion with your peers.
As always, there will be no sales pitches and there is no cost to join.
Powering the Intelligent Enterprise
Vice President, North American Sales Engineering at ScienceLogic
Nicholas Hassell is the Vice President of North American Sales Engineering at ScienceLogic, a company that empowers intelligent and automated IT operations. Nicholas is skilled in product development, product management, customer support, professional services, and sales engineering. He has a demonstrated history of working with leading technology companies worldwide, including Infoblox, Netcordia, and Xangati.
COO at BWG Strategy LLC
BWG Strategy is a research platform that provides market intelligence through Event Services, Business Development initiatives, and Market Research services. BWG hosts over 1,800 interactive executive strategy sessions (conference calls and in-person forums) annually that allow senior industry professionals across all sectors to debate fundamental business topics with peers, build brand awareness, gather market intelligence, network with customers/suppliers/partners, and pursue business development opportunities.
What challenges are companies facing in regards to IT automation? How are AIOps servicing the industry and reducing pain points?
Companies want to grow geometrically, but it can be difficult when you’re trying to dig through data overload and reduce alerts. Organizations like ScienceLogic are implementing intelligent automation to help fix these issues. According to Nicholas Hassell, there’s more than one tool involved in the automation process — it’s an entire ecosystem. So, where do you begin? Nicholas says it’s important not to think of the problem in its totality but instead pick a use case to start in order to troubleshoot the issues.
In this virtual event, Greg Irwin sits down with Nicholas Hassell, Vice President of North American Sales Engineering at ScienceLogic, to gain insight into IT operations and AIOps. Nicholas shares ScienceLogic’s process of identifying and simplifying complex problems through automation, how the organization focuses on contextualization and reduces mean time to repair (MTTR), and where to begin with AIOps implementation.
Greg Irwin 0:19
Nick do us a favor, give our quick intro to our Hollywood Squares please.
Nicholas Hassell 0:25
Thanks, Greg, very much appreciate being here excited to be able to talk with everybody. I'll introduce myself first Nick Hassell. My work with ScienceLogic have been here about six years. I run sales engineering, for ScienceLogic for the Americas. So my team supports everything that's enterprise MSP, US Fed our global system integrators, our channel partners, any local, any sled stuff. So we have a broad range of support that that my team does. I mentioned, I've been here for six years. I've been in the industry about 30 years, though. And maybe just by way of just a little background, Greg and the rest. I started my career managing and monitoring one of the larger internal enterprise networks at the time in the world, and that was for what is, you know, today's AT&T was SBC at the time, and I got the gig, not because I knew a lot about network, or even network management at the time, I got the gig because I was a developer. And, and the challenges that were faced was, as it was presented to me were things around needing to integrate datasets better leverage all of this siloed data that they had, from all these monitoring tools that were deployed, managing 10s of 1000s of infrastructure devices, just be smarter with what we do with the data and start to introduce automations help take out some of those big chunks of mean time to repair help us focus on the things that we needed to focus on. So you know, I started building solutions. And fast forward now for the last 20 years, I've been on the sales side of things, either running a product management or sales engineering for various companies. And as we talk to customers, same set of problems, the exact same set of problems. That's not to say that no one is introduced any smart solutions between then and now, of course, that everybody's pretty good in this market, I would say I would say both from a customer and have an a vendor perspective, the challenge is that the variables that in introduce those problems are involved in those problems just become increasingly more complex, the environments become more complex, and they reintroduce the challenges around those problems. When when I was running, helping run the network at at SBC, you know, we there was no cloud, right? There was very little virtualization, it was all mainframe stuff, we were managing, you know, 35,000, routers and switches and some key servers was all we had to worry about. But as we move forward with this transient network and containerization, and on prem, off prem, these same problems, just introduce more and more and more complexity to those same challenges. So that's where ScienceLogic comes in, Greg, and that's where, why I've been here for six years, is we think we can help solve that problem. And I'm hoping we can talk a talk a little bit about that today and how we purchased
Greg Irwin 3:41
a quick elevator pitch, what ScienceLogic does.
Nicholas Hassell 3:45
ScienceLogic, we're focused on reducing the complexity and managing and monitoring that environment that that I just described. And we're going to do it's really our solution. So one platform is based on three key pillars it is to be able to see or visibility to add some contextualization, to what we see. And then that will allow that enable that third pillar, which is automation. I mean, we hear in the market, that intelligent AI automated IT operations. We all know that as AI ops, right, but the end goal of AI Ops is automation. How can I get to some automated state usually, an automation task itself? is pretty cool. We probably has code automation tasks. Did I lose? I lose somebody?
Greg Irwin 4:34
Yeah, no, you just you glitched for a second. My end? I don't know. I could have glitched but I think I think you just clipped out for a moment. Okay, wait, so I'm just gonna say you're not a ticketing system. You're not an agent system. Your car
Nicholas Hassell 4:52
star is managing in monitoring, right? It's intelligent discovery. It's gonna say not only it's of all things, it's not discriminatory, right? So we want to integrate datasets. So we manage and monitor route switch, compute, storage, virtualization, public cloud, private cloud, etc, etc, etc. All in one non modular platform that allows us then when we do discovery and start to add things into the system to do more than just ask, are you there? And what are you, but also add some context? What are you doing? And how are you related to the things around you? Right, so now, we can sort of flip that monitoring paradigm upside down instead of, hey, device has an issue. I wonder if that CPU or memory or or packet congestion issue is what problems it's causing, instead, we can start looking from service impact, and work our way down to causality, hey, I'm having an issue processing credit card transactions, there's a health issue there, how much risk is involved to my business, and work my way through causality more quickly than the other way around? And ultimately, that leads to Greg automation, how much the ScienceLogic platform once we have context around why infrastructure is constructed the way it is? And why assert what is a certain device doing? Why is it there? What is its purpose? Is it redundant? Is it not? Is it virtualized? Is it physical, then we can start to get to Intelligent Automation. Without that context, it's almost impossible to bridge that gap to automation.
Greg Irwin 6:25
Got it? Got it. Hey, you know what I met you and I were trying, we're brainstorming earlier. And I think I want to go ahead with a little bit of a prize for our session today. So let's do this. All right, everyone's busy. Everyone's distracted. But so let's try and do something a little fun. We got this thing called a solo stove. These things are awesome. It's an outdoor stove, it's about 250 bucks. And and it's great for the winter. So let's do this. Everybody who's on at the end of our call, we'll do a random drawing, you're gonna have to trust us. I'll have Matt pick somebody out of the hat. And somebody will get a solo scope, or at least a credit for it. Yeah, Adam, you're right. I don't know if it's already I I think I was gonna go straight to solo stove calm. There you go. Here's the ad for it, you can click on that. But here's the so so a little bit of incentive will try and keep it fun. But now in return, do me a favor, I want us to focus and have a conversation about business. Because the reason I'm doing this isn't to sell solo stoves, or giveaway solo stoves. I'm doing this because we actually really like the guys at ScienceLogic. And we understand the importance of it automation. So there's my, there's my quid pro quo, stick around, let's have a real conversation. And in return, we'll do a drawing here at the end. That we'll see how that works. Bottom. Well, fun. Alright, let's, let's do this, Nick, let's go. as crisp as you can, let's get into a customer story. Sure, um, tell me about one customer who had a problem and IT ops. And what they did about it. Maybe it was MTR. Maybe it was a team. Maybe it was data. I don't know. Give me one, give us one story. And then I'm going to go around the group. And we're going to, you know, share some other stories. So I'm going to put people on the spot.
Nicholas Hassell 8:34
Okay, great. I'll give you I'll give you two very short ones. The first being financial technology company. After deploying, so, one, their challenge was incident overload. How do I how do I know what to look at? Right? There's too many things on that on that event screen. deployed. So one we deploy, we start to apply context and do some correlation. Even before we get to the automation piece. 39% fewer incidents across the enterprise lowered MTTR from three six hours once we have context to one to five minutes,
Greg Irwin 9:07
right. Oh, hold on, hold on, hold on. I'm sorry. I'm gonna keep interrupting you. It's all right. Did you reduce the overload was it just finding duplicates? What what happened that you reduced the overload
Nicholas Hassell 9:19
It’s combination of yes finding duplicates doing doing correlation understanding the more important thing to look at
Greg Irwin 9:35
Nick your major internet glitched again there for a half a second.
Nicholas Hassell 9:40
Is everybody seeing that? Yeah, I'm sorry. Sorry, maybe I'll maybe I can maybe I can go video off if if it keeps happening, but I'm not sure what you missed. I'll just say it was. Yes, there is masking of. So we're focusing on either causality or causality rather than symptoms from an event perspective, but it's also saying I don't care about necessarily a server CPU. Yeah, I care about is the health of the service that that server is providing. But it may or may not be normal or abnormal, I don't even care about that. I just want to know if the service is healthy, I focus on that I don't even have to, I never have to know that the CPU is bad. Unless we get to that, I predict that it is going to cause a problem at some point.
Greg Irwin 10:30
Alright, so next, I'm going to drive you to the business metrics on this on this situation. This company was seeing was having problem discovering the issue, too many or too many alerts,
Nicholas Hassell 10:43
and three to six hours, on average, MTTR. Yeah, they told us their cost was about $65 An hour and lost revenue, we were able to get them to one to five minutes, on average MTTR. Just through context, adding context
Greg Irwin 11:03
is that so we hold on MTTR is that just the discovery part was three to six hours down to one to five minutes on Discovery.
Nicholas Hassell 11:11
That was a total, meantime to identify and mean time to repair. So it was both it was all the way from incident to resolution.
Greg Irwin 11:19
That sounds too good. It does thing that I'm not saying that as like a lead in. But it sounds like a five minute resolution doesn't sound realistic. And you know, for for a complicated environment.
Nicholas Hassell 11:31
Right in that's in this environment. Maybe that's maybe that's an outlier. Maybe that is just maybe their environment isn't. We're not doing as much cloud based sort of complexity. But the issue is the same that I stated earlier, automation steps are easy. And they run fast. It is the identifying which ones to run. That is the challenge, right? I don't have contacts, how can I know how to troubleshoot a server if I don't know if it sits in a data center rack? Or if it's virtualized? Or if it's in Amazon? Or Azure? If it's a container on a Linux box somewhere? How do I how do I know how to do any automation?
Greg Irwin 12:15
Alright, so now, how long did it take to do the training the learning on it, implement it and deploy? Even if you told me you went from three to six hours, call it average four hours down to three hours, I would still say that could be a really, really worthwhile project. So I'm not even going to I'm not even going to go debate with you in terms of the actual MTTR. But the question is how long until you put this system in place that meaningfully reduced the MTTR? Through via discovery, as one as one major issue? How long did it take to put that system in place?
Nicholas Hassell 12:51
It's a great question. And it's not it's not an overnight sort of thing to talk about AIOps. It is a journey. I think there is a oftentimes a misunderstanding in the market that AIOps is a product that you can buy and install and turn on an AI AIOps is, at least as we understand it, and we present it, it is an adoption of methodologies, there are going to be probably more than one tool involved more than just our solution, right? It's going to be an entire ecosystem. And that's there's going to be a journey. Yeah, I am along the way. So for this customer, they spent a year implementing and starting to move processes to to be more automated number one, and number two, build up business services. So they're starting to say instead of looking at 5000 events, I'm looking at a single service incident, right, that helps with that massive amounts of hours and reducing meantime to repair I'm not looking at things I shouldn't look at
Greg Irwin 13:57
that. That sounds realistic, a year to put in a new systems. Systems response process. Yeah. Yeah. That's, that's really what that's really what it takes. And what's underneath. So what, what's this customer using for their ticketing? What's this customer using for incident incident management
Nicholas Hassell 14:23
service now, their ServiceNow customer. So there's two pieces of that, that helps enable that reduce Meantime, it is incident automation, right. So being selective about which events can result in an incident. Yeah, and also a huge piece of it is cleaning up the CMDB. And adding the context that other products discover, for instance, ScienceLogic and enriching the CMDB with that context, so that I start to have CI alignment, that helps me get to impact faster.
Greg Irwin 14:56
So David asked a great question. Thank you for that. What's the MTTR For those events that actually required action?
Nicholas Hassell 15:05
Um, that's a good question. I, you know, I don't want to give the politician answer but it obviously depends on the type of event. When we're talking about simple network related events, that MTTR is obviously smaller than something that's more complex, like perhaps, if we're looking at something that I have a distributed workload over, you know, the Cisco ACI deployment. The easiest are the are the or the network events, we have a lot of out of the box solutions that they're able to use automations to, to do what we call incident enrichment, which basically takes the first, you know, 20 to 30 minutes on the conservative side out of the MTTR window in a matter of one or two seconds. Right. So we've worked with our industry partners to say hey, if I have an OSPF routing issue, for instance, what do I need to do? What what is the engineer going to do is going to sit down is going to log in is going to go into to the to the router and he's gonna start doing show commands understand how the thing is configured to understand what its current state is, how about if we just do all that for you and cut the first 30 minutes out? And then and then take all that and put it in an incident in ServiceNow, let's aligned to a CI now we cut another 30 minutes out, there's an hour of that process gone starting within an enriched incident, as opposed to an event that requires a technician to log into device and start to gather more information.
Greg Irwin 16:48
You know, let's let's do this. Let's pause. Brian, you, you raised a great scenario. And I'd like to dig in with you. That's okay. And then Nick, we can come back to the second story a little bit later. Yeah, no. It's nice to see you. Not in the flesh. But at least in 2d. Here. Yeah. Give give a little intro place.
So I'm security infrastructure at a major hospital system in Atlanta. We have five hospitals all together. It's always growing, and will be these cancers, cancer centres, you name it, we have all of these we're doing even affiliates. We're doing security operations. And we're doing projects. I have five people right now. Billions of alerts are coming from 17 different systems to assume. I even have an MSS and they're struggling to keep up
Greg Irwin 17:50
with the broader feeds, what's the primary data feeds that are coming?
Firewall network stuff, IDs, EDR, NVR, servers, health, just, you name it, it's out there. And trying to put that into your even DC logs. From domain controllers, just trying to feed everything to the seven to the data lake, extract that data out of it. That's useful, hidden the top level stuff. And that's basically all we're filtering right now is getting the highest of the critical level stuff. And trying to narrow that down to actionable events. There are so many false positives that the guys that are doing managed services are just struggling to keep up. What's What's your tactical? What's your diet? That's kind of my question is like, Where does ScienceLogic fall in? When you're talking about doing this automation, like, Where do I even start with automation, when I only have five guys, and all these, you know, we're struggling? Just try to look at the systems once a day, at this point with 17 different monitors to look at. It's like you overload?
Greg Irwin 19:15
Oh, well, you know what, let's not have Nick respond on. Let's have somebody else first. Let's put it to the group. Where do you start? It's the absolute data and alert overload. Who wants to give a you know, what's it called, like a self help here? Who can offer a word of advice to Brian?
Nicholas Hassell 19:35
We're talking like in the billions of dollars, billions.
Greg Irwin 19:40
Who here can offer a first step?
I mean, you're digging, you're trying to dig through tons of data or it's coming in asynchronously. So I mean, you're going to use, you know, some sort of pattern recognition. It's really tough problem, right? And it that's kind of what I'm here to, to here to write. But you know, we've thought about it, we've we've taken a couple runs at it. There's no good answer yet. Right? The the machine learning stuff that we've we've played with so far is better than what we got. Right? We'll tell you that, but nobody does it well, yet. Um,
Greg Irwin 20:22
you know, your, your, your, your day, make sure I'm we're talking the same thing, is it? Is it SEC ops? Is it is a subject option,
regular ops, they both have they all everybody's got a similar, similar set of problems, right. And the application teams, the product owners of reach service, all have too much data to dig through not enough understanding of what what the person that created the alert intended. Right. So there's all these, these wrinkles to to the data as it's coming in. And, you know, interpreting it is is super challenging. The volume makes it 10 times more challenging, you know, and and so, you know
I mean, even even the basic question of who did what, when is a stroke?
Or another one? That's, that's, you know, no disagree on what you're saying. The other one is, suddenly something stopped barking. Right. The thing that had been throwing 100 alerts per minute, for the past six months, just went down to 10 alerts per minute. Right? Why did that happen? Yeah. Rob, what's
Greg Irwin 21:38
on your what's on your plate? For 2020 211 initiative, you're gonna, you're gonna try.
So we're rolling out a new APM and a new start a new everything. So as we do that, we're trying to connect everything up to to a centralized stack of data streams and data data analytics tools to try to take another run at this, you know. So that's what we're we're working on right now is, you know, our EPM solution kind of failed us. And that was that was on me. So now we've got to come up with another solution there. So that's kind of what I was hoping to get from this. One is how far down the stack are folks doing their APM? Right. So as folks go into Kubernetes, and these new cluster technologies, the challenge is growing geometrically, right? If not even exponentially but bigger than exponentially, right? As you introduce more of these components into the pile of technologies. Now they all become suspects every time there's a problem, right? Every time there's one of these alerts, storms, right? So all of these components automatically becomes a suspects when you start running into problems. So things like if you're running Kubernetes with Calico or something like that, what Calico do and under the hood, why am I getting? Why is this monitoring tool telling me I'm getting 10,000 retransmits per minute, out of this? This container, right? There's there's nothing wrong with the environment. But I'm getting 10,000 retransmits? Well, but when I look at the host that it's running on, the host is only doing 2000 retransmits. But Will Will the real will the real metric, please stand up. And then all of a sudden, the thing that was doing 10,000 retransmits per minute, now goes down to 200. So something fixed, and then it goes right back to 10,000. Again. Alright, so you know what the hey, what are you trying to tell me boy?
I mean, after all right? This is for us. I can't and certainly say this is a fantasy. This is an object for Major League Baseball in case you can't recognize each rose arm up there actually never said it this angle. So that's what, yeah, this each row, I'm in a conference room. But, you know, we do have a, an established SRE team. While they're small. They're all you know, a bunch of sharps and, you know, they know this space. And they operate in a consulting role. So they have their own platform responsibilities that they maintain, but they also have operated in a consultative role. And recognizing that each team is at their own maturity levels has to deal with their own technology stacks. Um, a few years ago, I might have believed in the whole like multimodal bimodal, Gartner stuff, like I don't believe in that anymore. Like every team, enterprise consumer facing, they all have to move fast. They're all doing containerization all the cloud at this point. And so and yeah, there's a tales of legacy around Sure. But for the most part, you know, the last few years, it's certainly here and imagined other companies, you know, a lot more emphasis on modernizing the enterprise side so it's no longer just like your consumer web page. The thing that needs to adopt these practices and this move towards a services model, and now has to do with the sprawl of infrastructure notifications. And so again, can't say that it's necessarily it's a solution. But I do think the SRA model, which was modeled by the company that still probably to this day has more infrastructure than anybody, Google, and probably just has more noise than anybody, and needs to be able to operate that at a scale that, you know, we certainly won't ever, maybe some of the bigger banks need to. But the sheer volume of stuff you have to instrument like, it's, I certainly can't say it's a solved problem. But I looked at what since we've established that it's a practice for us, focusing strictly for example, on networking, we not only have core networking in our data centers, and, and our branch offices, but for us, it's specific to baseball, we've gotten networking, we established into all the ballparks for ball tracking. And, you know, within within Google and for firewalls, and, and we're able to capture an aggregate every single event and state change in that with our you know, with the instrumentation that we put in place, we've we've moved also towards more of a best of breed open source rather than trying to cobble together you know, we've gone down the APM path is still few painful VPN products we've had out there, but I would just saying conclusion, you know, look at your look at your culture a bit. If your culture is you know, you've got a lot of very sharp engineering specific groups that are that are truly capable of innovating and moving fast. You ESRI, my model might be a good one for you to consider. But it can't be can't be part time role. Like it has to be the handful of people that that is their core responsibility for educating others on best practices and maintaining whatever their their own core platforms are for kind of further advancing the cause.
Greg Irwin 27:00
But thank you, let's, let's bring Nick back for a second. Nick, you said he had two case studies, let's go let's go to the second one, or this is driven a new idea for you. Let's talk about some some stories of some successes and whatever, whatever whatever level we got more missing your your audio there. Thank you.
Nicholas Hassell 27:27
There we go. Yeah, and as long as we're having good conversation here, I mean, that's very, that's really interesting to listen in on that conversation, I can give another case study, I, you know, it's around a technology company that sort of that I have some different numbers, but it's it's built around the same value propositions, right, it is the operational savings, you know, due to incident automation. And that same incident in Richmond, in this case, their costs were a little higher. They were they and they're a very large company, right. So they have an astronomically larger number of incidents that they were dealing with. But they they told us 86% productivity gains, and they were saving, on average two hours per incident, through the automated ticket enrichment and triage data that we provide them.
Greg Irwin 28:22
And how they do it.
Nicholas Hassell 28:25
They do that actually, in this environment. We actually don't directly monitor in this case, they are sending Syslog to us and just allowing us to put that through
Greg Irwin 28:37
play in that
Nicholas Hassell 28:42
view. And from that correlation, discern whether or not we should or shouldn't trigger an event, and whether then whether that event should or shouldn't trigger an automation.
Greg Irwin 28:55
I'm sorry, I'm gonna have you repeated, you cut out for a second in between. So I'm gonna have you repeat it,
Nicholas Hassell 29:02
like, Okay, so in we don't directly monitor their environment, which for most customers, we do. Instead, we're in place, receiving Syslog largely around the network infrastructure. Right? As we get that syslog. We then correlate determine whether or not it meets criterias for an event. And then so there's noise reduction there, I guess is my point. Then we'll take that event in determine whether or not we can apply automation, what automation should be applied. And it for them specifically, it's not remediation, it is incident enrichment, right? It is gathering more troubleshooting data. It's the first mile we stick it in the incident and that's where I mentioned that they are saving about two hours per incident.
Greg Irwin 30:00
I see. So how much of this is AI is real training and learning versus machine learning and algorithms and automation.
Nicholas Hassell 30:12
In that case, in this, that's why I wanted to talk about that use case, if we had time. There's no AI. And that, which is, as, as Robert mentioned, you know, machine learning for the market is when we think about, you know, when we remove the Googles of the world, if that was also referenced, and infrastructure size, but also when we think about the investments that those billion dollar companies make in predictive analytics, via AI and ML, nobody selling software solutions is even close to there. Right. Most of us spent the last two years building the plumbing to enable more sophisticated machine learning. It's not like we had scalable compute, and storage built in the solutions two years ago, to do machine learning at scale. Right. So I think what we see industry wide, from the software providers is the beginning of machine learning and AI being introduced into this, this workflow. So I don't think it is even close to the endgame yet for anybody. And that's just, I'm just being transparent there. Instead, I think that there's a lot of non AI ml driven analytics and contextualization that can enable the automation AI will will make that better, but it's not required on day one.
Greg Irwin 31:47
You're saying it's not necessary in the AI, the AI in the training and the learning? It's definitely can you get without it? Because I think we all understand AI. How just has that vibe of like, you got to prove it to me? Because?
Nicholas Hassell 32:05
Yeah, absolutely. So the how far can we get we can we can get through our entire model without the machine learning. And I guess maybe the reason why I say that I'll be very transparent. The machine learning piece of our solution today is focused around anomaly detection. Anomaly detection is an important piece of understanding the problem. But it's not the problem identifier. It's just telling you something abnormal happened, it could be good. It could be bad. We don't you don't really know at that point. Right. So for us, our Intelligent Automation comes from the contextualization that we understand for how things are related.
Greg Irwin 32:51
Cool. Now, there are other products
Nicholas Hassell 32:52
in the market that that approach it differently. And I in You know, they're viable good products as well. That is our that's our approach to this. So I'm definitely not here to say that AI and ML doesn't have its place. I'm saying that it's not I don't think anybody has it all the way where it needs to be. So
Greg Irwin 33:12
I appreciate that candid view because I've heard other stories that talk about all the how AI works and how it can train and how it can find these things in your in your patterns. But so I really appreciate that. Let's, let's keep mixing it up. Do me a favour. I'm not. I think it's important that you all ask the questions that matter and not just ScienceLogic again of everybody. So let's go to Guy. Guy, I'd like to dig in a little bit into your scenario, universe. And do me a favor use this. Proverb that's, like the irony of that. Thank you for sharing that. But guys in the chat, add your own comments and questions. Okay, I promise this, this group will be more valuable. If you're all asking the questions. Instead of relying on a newbie like me to try and guide it, I want to make sure we're hitting this point. So get your fingers going. Ask the questions to the group. And also I encourage everybody to respond. So if you have a thought or an opinion on something, somebody else is sharing here a question they've got, just say at whoever asked it and share the comment. And please remember the one the one goal for today make one new contact. So Guy, let's let's do it. First, a little bit of a little bit of an intro your focus.
Yeah, I'm the director of infrastructure platform services VMC. And that basically means I take care of all the infrastructure, all your endpoints, all your servers by your virtualization layer. desktops virtualized desktops. So, you know, it's a large hospital, medium sized hospital 35,000 employees. So we have a lot of things going on as well. And I'm starting down this journey of AI ops, because I know that where our mean time to resolution is abysmal. And we, we've been told, where you know, the goals, if we could reach we could how much money we could save. But the problem people that don't understand and this was in the communist, you know, Nick, one, first I appreciate you say, Seek first to understand the problem, and understand the company, because that is what most vendors forget is it, you know, each place is different. And each problem is different. So if you don't understand my problem, he tried to give me a solution, you're just causing me more nightmares, and you are a solution. So thank you for saying that, because that is very, very nice to hear. Um, but the other thing is, um, where the gentleman was, David was talking about the number of billions of alerts that you get in. Unlike Nick, I've been around this business for 40 years. And I don't look as good as Nick does, except for 30. So but, you know, I've done it both ways, we turn on all the alerts, and we get billions and billions. And we turned, we didn't turn on email, and started picking and choosing, say, this is what we're going to focus on, let's focus on networking. Let's trim those down. The we basically had to turn everything off. And I say that that's the way we did it five years ago, when we broke away from our university sister is that we turned everything off and said, Okay, let's start focusing, and trim it down. Just so we understand the problem and not be flooded with a whole bunch of alerts. I have a gentleman who used to be a regular person in the Navy. And his thing is, I only want to see alerts that are actionable. If they're not, if there's not an emergency, is that accessible, I really don't want to see, because that's the way they were trained is it has to be alert has to be an alert, not just information. So I say all that to say that my problem is basically the same as when we were starting very small and very question was where to start, we're starting with the low hanging fruit on our virtualization side of the house and looking at the virtualization infrastructure, but because it was not being monitored that well. So we that's where we started to get into monitoring and discovery alerting. But we've looked at Now currently, we looked at five different projects, I have not looked at ScienceLogic, but I will have to put it into my list now. Just to see what is actionable? And what is you know, what information that is given? So that's my big story. But my question here, and this is where I'm going to ask the team or the whole group here is how do you set those expectations? Because next Saturday it takes to do was a year of discovery and deployment and rolling out? When I have the vice president or the doctor said, I need I need information that okay, kit, stack your fingers? So what are you guys dealing with that situation of how you deal with senior management asking for results, when you will, it's gonna take at least three to six months to get a baseline and get understanding of some of the taxable
Greg Irwin 39:12
I think that's a that's a really tough, soft question. How do we get buy in from your CIO to really set proper expectations for how long this takes how hard a problem it is? And to show some some progress and steps along the way. David, you raise the hand. I'd love to hear what your experiences
Yeah, not not to be tried at all, guys, it sounds like you're in the right. Right industry does talk about why these diagnostics don't necessarily mean a specific outcome. Right. I mean, the entire medical industry, you know, has reams and reams of data on all of us. Can can't necessarily guess when when I was going to have diabetes and when I was going to have heart attack, right. So what's happening there? Right, so, so I think it is a common struggle, certainly one that I've been going through, right, the shining brochures really made it look like everything is going to be self healing, we're going to be 100% reliable, and that everything will be automated, right? So, you know, do more looking for that business case for us as ops guys to always be up and always be cheap. And I think Brian, you nailed it. The more tools on this, I get, the more humans I actually need to use the tool. And it's almost like what we did was, we opened our eyes wide world, and then we just shut her eyes. Because there was so much out there, it was complete overload. So we had to go way back the other direction, really, three things to just the worst of the alerts. And then embedded in our problem management process. When we did actually have outages and things go down. Asset questions started building from the ground up, we donated a lot more organic for that one
Greg Irwin 41:03
Hey, but David, give us the punch line? How did how did it work out?
Well, we got really bad for a while. That's kind of where the meantime the resolution comment went. And then we tweaked it down. And we got a lot better, right, we figured out a lot more things about what we needed to what we needed to do. But I think the biggest thing that we did was we embedded a problem management process. And we found a lot more comprehensive approach there. got us to where we were actually addressing real outages, right and learning on the right things. So we started things like as simple as ups, putting a monitor ups as was standard, you know, standard, four times right, and given us a little more information, we put in redundancies in a couple of different places. So not so much AIOps. But it's almost like I think I think Adam was talking about right, you get all kinds of information about where the ball is, at any given moment, in a ballpark, we still have the Empire, right. So there's humans still have to tweak every one of these things is probably best described as semi automated. Our network engineer, our network leader, and our data center leader, both made that part of their jobs was tweak, and build from the ground up. And we got a lot better. Let's say we're probably a third number of Major Incident Management major incidents on the network data center side that we had three years ago. And just to give maintenance a shout out, we did a lot of that with ScienceLogic. So we made a transition from ca to ScienceLogic. We also had a 10 ball and sumo no sim side of things.
Greg Irwin 43:06
Oh, that's interesting. You went over to sumo huh? For the cloud for their cloud sim or for their observability for the security
of ours. Oh, okay. Yeah. Why not? Just do that with swag. Good question. Right off. Ah,
Greg Irwin 43:27
that's interesting. That's i It's, I find that really interesting. It's sumo V. Splunk. Had it? Was it just cost that they beat them? Or did they beat them on performance?
I think we're similar beats on was the way they came to market with us. So the sumo team really kind of kicked ass when they came in and, and spend time with the team and actually worked with them. You know, that's, that's in the trenches is where those those sales really get made? Well, it's actually been, huh? Yep. Interesting, guys. Hey, Nick,
Greg Irwin 44:04
we're hearing a lot about a system ops, IT ops as well as security. Are they really the same? In terms of how you approach handling alerts and resolution time?
Nicholas Hassell 44:21
I don't I don't think so. I'll tell you that. From a security perspective, I would consider ScienceLogic just part of a larger ecosystem. As David laid out, I think it's really important in all these scenarios to consider investments that have already been made. And in the tools that are in place, and just how you can make them better. I think that the problems just when you just think about scale, Robert mentioned it that started this this conversation. I think it was Robert, talking about the the that machine learning wasn't there yet. So then how am I going to handle scale without that machine learning? And I think it was a was a Brian, that that mentioned just the billions of alerts that they were dealing with. From an IT management perspective, not security, your you should you just don't have that volume. So your workflows are a little different. We normally keep them separate in our conversations. Did that answer your question now?
Greg Irwin 45:26
Nicholas Hassell 45:28
Trying to be transparent, not sidestep. So we often we oftentimes, I'll say this group, we oftentimes end up in that, in that Sumo, slash Splunk, slash whoever else fits in that space conversation. From a competitive perspective, like, Oh, should we go with that, you know, or ScienceLogic? And I think more times than not, the reality is, again, depending on the use case, and how mature the environment is, it's probably a little bit about
Greg Irwin 46:06
the folks we've about 10 minutes. Let's try and do kind of a quick sprint of some comments, bring some other folks in, maybe get to hear a little bit of different people's issues. And I'm going to ask one question, what's the one real thing you want to tackle over the next 12 months? Maybe it's process, maybe it is, you're going to deploy a new APM, whatever, but it's around IT ops. So I'm going to bring some other people in. And I have one question for Nick, before we do that, which is, Nick, let's say somebody is listening here. And they like it. They say, all right. You just told me it's gonna take a year to get this thing all together? How do you prove to me how do I do a POC with you guys? So that I know that this thing is actually gonna work? Is there a quick? Is there any kind of quick hit that somebody can do that can prove and give me the confidence that in three weeks or in six weeks, we're on the right track, and this thing is going to have a shot at really improving? You know, my, my response to issues?
Nicholas Hassell 47:23
Or? Yeah, no, that's a that's a great question. So I think that's the conversation has been laid out. And I forget everybody that contributed out David was speaking to this in his solution, at least David was saying this. It's important to not think of the problem in its totality, as automating everything, you have to find a place to start. It's too overwhelming to try and do everything. And it's unrealistic, that one year for that company, was to show the benefits that they that I mentioned, but it's not to show any benefit at all. So I think it's important to think in terms of day zero, you have to have like, over like functionality, if you're going to forklift something out and put a new monitoring platform in, for instance, right. So there's the day zero, there's day one, and then there's day two through and right. So I think it's important, whether it's ScienceLogic, or any other vendor that you understand what your day zero and day one value is going to be, because that's the shorter term things are the immediate gains that I can get. And then it is, are we heading in the same direction? Do we have the same goals as a vendor and a consumer? Do we have input to the process, it's about that more that relationship, I think, comes into play. For instance, the two most common the team just was there and plugged into the business and trying to understand the problems versus selling a software solution. Right? So I'm not trying to sidestep that but it really is a matter of simply picking a couple of smaller use a value and then you have the scale and workflow configurability to reach the larger use cases and and a larger volume of use case session before where does that help
Greg Irwin 49:14
100% You're saying you can pick a couple of use cases take a shot at it. You're not solving everything but you can at least demonstrate on a couple of use cases. Absolutely. Cool. Let's let's do a quick sprint speed round guys. Ah, pitch a Chuck. You've been sitting listening. Nice to meet you. Real quick intro and and a question. What are you working on here in terms of improving IT ops for your organization?
Yeah, Chuck. My thanks for having me on. They consult advisory consultant. My background is infrastructure operations. I've been doing that since 1980s. Blood children my age working on because, again, we I mean, I've been listening very carefully, is very hard to keep up with the data coming at me in my last experience, I came in out of manufacturing a retail company, who have roughly about 40 different sub manufacturing sites and over 1000 retail stores. We have aI ops coming in from both sides, from IoT from devices from manufacturing, from everything else. And we just could not keep up with with the amount of data.
Greg Irwin 50:28
What's What's one thing you're doing about it.
As always, the one thing I'm thinking about is trying to automation my answer a couple of the things that automation, automation is basically trying to understand the data, trying to understand what's coming at you, and how you're segmented and how you're going to make sense out. I mean, we using ServiceNow, I've been very working with ServiceNow quite a bit these days, especially last several years, you know, try to figure out how can we put workflows around the data that's coming in. So when we get the data, the injection of the data is coming in with a look at make sense out of it. And we actually trying to categorize it in a different way. So we can actually send it to the right people. Because you know, sec ops wants to see something and your your infrastructure work is something else. Your business wants something else. So yeah, so So what we tried to do was trying to figure out, how can we automate that through workflows. So when we are getting the data that systematically go into different directions, with limited amount of data for some people, because one, one team is not going to be able to see it all. So that's what we're trying to consolidate on trying to figure out how to make that work, make it happen. It's a challenge a sec up in a sec of operations, because we're getting threat intelligence from all over the different world from feeds, and trying to make sense out, that is crazy. But that's what it is we are in it. That's what we do. That's what we try to figure that out. So ServiceNow has been very helpful for us. Trying to automate some of that workflow together
Greg Irwin 51:51
are great, great feedback, Chuck, and thanks for taking some time in joining my bot. Samuel. Nice, nice to speak with you. Same question. Um, real quick intro. And what do you what do you what are you focused on the one what's one fix you're working on?
Yeah, I'm on application development side. Not too real infrastructure side. But some of the challenges are similar. So we have deployment automation, all changes go to production to test environments, automatically. We keep explaining that. We, we, we use primarily Microsoft platform, which was not supporting containers and things like that until recently. Now, at least, theoretically, we're hearing that with dotnet, six, we should be able to run Microsoft stuff in containers, not the vendor stuff, because vendors would have to compile it with with new every better home. So that's, that's our next thing to try. And we'll also use Splunk. With ServiceNow. We have alerts, it's all it's all similar. Similar challenges. Similar, similar direction.
Greg Irwin 53:02
Excellent. Thank you, Samuel. Ah, let's go one or two more. Boris. Nice to see you. I like I like the sandy beach behind you,
as well to be
Greg Irwin 53:15
away. Yeah. Are we all tonight? The New Jersey Turnpike right behind me. Boris, what's what's a little intro and one thing you're working on?
Sure. I represent so Event Management monitoring as what we do kind of cross cross company. I've solved already the situation with billions of alerts resolved. Quite many changes I've heard here, I really solved it here. And I could say commit the basically what Nick said, I actually just just, you know, can tell that his description. And what he was talking about is exactly the way to solve it. It's not necessarily has to be ScienceLogic. But you have to put a system in between, you have to have an open telemetric of the stream, I actually encourage more and more of those who have maybe 100 of those maybe more, it's just I'm happy doing to really not integrate another tool, another signal. And I'm making sense out of them very powerful, they're already on a level of topology or level of correlation. We are doing actually artificial intelligence and we started this machine learning recently so we probably based on what I heard, we were very mature on our site right now. And probably kind of competing this business design slowly but we obviously internal shop, to G obviously right but I fully support and I like this company very much by the way and we are having interaction with ScienceLogic as well. And consider is the most bizarre platform so please reply,
Greg Irwin 54:53
reply. All right, Boris, thank you very much. i Let's let's play the game. You all stuck around. And it was a great conversation and I want to thank you for that. We stayed on point. Now let's, let's have some fun. So let me see. We if you can all tell me if you can all see this. And you'll see the wheel. Alright, yeah, I've got everybody's names in here. Let's see if this thing works. On one shot one shot Winner gets it. Let's. Let's see how we do. Here with Him. You cannot beat the will of names at the wheel of names. And we go Adam it is. Adam. Are you on with us? Oh wait, Adams out here. We lost Adam. We get to reroll. We have it. We're gonna go again. It must be present to win. Remove them. Let's go again. You ready? Everybody gets another life. Here we go. Don't tell him if you see him. Don't tell him that he Alright, Rica. Congratulations. Enjoy yourself. Oh stove. Okay, in all seriousness, thank you all for joining radico. We'll follow up with you and, and that but hey, thank you all for joining in a big thanks to Nick and the team over at ScienceLogic, obviously, like if these folks can, can help you that would be wonderful. Please take us up on try and connect. LinkedIn is brilliant. It's a great way to connect. Try it if you need some help from us. We'll do the best we can. But we're not going to publish anybody's email. With that, let's wrap it up. Thank you all for joining. And everybody. Have a great day.
Nicholas Hassell 56:53
Thank you so much.
Greg Irwin 56:54
Thanks, Nick. Thanks. Take care everybody.