Leveraging Large Language Models In Financial Services

Nov 9, 2022 3:00 PM4:00 PM EST

Request The Full Recording

Key Discussion Takeaways:

Transformers have become a large part of AI. Their capabilities include natural language processing, chatbots, voice recognition, reinforcement learning, and much more. But how can they aid financial service organizations?

Companies like JP Morgan are utilizing transformers and AI to create and summarize documents and articles, predict data, and offer chatbots for customers. But AI doesn’t just have to be for the big corporations. There are open-source tools and technologies that can help financial services companies find cost-effective solutions to simplify processes, analyze data, and program languages.

In this virtual event, Greg Irwin is joined by Justin Hodgson, the Financial Services Business Manager at NVIDIA, to talk about transformers, AI, and software that can help your business scale large language models. Justin uses demos to break down the benefits of transformers, describes the different capabilities of transformers, and talks about how NVIDIA’s AI supercomputers can help enterprises thrive.

Here’s a glimpse of what you’ll learn:

  • Justin Hodgson talks about transformers’ current role in AI
  • Justin shares lives demos regarding transformers’ capabilities
  • How can you build transformer functions into your financial organization? 
  • Justin explains how transformers can analyze video
  • What else can transformers be used for?
  • How financial services companies can analyze large datasets
  • What does it take to train models, and how can you scale on a budget?
  • How JP Morgan uses AI to improve performance
  • Justin describes how NVIDIA’s supercomputers work for enterprises
Request The Full Recording

Event Partners

NVIDIA

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics and ignited the era of modern AI. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Guest Speaker

Greg Irwin LinkedIn

Co-Founder, Co-CEO at BWG Strategy LLC

BWG Strategy is a research platform that provides market intelligence through Event Services, Business Development initiatives, and Market Research services. BWG hosts over 1,800 interactive executive strategy sessions (conference calls and in-person forums) annually that allow senior industry professionals across all sectors to debate fundamental business topics with peers, build brand awareness, gather market intelligence, network with customers/suppliers/partners, and pursue business development opportunities.

Justin Hodgson LinkedIn

Financial Services Business Manager at NVIDIA

Justin Hodgson is the Financial Services Business Manager at NVIDIA, a multinational technology company that invented the GPU (graphics processing unit). Justin is an experienced technology executive, having served as the Business Development Manager at Cisco for nearly 12 years. He’s skilled at creating and selling $100 million-plus technology and “as a service” solutions for customers who want to align end-to-end IT solutions with their business needs.

Event Moderator

Greg Irwin LinkedIn

Co-Founder, Co-CEO at BWG Strategy LLC

BWG Strategy is a research platform that provides market intelligence through Event Services, Business Development initiatives, and Market Research services. BWG hosts over 1,800 interactive executive strategy sessions (conference calls and in-person forums) annually that allow senior industry professionals across all sectors to debate fundamental business topics with peers, build brand awareness, gather market intelligence, network with customers/suppliers/partners, and pursue business development opportunities.

Justin Hodgson LinkedIn

Financial Services Business Manager at NVIDIA

Justin Hodgson is the Financial Services Business Manager at NVIDIA, a multinational technology company that invented the GPU (graphics processing unit). Justin is an experienced technology executive, having served as the Business Development Manager at Cisco for nearly 12 years. He’s skilled at creating and selling $100 million-plus technology and “as a service” solutions for customers who want to align end-to-end IT solutions with their business needs.

Request the Full Recording

Please enter your information to request a copy of the post-event written summary or recording!

Need help with something else?

Tiffany Serbus-Gustaveson

Senior Digital Strategist at BWG Connect


BWG Connect provides executive strategy & networking sessions that help brands from any industry with their overall business planning and execution.

Senior Digital Strategist Tiffany Serbus-Gustaveson runs the group & connects with dozens of brand executives every week, always for free.


Schedule a free consultation call

Discussion Transcription

Greg Irwin  0:18

Thank you all for joining and more people will be working their way into our, into our zoom. Again, I'm Gregor, when Justin Hodgson is joining from NVIDIA, and we are your co NS today running, running the show doing our best to make it interesting and informative. A couple of rules of the road are things to think about. I think these types of forums work really well, when you think of them as a networking opportunity. Because you'll learn and you'll pick up some tidbits in this session, no doubt. But if there's one thing we might be able to provide as extra value, it's actually the grid that hopefully you all have in front of you. So be proactive, I'll ask you to take make the effort to try and connect with one person across this grid. If you need help finding them, come right back to us at BWG, and we'll make the intro. But I always find that those references and those experience of people doing similar things are incredibly valuable. So please, you're the goal is make one new contact. It doesn't have to be Justin or me it can be anybody else. Then obviously, we're doing this for awareness here with with Justin and NVIDIA. Justin is the true subject matter expert. So I use him as such, ask questions. And I'll ask everybody. Another goal I'm giving you some work to do in the chat as we go doesn't have to be right at the at the upfront. I'd like everybody to share a question. Either a question or the one thing you'd most like to hear about. With regards to AI ml models, performance infrastructure, let's keep it roughly in topic. It could be one, you know, one deviation standard deviation away, but keep it you know, roughly in the topic, one thing you'd like to hear about to make sure that this topic is interesting and relevant for you so that you get good use of the time that you're putting in here today. Enough for me, Justin, do us a favor give a little intro please.

 

Justin Hodgson  2:35

Awesome. Greg, thank you so much. Really appreciate everybody's time this afternoon. How's my audio sound? Great. Crystal clear. Thank you to everybody for joining. My name is Justin Hodgson. I've been in video for six years, I specialize in the AI space, and videos, chips that make that possible. Many of you on this call, I've seen the names of the companies, you work for our customers of ours already. We're very grateful for your business. The intent of today is to take you through an agenda that talks about the impact of large language models on AI, they are predominant at the moment. I'd love the questions, as Greg mentioned, but I'll take you through the agenda in a second. And then I'm gonna take a risk guys and do some live demos. There's nothing like a risk on a What are we Tuesday, Wednesday afternoon, Wednesday afternoon. And assuming I survived the live demos, we'll take you through where I think the industry is going, what data scientists are doing and some of the accounts at NVIDIA. I've got some real life example from JP Morgan, what they're up to, and talk about some of the technology going forward and some of the things that you might consider for your organization. This is relevant to whether you consume AI in the cloud in the public cloud like an AWS bill, whether you do it on your own premises or whether you do both. So it's relevant to both. So with that in hand, I'm not going to go full screen for a second because in a minute, I'm going to swap to the live demos that are you seeing now? The agenda slide, Greg? Yes, we gotta. Okay, so

 

I'm going to talk you through some of the large language models, I'm going to do a live demonstration of sub. I'm going to talk about some of the supercomputers that were built for the enterprise and focused on the Enterprise not not on universities and government institutions, but how AI supercomputers are being used in the enterprise. Today, I'm going to talk about the difference between doing it yourself and doing it in the cloud. And some of the pros and cons. I'd love some questions on that. And then we happen to be at a juncture right now where there's a new selection of NVIDIA chips coming out. The first ones are being shipped in January, February. So I'm going to give you a little insight into something called Hopper, H 100 GPU Let's talk about our AI software that NVIDIA provides nearly all of it is open source and free of charge. I'm going to talk about how that software specifically relates to large language models. So how we help you scale large language models in the cloud or on premises. And the two main tools that make that possible, we're going to talk about networking. Because these large language models span many nodes many service, I keep a list myself of what I consider the 16, state of the art transformers at the moment, a large language model, and I'll share that list. And if we have time, we'll talk about some of the other announcements that are coming out next year and give you a little insight into where NVIDIA is going or what we're doing. There's some interesting things happening in 2023. So with that, we're going to move on. Okay, Greg, I'm gonna let you handle questions because I can't do both things at once my wife has told me that many.

 

Greg Irwin  6:08

By the way, we're off and running. Chris Bridget, by the way. Thank you for the for the direction here, we're going to try and layer in Justin, I might either prompt some of these questions as we go. But as you said, we're going to try and walk and chew bubble gum at the same time. If folks have some stories or questions on top of others, use the chat, it doesn't have to be static, pleated, please jump in with your own comments on top of others. But Justin, the show is yours. Thank you. So there are a few

 

Justin Hodgson  6:41

things on this slide I want to cover. The first one I want to talk about is the bottom left hand corner. If you look at the AI papers that are being written by their 1000s, every year, from all around the world, 70% of them in the last two years have been about transformers. And for those of you that don't know what a transformer is, it's a type of model that uses pipe torch tensor flow typically. But it's been associated with natural language processing, predicting text, creating text, creating chat bots, that kind of thing. But it does much, much more than that. And transformers have become a dominant force in AI at the moment. And I think there will be for for the foreseeable future. We, my team are on this call right now. And two of the customers that we look after in New York, are actually training to bloom models. Now they, they don't know about each other. And I'm not going to reveal the names of the companies that would be inappropriate, but one of them is training a blue model from scratch with their own data set. And the other one is taken the checkpoint, which is a free checkpoint from the organization created bloom. And they're fine tuning it again with a with a private dataset. This is a very large language 170 676 billion parameters. And I'll give you some insight into why they picked it in a second. But essentially just to talk about the size of it so that the company that is training it from their own data set, they have 500, a 100 GPUs, that's the very latest GPU at the moment. And they've calculated it'll take 108 days running 24 bicep. And they are right now about a third of the way through that as of today. And we've been helping them with some of the software, the open source software to make sure that that is running and that the model is going in the right direction. And they're going to get some great results from it. So I tell you the story because we're speaking from experience within NVIDIA with big financial services customers, like many of the people on this call. This is really happening right now. On the right hand side of this screen are really the state of the art AI deep learning projects that I see going on across our customer base. You see the first six are dominated by transformers. And I'm gonna give you some examples of that both in a demo. And a couple of slides I've got on projects that are ongoing. The other two big interesting areas that we constantly get badgered about is where are we graph neural networks? I'm not going to get into that today. But if you have questions, if that's important to you and your organization, please put something in the chat, because one of my colleagues will capture that will follow up with you graph neural networks. This is going to be their year 2023. And I'll explain why a little later in the in the conversation but they all really come into force and they do phenomenally accurate predictions. So that's a great area and then the final area is reinforcement learning again, if any of you are doing projects in this space. And I've put an example on here. But there are lots of ways to use reinforcement learning in finance, to create better scenarios and better predictions for the future. So these are the key areas that are happening, right? Right. Now, the demo, by the way, all of these slides, you don't need to make notes, you don't need to cut and paste these particular URLs, you will get a copy a PDF of these through Greg and his team. I'm not going to do though, don't worry, but I, I love to leave people with something they can do themselves. So all of these demos are public, you just might not be aware of all you can click on these links yourself at your own leisure, and you can play with them. And I'm just gonna give you an insight to two or three of them right now. But I encourage you to try them. They all transformers,

 

Greg Irwin  10:59

Justin, can somebody on your team, just scrape that and put it into the chat, because I think that would be fun for some people to play with these after the fact.

 

Justin Hodgson  11:14

I'm sure Rob will do it. Someday, I will let you guys get on with it. So what I'm going to do now is move over to my browser, and we're going to try one it's one or two of these live he took them up there's so many things going on on my screen. So

 

Justin Hodgson  11:53

let's go to the best. So I'm sure a number of you use hugging face or your large language models. I don't know if there's a way on this, put your hands up. There's an icon that puts your hands up who's who uses hugging face, or a little icon in zoom that allows There you go. Okay, good. So hugging face have a wonderful repository of large language models. And this is one of them. It's one of one of the small ones or small in today's world. This is GPT two. And this particular one, I think it's about one and a half billion parameters. Remember that bloom one that you I was just talking about? It's 176 billion. And I'll put some of these numbers into perspective and a little while. This is a fairly baby one. And this one's been trained by hugging face, and you can type stuff it so I can put Tiger Woods until what I do at weekends is and then if I hit the Tab key, it'll do an inference for me. And I picked this one because it'll actually give you three inferences. It used to give you three to choose from. Okay, and this is what transformers do transformers try and predict the next word. Some of these transformers are intelligent enough to understand information beyond your sentence or before your sentence but it's giving me an option of three. So if I put on an I'm gonna select the and I hit tab again. Okay, with greatest golfer tap again we'll talk in the history I'm not gonna go any further you get the point here it's basically making a prediction it's offering me three alternatives based on centage likelihood in the AI and it can construct intelligent sentences based on what it's been trained on. There's a data set that's trained this it's a public data set, there are numerous public datasets if you need direction to those just come to us we'll show you where to get that you can train this you can actually download this directly from hugging face. So how you can place all that you have this completely free of charge. And I'd encourage you to play with one of things now what people have then done said well this is great but although this demos well what's the use in my back? No use the tool what's the use to my financial services institution. So I'm going to take you now to Uncle blendable blend blend of art was created by Facebook, it's part of the methods there's not this is not live in any of their products. This is purely in Facebook research. And I'm going to start with Blender. Now blender, lots of big AR and in the top right hand corner, it's slightly out of view because of everything that's open on my screen. But there's an FA FAQ that you can go to and it'll tell you how this is built up. also download everything in here, the weights of the model, the data set all these things from here this is this is a very useful place to keep an eye on and blend blenderball will allow me to have a conversation. So I'm going to scroll down to all my videos at the bottom it says it's busy here we go start chat. Now, this bot knows me. So it's actually remembering something I talked to it about. Last week, I was asking about 401 K's. So rather than me pick something I'm going to pick who was it was it way said you've used it hugging face? Wait, who do you work for away? If you don't mind me asking?

 

Greg Irwin  16:01

We may be muted. Let's see if we can unmute him.

 

Justin Hodgson  16:03

away, you can type in the chat if you like and do your work for city. City. City Bank. Okay, great. So I'm going to ask this chatbot What do you know

 

Justin Hodgson  16:20

about this model was trained on 70,000 interactions between people. And and we don't know exactly because meta won't tell us but but interactions between call centres and people and they're getting another. It's something like 70,000 a month. So this particular interaction we're doing now is being sucked into matter. And it will be used to train the model and make it better and better. So it's done a pretty good job already recognised to Citi Group up. So wait. Are you happy in your job, city? Type in the chat for me? Do you like your job? Are you happy? Is it a good job? Do they pay you enough?

 

Right? Anything you like? And while you're doing that, I mean, chat. What, what's what is it like working at Citi Group. And this chatbot will actually take on the role of if it does work at Citi Group who wants to? Are you happy?

 

Justin Hodgson  18:10

Hopefully, I didn't see ways answer because I don't have the chat open on my screen, but hopefully ways having a better experience than the cashier that I'm talking about this chatbot thing. So way, I appreciate you taking part in this, I would urge you all to try this. Because everything you see here, you can incorporate in your organization, there is nothing I'm showing you that can't be downloaded or built into your organization. And the data scientists that I work with in Vidya can even show you how to incorporate your data into this to make it specific. So we could take all the city group, for example, documentation around equities around 401, KS or whatever you wish. And we can fine tune this model from Facebook, to be city group specific. And you would then have your own chat bot that is this good of this intelligence based on a very large language. So I won't, I will quit the demo there. But please, if this is of interest, this kind of thing, please come back to the site. And if if your projects aren't chatbots if your projects are, what we'd love to do is use this intelligence to summarize a document or to write a document for us or to understand a piece of financial information. Like you know, when the consumer price index is announced and write an article about it. All of those variants can be done for this basically the transformer is intelligent enough to understand English in this situation we'll talk about foreign language is a second. But it's able to do all of these exercises. And now this size at 176 billion parameters, it's able to do it in a very intelligent way. accurate way, way I knew and I don't think know each other I don't think we've ever met. This wasn't a setup, it was your hand that prompted me to pick you up, pick you. Now, I'm going to show you a couple of other things right now. So let's go to, we're going to come away from natural language processing, and we're going to do something called in video. Okay, so this is another one, it's, it's on the list that I provided. And this uses a transformer to analyze videos. So we're going to try it for free. And all you need to do is you can pick any YouTube link you like. The only caveat is, it's less than five minutes, go on YouTube, find your favorite video on any topic you like. And you just paste the URL. I'm going to pick one that I I know is already there for the time sake, this bottom one, and this is one of my passions beyond golf. And this is Formula One racing. So in Formula One, there was a race recently in Japan, it was pouring with rain. And this is a clip from that race where a gentleman called Pierre Gasly, who's one of the Ferrari drivers is fine array. Let me see if I can add you get rid of the bar the top real like to show you the video setup, so that I can do that. That'll mean more to you back to you. Right, so there's a red Ferrari, and you're gonna see it spin and then other cars come by we got to see the other cars flying by.

 

Justin Hodgson  22:08

And why this hit the news is attractor comes out and rescues the car. But there are still these cars flying along at 100 miles an hour, it was a very dangerous situation. So that's why this clip became somewhat famous. So I'm going to pause this for a second. And we're going to go back to

 

Greg Irwin  22:26

Justin, just so you know, based on the way it's rendering through zoom, we're seeing snapshots of it, I don't think we're seeing a full resolution rolling video.

 

Justin Hodgson  22:39

No problem. You got a you got a sense of what it is that that was what's important. Now, you're like, What is a transformer going to do in this world? Well, the transformer basically analyzes the video. So it's already done that for me when I clicked on your URL. Now I'm going to say to it, I want to find all

 

red cars. And what the transformer is going to do is identify hopefully, all red cars in this video

 

Justin Hodgson  23:15

look a little slow, because I've got a whole bunch of stuff running at the same time. But you can see the one minute and 43 second mark, this is the red car. And if I scroll down, you'll see other other cases it's very it's very good at this setting. You're like wow, that's That's clever. What about the tractor that you mentioned? Just Can we can we can we get to the tractor. So let's go and have a look for the tractor. I'll type that in. See if we can find the tractors. And there's one, there's the yellow tractor at two minutes and 10 seconds. And if I scroll down, you'll see there are other images on it. So this transformer has learned, basically what video images look like, and it's able to identify them. Now, this is not very businesslike so far, but let me say let's imagine that one of the folks on this call works for Rolex. Now Rolex are a big sponsor for Milan. And you want to know how often Rolex get a look. Okay, so if I type in Rolex, it'll show me if there are any images of Rolex that are making it onto the TV screen you can see it starting to find where the Rolex images is a knock on the hoodie. And if I scroll down, there we go. And it's remarkably good at this now, by all means, play with it. Try the YouTube issue of your choice. But what I think's going to happen, in fact, I know it's going to happen is in the business world. On a Zoom meeting like This, I'm gonna look at the gentleman at the top pick on Stephen, I guess the Stephens pleasure to be like thank you for joining us. You've been I've been watching your face, you've been paying attention, you've been leaning forward, you're interested. Thankfully, you're smiling at the moment, that's a good sign. And you are engaged in this conversation. However, there are another 3040 people on this call, I can only see three. Eric, thank you for smiling. But I can only see three of you. Wouldn't it be nice if an AI is watching this, and it was studying your faces and deciding who's engaged, who's on their phone, who's talking to the dog is not good video on and gives me an idea of what I'm saying. And your reaction to it, particularly in in the financial services market, where nonverbal communication is critical. And if we can identify Formula One cars and tractors and Rolex science, we can do that with people's faces. In fact, we're doing it in video, within the self driving car community where the cameras in the car actually watch the human to make sure they're alert, they're driving the car. So there's an awful lot of opportunities with video and still images to use transformers in this space. So I'm going to come out of this now. And I'm going to show you a couple of other demos, but they're just they're just going to be slides. They're not going to be live demos. I think like any new technology, it takes a little bit of getting used to this, this call, for example is being recorded. You probably got a message when you logged in this call is being recorded. Do you want to join? And do you join? We have apps on our phone monitor where we are in the world all the time we put up with them. I think as long as people are directed and been told up front, this is what's happening. I think it's appropriate. Obviously, I don't think it's appropriate. It's been done secretly. But it depends on am I am I helping you basically if that, if that monitoring of you and what we do and what you think is useful, I then follow up and say, Hey, I saw that you're very interested in x. Is there something I can help you with? I think you'd like that. If I'm doing it surreptitiously, I agree with you. That would be inappropriate. Any other questions right now?

 

Greg Irwin  27:28

It speaks to a comment Eric made earlier, Eric Cartman, the trade offs between performance and executing all of this. And the here party in particular for financial services, the compliance overlay of that step of you know, can I get your approval? Or can I audit the process? Or can I anonymize some of the data, the compliance checks, I think, you know, make make this whole exercise more complicated for sure.

 

Justin Hodgson  28:00

So I'm gonna go into full screen mode now. Because from now on, I'm going to stay on slides. So you just tell me if this looks okay. Yeah, looks good. Okay, so one other area I had on the previous slide, but one was using transformers for other things. And two of the things they do extremely well is one is predicting time series data. And this is an example of a project we did about seven, eight months ago, it was presented publicly. If you're interested in the video, I can send you a link to that of the customer that did this. But essentially what I challenged them with at the time is best predicts and data into the future. And I said, The caveat is NVIDIA would like you to predict inflation. Inflation is a big topic. So whatever you choose to predict yourself for your own purposes, we want you to predict inflation. So that was the one in green here. You picked a particular type of CPI, the open consumers less food and energy, that's the one he picked. And I said to him, he works for an asset management asset manager, Michael Cohen and stairs in New York, you know them, they're famous for trading REITs. And he said, Well, I liked the VIX. And I said why? And it was a big smile on his face and said, I know how to make money out of trading banks, like Greg. So those are those are the two will predict. And I just said to him, Look, you've been a quant for 30 years in New York. What do you think are the other pieces of data that are relevant to inflation and to the VIX, and this is the list you picked. I had nothing to do with my data scientist and I had had no input into this other than we wanted inflation to be one of them. And the gentleman provided us we did it jointly as a project with a data set, dated back to December 29 1989, so just over 30 years worth, and you'll see that the categories he picked are across the top here. So this is the s&p information technology index. This is the financial services industry index, this is the industry's and so on across the top, I won't read them all out, but it's that previous list. These are the dates down the left, this is a daily close, and these are the actual numbers over from those time periods. Okay. And what we said is he gave us the data at the time up until December the 16th, to 2021, so about 11 months ago, and he said, predict for me, December the 17th 18th, out to one week for the VIX. So if you can predict accurately the VIX, seven days in advance, I can make money. And I need you to predict inflation monthly because it's a monthly number. And this transformer was trained on this data set. And then it goes into inference mode and it creates predictions, it doesn't create one prediction per cell, it can, it can predict as many as you like for a cell. So what we agreed on with his customers, we do 100 predictions per cell. And then we would create a distribution chart of those predictions just like Tiger Woods is had three different alternatives. Well, this has 100 alternatives, create a distribution, and then we pick an average or a median or whatever the quant chose the time. And the predictions were remarkably good, to the point that he took myself and three colleagues out and he bought very expensive bottles of wine for us. So we know he's happy, we're doing a follow up to this project that we're going to release the results in March. And that's to predict the price of a barrel of oil. So look out for Cohen and stares and NVIDIA presenting the price of a barrel of oil. All of this will make available to anybody on this call they wish. And we have a training course of how to do this. You can do it with any time series data, we'll use this as an example. But the code that is used to create this model and how we use pytorch to do it, for that will train you on how to do it. And we'll give you the code no charge no catch. We know if you use transformers, you'll use our GPUs. That's what's in it for us. So hopefully that will give you a flavor of some of the use cases. As the quant pointed out in this world of prediction, I don't have to be right 100% of the time. In fact, if I'm right, sort of 50 to 53% of the time I'm going to make money if I'm a disciplined investor, I think where you're going with your question is something that reinforcement learning addresses. So we've done a project with one of the customers on this call, so I'm not gonna mention the name, but where they created a sort of synthetic Stock Exchange, where instead of humans trading on it, they created reinforcement learning bots. And those bots were told to go make a profit and to trade. And there are certain caveats put around them. But to your point, once a trade happens, that trade in its own right affects the price of an underlying asset. And so reinforcement learning addresses that issue, it takes it to a whole new level of complexity. So you're gonna address it in reinforcement learning in terms of what we did with this quant from Cohen and stairs. And we didn't take into into account for the observability of the trade. It was purely every Friday, we would create a prediction for the following Friday for the VIX, inflation, nobody was trading on that we were just looking at it out of interest. And if you come on to training, we'll show you the charts of what was predicted and what actually happened. So now we've got about a year's worth of data of what we predicted and what actually happened. And you can see, see how good

 

Greg Irwin  34:17

it was. Why don't we Why don't we stay with this that. Thanks so much Sandeep and Justin. I'm going to bring forward a couple of other questions. So Steve asked, great, he's interested if anyone is training their own LLM. We're doing significant fine tuning for deployed applications. What's your experience in terms of how some of your clients are tackling LLM?

 

Justin Hodgson  34:42

Um, so that I would say 90% of people are doing fine tuning. So they're taking an established LLM, where the weights have been provided for free from hugging face or open AI or whoever is provided them and I've got a list of them. Bloom was certainly picked by one of the one of the customers for that very reason that they didn't have to do the initial training, which is expensive in terms of compute. However, the 10% of people who have budgeted for this, doing it with your own dataset always gives you the most accurate results. And so the particular customer I'm thinking of, they have huge datasets, like many of you do, and they're investing in that, because they've seen incredible results so that they're not doing Bloom from scratch. Just hoping for success, they have been building up this chart along the way. And I'll explain this chart. So each of these models on here is a transform, and transform as started really around Bert large, in the sort of late 2019. That's the first really well known transformer, you can see it was 340 million parameters. And people like the customer, I think you have they started with this model. And they and they trained it and they were like, wow, this, this is formidable. And as new models have come out, and they've tried on bigger and bigger versions of these models. So they went to, you know, an 8 billion 15 billion etc. And as they built bigger models with the same day just got more and more accurate results. And so the people can afford to train from scratch, that's the best approach. Many customers can't or won't initially, because they just want to prove this will work. And taking a model like bloom, which already provides a free check point, I totally recommend it. And I would tell you to do it also with Blender Bots as well, that take an established model with weights and fine tune it with your data. And if you don't know how to do that, come to us, we'll show you how to do it, no charge will show you. We'll even give you the code of how to do it. As you can see, these models are getting bigger and bigger and bigger. And if I build up this slide, you'll see that in the first couple of years, these models grew 340 times 1500 times in the last three years. This this Megatron cheering is one that NVIDIA did with Microsoft 530 billion. I am yet to see an enterprise do this that the biggest enterprises are doing sort of 170 5 billion right around here right now. But this is the next step. Google have put out switch transformer, there's one out of China, the Beijing Academy put one out called I think it's two well, it's Chinese and English that you can get that's public as well. And that's 1.7 5 trillion. And our CEO, Jensen Wang believes that this growth is going to continue. So we think models will be 1000 times bigger than they are right now by 2025. And the reason they keep getting bigger is because big equates to accuracy, then the question is, can I afford to do big? And where can I be along this trajectory to make something valuable for my organization? And how can I use it to predict time series to create synthetic data to look at video to look at images, etc, etc. There are there are 100 use cases. So we would love to engage and help you in that. And I'm going to share a couple of things about the technology in a second that will help with this and 2023 and make it more affordable.

 

Greg Irwin  38:44

So I I have a question here, Justin, I think the I think we understand the scale. I'm really glad you started with the demos, because that kind of drove home the capability. My question is, what kinds of things that are not possible today will be possible in two years, because of the expanded model capabilities, the expanded models and performance.

 

Justin Hodgson  39:11

And so I think right now, there are enough really, really good transformers available right now that have already been designed. They were on my list or they're on the list of the the final slide in this deck, which you're all gonna get. I think the problem for most customers is it's too expensive to train them. Or it's too expensive to adopt that model and process that or fine tune it with their data. And that's where we're trying to help with. We've got open source tools. We've got some new technology coming imminently. That's going to make it a lot more cost effective to do that. And I'd love to spend a few minutes on that but I'm conscious that we are also going to run out of time. So you tell me Greg, do

 

Greg Irwin  39:59

you Go and I love this. I others can redirect, you want to raise a hand redirect us? Drop it in the chat. Otherwise, Justin, you got you have the you have the pace on the marcher. Okay, great.

 

Justin Hodgson  40:13

So these numbers and the two red boxes just to illustrate with today's technology, how much compute you need to train some of these models. So it's not exactly the same number. But there's there's one here case 145 billion parameters. So very similar to balloon very similar to blender but, and you can see that if you want to train that in about a month, it would take you 1500 GPUs, that's a lot. We'd love you to buy 1500 GPUs or rent them. But but that is a lot, we acknowledge that there are relatively few companies that will do that. The company that I mentioned, that's doing it, is doing it with 500. GPUs, it'll still work, it just takes three times as long. So they're taking 108 days instead of one. But a lot of a lot of customers could get tremendous value out of a model that is somewhere around the 20 to 40 billion parameter mark this this one 500 GPUs in 30 days. But we can make it smaller and use, I'm going to talk about a piece of software called Metatron. That allows you to scale this size to meet your constraints, your budget, your ability to train this in a reasonable amount of time. And I'm also going to tell you about the next generation of GPU, which is eminently out, which is four to five times faster than these numbers. So let's say at the bottom one, instead of needing 512 GPUs, you'd only need 100 of the new one, to do this in a 30 day period. And I think, Greg, that's what NVIDIA is trying to do, we're trying to bring this tech that already exists to the market and a more affordable way, and wave the bar buy 100 of the new GPUs or you go to AWS or Azure and rent them doesn't matter. They both work great. And we create scale as you go up this curve. We if we don't get to Megatron in this conversation, because we ran out of time, if you're building large language models, and you want an efficient way to scale a large language model across 100, GPUs or 50, or 20, whatever the number is, talk to your NVIDIA representative or come back through Greg to me and I'll make sure you get hooked up with a data scientist that will show you how to use Megatron to do this. Think of Megatron as a framework of software that allows me to scale automatically these large language models across multiple GPUs, multiple nodes does it extremely well and does all the parallel processing and all the communication automatically for you. And there's also one for inference called faster transformer. So inference is scaled by faster transformer training is scale by Megatron, they're both open source, you can download them from video, but get one of the data sciences and video to train you on how to use them. They work with pytorch or TensorFlow. Very briefly on JP Morgan, I'm not sharing anything that isn't public already. These are just two slides from the latest investor report. But AI has made their investor report, they next year will be spending a billion dollars on AI. And they're seeing a $2 billion return that's a projection for next year. So spend a billion make a billion. And they promoted two particular use cases. We know that they have 500. Because Jamie diamond has talked about this publicly that have 500 ai projects live. And he says that will double to 1000 by 2024. And you can see some of the improvements in performance of equities trading and modeling risk. And they've improved their cycle of generating AI by about 70% with a bunch of things. I'd be happy to talk to you about another time. The other slide here is our JP Morgan doing this well. Initially, they were all on premises, they did all their AI on premises on their own computers, then they moved entirely into the cloud. They've talked publicly about their very good relationship with AWS. But their strategy right now is to do both. They want a hybrid infrastructure where they do public cloud, they want to go to beyond just AWS. They want to access to Google Azure, etc. And at the same time, they have their own infrastructure on prem. As they determine where the jobs are going based on the profile of the job training, inference, latency, that kind of thing, just sort of being there an interesting example, I would say JP Morgan for anybody's on this call that works for them, it's probably one of the leaders in the financial services space, right. They're not the only ones, but they are one of one of the biggest, they're a good example to follow. Now, let me just do a quick time check, take them out for the water.

 

Greg Irwin  45:29

We've got about 10 minutes left. So any last questions, drop them in here, I'll do my best to bottle them up. Also, of course, there's room for follow up. Just to remind not to not to hide the lead by and by any bed. If you do have models, or projects, or just general questions, let us connect you here with Justin and his team. And we'll make sure they get answered. And just again, make sure this is useful for you.

 

But we got 10 minutes left, let's let's finish strong.

 

Justin Hodgson  46:02

Great. So five or six minutes, and then I'll leave the last four minutes for questions. So AI supercomputers have always been the preserve of big government bodies, universities, that kind of thing. Not anymore. Ai supercomputers are now affordable for the enterprise. And they're being deployed actively. And I'll mention a couple. Why is the video the best place to build these? Well, 355 of the top 500 in the world are built by NVIDIA technologies 70% of them. And we've got very, very good at this. And we've now in our new technology, which I'm going to show you in a second, which launches in February, we're able to scale it to pretty much any size you want. This is a rendering of one of our AI supercomputers, we actually have six within NVIDIA that are all used entirely by us. We have our own research team and about 1000 people, we have about four or 5000 other engineers that use these, everything from robotics through the self driving cars through to transformers. And you can see this particular one is the sixth fastest in the world. It's called saline. This is built on the current a 100 technology. These are two that we built one for a matter for Facebook

 

earlier this year, and one we built for Microsoft late last year.

 

Justin Hodgson  47:33

And the Microsoft one is interesting because Microsoft obviously have their own cloud. So their engineers can use Azure as they wish. But they realised that the way these supercomputers are designed is a much better performance. And I'll show you why in a second. But essentially, they built an AI supercomputer using our architecture just like Facebook did for their own purposes. Because once you get into transformers in these huge models, the architecture of how these are designed is critical to performance. Another one is Johnson and Johnson classic enterprise based in New Jersey, they'd been a big big user of AWS still are they love AWS. They also realized that having their own infrastructure was going to save the money and make it more economical certain types of jobs, these very large transformers that span multiple nodes and GPS. And there are there are videos if you're interested in the j&j example, there are some videos where they've interviewed the CIO of j&j. And he tells you a little bit more about how they use them. public clouds can struggle in terms of scalability. So the blue line on here is an NVIDIA AI supercomputer. You can see along the bottom here, we've got how many GPUs without the way access the performance of platforms that somebody just needed for me for a second.

 

Greg Irwin  49:10

I think the control they're just law.

 

Justin Hodgson  49:32

Because that was a test. Greg was a pass,

 

Greg Irwin  49:35

you pass.

 

Justin Hodgson  49:36

Thank you very much. So the way these supercomputers are designed is they're designed for linear scale. So as bigger and bigger components to it, I maintain the scalar performance. The cloud unfortunately doesn't work that way. It was It wasn't designed to work that way. The cloud was designed for cut So how can I rent a GPU or rent a server and do it as economically as possible. So the way the cloud network needs to gather isn't as optimal as a supercomputer. And that's why supercomputers built at the right scale, or the right economies can be a very economical way of doing this work within your organization. And you should certainly consider it. Let's go on for a second. So I'm going to talk about the H 100. H 100 is the new GPU that was announced about six months ago, and it's coming through the OEMs, like Dell and HP in the spring, and we're launching a range of supercomputers based on our server called the DGX. And we provide it as a pod. And I'll show you what that means in a second. But really, what makes us crazy fast is the super fast networking between the nodes. This is the the technology itself, this is the new H 100. GPU, just to give you a relative value that Intel are just about to launch a CPU called sapphire rapids, it's been delayed a few times, it looks like it's coming out in March, Sapphire rapids does four teraflops 4 trillion floating point operations per second, this GPU for Masters 60 trillion floating point operations per second, just to give you some idea of the performance of this chip, I'm not going to bore you with Chip stuff, but this is the fastest chip in the world by a mile. It is going to significantly outperform the current generation. So the current generation is the A 100. You can see the grey bars, the A 100. For AI training on the left here has GPT three. This is four times faster on the new chip, these chips were designed for transformers, they have a transformer matrix generator on them that makes them go fast. And you can see summer in the range of three and a half to 5x. The performance depending on what you're doing. These are different types of transformers. And for inference, it's even faster. So H 100 is going to allow you to get to do some of these models that you kind of fall to now because you're getting three and a half to five times the performance of the previous generation. So whether you love the cloud or whether you love on premises, take a look at h 100. You can start reading about it they're coming imminently supercomputing is next week. Often there are announcements, supercomputing. Keep an eye out on that as well. And this is these next two slides. So where I want to leave you in this presentation, and then we'll take a few questions. But what happens if you want to consume your own AI supercomputer, you want to buy one rent one, whatever? Well, what we did very cleverly this time is we were building our new one based on the H 100. GPUs. And our internal code name for it is EOS. You won't see that in the press. But that's just what it vehicles is it supercomputer Eos, the way will be built, there'll be 4608, GPUs all interconnected at 3600 gigabits per second, if you think about your home network, you're probably at 100 megabits, maybe you're on a superfast one gigabit, this is 3600 times faster than the network coming into your house. And it's that that makes it super performance compared to renting GPUs in the cloud. Now, when we're building them, you can see multiple rows down in the right hand corner here, multiple rows, and rows, and rows and rows. And these are all interconnected at the same speed. But what we've done very cleverly this time and said, You don't have to buy all these rows, you can buy one row, or half a row or quarter borough based on your budget. And we will package it up for you. Here's here's one, which has 31 of the DGX is in it. It's connected to high speed storage from NetApp. This particular one, you don't have to use NetApp, but a build a great product. And you can see this has 248 GPUs in it. And it's equivalent to 1000 of the current generation for 3000 of the previous generation. And I bet a bunch of you on this call are still using the one hundreds. So if you if you go from V 100, you skip a 100 and you go to H 100 248 of these is equivalent to 3000 of the V 100. And that's what's making large language models affordable. And, of course, this is a shared resource I've got a diagram in the bottom right hand corner of uses.

 

Greg Irwin  55:05

Justin, we're going to be losing people here in a moment. So maybe we use this as our point to wrap up. Great. It's okay. There's certainly more to cover. And I thought that it was fantastic in terms of the content. So, you know, just applause in terms of, you know, really covering great, interesting content. Let's, let's do this. Back to the original goal. Make a contact, if you want to connect across this group. Absolutely. If you want to connect with Justin or the team here, you know where to find us. And Justin, a big thanks for for a great session.

 

Justin Hodgson  55:40

Thank you very much. Really appreciate everybody's time. We look forward to getting to know some of you offline. Take care, everybody, have a good rest of the day.

 

Greg Irwin  55:47

Thank you. Thank you all. Take care.

Read More
Read Less

What is BWG Connect?

BWG Connect provides executive strategy & networking sessions that help brands from any industry with their overall business planning and execution. BWG has built an exclusive network of 125,000+ senior professionals and hosts over 2,000 virtual and in-person networking events on an annual basis.
envelopephone-handsetcrossmenu linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram