Jonathan Allen:
Now you and I both work with a lot of customers who have iterated their way through generative artificial intelligence to solve real problems for themselves. I thought Andy talked really eloquently yesterday when he talked about the iteration that you have to go through. But I think you and I have both seen customers get to that, I'm 99% of the way through and then I'm just going to hit pause because of hallucinations.
Dr. Swami Sivasubramanian:
Yeah.
Jonathan Allen:
So obviously, we're an extremely, we do a lot of science in HAQM. And I'm truly interested to hear you talk about some of the progress we've done in that.
Dr. Swami Sivasubramanian:
I think this is an area, it's fascinating when you think about what LLMs are capable of. But what we all and many of us forget once in a while, that at the end of the day these are highly probabilistic models. And they end up actually generating things even if it is wrong 1% of the time or inaccurate. It can actually be a real production blocker, especially in regulated industries.
So when we looked at this problem, we actually said, "You know what, what is the best way for us to be able to actually ground these with the more certainty?" And then this is where in my team we are invested in a science called automated reasoning. If your engineers are actually leveraging our IAM policy tools and authoring, it is already powered by automated reasoning capabilities, to check if that policy is enforced the way we think it is enforced, or you think it is enforced.
It's like a mathematically verifiable construct using formal logic. And we actually thought why not actually combine the mathematical proof that is provided by automated reasoning, with actually the creativity and expression of GenAI? To actually produce something that is extremely creative but also grounded in formal logic.
And this is what actually was one of the groundbreaking innovations I actually think can change the landscape in a way. Because this is going to be one of the big blockers in the future for putting GenAI innovations in the regulated industry. And we have been working with a few customers in these spaces. And the early results are really, really promising. Where you give these models a document of saying, "Here are the rules, grounding principles you can never violate." And then give it as a guardrail. And it constructs a semantic model that actually acts as the guardrail for the LLM to not violate.
And when the LLM generates an inaccurate response, it goes back and actually tells LLM like, "You've got to try better, you are violating this one." And then suddenly everything becomes not better. So when you think about it, these are the kind of techniques we used even in our Java upgrades. When you think about formal logic is not new. It is what powers our computers today in many ways with things like compilers and so forth.
And that's what we're doing, and we are very excited about it. Of course there is a lot more machine learning-based innovation that is happening. And I can get into that more and more too on how we built Nova and various other things too. But this is something I don't think the industry has woken up to.
I almost view it as the yin and yang, so to speak, between because formal logic is all about certainty, whereas GenAI is all about unleashing creativity and accepting imperfection. And if you actually marry these two well, I think the results could be amazing.
Jonathan Allen:
I think certainly in the regulated industries, mathematical proofing is actually pretty well established. But the automated reasoning element coming in, I think having listened to customers, it's something that they're going to rapidly embrace. I'm super excited about it.
Now when I was a customer in this room before I joined HAQM, I wanted to hear about the lessons learned. You've been at HAQM a long time. You've worked on many machine learning and generative artificial intelligence initiatives across HAQM.
For this group here, what are the two or three most pronounced lessons learned that you've got as you've helped those initiatives come to life? That you could share with this audience where you wouldn't want them to make the same mistake unconsciously or without going into it?
Dr. Swami Sivasubramanian:
Yeah, so tough. I was thinking the two or three biggest one I would say is when the technology is leading edge, some of the mistake, we as like organization owners and product owners. And we end up first actually not investing enough time from the engineers to all the way up to even the CXO levels to actually get hands-on and play with it.
To me, I actually think this is one of those where unless you really know and embrace and really experiment with this technology, you don't realize the power and potential of what it can do. And to me, that is often overlooked. So that's why I almost make it a point every week I have a few rules that I set for myself.
I at least meet with three to four customers personally and I at least read three to four research papers personally. Those are the two rules. And I tell my teams. And then over the weekend I either write code for fun or with my daughter for a robotics project, one of these two. So that I actually know, I stay to the ground.
Jonathan Allen:
Yeah, these are good forcing functions, right?
Dr. Swami Sivasubramanian:
Yeah, exactly. Otherwise, you end up operating at a level where you are not grounded.
Now the second thing I'll just say is we had to really be tuned when the pace of innovation is so rapid that the mechanisms we typically organize to manage at scale really do not work. When the pace of innovation in technology is actually suddenly changing to 10X than what it was five years ago.
All our existing mechanisms of managing programs and engineering projects, all those things are really going to be not fast enough at the pace at which you want to innovate. So it requires a different style of management where you are not down to the details. And then sometimes you don't need to actually engage every layer of managers before you get to the person who's actually designing the product and then writing the code and so forth. So I actually changed my style pretty drastically in the past three years for that reason as well.
And then the third one goes without saying, it's an HAQM DNA. Work backwards from the customer problem and the business objective you want to accomplish. At the end of the day, as much as I'm passionate about GenAI, AI, and data, it's a means to an end to solve something for your customer and for your business.
This is why, even in the early days of deep learning, but also early days of GenAI, we all build a lot of proof of concepts and they all went to the graveyard to die a slow painful death. Because we really didn't see which one is going to really make an impact to my business in terms of return on investment.
And that is so important more than ever because especially in the GenAI world where the cost is really, really high. And you saw my launches, a lot of these are all around how do you manage things at scale? If you saw the SageMaker HyperPod task governance, this is a big lesson in HAQM. Like last year we had lot of GenAI projects.
And you can call them inference and training and fine-tuning and experiment. And we internally built a system to automatically move resources between that. Because inference, the cool thing about inference is, I shouldn't say cool thing. The challenging thing about inference is let's say in something like our shopping agent requires 1,000 training in two instances at its peak. The nighttime, not that many people shop, but you can't have those instances sitting idle. But ideally you would want to actually give it to workloads that are not time sensitive like training or fine-tuning experiments.
This required active resource management. So we ended up building the system. And then we said, actually when we talked to many of our customers, especially CEOs, they said, "I am pouring so much money into this, but my utilization is not that high. Why is it so?" Then we actually said, "This is what we internally do." They said, "Why can't I have it in SageMaker?"
That was a big lesson. To actually do GenAI at scale, you've got to actually be exceptionally good at managing costs. That's why you saw so much attention on hyper-powered task governance, but also doing scale at inference with model distillation. And also actually things like intelligent prompt routing. These are the techniques we internally use because our GenAI projects especially, they are reaching at a scale where every percentage improvement in utilization and cost really matter.
Jonathan Allen:
I was meeting with a customer yesterday and they had identified, now I've seen this a few times, they've had 100 candidates for GenAI things they want to do. They isolate in on a few and then they have trouble getting going with those one or two. There are blockers as we know.