Las Vegas 2020

Optimizing at Scale: Using ML to Optimize All Applications Across the Service Delivery Platform

While DevOps has created software release velocity, traditional performance optimization has not been able to keep up. The result is the need to overprovision systems with CPU, memory, among other resources, all of which drive up costs unnecessarily. This problem is exacerbated as enterprises shift their software production on a services delivery platform, like Kubernetes.


The answer is performance tuning automation. Opsani, the leader in ML-driven workload configuration tuning, allows companies to tune a single service or all services across the service delivery platform autonomously. It works every time there is a code release, load profile change, or infrastructure upgrade. By discovering the service level objecting, measuring, learning, and turning to give the right resources to address the system needs predictively, Opsani continuously delivers value through higher performance, improved availability, and lower costs.


Come and learn how some of the largest enterprises have autonomously optimized thousands of their workloads across their service delivery platform with Opsani, saving countless human hours and budget dollars, while delivering better customer experience.


This session is presented by Opsani.

PN

Peter Nikolov

Founder and CTO, Opsani

Transcript

00:00:12

Hello, and welcome to our session. My name is Peter Nico, and I'm the chief technology officer of abstaining. In this session, you will learn what continuous optimization as a service is and how to keep your obligations at high-performance with low cloud spend. And at the same time, have more time to implement value. Adding features. Let me share with you an interesting fact. There are 7.5 quintillion grains of sand on earth. Keep this in mind. As we look at the complexity of optimizing cloud applications, the Google online boutique is a reference application that Google built in order to demonstrate the key concepts of microservice architectures and help train engineers on cloud native applications. It has is an open source application. It consists of 11 different microservices, each taking different functions of the application. Um, and it is mostly used to demonstrate best practices and understand how to operate and run applications in the cloud.

00:01:25

When you look at deploying that application, how do you deploy this application in production and tune it? Uh, and the complexity of that tuning process, let's look at the number of variations number of parameters that need to be tuned. There's 11 services. Each of these service has at least two tuneable parameters. The CPU that will be allocated to the resource that will be given to it and the memory that will be given to it. And each of the level of services needs a specific value of the CPU and Lambert resource assigned to it. So if we look at so there's 22 such configuration values, if we go with just eight possible values for each of them, although this is conservative, there can be more. Then we will end up with, we will see that there's eight to the power of 22 or 75 quintillion, different configurations of this application we can take.

00:02:26

And only one of them is optimal. Now you will know, is this is 10 times more different configurations that there's grains of sand on earth. So how would you come up with the right configuration? How do we find the right configuration? Our answer is machine learning. This is the way to get to it. And to demonstrate this, what we did is we connected our continuous optimization service to the online, but we took the online would take from recovery book as, as it is published as an open source application and running on a Kubernetes cluster, we'll look at it. It is performance and how much resources, what footprint it pace. And it was pretty straightforward to see that it was not running optimally. So we connected it to our continuous optimization as a service system. Um, and these are the results I'm going to give you the results that we got.

00:03:23

We saw an 80% reduction in cloud costs. So you, you actually needed prevail only about 20% of, um, on the cost for it. Uh, and at the same time, we're able to increase its performance. The result of that optimization is that we, after it's optimized, you can get eight times more transactions per dollar, um, in running this obligation. And we did this with about 20 minutes, uh, set up, uh, whatever engineers. And it took approximately two days for our system to find that unique configurations, come on the 75 quintillion, this is done. We did this with machine learning. Um, why, why do we need machine learning for this? Well, let's look at how we would do it differently, how you would do it. Uh, we let's say brute force. If it takes you one second to try one of those configurations, it will still take 20 300 trillion years to go through the full configuration space, which even if in your business, you can afford to wait that long.

00:04:41

Um, you actually, you can, you can do it because the certain we'll go supernova in only 5 billion years. So this gives you, I hope this gives you an understanding of the depth and the complexity of the problem. And even if you go with more advanced, um, always, I like to say you give you a binary search. The scale of the problem is so big that it cannot be solved through the traditional means. When taking you with this is that tuning cloud workloads manually just is not feasible. Why, how, what, what, how are they different for how cloud native application is different from traditional applications of which, um, manual tuning was done? Well, the main thing is DevOps. The concept of DevOps brings tremendous velocity. That is how we, and this is a benefit that we specifically look for in demos. That's why we do that ops.

00:05:42

That brings a rapid change in your application on a weekly and daily basis in even hourly basis. So, you know, code changes constantly, your application, Rosa, multiple platforms, you have two or three different cloud providers. You have vast configuration space. And the example that we just gave, we only took 22 parameters with only two parameters per, um, service was actually very, very basic. The moment you add Java or any other sort of middleware, the number of parameters per service explodes the guests with 20 to 200 parameters per service. Um, so there's tremendous amount of complexity in that configuration space while doing traditional manual performance optimization is something that is a reactive eat, applies to, um, the configuration and allows you to spend several weeks trying to tune it to the configuration and the code that was there two weeks ago or eight weeks ago. Um, most teams don't even have a way to measure efficiency of how efficiently you're running. What is your, uh, coefficient applications? Um, and then the typical solution is that applications are over-provisioned. That is how most applications run today is data just being tremendously over provision in order to compensate for not being able to do tuning on the scale, um, that that is needed. And the result is predictable. Yeah, massive waste of resources, um, performance that definitely needs improvement. Performance needs needs to get better and very frequently lower availability of the application as a result of performance and under provision resources.

00:07:38

Let's zoom in a little further and get a little bit more concrete. When we look at how an application is developed, starting with the definition, the product owner defines the application, developers build the code and committed to version control. Then use us good CICB pipeline to go through bill test integration, um, yeah, through the release of workflow, through that staging and production and put the application in, in production. The result very frequently is the performance in production is not adequate. There is performance lag is not working as well as it should. So what are the best practices? Best practices is ad real user performance monitoring. Seeing what your customers actually experience as performance. Your team can build synthetic load generators and create various, um, load profiles, test your application with, and then you use sophisticated, um, application performance monitoring systems to understand how your code works, where the bottlenecks are and try to improve and improve that performance.

00:08:54

That is a very long process. So what happens in the meantime is you over-provision, you add a lot more resources so that the application works and works today and delivers with quality and behavior to the customers. Then as that, um, process of obligation performance management, um, continues, and you have a tiger team that works to solve the particular performance problems that show up, you feed that back. There's a lot of learning about your obligations, a lot of guess work, and a lot of trial and error that are involved to feed that back into the configuration and test. Um, and then in some cases, you, that information, you can go all the way to the beginning of the process and inform the developers as to what they could have done differently, um, three months ago. So this process is beginning to reduce velocity velocity, and it's just not working very efficiently. Why? Because it is attractive. It's looking backwards. This is like driving a car on the freeway by looking in the rear view mirror.

00:10:09

Now let's add continuous optimization as a service and see how it changes, um, this process. So the setup is the same. You have exactly the same initial product development process integration process CICB pipeline. The main difference is now that you add continuous optimization as a service into your production system, the result of this is that your application, as it grows in production is being constantly queued for the current load profile using the current application behavior. So today's release or yesterday's release is the one that is being killed on the load that is experiencing at the moment. So the changes in configuration and that tuning and cost adjustment is re practically real time. It happens on the application that's running today. In addition, you can take that process in the shifted left, bring it also to the development team, um, so that the developers can experience the performance behavior and take it into account, uh, in the, in development that is optional, but it is an improvement that you can do for the development process. And the results are predictable. Um, we thought on almost workload tuning, um, you get higher performance for your application. You get significantly lower court costs, and many of our customers report significantly better availability.

00:11:50

So what is continuous optimization as a service that's, let's try to define it. Um, what it does is it maximizes performance and efficiency of cloud workloads, and it does that by using machine learning to optimize these workloads continuously. Well, let's dig in a little bit deeper. When does, what does continuous mean? When, when does it work? How, how does it do this optimization? Continuous optimization can be applied to individual applications they have among our lists, um, or one or, um, several large obligations or your company has a service delivery platform has thousands of applications and continuous optimization can be applied in both of these cases. What does continuous mean? So continuous tuning triggers that process on every code release. So every time there is a commitment, a new release coming out and being deployed, um, tuning is being performed low profile changes. If, if you're the mixer requests that are coming, um, to your application, changes this conserve as a trigger to reoptimize so that your obligation needs to perform effectively main change as a result of that.

00:13:17

And then any time when you use cloud providers, they, they to run our application or even just changing an instance type, um, and any sort of infrastructure, uh, change the result of doing that continuously is also continuously delivering value. As we, we talked before the battery being higher and more consistent performance, improved availability and significantly lower cloud costs. How is that different let's look at how Katina's optimization is different from what Jamie generally cloud optimization or cloud cost management, um, as it has been used in the last few years is when you look at called optimization was zero. It started with cloud cost governance tools of where, where are you spending money? How much are you spending? Can you break them down by department, by application, by region? Um, and that analysis allows you to understand better how you're spending money on cloud applications, but it doesn't necessarily help you change that improve that ratio or at all the performance.

00:14:34

The next kind of phase of this is tools, several tools and companies exist that can make suggestions. They say, well, you're obviously under utilizing this particular virtual machine. Maybe you can squish that to a lower instance type, but no, it is that it just gives you suggestions as maybe you should try that it doesn't do it for you. You still have to decide when you want to do it, like pick a time, allocate a resource to, to do that. And also evaluate whether that proposed change, um, affects performance and how it affects performance. Not only at the time when you are testing it, but also how it affects performance across, um, your typical, um, cycle afloat, which, which made value day to day, week to week. What continuous optimization does differently is first it's continuous. It just happens all the time. There's no phases. He either arose every day, every hour.

00:15:40

The second important thing is, is autonomous when it needs to make a change. And there is a better way to run this application. The Tina's optimization applies it autonomously. There's no need for anybody to look at this and say, well, I'm not sure whether this is going to make it better or worse. Let me try. That is our continuous optimization system tries that in, in a kind of contained in a sandbox environment, and then you promote that to the application as a whole. So the impact is always there. Your application is always working at these best without requiring any manual and handholding that your engineers have to keep it on island. Again, with continuous optimization, as a service, you can innovate more efficiently. You can outperform your competition by having your application being well-tuned and having better end user performance. You outsmart your competition because you can have your engineering team work on adding new features instead of doing trial and error configuration changes, and you can outclass them by using your resources more efficiently. Thank you for attending our session. If you want to learn how to use continuous optimization as a service for your cloud applications or your service delivery platform, please visit us signing books. Thank you.