Las Vegas 2018

You Wouldn’t Drive Kids to School in a Golf Cart. Why Run Business Applications with a Build Tool?

Back in the day – five to ten years ago – business applications were orchestrated with purpose built enterprise solutions owned and managed exclusively by IT Operations.


Along comes the concept of DevOps and the SDLC transitions to the fully automated delivery pipeline you know today where everything is expected to be embedded into the code. Since developers know and love scripting, or cron or Jenkins or whatever automation tool they have access to, it became those tools they used to build their operational instrumentation and called it a day!


How would DevOps folks react if you suggested they should manage their delivery pipeline with cron?


Yeah, that’s how everyone should react to managing payroll or inventory or payments or any sophisticated business service with Jenkins or Puppet or Chef or ANY tool not purpose built for such functionality.


Come to learn how PayPal, Amadeus and similar enterprises are orchestrating their critical business applications with a DevOps enabled, Jobs-as-Code approach.


Joe Goldberg is an IT professional with several decades of experience in the design, development, implementation, sales and marketing of enterprise solutions to Global 2000 organizations. Joe has been active in helping BMC products leverage new technology to deliver market-leading solutions with a focus on Workload Automation, Big Data, Cloud and DevOps.

JG

Joe Goldberg

Innovation Evangelist, BMC

Transcript

00:00:05

If you look at this title, you might think that I have a slightly different intention than I do. So I wanna be clear, I am not here to bash build tools. I think build tools are great. What I want to talk about, however, is domain spec, specificity, <laugh>. Um, so there's a lot of different categories of automation, and if we look at our SDLC or our pipelines, however you wanna look at the set of tools you have, you've got build tools, config management tools, security tools, testing tools, networking tools, database tools, tons and tons of different tools. All of them provide some measure, probably a great deal of automation if we're talking about an automated pipeline, obviously a great deal of automation, but they are all a little bit different and, uh, you know, in some cases a lot different. So there is certainly some measure of overlap when you're dealing with automation, but I would argue that each one of them exists because they have a set of functions that are specific to their particular domain.

00:01:14

And, um, if you ever have questions about how specific they are, if you're a build person, imagine having to do builds with, let's say, Ansible, or, you know, if you're a config management tool, imagine managing your infrastructure with Jenkins and so forth. So each one of these tools has its place. And I would argue that in the operational world, there's a great deal of tool availability as well. They all have their place because they have specific operational capabilities, uh, such as the ability to maybe visualize and, and perform operational activities and interact with stuff in your environment. And we'll talk about those. So, um, in case you're wondering in the operational world, you can see I just selected a few. I put control M first, that's our particular solution, but there's a bunch. And if you have any, any doubts about how many every single cloud vendor, um, lots of commercial tools, lots of, uh, organizations have built and open source their own tools.

00:02:20

So that just a ton of operational workflow management tools. Now, I think what has existed in the past, and the reason that this is perhaps a, a subject of conversation today when it wasn't in the past, uh, whether it's the recent past or if you're still doing sort of traditional, taking a traditional a approach to how you manage, uh, your entire SDLC. There have been these silos that kind of imposed that domain specificity that I mentioned. And not that that was a good thing that these silos existed, but if you were a build person, you kind of threw stuff over the wall to your system administrators, the config management, they used their tools, they didn't even have access to your tools if you wanted to, if they wanted to. If you're a config management or security management or database management, et cetera. You had your own tools and probably you didn't have access to anybody else's tools.

00:03:16

That was certainly the case in the operational world. Um, the kinds of tools that we're talking about that manage workflows and orchestrate applications, um, still today in many organizations are the domain and the ownership of operations and the operations folks. And there's either a request mechanism, whatever. And you know, this, this is kind of the nature of the situation as it existed for quite a long time as these walls began, began to sort of disappear. Uh, I think we today suffer from what I would describe as the hammer syndrome, and it's really who's wielding the hammer as to what tool they want to use. So if you're a build person, everything looks like a build problem. If you're a config management person, everything looks like config management problem and so forth and so on. And so this is the argument for domain specificity. There's good reasons why we have these different tools.

00:04:11

There's a great set of capabilities that they each have for their particular domain, and the argument simply is choose the right tool for the job. So if we're talking about orchestrating applications in a production environment, what are some of the capabilities that we think that I think are critical for us to have in this kind of environment? So you want to be able to have a view that is abstracted and kind of elevated above technology for the purposes of running applications. You're running business services, you have to deliver or support business services. And when your customers or your customer's customers or whoever it is that's consuming those services, interact with them, they have either little or possibly no interest in the underlying technology. They don't care how complex your stack is, they don't care whether you are OnPrem or in the cloud. They don't care whether you're using containerization, they have a service that they want to consume, and you from an operational perspective had to be able to support it. If there's a, if there's a problem anywhere in that complex, arguably and ever increasingly complex environment, you need to be able to find out where it is. You need to understand what is the impact of a problem in one place downstream or upstream. So this notion of end-to-end view across a com complex technology stack becomes really, really important.

00:05:42

Another characteristic is that once you get into production or into the data center, or however you're exposing your services to the world, then there are certain things that are taken for granted and expected. And you know, things like auditing and security and scale, those are things that just have to be there. And so what may be a great tool that even though it may have certain characteristics for relatively low volume activity, when you get to high volume in a production scenario, you know, there's certain expectations that have to be met. I think one that is frequently overlooked, especially from a a, for a technical audience and from a technical perspective, is a diversity of users that exist in the operational production environment. So going back to that hammer picture, if you are a developer, an engineer, or even an operations analyst, and your view of your tool, your technology is something that you work with and live with all the time, and it's really highly technical and complex, but you're perfectly comfortable with it.

00:06:57

That may be great at some stage in the SDLC, but when that facility is exposed to the world, when you get into an enterprise or a web scale kind of deployment, you've got people that may be either highly technical and really savvy and they can consume no matter what you put out there. And you can have people all the way on the opposite scale who are not very technically savvy, or even if they are, they're not interested in becoming expert in the particular technology. Their goal, their need is to consume a particular service for the purposes of either the business or the transaction that they're performing, whatever. And it is this diversity that frequently is not apparent until you get to at least some kind of user testing or fairly far down the SDLC. So a really important consideration, I mentioned cloud and containers today because, you know, those are the conversations, those are the environments and technology stacks that we talk about today.

00:08:02

And so obviously whatever tooling you use has to be able to support the complex environments and the technology stacks we have today. But it really has to be sort of open to a degree that it can evolve and support future stuff. Because today we're talking about cloud and containers. A couple of years ago we were talking about virtualization a couple of years before that was, you know, internet and web technology. And before that it was something else. And I am sure that a couple of years from now we'll be talking about other things and containers and cloud will be old hat. Everybody will be doing it and there will be a need to be doing or supporting other things in addition. And another characteristic that's a function of enterprise is that rarely do things disappear entirely. And so this need to be able to orchestrate and have dependencies and visualization across complexity means that complexity is not just the stuff you have today, but probably stuff that you've had in the past that you may be bringing forward and certainly stuff that you're gonna have in the future.

00:09:08

The final thing, depending on your perspective is either well duh, yeah, or, uh, a major challenge. So if you're looking at traditional operational tools, the ones that have lived and still today exist in the operational world. Many of them are, were built and today still are challenged by integration into an SDLC. So given that we're talking about DevOps and CICD, whatever tooling you have must be something that can be embedded in an automated pipeline. And so the term that I'm using here is jobs as code. But if you kind of expand that thinking, what we're talking about here is operational instrumentation, application workflows, however you want to define it, that are a logical part of the application. And what that means is that just like you have Java or Python or whatever language of choice you're coding your business logic in, and whatever tool or language of choice you're coding your infrastructure in, you need to be able to code your operational instrumentation.

00:10:31

Similarly, in code committed to version control, submit it to all of the fa facets of your automated delivery pipeline so that in order to realize the real benefits of DevOps or CICD, where eventually however you choose to deploy into production becomes a non-event. That can only happen if the entire application, including this kind of instrumentation has also been riding along. The application has been built together, has been tested together, has been, you know, inserted or embedded into whatever test environments you've constructed as you move down the line. And so, uh, that becomes a really critical component. And I think coming at this world from an operations perspective, the tools that in the past have been and lived in operations, they're challenged by this. And I would argue that this is absolutely mandatory, that you have to have these kind of characteristics in whatever tooling you, you, uh, you're going to, uh, select.

00:11:32

So a couple of stories, uh, from customers that have been using our tooling, and I think they reflect this kind of journey. So both of these are companies that are fairly large, um, hopefully one at least is a household name. The other one, uh, perhaps a little bit less known, but, uh, certainly a a huge enterprise. Uh, in the case of PayPal. Um, they are really a financial services organization. So a lot of their applications still re include things like moving money around and doing reconciliation and payment processing. And fundamentally that's what they are as a company. They've been using this kind of tooling to run those kind of applications for a very long time. And, uh, several years back, they embarked on their journey to move from traditional ways that they were developing applications, which as you can see was time consuming. Slow, took a long time to a position where they now have this developer portal that lets them automate every aspect of how they build applications, including how they build their workflows.

00:12:41

And those workflows now are constructed in code. They ride along with the entire application of completely tested as they go from inception and construction or development, if you will, all the way through through production. Uh, similarly, ADEs is a company that provides IT services to the travel industry. And they too have, um, you know, embarked and are well along on the DevOps journey that has seen them now move from a traditional data center to a private cloud with a mix of public, uh, the need to also manage and move all of their workload dynamically among those environments and the workflows. The only way that they could meet that test is to move to a, a DevOps model where the workflows were constructed in code rigorously tested throughout and as they get deployed to production. Because if they didn't do that, um, most of us probably wouldn't be here, <laugh>, we'd be somewhere else stuck in an airport.

00:13:38

Amadeus touches about 95% of all commercial airline traffic in the world. Uh, if they have any kind of outages, uh, they're reflected in airline traffic, literally around the world. So, uh, highly demanding, uh, environment. So what I like to do for the rest of the time is just kind of give you a little bit of a demo and a flavor of what I'm talking about. So let's say, sorry, <laugh>, my mic is right here. Let's say I'm a developer, uh, or an architect or part of a team that's embarking on creating a new application. So we kind of sketch out obviously a very simplistic flow and hopefully this will give you a little bit of insight in what I'm talking about and what kind of instrumentation we're talking about here and kind of where it fits in. So I'm talking about a preventive, uh, oh, sorry, predictive maintenance for, uh, trucks.

00:14:29

In this case, this is actually based on a real customer use case. So I've got IOT data streaming from my vehicles and it's all streaming in real time. It, it lands, uh, in, in a public cloud, uh, serviced and delivered by a telematics company. I then, the goal that I have here is to, uh, identify potential vehicle failures and get them repaired while the vehicle is still mobile so I don't have to find it and tow it and all this other stuff. Obviously with a goal towards significantly reducing the amount of time it takes to maintain my vehicles. So these are some of the things that I have to think about. I'm getting the data in. I have, uh, a regression model that's going to determine whether a particular set of information is indicating a possible failure. However, I've got a tr a model that I need to train, so I gotta do that every so often.

00:15:25

And as maybe the my predictive capabilities change, I need to train that model once a week, once a month. Or when I see that the predictive levels maybe are dropping though, that's something I need to do on kind of a deferred basis, kind of standard housekeeping and maintenance. In addition, once I get an indication of potential failure, then I have got to enrich it and marry it with a whole bunch of traditional system of record data. Okay? I need to find out who the customer is, what the vehicle of warranty information is, um, what kind of parts I need. I need to find out where my service center is. I can determine where the truck is. I could book a service appointment, but maybe the distribution or the repair center doesn't have the part I need to order it and get it to that particular location.

00:16:13

Maybe that service center doesn't have space for a few hours or a few days. So these are all the things that I have to kind of take into account. I may have to, uh, order the part using my inventory system and I gotta wait till that's going to become available. All keeping in mind how long I am predicting the vehicle is still going to be operational before it fails. And so all of these things are, are things that I need to do. So you can see that there are a lot of different components to such an application, many of which happen in real time based on either data that is streaming from a vehicle, possibly what's going on with the vehicle location, but also other things that are either deferred or somehow long running or interacting with other applications and systems. So let's get out of this for a second and just take a quick look at the components that I'm dealing with.

00:17:03

So I'm using Eclipse here as a model, but it really doesn't matter which either IDE or whatever development environment I'm using. But you can see up in the number of tabs that I have, I'm dealing with a whole bunch of different components to my application. I've got some s scale code because I've got a Lydia regression model. I've got, uh, you know, some sort of make files or build information and I have this JSO here that is my workflow. And this workflow is that diagram or that flow chart that you just saw that when I get some data, I may move some data I need to extract from, uh, my traditional systems. I need to marry, enrich that data, I need to do some trial transfers, I may need some additional resource and so forth. So it's all here. Now, I as a developer, probably I'm not super familiar with this kind of stuff.

00:17:58

So like any other language, I need something that will allow me to validate this. So there's a service that we provide that will let me validate this stuff. So we'll see momentarily how that is going to actually run and what other kind of services we have available. But I can validate my syntax and you know, this is telling me that it's correct in case just to kind of check this, let's say if we make an error just to highlight that we have the potential to make errors and see how they fall out. Okay, so I just introduced an error, but you know, so I get told that I get an error, I can fix that, um, and make sure that I get it correct. So I iterate through this process of building this. Now, even this process of building it, one of the, the challenges that exist in large organizations is that I need to understand what my standards are.

00:18:57

And so there's a whole sort of standards and rules definition under the covers that tell me what my standards in the operational environment are and help me along once I validate the syntax. If I'm a developer, I wanna make sure that this thing actually works. I mean, I wouldn't just code Java or Python and then as soon as I had no errors, I would assume it was correct. I need to execute it to make sure that it's correct. Similarly here, and so you may have seen that I have an option to actually test. And what will happen with that is that this is gonna get submitted to an environment to actually run this stuff. So the flow chart that you saw that we have now built via, um, via this J-S-O-N-I can run this to make sure that it runs successfully. And this is kind of my unit testing.

00:19:44

Now as far as running, I hope that I still have connection. Yeah, it's, it's kind of slow <laugh>, but I have, um, a personal test environment. I can either run that personal test environment as a virtual appliance in VMware or virtual box. In my case I'm using a, an environment on AWS and you can see that I got a response that I've got a run ID here. I don't know why this is not going away, go away. We'll ignore it for a moment. Um, and I can even, you know, look at this environment and see what that set of jobs looks like. So there's my, my flow and this is actually going to run so I can get a test. Now this, in this case, this thing is waiting for user confirmation. I didn't want it to take off and run before I got here.

00:20:39

So you can see it's waiting for me to confirm in the let run. I'm not gonna bother that with doing that 'cause I don't have the time. But the point is that I as a developer can not only validate this syntactically but logically and execute it and make sure that it's correct. And then at that point I can pass it on. And the way I would pass it on is I would commit to my version control I'm using get in this case, um, but whichever one you happen to use, so my developer work is done, I could pass it on. Once I commit, then the things that you would normally expect are going to happen, whether it's just updating the flow itself or whether I'm updating other parts of my application, I am then going to trigger a build. And everything here is done a little bit slowly so that we can take a look at it.

00:21:29

But you know, here is a very simplistic Jenkins pipeline, uh, and if you can take a look at what we're doing here. So first I'm gonna build my application. So I'm using the scallop build tool 'cause I've got s SC code. I am then going to use a service to do the validation of the syntax as I saw, as I showed you interactively. I would then deploy it to my first stage of testing and then I might use a testing framework like robot to run tests against it. And then if that was successful, I then may deploy it or push it to my next environment and so forth down the line. So all the kinds of things that you would expect to do with any other component of your application I can do here. Now, in, in the interest of time, I'm not running any of that stuff, but I'll show you some of the bits and pieces in our particular tool we have provided or we've exposed all this functionality both via restful web services as well as providing a node J-S-C-L-I, which is a kind of a thin wrapper, uh, implementing those same rest APIs.

00:22:39

And what I'd like to do is just speak a little bit about the different services and how they map to different phases in your pipeline. So again, our thinking here and, and you know, I think that this is the way you must think about this, this to deconstruct your pipeline and then make sure that you have services for the tooling that you're using that will support each one of those phases. So we saw, well you may not have realized it, but we saw the build service. Okay? The build service validates syntax. And so I create my objects, I can validate them. The next service that we used was the run service, which lets me execute. And in addition to, just to show you some of the other things that you can do, it's not just simply running the jobs, but being able to interact with them so I can retrieve their output, I can perform operational actions. This is of course important in an operational world, but even in a, in the context of running a pipeline, um, let me just bounce over real quick to that Jenkins machine, I think, okay, so I'm using robot in this example, uh,

00:24:06

Okay for my test scenarios and in these tests I'm doing or performing the testing using those functions that I described. So when I talk about running stuff over here, I wanna be able to perform those same kind of tests or those same kind of operations in my test environment. So I wanna be able to run a job, I wanna be able to look at the output to make sure it's okay. I may want to in, you know, sort of haphazardly or randomly kill some of my jobs and see what the effect is going to be downstream. So whatever the complex scenario that you want to construct, you have the ability in your testing framework to use these services to perform those kinds of operations. And the intent again, is to make sure that as you move down the pipeline, just as you would with your business logic, you have the same kind of capabilities to perform the same kind of operations and the same level of testing on your operational instrumentation because it is an equal participant in your application.

00:25:13

And this by the way, contrasts significantly from an approach we see lots of people take where frequently the first time this instrumentation is actually invoked is when jobs get into either production of very close to production. So, you know, imagine if you are a developer, you know, writing a bunch of Java code and then before you push the production, somebody changes half of it. You know, you would say that's absolute insanity. But when it comes to the instrumentation that runs the application, those are the kinds of things that are frequently done. And so this is the argument against that. Um, if you do want to find more, um, you know, some of the other things that I think are important about such things is not just to have the capabilities, uh, but also to be able to access them. Uh, so for example, there is some information available, excuse me, on GitHub for, you know, this virtual appliance. So if you want to get familiar with this particular solution, you can download what we call the workbench. Um, it is about a one gig download, but then once you get it downloaded from the time that you fire it up, you can be writing and running jobs and testing this stuff in a matter of minutes. Uh, in addition to the actual appliance itself, you have a bunch of, um, samples. So if you go to github.com/control, there are a couple of, well, you should spell it right,

00:26:54

Okay. And there's, uh, several repos with some samples and everything from a Hello world to much more complex and sophisticated, um, uh, examples. And if you go back to that first page that I showed you, there's a bunch of other resources here such as all of the code reference and documentation. There's a swagger UI reference to show you all of the API calls that are available and the functions. And so the intent is to provide you a rich set of capabilities that perform operational actions, but can be consumed by developers and engineers within the SDLC or their CICD pipelines, just like they do for any other component of the application. At this point, I have I think a couple of minutes before we end. Um, I'll open it up for any questions if anyone has any, and if not, I'll let you take off for lunch. Oh, sorry, did you Yeah, so

00:28:02

So the problem that we have a lot of

00:28:07

E natural jobs, so Oh, okay. Uh, so we have a lot of ETL uh, uh, related jobs. So, and departments that are distributed departments geographically or, sure. So in the, in the, in the job space, what we concern is, okay, my application runs at this stage and there will be a, a job, uh, coordinator or someone who's, who's putting that, defining that process flow. Okay. So with the, with this floor, it looks like the, the guy who, uh, the person who writes application also owns the floor

00:28:45

Could Absolutely, yeah. So what you're describing I think is a much more traditional operational environment where there is a central group or IT ops owns it and you submit some kind of request. If that works, that's great, but everything that we have been hearing over the last several years from customers is that slows down the entire process. Um, you know, we had, uh, just an, an internal event, uh, last week where a large health insurance company spoke about their experience in, in, uh, moving to this kind of model where when they did their analysis of what was slowing down their ability to deliver, they found that they were having anywhere from like one to three weeks sometimes that their application was ready to go and then operations because they were backed up. And because, you know, they're a, an insurer, a health insurer that has, I don't know, tens of thousands of developers, but a central group of like 10 people doing this, it was just simply slowing it down. So this is really one of the aspects of, you know, DevOps and CICD and democratizing this kind of functionality that if your organization can sustain it, this is certainly an alternative. Um, and again, I would argue the reason that you're probably doing the, what you're doing is because of standards, compliance, governance and all of these can be addressed in an automated world and you know, it, it shouldn't really be any different for the operational instrumentation than it is for the business logic and all the other components of that application.

00:30:22

Yeah.

00:30:23

How do handle database changes specific DDL changes?

00:30:29

So the, the question was how do we handle database changes and specifically DDL changes? Um, from our perspective, we don't have to deal with that, right? That's, that's part of the application, um, that they have to take care of it from. What we change deal with is just the execution layer. Um, but I would say in general, the way that that has to be addressed is that your test environment has to have all of the same components that your production environment has and you have to be able to apply those changes and then execute whatever queries or applications that may be dependent on those changes to make sure that they've been done correctly as you move down the pipeline. Okay. Thank you. Sure. Well, I think, thank you very much. I think we're out of time, so thank you. Bye.