Las Vegas 2018

Have Your Cake and Eat It: VMWare IT Adoption of PKS

VMware IT has been on the journey with containers for more than two years with over 4,000 containers running. Having gone through the various container orchestration framework, from hand-rolling ones to proprietary ones, the team found that orchestration was limiting the ability to broadly adopt the container technology. With Kubernetes established as the de facto container orchestration platform standard, VMware IT adopted it as the go-forward container platform.


In this session, you will discover how VMware IT super-charged the container adoption with all the benefits Kubernetes brings to the container workload by deploying Pivotal Container Service (PKS) with VMware NSX-T integration instead of vanilla Kubernetes.


You can have your cake and eat it too.


Eric Rong leads the strategy and architecture of the next-gen Cloud Native Platform and application transformation for VMware Business IT. He has worked on architecture for various business areas within VMware IT to support Marketing, Sales to Support. Prior to VMware Eric worked at Franklin Templeton Investment as technical consultant in web technology. Eric has 20 years of industry experience in complex IT application delivery.

ER

Eric Rong

Technical Architect, VM Ware

Transcript

00:00:05

Today my talk is to share with you, uh, VMware, it's journey on cloud native container based applications and how and why, and we choose to use PKS and how we deploy PKS. Hopefully our experience can be some help to you. So in terms of agenda, we'll do briefly introductions, which we already did. We'll share the V Y's cloud native journey. We'll talk about the VM Y's PKS deployment, and then we'll talk about the takeaways. Alright, so, uh, little bit about the VMYT. Right? So my group is called a business IT group in VMware. In VMware, our group responsible for all of the applications that VMware used to run this business, right? That including internal internet, you know, people's HR system, all the way to the portals and the services that customer access. We support all of them. And, uh, we have about 400 plus applications in our landscape.

00:01:10

We have about 10 million plus lines of code that we return ourself on top of it. And, uh, because VMware's already a 20-year-old, a billion dollars company, we have very, very heterogeneous environment with mixed of package of applications. We have a package applications, SaaS applications, also custom applications. Pretty much any big name applications you can think of, we have the new form or another, right? So it is a very diverse portfolio and, uh, we actually manage a lot of things in the group. So one of the key challenge with this heterogene environment is, is making changes become more and more difficult as you bring more and more application together and integrate them together, and the speed of change become more and more slow. At the same time, the business is asking us to do things faster and, uh, as fast as we can possibly can, and as cheap as possibly we can.

00:02:08

And, uh, to meet those challenges, we went on a, we went on this journey to start looking at how do we, you know, develop our applications, right? How do we make changes to the package applications and, uh, what are the best way to achieve the velocity and agility that business is looking for? So our journey truly started towards the end of 2014. When we first start looking at a container, at that time, dockers just become popular, right? Our initial motivation of looking at Docker is quite simple actually, because in our traditional process, we hand off our, develop the code to operations, right? We do those, you know, once a month, release parties, everybody come on Friday, right? Sit together during the release. The release can last from 12 hours to 24 hours. Depends on what's happening, right? The biggest, uh, challenge we find in that process is the handoff between dev to ops, right?

00:03:03

Developer does write release notes, but developer lazy peoples. I come from developer, I'm lazy too, so I don't like to write those things. Our op friends are busy, right? They have, they have to run systems, they have to monitor system, they have to fix system that broke. They have to release the new things so they don't have time to read the release notes either. So what you're showing is on the weekend, everybody shows up Friday night and everybody start pointing finger at each other, right? Right. You didn't read my document, you didn't write it in the document. So what's the, what's the, so there must be a better way of doing that. And then when we look at a docker, this seemed to make sense, right? Instead of writing my release notes, I write everything to the docker containers. I hand that off to the operations operation, have one way to dealing with whatever the container, somebody hand to it just to make it run.

00:03:50

It's great. So let's start. We pick a simple, you know, lamb based, uh, block system, put that into container and then try it out. And the both development ops guys loves it because it simplified handout, uh, handover, right? You just give them a container to run it. And, uh, we actually push that container all the way to production. Then we showed our learnings and present our findings to our executives. You know, it's just like any other good executives like to challenge their team, right? The next thing our executive asked us to do is to go find the most complex, most difficult applications. Try to containerize that. So we said, okay, let's go do that. So in two beginning of 2015, we start on that journey. The application we pick is our customer support portal, the portal that serves all VMware customers needs after the sales.

00:04:41

And, uh, as you can imagine with our hydro genius environment, that application pretty much have tentacle into 60% of the application on the back, that running the company in one form or another. So we start from the beginning of the web layer and the going layer by layer down, right? We try to package them into container web layer is easy. App layer is a little bit difficult, but by the time we get to the integration layer, we are looking at about 36 different images if you will run them at the full scale. That's about 250 containers. And the manually running them is just not option, right? Op sky, look at it. I'm not gonna run 2,250 containers by hand. So, and we also learned that because those legacy applications require, for example, clustering shared the sessions, right? Replications, all of these things that Docker and a container doesn't natively make it conducive to do.

00:05:33

It's very, very difficult and, uh, there's a lot of manual step you have to do to make it work. At the end, the efficiency gain from legacy application really doesn't warrant it doesn't warranted effort that goes into try to show, hone those into containers. So we decided to abandon the shift and lift approach to those complex applications because it's just not a worst effort to be honest. So, but we do realize the value of containers. We do allow the team that does greenfield application development to continue use containers and we'll continue to run them on VMs, and we source a third party tool to actually orchestrate the small sets of containers together. And that actually work out quite well because the reason at that time, Docker compose is not a thing. Kubernetes is not a thing. They're really not whole very open standard based oxygenation framework for containers.

00:06:30

So that's why we did that way. And, uh, when come to, since 2016, we really look at our journey, really look at our priority, look, look at where we are. We said we should not be at end journey because we need to make changes. And, uh, we decided to move forward by actually adopting cloud native applications and start developing applications truly in the microservice fashions. And for those legacy applications, we decide to leave them alone for the time being. And when we master the skill of developing the microservices, we will refactor 'em into microservices and make them run in containers. And as we building out those things, we still haven't solved the platform problem, right? How do we get those containers running orchestrated away? And, uh, in 2016, when we look at the market, we couldn't find a mature tool that deals with Docker containers, but we do find that the pivotal cloud foundries give us a quite good platform to run those containers.

00:07:28

And the reasons, because we're primary Java shops, right? Most of our application develop in Java using spring frameworks. And, uh, pivotal Cloud Foundry seemed to be a logical place to run that, and that's why we started it, a pivotal Dojo with Pivotal and deployed the Cloud Foundry and, uh, get the team on running on the platform. And in 2017, we rolled the platform out to a much larger team. Right now between non-production to production environment, we have about 2020 500 containers running that platform with all kinds of microservices that, uh, serve both internal, external and the partner integration needs. But, uh, as we go through this journey, pivotal cont uh, pivotal Cloud Foundry is great, right? It does solve a problem, but it also introduces a different sets of problem. Because Pivotal Cloud Foundry is a very opinionated platform. You have to do things exactly the way it is.

00:08:21

If you application slightly wonky and different that cannot be fit neatly into that, then to show, hone it into it is actually quite difficult task to do. And the second part is, even though Cloud Foundry give you, give us this very nice service broker, you know, Bosch manage the service construct, the availability of a service is a challenge on the platform to make them available. And the package ourself, the systems into services is quite difficult too. We did a package, one is a MariaDB cross data center cluster, but just package that service took us three months, right? To make it work consistently, solidly. And we're not gonna embark on doing that for every single possible data and service that we're gonna use because it's just not viable option to do that. And also, as we move to refactor the legacy applications, there are certain Lab legacy applications require access to, for example, storages, right?

00:09:19

So share the storages. There are application require access to the actual file systems and all that stuff, which Pivotal is not be able to do. So we need to look at alternatives, right? That's why we stop. And the good thing in 2018, Kubernetes really takes off, right? It's basically now is the defacto orchestration platform for containers. And, uh, once we look at the Kubernetes, it does solve majority of problem we have with PCF and allow us to basically move a lot of those workload that doesn't fit natively into PCF, into Kubernetes, into containers. But as we look at the Kubernetes, it does introduce its own overhead, right? As we're coming from the Bosch manager, the pivotal cloud hundred world, right? Manually rolling and manage Kubernetes cluster. It just seemed to be way too complicated. There are too many moving pieces and they have to be manually orchestrated in a consistent way.

00:10:17

We did set up those, uh, those, uh, plan Kubernetes cluster ourself. It took a while and, uh, there's a lot of trial errors in it and, uh, make it running a large scale is quite difficult to be honest. And also, and that's why we start looking at, uh, that's why we didn't go with that and start looking if there's a better alternatives. And, uh, that's why the Pivotal Container Service become a officially declared a product, and we look at a feature it provides, it does fit into our ecosystem nicely, and it give us a lot of things that we're looking for. So what are we're looking for from a Kubernete offering, right? So Kubernetes, take care of the containers, right? Who's taking of the Kubernetes right in the Cloud Foundry Award, PCF take care of the containers and the Bosch take care of the cloth hundred itself, and it provide a very neat day one, day two, you know, provisioning and operation model that is very consistent and very easy for people to understand and, uh, practice, right?

00:11:20

We would like to have the same kind of operation model for Kubernetes as well. And from a network topology perspective, because VMware, it is a very heterogeneous environment. We need a network plan that not just native to the containers, but it can stretch across legacy applications, you know, to traditional workload as well. And, uh, you traditionally what you do is you use netting right from the container to the host, then from the physical to the physical network. The challenge with that is you have obscure the container workload, uh, container traffics, you cannot really tell if the traffic is coming from the container, which container, when you look at it from host and apply the network policy at that level become very difficult. And to manage it become very tedious, right? And also from a storage perspective, we need to be able to support persistent volumes into the containers, right?

00:12:14

And, uh, we don't want a bandage the solutions. We want a storage solution that is consistent with our storage policies and the way we provision and run storages in our ecosystem. And also we want to make sure that we take security into consideration, right? We don't, we want a way to be able to predict policy driven what container developer can push into the systems, right? We don't want any developer to be able to pull a random container off the internet and run it because that is a security risk, right? And also, we wanna make sure that operation operationally, it fits neatly into our support infrastructure. For example, how we monitor our systems, how we do logging. We wanna make sure Kubernetes can fit into that neatly. And, uh, last but not least, we wanna make sure that our developer can get access to Kubernetes through a self service capabilities, because we don't want to be there handling tickets and answering phone calls or see it if somebody send you a Slack request for a cluster, right?

00:13:22

So that's where PKS solves a lot of these things that we're looking for, and it solves a lot of those problems. So this is, uh, basically a, the PKS architect, uh, sorry, the PKS architecture, it's basically sits on top of our vSphere based infrastructure with an SXT as a network network layer on top of that sits the Bosch, manage the Kubernetes cluster. And uh, also it's integrated with Harbor is our, is VMware's open source image registry, which just, I think they just get as accepted into Cloud Native Computing Foundation as a official project for Kubernetes. And, and also along with that, it provide a lot of the out box integration with a lot of tools that we use. For example, we use VR ops to mon monitor the infrastructure, right? We use Wavefront and uh, we use log site for monitor our applications and, uh, it fits nicely into that.

00:14:22

So talk about, uh, how NSXT truly help us, right? Basically our deployment, uh, slated into this diagram. We have a separate networks subnets for our management network for the whole infrastructure and then the administrative network for the Kubernetes itself. And that each Kubernetes cluster itself, that gets provision, that gets its own subnet as well. And they are network segregated so that we can ensure that unauthorized access cannot go across and in between of the policy in between of the cluster. And also we can ensure that the workload cannot get access to the admin administration network and the management network. From, uh, ingress egress perspective, we use the NSX Edge, which serve as a low serve, serve as a ingressing point, and also as a low balancer. And also actually it's a low balancer ingress point and a firewall in one, right? We can defense firewall policy at that level as well.

00:15:23

And, uh, one of the neat feature in the PKS with NS XT integration is when we push a cluster set workload into Kubernetes, right? There is A-N-S-X-T load balancer virtual load balance automatically created for that workload and it keep track of the workload so that we don't have to manage the low balancing separately from network perspective. And also in the SXT landscape, every pod actually get a real IP address that is visible to the network so that when we define firewall policies, we can truly see which containers talking to which container, what container have access to what we can actually define firewall policy to that level of granularity that can control the traffic to that way.

00:16:15

From a storage perspective, we are a heavy user of vsan. So vsan give us, uh, way to virtualize the storage, allow us to provide a different tiers and policy driven storage provisioning. And, uh, in our P case deployment, we use the local drive with combination with the network architecture storage as well. Depends on what data the user, what kind of performance characteristics the user look for. We intelligently pick the right storage to service that. Right On top of that, we integrate the PKS product, project p uh, product, actually integrate the open source hatchway project, which provide a infras, provide a, uh, uh, storage plugin into the underlying physical infrastructure, which allow us to be able to do things like a persistent volume when, so in the PKS order, when we declare persistent volume attached to a container, that persistent volume actually created as A-V-M-D-K file on the storage. So from, uh, from operation perspective, these files in terms, even though they're persistent store volumes, they are no different from any other VM DK files that they already managed today. So their backup, restore and all that existing capability can be leveraged looking at the, the VMDP file. And then we can specify policy to those persistent volume in terms of what kind of, you know, backup, restore needs to be applied or what kind of performance they need to guarantee and all that can be managed in a centralized way.

00:17:55

And we also use Harbor heavily. The reason we use Harbor, not the open, uh, the Open Docker registry is one is because it's integrated, right? Second part is Harbor provide, uh, RAC based access control, which the open source Docker registry doesn't have. And when in enterprise you really need that, you need to make sure the right people have access to right, to update and change the right Docker images and PO Potter scripts in this, in the registry and also Harbor integrated with Notary and the clarity, which give us a way to ly define what can be run into our clusters. So in our build pipelines, we use, uh, clarity to scan the images to make sure there's no vulnerability in it, and then we use Notary to sign the image and our Kubernetes cluster configured to not run any container that's not signed by the registry, right? That's how we control what is allowed to be run. And in VMware it, we create our own golden based images based on the public golden images, and we harden them and add the VMware mandated demons and the agent process into it, and then we build it from the rest of it, and we make sure that every image that we run our cluster is built based on that, so that we can ensure the security and the integrity of the containers from that perspective.

00:19:23

And from operation perspective VM in VMware, we are a heavy user of Log site, which, uh, provide, uh, log ingestions and, uh, log analytics and, uh, alerting reporting, all of that for us. And, uh, PKS outbox integrated with login site, but it can be fairly easily integrated into any logging system that supports cyd, you know, SY login endpoint and all of that, because internally the utilize thelan D for that. So all the log shipping is done through fluent D and, uh, those get pushed to whatever the logging system that you choose. In our case, we choose login site because we use that to monitor the rest of our infrastructure as well, so that it'll fit in neatly and allow us to be able to look at the log from end to end, right? Because today majority of our workload still runs in the legacy stack, which still is not fully integrated into the containers, right? And also automatically it populates things like a cluster tag IDs, pod IDs, namespace IDs. So they make very easy to look at the log and figure out where the problem come from.

00:20:40

And from monitoring perspective, uh, Kubernetes out box can outbox basically use the, use the Hester and the Telegraph, and we use that to push to Wavefront using Wavefront proxy that is integrated into PKS. And the Wavefront provide a very, very good outbox Kubernetes monitoring dashboard, which pretty much give you 360 degree view of your Kubernetes deployment, right? And, uh, the reason we use, uh, Wavefront is because we use Wavefront to monitor our PCF platform as well, and then we use Wavefront to monitor our application as well. So all the matrix from our whole ecosystem metric going in there. So the benefit of have a single monitoring tool where all these things go together is, is make sure it's easier to troubleshoot, to correlate the, the stats and the data points for you to be able to look at the trends and look at the system behaviors much easier from that perspective.

00:21:41

And we actually deploy our platform in two different data centers in an active, active fashion. The reason we do it this way is because we want to provide the higher availabilities to our developers. Basically, we're saying that in each data center, we only guarantee 99%, 99 to 99.5% availabilities from infrastructure perspective. But with this setup, we expect that application be able to achieve four nine to five nine availabilities, but application build responsibility to bridge the net the last miles by correctly routing their traffic correctly, themselves correctly, sending the signals to the infrastructure that they're in trouble. So the infrastructure help them doing the automatic failovers and reroute traffics to different data centers.

00:22:34

Okay, so takeaways. So why is the Pivotal Container Service different from other tra other plant van vanilla Kubernetes cluster, and why is it good for enterprise to use that? One is that PKS are committed to provide a constant, constant mainline compatibility. What that means is you'll get the latest feature that Kubernetes brings on board in no more than 30 days, right? And, uh, the second part is that automated provisioning, scaling, and patching through Bosch is very, very valuable from operation perspective, give you operation team a piece of mind that whenever they run the deployment, it'll be consistent, it'll be repeatable, and, uh, all these best practice from operating perspective is already ingrained into the Bosch itself so that you don't have to reinvent them, right? The NSXD integration of obviously simplify the network and traffic management greatly and allow us to create a much more flat and a secure network to provide the, the access product application access to all the components that need access, right?

00:23:44

From storage perspective, vsan give us a storage solution that can work across both container workload and traditional workload, give us much more efficiency and a better utilization of our storages. And, uh, on top of that is ship with Harbor, which give you a secure image store that you can manage your image blueprints very securely and deploy them securely, right? And out box integration with, uh, login site and Wavefront to just icing on the cake to make you make your operation even better. So that's, that's the talk of the head of this talk is have your cake and eat it. Right. Alright, I think that that's the, that's all I get to talk about. And, uh, there's five minutes left if you have any questions, feel free. Yeah.

00:24:38

Be aware. And the de

00:24:43

Mm-hmm, <affirmative>,

00:24:44

Did that influence your decision between s at all? Strictly <inaudible>?

00:24:51

I would say it's about 80 20. So yes, we work for, we all are part of Dell family, but you'll be surprised how many competitions there in the, between VMware and the Pivotal. So we are not, we are being encouraged to explore the products from our sister companies, but we're not being mandated to use it. Right? And, uh, the, those, those, these are, uh, I would say Pivotal or Dell are the first place we go look for solutions. But the solution has to mandate, has to meet our requirement and fit into VMware's ecosystem, actually ultimately benefit VMware. If it doesn't, it's not helping because you look at this, I'm a VMware stockholder, I'm not a pivotal stockholder, right? Make Pivotal stock go up is not helping me, right? So from that perspective, I would say, yeah, so we do evaluate them in a common scale with other solutions. So for example, right now we have a very big conversation with Pivotal going on because Pivotal contain, uh, cloud hundred serves as well, but the cost of is ridiculous. It's very, very expensive. And the bigger it, it help you go fast, go big, but the, the faster you go, the bigger you go. The price tag just goes up with it, right? So, so we are having these conversations. If the cost is not gonna benefit the VMware, we'll get rid of it and go back to Kubernetes.

00:26:18

Alright, any other questions? Yeah,

00:26:21

So VMware,

00:26:29

So to be honest, I don't, I I, I have some of those

00:26:35

Conversations with <inaudible>, right?

00:26:37

But I don't have a complete approval of what their plans are. But if you ask me what my view is, I think what Pivotal, not just Pivotal Cloud Foundry should do is porting that experience over onto Kubernetes. Basically replace Diego with Kubernetes and all the rest of the, like the go routers, the service brokers. Uh, and you actually see that happening, right? You look at the cloud native compute foundations, they start adopting, you know, service brokers, build packs, all these contract that introduced by Cloud Foundry into this. So I wouldn't be surprised seeing a year and a half, two years from now, cloud Foundry is just gonna be experience layer that deploy on top of the Kubernetes just like it still is, right? So that's how I think it will happen. It may or may not, I don't know, but it'll take some time because there's still some unique capabilities like Diego and all of that provides that. I think they're waiting on Kubernetes to catch up from their perspective, but that's the reason in my view. Any other questions? Yeah. Mm-Hmm. <affirmative>

00:27:48

And capabilities that this

00:28:00

Way? Yeah, a lot. A lot. So one of the biggest mission of V-M-Y-I-T besides keep the company running, keep making money, is to be the class customer zero for all product. So that started about this three, four years ago. What happens is, generally when a product gets a beta stage, VMY, it gets a bit of it and we start running experimenting workload on it. And generally we are in production with the actual product about two months before it become ga and we actually run our real production workload on it. The reason we do that is because, so we want to validate our product in our real world scenarios. And our assumption is that if that product works for V-M-Y-I-T, it probably works for 95% of the com company and customer on the planet, right? Because the VMware is very typical of your large enterprise, right? So that's how we, that's how we do that. And in the case of PKS, we are actually very, very early adopt. We are start a conversation while the product is just in concept, right? We got the first bid and we start running it. A lot of the things that you see in the latest 1.2 releases is directly come from the work that we're doing, the part that we asking them that missing for us.

00:29:22

Yep.

00:29:25

Yep.

00:29:26

You mentioned earlier that after p you tried to applications and can you talk a little bit more about

00:29:38

Oh, sure. Yeah. It's a very, very interesting challenges, right? So our legacy applications, your traditional multi-tier application, you have web tier, application tier, integration tier, you have your database, you have your ERP system, you have your whatever, the package application that underneath it, that sit there, right? So couple of challenges in it. One is our legacy application runs in things like Tomcat web logics, and they rely on clustering capability of those systems, right? Those clustering depends on multi castings and network construct, which the container donly give you, right? Once you put them into container, you literally cut that out, right? So is there a solution to that? Yes, there is, but that require me to make changes to the legacy applications, right? So I can externalize all these sections to external part, but then that's external dependency I have to introduce. That's another set of container I have to manage, right?

00:30:36

And that, the other thing is that in our workload, because the way they're integrated, the biggest challenges they have dependencies, one service cannot come up until the other one has came up, right? And that dependencies start with a simple, simple A two B gradually become a mishmash of super spaghettis, of start startup sequences, right? And to, with that startup sequence in a container word is very, very difficult, right? I, as far as I know, even today, the only way we can do that manually, there's no reliable way. And we, the way we solve that is by writing custom script in the dependent container to the container, depend on, so the script, when the container first starts, the script does it to ping the other containers, see if it's up, not unless that one's up, it will not try to start itself. But one is that is very, very custom and proprietary.

00:31:31

Second is not very reliable, right? There's a, there's a lot of those challenge you will find as you go take on more bigger scale complex legacy applications. If you have your traditional simple legacy application like, uh, you know, like the sample they always like to give where you have Apache server with a MCAST server with a MySQL database underneath it, no problem. It's easy, right? You can just stick that into a pod or stick that into do com post file. It'll work just fine. But if your application involves certain degree of complicities, it become very, very challenge from that perspective.

00:32:06

And

00:32:10

No it doesn't. No it doesn't. So that's why we didn't try to shift, lift those applications. We said we'll leave them where they are. And the approach we to them is, we already started on the effort, is basically what I call the choke and release. Basically what we do is we take a component of the legacy applications, right? We first rewrite the interface. So we take the component of we, sorry, we take a piece of the, the legacy application as a module, we rewrite it into microservices, right? Then what we do is we take that module that's still inside the legacy, we take out all the backend code, we replace that with proxy to the microservices, and then we go do that slide. It's like a piece of slicing. You do slice by, slice by slice. Eventually, our hope is that we slice enough there, nothing left there. Yep. Yeah. Into, into microservice with the manage dependencies. Yeah. With clear, yeah, with clear control dependencies. Yeah. So, you know, all these nice things said about the containers, all of that. End of the day, what do you realize is your agility comes with how you manage your dependencies, right? You have to be very thoughtful for what dependence you pick, right? What dependence you want to have, what dependence you don't want to have. Alright, I think that's it. Thank you.