Industrializing your Data Science Capabilities (Europe 2021)

Data Science and AI are huge buzzwords nowadays. Data Scientists are creating insights and predictions with huge potential to overthrow our daily work by far. Unfortunately many of these approaches keep stuck after few meters in the mud of operations. At Continental Tires we started out by creating an environment and accompanying processes from day one. Following the agile approach of Continous delivery and Continous deployment the Data Science Factory was a supporting infrastructure to support Data Scientist to industrialize AI and Machine Learning Use Cases. Today this environments is used by Data Scientists all over the different parts of Conti Tires and is even highly recognized by players like AWS, Microsoft or Google. In this talk we will present the architecture and the approach we followed to implement this provider-independent environment. Done as Infrastructure-as-code and aligned with processes to follow a CI/CD pipeline Data Science at Tires can be done for developing real products inhouse. DevOps and MLOps are collaborating together with the Data Scientists to bring the old industrial company into the new world of Software and Data driven products.

europelondonbreakout2021
DD

Dubravko Dolic

Head of Advanced Analytics & AI, Competence Center eCommerce, Fleet Solutions & Data Science, Continental Tires

TRANSCRIPT

00:00:13

Hi, my name is . I want to show you how we attires, industrialize our AI and data science use cases. Over the years, we had slide travel to come to the place where we can do so. And, um, before I show you how we went through this, we, um, I would like to show you some, uh, recent news, which shows where we are today. Maybe you have heard if you're interested in tires about, uh, uh, news like this year, um, reasonably Conti unfortunately had to recall some of its tires from a plant in the U S um, not very nice, uh, but unfortunately, things like this happen. The challenge here we had is that from our huge, um, production, we needed to find out exactly those tires, which were concerned by this, uh, recall. Um, that was the point in time when I was approached, uh, some data scientists that created a small, um, script, a Python script, uh, which was able to identify the correct tires, but it was on this local computer.

00:01:24

So what to do there, luckily over the last years we created an infrastructure called data science factory, and this data science factory is exactly the place where we can industrialize such use cases. Um, as you can see from the slide here, we, um, created this data science factory to be able to very quickly set up an environment where we can scale out such programs. So we took, um, a new project there and in really less than a day, we were able to do all the setup, connect to the data source and bring tons of data three years of production into the database, uh, to analyze this, um, problem. Um, over the weekend, we were able to identify the tire's concern from the problem and, uh, deliver an answer to the management, quite a success story. So, um, how did we get there in the beginning, we started very small, which is nowadays a very common approach, but for a huge company like, uh, Conti tires, that was not usual.

00:02:33

When I was approached with a question, Hey, we would like to do some machine learning, some AI stuff in the demand forecasting. Um, the first, uh, impulse of management was let's call IBM SAP or whatsoever. They can solve the problem. Um, as data scientists, we stress the point that we can do it, uh, and we showed the management what is possible. So we just extracted some data from the data source, uh, loaded into our, did some analysis on a specific market for specific articles. So broken down the problem very much and showed the management. This is what's possible. And that led to our first bigger data science project there. And from the beginning, as we were looking from certain it perspective on this, we were looking for a possibility that we create some infrastructure, which we can reuse for other use cases later on. So what we, um, uh, came up with some kind of sketch, which we wanted to create there, where we thought the process of doing data science is more or less included.

00:03:45

Uh, what was very important to us as, as you can see here on the slide that we created all, uh, on the one hand, some lab environment where the data scientists can really create freely, completely independent from any it infrastructure, something with the latest libraries, the language you once whatsoever. And on the other hand, an environment where we can work like in a factory reg, where we can industrialize, um, any data artifact that the data scientists comes up with. That was the basic idea there. Um, and these were the, the first, um, uh, you know, components, which we used to create this, um, data science factory.

00:04:26

We looked at the market, we, um, investigated sometime to see and find out what, what, what, what was there that was four years ago? Uh, we did some use cases with some, uh, providers there, but nothing on the market was really, really good to solve all the requirements or to satisfy all the requirements that the data scientist really had. Um, sometimes it was too strong in the business business perspective. So not flexible enough on coding. Sometimes it was too strong to the it perspective. Uh, so there was not good, not a good mixture. What we find out at that time sure was that the, there is a common tool stack, which nowadays is very, very common and more or less they are everywhere at that time. It was, it was not that common, but also already known that you can use it for doing, um, collaboration on code that you need, uh, something like Docker and Jenkins to deploy stuff.

00:05:24

And also Cuban need is to scale out the things there. So we looked deeper into that tool stack and found out that this is exactly what we need when we would like to go this journey from classical waterfall it projects, which are still in place. And at Conti, uh, up to today, going above the HOA and really, uh, meeting the requirements of continuous integration, continuous delivery, where we are able to, um, find new features and integrate them into the software very quickly and give also a lot of steering into the hands of the data scientist. So this was the basic idea of where we wanted to go there. And, um, the data science factory was the outcome. What is very important is that with the data science factory, we, uh, created alongside processes. So we not just concentrated on delivering some piece of technology, some, uh, something, uh, which runs anywhere.

00:06:28

Uh, we also wanted to help the data scientists to, to go along the process to really deliver their results. So we started really very early. And as you can see here, uh, we use, uh, some templates to help people to find use cases at all. How can you start? What is the first step that you go, is it, do you look for, for the right data? And then think about the use case. Do you look for business value? What is it that you are looking for? Um, so we did also workshops with the business areas, with the main experts to identify use cases. Um, and, uh, we also established this way of working that a data scientist goes along with a business expert, a domain expert, and helps understanding the pro chose. The data, gives insight into the data, um, creates maybe something very, very isolated, maybe a shiny dashboard where you can look into the data.

00:07:23

And that helped the business people very much also to understand what data can deliver, what data science can deliver. That was really a very good first step, but that was not the step where we wanted to stop. So the data science factory was the logical next step, because when it comes to the point that the area of the business domain says, well, that's great. I want to have it on a daily base run automatically. And next week, maybe I have a new idea. I want to give input to you and please integrate it there. That executives the point when we need to get one step further and integrate the data science factory in there. So what you can see here is also the process, how the data scientists can really go to this further steps and go along the development process to create something which then we, uh, productive alongside this.

00:08:18

We, um, um, recommended to use this classical areas, Def QA products you can see here. Um, but what we did there in the air science factories, we abstracted everything from the data scientists, the data scientists don't does not need to take care about, um, what is in there. So all the load balancing networking, proxies, uh, whatsoever is taken away that is done by their satisfactory so that the data scientists can concentrate on these steps. You can create an artifact, a nice thing that would be a dashboard, a classification algorithm, which delivers it's resolved using an API whatsoever. And, uh, he can bring this, delivering it in his lab. He can bring it to Def area tested, doing hand over, talk with the business. Hey, is this what you want? Can deliver the results. There can, uh, move it to an QA environment and move it to the prod environment.

00:09:14

All this is in the data science factory. So as I said before, classical to state what you find in the data science factory, basically our components that are well known to the audience here, I guess, um, you can find, uh, um, get their Jenkins. We put things in Dockers, use Kubernetes to scale it out. Um, and we have airflow with the process. The workflow is more complex for simple workflows. We still use something easy like crontab, but if it's getting more complex to schedule all these things, we have airflow in place. Um, and we have, uh, monitoring on that. And, uh, what's very important is, uh, that, uh, from the beginning we put everything infrastructure as a code in the beginning. It wasn't clear, uh, whether we stick to the specific environment, whether it's on-premise or whether it's, uh, some AWS or Azure cloud. So we, we went to Azure because there was already there, there was an account there and stuff like that, but we didn't rely on the services there.

00:10:18

So we really did everything, um, infrastructure as a code. And you see here are the components, Terraform, Ansible. So we automized almost everything. Also many templates are available nowadays, if you want to have a new, um, let's say database in place. Uh, most of the time we have Postgres, we have some templates to automize the process there. Uh, so automation was a very, very important paradigm to the data science factory to create gig, um, abstract, really a lot of stuff from the data scientists and make it possible to run the stuff as independent as possible. So here you see the, the workflow that we created alongside this process, um, we have, uh, the, the data lake, I will show you in a second that we call it a data lake. It's not really a data lake where you have a lab to work in, and then you have the stages.

00:11:13

And the data scientist goes through the stages to, to in each stage, um, test his model, uh, deliver the results, check the data, check all the process behind it, promoted them with a defined process and go on. What's also important is, uh, as this area was new to two tires and too many developers, many developers, or not now, not in it, but came from something like manufacturing or ELLs. Uh, we also had the kind of, uh, role function there and show to these data scientists in the different areas, how you can work with such a process, how you can do software development, um, what is the best way, what, where you need to hand over how you work with the gates and the get flow there and stuff like that. You can see a very technical view on this code and you see how this works.

00:12:06

We have code there. Um, you check this code in, uh, in get, and, uh, you have texts on there. And these texts are then later used for the, uh, data science factory for the right, um, deployment. There, let's assume you have an N R a shiny dashboard, which you would like to deploy. There, there is a conflict file, which steers this in a central way. Um, and then you can do your release management and, uh, from, from the build, from the, get the data science factory knows. So to say, uh, which area you are deploying into and how you would like to work with it. I will show you this now in a few slides here, let's do a quick run through this process here.

00:12:56

The starting point is always the so-called data lake. It's not a very lucky name because it's not the data lake in the classical sense, but it's a data lab. And the very good data that I really liked that because, uh, as a data scientist, this is a very nice entry point. When you start with a web page, as you can see here and behind this year is the full process that many of you definitely know, like creating an easy two instance, creating S3 buckets, uh, to, um, store your data and stuff like that. But as a data scientist, you don't care. You don't need to look into that because you just click on this web page with your usual single sign on and you choose whatever you want for, for an environment may be an AR studio or deep learning environment whatsoever. You choose your configuration.

00:13:44

Let's say you would like to do some deep learning. Then you can choose how many, um, uh, power you need. If you want to have a GPU within there, go for it. Um, and, uh, there you go be behind that a specific you see two starts and the C2 is pre, uh, installed with all the necessary pies and flavors. You have different favorites in there, like an Anaconda and a plain Python. Um, you can have your virtual environments already. Pre-installed, uh, having tens of flow, PI torch, MX net cafe, all in that there, um, you have a Jupiter notebook and really a minute later you can start working. You just need to upload your data into an S3 bucket. Also, the S3 bucket is not really visible for you. It's mapped to a file into a, to a directory in your Institute instance, and you just start working there.

00:14:36

So this is the lab environment where we work in a very helpful, very useful, uh, and this is also maintained by our corporate, um, division. So it's really usable for all over the place at continental. Uh, then we use GitHub. GitHub has a very central role, as you can imagine, uh, if you are ready with your code, you can, um, uh, create a repository, uh, where you put all your code in there. And as a data scientist, you also have to containerize. So you put the Docker file also into the, um, into the, uh, GitHub. Uh, you can see here, we do all the documentation and get up. We have a meta documentation which relies on getup and, uh, extracts the, um, RM files, um, empty files from, from the, um, get up every night. So we have an up-to-date documentation. Every data scientist is asked to do a decent documentation in there, put a Docker file in there so that the Dockerization is already done.

00:15:37

And we have a Jenkins template where you need to enter your, uh, um, specific project. And that's it. Then you can publish your stuff on GitHub and what you need to do in your, in your lab environment. You have a configuration file, like, you know, from other applications, this configuration file is really specifically for your project. And, uh, with this configuration file, you are able to use our data science factory control, the data science factory control. As you can see here is a, um, a specific application that we have, uh, made available to, to the lab and to every data scientist, you can easily install it from our internal gates. Uh, it's just a PIP install. Then you have this DSF control and with the DSF control, you can do the whole process. You want to have an overview, you just do a list. Then you have a overview of all the builds that you that are already available.

00:16:36

If you want to deploy your specific application to a specific stage, you go here, as you can see in the X, this example, it's, um, you release it to a specific, to a specific one, or you do a on Def on QA. You can release your stuff there with this, uh, control. And this is basically the process that a data scientist has to follow after this. You can control your environment. Sure enough, we have a monitoring in here. And, um, with this monitoring you as a data scientist, you can see how this works, how, where, where the, uh, how many resources you use, how many jobs you have running, what the, um, performance is behind this. If there are any problems in there as you already can see here, for instance, this, uh, example here shows that there are more CPU's, um, reserved as there are necessary.

00:17:33

This are also possibilities for us as an it organization, to look into this and, uh, help the data scientists to optimize their, their, their work. We have this monitoring in place, um, where you can really control all the area, all different projects in the data science factory. Currently, there are over 30 projects running in there from small projects, with simple dashboards, to very complex ones, with many interfaces to many different data sources. And the API is behind that where more teams are involved. So everything in there here, I give you an overview. We don't stop with the data science factory. The data science factory is a very central and important, and the first infrastructure that we created, you can see it here. Um, but from there, uh, we started to spread out because, um, after the first success of the data science factory, many people were interested to do more, to get more data also available.

00:18:32

So we get connected to different data sources. Um, we have some lightweight, um, open source data warehouse in here, because it's always the case that you find use cases where it's still necessary to have not big data or unstructured data, but the good old traditional, um, dimensional data. So we put this in there and have access to these data. Uh, very interesting. Also the telemetric backbone as we call it here, the telemetric backbone is an infrastructure where we, um, as the name already says, you collect fast data. So data, uh, in, uh, which was very small pieces of information, but many of them. So here we collect data from, let's say, um, vehicle tests where our tires are tested out there. Uh, and we get data information on, um, uh, pressure or temperature. And so on GPS information, everything like that is collected in this telemetry backbone.

00:19:30

Um, in this telemetric backbone, we have, uh, stages, um, and a Cassandra database, but also here, we abstracted all the stuff for the data scientists, data scientists has a pioneer again, which is the, um, um, layer of access for the data scientist, where you can easily grab the data from the Cassandra without knowing any details about the Cassandra, just, um, use his optimized view. Like I want to query kind of vehicles kind of date range. Um, maybe also region some, some GPS information or something like that. This all we abstracted from the, um, technology so that the data scientist really quickly can use they're used, they're mainly used languages. Um, yeah, this is the overview here, uh, for the whole technology stack that we use. So you see from the start, uh, four years ago, we grew a lot, uh, but still the data science factory is the central point where we industrialize, uh, use cases, um, and use them for production.

00:20:35

I would like to show you, oh, this is an, a very new slide. Very recently. We were also challenged like, well, is this still the right infrastructure? As I told you, it's four years old. Um, and, uh, so, uh, we were asked like, let's compare it to what is there now over the years something developed, and yes, there is something there. So we did a, um, a comparison with SageMaker, um, very interesting, very, uh, um, nice insights that we had there in the, in the software. Um, and what we, what we saw is that, uh, with the clear distinction between what the ML ops needs to do and the data scientists needs to do in the data science factory, um, our process was slightly more adapted to the needs that we have as a company than SageMaker because this distinction between these roles of data scientists and ML ops is very, very common because we have many data scientists, as I said, uh, who not came from it, background, computer science or something, but, uh, learn their domain like sales, like a manufacturing process engineering, something like that.

00:21:43

And now learn things like R or data Python, or do tens of floor ELLs. So they don't want to have too much, uh, insights into what the MLR does. And this distinction is done with the data science factory. And we use this here, and that was clearly something that was stated after our evolution that we have in the data science factory, which is not there in SageMaker. Um, also what we are quite skeptical about is this locked in. Uh, if we go to something like AWS, that we have a locked in there, um, vendor lock in there, which we don't want to have, but still we are completely aware that a company like Amazon can develop these things in a, with a vast amount of development developers. And maybe we are not that quick because in the end, we are a tire manufacturer, not an it company.

00:22:33

So we need to keep our eyes open. We keep our, to our mind open and, uh, always compare the things that we do compare to what's out there to find out whether, uh, we still have the right balance. Currently, we clearly, um, placed ourselves between the more driven business driven towards which give you a point and click, uh, possibility to employ some, um, AI and, uh, things like you see here for strongly driven from the it perspective, um, which is something like Google or Amazon, or, um, uh, also Azure. Uh, but they, these tools are really clearly have a strong it background and focus. And, uh, working with these students, you always are reminded of that. And we are more in the middle of that. Uh, and, um, also adapted very much to the process we have at Conti terrace.

00:23:30

Let me show you some of the use cases we have at the data science factory. I think after these, um, insights into the data science factor, you might be interested in what we doing with it. Uh, yeah. I show you some things here, a nice use case, for instance, um, uh, concerns our specialty tires, so tires, which, uh, are not, um, mounted to a usual way to you see on the street, but some mining, uh, vehicle, uh, these are tires, which are my, maybe two meters high, and they have a specific behavior. And it's important that you really handle these tires also in this specific way, they build up heat and stuff like that. So we created this monitoring tool together with our engineers, and now these engineers have a tool which they can use in the field, go out there and, um, Mount a lager to the two vehicles and tell the, let's say, mining operator, well, uh, we observed your, your behavior with this mining, uh, vehicle, and you could change it here and there.

00:24:33

If you go this way, this path in the, with a longer curve, for instance, then your tire will last longer. And behind this, um, tool, what you see here, this is completely built in, are shiny, um, and scaled all with the data science factory. Uh, several customers and engineers are working on this today and they can use it. They, the software automatically detects loading and dumping areas in the mining area detect, um, path, which are not recommendable Dessa, lateral force analysis and things like that. Another use case here from our, uh, plants, um, here we, uh, observed or investigated into the mixing process. So the first process when the raw materials were like in a huge, huge mixer to be prepared for, for the, uh, creating tires and, uh, this process, we also had some insights into created some algorithms to optimize this process to be quicker and the turnaround times and stuff like that.

00:25:34

Um, uh, also behind that really mechanics, uh, which were created in Python, we, um, developed a small Daesh dashboard, uh, and made it available liberal to the operators. They use it now on a daily base. And up to now, we, um, rolled this out in three different plants using data science factory again, where the whole software is running. Another very interesting use case is the extrusion process. So when you really have the right, uh, mask for the creating tires, you need to stretch it out, sort of like on a spread on an, on a bread. Um, and with this process, uh, this is where a sense of, uh, it might happen that you create scrap, which is not very, uh, um, very good. So, um, we were on the way to pre predict this crap. So we try to predict as early as possible, whether this extrusion process happened in the right way or whether it ran from, and, um, in the beginning, we did this on a very low level.

00:26:40

We didn't have the right data. So we had a low accuracy, but over time we evolved there and we are on a good way to be even better and predicting the spread very with a very high accuracy. And also this process is really running on the data science factory. And here it's very useful because we have it. Uh, we have installed this on one in one plant on one machine and the operator can give feedback and the data scientists can really optimize the algorithm behind it and just deploy it as we seen before in the slides very quickly so that the operator sees the results of his feedback quite quickly. That's a very good process here for the data science factory. Also image recognition is something we turn to recently, we have a TensorFlow model, or even some tens of floor models to detect, um, Sierra numbers, which are printed on some vehicle tires.

00:27:36

And we, um, uh, have an app running, uh, to use these, uh, the results of this, uh, model here and also the, the, the code itself has really running on the data science factory again. So the, the, the iPhone also the, the phones need to connect to the data science factory and get the inference from there. Um, this is still a good working model there in future. We will also deploy things to the mobile device. Also, edges are, are in plan. And with that, the data science factory might evolve and do also, um, device management, mobile management and stuff like that. This is a process ongoing. We are currently into it and really plan to stretch out to also these areas here. Very important area. We saw the first example already is the whole fleet service area. We have little sensors in our tires, uh, specifically in, in, uh, uh, truck tires.

00:28:35

Um, and the sensors can measure the pressure and the temperature. And from there, we can use these data to do some predictions. How long will the tire last? How many miles can you go with a tire? When is the next service necessary? And this is something we, uh, we developed currently for our customers, so that they can use services alongside to the tire, very important service. And this is also based on the, on, on the development and things like the telemetric backbone I show before. And it's then deployed to data science factory, where we, the algorithm works and delivers the results using APIs. That was it. Thanks for listening. If you have questions, I'm happy to hear. Thank you.