Scaling DevSecOps adoption in a Large IT Services Firm

The session would highlight the experience of TCS, a large IT services organization, in deploying DevSecOps at scale. We have designed a comprehensive DevSecOps framework. This is abstracted into an assessment method for benchmarking the DevSecOps maturity of various projects. The assessment provides the quantified maturity score in various DevSecOps domains and dimensions. The assessment report includes the project-specific recommendations to elevate DevSecOps maturity. At the project level, the assessment findings are contextualized to come up with the improvement roadmap for implementation.


At the organization level, 530+ projects have already been assessed. The projects are on their DevSecOps improvement journey.

The session would describe the TCS journey, an overview of the assessment framework, and the approach of promoting improvements in the organization.

DC

Dhruba Chaudhuri

DevSecOps DevOps SRE Site Reliability Engineering Process Design Specialist Functional Lead, TCS

LP

Leena Pradhan

Delivery Excellence, TCS

Transcript

00:00:00

<silence>

00:00:14

Good morning, good afternoon, and good evening everyone across the globe. It's a great pleasure for us to be sharing our experience in this great forum, the DevOps Enterprise Summit. We are from Tata Consultancy Services, a large IT services firm. I'm Lena Pram, leading service reliability engineering practices within Corporate Delivery Excellence Group. The charter for our group is to enable our engagements to deliver reliability in their services to our end customers. Uh, we look at the best in class engineering practices, which are prevalent in the industry. We look at these practice implementation across our different, uh, customer engagements and then scale these practices across the organization.

00:00:56

Hello, UBA here having 28 years of IT experience, and I'm currently working in corporate delivery excellence function. My focus areas are DevSecOps and SRD practices. Data consultancy services is a IT service company serving a very large customer base across geographies, countries, and industry verticals. Our each project is unique and a different use case in terms of domain technology, methodology, culture, maturity and complexity. Today, we would like to share our experience study on how are we driving DevSecOps adoption at scale, embracing diversity.

00:01:38

Before we go into the details of how we are scaling these DevSecOps practices across the organization, we would like to share what triggered this, uh, thought process. In the year 2017. Uh, our leadership team set us one vision for the entire organization, which is to be enterprise agile By 2020, we were not looking at Agile only in the IT projects, but agile in every function. So the entire organization was driven by this one vision, which was set by our leadership team. Uh, we went about doing the transformation. Uh, we looked at the process transformation. We leaned as required. We did a people transformation to enable the teams to be able to imbibe agile behavior in all their practices. And we also did a workplace transformation, which enabled a seamless and collaborative working experience for our associates. With a strong deployment focus, rigor, and a continuous leadership direction, uh, we achieved our vision.

00:02:37

And 2020 was the enterprise aha moment for us. With this, uh, solid foundation on agile, we now have, uh, more than 80% of engagements being executed in Agile. And this, uh, transformation, it brought another objective in front of us to reinforce the engineering practices to go deeper into DevSecOps practices. DevOps, uh, complement agile from a technology standpoint, and thus, uh, this transformation brought DevOps into more prominence across the organization. Uh, this transformation also paved us the path to a faster realization of business value. We understood what it takes to be able to deliver faster and quicker. We are now able to help our customers to achieve their business objectives, to stay competitive. Uh, as against the other players in the industry, our customers are able to respond to the industry demand with much more agility. And this also brought in a lot of efficiency into the various processes.

00:03:38

Now, that was all on the customer space. So if we look at what is happening on the technology space, uh, in the last few years, we know we have seen that there is a significant rise in the digital footprint in every business because small, uh, the industries have gone, uh, mobile, there is a lot of web presence in all the industries, and the organizations are also looking at multi-channel, uh, approach for delivering their services to the end customers. And perhaps the pandemic has accelerated this movement across all the industries. Uh, this also, uh, saw an increased exposure to various cyber threats, which are there. So what it meant was that we require a very fine balance between speed, reliability, resilience, and quality. And hence, DevSecOps is becoming a default ask now. So what was the need in, uh, certain pockets or in certain, uh, industry or projects?

00:04:36

Now it has become a default need for every other industry. And, uh, around the same time, our customers were also, uh, getting more interested into DevSecOps, and they were asking, what are the T Cs capabilities in implementing DevSecOps? And with this, uh, agile transformation, we also found that the accounts which were the early adopters of Agile, they had already done multiple cycles of, uh, transformation, uh, for faster service delivery. Uh, they had done transformation in the terms of people process and the different technology or the automation enablers. And, uh, they were, uh, they have now moved from the earlier, larger and less frequent releases of let's say yearly or half yearly to much smaller and much more frequent releases of quarterly, monthly, um, fortnightly. And some even have, uh, release on demand capabilities. Okay. But when we look deeper, we found that, uh, these are all diverse implementation.

00:05:35

And there is also, to some extent, there is a difference in the understanding of what DevSecOps is among the different teams and the experience and learning. This also mostly stayed within the teams and teams were kind of relearning and reinventing the wheel, even though the same practice has been already implemented elsewhere in the organization. So what, uh, essentially it means that we were faced with the challenge of scalability and multiplicity in organization, uh, such a big organization like I, like us. So for a process group like us, it was, uh, very imperative that we needed to have a standard framework to scale DevSecOps across our organization. That's when we came up with, uh, the TCS DevSecOps, uh, framework. It's a process, uh, ex exhaustive process framework, which helps, uh, to implement these, uh, DevSecOps practices, uh, across the different, uh, um, different, uh, uh, engagements. It helps us to institutionalize these practices across various teams, uh, with different diversity. And it also addresses the challenge of the higher incubation period or the higher, uh, readiness, uh, cycle time. It also gave us another opportunity to be able to benchmark our different accounts in terms of their dev across maturity implementation. Uh, this is an exhaustive com and comprehensive, uh, process framework, which helps us to strengthen our core capabilities in the engineering space.

00:07:03

Uh, the framework is a four layered, uh, structured framework. There are five domains, uh, practices, a practice areas within them, then 40 plus themes and more than 80 practices. The five domains are continuous planning, continuous development, continuous testing, release or deployment, and continuous monitoring and improvement, continuous planning, uh, domain. It deals with the practices which are related to product conceptualization and product planning. It also looks at the team and the foundational readiness in terms of scaling and the automation capabilities. Continuous development looks at the practices which are related to development, peer reviews, security practices, and uh, continuous integration. Continuous testing looks at the practices related to, uh, test design, uh, test strategy test, data enablement and test environment enablement for the various functional and non-functional, uh, testing, continuous release and deployment. It deals with the practices which are related to product validation, the release approval release process, and also the different, uh, zero downtime strategies for deployment, continuous monitoring and improvement.

00:08:13

This looks at the practices which are related to monitoring, uh, observability, and it also looks at what kind of insights can be derived, uh, based on the data which is collected throughout the delivery lifecycle. And this framework has been designed, uh, with the best practices from TCS matured implementations, and also the practices which are more prevalent and upcoming ones in the industry. It's a multidimensional framework. Uh, it addresses the culture, it addresses processes, the framework addresses people, the practices address the various aspects of quality, security, reliability, resilience, and the framework has a special focus on the automation aspects of it, uh, for different, uh, practices such as for testing, integration, uh, build deployment, monitoring and so on. And the framework also covers in in depth the various KPIs, the pipeline KPIs while the software is being developed, the application performance, infrastructure performance and health, and also the product performance. In terms of the business KPIs, these dimensions have been very carefully and thoughtfully selected, the ones which are essential for, uh, achieving the goals of DevSecOps.

00:09:28

The framework covers the entire breadth and depth of the service delivery. Uh, all those IT practices that I mentioned earlier. They cover the entire breadth of the service delivery lifecycle. And for each of those IT practices, we are going into the depth of each of them. When I say depth, it means, uh, who is practicing, uh, this practice, whether it is the product team or it is some other team who is helping the product team in executing that practice. So that tells us about the self-sufficiency or cross-functional liability of the product. We are looking at what is the readiness in terms of the skills which are required, and in terms of the automation capabilities, whether the practice is being in, uh, implemented offline or it is a practice integrated as a part of the pipeline itself. We are looking at how early is the practice being done to understand the left shift culture of the team, and if it is an automated practice, whether it is fully automated or what is the extent of automation, which is, uh, there as an example of static application security testing, we are looking at whether the product development team is capable enough to, uh, implement this practice, or there is an InfoSec team who is outside of the product team and they are, uh, executing SaaS.

00:10:44

Whether the product team, uh, is enabled to do the setup, to do the configuration of the tool. And they are also enabled to understand the tool output and info from it, whether sast is being triggered as a part of the CI bills or it is executed as an offline uh, process. And when we look at the left shift aspect, we are checking whether this, uh, SaaS tools are integrated as a part of the developer's ID or are they a part of the CI bills or it is done just before the release. The framework also, uh, looks at whether the entire code base is, uh, being used for doing the SaaS or it is done on only on a sampling basis. So the breadth and depth of each of these practice determine the maturity of the practice. And then, uh, in an aggregated way, these help us to determine the maturity of the team.

00:11:40

Just to summarize whatever we had so far. For a service industry like TCS, the DevSecOps adoption scenario is different. Why due to the huge scale and diversity, each project is a different instance and not comparable with other. So we have come up with this standard framework that integrates DevSecOps core principles, practices and culture into a software engineering assembly line to deliver reliable, resilient and secure product increments at speed or on demand. The framework defines the TCS way of approaching DevSecOps projects aimed at achieving DevSecOps goals and objectives after the framework. What next? So now we wanted to benchmark our projects against this framework baseline to understand their current state of adoption and further opportunity for improvement because we want all of them to elevate their dev cycle of maturity. The reason being, we observed that maturity plays a very critical role in at attending and maximizing desired benefits from DevOps adoption. So we started doing manual assessments to gauge the current state and opportunities. The method includes, as you can see on my left hand side, identification of stakeholders and assessment team conducting interviews and pipeline demo sessions, manual benchmarking against the framework elements, prepare findings, conducting drug validation sessions with the team, sharing final assessment report with detailed findings.

00:13:28

One sample report is shown as the outcome of this manual benchmarking method, which captures the adoption and practices, again, 200 plus practice elements and its state of adoption using different color posts. Do you think this was scalable? No, because we could complete only 20 such assessments in seven months where we target to cover a multiple thousands. So needless to say, the method was neither scalable nor sustainable due to many challenges such as some of them you can see uniformity, stakeholders, availability being very effort intensive, consuming very high cycle time of four to six weeks and so on. So how are you welcoming this as a next step? We did abstract the framework into a technology agnostic self-paced and digitized assessment method. Having four maturity levels, you can see as basic, standard, advanced, and best in class. It has over 500 plus responses designed to capture each practice element across five dev domain and associated various dimensions such as people, process, technology, culture, automation, security, integration, and so on. The maturity is based on breadth and depth of practice implementation and regard. As explained earlier in a T shift model, it may also depend on the practice gradation, such as for infrastructure as a code, the practice or the variation in deployment of this practice can vary widely.

00:15:20

The starters can do this using this IAC, the infra configurations only, whereas others they can start doing provisioning and infrastructure component. Some are able to provision an entire environment like spinoff or tear down the environments as their will. And some are even more mature to orchestrate the entire deployment production deployment automatically using auto rollback facilities. This assessment has an integrated knowledge base to translate the responses and come up with a detailed report that is automated and send to the team within one day cycle 10. You can see one sample report, which is mainly having three parts. The first part on your left is the domain sub-domain and various dimension wise maturity score and their levels. The second part, the middle one provides an executive summary on key aspects such as deployment frequency, late time automation, security integration in the pipeline and similar the the third part is the detailed findings on each and every strength opportunities, which are classified as observations and related recommendations based on the framework and the suggestions for some industry leading best practices to explore, such as hypothesis based testing, often known as ga, engineering rate teaming and so on. Now we are showing one sample question just to show, you know, the kind of rigor regard we have into this assessment method and the responses that we have designed.

00:17:12

You can see one continuous integration build check, get, uh, options vis-a-vis possible responses. So the expectation versus responses. What we have experienced in various projects, and we have seen both the extremes, like someone is doing a continuous integration for doing only compilation and linking different libraries while some other, in that other extreme, they are able to integrate all these steps and having a robust check. It also, the assessment and the reports are technology agnostic. So what we are doing with these reports, so what is the flow? The report findings are then interpreted in the context of the customer to come up with detailed action plan with priority action owner and timeline. The context could be different like technology stack and limitations, customers already investment into the technology and tools, agile maturity organization and cultural rigidity, their alignment towards traditional and other technologies and so on.

00:18:23

It is then shared with customer to further elevate the dev SecOps maturity as the actions are being taken. Reassessment is recommended in four to six months or so to experience if the maturity elevation for these projects have happened and if that is reflected in achieving the dev SecOps goal. So if the maturity elevation has not happened, then they need to continue with these actions and see the regard of implementation. And if it has happened and the reflection within the goals are also visible, then they would go further next, uh, maturity. And this is how the journey continues for every projects.

00:19:08

So the next question is, we have a huge scale and that is what that only the framework and the manual assessment could not sustain. So how, how are we reaching to our different engagements? We have a strong ecosystem and enablement team in the form of delivery excellence said, delivery excellence partners, unit agile leaders, community of DevSecOps practitioners as neighborhood to facilitate this scaled adoption. Apart from this, we have been evangelizing DevSecOps adoption through various forums such as webinars, focused weeks, account speak, knowledge sharing sessions and so on. We also conduct orientation sessions at regular interval, both at business unit and organization level. Now to show a sample governance mechanism by different business unit, how the specific single point of contact or spot is driving rigor in depth SecOps adoption within his or her units. They often adopt various means such as coverage dashboard, daily sync up to respond to different queries and removing impediments, monthly town hall to answer all queries and to show up to the management. They often collaborate with us to get any help that is required to sustain this initiative. The focus is to bring everyone in this improvement journey to maximize both customer and organizational benefits.

00:20:45

Now talking about the impact we have just explained how the framework and the assessment method are helping us to benchmark and to scale DevSecOps adoption at organization level as we speak. We have already achieved 700 plus benchmarking as assessments for 200 plus customers across business units and geographies. Customers are also very positive to see this proactive engagement as growth and transformation partners towards improving their DevSecOps maturity and thus helping and enabling them to realize their business benefits. This has reflected in their feedback too. You can see a couple of sample give them as a derived benefit At organization level, we are also having additional benefits because we have a rich information data set of 500 plus responses from each of these 700 plus assessments. So we are coming up with various additional organizational insights on each of these practice element and corresponding dimensions. This is helping us to identify specific focus areas to drive excellence in software engineering practices across the organization and even beyond DevSecOps projects.

00:22:11

To summarize the behavior as being exhibited by different maturity levels of projects. At a very high level, we have been observing like advanced and best in class projects, exhibiting better ability in continuous delivery enabled by their maturity and test automation, infra codification, security integration and so on. They also found to be more mature in zero downtime and zero hands off production deployments. Often one click followed by realtime observability and proactive event management. Additionally, best in class projects also show better adoption of advanced capabilities such as AI ops testing with focused user group like heavy testing, dark launching failure injection to test infra resilience and so on. Whereas the projects in standard maturity, they're capable of doing a continuous integration and now reaching for the next uh, level by improving and um, on different areas such as adoption of best practices like test driven development, behavioral driven development, threat modeling, building cross-functional skill in test automation in automation, integrating security tools within the pipeline, improving realtime pipeline visibility and more integration and proactive monitoring and so on. We often have seen that cloud adoption is helping projects to reach the higher maturity level, especially from info infrastructure automation and continuous monitoring perspective.

00:23:56

So now how DevSecOps adoption is helping TCS and customers. There have been customer delight and appreciations everywhere as it is helping them to realize their DevSecOps goals towards maximizing business benefits such as faster delivery, high frequency deliver, improved agility and responsiveness towards technology changes, business changes, events and disaster management, faster recovery from incidents and outages, improved reliability and so on. Now the question could be how are we redefining the framework? Are we doing this framework and assessment method fin redefining on enriching? Yes, the answer is yes. Obviously we are continuously working and to upgrading our knowledge on how the industry is trending, what are the new practices are emerging, what are our mature engagements are practicing, how are the practicing, what is the additional need? And that is how we are feeding back all our experiences into this framework and assessment matter to continuously enriching this. With this, we'll come to the end of our presentation. So finally, thank you all having patience and listening to our experience towards scaled adoption. If CIRCUMS adoption and how is this different from a very large IT services company perspective, we are sure you'll be able to replicate this story in your organization as applicable. Thank you.

00:25:35

Thank you and have a good day.