In a company the size of Capital One, where over 8,000 software engineers are tasked with developing secure and compliant systems for a leading financial institution, challenges naturally arise.
From managing complex deployment pipelines to ensuring compliance with hundreds of regulatory requirements, the landscape is nothing short of demanding. As a backend deployment engineer specializing in AWS infrastructure for Capital One, I’ve spent years working within the nuances of this environment.
What I can say is that despite the hurdles, Capital One’s engineering teams have found ways to successfully Frankenstein together third-party tools, custom solutions, and internal workflows that keep things running. Yet, the developer experience could be much improved, and that's a challenge many large financial institutions, like us, face every day.
TL;DR
Capital One's internal development pipeline is extensive and complex due to compliance and scale, involving 8,000 engineers.
Custom tools and platforms are developed internally, using Jenkins, Ansible, and various programming languages like Groovy, Python, and Go.
Internal development experience can be challenging due to fragmented systems, requiring more cohesive collaboration tools.
AI integration is a growing challenge due to security and compliance concerns.
Moving back to on-prem for AI-related tasks is being considered due to privacy and control issues, though the cloud remains a core resource.
Developer Stack: Building on Flexibility
At Capital One, flexibility is key to our developer stack. Engineers are given the freedom to choose their stack depending on their team’s focus, leading to a variety of tools, languages, and workflows across the company. My team, for example, focuses on Ansible, Jenkins, and GitHub, while working primarily with Groovy, Bash, and Python to manage deployments. But this flexibility comes with its own set of problems.
“No one’s got the same setup,” I often say when explaining the inevitable quirks that arise when 8,000 engineers are working with different tools, languages, and configurations. While our stack covers everything from Java to Go, this wide array of tools sometimes creates more chaos than convenience.
In large-scale financial institutions, speed and compliance often clash. For example, while a service working on payment processing needs to execute in microseconds, we’re stuck ensuring that every deployment pipeline bakes in hundreds of compliance checks. These checks often require specific tools, and it’s here that flexibility starts to turn into a logistical challenge.
Baked-in Compliance: Automation for Scale
One of the key features of our deployment pipelines is that we’ve managed to bake in compliance directly into the system. Using OPA (Open Policy Agent) policies, we scan code for security issues before it even gets deployed. This allows us to provide feedback to engineers in real-time, rather than making them wait 25 minutes for a deployment to fail.
OPA allows us to scan user inputs before any job really gets going. We use OPA to set parameters, ensuring things like security groups in AWS are configured correctly before a developer moves too far into the deployment.
This has significantly sped up our development process. However, due to the sheer number of checks—hundreds in some cases—it’s still a major effort. We have an entire dedicated team just to maintain and coordinate these checks, which results in redundancies at times, where policies get implemented twice by different teams.
The ideal solution would be a tool that allows for better developer collaboration. While we do have internal tools and documentation, it’s often so vast that finding relevant information can feel like searching for a needle in a haystack. A more cohesive platform could make this process less cumbersome.
Challenges of Scaling Developer Tools: A Fragmented Experience
Despite our successes in automating compliance, developer experience at Capital One is not without its challenges. We’re using a Frankenstein system, pulling together components from over 15 different third-party resources to manage the development pipeline.
“Our documentation is so extensive, it’s become unwieldy,” is a sentiment shared by many. Our engineers have to follow 45-minute video tutorials just to get their local environments set up. This setup includes multiple versions of Groovy, Go, Python, and other dependencies.
This fragmentation makes it difficult for engineers, especially new hires, to contribute quickly and efficiently. Ideally, we’d have a single-click setup for development environments, but that’s far from the current reality. Internal teams dedicated to maintaining these tools are often overwhelmed by the sheer volume of requests and contributions from other teams.
Our fragmented setup demonstrates that even in large enterprises, internal tools can sometimes lag behind the desired agility and integration.
Internal Open Source: Capital One’s Contribution Model
Despite the hurdles, one area where Capital One excels is in fostering an open-source culture internally. Roughly two-thirds of the new features for our internal tools are contributed by engineers outside of the core developer tools team. This model allows for a more dynamic and responsive development environment where improvements come from across the company.
It’s like having an internal open-source ecosystem. Engineers contribute features and work collaboratively to improve tools that everyone uses.
The challenge is ensuring that contributions don’t lead to duplication of effort or misaligned integrations. That’s why we have technical reviews and meetings to ensure that new features are valuable and don’t repeat existing functionality.
AI Integration: The Future of Internal Tools?
One of the most exciting developments we’re exploring is the potential for AI integration. However, integrating AI into our systems brings up significant compliance and security concerns. While AI can automate many tedious tasks, there’s no way we can allow an AI bot to have full access to our banking systems.
For now, we’re taking a cautious approach, building custom versions of AI tools that can only access specific, limited files within the IDE (Integrated Development Environment). Even these tools are still in the beta testing phase, and we are evaluating whether they can be safely rolled out company-wide.
ChatGPT is like a very good eighth grader at programming.
While tools like ChatGPT can handle simple tasks, it’s nowhere near ready to handle the complexities of a financial institution’s software.
The reality is, financial institutions like ours are navigating a delicate balance between innovation and security. Moving back to on-prem solutions for AI may be necessary to maintain control over sensitive data, but the cloud continues to offer unparalleled speed and flexibility.
The Cloud vs. On-Prem Debate: Where Do We Go from Here?
While some companies are reconsidering moving back to on-premise solutions due to AI and data privacy concerns, Capital One is not making that shift—at least not yet. We’ve invested billions into cloud infrastructure, and the efficiency gains from using cloud services like AWS are undeniable.
However, to mitigate the risks of AI, we’re looking into hybrid models, where AI tools are hosted locally or in locked-down environments to ensure that sensitive data is kept secure.
Once the cat’s out of the bag, there’s no putting it back in.
This sentiment rings true when discussing security breaches, which is why we’re cautious about how AI tools are integrated.
Key Takeaways
Compliance is a key factor in how we build our internal tools, often creating additional layers of complexity.
Developer experience needs improvement, particularly around ease of setup and collaboration.
Internal open-source contributions from non-core teams are essential to keeping up with new feature demands.
AI is both an opportunity and a challenge, especially when considering security and compliance.
The cloud offers incredible advantages, but we are cautious about how much control is given to AI models, considering potential privacy risks.
Conclusion: The Future of Developer Experience
In large-scale companies like Capital One, there’s no one-size-fits-all solution for developer tools. However, the need for a more unified developer experience is evident.
Tools that streamline compliance, make collaboration easier, and integrate AI responsibly will be critical in the coming years.