Do you still need GitOps in the era of gen AI?

Link
2025-09-10 ~1 min read www.redhat.com #kubernetes

⚡ TL;DR

Do you still need GitOps in the era of gen AI? Our own evolution when it comes to risk Move fast with AI, but stay in control Speed and control: With GitOps, you can have it all! What changed for us For a much deeper dive … Final thoughts Get started with AI Inference About the author Roberto Carratalá More like this Blog post Blog post Original podcast Original podcast Keep exploring Browse by channel Automation Artificial intelligence Open hybrid cloud Security Edge computing Infrastructure Applications Virtualization Share If you've ever worked for or with enterprise companies you know that, when it comes to software, whether it's AI-powered or not, the stakes could not be higher. And that is the reason they invest heavily in making their production environments as bulletproof as possible.

📝 Summary

Do you still need GitOps in the era of gen AI? Our own evolution when it comes to risk Move fast with AI, but stay in control Speed and control: With GitOps, you can have it all! What changed for us For a much deeper dive … Final thoughts Get started with AI Inference About the author Roberto Carratalá More like this Blog post Blog post Original podcast Original podcast Keep exploring Browse by channel Automation Artificial intelligence Open hybrid cloud Security Edge computing Infrastructure Applications Virtualization Share If you've ever worked for or with enterprise companies you know that, when it comes to software, whether it's AI-powered or not, the stakes could not be higher. And that is the reason they invest heavily in making their production environments as bulletproof as possible. They will architect for high availability and disaster recovery, enforce strict service level agreements (SLAs), and build redundancy into every possible layer. But if their architecture doesn’t also account for the potential for human error , is any of it worth the effort? Time and again, we’ve seen catastrophic outages traced back to a wrong mouse click, a badly written command, or a rushed deployment. Let’s review a few of them: Amazon S3 outage (Feb 2017): An engineer mistyped a command intended to remove a few servers, instead taking down critical S3 subsystems. The disruption affected GitHub, Slack, and other big players, lasting several hours and costing billions of dollars in damages. Facebook global outage (Oct 2021): Engineers withdrew critical Border Gateway Protocol (BGP) routes during a backbone configuration change, causing Facebook, Instagram, and WhatsApp to vanish from the internet for nearly six hours, a lesson on how a single manual change can break world-class redundancy. CrowdStrike Falcon update meltdown (July 2024): A single defective configuration file shipped in an automated security-agent update sent millions of Windows machines into a reboot loop, grounding airlines, banks, and hospitals worldwide. Again, a preventable human mistake , just at internet scale. As a personal example, I once worked with a customer whose datacenter cleaning staff unplugged a production rack server to plug in a vacuum cleaner. That predictable and preventable mistake caused six hours of serious downtime (and I’m only being vague because the incident is recognizable). The solution would not be to fire the cleaner, but to install lockable power outlets, so critical equipment cannot be disconnected without a key.