DevOps September meetup

zář24

Středa 24. září 2025

18:00 - 21:00

Tato akce už proběhla.

Původní stránka akce →

O akci

Talk 1️⃣ Sirius Ivlev 🔗 LinkedIn
Sirius, Lead of the DevTools SRE Team 🛠️
AI in Engineering: The Trial-and-Error Method 🤖⚡

For almost three years now, we’ve been waiting for AI to replace us. But until that glorious day arrives ✨, what real value can it deliver? Let’s take a closer look at the pros and cons ⚖️, review research findings 📑, and share the results of our own experiments 🧪 in integrating AI tooling into our daily workflows.


Talk 2️⃣ Evgeny Arhipov 🔗
Head of scheduler services at Nebius 🌐
Managed Soperator: A modern, democratic approach to Slurm-based supercomputing 💻🚀

Pretraining and fine-tuning tasks often require a significant amount of interconnected processing power 🔋, also known as supercomputing. Go-to tool of choice in the industry is Slurm 🧩, a project dating as far back as 1994 📅. We will discuss how we made this traditionally very expensive 💰 and complex endeavour of running a Slurm cluster at scale -- a breeze 🌬️. The result is a managed solution built on top of a modern tech stack 🏗️, using open source tools 👐, either existing or newly built and contributed back by Nebius to the community 🤝.

Talk 3️⃣ Amrita Nair 🔗
ML, DevOps and SRE at Tricentis 🌎
🚀🔍 Building an LLM Observability Stack on AWS ☁️

In this talk, I’ll share how we built the foundational infrastructure 🏗️ for integrating an LLM observability SaaS 📊 into our internal co-pilot product 🤖. I’ll walk through how we designed the stack from scratch ✨ using fully AWS-native services: VPC networking 🌐, load balancers ⚖️, MSK (Kafka) 📨, RDS (PostgreSQL) 🗄️, and ClickHouse Cloud ☁️ with cross-region access 🌍 via AWS PrivateLink 🔒.

I’ll cover how we automated Kafka topic creation ⚡ with Lambda 🐑, provisioned secure 🔐, production-ready infra 🛡️ with Terraform 🌱, and connected it all to enable real-time ⏱️ prompt tracing 🧵. We will discuss pitfalls ⚠️ and lessons 📘; if you're interested in building AI-adjacent systems 🤝 without drowning in abstraction 🌊, this is a practical 🛠️, honest look 👀 at what it takes.

This was a greenfield project 🌱 developed in pre-production 🧪, offering insights 💡 into designing for scale 📈, reliability 🛠️, and future extensibility 🔮 without the pressure of live customer data 📦. Ideal for DevOps engineers 👨‍💻👩‍💻 curious about AI system architecture 🏛️ in the cloud ☁️.

Místo

WeWork DRN