Ci cd

CI/CD Evolution: From Pipelines to AI-Powered DevOps • Olaf Molenveld & Julian Wood • GOTO 2025

CI/CD Evolution: From Pipelines to AI-Powered DevOps • Olaf Molenveld & Julian Wood • GOTO 2025

Olaf Molenveld (CircleCI) and Julian Wood (AWS) discuss the evolution of CI/CD practices. They draw parallels between managing production code and the 'factory' that produces it, covering optimization strategies, local vs. remote development, the rise of platform engineering, and how AI is reshaping DevOps by acting as both an expert system and a guarded collaborator.

Evals Are Not Unit Tests — Ido Pesok, Vercel v0

Evals Are Not Unit Tests — Ido Pesok, Vercel v0

Ido Pesok from Vercel explains why LLM-based applications often fail in production despite successful demos, and presents a systematic framework for building reliable AI systems using application-layer evaluations ("evals").

The Cloud Native Attitude • Anne Currie & Sarah Wells

The Cloud Native Attitude • Anne Currie & Sarah Wells

Authors Anne Currie and Sarah Wells discuss the core principles of "The Cloud Native Attitude", defining it not as a specific technology stack but as a cultural mindset focused on removing bottlenecks and enabling rapid, iterative change. The summary covers the primacy of CI/CD, the evolution of orchestrators like Kubernetes, and how a cloud native approach is a critical enabler for building sustainable, green software.

Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger

Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger

A detailed summary of a workshop on building and deploying production-minded AI coding agents using Dagger. The session covers creating controlled, observable, and test-driven agent workflows and integrating them into CI/CD systems like GitHub Actions for automated, reliable software development.

Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith

Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith

Traditional benchmarks and leaderboards are insufficient for production AI. This summary details a practical, multi-layered evaluation strategy, moving from foundational system performance to factual accuracy and finally to safety and bias, using open-source tools like GuideLLM, lm-eval-harness, and Promptfoo.