Tokenless

Interactive discovery

Explore the topic map

Follow the connections between themes, people, and ideas across the Tokenless archive in an interactive topic modeling map.

Machine Learning

View All
Frontier results, on device - RL Nabors, Arize

Frontier results, on device - RL Nabors, Arize

RL Nabors discusses the significant costs associated with using frontier AI models, covering security, latency, and financial implications. She introduces a framework for right-sizing AI solutions by leveraging smaller, task-specific models and Small Language Models (SLMs). The framework details how to prove task feasibility, establish success criteria with golden datasets, conduct capability evaluations (using tools like Phoenix), and select the most appropriate "Small And Good Enough" (SAGE) model. Nabors further demonstrates how prompt engineering, particularly few-shot prompting, and post-processing can close performance gaps with larger models, while advocating for continuous regression evaluations to maintain performance integrity. The overarching message is to "prototype big, deploy small" to optimize AI deployments.

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Vaidas Razgaitis, Senior Research Engineer at Higharc, shares three tactical tips to accelerate the transition of novel AI/ML research into production-ready features. He emphasizes addressing the critical handoff challenge between ML researchers and software engineers through structured documentation (Research Prototype Taxonomy Document), a well-organized monorepo utilizing decoupled microservices, and a systematic approach to code decomposition and PR review. These strategies aim to improve legibility, maintainability, and delivery speed for ML-driven products.

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

This session details a data-efficient method for training engineering surrogate models by using uncertainty quantification (UQ) to guide geometric data augmentation. Instead of random deformations, the approach lets the deep ensemble model identify its own knowledge gaps (epistemic uncertainty), then uses Free-Form Deformation (FFD) to generate new shapes specifically in those uncertain regions. This ensures every expensive simulation run yields maximally informative data, significantly improving model accuracy for a fixed computational budget across domains like structural mechanics and aerodynamics.

Artificial Intelligence

View All
Grant Sanderson (@3blue1brown) – AI and the future of math

Grant Sanderson (@3blue1brown) – AI and the future of math

Grant Sanderson and Dwarkesh Patel discuss AI's rapid but uneven progress in mathematics, exploring whether AI can achieve true conceptual breakthroughs, the challenge of measuring creativity, and the long-term implications for human understanding and the future roles of mathematicians. They delve into the unique 'grindability' of math for AI training, the potential of formalization, and why AI currently struggles with 'theory of mind' in writing, offering advice for students navigating an AI-transformed world.

Sustainable Augmented Development • Kent Beck • YOW! 2025

Sustainable Augmented Development • Kent Beck • YOW! 2025

Kent Beck's presentation at YOW! Australia 2025 explores the transformative impact of augmented development ('the genie') on software engineering. He argues that AI shifts programming into an exploratory phase, requiring a re-evaluation of traditional practices. Beck introduces the concept of 'resting between the notes' to foster optionality over mere feature churn, and makes a compelling case for the increased value of junior developers as AI tools become powerful learning aids, emphasizing that 'nobody knows' the future but we must 'take our time' to adapt effectively.

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

LLMs often slow down under heavy traffic due to inefficient GPU memory management during inference. This overview explains how KV cache and Paged Attention, implemented in VLLM, optimize memory usage across prefill and decode phases, significantly boosting LLM throughput, reducing latency, and improving GPU utilization through advanced context handling and specific tuning techniques like prefix caching and speculative decoding.

Technology

View All
Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Gregor Hohpe explains the critical role of platforms in managing the growing cognitive load on developers due to complex distributed systems. He contrasts platforms, driven by "economies of speed" and fostering innovation through diversity, with traditional IT services and oversimplified abstractions that create dangerous illusions. Hohpe emphasizes building platforms that provide intuitive, domain-specific abstractions to solve real business problems, rather than just repackaging existing cloud services.

Full Stack Greenfield Projects : Are they still relevant?

Full Stack Greenfield Projects : Are they still relevant?

Bharat Goenka, co-founder of Tally, discusses the company's unconventional approach to software development through "Full Stack Greenfield" projects. He explains why building every component from scratch, despite being a high-risk strategy, has been crucial for Tally's success in serving the SMB market, fostering extreme customer loyalty, and aspiring to connect 200 million businesses. The talk delves into the historical context, the philosophy of questioning and choosing constraints, and the distinction between product and custom engineering.

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

Jeff Crume outlines essential data resiliency strategies, starting with the 3-2-1 backup rule—three copies, two media types, one offsite—and expanding to include immutable or air-gapped backups, rigorous testing, and encryption. He emphasizes these principles for robust disaster recovery, ransomware protection, and minimizing costly downtime, highlighting the trade-offs in achieving high availability.


Recent Post

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

This workshop by Mahmoud Mabrouk, CEO of Agenta AI, delves into building calibrated LLM-as-a-judge evaluations that reliably align with human judgment. It highlights how miscalibrated judges lead to false confidence and presents a practical workflow, including designing use-case specific metrics, detailed data annotation, and optimizing judge prompts using the GAPA algorithm. The talk emphasizes the importance of iterative debugging, model selection, and custom reflection templates for achieving trustworthy and effective LLM evaluations.

Ideas: Steering AI toward the work future we want

Ideas: Steering AI toward the work future we want

Microsoft researchers unpack the New Future of Work Report 2025, exploring AI's real-world impact. They discuss adoption trends, the shift in job tasks, and the crucial distinction between viewing AI as a simple tool versus a collaborator. The conversation emphasizes moving beyond pure efficiency to consciously design a future where AI supports human flourishing and meaningful work.

Building Agentic Applications with Spring AI • Matthew Meckes • GOTO 2025

Building Agentic Applications with Spring AI • Matthew Meckes • GOTO 2025

Matthew Meckes from AWS makes a compelling case for Java's central role in the future of enterprise AI. This talk explores how Spring AI empowers developers to build robust, production-ready agentic applications by integrating LLMs with existing Java services, moving beyond proofs-of-concept to solve real-world business problems.

How AI Agents Will Transform the Financial System with Circle Co-Founder and CEO Jeremy Allaire

How AI Agents Will Transform the Financial System with Circle Co-Founder and CEO Jeremy Allaire

Circle CEO Jeremy Allaire delves into how programmable money and the Arc blockchain will power the emerging AI agentic economy. He explains how stablecoins like USDC offer an internet-native financial infrastructure for micro-transactions and large settlements, addressing the limitations of traditional banking for AI agents. The discussion covers the foundational principles of full-reserve banking, the unique attributes of Arc blockchain for machine-driven economic activity, the tokenization of real-world assets, and a bold vision for AI's potential to drive double-digit GDP growth and foster new on-chain organizational structures within the next decade.

From Neural Networks to Digital Brains: The Next Leap in AI • Daniel Lütgehetmann • GOTO 2025

From Neural Networks to Digital Brains: The Next Leap in AI • Daniel Lütgehetmann • GOTO 2025

Daniel Lütgehetmann of inait introduces "digital brains," biologically accurate computational models of real brains, as a solution to current AI's limitations in physical world interaction. Unlike traditional AI that struggles with dynamic environments and skill accumulation, these digital brains leverage biologically inspired learning rules to achieve dramatically faster learning in robotics and complex systems, demonstrating potential for real-world adaptability and efficiency.

From Chaos to Choreography: Multi-Agent Orchestration Patterns That Actually Work — Sandipan Bhaumik

From Chaos to Choreography: Multi-Agent Orchestration Patterns That Actually Work — Sandipan Bhaumik

Sandipan Bhaumik from Databricks explains that scaling from one to many AI agents is a distributed systems problem, not an AI one. He details common architectural anti-patterns like shared mutable state that cause race conditions and silent failures. The talk provides a practical framework based on distributed systems engineering, covering crucial patterns like choreography vs. orchestration, immutable state management with versioning, data contracts, and failure recovery using circuit breakers and compensation (Saga) patterns. Bhaumik illustrates how to build a robust, production-grade multi-agent architecture using tools like Databricks, LangGraph, and MLflow.

Stay In The Loop! Subscribe to Our Newsletter.

Get updates straight to your inbox. No spam, just useful content.