Tokenless

Interactive discovery

Explore the topic map

Follow the connections between themes, people, and ideas across the Tokenless archive in an interactive topic modeling map.

Machine Learning

View All
Frontier results, on device - RL Nabors, Arize

Frontier results, on device - RL Nabors, Arize

RL Nabors discusses the significant costs associated with using frontier AI models, covering security, latency, and financial implications. She introduces a framework for right-sizing AI solutions by leveraging smaller, task-specific models and Small Language Models (SLMs). The framework details how to prove task feasibility, establish success criteria with golden datasets, conduct capability evaluations (using tools like Phoenix), and select the most appropriate "Small And Good Enough" (SAGE) model. Nabors further demonstrates how prompt engineering, particularly few-shot prompting, and post-processing can close performance gaps with larger models, while advocating for continuous regression evaluations to maintain performance integrity. The overarching message is to "prototype big, deploy small" to optimize AI deployments.

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Vaidas Razgaitis, Senior Research Engineer at Higharc, shares three tactical tips to accelerate the transition of novel AI/ML research into production-ready features. He emphasizes addressing the critical handoff challenge between ML researchers and software engineers through structured documentation (Research Prototype Taxonomy Document), a well-organized monorepo utilizing decoupled microservices, and a systematic approach to code decomposition and PR review. These strategies aim to improve legibility, maintainability, and delivery speed for ML-driven products.

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

This session details a data-efficient method for training engineering surrogate models by using uncertainty quantification (UQ) to guide geometric data augmentation. Instead of random deformations, the approach lets the deep ensemble model identify its own knowledge gaps (epistemic uncertainty), then uses Free-Form Deformation (FFD) to generate new shapes specifically in those uncertain regions. This ensures every expensive simulation run yields maximally informative data, significantly improving model accuracy for a fixed computational budget across domains like structural mechanics and aerodynamics.

Artificial Intelligence

View All
Grant Sanderson (@3blue1brown) – AI and the future of math

Grant Sanderson (@3blue1brown) – AI and the future of math

Grant Sanderson and Dwarkesh Patel discuss AI's rapid but uneven progress in mathematics, exploring whether AI can achieve true conceptual breakthroughs, the challenge of measuring creativity, and the long-term implications for human understanding and the future roles of mathematicians. They delve into the unique 'grindability' of math for AI training, the potential of formalization, and why AI currently struggles with 'theory of mind' in writing, offering advice for students navigating an AI-transformed world.

Sustainable Augmented Development • Kent Beck • YOW! 2025

Sustainable Augmented Development • Kent Beck • YOW! 2025

Kent Beck's presentation at YOW! Australia 2025 explores the transformative impact of augmented development ('the genie') on software engineering. He argues that AI shifts programming into an exploratory phase, requiring a re-evaluation of traditional practices. Beck introduces the concept of 'resting between the notes' to foster optionality over mere feature churn, and makes a compelling case for the increased value of junior developers as AI tools become powerful learning aids, emphasizing that 'nobody knows' the future but we must 'take our time' to adapt effectively.

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

LLMs often slow down under heavy traffic due to inefficient GPU memory management during inference. This overview explains how KV cache and Paged Attention, implemented in VLLM, optimize memory usage across prefill and decode phases, significantly boosting LLM throughput, reducing latency, and improving GPU utilization through advanced context handling and specific tuning techniques like prefix caching and speculative decoding.

Technology

View All
Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Gregor Hohpe explains the critical role of platforms in managing the growing cognitive load on developers due to complex distributed systems. He contrasts platforms, driven by "economies of speed" and fostering innovation through diversity, with traditional IT services and oversimplified abstractions that create dangerous illusions. Hohpe emphasizes building platforms that provide intuitive, domain-specific abstractions to solve real business problems, rather than just repackaging existing cloud services.

Full Stack Greenfield Projects : Are they still relevant?

Full Stack Greenfield Projects : Are they still relevant?

Bharat Goenka, co-founder of Tally, discusses the company's unconventional approach to software development through "Full Stack Greenfield" projects. He explains why building every component from scratch, despite being a high-risk strategy, has been crucial for Tally's success in serving the SMB market, fostering extreme customer loyalty, and aspiring to connect 200 million businesses. The talk delves into the historical context, the philosophy of questioning and choosing constraints, and the distinction between product and custom engineering.

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

Jeff Crume outlines essential data resiliency strategies, starting with the 3-2-1 backup rule—three copies, two media types, one offsite—and expanding to include immutable or air-gapped backups, rigorous testing, and encryption. He emphasizes these principles for robust disaster recovery, ransomware protection, and minimizing costly downtime, highlighting the trade-offs in achieving high availability.


Recent Post

Cognitive Exhaust Fumes, or: Read-Only AI Is Underrated — Šimon Podhajský, Head of AI, Waypoint

Cognitive Exhaust Fumes, or: Read-Only AI Is Underrated — Šimon Podhajský, Head of AI, Waypoint

A deep dive into a "read-only" personal AI system that analyzes your digital footprint—or "cognitive exhaust fumes"—from sources like email, notes, and browsing history. The author argues that this observer approach provides more profound insights and is inherently safer than action-oriented AI agents, by preventing data contamination and mitigating the high-stakes risks of write-access errors.

Platforms for Humans and Machines: Engineering for the Age of Agents — Juan Herreros Elorza

Platforms for Humans and Machines: Engineering for the Age of Agents — Juan Herreros Elorza

This talk by Juan Herreros Elorza explores how to design internal developer platforms for a future where AI coding agents are first-class users. It argues that the same best practices that make platforms accessible to humans—self-service interfaces, well-defined APIs, local-first workflows, and rich observability—are now critical prerequisites for agents to autonomously build, debug, and ship software. The session provides concrete principles for platform design, discusses how to manage AI-assisted contributions, and emphasizes the need to measure the impact of these changes on developer productivity and system reliability.

Your Insecure MCP Server Won't Survive Production — Tun Shwe, Lenses

Your Insecure MCP Server Won't Survive Production — Tun Shwe, Lenses

Lenses.io experts Tun Shwe and Jeremy Frenay discuss the significant security and design hurdles in transitioning Model Context Protocol (MCP) servers from local development to enterprise production. They introduce five core principles for secure agentic design, including shrinking the attack surface and constraining inputs, and detail the necessity of remote MCP servers with robust authentication. The talk provides an in-depth comparison of OAuth 2.1's Dynamic Client Registration (DCR) and the more secure Client ID Metadata Document (CIMD) approaches for managing agent identities, offering a roadmap for building enterprise-grade agentic AI systems with MCP.

Agentic Engineering & PINNs: AI for Simulation Engineers - James Shaw | Podcast #172

Agentic Engineering & PINNs: AI for Simulation Engineers - James Shaw | Podcast #172

James Shaw, a mechanical engineer and Ansys channel partner, delves into the current and future impact of agentic AI and physics-informed neural networks (PINs) on simulation workflows. He explores how AI is revolutionizing aspects from tech support and model setup to the solver itself, particularly in CFD. The discussion also covers the implications for the engineering job market, the 'senior-junior inversion crisis', and the continued irreplaceability of skilled engineers due to the inherent physicality of the world, emphasizing the need for robust, trustworthy data to train AI.

Bending a Public MCP Server Without Breaking It — Nimrod Hauser, Baz

Bending a Public MCP Server Without Breaking It — Nimrod Hauser, Baz

Learn practical strategies to adapt third-party MCP server tools for production AI applications. This talk covers five key practices: curating tools, enhancing descriptions, implementing deterministic guardrails, composing new tools from existing ones, and leveraging tools as simple functions, all demonstrated through a real-world "Spec Reviewer" example.

Extreme Harness Engineering for the 1B token/day Dark Factory — Ryan Lopopolo, OpenAI Frontier

Extreme Harness Engineering for the 1B token/day Dark Factory — Ryan Lopopolo, OpenAI Frontier

Ryan Lopopolo of OpenAI's Frontier team discusses "Harness Engineering," a new paradigm where AI agents manage the entire software development lifecycle. He details an experiment building a 1M LOC product with zero human-written code, shifting the engineer's role from coding to designing systems and context for agents. The conversation covers the Symphony orchestration framework, the concept of "agent-legible" software, and the future of AI-driven development.

Stay In The Loop! Subscribe to Our Newsletter.

Get updates straight to your inbox. No spam, just useful content.