Tokenless

Interactive discovery

Explore the topic map

Follow the connections between themes, people, and ideas across the Tokenless archive in an interactive topic modeling map.

Machine Learning

View All
Frontier results, on device - RL Nabors, Arize

Frontier results, on device - RL Nabors, Arize

RL Nabors discusses the significant costs associated with using frontier AI models, covering security, latency, and financial implications. She introduces a framework for right-sizing AI solutions by leveraging smaller, task-specific models and Small Language Models (SLMs). The framework details how to prove task feasibility, establish success criteria with golden datasets, conduct capability evaluations (using tools like Phoenix), and select the most appropriate "Small And Good Enough" (SAGE) model. Nabors further demonstrates how prompt engineering, particularly few-shot prompting, and post-processing can close performance gaps with larger models, while advocating for continuous regression evaluations to maintain performance integrity. The overarching message is to "prototype big, deploy small" to optimize AI deployments.

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Research to Reality: Bringing Frontier ML Research to Production - Vaidas Razgaitis, Higharc

Vaidas Razgaitis, Senior Research Engineer at Higharc, shares three tactical tips to accelerate the transition of novel AI/ML research into production-ready features. He emphasizes addressing the critical handoff challenge between ML researchers and software engineers through structured documentation (Research Prototype Taxonomy Document), a well-organized monorepo utilizing decoupled microservices, and a systematic approach to code decomposition and PR review. These strategies aim to improve legibility, maintainability, and delivery speed for ML-driven products.

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

Uncertainty-Guided Data Augmentation for Engineers | Deep Dive - Yongmin Kwon

This session details a data-efficient method for training engineering surrogate models by using uncertainty quantification (UQ) to guide geometric data augmentation. Instead of random deformations, the approach lets the deep ensemble model identify its own knowledge gaps (epistemic uncertainty), then uses Free-Form Deformation (FFD) to generate new shapes specifically in those uncertain regions. This ensures every expensive simulation run yields maximally informative data, significantly improving model accuracy for a fixed computational budget across domains like structural mechanics and aerodynamics.

Artificial Intelligence

View All
Sustainable Augmented Development • Kent Beck • YOW! 2025

Sustainable Augmented Development • Kent Beck • YOW! 2025

Kent Beck's presentation at YOW! Australia 2025 explores the transformative impact of augmented development ('the genie') on software engineering. He argues that AI shifts programming into an exploratory phase, requiring a re-evaluation of traditional practices. Beck introduces the concept of 'resting between the notes' to foster optionality over mere feature churn, and makes a compelling case for the increased value of junior developers as AI tools become powerful learning aids, emphasizing that 'nobody knows' the future but we must 'take our time' to adapt effectively.

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

How KV Cache Speeds Up LLMs for Faster AI Models on GPUs

LLMs often slow down under heavy traffic due to inefficient GPU memory management during inference. This overview explains how KV cache and Paged Attention, implemented in VLLM, optimize memory usage across prefill and decode phases, significantly boosting LLM throughput, reducing latency, and improving GPU utilization through advanced context handling and specific tuning techniques like prefix caching and speculative decoding.

The AI Agents Helping Home Services Book More Jobs

The AI Agents Helping Home Services Book More Jobs

Avoca (YC W23) has achieved eight-figure revenue and a $1 billion valuation by building an AI workforce for home services, turning missed calls into revenue. Founders Apurva Shrivastava and Tyson Chen explain how AI expands software's market share beyond 1% by automating labor and operational costs, leading to a 15x larger opportunity. They emphasize that their AI agents augment human workers, reducing attrition in challenging CSR roles and creating new positions for training AI, driven by a deep customer obsession learned at YC.

Technology

View All
Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Platforms: Build Abstractions, not Illusions • Gregor Hohpe • GOTO 2025

Gregor Hohpe explains the critical role of platforms in managing the growing cognitive load on developers due to complex distributed systems. He contrasts platforms, driven by "economies of speed" and fostering innovation through diversity, with traditional IT services and oversimplified abstractions that create dangerous illusions. Hohpe emphasizes building platforms that provide intuitive, domain-specific abstractions to solve real business problems, rather than just repackaging existing cloud services.

Full Stack Greenfield Projects : Are they still relevant?

Full Stack Greenfield Projects : Are they still relevant?

Bharat Goenka, co-founder of Tally, discusses the company's unconventional approach to software development through "Full Stack Greenfield" projects. He explains why building every component from scratch, despite being a high-risk strategy, has been crucial for Tally's success in serving the SMB market, fostering extreme customer loyalty, and aspiring to connect 200 million businesses. The talk delves into the historical context, the philosophy of questioning and choosing constraints, and the distinction between product and custom engineering.

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

3‑2‑1 Backup Rule Explained: Protect Your Data from Disaster

Jeff Crume outlines essential data resiliency strategies, starting with the 3-2-1 backup rule—three copies, two media types, one offsite—and expanding to include immutable or air-gapped backups, rigorous testing, and encryption. He emphasizes these principles for robust disaster recovery, ransomware protection, and minimizing costly downtime, highlighting the trade-offs in achieving high availability.


Recent Post

Paperclip: Open Source Human Control Plane for AI Labor — Dotta Bippa

Paperclip: Open Source Human Control Plane for AI Labor — Dotta Bippa

Dotta, the creator of Paperclip, introduces it as an open-source orchestrator for building "zero-human companies." This talk demonstrates how to set up an organization of AI agents, leverage skills and custom instructions for reliable work, and automate business processes. Through a live demo, Dotta showcases creating a company from scratch, managing agent workflows with QA and routines, and outlines the exciting future roadmap for the platform.

Jensen Huang – Will Nvidia’s moat persist?

Jensen Huang – Will Nvidia’s moat persist?

Nvidia CEO Jensen Huang discusses the company's core strategy, which he defines as transforming electrons into tokens by orchestrating a vast supply chain. He details how Nvidia's true moat lies in its ecosystem and its ability to manage supply bottlenecks. Huang contrasts Nvidia's versatile 'accelerated computing' platform with competitors like TPUs, arguing programmability via CUDA is key to AI innovation. He also presents a strong case against broad AI chip export controls on China, warning it could backfire by forcing the creation of a competing tech stack. Finally, he explains why Nvidia invests in the ecosystem rather than becoming a hyperscaler itself.

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Why Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve

Wayve CEO Alex Kendall discusses their contrarian, AI-first approach to autonomous driving. He explains their journey from a garage prototype using reinforcement learning to developing a generalizable AI driver that has driven zero-shot in over 500 cities. Kendall emphasizes a strategy focused on licensing this embodied AI for mass-market consumer vehicles—a 100-million-unit-per-year opportunity—rather than building bespoke robotaxis, arguing that the future is an AI that can drive any car, anywhere.

From Renting Machines by the Hour to Renting Capabilities by the MSeconds • Dhaval Nagar • GOTO 2025

From Renting Machines by the Hour to Renting Capabilities by the MSeconds • Dhaval Nagar • GOTO 2025

Dhaval Nagar chronicles the evolution of cloud economics from hourly-billed virtual machines a decade ago to the current 'capability economy.' The talk is structured in three acts, detailing the journey from the initial launch of AWS Lambda, through the maturation of the serverless ecosystem with frameworks and new platforms, to the present day where complex capabilities like AI models are consumed as millisecond-metered APIs. This shift demands a new developer mindset focused on composing services, event-driven architecture, and eliminating infrastructure management.

Enter the Matrix • Conor Hoekstra • YOW! 2025

Enter the Matrix • Conor Hoekstra • YOW! 2025

Conor Hoekstra demonstrates how to achieve exponential productivity by combining AI-assisted development, array programming, and high-performance computing. Using a financial dashboard app built entirely with AI (Vibe Coding), he showcases a custom array-based DSL with a dual backend (interpreted BQN and compiled NVIDIA Parrot for GPUs), urging developers to fully embrace modern tools and elevate their expectations of what is possible.

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

This workshop by Mahmoud Mabrouk, CEO of Agenta AI, delves into building calibrated LLM-as-a-judge evaluations that reliably align with human judgment. It highlights how miscalibrated judges lead to false confidence and presents a practical workflow, including designing use-case specific metrics, detailed data annotation, and optimizing judge prompts using the GAPA algorithm. The talk emphasizes the importance of iterative debugging, model selection, and custom reflection templates for achieving trustworthy and effective LLM evaluations.

Stay In The Loop! Subscribe to Our Newsletter.

Get updates straight to your inbox. No spam, just useful content.