Hardware acceleration

Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind

Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind

A deep dive into Google's AI Edge stack for on-device AI, covering the new Gemma 4 models, the LiteRT framework for cross-platform deployment, and practical use cases in agent skills, tool calling, and hardware acceleration on CPUs, GPUs, and NPUs.

CROSS — Leveraging AI ASICs for Homomorphic Encryption

CROSS — Leveraging AI ASICs for Homomorphic Encryption

The talk presents CROSS and Morph, two compiler frameworks that enable existing AI accelerators, like Google's TPUs, to efficiently execute cryptographic workloads. CROSS focuses on Homomorphic Encryption (HE) and Morph on Zero-Knowledge Proofs (ZKP), demonstrating how to transform high-precision modular arithmetic into low-precision matrix operations that TPUs excel at, thereby achieving state-of-the-art performance and energy efficiency without any hardware modifications.

Hardware Realization and Implementation Security Evaluation of HQC, A NIST PQC Standard

Hardware Realization and Implementation Security Evaluation of HQC, A NIST PQC Standard

This talk by Sanjay Deshpande from Northwestern University explores the critical transition to Post-Quantum Cryptography (PQC) in response to the threat quantum computers pose to current public-key algorithms. It provides a deep dive into the Hamming Quasi-Cyclic (HQC) algorithm, a code-based candidate for NIST standardization. The session focuses on the challenges and innovations in creating efficient and secure hardware implementations of HQC, covering performance optimization for polynomial multiplication and countermeasures against side-channel attacks.

Efficient Homomorphic Integer Computer from CKKS

Efficient Homomorphic Integer Computer from CKKS

A deep dive into the hardware design and implementation of HQC, a post-quantum cryptography scheme. The talk covers performance and security bottlenecks, detailing novel solutions for efficient polynomial multiplication by leveraging sparsity and constant-time methods for generating fixed-weight vectors to thwart side-channel attacks.