Boekhandel Douwes Den Haag

Synthesis Lectures on Computer Science

Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

Everything is a Matrix

Andrew Sabot

Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

Synthesis Lectures on Computer Science

Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

Everything is a Matrix

Synthesis Lectures on Computer Science: Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

Nieuw

 

This book provides a focused, research-forward guide to making large AI models efficient in practice and also presents an array of novel techniques to reduce memory footprint, accelerate computation, and improve overall hardware utilization.


Er is geen levertijd bekend.

€ 49,00

Informeer eerst of het boek leverbaar is voor u bestelt.


Beschrijving Synthesis Lectures on Computer Science: Minimizing Data Movement and Parameter Count Across the Machine Learning Stack

This book provides a focused, research-forward guide to making large AI models efficient in practice and also presents an array of novel techniques to reduce memory footprint, accelerate computation, and improve overall hardware utilization. The author demonstrates that substantial efficiency gains can be achieved by rethinking how data is computed, stored, and compressed, with a special focus on matrices, the core computational structure underpinning both scientific computing and neural networks. Modern AI models run on huge grids of numbers (matrices/tensors), and their speed and affordability depend on how those numbers are arranged and processed on real hardware (GPUs/TPUs/CPUs). This book explains practical methods to skip unnecessary work (structured sparsity), move data efficiently (gather/scatter), and shrink models without losing accuracy (block distillation) so that AI systems can use less memory, less time, and less energy without sacrificing quality. In addition, the book shows how to turn algorithmic ideas into hardware-aware speedups on GPUs/TPUs. Readers will learn when sparsity pays off, how to schedule irregular workloads, and how to recover accuracy in compressed models. Case studies illustrate end-to-end design choices, evaluation, and pitfalls. The result is a coherent perspective that bridges theory, compilers/run times, and real-world deployment.

In addition, this book:

  • Integrates dense blocking, structured sparsity, gather/scatter scheduling, and block distillation/low-rank SVD
  • Provides reproducible benchmarking templates and guidance on when sparsity pays off and common pitfalls
  • Connects theory to compilers/runtimes and real deployment across scientific computing and state-of-the-art AI models
  • ul>


ISBN
9783032230997
Pagina's
110
Verschenen
Serie
Synthesis Lectures on Computer Science
Rubriek
Informatica
Druk
1
Uitvoering
Hardback
Taal
Engels
Uitgever
Springer Nature Switzerland AG

Informatica