Library

Curated reading list with quick filters and links to notes.

A living bookshelf of papers, kernels, and system guides that influence my research. Use the search box to filter by title, author, venue, or tag.

Library

NeurIPS · 2017

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

Introduces the Transformer architecture, demonstrating that self-attention alone can outperform recurrent and convolutional models for sequence transduction.

transformerattentionnlp

arXiv · 2022

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré

Proposes an IO-aware tiled attention kernel that reduces memory traffic and speeds up training while remaining exact.

attentionefficiencygpu

arXiv · 2025

Efficient Attention Mechanisms for Large Language Models: A Survey

Yutao Sun, Zhenyu Li, Yike Zhang, Tengyu Pan, Bowen Dong, Yuyi Guo, Jianyong Wang

Surveys the design space of efficient attention variants for LLMs, covering algorithmic approaches and hardware implications.

attentionsurveyefficiency


Bookshelf

Whatever I am, I am because of them

Library

Cover of Deep Learning for Vision Systems

Manning Publications · 2020

Deep Learning for Vision Systems

Mohamed Elgendy

View
Cover of AI Engineering

O'Reilly · 2023

AI Engineering

Chip Huyen

View
Cover of Deep Learning Foundations and Concepts

Springer · 2024

Deep Learning Foundations and Concepts

Chris Bishop

View
Cover of Build a Large Language Model (From Scratch)

Manning Publications · 2024

Build a Large Language Model (From Scratch)

Sebastian Raschka

View
Cover of Ensemble Methods for Machine Learning

Manning Publications · 2024

Ensemble Methods for Machine Learning

Gautam Kunapuli

View
Cover of A Simple Guide to Retrieval Augmented Generation

Manning Publications · 2024

A Simple Guide to Retrieval Augmented Generation

Abhinav Kimothi

View
Cover of Machine Learning for Tabular Data

Manning Publications · 2024

Machine Learning for Tabular Data

Mark Ryan

View
Cover of Math and Architectures of Deep Learning

Manning Publications · 2024

Math and Architectures of Deep Learning

Krishnendu Chaudhury etal

View
Cover of Inside Deep Learning Math, Algorithms, Models

Manning Publications · 2022

Inside Deep Learning Math, Algorithms, Models

Edward Raff

View
Cover of Getting Started with Natural Language Processing

Manning Publications · 2024

Getting Started with Natural Language Processing

Ekaterina Kochmar

View
Cover of Natural Language Processing in Action

Manning Publications · 2025

Natural Language Processing in Action

Hobson Lane, Maria Dyshel

View
Cover of Hands-On Large Language Models

Oreilly · 2024

Hands-On Large Language Models

Jay Alammar, Maarten Grootendorst

View
Cover of Designing Large Language Model Applications

Oreilly · 2025

Designing Large Language Model Applications

Suhas Pai

View
Cover of How Large Language Models Work

Oreilly · 2025

How Large Language Models Work

Drew Farris, Stella Biderman, Edward Raff

View
Cover of LLMs in Production

Manning Publications · 2025

LLMs in Production

Christopher Brousseau and Matthew Sharp

View
Cover of GPU Programming with C++ and CUDA

Packt Publishing · 2025

GPU Programming with C++ and CUDA

Paulo Motta

View
Cover of Programming Massively Parallel Processors

MK · 2024

Programming Massively Parallel Processors

Wen-mei Hwu, David Kirk

View
Cover of Fluent Python

Oreilly · 2025

Fluent Python

Luciano Ramalho

View
Cover of AI Agents in Action

Manning Publications · 2025

AI Agents in Action

Micheal Lanham

View