Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Online Convex Optimization and Accelerated Gradient Descent Methods for Efficient Training
Published:
TBC
Triton Notes
Published:
TBC
Non-convex optimization for Over-parameterized Neural Nets: Reproducing Kernel Hilbert Space and Neural Tangent Kernel
Published:
This blog is based on Real Analysis by Elias M. Stein and Rami Shakarchi, and Learning Theory on First Principles by Francis Bach.
Note on Submodular Function Optimization, Minimization and Maximization, Lazy Greedy
Published:
This blog is based on week 10 of PKU Algorithms for Big Data Analysis.
Efficient Methods for Generative Models 3: Sparse and Adaptive Attention, Dynamic Token Pooling
Published:
Introduction to Recurrent Neural Networks (RNNs)
Efficient Methods for Generative Models 2: KV Cache, FlashAttention, vLLM
Published:
Introduction to Recurrent Neural Networks (RNNs)
Efficient Methods for Generative Models 1: Linear Attention, State-Space Models, and Linear RNNs
Published:
Modern sequence modeling has evolved from recurrent architectures to attention-based models and, more recently, state-space approaches. Traditional RNNs introduced an efficient way to process sequential data but struggled with long-term dependencies. Transformers later revolutionized the field with attention mechanisms, though their quadratic cost limits scalability to long contexts. This has driven research into more efficient alternatives—such as linear attention, state-space models like S4 and Mamba, and newer architectures like DeltaNet, that aim to combine scalability, stability, and strong modeling capacity for long-range sequence tasks.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Paper Title Number 1
Published in Journal 1, 2009
This paper is about the number 1. The number 2 is left for future work.
Recommended citation: Your Name, You. (2009). "Paper Title Number 1." Journal 1. 1(1).
Download Paper | Download Slides | Download Bibtex
Paper Title Number 2
Published in Journal 1, 2010
This paper is about the number 2. The number 3 is left for future work.
Recommended citation: Your Name, You. (2010). "Paper Title Number 2." Journal 1. 1(2).
Download Paper | Download Slides
Paper Title Number 3
Published in Journal 1, 2015
This paper is about the number 3. The number 4 is left for future work.
Recommended citation: Your Name, You. (2015). "Paper Title Number 3." Journal 1. 1(3).
Download Paper | Download Slides
Paper Title Number 4
Published in GitHub Journal of Bugs, 2024
This paper is about fixing template issue #693.
Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper
Paper Title Number 5, with math \(E=mc^2\)
Published in GitHub Journal of Bugs, 2024
This paper is about a famous math equation, \(E=mc^2\)
Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper