FAI-Seminar: Previous Talks

2024 R02

Time	Speaker	Talk Title	Talk Info	Paper	Video
07/19	张博航 (北京大学)	Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness	Talk Info	[1], [2], [3]	B站
08/09	黎善达 (CMU)	Inference Scaling Law of Large Language Models and Second-Prize Winning Solution of AIMO	Talk Info	[1], [2]	B站
08/16	王天浩 (TTIC)	Tractable training dynamics of transformers for in-context learning	Talk Info	[1], [2]	B站
08/23	吴京风 (Berkeley)	Reimaging Gradient Descent: Large Stepsize, Oscillation, and Acceleration	Talk Info	[1]	B站
08/30	马梓业 (港城大)	Navigating the non-convex landscape via amplifying escape directions of saddle points	Talk Info	[1], [2], [3]	B站
11/01	刘勇 (中国人民大学)	Can Retrieval Augmented Generation (RAG) Enhance the LLM’s Reasoning Capabilities?	Talk Info		B站

2024 R01

Time	Speaker	Talk Title	Talk Info	Paper	Video
Special talk 05/31	李建 (清华大学)	Generalization Error and Implicit Bias of Gradient Methods in Deep Learning	Talk Info		B站
03/08	翟润天 (CMU)	On the Generalization of Representation Learning and Big Foundation Models	Talk Info	[1, 2]	B站
03/15	罗胜杰 (北京大学)	Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products	Talk Info	[1]	B站
03/22	高天宇(Princeton)	Long-Context Language Modeling with Parallel Context Encoding	Talk Info		B站
03/29	邹荻凡 (香港大学)	Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo	Talk Info	[1]	B站
04/05	陆一平 (NYU)	Simulation-Calibrated Scientific Machine Learning	Talk Info	[1]	B站
04/12	俞鼎力(Princeton)	Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks	Talk Info	[1]	B站
04/19	吕凯风(Princeton)	Understanding the Limitations of Neural Networks on Algorithmic Reasoning	Talk Info	[1, 2]	B站
04/26	李禹辰 (CMU)	Towards Mathematical Understanding of Modern Language Models	Talk Info	[1, 2, 3, 4]	B站

2023 R03

Time	Speaker	Talk Title	Talk Info	Paper	Video
Special Talk 2/16	胡威 (UMich)	Hidden Structures in Neural Network Representations	Talk Info	[1, 2]	B站
11/10	陈乐偲 (清华大学)	Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles	Talk Info	[1]	B站
11/17	张博航 (北京大学)	Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective	Talk Info	[1]	B站
11/24	顾欣然 (清华大学)	A Quadratic Synchronization Rule for Distributed Deep Learning	Talk Info	[1]	B站
12/1	石佳欣(DeepMind)	MultiresConv: From Wavelet Theory to Long Context Modeling with Neural Networks	Talk Info	[1]	B站
12/8	范凤磊 (香港中文大学)	In Pursuit of Deciphering ReLU Networks and Beyond	Talk Info	[1]	B站
12/15	NeurIPS break
12/22	刘冰彬 (CMU)	Thinking Fast with Transformers: algorithmic reasoning with shortcuts	Talk Info	[1] (ICLR 23' oral), [2] (NeurIPS 23' spotlight)	B站
12/29	温凯越 (清华大学)	Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars	Talk Info	[1]	B站
1/12	游凯超 (清华大学)	Understand, Learn, and Adopt the PyTorch compiler (torch.compile)	Talk Info	[1, 2, 3]	B站

2023 R02

Time	Speaker	Talk Title	Paper	Video
(Special)09/15	李志远 (Stanford)	The Generalization Benefit of Flatnes Regularization	[1][2]	B站
06/23	张博航 (北京大学)	Understanding the Expressivity of Subgraph-based GNNs for Graph Learning	[1]	B站
06/30	罗胜杰 (北京大学)	One Transformer Can Understand Both 2D & 3D Molecular Data	[1]	B站
07/07	刘子鸣 (MIT)	Intelligence from hunger	[1], [2]	B站
07/14	马鉴昊 (UMich)	Robust Sparse Mean Estimation	[1]	B站
07/21	金及凯 (北京大学)	Minimax optimal operator learning	[1]	B站
07/28	ICML break
08/04	王博涵 (中国科学技术大学)	When and Why Momentum Accelerates SGD	[1]	B站
08/11	滕佳烨 (清华大学)	Predictive inference with feature conformal prediction	[1]	B站
08/18	蔡天乐 (Princeton)	Large Language Models as Tool Makers	[1]	B站

2023 R01

Time	Speaker	Talk Title	Paper	Video
(Special) 05/26	张景昭 (清华大学)	Two Phases of Scaling Laws for Nearest Neighbor Classifiers	[1]	B站
03/03	张鼎怀 (Mila)	GFlowNets: Exploration for Probabilistic Inference	[1],[2],[3],[4]	B站
03/10	顾欣然 (清华大学)	Why (and When) does Local SGD Generalize Better than SGD	[1]	B站
03/17	王博涵 (中国科学技术大学)	Provable Benefit of Adaptivity in ADAM	[1]	B站
03/24	温凯越 (清华大学)	How Does Sharpness-Aware Minimization Minimize Sharpness?	[1]	B站
03/31	张博航 (北京大学)	Rethinking the Expressive Power of GNNs via Graph Biconnectivity	[1] (ICLR 2023 Outstanding Paper)	B站
04/07	马鉴昊(UMich)	Escaping Saddle Points Or Not?	[1], [2]	B站
04/14	陈乐偲 (复旦大学)	On Bilevel Optimization without Lower-level Strong Convexity	[1]	B站
04/21	黄凯旋(Princeton)	Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data	[1]	B站
04/28	戴言 (清华大学)	Variance-Aware Sparse Linear Bandits	[1]	B站