I'm currently a Research Scientist at Meta AI. I hold a Ph.D. in Computer Science from Stanford University, advised by Peter Bailis and Gregory Valiant. While at Stanford, I was a member of the Future Data Systems group and the DAWN project.

My Ph.D. research focused on statistical machine learning under resource constraints such as limited memory and limited labeled data. In this space, I've worked on methods for improving sample complexity by incorporating prior knowledge on transformation invariance directly into neural network architectures, and on semi-supervised learning algorithms that improve accuracy by additionally leveraging large collections of unlabeled data.

Prior to starting the Ph.D., I spent a year in industry at MetaMind (acquired by Salesforce in April 2016), during which I worked on computer vision for medical imaging. I received an MS in CS from Stanford in 2015. During my Master's, I was a member of the Stanford NLP Group. I also hold an AB in Physics from Princeton University.


An End-to-End Earthquake Monitoring Method for Joint Earthquake Detection and Association using Deep Learning
Weiqiang Zhu*, Kai Sheng Tai*, S. Mostafa Mousavi, Peter Bailis, Gregory C. Beroza
* Equal contribution
Journal of Geophysical Research: Solid Earth

Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training
Kai Sheng Tai, Peter Bailis, and Gregory Valiant
ICML 2021

Equivariant Transformer Networks
Kai Sheng Tai, Peter Bailis, and Gregory Valiant
ICML 2019

Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data
Vatsal Sharan*, Kai Sheng Tai*, Peter Bailis, and Gregory Valiant
* Equal contribution
ICML 2019

Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries
Edward Gan, Jialin Ding, Kai Sheng Tai, Vatsal Sharan, and Peter Bailis
VLDB 2018

Sketching Linear Classifiers over Data Streams
Kai Sheng Tai, Vatsal Sharan, Peter Bailis, and Gregory Valiant
[code] [extended abstract] [long version] [slides]

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Kai Sheng Tai, Richard Socher, and Christopher D. Manning
ACL 2015
[code] [slides]

Detecting gravitational waves from highly eccentric compact binaries
Kai Sheng Tai, Sean T. McWilliams, and Frans Pretorius
Physical Review D, 2014


index-baselines: Comparing learned index structures to classical data structures like cuckoo hashing

neuralart: An implementation of the paper 'A Neural Algorithm of Artistic Style' by Gatys et al.

torch-ntm: A Neural Turing Machine implementation using Torch