Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns†, Pavel Izmailov†, Jan Hendrik Kirchner†, Bowen Baker†, Leo Gao†, Leopold Aschenbrenner†, Yining Chen†, Adrien Ecoffet†, Manas Joglekar†, Jan Leike, Ilya Sutskever, Jeff Wu†
ICML 2024 (Oral)
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns*, Haotian Ye*, Dan Klein, Jacob Steinhardt
ICLR 2023
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, Jacob Steinhardt
NeurIPS 2021 (Datasets and Benchmarks Track)
Measuring Coding Challenge Competence With APPS
Dan Hendrycks*, Steven Basart*, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt
NeurIPS 2021 (Datasets and Benchmarks Track)
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks*, Collin Burns*, Anya Chen, Spencer Ball
NeurIPS 2021 (Datasets and Benchmarks Track)
Limitations of Post-Hoc Feature Alignment for Robustness
Collin Burns, Jacob Steinhardt
CVPR 2021
Measuring Massive Multitask Language Understanding
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
ICLR 2021
Aligning AI with Shared Human Values
Dan Hendrycks*, Collin Burns*, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, Jacob Steinhardt
ICLR 2021
Streaming Complexity of SVMs
Alexandr Andoni*, Collin Burns*, Yi Li*, Sepideh Mahabadi*, David P. Woodruff*
APPROX/RANDOM 2020
Interpreting Black Box Models via Hypothesis Testing
Collin Burns, Jesse Thomason, Wesley Tansey
ACM FODS 2020
(*: equal contribution, †: primary contributor)
Speedcubing. I used to be very involved in competitive Rubik's Cube solving ("speedcubing"). In 2015 I broke the official world record for a single 3x3 solve with a time of 5.25 seconds. I've also had a national championship title, a continental record, and four national records.