Daniel Campos

Publications

2023

Daniel Campos, Surya Kallumadi, Corby Rosset, Cheng Xiang Zhai, Alessandro Magnani - Overview of the TREC 2023 Product Product Search Track - TREC 2023

EFFICIENT AND ROBUST WEB SCALE LANGUAGE MODEL BASED RETRIEVAL,GENERATION, AND UNDERSTANDING University of Illinois Urbana-Champaing Computer Science Doctoral Thesis

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin - OVERVIEW OF THE TREC 2022 DEEP LEARNING TRACK - TREC 2022

Daniel Campos, Cheng Xiang Zhai - To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency - 4th Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2023) @ ACL 2023

Daniel Campos, Alessandro Magnani, Cheng Xiang Zhai - Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders - 4th Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2023) @ ACL 2023

Daniel Campos, Cheng Xiang Zhai - Dense Sparse Retrieval: Using Sparse Language Models for Inference Efficient Dense Retrieval - Arxiv Preprint

Daniel Campos, Alessandro Magnani, Cheng Xiang Zhai - CAPOT: Creating Robust Dense Query Encoders using Post Training Contrastive Alignment - Arxiv Preprint

Daniel Campos, Alexandre Marques, Tuan Nguyen, Mark Kurtz, Cheng Xiang Zhai - oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes - 4th Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2023) @ ACL 2023

Daniel Campos,Daniel Perry, Samir Joshi, Yashmeet Gambhir, Wei Du, Zhengzheng Xing and Aaron Colak - Compressing Cross-Lingual Multi-task Models at Qualtrics - The Thirty-Fifth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-23)

2022

Daniel Campos,Daniel Perry, Samir Joshi, Yashmeet Gambhir, Wei Du, Zhengzheng Xing and Aaron Colak - Compressing Cross-Lingual Multi-task Models at Qualtrics - Arxiv Preprint

Daniel Campos, Alexandre Marques, Tuan Nguyen, Mark Kurtz, Cheng Xiang Zhai - Sparse*BERT: Sparse Models are Robust Sparsity in Neural Networks Workshop at ICML 2022

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy LinOVERVIEW OF THE TREC 2021 DEEP LEARNING TRACK - TREC 2021

Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, Emine Yilmaz - Fostering Coopetition While Plugging Leaks: The Design and Implementation of the MS MARCO Leaderboards - In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, Dan Alistarh - The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models - 2nd workshop on Efficient ML

2021

Daniel Campos, Heng Ji - IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System

Daniel Campos - Curriculum learning for language modeling

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen Voorhees and Ian Soboroff - TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime - In Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin - MS MARCO: Benchmarking Ranking Models in the Large-Data Regimee - In Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, Emine Yilmaz - Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard - In Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos - Overview of the TREC 2020 deep learning track - TREC 2020

2020

Explorations In Curriculum Learning Methods For Training Language Models University of Washington Computational Linguistics Master's Thesis

Manling Li, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Haoyang Wen, Sha Li, Zhenhailong Wang, Pengfei Yu, Lifu Huang, Di Lu, Qingyun Wang, Haoran Zhang, Qi Zeng, Chi Han, Zixuan Zhang, Yujia Qin, Xiaodan Hu, Nikolaus Parulian, Daniel Campos, Heng Ji, Brian Chen, Xudong Lin, Alireza Zareian, Amith Ananthram, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Michael Spector, Mitchell DeHaven, Daniel Napierski, Marjorie Freedman, Pedro Szekely, Haidong Zhu, Ram Nevatia, Yang Bai, Yifan Wang, Ali Sadeghian, Haodi Ma, Daisy Zhe Wang -GAIA at SMKBP 2020-a dockerlized multi-media multi-lingual knowledge extraction, clustering, temporal tracking and hypothesis generation system -Proceedings of Thirteenth Text Analysis Conference 2020

Nick Craswell, Daniel Campos, Bhaskar Mitra, Emine Yilmaz, Bodo Billerbeck-ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search-CIKM 2020

Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation- In Proceedings of EMNLP 2020

Emine Yilmaz, Nick Craswell, Bhaskar Mitra and Daniel Campos -On the Reliability of Test Collections to Evaluating Systems of Different Types - In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees - Overview of the TREC 2019 deep learning track- TREC 2019

Corbin Rosset, Chenyan Xiong, Xia Song, Daniel Campos, Nick Craswell, Saurabh Tiwary and Paul Bennett - Leading Conversational Search by Suggesting Useful Questions - In Proceedings of The Web Conference 2020

2019

Lee Xiong, Chuan Hu, Chenyan Xiong, Daniel Campos, Arnold Overwijk and Xiayu Huang - Open Domain Web Keyphrase Extraction Beyond Language Modeling - In Proceedings of EMNLP 2019, Github

2018

Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang MS MARCO: A Human Generated MAchine Reading COmprehension Dataset, Website, and Github

2015

Daniel Campos, Zoe Konrad - Experiments in Inferring Social Networks of Diffusion

Education

University of Illinois Urbana-Champaign (UIUC) - PhD Computer Science 2023

University of Washington - MS Computational Linguistics 2020

Rensselaer Polytechnic Institute - BS Computer Science 2014

Jobs

Research Scientist - Snowflake May 2023 -

Research Scientist - Neeva AI ( acquired by Snowflake) Dec 2022 - May 2023

Applied Scientist Consultant - Walmart Labs June 2022 - Dec 2022

Applied Scientist Consultant - Qualtrics March 2022 - June 2022

Research Scientist Consultant - Mendel AI Oct 2021 - March 2022

Research Scientist Consultant - Neural Magic Oct 2020 - March 2023

Teaching Assistant - University of Illinois Urbana Champaign - Jan 2021 - May 2023

Research Assistant - University of Illinois Urbana Champaign - June 2020 - Dec 2021

Senior PM - Microsoft Bing - July 2020 - Oct 2020

PM II - Microsoft Bing - Nov 2017 - July 2020

PM II - Microsoft Azure - Jan 2017 - Nov 2017

PM - Microsoft Azure - Aug 2015- Dec 2016

Awards & Fellowships

Ripple X Fellow - Summer 2022

Z Fellow - Jan 2022 Cohort

Computer Science Excellence Fellowship at the University of Illinois at Urbana-Champaign - 2020-2021

UIUC Summer Predoctoral Institute Fellow - 2020

Patents

Using a Multi-Task-Trained Neural Network to Guide Interaction with a Query-Processing System via Useful Suggestions- 408364-US-NP - Filed 4/16/2020.

Keyphrase Extraction Beyond Language Modeling - U.S. Appln. No. 16/460,853 - Filed July 2nd, 2019

Activity

The 2023 SIGIR Workshop On eCommerce Invited Talk: Benchmarking End To End Product Retrieval

LLMs in Production Conference II Invited Talk - Making LLM Inference Affordable - 2023

LLMs in Production Conference - Invited Panelist Cost Optimization and Performance - 2023

NIST TREC Product Search Track Principal Coordinator. 2023-2024

NIST TREC Deep Learning Track Coordinator. 2023

Invited Talk @ Neeva: Scaling Language Model Inference to Web-Scale Workloads - 10/5/2022

Invited Talk @ Walmart Labs: Efficient Language Model Inference - 01/19/2022

NIST TREC Deep Learning Track Coordinator. 2022

Invited Talk @ You.com:Efficient Language Model Inference - 08/21/2021

Invited Talk @ Qualtrics Research:Efficient Language Model Inference - 08/12/2021

NIST TREC Deep Learning Track Coordinator. 2021

ACM SIGIR/SIGKDD Africa Summer School on Machine Learning for Data Mining and Search. - 2020

NIST TREC Deep Learning Track Coordinator. 2020

ACM SIGIR/SIGKDD Africa Summer School on Machine Learning for Data Mining and Search. 2019

NIST TREC Deep Learning Track Coordinator. 2019