Leaderboards are widely used in NLP and push the field forward. While leaderboards are a straightforward ranking of NLP models, this simplicity can mask nuances in evaluation items (examples) and subjects (NLP models). Rather than replace leaderboards, we advocate a re-imagining so that they better highlight if and where progress is made. Building on educational testing, we create a Bayesian leaderboard where latent subject skill and latent item difficulty predict correct responses. Using this model, we analyze the reliability of leaderboards. Afterwards, we show the model can guide what annotate, identify annotation errors, detect overfitting, and identify informative examples.
PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
Write PyTorch controllers, test them in simulation, and seamlessly transfer to real-time hardware.
Per-Pixel Classification is Not All You Need for Semantic Segmentation (NeurIPS 2021, spotlight)
A thin, highly portable toolkit for efficiently compiling dense loop-based computation.
ML models often mispredict, and it is hard to tell when and why. We present a data mining based approach to discover whether there is a certain form of data that particular causes the model to mispredict.
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.
DialogStitch Synthetic Deeper and Multi-Context Task-Oriented Dialogs
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.
Code & Models for 3DETR - an End-to-end transformer model for 3D object detection
A collection of tools for neural compression enthusiasts.
Official code accompanying the arXiv paper Compressing Multisets with Large Alphabets