# lm-calibration **Repository Path**: mcgrady164/lm-calibration ## Basic Information - **Project Name**: lm-calibration - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-08-04 - **Last Updated**: 2021-08-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # LM Calibration This repository contains code for the paper [How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering](https://arxiv.org/abs/2012.00955) ## Install Our code is mainly based on [T5][t5] and [mesh-tensorflow][mesh] and runs on TPUs. Please follow the original [T5][t5] repository to properly setup TPUs. To install required packages, download [T5][t5] (version 0.6.4) and [mesh-tensorflow][mesh] (version 0.1.16) and copy source files into the `t5` and `mesh_tensorflow` folder. Don't replace files already in these folders because those files are the files we modified for calibration purpose. ## Fine-tune Run the following commands to fine-tune the [UnifiedQA][uq] models with `softmax` or `margin` objective functions. `$tpu` specifies the name of the TPU, `$model_output` specifies the output location to save the fine-tuned model, `$objective` specifies the objective function to use. ```shell ./finetune.sh $tpu 3B $model_output $objective uq_clean_train_ol_mix train mc ``` ## Evaluate candidate answers Run the following commands to evaluate the probabilities of candidate answers. `$score_output` specifies the location to save the output, and `1103000` specifies the checkpoint to use. ```shell ./score.sh $tpu $score_output $model_output 1103000 uq_clean_test dev ``` ## Compute ECE Run the following commands to compute the ECE metric given the probabilities of candidate answers. ```shell python cal.py --mix uq_clean_test --split dev --score $score_output ``` [t5]: https://github.com/google-research/text-to-text-transfer-transformer [mesh]: https://github.com/tensorflow/mesh [uq]: https://github.com/allenai/unifiedqa