# signature-marking-detector

**Repository Path**: mirrors_microsoft/signature-marking-detector

## Basic Information

- **Project Name**: signature-marking-detector
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-08-12
- **Last Updated**: 2025-08-23

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Repository Description

This repository contains code used to extract and label handwritten markings on elephant tusks from images taken at seizures. More information about the project can be found at: >link paper here>. There are three notebooks: 

1. Extract Markings

Runs inference using the model weights which are published at `https://zenodo.org/records/16423661`. If this is your first time using the repository, the model weights will be downloaded to your local environment. After inference, the predictions are post-processed to arrive at de-duplicated handwritten marking extractions using the functions stored in `extraction_utils.py`. The processing functions can be easily modified to ingest predictions from any object detection model. Initializing the inferencer object relies on code from the open-mmlab/mmdetection package<sup>*</sup>. We reccomend you fork this package. To run the inference code, you will need the following files: 

- `configs/_base_/datasets/coco_detection.py`
- `configs/_base_/default_runtime.py`
- `configs/_base_/schedules/schedule_1x.py`
- `configs/grounding_dino/grounding_dino_swin-b_finetune_16xb2_1x_coco.py`
- `configs/grounding_dino/grounding_dino_swin-t_finetune_16xb2_1x_coco.py`

2. Automated Marking Labeling

After a human reviewer has manually labeled some markings, these can be fed into the `Automated Marking Labeling` notebook to provide additional labels to extractions with strong visual resemblance to the labeled markings. This is done by encoding the images using Open-CLIP and then passing them through a support vector machine. Any classifier could theoretically be used, but should first be tested before implementation. 

3. Label Markings With LLM

The third notebooks provides informative labels and descriptions to the extracted markings, using the prompts and functions found in `gpt_utils.py`. Within the notebook, labeling setting have to be provided for each existing class of marking. For example, you might want markings labeled "black text" to be submitted for text parsing, but markings labeled "post-seizure marking" to only be described. More information is available in the notebook. You will need credentials with which you can access an Azure OpenAI Client. 


# Installation Notes

There are two .yml files provided to aid in installation. The `labeling_environment.yml` should be used to create a kernel for the `Automated Marking Labeling` notebook and the `Label Markings With LLM` notebook (i.e. via `conda env create -f ...`). 

For the extraction environment, the following specific installation steps need to be followed due to requirements of the `mmcv` package.

    ```bash
    conda env create -f extraction_environment.yml
    conda activate openmmlab
    export CUDA_HOME="/opt/conda/envs/openmmlab"
    pip install -r requirements.txt
    ```
    
You will also need to fork https://github.com/open-mmlab/mmdetection and clone the files listed above.

<sup>*</sup>Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C. C., Lin, D. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155