Discontinuity-aware Normal Integration for Generic Central Camera Models

# surface_normal_integration

**Repository Path**: mirrors_facebookresearch/surface_normal_integration

## Basic Information

- **Project Name**: surface_normal_integration
- **Description**: Code for 'Discontinuity-aware Normal Integration for Generic Central Camera Models'
- **Primary Language**: Unknown
- **License**: GPL-3.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-01
- **Last Updated**: 2025-10-04

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

<h1 align="center">Discontinuity-aware Normal Integration for Generic Central Camera Models</h1>
<p align="center">
<strong><a href="https://scholar.google.com/citations?user=qwSANZoAAAAJ&hl=en">Francesco Milano</a></strong>, <strong><a href="https://scholar.google.com/citations?user=I-t1ZrYAAAAJ&hl=en">Manuel López-Antequera</a></strong>, <strong><a href="https://scholar.google.com/citations?user=QNwhaR8AAAAJ&hl=en">Naina Dhingra</a></strong>, <strong><a href="https://asl.ethz.ch/">Roland Siegwart</a></strong>, <strong><a href="">Robert Thiel</a></strong>
</p>

<h2 align="center">ICCV 2025 (Highlight)</h2>
<h3 align="center"><a href="https://arxiv.org/abs/2507.06075">Paper</a></h3>


<p align="center">
  <a href="">
    <img src="./assets/surface_normal_integration_scheme_3d.png" alt="Discontinuity-aware Normal Integration for Generic Central Camera Models" width="50%">
  </a>
</p>

Recovering a 3D surface from its surface normal map, a problem known as normal integration, is a key component for photometric shape reconstruction techniques such as shape-from-shading and photometric stereo. The vast majority of existing approaches for normal integration handle only implicitly the presence of depth discontinuities and are limited to orthographic or ideal pinhole cameras. In this paper, we propose a novel formulation that allows modeling discontinuities explicitly and handling generic central cameras. Our key idea is based on a local planarity assumption, that we model through constraints between surface normals and ray directions. Compared to existing methods, our approach more accurately approximates the relation between depth and surface normals, achieves state-of-the-art results on the standard normal integration benchmark, and is the first to directly handle generic central camera models.

## Installation
The reference code in this repository was tested on Ubuntu 22.04, using a Python 3.11.6 virtual environment with installed packages specified in the [`requirements.txt`](./requirements.txt) file.

### DiLiGenT download

To run experiments on the DiLiGenT dataset, download and set up the dataset in a folder `${DILIGENT_ROOT}` of your choice as follows:
1. Download the per-object folders containing the camera intrinsics, the PNG-encoded normal maps, and the normal masks from [this folder](https://github.com/xucao-42/bilateral_normal_integration/tree/main/data/Fig7_diligent) in the BiNI repository. Extract the folders into a subfolder named `normals` in `${DILIGENT_ROOT}`.
2. Download the ground-truth depth, provided by the BiNI repository (_cf._ [here](https://github.com/xucao-42/bilateral_normal_integration/tree/main?tab=readme-ov-file#evaluation-on-diligent-benchmark)), from [here](https://www.dropbox.com/scl/fi/uhg8538lr43h2g2arim2l/diligent_depth_GT.zip?dl=0&e=1&rlkey=4qdmse25e0lmrtbo9bsfvifk1). Extract the `.mat` files in a subfolder named `depth_gt` in `${DILIGENT_ROOT}`.
3. Download the high-resolution normal maps from the [link](https://drive.google.com/file/d/1EgC3x8daOWL4uQmc6c4nXVe4mdAMJVfg/view) provided on the original [DiLiGenT dataset webpage](https://sites.google.com/site/photometricstereodata/single). Extract the `pmsData` subfolder into `${DILIGENT_ROOT}`.

Your `${DILIGENT_ROOT}` folder should now look as:

```
${DILIGENT_ROOT}
├── depth_gt
|   ├── bear_gt.mat
|   ├── buddha_gt.mat
|   └── ...
├── normals
|   ├── bear
|   |   ├── K.txt
|   |   ├── mask.png
|   |   └── normal_map.png
|   ├── buddha
|   |   └── ...
|   └── ...
└── pmsData
    ├── bearPNG
    |   ├── Normal_gt.mat
    |   └── ...
    ├── buddhaPNG
    |   └── ...
    └── ...
```

### DiLiGenT-MV download
To run experiments on DiLiGenT-MV, download and extract the dataset from the [link](https://drive.google.com/file/d/18dheWmAxCNaBpYoH3usuFeH9vGlhODvx/view?usp=sharing) provided on the [official DiLiGenT-MV dataset website](https://sites.google.com/site/photometricstereodata/mv).

Optionally, to render ground-truth depth maps from the ground-truth meshes, to use for evaluation, we provide a [representative script](./render_depth_DiLiGenT-MV.py), which relies on BlenderProc. After installing BlenderProc (for instance, using `pip install blenderproc~=2.8.0`), you can render the depth maps by running in the root of this repository the following command:
```
blenderproc run render_depth_DiLiGenT-MV.py -- \
    --mvpmsData-folder ${DILIGENT_MV_ROOT}/mvpmsData \
    --output-folder ${DILIGENT_MV_ROOT}/depth_from_mesh \
    --obj-name ${OBJ_NAME} \
    --view-idx ${VIEW_IDX}
```
where `${DILIGENT_MV_ROOT}` is the folder where you have extracted the DiLiGenT-MV dataset, `${OBJ_NAME}` is the name of one of the objects in the dataset (`bear`, `buddha`, `cow`, `pot2`, and `reading`) or `${VIEW_IDX}` is the index of the view to render (from `1` to `20`).

## Basic usage
Results similar to the experiments in the paper can be obtained with the following command:
```
python main.py \
    --data-dir ${DATASET_ROOT} \
    --dataset-type ${DATASET_TYPE} \
    --obj-name ${OBJ_NAME} \
    --num-shifts ${NUM_SHIFTS} \
    --normal-type ${NORMAL_TYPE} \
    --q-beta ${Q_BETA} \
    --rho-beta ${RHO_BETA} \
    --gamma-type ${GAMMA_TYPE} \
    --lambda-m ${LAMBDA_M} \
    --w-b-to-a-outlier-th ${W_B_TO_A_OUTLIER_TH} \
    --num-ms ${NUM_MS} \
    --max-iter ${MAX_ITER} \
    --force-all-iters \
    --output-folder ${OUTPUT_FOLDER}
```
where:
- `${DATASET_ROOT}` is `${DILIGENT_ROOT}` for the experiments on DiLiGenT and `${DILIGENT_MV_ROOT}` for the experiments on DiLiGenT-MV;
- `${DATASET_TYPE}` is `diligent` for the experiments on DiLiGenT and ``diligent_mv`` for the experiments on DiLiGenT-MV;
- `${OBJ_NAME}` is the object name (_e.g.,_ `bear`) for the experiments on DiLiGenT and the object name followed by the view index for the experiments on DiLiGenT-MV (_e.g.,_, `bear_view_01`);
- `${NUM_SHIFTS}` is either `4`, `4_diag`, or `8` (we use `4` in our main experiments);
- `${NORMAL_TYPE}` is `gt` for the experiments on DiLiGenT and either `gt` or `TIP19Li` for the experiments on DiLiGenT-MV (respectively, for ground-truth normals and for normals from photometric stereo);
- `${Q_BETA}` and `${RHO_BETA}` control the _discontinuity activation_ term $\beta_{b\rightarrow a}$ in our approach (_cf._ eq. (16) in our paper), which we set respectively to `50` and `0.25`. Set both to `None` to run our method without $\alpha_{b\rightarrow a}$ computation;
- `${GAMMA_TYPE}` controls the $\gamma_{b\rightarrow a}$ factor in our method. Set it to `bini` to use the default term from BiNI and to `ours` to use our more generic term (_cf._ eq. (13) in our paper);
- `${LAMBDA_M}` is the $\lambda_m$ factor that controls the sub-pixel interpolation of $\boldsymbol{\tau_m}$ in our method (_cf._ Appendix D in our paper). We set this to `0.5`;
- `${W_B_TO_A_OUTLIER_TH}` and `${NUM_MS}` control our outlier filtering strategy through $\boldsymbol{\tau_m}$ (_cf._ Appendix H in our paper), which is useful for DiLIGenT-MV (we set `${W_B_TO_A_OUTLIER_TH}` to `1.1` and `${NUM_MS}` to `15`);
- `${MAX_ITER}` is the maximum number of iterations to use. We set this to `1200` in our experiments. The flag `--force-all-iters` can be included to force the method to be run for all the `${MAX_ITER}` iterations;
- `${OUTPUT_FOLDER}` is the path to the folder that should store the output of the experiments. For each experiment, a subfolder is created that is indexed by the experiment's starting time and by the object name.


Example run on DiLiGenT:
```
python main.py \
    --data-dir ${DILIGENT_ROOT} \
    --dataset-type diligent \
    --obj-name buddha \
    --num-shifts 4 \
    --normal-type gt \
    --q-beta 50 \
    --rho-beta 0.25 \
    --gamma-type bini \
    --lambda-m 0.5 \
    --w-b-to-a-outlier-th 1.1 \
    --num-ms 15 \
    --max-iter 1200 \
    --force-all-iters \
    --output-folder ${OUTPUT_FOLDER}
```

Example run on DiLiGenT:
```
python main.py \
    --data-dir ${DILIGENT_MV_ROOT} \
    --dataset-type diligent_mv \
    --obj-name buddha_view_01 \
    --num-shifts 4 \
    --normal-type gt \
    --q-beta None \
    --rho-beta None \
    --gamma-type bini \
    --lambda-m 0.5 \
    --w-b-to-a-outlier-th 1.1 \
    --num-ms 15 \
    --max-iter 1200 \
    --force-all-iters \
    --output-folder ${OUTPUT_FOLDER}
```

## Citation
If you find our code or paper useful, please cite:

```bibtex
@inproceedings{Milano2025DiscontinuityAwareNormalIntegration,
  author    = {Milano, Francesco and López-Antequera, Manuel and Dhingra, Naina and Siegwart, Roland and Thiel, Robert},
  title     = {{Discontinuity-aware Normal Integration for Generic Central Camera Models}},
  booktitle = {IEEE/CVF International Conference on Computer Vision (ICCV)},
  year      = {2025}
}
```

## Contributions
See the [CONTRIBUTING](CONTRIBUTING.md) file for how to help out.

## License
surface_normal_integration is GPLv3 licensed, as found in the LICENSE file.


## Acknowledgements
Parts of the code are based on <a href="https://github.com/xucao-42/bilateral_normal_integration/">BiNI</a>. This includes in particular the files in the folder [`bini`](./bini/), which were refactored and partly restructured to comply with standard camera conventions, the basic structure of the optimization procedure in the [main script](./main.py), and the computation of the MADE metric in [`utils/metrics.py`](./utils/metrics.py).