# research_semantic_localization

**Repository Path**: pi-lab/research_semantic_localization

## Basic Information

- **Project Name**: research_semantic_localization
- **Description**: 语义定位的研究资料
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 6
- **Forks**: 7
- **Created**: 2019-11-06
- **Last Updated**: 2025-01-03

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 基于语义地图的集群无人机自主定位

## 1. 研究目标

人类的感知没有精确计算自己所在场景的位置信息，仅仅通过场景的相似性来进行定位，既通过目前看到的场景和之前走过的那个场景相似从而完成定位。因此通过深度学习来对场景进行相似性判断，并和地图库中的场景进行比对，可以模拟人的认知过程，从而实现基于视觉（非几何计算）的定位。

主要的研究目标有：
1. 单张图像的相似性判断已经比较多，而序列化的图像的场景相似性判断还没有。研究如何将多张图像所拍摄的场景进行建模，可以构建图，然后在地图（图网络）上进行子图匹配，找到最优的匹配点。

2. 如何更好的生成场景中物体的特征，可以NetVLAD或参考《Image Matching Based on Deep Feature and Spatial Correlation Graph》中的FE-Net。

3. 如何构建图卷积神经网络(Graph Convolutional Networks)来计算相似性。

4. 如何将所经过的场景提取特征并保存到一个图中，从而完成语义地图的构建。


## 2. 主要思路

研究思路：
1. 先通读一下主要的参考文献，建立对所研究问题的基本认识，了解基本的方法等。
2. 找一些代码运行一下，建立直觉的认识，并熟悉数据集。
3. 可以从基本的深度学习提取特征开始，然后再深入到图卷积神经网络。


具体的研究方法（需要尝试）：

1. 使用EdgeBox或者其他方法找到感兴趣的区域，然后提取对象的深度学习特征
    - EdgeBox 仅仅提取感兴趣的区域 [论文](https://www.microsoft.com/en-us/research/wp-content/uploads/2014/09/ZitnickDollarECCV14edgeBoxes.pdf)，[中文解释](https://blog.csdn.net/wsj998689aa/article/details/39476551)
2. 研究Siamese网络，提高特征的区分能力 (可以参考《Image Matching Based on Deep Feature and Spatial Correlation Graph》中的FE-Net)
3. 研究Graph Convolutional Networks，如何提取网络节点的特征，并做图匹配等。


## 3. 关键技术
1.匹配精度,用来衡量图片匹配的精度。
2.Precision-Recall 曲线,用来衡量检索相关地点的图片的相关性。
3.内存使用量、算法复杂度，用来衡量内存使用的效率，以及算法的性能。
4.针对先前工作存在的问题所做出的改善，包括提高视觉上相似但不是同一地点的错误匹配、场景建模的性能、子图匹配等。
5．网络设计的合理性和损失函数设计的合理性。


## 4. 研究计划
第 1 周～第 2 周 课程设计

第 3 周～第 4 周 撰写课程设计报告

第 5 周 查阅与课题有关的文献资料，撰写开题报告和任务书

第 6 周～第 7 周 根据要求，分析制定设计方案，划分程序基本功能模块

第 8 周～第11周 研究视觉算法、深度学习网络设计、程序构成等

第12周～第13周 撰写毕业论文

第14周～第15周 准备毕业答辩


### 5.0 参考资料
* Improved version of DBow2 (https://github.com/rmsalinas/DBow3)
* FBOW (Fast Bag of Words) is an extremmely optimized version of the DBow2/DBow3 libraries (https://github.com/rmsalinas/fbow)
* Robust Visual Robot Localization Across Seasons using Network Flows (https://github.com/MHassanNadeem/localization-network-flows)
* NetVLAD: CNN architecture for weakly supervised place recognition (https://www.di.ens.fr/willow/research/netvlad/)
* Graph Convolutional Networks in PyTorch (https://github.com/tkipf/pygcn)
* Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (https://github.com/mdeff/cnn_graph)
* Graph Convolutional Networks (GCNs) (https://github.com/sungyongs/graph-based-nn)

* [2015 Visual Place Recognition: A Survey](references/survey/2015 Visual Place Recognition: A Survey.pdf)
* [基于视觉地图的视觉定位](https://www.zhihu.com/column/c_1287353014030585856)

### 5.1 VLAD/NetVLAD
* NetVLAD CNN architecture for weakly supervised place recognition
	- https://towardsdatascience.com/netvlad-cnn-architecture-for-weakly-supervised-place-recognition-ce64b08bebaf
	
* VLAD
	- https://ameyajoshi005.wordpress.com/2014/03/29/vlad-an-extension-of-bag-of-words/

* PatchNetVLAD


### 5.2 特征提取

* Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free

* 2020 Visual search over billions of aerial and satellite images

### 5.3 网络方法
* Image Matching Based on Deep Feature and Spatial Correlation Graph
* Location Graphs for Visual Place Recognition
* Learning Convolutional Neural Networks for Graphs
* Robust Visual Semi-Semantic Loop Closure Detection by a Covisibility Graph and CNN Features
* Siamese Network
    - https://github.com/delijati/pytorch-siamese

    
### 5.4 图神经网络
* Learning Convolutional Neural Networks for Graphs
* Image Matching Based on Deep Feature and Spatial Correlation Graph


### 5.5 多视角的检索
* Lending Orientation to Neural Networks for Cross-view Geo-localization  https://github.com/Liumouliu/OriCNN
* Optimal Feature Transport for Cross-View Image Geo-Localization https://github.com/shiyujiao/cross_view_localization_CVFT


### 5.6 索引方法
* Tree-based indexing for real-time ConvNet landmark-based visual place recognition
* 2020 Visual search over billions of aerial and satellite images


### 5.7 Codes
* NetVLAD - PyTorch Version: https://gitee.com/hu_jinsong/pytorch_-net-vlad

* Keras implementation of the Netvlad for visual place recognition https://github.com/crlz182/Netvlad-Keras

* LoST - Visual Place Recognition using Visual Semantics for Opposite Viewpoints across Day and Night https://github.com/oravus/lostX
* visual place recognition in changing enviroments https://github.com/PRBonn/vpr_relocalization
* PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018 https://github.com/mikacuy/pointnetvlad
* NetVLAD: CNN architecture for weakly supervised place recognition https://github.com/Relja/netvlad

* Visual place recognition from opposing viewpoints under extreme appearance variations https://github.com/oravus/seq2single
* Optimal Feature Transport for Cross-View Image Geo-Localization https://github.com/shiyujiao/cross_view_localization_CVFT

* Neural Subgraph Learning Library https://github.com/snap-stanford/neural-subgraph-learning-GNN
* Neural Subgraph Matching http://snap.stanford.edu/subgraph-matching/


### 5.8 Dataset
* University1652-Baseline https://github.com/layumi/University1652-Baseline
* Places: A 10 million image database
 for scene recognition
* 24/7 Place Recognition by View Synthesis http://www.ok.ctrl.titech.ac.jp/~torii/project/247/