This is my personal note about local and global descriptor. Trying to make anyone can get in to these fields more easily. If you find anything you want to add, feel free to post on issue or email me.
This repo will be constantly updated.
Author: Tsun-Yi Yang ([email protected])
In this section, I focus on the review about the sparse keypoint matching and it's pipeline.
This subsection includes the review about keypoint detection and it's orientation, scale, or affine transformation estimation.
Year | Paper | link | Code |
---|---|---|---|
[ICCV19] | Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters | Github | |
[ECCV18] | Repeatability Is Not Enough: Learning Discriminative Affine Regions via Discriminability | arXiv | Github |
[CVPR17] | Quad-networks: unsupervised learning to rank for interest point detection | - | |
[CVPR16] | Learning to Assign Orientations to Feature Poitns | - | Github |
[CVPR15] | TILDE: a Temporally Invariant Learned DEtector | arXiv | Github |
In the last few decades, people focus on the patch descriptor
- Hand-crafted
Year | Paper | link | Code |
---|---|---|---|
[CVPR16] | Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales | Github | |
[CVPR15] | Domain-Size Pooling in Local Descriptors: DSP-SIFT | - | |
[CVPR15] | BOLD - Binary Online Learned Descriptor For Efficient Image Matching | Github | |
[CVPR13] | Boosting binary keypoint descriptors | - | - |
[CVPR12] | Freak: Fast retina keypoint | - | - |
[CVPR12] | Three things everyone should know to improve object retrieval | - | |
[IPOL11] | ASIFT: An Algorithm for Fully Affine Invariant Comparison | - | - |
[ICCV11] | BRISK: Binary robust invariant scalable keypoints | - | - |
[ICCV11] | Orb: An efficient alternative to sift or surf | - | - |
[ICCV11] | Local inten-sity order pattern for feature description | - | - |
[CVIU06] | Speeded-up robust features (SURF) | - | - |
[ECCV06] | Surf:Speeded up robust features | - | - |
[IJCV04] | Distinctive image features from scale-invariant keypoints | - | Github |
- Deep learning
Year | Paper | link | Code |
---|---|---|---|
[ICCV19] | Beyond Cartesian Representations for Local Descriptors | - | |
[CVPR19] | SOSNet: Second Order Similarity Regularization for Local Descriptor Learning | arXiv,Page | Github |
[ECCV18] | GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints | - | Github |
[CVPR18] | Local Descriptors Optimized for Average Precision | Page | - |
[NIPS17] | Working hard to know your neighbor's margins: Local descriptor learning loss | arXiv | Github |
[ICCV17] | DeepCD: Learning Deep Complementary Descriptors for Patch Representations | Github | |
[CVPR17] | L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space | Github | |
[arXiv16] | PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors | arXiv | Github |
[BMVC16] | Learning local feature descriptors with triplets and shallow convolutional neural networks | Github | |
[ICCV15] | Discriminative Learning of Deep Convolutional Feature Point Descriptors | Page | Github |
[CVPR15] | MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching | - | |
[CVPR15] | Learning to compare image patches via convolutional neural networks | Github |
Recently, more and more papers try to embed the whole matching pipeline (keypoint detection, keypoint description) into one framework.
Year | Paper | link | Code |
---|---|---|---|
[arXiv19] | Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task | arXiv | - |
[NIPS19] | R2D2: Repeatable and Reliable Detector and Descriptor | arXiv,Page | - |
[ICCV19] | ELF: Embedded Localisation of Features in Pre-Trained CNN | Github | |
[CVPR19] | D2-Net: A Trainable CNN for Joint Description and Detection of Local Features | arXiv,Page | Github |
[CVPRW18] | SuperPoint: Self-Supervised Interest Point Detection and Description | arXiv | Github |
[NIPS18] | LF-Net: Learning Local Features from Images | Github | |
[ECCV16] | LIFT: Learned Invariant Feature Points | - | Github |
After the matching, standard RANSAC and it's variants are usually adopted for outlier removal.
- Algorithm based
Year | Paper | link | Code |
---|---|---|---|
[CVPR19] | MAGSAC: Marginalizing Sample Consensus | Github | |
[ECCV12] | Improving Image-Based Localization by Active Correspondence Search | - | |
[CVPR05] | Matching with PROSAC – Progressive Sample Consensus | - |
- Deep learning based
Year | Paper | link | Code |
---|---|---|---|
[arXiv19] | SuperGlue: Learning Feature Matching with Graph Neural Networks | arXiv | - |
[ICCV19] | NG-RANSAC for Epipolar Geometry from Sparse Correspondences | arXiv | Github |
[ICCV19] | Learning Two-View Correspondences and Geometry Using Order-Aware Network | arXiv | Github |
[CVPR18] | Learning to Find Good Correspondences | - | Github |
Consider global retrieval usually targets on a lot of candidates, there are several way to generate one single description for one image.
- Hand-crafted
When there is only hand-crafted local descriptors, people usually uses feature aggregation from a set of local descriptors and output a single description.
Year | Paper | link | Code |
---|---|---|---|
[CVPR13] | All about VLAD | - | |
[ECCV10] | Improving the fisher kernel for large-scale image classification | - | |
[CVPR07] | Object retrieval with large vocabularies and fast spatial matching | - | |
[CVPR06] | Fisher kenrels on visual vocabularies for image categorizaton | - |
- Deep learning
Similar idea but use deep learning to adapt classical algorithm
Year | Paper | link | Code |
---|---|---|---|
[ECCV16] | CNN Image RetrievalLearns from BoW: Unsupervised Fine-Tuning with Hard Examples. | - | |
[CVPR16] | NetVLAD: CNN architecture for weakly supervised place recognition | Page | Github |
One single representation from the image.
Year | Paper | link | Code |
---|---|---|---|
[ICCV19] | Learning with Average Precision: Training Image Retrieval with a Listwise Loss | arXiv | Github |
[CVPR19] | Detect-to-Retrieve: Efficient Regional Aggregation for Image Search | Github | |
[TPAMI18] | Fine-tuning CNN Image Retrieval with No Human Annotation | arXiv | Github |
[IJCV17] | End-to-end Learning of Deep Visual Representations for Image Retrieval | arXiv | Github |
[ICCV17] | Large-Scale Image Retrieval with Attentive Deep Local Features | - | Github |
[ECCV16] | CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples | arXiv | Github |
For more compact representation, a binary descriptor can be generated from hashing or thresholding. Quantization is also very popular in large-scale image retrieval.
Year | Paper | link | Code |
---|---|---|---|
[ICCVW19] | DAME WEB: DynAmic MEan with Whitening Ensemble Binarization for Landmark Retrieval without Human Annotation | - | |
[AAAI18] | Deep Region Hashing for Generic Instance Search from Image | - | - |
[TPAMI18] | Supervised Learning of Semantics-Preserving Hash via Deep Convolutional NeuralNetworks | - | - |
[TPAMI13] | Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval | - | |
[TPAMI10] | Product quantization for nearest neighbor search | - |
Anything can boost the performance in the post-processing stage such as re-ranking or query expansion.
Year | Paper | link | Code |
---|---|---|---|
[CVPR19] | Local features and visual words emerge in activations | - | |
[CVPR12] | Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking | - |
Some works try to cover both local descriptor and global retrieval due to the shared similarity about the activation and the applications.
Year | Paper | link | Code |
---|---|---|---|
[CVPR19] | ContextDesc: Local Descriptor Augmentation with Cross-Modality Context | - | Github |
[CVPR19] | From Coarse to Fine: Robust Hierarchical Localization at Large Scale with HF-Net | arXiv | Github |
[ICCV17] | Large-Scale Image Retrieval with Attentive Deep Local Features | - | Github |
Year | Paper | link | Code | Note |
---|---|---|---|---|
[CVPR17] | HPatches: A benchmark and evaluation of handcrafted and learned local descriptors | arXiv | Github | Hpatches |
[TPAMI11] | Discriminative learning of local image descriptors | Page | - | UBC/Brown dataset (subsets:Liberty (New York), Notre Dame (Paris) and Half Dome (Yosemite)) |
[CVPR08] | On Benchmarking Camera Calibration and MultiView Stereo for High Resolution Imagery |
Year | Paper | link | Code | Note |
---|---|---|---|---|
[CVPR18] | Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking | Page | Github | ROxford5k, RParis6k |
[CVPR07] | Object retrieval with large vocabularies and fast spatial matching | Page | - | Oxford5k |
[CVPR08] | Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases | Page | - | Paris6k |
Year | Paper | link | Code | Note |
---|---|---|---|---|
[CVPR18] | Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions | PDF,Page | Github | Aachen-day-night, Robotcar, CMU-seasons |