This project implements an image retrieval system using hypergraph-based manifold ranking, as described in the paper "Multimedia Retrieval through Unsupervised Hypergraph-based Manifold Ranking" (IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 28, NO. 12, DECEMBER 2019). The system retrieves similar images based on content using a combination of feature extraction, hypergraph construction, and ranking algorithms.
The goal of this project is to develop a content-based image retrieval system using hypergraph-based manifold ranking. The system extracts features from images using a pre-trained ResNet50 model, constructs a hypergraph to model relationships between images, and ranks images based on their similarity to a query image. Details on the project can be found in the documentation folder.
Key steps in the algorithm include:
- Feature Extraction: Using ResNet50 to extract high-level features from images.
- Distance and Similarity Calculation: Computing cosine distances and converting them to similarity scores.
- Rank Normalization: Normalizing ranks based on reciprocal rank positions.
- Hypergraph Construction: Building a hypergraph to model relationships between images.
- Hypergraph-Based Similarity: Computing similarity matrices using hypergraph structures.
- Iterative Ranking: Refining rankings iteratively (optional).
- Evaluation: Measuring retrieval accuracy using precision, MAP, and NDCG.
-
Feature Extraction:
- Using ResNet50 to extract deep feature representations from each image.
- Feature vectors capture meaningful visual information for similarity comparison.
-
Similarity Calculation:
- Compute pairwise cosine similarity between images to build an initial similarity matrix.
-
Ranking and Refinement:
- Use a hypergraph-based ranking approach to capture local and global relationships between images.
- Optionally, apply iterative refinement to update rankings using feedback.
-
Evaluation:
- Evaluate the effectiveness using Precision@k, Mean Average Precision (MAP), and NDCG metrics.
Metric | Top-5 |
---|---|
Precision | 0.6929 |
MAP | 0.7274 |
NDCG | 0.7383 |
- Clone the repository:
git clone https://github.com/ioannisCC/image-analysis.git cd image-analysis
- Create a virtual environment and install dependencies:
For Linux/MacOS:
python3 -m venv venv source venv/bin/activate
For Windows: ```bash python -m venv venv venv\Scripts\activate
- Install Dependencies:
pip install -r requirements.txt
- Prepare the Dataset:
- Download the Oxford-IIIT Pet Dataset.
- Extract the dataset and place the images and annotations folders inside the data directory of the project (like in the image above).
- Extract Features:
Run the feature extraction script to extract features from the images using the ResNet50 model:
python scripts/final_working_scripts/feature_extraction.py
This will generate a file called features.npy and another one all_features.pkl in the artifacts folder.
- Run the algorithm:
Run the manifold ranking script to build the hypergraph and compute similarity matrices:
python scripts/final_working_scripts/manifold_ranking.py
This will generate the following files in the artifacts folder:
-
hypergraph_data.npz: Contains hypergraph data (H, Sh, Sv, W).
-
combined_data.csv: Contains metadata and feature vectors for all images.
- Run the Image Retrieval System in GUI
To retrieve similar images for a query image, run the following script:
python scripts/final_working_scripts/image_retrieval_system.py
The GUI allows you to:
-
Enter an image ID or index.
-
View the top-k similar images.
-
Display metadata and similarity scores.
-
Ensure you have sufficient computational resources (e.g., GPU) for faster processing, especially during feature extraction.
-
Modify the scripts/final_working_scripts/manifold_ranking.py script to adjust hyperparameters (e.g., k, L) for better results.
-
The artifacts folder stores precomputed data to avoid redundant computations. Delete these files if you want to recompute them.