Image Captioning Using CNN and LSTM

Caption generation is a challenging artificial intelligence problem where a textual description must be generated for a given photograph. It requires:

Computer Vision techniques to understand the content of the image.
A Language Model from the field of Natural Language Processing (NLP) to turn the understanding into words in the correct order.

Recently, deep learning methods have achieved state-of-the-art results in caption generation problems. The most impressive part of these methods is that a single end-to-end model can predict a caption for a given photo without requiring sophisticated data preparation or multiple specifically designed models.

Dataset

Flickr 8k: Download from Kaggle
Description File: Flickr8k Text

Model Architecture

Model Diagram

Training Diagram

Final Results

Successful Examples

Failure Examples

Image-Captioning-Using-CNN-and-LSTM

This repository implements a deep learning approach combining CNN and LSTM for generating image captions. The method leverages both visual and textual data to provide accurate, context-aware captions for images.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Image Captioning .ipynb		Image Captioning .ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Captioning Using CNN and LSTM

Dataset

Model Architecture

Model Diagram

Training Diagram

Final Results

Successful Examples

Failure Examples

Image-Captioning-Using-CNN-and-LSTM

About

Uh oh!

Releases

Packages

Languages

RichieOnData/Image-Captioning-Using-CNN-and-LSTM

Folders and files

Latest commit

History

Repository files navigation

Image Captioning Using CNN and LSTM

Dataset

Model Architecture

Model Diagram

Training Diagram

Final Results

Successful Examples

Failure Examples

Image-Captioning-Using-CNN-and-LSTM

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages