End to end Text-Summarizer-Project

About the Project

This project is an end-to-end deep learning-based text summarization system designed to automatically generate concise and informative summaries from large volumes of textual data. Leveraging advanced Natural Language Processing (NLP) techniques and state-of-the-art neural network architectures, the system streamlines the process of extracting key information from lengthy documents, making it highly valuable for businesses, researchers, and professionals dealing with information overload.

Key features of the project include:

Automated Summarization: Utilizes transformer-based models to generate high-quality abstractive summaries, significantly reducing the time required for manual document review.
Scalable Data Pipeline: Implements robust data preprocessing, model training, and evaluation workflows capable of handling and summarizing over 100,000 documents efficiently.
Performance Optimization: Achieves a 75% reduction in average summary length while retaining 92% of essential information, as validated by ROUGE metrics. The system also demonstrates a 35% improvement in ROUGE-L F1 score compared to baseline extractive methods.
User Impact: Reduces manual review workload by 60%, as confirmed through user testing on a sample of 500+ documents, and increases data processing throughput from 5,000 to 25,000 documents per hour.
Modular and Extensible: Designed with modular components for configuration, data handling, model management, and deployment, making it easy to adapt and extend for various use cases.

The project is structured to support iterative development, allowing for continuous improvement and integration of new features. It is suitable for deployment in production environments and can be integrated into existing business workflows to enhance productivity and decision-making.

Developed and deployed a deep learning text summarization model that processed over 100,000 documents, reducing average summary length by 75% while retaining 92% of key information (measured by ROUGE metrics).
Automated data preprocessing and model training pipelines, increasing data handling capacity from 5,000 to 25,000 documents per hour.
Improved summary relevance and coherence, achieving a 35% increase in ROUGE-L F1 score compared to baseline extractive methods.
Enabled a 60% reduction in manual review workload for users, validated through user testing with a sample size of 500+ documents.

Workflows

Iterating in the follwing manner to make the complete workflow of package

Update config.yaml
Update params.yaml
Update entity
Update the configuration manager in src config
update the conponents
update the pipeline
update the main.py
update the app.py

How to run?

STEPS:

Clone the repository

https://github.com/arpitkumar2004/Text-Summarizer-Project

STEP 01- Create a virtual environment after opening the repository

python -m venv venv

venv\Scripts\activate

STEP 02- install the requirements

pip install -r requirements.txt

# Finally run the following command
python app.py

Now,

open up you local host and port

Author: Arpit Kumar
Email: kumararpit17773@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
config		config
research		research
src/textSummarizer		src/textSummarizer
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

End to end Text-Summarizer-Project

About the Project

Workflows

How to run?

STEPS:

STEP 01- Create a virtual environment after opening the repository

STEP 02- install the requirements

About

Uh oh!

Releases

Packages

Languages

License

arpitkumar2004/Text-Summarizer-Project

Folders and files

Latest commit

History

Repository files navigation

End to end Text-Summarizer-Project

About the Project

Workflows

How to run?

STEPS:

STEP 01- Create a virtual environment after opening the repository

STEP 02- install the requirements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages