Skip to content

PyHttrack is a lightweight and powerful Python tool that allows you to download entire websites directly to your local computer for offline access, archive, or content analysis. Inspired by the legendary HTTrack, PyHttrack comes with a modern approach, is easily customizable, and can be integrated in various automated workflows.

License

Notifications You must be signed in to change notification settings

riodevnet/pyhttrack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🕸️ PyHttrack — Mirror Your Favorite Web to Your Computer!

PyHttrack is a lightweight and powerful Python tool that allows you to download entire websites directly to your local computer for offline access, archive, or content analysis. Inspired by the legendary HTTrack, PyHttrack comes with a modern approach, is easily customizable, and can be integrated in various automated workflows.

Image

🔍 Top Features:

  • 🌐 Download Full Website - HTML, CSS, JS, images and other media directly to local directory.
  • ⚙️ Flexible Configuration - Specify crawl depth, file extensions, domain limits and more.
  • 🖥️ Simple CLI Interface - Run and monitor processes with easy-to-understand commands.
  • 📁 Organized Directory Structure - Keeps the original structure of the site for an identical offline experience.
  • 🧩 Easy to Customize - Suitable for developers, researchers, and digital archivists.

🛠️ Use Case:

  • Save important site documentation before going offline
  • Perform local SEO crawling & analysis
  • Learn to build a site from real examples
  • Backup personal content or public blogs

🚀 Get Started

Installation

pip install -r requirements.txt

Configuration

Edit the web.json file and add the url of the website you want to download, for example the following :

["https://example.com/xxx/xxx"]

or download many websites

[
  "https://example.com/xxx/xxx",
  "https://example.com/xxx/xxx",
  "https://example.com/xxx/xxx"
]

Start Download

Run the following command to start the download :

python pyhttrack.py

📥 Latest Release

Click here to get the latest version of PyHttrack.

🤝 Contribution

Contributions are very welcome!. Please feel free to fork this repo, create an issue, or submit a pull request for new features or performance improvements 🚀

About

PyHttrack is a lightweight and powerful Python tool that allows you to download entire websites directly to your local computer for offline access, archive, or content analysis. Inspired by the legendary HTTrack, PyHttrack comes with a modern approach, is easily customizable, and can be integrated in various automated workflows.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages