Web Crawler REST API

Web crawler program to fetch data on HTML pages up to provided depth and up ot maximum pages. Crawling service is exposed as REST endpoints. Web UI is available to complete web crawler request form, start web crawler and get statistics as file in CSV format.

Once application is started web UI is available on: http://localhost:8080/main or http://localhost:8080/

The API caller is able to use these operations on API:

Start web crawler - POST - /api/webcrawler
Get all records JSON format - GET - /api/webcrawler
Get n records JSON format - GET - /api/webcrawler/{count}
Get all records CSV format - GET - /api/webcrawler/csv
Get n sorted records CSV format - GET - /api/webcrawler/csv/{count}

Link to Postman test data samples

Installation

Download code as ZIP or git pull https://github.com/hmurij/Web-Crawler.git Import existing Maven project and run com.webcrawler.WebCrawlerApplication.java or start application by executing startCrawler.bat, please note webCrawler.jar should be in the same directory as bat file.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
web-crawler		web-crawler
README.md		README.md
startCrawler.bat		startCrawler.bat
webCrawler.jar		webCrawler.jar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Web Crawler REST API

Installation

About

Uh oh!

Releases

Packages

Languages

hmurij/Web-Crawler

Folders and files

Latest commit

History

Repository files navigation

Web Crawler REST API

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages