Skip to content

πŸš€ High-performance server for concurrent fetching from multiple web resources 🌐, built with Python's 🐍 async capabilities.

License

Notifications You must be signed in to change notification settings

ddoroshev/multifetcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

30 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

multifetcher

Multifetcher is a web-based server that enables parallel requests to external resources. This lightweight, high-performance solution takes advantage of Python's asynchronous I/O capabilities to fetch data from multiple URLs concurrently and efficiently.

Features

  • Asynchronous HTTP requests to external resources
  • Dockerized application for easy setup and isolation
  • HTTP POST API to receive multiple request details
  • Streamed responses for real-time results
  • Timeout handling for each request

Getting Started

These instructions will get you a copy of the project up and running on your local machine.

Prerequisites

  • Docker

Installation & Running

  1. Clone this repository:
git clone https://github.com/yourusername/multifetcher.git
cd multifetcher
  1. Build the Docker image:
docker build -t multifetcher .
  1. Run the Docker container:
docker run -d -p 8000:8000 multifetcher

The server is now running at http://localhost:8000.

Usage

Multifetcher listens for POST requests at its root URL. The body of the request should be a JSON array of objects representing the HTTP requests to make. An example POST request body might look like this:

[
    {
        "id": "1",
        "method": "GET",
        "url": "https://google.com"
    },
    {
        "id": "2",
        "method": "GET",
        "headers": {"Cookie": "foo=bar"},
        "url": "https://yandex.ru"
    },
    {
        "id": "3",
        "method": "GET",
        "headers": {"Foo": "Bar"},
        "url": "https://httpbin.org/json"
    }
]

The server responds with a stream of newline-separated JSON objects. Each object corresponds to a response from one of the HTTP requests:

{"id": "1", "url": "https://google.com", "response": "<!doctype html><html itemscope=\"\"<...>"}
{"id": "2", "url": "https://yandex.ru", "response": "<!DOCTYPE html><html class=\"i-ua_js_<...>"}
{"id": "3", "url": "https://httpbin.org/json", "response": "{\n  \"slideshow\": {\n<...>"}

Testing

You can test Multifetcher by running the provided test.py script:

python test.py

This script sends a series of test HTTP requests to the server and prints the responses.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

πŸš€ High-performance server for concurrent fetching from multiple web resources 🌐, built with Python's 🐍 async capabilities.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published