Skip to content

dtuit/twitter_scraper

Repository files navigation

Twitter_Scraper

NOTE: currently under development

Scrapes tweets from twitter.com and inserts into a SQL server database.
Uses Celery the asynchronous task queue as a framework.
Tested on Ubuntu 14.04 with pyhton 3.4

Install requirements

  • Python
  • Celery
    • pip install Celery
  • pymssql
    • sudo apt-get install freetds-dev freetds-bin
    • pip install pymssql
  • requests
  • lxml
    • sudo apt-get install python3-lxml
  • cssselect
    • pip install cssselect
  • RabbitMQ
    • sudo apt-get install rabbitmq-server

create a file keys.json file which contains the SQL server connection parameters

{
    "server":  "SERVER.database.windows.net",
    "user": "USER@SERVER",
    "password": "password",
    "database": "databasename"
}

note: Use the --recursive option when cloning to also clone the submodule

About

Scrape tweets from twitter.com using celery

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages