Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 2 KB

README.md

File metadata and controls

27 lines (16 loc) · 2 KB

Data Warehouse and Data Lake Systems Design

Data Warehouse and Data Lake Systems Design - An Application for Searching Tourism Attractions in Switzerland

The project aims to build a travel data warehouse and develop an insight dashboard that provides information about the most popular attractions and destinations in Switzerland.

Database Architecture

The target of this database is to build a User Interface for decision-making of travelers and understand how the impact of social media and influencers is.

Below figure shows the architecture of database. Data sources consist of 3 API and 2 datassheets. Except static data such as Switzerland city, canton, destination, information of geographical locations, dynamic data are fetched via Instagram Graph API and pytrends API. The data lake is built in Amazon Relational Database Service (RDS). Apache Airflow is utilized to execute data pipeline processes automatically. After that, all needed datasets are stored in the data lake.

Traveler-insight Dashboard

The results of data analysis are visualized and presented as follows.

The interactive dashboard can be accessed from the URL below. https://public.tableau.com/app/profile/yang7231/viz/TravelerInsightDashboard/TravelerInsightDashboard

Description of Files

trends_all.py : airflow dags files fetch the google trends of all cities in Switzerland and the trends of all hashtags from Instagram.

upload_city_list_to RDS_PostgreSQL.py : python script uploads the datasheet to AWS RDS.