prisms-air-quality-modeling

In this project, we build an accurate fine-scale air quality prediction model. Here is the paper: Mining Public Datasets for Modeling Intra-City PM2.5 Concentrations at a Fine Spatial Resolution

Data Source

Air Quality Data

We are collecting the air quality data, including O3, PM25, PM10, CO, NO2, and SO2 concentration/AQI observations from the monitoring stations in Los Angeles County through the EPA’s Airnow web service. The air quality table (los_angeles_air_quality) has been initialized in JonSnow database (SQL code). We query the web servive every hour automatically (Python code) using crontab (Appendix I).

Weather Data

We are collecting meteorological data through the Dark Sky API. The weather table (los_angeles_meteorology) has been initialized in JonSnow database (SQL code). We query the web servive every hour at a given location (or sensor locations) automatically (Python code) in the same way. We can also query the data from a given time to a given time (Python code).

Geographic Data

We are using Openstreetmap to generate geographic features for our model. For a given location, it creates the buffers (default 100m-3000m with 100m interval) around the location and compute the intersected area/length/count between those buffers and various geographic categories in Openstreetmap data (see figure below). (Python code)

Other data sources

Purple Air

We are collecting data from Purple Air. We query the Purple Air web service that each sensor updates air qualty data around every minute, including PM2.5, PM10, PM1, temperature, and humidity. Each machine has two channels (A and B) at the same location. The two-channel mechanism ensures if one channel has noises, the other one can still work properly. Each unique "sensor" has three ID numbers:

id - Each sensor (channel A or B) has its unique id
sensor_id - Each sensor (channel A or B) has its unique sensor_id
parent_id - For channel A sensor, it has a unique parent_id. When parent_id = null, it indicates a channel B sensor. If the id from a channel B sensor equals to the parent_id from a channel A sensor, the two sensors share the same machine and location.

Fishnet Data

Grids over Los Angeles County (around 3000 points), used for fine-scale prediction

Algorithm

Edit configuration in config.json.

High Level Architecture

Model Evaluation

Cross Validation Run CrossValidation.scala to evaluate the model with itself.
Validation Run Validation.scala to evaluate the model with other dataset.

Fishnet Prediction

Run FishnetPrediction.scala to get the prediction result for fishnet. (Current Time or From Time To Time)

Appendix

I. Access JonSnow Database

You need to get the username and password for both JonSnow server and database.

Log in server with the server username and password

ssh -L [your local port]:localhost:5432 [your username]@jonsnow.usc.edu

Use Postico (only Mac) or PgAdmin to log in with database username and passward show in the figure below. [port] would be [your local port].

II. crontab

Check all the running crontab

crontab -l

Edit user crontab file

crontab -e

Update crontab operations

sudo /bin/systemctl restart crond.service

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
AirQualityPrediction		AirQualityPrediction
PRISMS_AirQualityPrediction		PRISMS_AirQualityPrediction
PurpleAirData		PurpleAirData
PythonCode		PythonCode
SQL		SQL
images		images
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prisms-air-quality-modeling

Data Source

Air Quality Data

Weather Data

Geographic Data

Other data sources

Purple Air

Fishnet Data

Algorithm

High Level Architecture

Model Evaluation

Fishnet Prediction

Appendix

I. Access JonSnow Database

II. crontab

About

Releases

Packages

Contributors 2

Languages

spatial-computing/air-quality-prediction-scala

Folders and files

Latest commit

History

Repository files navigation

prisms-air-quality-modeling

Data Source

Air Quality Data

Weather Data

Geographic Data

Other data sources

Purple Air

Fishnet Data

Algorithm

High Level Architecture

Model Evaluation

Fishnet Prediction

Appendix

I. Access JonSnow Database

II. crontab

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages