Skip to content

tensorsum/pandas_test

Repository files navigation

pandas_test

test for airbnb

We want to get answers to a few questions

(1) Using unique_host_id, how many unique hosts are there in Vancouver?

(2) How many unique hosts are there by neighborhood (i.e. how many distinct unique_host_id values by neighborhood)?

(3) We will consider any two listings with the same unique_host_id or the same external_property_id to be managed by the same host. How many unique hosts are there? (Build a graph where the nodes are the listings and the edges are defined as having the same unique_host_id OR the same external_property_id then count the connected components.)

Please write this up as a jupyter notebook using pandas.

#########################

host->property break apart clusters by (host OR property)

Test contained chained clusters and obscure data like properties without hoste and vise versa

Chained:

A->X (A=X)

B->Y (B=Y)

A->Y (A=Y=B=X) All in the same cluster

About

test for airbnb

Resources

Stars

Watchers

Forks

Packages

No packages published