Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elex 1235 create default aggregates #53

Merged
merged 11 commits into from
Jun 28, 2023

Conversation

lennybronner
Copy link
Collaborator

Description

We now set the default aggregates dynamically based on whether the election is statewide, in which case the default aggregate is postal_code or districtwide (e.g. house elections), in which case the default aggregate is (postal_code, district).

The second change is that we no longer hard-code postal_code as the main aggregate when generating the aggregate predictions. Previously we always passed just postal_code as the largest aggregate to the model, even when were were generating county_classification predictions for House races. This meant that we wouldn't be able to create county_classification predictions for each state, district, but only for each state. This is now resolved, since we use the default aggregate for that also.

Jira Ticket

https://arcpublishing.atlassian.net/browse/ELEX-1235

Test Steps

Added unit tests to run tox. To see the new functionality run this in develop:

elexmodel 2017-11-07_VA_G --estimands=dem --office_id=Y --geographic_unit_type=precinct-district --percent_reporting 10 --unexpected_units=10 --aggregates=postal_code --aggregates=district --aggregates=county_fips

The county_fips predictions do not take into account the district, since we are aggregating over postal_code, county_fips instead of postal_code, district, county_fips. If you run the same invocation in this branch, this will be resolved.

Note

The model now forces the user to input district for district wide races (ie. when office id is H, Y or Z since otherwise the model may break when dealing with unexpected units. This is because district was not in the passed in defaults (so we do not create a district column for the unexpected units) but it's expected as part of the default aggregates when creating the list to generate the aggregate predictions. Here is an example:

elexmodel 2017-11-07_VA_G --estimands=dem --office_id=Y --geographic_unit_type=precinct-district --percent_reporting 10 --unexpected_units=10 --aggregates=postal_code --aggregates=county_fips

This can likely be fixed, if we think that is necessary.

…lections). Also can now aggregate lower than district for districtwide elections
src/elexmodel/client.py Outdated Show resolved Hide resolved
@@ -142,7 +160,8 @@ def get_estimates(
column_values = current_data[0]
current_data = pd.DataFrame(current_data[1:], columns=column_values)
features = kwargs.get("features", [])
aggregates = kwargs.get("aggregates", ["postal_code", "unit"])
default_aggregates = self.get_default_aggregates(office)
aggregates = kwargs.get("aggregates", default_aggregates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about doing this in one line like:

aggregates = kwargs.get("aggregates", self.get_default_aggregates(office))

The reason I'm suggesting that is that if we have a new office that we want to include, we don't have to make a code change to add it to the defaults list, we can just include the aggregates arg.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

src/elexmodel/client.py Show resolved Hide resolved
@lennybronner lennybronner merged commit d46130c into develop Jun 28, 2023
5 checks passed
@lennybronner lennybronner deleted the elex-1235-create-default-aggregates branch June 28, 2023 01:50
@jchaskell jchaskell mentioned this pull request Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants