Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential performance improvements #377

Open
bstabler opened this issue Feb 4, 2021 · 6 comments
Open

potential performance improvements #377

bstabler opened this issue Feb 4, 2021 · 6 comments
Labels
Performance Changes that improve performance

Comments

@bstabler
Copy link
Contributor

bstabler commented Feb 4, 2021

This issue is for keeping track of potential performance improvement ideas:

  • reduce expression solving time and memory needs by better handling string data as pandas categorical data
  • improve parallelization by taking advantage of updates to Python 3’s multiprocessing library
  • continue to improve chunksize calculations for more optimized multiprocessing setups
  • review ct-ramp and daysim performance ideas

Please add other ideas, thanks

@jpn--
Copy link
Member

jpn-- commented Feb 5, 2021

A configuration file switch that can disable trip-level processing for tours based on tour mode. So, you can shut off (skip) stop_frequency, trip_purpose, trip_destination, trip_scheduling, and trip_mode_choice for walk and bike tours if you don't care about those trips (e.g. inside a global feedback loop iteration, I don't care about walk or bike trips as they don't impact congestion).

Bonus points: the ability to easily flop the switch the other way, and re-start only the filtered tours (e.g. I decided I finished doing all my global feedback loops and I want those non-motorized trips back now)

@bstabler
Copy link
Contributor Author

@stefancoe - add reading skim data from disk on-demand as opposed to reading every skim into RAM at the start as a way to trade runtime for RAM. @toliwaga implemented an undocumented version of this during the TVPB caching research and it runs slower but uses a lot less RAM. We may want to complete this feature for general use.

@bstabler
Copy link
Contributor Author

bstabler commented Apr 27, 2021

Some more ideas from discussions with SANDAG:

  • Move from strings to factors
  • Exponentiate ahead of time TAP to TAP utilities, along with pre-computing access/egress costs
  • Smarter binary search / picking of an alternative from a large choice set (such as for location choice)
  • Make trip destination (i.e. intermediate stop location choice) aware of the tour mode so:
    o For bike, walk, transit to reduce the set of possible mazs ahead of time
    o For auto, to pre-compute TAZ to TAZ total utilities to avoid duplication of calculations
  • Smarter chunking calculations to get more throughput Improve RAM/chunking settings #406
  • Continued expression review/tidying up to reduce redundancy of calculations (i.e. optimization of written expressions)
  • Buy a bigger / faster server and test ahead of time in the cloud what’s possible with respect to runtime reductions

@stefancoe
Copy link
Contributor

Some good ideas here to increase pandas performance. The Pandas eval function looks interesting. Could it replace/substitute python eval in some cases?

@jpn--
Copy link
Member

jpn-- commented Apr 27, 2021

Could it replace/substitute python eval in some cases?

Not that we couldn't do it more, but we're already using pandas.eval in several places, for example:

@stefancoe
Copy link
Contributor

Oh good to know-Thanks for pointing that out!

@jfdman jfdman added the Performance Changes that improve performance label Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Changes that improve performance
Projects
None yet
Development

No branches or pull requests

4 participants