Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove GEOS dependency in favor of Shapely #2083

Merged
merged 10 commits into from
Jul 14, 2023

Conversation

greglucas
Copy link
Contributor

This removes the GEOS dependency and pushes the geometry handling into Shapely. It currently takes ~2x longer, so opening this for discussion purposes.

ping: @QuLogic, @jorisvandenbossche

For simple benchmarking I've been making the project_linear benchmark executable for quick runs by adding this to the bottom:

if __name__ == "__main__":
    s = Suite()
    p, r = "PlateCarree", "50m"
    s.setup(p, r)
    s.time_project_linear(s, r)

Then profiling

python -m cProfile -o nogeos.prof benchmarks/cases/project_linear.py
snakeviz nogeos.prof

@greglucas
Copy link
Contributor Author

  1. The most time spent in Shapely seems to be in getting coordinates,

It might be easier to comment inline on a draft PR, but one idea to make this more efficient is to do src_coords = np.asarray(geometry.coords) (inside project_linear) and then further access coordinates in that numpy array, instead of accessing the coordinates from the coords object (since this has a getitem in python that has some overhead).

Good idea! That reduced it by ~20%. But, we are still creating a lot of Points and Linestrings to do the covers test further on, so we need to think about creating the array of all geometries as soon as possible if we can. This would also greatly help the pyproj transforms too...

@dopplershift
Copy link
Contributor

I'd be willing to pay a significant speed penalty (with the hopes of getting it back later) in order to remove the headache that is depending on Shapely while simultaneously linking to GEOS.

Comment on lines 203 to 247
# TODO: Avoid create-destroy
coords = GEOSCoordSeq_create_r(handle, 1, 2)
GEOSCoordSeq_setX_r(handle, coords, 0, point.x)
GEOSCoordSeq_setY_r(handle, coords, 0, point.y)
g_point = GEOSGeom_createPoint_r(handle, coords)
state = (POINT_IN
if GEOSPreparedCovers_r(handle, gp_domain, g_point)
else POINT_OUT)
GEOSGeom_destroy_r(handle, g_point)
g_point = sgeom.Point((point.x, point.y))
state = POINT_IN if gp_domain.covers(g_point) else POINT_OUT
del g_point
Copy link
Contributor

@jorisvandenbossche jorisvandenbossche Sep 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this create-destroy Point overhead, shapely could have some predicates specialized for taking x, y coordinates instead of a Point.
Coincidentally, GEOS recently added this to their C API as well, so I experimented with this a bit in shapely: shapely/shapely#1548 (and this improvement wouldn't be tied to the GEOS version).

With that, if the user has shapely 2.0 installed, the above could be replaced with:

state = POINT_IN if shapely.intersects_xy(gp_domain.context, point.x, point.y) else POINT_OUT

(intersects and covers should be the same in case of polygon/point predicate)

Testing this, this seems to give around 15-20% speedup on the time_project_linear benchmark (on my laptop, going from around 6s to 5s)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be nice to have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this actually made it into shapely 2.0, and thus could be used (conditionally on the shapely version, if you are OK with adding some different code branches depending on the shapely version)

@varchasgopalaswamy
Copy link

If it means I don't have to build GEOS, I would take a 2x slowdown any day :)

@greglucas
Copy link
Contributor Author

After a long delay, I finally got around to pushing up some commits I've had around for a while.

I added a new commit that transforms all of the source points in one call immediately to remove some of the repetitive transform(single_point) calls. However, this still only improves the performance by about 20% or so. Reducing the number of calls from 1.2M to 900K, indicating many of the calls are from within the bisect() routine itself.

If we want to be more like basemap and transform everything immediately and not interpolate between the transformed points, we can get a factor of 2x improvement. See this quick commit I was using for testing: greglucas@c052f63

My project_linear timings were:
Removing GEOS and only using Shapely: 9 seconds
Transforming the points as early as possible: 8 seconds
Removing interpolation/bisection between points: 4 seconds

On the final one where we've removed all interpolation/bisection, the slowest functions are all within Shapely, so we could focus effort in refactoring the shapely geometry handling then.

@greglucas greglucas marked this pull request as ready for review November 16, 2022 17:01
@dopplershift
Copy link
Contributor

I think the bisection/interpolation was an intentional feature of Cartopy, so we shouldn't just blindly remove it. How hard would it be to add an option (somewhere) to disable it?

The 20% sounds gain sounds great, and honestly the win in terms of making this project more sustainable, simplifying packaging, and eliminating a solid 50% of the support questions is worth the current performance impact.

@greglucas
Copy link
Contributor Author

greglucas commented Nov 16, 2022

💯 agree on all fronts. I'm hopeful that this can be a short-term hit in performance, but lead to a more sustainable project and getting people to be able to contribute.

I think the bisection/interpolation was an intentional feature of Cartopy, so we shouldn't just blindly remove it. How hard would it be to add an option (somewhere) to disable it?

I agree we should not remove it (I have some ideas for how to do it better still), that commit is just a demonstration of how it could work and the speed we can gain. Thinking more about it, it is analogous to the transform_first keyword argument we added to the contour functions.

def _add_transform_first(func):
"""
A decorator that adds and validates the transform_first keyword argument.
This handles a fast-path optimization that projects the points before
creating any patches or lines. This means that the lines/patches will be
calculated in projected-space, not data-space. It requires the first
three arguments to be x, y, and z and all must be two-dimensional to use
the fast-path option.
This should be added after the _add_transform wrapper so that a transform
is guaranteed to be present.
"""

So, I can see adding that decorator to more functions, or even bringing that up a level and adding a property to Projection that would signal us to transform things immediately for the user.

@dopplershift
Copy link
Contributor

That seems reasonable.

@greglucas
Copy link
Contributor Author

Any other thoughts from people on this PR and the trade-off of the speed decrease vs ease of installation?

@QuLogic this was mostly your work, so did you have a preference one way or the other on it?

@dopplershift
Copy link
Contributor

Just a ping to any interested parties for comment (@QuLogic ) since the release of Shapely 2.0 is imminent.

@jorisvandenbossche
Copy link
Contributor

As another speed-up idea, I am wondering: how many of the projections / CRS classes have a domain that is rectangular (bounding box) ? Because if a significant portion is limited to that case, I think it would be relatively straightforward to have a fast path for that case for some of the shapely interactions (eg checking if a point is in a bbox is easy and could be done without shapely, avoiding the overhead of shapely in this fast path)

@greglucas
Copy link
Contributor Author

I like all of these thoughts for speed-ups! My question is do we want to keep implementing the speed-ups before accepting a PR, or do we merge something a bit slower for now and hope that people open PRs with improvements in the future?

My take is that it would be good to get this in and then start whittling away on the speed-ups later (don't let perfect be the enemy of good). There are quite a few speed-ups to be had and I think they can be done in parallel by people with expertise in different areas too.

(I've convinced myself that our bisection routines are bad performance-wise and not true bisection and that is the root of the problem. We go back and forth bisecting between points until a straight-enough segment is found, then set that to the new start or end point and restart bisecting. This means if we've already overshot in a previous iteration, we could try that same point again in the next iteration. I think we should really do recursive bisection adding points as we go down the recursive path avoiding any duplicate project_point + overlaps calls.)

@dopplershift
Copy link
Contributor

If someone has plans to immediately implement some speed-ups, I'd entertain waiting. Otherwise, while I'm sure we'll catch some flack for being even slower, I think finally being able to upload wheels and eliminating whole classes of installation issue outweighs that. Maybe it being released slow will motivate some contributions. 😉

@greglucas
Copy link
Contributor Author

Maybe it being released slow will motivate some contributions

One can dream 😄 (that is my hope too)

@ocefpaf
Copy link
Member

ocefpaf commented Dec 8, 2022

I think finally being able to upload wheels

We, conda uses, underestimate the value of this. Cartopy really need wheels ASAP!

@greglucas
Copy link
Contributor Author

Latest commit adds the version gate to avoid the create/destroy Point geometry as @jorisvandenbossche suggested. I wasn't able to see much of a speed-up locally with that though.

There is another LineString create/destroy for the covers()/disjoint() calls that is likely still the slow path.

@jorisvandenbossche
Copy link
Contributor

My question is do we want to keep implementing the speed-ups before accepting a PR

My suggestions were certainly just ideas from looking at the code that I wanted to note, not things that have to be done in this PR (and anyway it's not up to me to decide on that ;))

I would personally also say that being able to drop the build dependency on GEOS will be a huge benefit for installation, and seems worth some (temporary) speed degradation.

@greglucas
Copy link
Contributor Author

I've taken a stab at upstreaming some performance updates to pyproj, which would be able to more than offset this (i.e. faster than current main). We are definitely taking a performance hit here, but I still think the benefits are worth it and then we can start whittling away on other performance improvements.

Anyone willing to review/approve and hit the merge button ;) ?

@dopplershift
Copy link
Contributor

I'm willing to hit it as soon as I carve out some cycles to actually review the code.


# Each line resulting from the split should start and end with a
# segment that crosses the boundary when extended to double length.
def assert_close_to_boundary(xy):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit changes this. As far as I can tell we don't extend a line segment by 2x the distance to attach to a boundary anywhere, rather we just project along the boundary to find the closest point. So really we just want to be "close" and not be in the middle of the domain. This is relevant because some segments may be very close to the boundary, but when extended they don't go very far because the points are too closely spaced.

@greglucas
Copy link
Contributor Author

@bjlittle does IRIS have any issues with removing GEOS? Does this slow things down too much, and are there any additional failures that this introduces for all of you?

@bjlittle
Copy link
Member

bjlittle commented Mar 12, 2023

@bjlittle does IRIS have any issues with removing GEOS? Does this slow things down too much, and are there any additional failures that this introduces for all of you?

@greglucas We also have asv benchmarking enabled on SciTools/iris, so we'll soon see whether there are any performance regressions.

@trexfeathers Would it be possible to benchmark against this PR pre-merge? That might be pretty informative. Fancy making that happen and reporting the outcome, if you have capacity?

INSTALL Outdated
written in C++.
**Matplotlib** 3.2 or later (https://matplotlib.org/)
Python package for 2D plotting. Python package required for any
graphical capabilities.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matplotlib is duplicated. This changeset can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, thanks for catching that!

@greglucas
Copy link
Contributor Author

Does anyone have other thoughts on this PR? What performance penalty is the maximum we would take for easier installations, or what else we would need to consider to keep this moving forward?

@varchasgopalaswamy
Copy link

As a user I would say < 2x performance hit is more than acceptable given the ease of installation. Also, once it's out in the field it'll be easier to notice where the performance hits are really mattering, right?

@dopplershift
Copy link
Contributor

Does anyone have other thoughts on this PR? What performance penalty is the maximum we would take for easier installations, or what else we would need to consider to keep this moving forward?

In all honesty, given the resources we have for maintaining this project, I think the only requirement I have is that it still works. This is a huge win for packaging and maintainability.

@greglucas
Copy link
Contributor Author

OK, some updated numbers for everyone then.

Running the benchmark from the initial comment (project_linear), I get 1.85s on upstream/main and 2.35s on this branch. So, much less than a factor of 2 even. Curiously, it is a bit slower on shapely 2.0 compared to 1.8.5, which probably means there are even more speed-ups to be had if we look at refactoring some of the array coordinate usage throughout the library.

Running the test suite locally I get within a second of each other. On CI it looks like there is ~50% difference between this branch and the most recent merge to main. (Look at the testing time explicitly because the installation time takes forever)
main: https://github.com/SciTools/cartopy/actions/runs/5205628895/jobs/9391295722
this branch: https://github.com/SciTools/cartopy/actions/runs/5217703945/jobs/9417803995

@varchasgopalaswamy
Copy link

Sounds more than acceptable to me !

@greglucas
Copy link
Contributor Author

In all honesty, given the resources we have for maintaining this project, I think the only requirement I have is that it still works. This is a huge win for packaging and maintainability.

It looks like I do have the ability to merge without any approvals. With the lack of people and time to review this PR I think we should push forward and merge this since we have new numpy, scipy, and matplotlib releases recently that it would be nice to push an update out for.

I'll give this 2 more weeks and someone can throw a request changes to block me from self-merging if they have strong opinions against that, otherwise since no one has spoken up adamantly against this I think it is time to move forward with getting this in.

QuLogic and others added 10 commits July 14, 2023 17:02
This avoids __getitem__ calls from the shapely geometry and converts
to numpy array and memory views within cython.
This transforms all points in the source geometry right away, rather
than one-by-one later on. This saves some calls to pyproj's transform()
method, which added some overhead.
With Shapely 2.0 we can avoid the create/destroy point for checking
whether a point is within a geometry.
If we are going to a rectangular domain and all initial destination
points are within the domain, then we can avoid checks to see if
each individual line segment is within the domain.
We don't "project" by a factor of a two to find the boundary
intersection, rather we project along the boundary to the nearest
point. So really we just want to make sure that our cuts are "close"
to the boundary, but we don't care about the segments being extended
by a given fraction.

This is relevant when the final two coordinates are close to each
other and thus that segment is not projected very far, yet it is
quite close to the boundary.
@greglucas greglucas merged commit 9006591 into SciTools:main Jul 14, 2023
8 checks passed
@greglucas greglucas deleted the qulogic-delete-proj-geos branch July 14, 2023 23:24
@greglucas greglucas added this to the 0.22 milestone Jul 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants