I'm trying to clean up memory use in the new overlayng code, on the overlay-sr branch, and have come to an impasse.
The valgrind report is here: https://gist.github.com/pramsey/d4be398473ea49ff4e241f5e7d4b855b As I see it, there's one big set of NodedSegmenentString that are created by EdgeNodingBuilder and represent the un-noded input edges and stored in the member variable "inputEdges": https://github.com/libgeos/geos/blob/overlay-sr/src/operation/overlayng/EdgeNodingBuilder.cpp#L78-L79 And those are cleaned up here: https://github.com/libgeos/geos/blob/overlay-sr/include/geos/operation/overlayng/EdgeNodingBuilder.h#L203-L208 There's a second transient set of NodedSegmentString that are generated by the Noder, and come into existence here: https://github.com/libgeos/geos/blob/overlay-sr/src/operation/overlayng/EdgeNodingBuilder.cpp#L98 And then disappear almost immediately afterwards here: https://github.com/libgeos/geos/blob/overlay-sr/src/operation/overlayng/EdgeNodingBuilder.cpp#L103-L105 The valgrind report seems to implicate this second lifecycle in a huge leak and yet, it seems really self-contained. The MCIndexNoder generates some new NodedSegmentString, the ValidatingNoder just holds onto the list for a while, then passes it on to the EdgeNodingBuilder that uses them to generate some Edges, then deletes them. The latest iteration of the branch is here. https://github.com/libgeos/geos/tree/overlay-sr/ Any and all feedbackk most appreciated, P _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
On Thu, Jul 23, 2020 at 08:22:44PM -0700, Paul Ramsey wrote:
> I'm trying to clean up memory use in the new overlayng code, on the overlay-sr branch, and have come to an impasse. > > The valgrind report is here: https://gist.github.com/pramsey/d4be398473ea49ff4e241f5e7d4b855b That report contains multiple leaks, which are reported from smaller to bigger, so I'd start at the end of it: ==6519== 59,784,992 (40,104 direct, 59,744,888 indirect) bytes in 1,671 blocks are definitely lost in loss record 8,846 of 8,847 ==6519== at 0x4C29203: operator new(unsigned long) (vg_replace_malloc.c:334) ==6519== by 0x528EBAA: geos::noding::NodedSegmentString::getNodedSubstrings(std::vector<geos::noding::SegmentString*, std::allocator<geos::noding::SegmentString*> > const&) (NodedSegmentString.cpp:148) The ownership of those SegmentStrings is not documented in NodedSegmentString.h, which would help. My impression is that those segment strings should be shared pointers, to overcome this long standing issue (it was a problem before snaprounding as well). I think the caller should take ownership of those objects. In this case, the caller is ValidatingNoder, which is storing those into its 'nodedSS' member, which is an heap-allocated vector. That vector is returned by ValidatingNoder::getNodedSubstrings() which is also undocumented but is probably expected to pass ownership back to its own caller. I suggest you destroy the vector and its contents IFF getNodedSubstring is never called, which may be the case here. Or (bigger change) use shared pointers. PS: I like the DEVELOPER-NOTES.md file --strk; _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
Paul, FYI: I've pushed some fixes to that branch. Keep this in mind
before you change your local copy too much. It's just some override keywords and a fix to autotools scripts at time of writing. The code, in its current form, passes `make check` (not sure how much of the new code `make check` actually tests, but I saw some unit test). --strk; On Fri, Jul 24, 2020 at 12:07:44PM +0200, Sandro Santilli wrote: > On Thu, Jul 23, 2020 at 08:22:44PM -0700, Paul Ramsey wrote: > > I'm trying to clean up memory use in the new overlayng code, on the overlay-sr branch, and have come to an impasse. > > > > The valgrind report is here: https://gist.github.com/pramsey/d4be398473ea49ff4e241f5e7d4b855b > > That report contains multiple leaks, which are reported from smaller > to bigger, so I'd start at the end of it: > > ==6519== 59,784,992 (40,104 direct, 59,744,888 indirect) bytes in 1,671 blocks are definitely lost in loss record 8,846 of 8,847 > ==6519== at 0x4C29203: operator new(unsigned long) (vg_replace_malloc.c:334) > ==6519== by 0x528EBAA: geos::noding::NodedSegmentString::getNodedSubstrings(std::vector<geos::noding::SegmentString*, std::allocator<geos::noding::SegmentString*> > const&) (NodedSegmentString.cpp:148) > > The ownership of those SegmentStrings is not documented in > NodedSegmentString.h, which would help. My impression is that > those segment strings should be shared pointers, to overcome this > long standing issue (it was a problem before snaprounding as well). > > I think the caller should take ownership of those objects. > In this case, the caller is ValidatingNoder, which is storing > those into its 'nodedSS' member, which is an heap-allocated > vector. That vector is returned by > ValidatingNoder::getNodedSubstrings() which is also undocumented > but is probably expected to pass ownership back to its own caller. > > I suggest you destroy the vector and its contents IFF > getNodedSubstring is never called, which may be the case here. > > Or (bigger change) use shared pointers. > > PS: I like the DEVELOPER-NOTES.md file > > --strk; > _______________________________________________ > geos-devel mailing list > [hidden email] > https://lists.osgeo.org/mailman/listinfo/geos-devel geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
In reply to this post by Sandro Santilli-4
On Fri, Jul 24, 2020 at 12:07:44PM +0200, Sandro Santilli wrote:
> > The ownership of those SegmentStrings is not documented in > NodedSegmentString.h, which would help. My impression is that > those segment strings should be shared pointers, to overcome this > long standing issue (it was a problem before snaprounding as well). More information. I think the primary problem with this API is that the ownership of SegmentString in Noder.h is not properly documented. Comments are even clearly expressing the confusion: * Some Noders may add all these nodes to the input SegmentStrings; * others may only add some or none at all. The methods defined for a Noder are: virtual void computeNodes(std::vector<SegmentString*>* segStrings) = 0; virtual std::vector<SegmentString*>* getNodedSubstrings() const = 0; What isn't clear is: - Who owns the SegmentString passed to computeNodes ? - Who owns the SegmentString returned by getNodedSubstrings ? - Can the same SegmentString be present in both input and output containers ? I think these questions should be answered with proper documentation in that class, and then we need to make sure all subclass implement the documented semantic properly. Using shared pointers of those SegmentStrings may help simplifying the implementations, with the downside of some more overhead. But other implementations could be made to work, as long as it is clear who is responsible for what... --strk; _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
Good observations, Sandro. My thoughts are below. On Fri, Jul 24, 2020 at 8:01 AM Sandro Santilli <[hidden email]> wrote:
- Noders are a process, not a container. As such, they should not own anything. -- The caller owns the NodedSegmentStrings passed in to the Noder. -- The caller owns the SegmentStrings returned by getNodedSubstrings - getNodedSubstrings should always return new SegmentStrings, even if they are just a copy of an input SegmentString (i.e. no nodes were found and added to the original NodedSegmentString (This will make it easier to handle the lifecycle of the inputs and outputs I think?) I'm not sure if JTS/GEOS obeys these semantics. In JTS it doesn't matter much, but GEOS should be fixed to have this contract. Furthermore: The fact that getNodedSubstrings returns NodedSegmentStrings is an mistake caused by an unfortunate shortcut in JTS a long time ago. In fact, almost all usage of getNodedSubstrings only requires a BasicSegmentString to be returned (which is a much simpler/smaller object). Paul & I have discussed fixing this in JTS and GEOS, and will likely do so during or after the delivery of OverlayNG. Also, the name of the method is then confusing, and will be changed as well (perhaps to getSubstrings or splitSubstrings or some such) _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
On Sat, Jul 25, 2020 at 07:12:08AM -0700, Martin Davis wrote:
> Good observations, Sandro. My thoughts are below. > > On Fri, Jul 24, 2020 at 8:01 AM Sandro Santilli <[hidden email]> wrote: > > > > > The methods defined for a Noder are: > > > > virtual void computeNodes(std::vector<SegmentString*>* segStrings) = 0; > > virtual std::vector<SegmentString*>* getNodedSubstrings() const = 0; > > > > What isn't clear is: > > > > - Who owns the SegmentString passed to computeNodes ? > > - Who owns the SegmentString returned by getNodedSubstrings ? > > - Can the same SegmentString be present in both input and output > > containers ? > > > > - Noders are a process, not a container. As such, they should not own > anything. > -- The caller owns the NodedSegmentStrings passed in to the Noder. > -- The caller owns the SegmentStrings returned by getNodedSubstrings > > - getNodedSubstrings should always return new SegmentStrings, even if they > are just a copy of an input SegmentString (i.e. no nodes were found and > added to the original NodedSegmentString (This will make it easier to > handle the lifecycle of the inputs and outputs I think?) > > I'm not sure if JTS/GEOS obeys these semantics. In JTS it doesn't matter > much, but GEOS should be fixed to have this contract. For sure it should be fixed to DOCUMENT this contract. I think current (the ones reachable from C-API) do respect it, which means that some of them have to DESTROY SegmentString objects that only exist temporary (think recursive noder) because those objects would never be returned back to caller, thus woule be left leaking. --strk; _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
On Sat, Jul 25, 2020 at 9:41 AM Sandro Santilli <[hidden email]> wrote: On Sat, Jul 25, 2020 at 07:12:08AM -0700, Martin Davis wrote: Definitely. Using C++ language constructs, if possible. I think current (the ones reachable from C-API) do respect it, Well, yes. The new noders (SnapRouningNoder and SnappingNoder) are not recursive, so don't have this problem. The goal I think is to eliminate other kinds of noding, since they should be superseded. But I realize that might have to be a long-term goal. Anyway, if there are noders with that behaviour, don't they already handle their own memory? Or are there leaks in old code? _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
In reply to this post by Sandro Santilli-4
> On Jul 24, 2020, at 6:35 AM, Sandro Santilli <[hidden email]> wrote: > > Paul, FYI: I've pushed some fixes to that branch. Keep this in mind > before you change your local copy too much. It's just some override > keywords and a fix to autotools scripts at time of writing. Pushed to github or osgeo? I'm working on github. I don't see any changes. > The code, in its current form, passes `make check` (not sure how much > of the new code `make check` actually tests, but I saw some unit test). Autotools make check might be missing some of the new XML files, as they have to be manually added still. All unit tests should run in both builds. P > > --strk; > > On Fri, Jul 24, 2020 at 12:07:44PM +0200, Sandro Santilli wrote: >> On Thu, Jul 23, 2020 at 08:22:44PM -0700, Paul Ramsey wrote: >>> I'm trying to clean up memory use in the new overlayng code, on the overlay-sr branch, and have come to an impasse. >>> >>> The valgrind report is here: https://gist.github.com/pramsey/d4be398473ea49ff4e241f5e7d4b855b >> >> That report contains multiple leaks, which are reported from smaller >> to bigger, so I'd start at the end of it: >> >> ==6519== 59,784,992 (40,104 direct, 59,744,888 indirect) bytes in 1,671 blocks are definitely lost in loss record 8,846 of 8,847 >> ==6519== at 0x4C29203: operator new(unsigned long) (vg_replace_malloc.c:334) >> ==6519== by 0x528EBAA: geos::noding::NodedSegmentString::getNodedSubstrings(std::vector<geos::noding::SegmentString*, std::allocator<geos::noding::SegmentString*> > const&) (NodedSegmentString.cpp:148) >> >> The ownership of those SegmentStrings is not documented in >> NodedSegmentString.h, which would help. My impression is that >> those segment strings should be shared pointers, to overcome this >> long standing issue (it was a problem before snaprounding as well). >> >> I think the caller should take ownership of those objects. >> In this case, the caller is ValidatingNoder, which is storing >> those into its 'nodedSS' member, which is an heap-allocated >> vector. That vector is returned by >> ValidatingNoder::getNodedSubstrings() which is also undocumented >> but is probably expected to pass ownership back to its own caller. >> >> I suggest you destroy the vector and its contents IFF >> getNodedSubstring is never called, which may be the case here. >> >> Or (bigger change) use shared pointers. >> >> PS: I like the DEVELOPER-NOTES.md file >> >> --strk; >> _______________________________________________ >> geos-devel mailing list >> [hidden email] >> https://lists.osgeo.org/mailman/listinfo/geos-devel > _______________________________________________ > geos-devel mailing list > [hidden email] > https://lists.osgeo.org/mailman/listinfo/geos-devel _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
In reply to this post by Sandro Santilli-4
I'm not sure moving the std::vector<SegmentString*> to shared pointers will be necessary, simply using std::vector<unique_ptr<SegmentString>> is probably sufficient. The "real problem" as you note is just that the contract isn't explicit anywhere. I think that the move the unique_ptr across other parts of the API avoided this chunk of code because it was and remains pretty gnarly. I did not want to refactor it because I'm dealing with enough mess just bringing in the new code. A refactor is on my list now though, including some work to reduce some of the huge amount of copying that goes on under the covers, some of it unnecessary In other news, I have identified and removed the memory leaks, everything I run through valgrind is now clean. So, onwards towards merging the branch to master, once I get CI green. P. On Fri, Jul 24, 2020 at 8:00 AM Sandro Santilli <[hidden email]> wrote: On Fri, Jul 24, 2020 at 12:07:44PM +0200, Sandro Santilli wrote: _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
In reply to this post by Paul Ramsey
On Mon, Jul 27, 2020 at 12:46:55PM -0700, Paul Ramsey wrote:
> > > > On Jul 24, 2020, at 6:35 AM, Sandro Santilli <[hidden email]> wrote: > > > > Paul, FYI: I've pushed some fixes to that branch. Keep this in mind > > before you change your local copy too much. It's just some override > > keywords and a fix to autotools scripts at time of writing. > > Pushed to github or osgeo? I'm working on github. I don't see any changes. To github/overlay-sr You should see changes when you `git pull` (I suggest adding a `--rebase` switch to reduce noise) --strk; _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
On Tue, Jul 28, 2020 at 11:08:13PM +0200, Sandro Santilli wrote:
> On Mon, Jul 27, 2020 at 12:46:55PM -0700, Paul Ramsey wrote: > > > > > > > On Jul 24, 2020, at 6:35 AM, Sandro Santilli <[hidden email]> wrote: > > > > > > Paul, FYI: I've pushed some fixes to that branch. Keep this in mind > > > before you change your local copy too much. It's just some override > > > keywords and a fix to autotools scripts at time of writing. > > > > Pushed to github or osgeo? I'm working on github. I don't see any changes. > > To github/overlay-sr > You should see changes when you `git pull` > (I suggest adding a `--rebase` switch to reduce noise) I just realized I did not really push the changes yet. Now I did (thanks `git reflog` for helping me find those changes after many branch switching). --strk; _______________________________________________ geos-devel mailing list [hidden email] https://lists.osgeo.org/mailman/listinfo/geos-devel |
Free forum by Nabble | Edit this page |