[gdal-dev] port/md5 cvs_MD5*

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] port/md5 cvs_MD5*

Kurt Schwehr-2
Hi all,

Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086 ?

I very much appreciate the work that bishop is doing and I hate to slow down a contributor

I am really worried about code like this going into GDAL, especially into port/

But my worries are:

- this in no way conforms to other code in GDAL
- has the potential to collide with other code that imports this code
- it has a very awkward C style
- the commit message did not point to where it came from (at least it's not hard to guess)
- the file naming is different than (most) of the rest of the directory
- and...

Yes, we have other libraries included, but it seems like a road we don't want to go down

-kurt

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2

Hi Kurt.

The md5.cpp and md5.h were in frmst/wms driver folder. As I see from header this files are not initially from GDAL project.

I moved this files to port with minimum changes - only macros for suppress Doxygen parsing.

The motivation of this moving was:

1. md5 functions used in several drivers (WMS and WFS) and may be others in future.

2. need API function to get the same md5 hash as used in WMS for autotesting functionality.

3. downloading area of interest into the WMS file based cache by external programs. May be some other utilization.

I think ideally the OpenSSL MD5 functions must be using. But this will be one more dependency. Do we need it?


Best regards,
    Dmitry
20.12.2017 16:43, Kurt Schwehr пишет:
Hi all,

Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086 ?

I very much appreciate the work that bishop is doing and I hate to slow
down a contributor

I am really worried about code like this going into GDAL, especially into
port/

But my worries are:

- this in no way conforms to other code in GDAL
- has the potential to collide with other code that imports this code
- it has a very awkward C style
- the commit message did not point to where it came from (at least it's not
hard to guess)
- the file naming is different than (most) of the rest of the directory
- and...

Yes, we have other libraries included, but it seems like a road we don't
want to go down

-kurt



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Kurt Schwehr-2
I have not looked in the wms code much, but my comment was not about adding an external dependency. It is about code quality in the core of GDAL.

On Dec 20, 2017 6:07 AM, "Dmitry Baryshnikov" <[hidden email]> wrote:

Hi Kurt.

The md5.cpp and md5.h were in frmst/wms driver folder. As I see from header this files are not initially from GDAL project.

I moved this files to port with minimum changes - only macros for suppress Doxygen parsing.

The motivation of this moving was:

1. md5 functions used in several drivers (WMS and WFS) and may be others in future.

2. need API function to get the same md5 hash as used in WMS for autotesting functionality.

3. downloading area of interest into the WMS file based cache by external programs. May be some other utilization.

I think ideally the OpenSSL MD5 functions must be using. But this will be one more dependency. Do we need it?


Best regards,
    Dmitry
20.12.2017 16:43, Kurt Schwehr пишет:
Hi all,

Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086 ?

I very much appreciate the work that bishop is doing and I hate to slow
down a contributor

I am really worried about code like this going into GDAL, especially into
port/

But my worries are:

- this in no way conforms to other code in GDAL
- has the potential to collide with other code that imports this code
- it has a very awkward C style
- the commit message did not point to where it came from (at least it's not
hard to guess)
- the file naming is different than (most) of the rest of the directory
- and...

Yes, we have other libraries included, but it seems like a road we don't
want to go down

-kurt



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2
In fact it was part of core of GDAL but hidden in WMS driver folder. The
same or similar is in:

1. ogr/ogrsf_frmts/generic

2. ogr/ogrsf_frmts/geojson

What solution can be suitable here (I mean md5 case)?


Best regards,
     Dmitry

20.12.2017 17:19, Kurt Schwehr пишет:

> I have not looked in the wms code much, but my comment was not about adding
> an external dependency. It is about code quality in the core of GDAL.
>
> On Dec 20, 2017 6:07 AM, "Dmitry Baryshnikov" <[hidden email]> wrote:
>
>> Hi Kurt.
>>
>> The md5.cpp and md5.h were in frmst/wms driver folder. As I see from
>> header this files are not initially from GDAL project.
>>
>> I moved this files to port with minimum changes - only macros for suppress
>> Doxygen parsing.
>>
>> The motivation of this moving was:
>>
>> 1. md5 functions used in several drivers (WMS and WFS) and may be others
>> in future.
>>
>> 2. need API function to get the same md5 hash as used in WMS for
>> autotesting functionality.
>>
>> 3. downloading area of interest into the WMS file based cache by external
>> programs. May be some other utilization.
>>
>> I think ideally the OpenSSL MD5 functions must be using. But this will be
>> one more dependency. Do we need it?
>>
>>
>> Best regards,
>>      Dmitry
>>
>> 20.12.2017 16:43, Kurt Schwehr пишет:
>>
>> Hi all,
>>
>> Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086 ?
>>
>> I very much appreciate the work that bishop is doing and I hate to slow
>> down a contributor
>>
>> I am really worried about code like this going into GDAL, especially into
>> port/
>>
>> But my worries are:
>>
>> - this in no way conforms to other code in GDAL
>> - has the potential to collide with other code that imports this code
>> - it has a very awkward C style
>> - the commit message did not point to where it came from (at least it's not
>> hard to guess)
>> - the file naming is different than (most) of the rest of the directory
>> - and...
>>
>> Yes, we have other libraries included, but it seems like a road we don't
>> want to go down
>>
>> -kurt
>>
>>
>>
>>
>> _______________________________________________
>> gdal-dev mailing [hidden email]://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> [hidden email]
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Kurt Schwehr-2
My personal opinion...

It was not in fact in the core of GDAL for all users.  My primary environment does not include the WMS *driver*.  My take is that only the memory and gtiff raster drivers are really required for GDAL (vector, I'm not so sure).

What do you mean by the same or similar for generic and geojson?  And because something isn't great elsewhere, that does not mean we want to introduce more similar code that is hard to impossible to opt out of.

Dmitry's question about a reasonable solution is a good one (if you buy my being seriously uncomfortable with cvs_MD5*).  Possibilities include:

- Make everything from md5.cpp and md5.h be internal to md5.cpp (aka static or inline) and move CPLMD5String to md5.{cpp,h} at least keeping the trouble localize
- Refactor md5 to be cleaner
- Reimplement md5 hashing for gdal
- Find some other code that is license compatible that is cleaner

And why are you exposing this to python?  Python has md5 hashing already built in.

Can others please weigh in? 

On Wed, Dec 20, 2017 at 6:25 AM, Dmitry Baryshnikov <[hidden email]> wrote:
In fact it was part of core of GDAL but hidden in WMS driver folder. The same or similar is in:

1. ogr/ogrsf_frmts/generic

2. ogr/ogrsf_frmts/geojson

What solution can be suitable here (I mean md5 case)?


Best regards,
    Dmitry

<a href="tel:20.12.2017%2017" value="+12012201717" target="_blank">20.12.2017 17:19, Kurt Schwehr пишет:
I have not looked in the wms code much, but my comment was not about adding
an external dependency. It is about code quality in the core of GDAL.

On Dec 20, 2017 6:07 AM, "Dmitry Baryshnikov" <[hidden email]> wrote:

Hi Kurt.

The md5.cpp and md5.h were in frmst/wms driver folder. As I see from
header this files are not initially from GDAL project.

I moved this files to port with minimum changes - only macros for suppress
Doxygen parsing.

The motivation of this moving was:

1. md5 functions used in several drivers (WMS and WFS) and may be others
in future.

2. need API function to get the same md5 hash as used in WMS for
autotesting functionality.

3. downloading area of interest into the WMS file based cache by external
programs. May be some other utilization.

I think ideally the OpenSSL MD5 functions must be using. But this will be
one more dependency. Do we need it?


Best regards,
     Dmitry

<a href="tel:20.12.2017%2016" value="+12012201716" target="_blank">20.12.2017 16:43, Kurt Schwehr пишет:

Hi all,

Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086 ?

I very much appreciate the work that bishop is doing and I hate to slow
down a contributor

I am really worried about code like this going into GDAL, especially into
port/

But my worries are:

- this in no way conforms to other code in GDAL
- has the potential to collide with other code that imports this code
- it has a very awkward C style
- the commit message did not point to where it came from (at least it's not
hard to guess)
- the file naming is different than (most) of the rest of the directory
- and...

Yes, we have other libraries included, but it seems like a road we don't
want to go down

-kurt




_______________________________________________
gdal-dev mailing [hidden email]tps://lists.osgeo.org/mailman/listinfo/gdal-dev



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev





--

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Even Rouault-2

On mercredi 20 décembre 2017 06:39:37 CET Kurt Schwehr wrote:

> My personal opinion...

>

> It was not in fact in the core of GDAL for all users. My primary

> environment does not include the WMS *driver*. My take is that only the

> memory and gtiff raster drivers are really required for GDAL (vector, I'm

> not so sure).

>

> What do you mean by the same or similar for generic and geojson? And

> because something isn't great elsewhere, that does not mean we want to

> introduce more similar code that is hard to impossible to opt out of.

>

> Dmitry's question about a reasonable solution is a good one (if you buy my

> being seriously uncomfortable with cvs_MD5*). Possibilities include:

>

> - Make everything from md5.cpp and md5.h be internal to md5.cpp (aka static

> or inline) and move CPLMD5String to md5.{cpp,h} at least keeping the

> trouble localize

> - Refactor md5 to be cleaner

> - Reimplement md5 hashing for gdal

> - Find some other code that is license compatible that is cleaner

>

> And why are you exposing this to python? Python has md5 hashing already

> built in.

>

> Can others please weigh in?

 

For consistency with the rest, I'd suggest:

- rename md5.* to cpl_md5.*

- rename cvs_MD5* to CPLMD5*

- include cpl_port.h in cpl_md5.h and remove the #if defined(CPL_BASE_H_INCLUDED) stuff. Currently if someone includes md5.h as the first header, cvs_uint32 will expand to a unsigned long, whereas the md5.cpp file has been compiled with GUInt32. Not good

 

Other follow-up cleanups can then be done by interested parties.

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2
In reply to this post by Kurt Schwehr-2
Will answer in letter body

20.12.2017 17:39, Kurt Schwehr пишет:

> My personal opinion...
>
> It was not in fact in the core of GDAL for all users.  My primary
> environment does not include the WMS *driver*.  My take is that only the
> memory and gtiff raster drivers are really required for GDAL (vector, I'm
> not so sure).
>
> What do you mean by the same or similar for generic and geojson?  And
> because something isn't great elsewhere, that does not mean we want to
> introduce more similar code that is hard to impossible to opt out of.
For example we have such method - char * OGRGeometry::exportToJson() const

In this method executed OGR_G_ExportToJson from
ogr/ogrsf_frmts/geojson/ogrgeojsonwriter.cpp

So the core OGR using some code from GeoJSON driver.

The same way I can move md5 back to WMS and use it from port function
CPLMD5String.

But I think ogr and gdal must use port (and port shouldn't depends on
org or gcore, etc.),
also drivers can use ogr or gdal, but ogr and gdal shouldn't depends on
some driver.

This is only my opinion.

>
> Dmitry's question about a reasonable solution is a good one (if you buy my
> being seriously uncomfortable with cvs_MD5*).  Possibilities include:
>
> - Make everything from md5.cpp and md5.h be internal to md5.cpp (aka static
> or inline) and move CPLMD5String to md5.{cpp,h} at least keeping the
> trouble localize
> - Refactor md5 to be cleaner
> - Reimplement md5 hashing for gdal
> - Find some other code that is license compatible that is cleaner
>
> And why are you exposing this to python?  Python has md5 hashing already
> built in.
I'm not sure that Python produce the same hash than cvs_MD5*.
Also I'm not sure what in future they will be the same.
Finally not only Python use CPLMD5String - any program utilizing API may
need the hash formed the same way as WMS/WFS does.

>
> Can others please weigh in?
>
> On Wed, Dec 20, 2017 at 6:25 AM, Dmitry Baryshnikov <[hidden email]>
> wrote:
>
>> In fact it was part of core of GDAL but hidden in WMS driver folder. The
>> same or similar is in:
>>
>> 1. ogr/ogrsf_frmts/generic
>>
>> 2. ogr/ogrsf_frmts/geojson
>>
>> What solution can be suitable here (I mean md5 case)?
>>
>>
>> Best regards,
>>      Dmitry
>>
>> 20.12.2017 17:19, Kurt Schwehr пишет:
>>
>>> I have not looked in the wms code much, but my comment was not about
>>> adding
>>> an external dependency. It is about code quality in the core of GDAL.
>>>
>>> On Dec 20, 2017 6:07 AM, "Dmitry Baryshnikov" <[hidden email]>
>>> wrote:
>>>
>>> Hi Kurt.
>>>> The md5.cpp and md5.h were in frmst/wms driver folder. As I see from
>>>> header this files are not initially from GDAL project.
>>>>
>>>> I moved this files to port with minimum changes - only macros for
>>>> suppress
>>>> Doxygen parsing.
>>>>
>>>> The motivation of this moving was:
>>>>
>>>> 1. md5 functions used in several drivers (WMS and WFS) and may be others
>>>> in future.
>>>>
>>>> 2. need API function to get the same md5 hash as used in WMS for
>>>> autotesting functionality.
>>>>
>>>> 3. downloading area of interest into the WMS file based cache by external
>>>> programs. May be some other utilization.
>>>>
>>>> I think ideally the OpenSSL MD5 functions must be using. But this will be
>>>> one more dependency. Do we need it?
>>>>
>>>>
>>>> Best regards,
>>>>       Dmitry
>>>>
>>>> 20.12.2017 16:43, Kurt Schwehr пишет:
>>>>
>>>> Hi all,
>>>>
>>>> Can we discuss cvs_MD5* from https://trac.osgeo.org/gdal/changeset/41086
>>>> ?
>>>>
>>>> I very much appreciate the work that bishop is doing and I hate to slow
>>>> down a contributor
>>>>
>>>> I am really worried about code like this going into GDAL, especially into
>>>> port/
>>>>
>>>> But my worries are:
>>>>
>>>> - this in no way conforms to other code in GDAL
>>>> - has the potential to collide with other code that imports this code
>>>> - it has a very awkward C style
>>>> - the commit message did not point to where it came from (at least it's
>>>> not
>>>> hard to guess)
>>>> - the file naming is different than (most) of the rest of the directory
>>>> - and...
>>>>
>>>> Yes, we have other libraries included, but it seems like a road we don't
>>>> want to go down
>>>>
>>>> -kurt
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> gdal-dev mailing [hidden email]://
>>>> lists.osgeo.org/mailman/listinfo/gdal-dev
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> gdal-dev mailing list
>>>> [hidden email]
>>>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>>
>>>>
>

Best regards,
     Dmitry

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Even Rouault-2

 

> > And why are you exposing this to python? Python has md5 hashing already

> > built in.

>

> I'm not sure that Python produce the same hash than cvs_MD5*.

 

I do hope they return the same thing. MD5 is a standardized algorithm:

https://tools.ietf.org/html/rfc1321

 

I'm not sure we really need to expose it our CPL MD5 through Python.

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2
Just the note that CPLMD5String not only for Python, but any other API
clients: C/C++, Java, Perl, etc.

But, I agree, it can be easily removed.

Best regards,
     Dmitry

20.12.2017 18:35, Even Rouault пишет:
>>> And why are you exposing this to python?  Python has md5 hashing
> already
>>> built in.
>> I'm not sure that Python produce the same hash than cvs_MD5*.
> I do hope they return the same thing. MD5 is a standardized algorithm:
> https://tools.ietf.org/html/rfc1321
>
> I'm not sure we really need to expose it our CPL MD5 through Python.
>

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2
In reply to this post by Even Rouault-2
Just the note that CPLMD5String not only for Python, but any other API
clients: C/C++, Java, Perl, etc.

But, I agree, it can be easily removed.

Best regards,
     Dmitry

20.12.2017 18:35, Even Rouault пишет:
>>> And why are you exposing this to python?  Python has md5 hashing
> already
>>> built in.
>> I'm not sure that Python produce the same hash than cvs_MD5*.
> I do hope they return the same thing. MD5 is a standardized algorithm:
> https://tools.ietf.org/html/rfc1321
>
> I'm not sure we really need to expose it our CPL MD5 through Python.
>

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Kurt Schwehr-2
Dmitry,

Statements like this indicate a world of trouble:

I'm not sure that Python produce the same hash than cvs_MD5*.
> Also I'm not sure what in future they will be the same.

If you remove the swig and python stuff, move the CPLMD5String to cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with the change.  A C++ test would be good too.  I'll be happy to do a quick fix up of a few things that I see in the md5 code once it is to that point.  I do think it is a good thing that there be md5 support in port along with the existing sha{1,256} code.

I see what you are saying about the OGRGeometry::exportTo{json,kml,gml}, but that doesn't (to me) support your argument.  My personal opinion is that GDAL would have been stronger if those had separate and not been a part of OGRGeometry.

If there is a critical difference between CPLMD5String or a language and use case where md5 support in that language from GDAL is really required, you need to first document that issue.

On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email]> wrote:
Just the note that CPLMD5String not only for Python, but any other API clients: C/C++, Java, Perl, etc.

But, I agree, it can be easily removed.

Best regards,
    Dmitry

<a href="tel:20.12.2017%2018" value="+12012201718" target="_blank">20.12.2017 18:35, Even Rouault пишет:
And why are you exposing this to python?  Python has md5 hashing
already
built in.
I'm not sure that Python produce the same hash than cvs_MD5*.
I do hope they return the same thing. MD5 is a standardized algorithm:
https://tools.ietf.org/html/rfc1321

I'm not sure we really need to expose it our CPL MD5 through Python.


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



--

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Ari Jolma-2

My comments on this:

1) The MD5 function seems to be needed only(?) for the WMS cache and probably also WFS. To me that highlights the need for some integration work regarding caching in GDAL. I also implemented yet another caching code for the WCS (without MD5). Looking at my $HOME/.gdal I see also gmlas_xsd_cache and srs_cache.

2) This discussion thread also highlights the need to go towards more pull request style development at least for bigger changes.

Ari


Kurt Schwehr kirjoitti 21.12.2017 klo 02:56:
Dmitry,

Statements like this indicate a world of trouble:

I'm not sure that Python produce the same hash than cvs_MD5*.
> Also I'm not sure what in future they will be the same.

If you remove the swig and python stuff, move the CPLMD5String to cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with the change.  A C++ test would be good too.  I'll be happy to do a quick fix up of a few things that I see in the md5 code once it is to that point.  I do think it is a good thing that there be md5 support in port along with the existing sha{1,256} code.

I see what you are saying about the OGRGeometry::exportTo{json,kml,gml}, but that doesn't (to me) support your argument.  My personal opinion is that GDAL would have been stronger if those had separate and not been a part of OGRGeometry.

If there is a critical difference between CPLMD5String or a language and use case where md5 support in that language from GDAL is really required, you need to first document that issue.

On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email]> wrote:
Just the note that CPLMD5String not only for Python, but any other API clients: C/C++, Java, Perl, etc.

But, I agree, it can be easily removed.

Best regards,
    Dmitry

<a href="tel:20.12.2017%2018" value="+12012201718" target="_blank" moz-do-not-send="true">20.12.2017 18:35, Even Rouault пишет:
And why are you exposing this to python?  Python has md5 hashing
already
built in.
I'm not sure that Python produce the same hash than cvs_MD5*.
I do hope they return the same thing. MD5 is a standardized algorithm:
https://tools.ietf.org/html/rfc1321

I'm not sure we really need to expose it our CPL MD5 through Python.


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



--


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2

Hi Ari,

The pull request and discussion can be found here: https://github.com/OSGeo/gdal/pull/280

I cannot imagine that this improvements will affect something else. MD5 used for cache paths initially, I only added some improvements for WMS cache size limits, expire time and split cache per dataset base (i.e. to delete cached files when dataset deletes).

What about your idea about common caching classes/functions - this is topic to discuss.

Best regards,
    Dmitry
21.12.2017 0:32, Ari Jolma пишет:
My comments on this:

1) The MD5 function seems to be needed only(?) for the WMS cache and probably also WFS. To me that highlights the need for some integration work regarding caching in GDAL. I also implemented yet another caching code for the WCS (without MD5). Looking at my $HOME/.gdal I see also gmlas_xsd_cache and srs_cache.

2) This discussion thread also highlights the need to go towards more pull request style development at least for bigger changes.

Ari


Kurt Schwehr kirjoitti 21.12.2017 klo 02:56:
Dmitry,

Statements like this indicate a world of trouble:

> I'm not sure that Python produce the same hash than cvs_MD5*.
> Also I'm not sure what in future they will be the same.

If you remove the swig and python stuff, move the CPLMD5String to cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with the change.  A C++ test would be good too.  I'll be happy to do a quick fix up of a few things that I see in the md5 code once it is to that point.  I do think it is a good thing that there be md5 support in port along with the existing sha{1,256} code.

I see what you are saying about the OGRGeometry::exportTo{json,kml,gml}, but that doesn't (to me) support your argument.  My personal opinion is that GDAL would have been stronger if those had separate and not been a part of OGRGeometry.

If there is a critical difference between CPLMD5String or a language and use case where md5 support in that language from GDAL is really required, you need to first document that issue.

On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email] [hidden email]> wrote:

    Just the note that CPLMD5String not only for Python, but any other
    API clients: C/C++, Java, Perl, etc.

    But, I agree, it can be easily removed.

    Best regards,
        Dmitry

    20.12.2017 18 <a class="moz-txt-link-rfc2396E" href="tel:20.12.2017%2018"><tel:20.12.2017%2018>:35, Even Rouault пишет:

                And why are you exposing this to python?  Python has
                md5 hashing

        already

                built in.

            I'm not sure that Python produce the same hash than cvs_MD5*.

        I do hope they return the same thing. MD5 is a standardized
        algorithm:
        https://tools.ietf.org/html/rfc1321
        <https://tools.ietf.org/html/rfc1321>

        I'm not sure we really need to expose it our CPL MD5 through
        Python.


    _______________________________________________
    gdal-dev mailing list
    [hidden email] [hidden email]
    https://lists.osgeo.org/mailman/listinfo/gdal-dev
    <https://lists.osgeo.org/mailman/listinfo/gdal-dev>




-- 
-- 
http://schwehr.org


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev




_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Even Rouault-2

On jeudi 21 décembre 2017 00:58:53 CET Dmitry Baryshnikov wrote:

> Hi Ari,

>

> The pull request and discussion can be found here:

> https://github.com/OSGeo/gdal/pull/280

>

> I cannot imagine that this improvements will affect something else. MD5

> used for cache paths initially, I only added some improvements for WMS

> cache size limits, expire time and split cache per dataset base (i.e. to

> delete cached files when dataset deletes).

>

> What about your idea about common caching classes/functions - this is

> topic to discuss.

 

Ah, and in a driver I'm going to commit in a few hours, I also have a (primitive) on-disk tile caching.

 

For in-memory caching, I found this useful class that implements a LRU cache:

https://github.com/rouault/gdal2/blob/idaho_driver/gdal/port/cpl_mem_cache.h

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Ari Jolma-2
In reply to this post by Dmitry Baryshnikov-2

Dmitry Baryshnikov kirjoitti 21.12.2017 klo 08:58:

Hi Ari,

The pull request and discussion can be found here: https://github.com/OSGeo/gdal/pull/280


Yes. Sorry. I didn't see that. Maybe Kurt also didn't see that - or the aspect of it that he brought up here. Maybe a place for some institutional improvements.

I cannot imagine that this improvements will affect something else. MD5 used for cache paths initially, I only added some improvements for WMS cache size limits, expire time and split cache per dataset base (i.e. to delete cached files when dataset deletes).

What about your idea about common caching classes/functions - this is topic to discuss.


I needed a simple URL -> a few files cache for the WCS. Not much code but probably other drivers need such functionality too. I didn't do good research although I knew WMS has similar needs. The code is not much but issues such as unique filenames and simultaneous access from threads/processes add complication.

Ari


Best regards,
    Dmitry
21.12.2017 0:32, Ari Jolma пишет:
My comments on this:

1) The MD5 function seems to be needed only(?) for the WMS cache and probably also WFS. To me that highlights the need for some integration work regarding caching in GDAL. I also implemented yet another caching code for the WCS (without MD5). Looking at my $HOME/.gdal I see also gmlas_xsd_cache and srs_cache.

2) This discussion thread also highlights the need to go towards more pull request style development at least for bigger changes.

Ari


Kurt Schwehr kirjoitti 21.12.2017 klo 02:56:
Dmitry,

Statements like this indicate a world of trouble:

> I'm not sure that Python produce the same hash than cvs_MD5*.
> Also I'm not sure what in future they will be the same.

If you remove the swig and python stuff, move the CPLMD5String to cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with the change.  A C++ test would be good too.  I'll be happy to do a quick fix up of a few things that I see in the md5 code once it is to that point.  I do think it is a good thing that there be md5 support in port along with the existing sha{1,256} code.

I see what you are saying about the OGRGeometry::exportTo{json,kml,gml}, but that doesn't (to me) support your argument.  My personal opinion is that GDAL would have been stronger if those had separate and not been a part of OGRGeometry.

If there is a critical difference between CPLMD5String or a language and use case where md5 support in that language from GDAL is really required, you need to first document that issue.

On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email] [hidden email]> wrote:

    Just the note that CPLMD5String not only for Python, but any other
    API clients: C/C++, Java, Perl, etc.

    But, I agree, it can be easily removed.

    Best regards,
        Dmitry

    20.12.2017 18 <a class="moz-txt-link-rfc2396E" href="tel:20.12.2017%2018" moz-do-not-send="true"><tel:20.12.2017%2018>:35, Even Rouault пишет:

                And why are you exposing this to python?  Python has
                md5 hashing

        already

                built in.

            I'm not sure that Python produce the same hash than cvs_MD5*.

        I do hope they return the same thing. MD5 is a standardized
        algorithm:
        https://tools.ietf.org/html/rfc1321
        <https://tools.ietf.org/html/rfc1321>

        I'm not sure we really need to expose it our CPL MD5 through
        Python.


    _______________________________________________
    gdal-dev mailing list
    [hidden email] [hidden email]
    https://lists.osgeo.org/mailman/listinfo/gdal-dev
    <https://lists.osgeo.org/mailman/listinfo/gdal-dev>




-- 
-- 
http://schwehr.org


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev




_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Kurt Schwehr-2
In reply to this post by Dmitry Baryshnikov-2
Dmitry,


Sorry, I didn't know about the pull request.  I get spammed by github notifications and have yet to figure out how make the 99.99% I don't care about go away. :|

I think commits should always reference relevant issues and pull requests.  The commit message here https://trac.osgeo.org/gdal/changeset/41086 was not particularly useful.  In reading that patch, I assumed that this was a code brand new to GDAL.  I filter out the changes for drivers that I don't use and the diffs never go through my repo (0.8M lines locally versus 1.8M in svn).  The description for r41085 never mentions migrating code from wms to port or the pr.


On Wed, Dec 20, 2017 at 1:58 PM, Dmitry Baryshnikov <[hidden email]> wrote:

Hi Ari,

The pull request and discussion can be found here: https://github.com/OSGeo/gdal/pull/280

I cannot imagine that this improvements will affect something else. MD5 used for cache paths initially, I only added some improvements for WMS cache size limits, expire time and split cache per dataset base (i.e. to delete cached files when dataset deletes).

What about your idea about common caching classes/functions - this is topic to discuss.

Best regards,
    Dmitry
21.12.2017 0:32, Ari Jolma пишет:
My comments on this:

1) The MD5 function seems to be needed only(?) for the WMS cache and probably also WFS. To me that highlights the need for some integration work regarding caching in GDAL. I also implemented yet another caching code for the WCS (without MD5). Looking at my $HOME/.gdal I see also gmlas_xsd_cache and srs_cache.

2) This discussion thread also highlights the need to go towards more pull request style development at least for bigger changes.

Ari


Kurt Schwehr kirjoitti 21.12.2017 klo 02:56:
Dmitry,

Statements like this indicate a world of trouble:

> I'm not sure that Python produce the same hash than cvs_MD5*.
> Also I'm not sure what in future they will be the same.

If you remove the swig and python stuff, move the CPLMD5String to cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with the change.  A C++ test would be good too.  I'll be happy to do a quick fix up of a few things that I see in the md5 code once it is to that point.  I do think it is a good thing that there be md5 support in port along with the existing sha{1,256} code.

I see what you are saying about the OGRGeometry::exportTo{json,kml,gml}, but that doesn't (to me) support your argument.  My personal opinion is that GDAL would have been stronger if those had separate and not been a part of OGRGeometry.

If there is a critical difference between CPLMD5String or a language and use case where md5 support in that language from GDAL is really required, you need to first document that issue.

On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email] [hidden email]> wrote:

    Just the note that CPLMD5String not only for Python, but any other
    API clients: C/C++, Java, Perl, etc.

    But, I agree, it can be easily removed.

    Best regards,
        Dmitry

    20.12.2017 18 <a class="m_-8462408375501910622moz-txt-link-rfc2396E" href="tel:20.12.2017%2018" target="_blank"><tel:20.12.2017%2018>:35, Even Rouault пишет:

                And why are you exposing this to python?  Python has
                md5 hashing

        already

                built in.

            I'm not sure that Python produce the same hash than cvs_MD5*.

        I do hope they return the same thing. MD5 is a standardized
        algorithm:
        https://tools.ietf.org/html/rfc1321
        <https://tools.ietf.org/html/rfc1321>

        I'm not sure we really need to expose it our CPL MD5 through
        Python.


    _______________________________________________
    gdal-dev mailing list
    [hidden email] [hidden email]
    https://lists.osgeo.org/mailman/listinfo/gdal-dev
    <https://lists.osgeo.org/mailman/listinfo/gdal-dev>




-- 
-- 
http://schwehr.org


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev




_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev



--

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: port/md5 cvs_MD5*

Dmitry Baryshnikov-2
Hi Kurt,

In future I will try to write more detailed commit messages, and of
course, will point to the osgeo/gdal pull request on github.

By the way this situation shows the one more reason to switch to github,
as Even proposed.

Best regards,
     Dmitry

21.12.2017 16:57, Kurt Schwehr пишет:

> Dmitry,
>
> Thank you for doing https://trac.osgeo.org/gdal/changeset/41095
>
> Sorry, I didn't know about the pull request.  I get spammed by github
> notifications and have yet to figure out how make the 99.99% I don't care
> about go away. :|
>
> I think commits should always reference relevant issues and pull requests.
> The commit message here https://trac.osgeo.org/gdal/changeset/41086 was not
> particularly useful.  In reading that patch, I assumed that this was a code
> brand new to GDAL.  I filter out the changes for drivers that I don't use
> and the diffs never go through my repo (0.8M lines locally versus 1.8M in
> svn).  The description for r41085 never mentions migrating code from wms to
> port or the pr.
>
>
> On Wed, Dec 20, 2017 at 1:58 PM, Dmitry Baryshnikov <[hidden email]>
> wrote:
>
>> Hi Ari,
>>
>> The pull request and discussion can be found here:
>> https://github.com/OSGeo/gdal/pull/280
>>
>> I cannot imagine that this improvements will affect something else. MD5
>> used for cache paths initially, I only added some improvements for WMS
>> cache size limits, expire time and split cache per dataset base (i.e. to
>> delete cached files when dataset deletes).
>>
>> What about your idea about common caching classes/functions - this is
>> topic to discuss.
>>
>> Best regards,
>>      Dmitry
>>
>> 21.12.2017 0:32, Ari Jolma пишет:
>>
>> My comments on this:
>>
>> 1) The MD5 function seems to be needed only(?) for the WMS cache and
>> probably also WFS. To me that highlights the need for some integration work
>> regarding caching in GDAL. I also implemented yet another caching code for
>> the WCS (without MD5). Looking at my $HOME/.gdal I see also gmlas_xsd_cache
>> and srs_cache.
>>
>> 2) This discussion thread also highlights the need to go towards more pull
>> request style development at least for bigger changes.
>>
>> Ari
>>
>>
>> Kurt Schwehr kirjoitti 21.12.2017 klo 02:56:
>>
>> Dmitry,
>>
>> Statements like this indicate a world of trouble:
>>
>>> I'm not sure that Python produce the same hash than cvs_MD5*.
>>> Also I'm not sure what in future they will be the same.
>> If you remove the swig and python stuff, move the CPLMD5String to
>> cpl_md5.{cpp,h}, and follow Even's suggestions, I will be much happier with
>> the change.  A C++ test would be good too.  I'll be happy to do a quick fix
>> up of a few things that I see in the md5 code once it is to that point.  I
>> do think it is a good thing that there be md5 support in port along with
>> the existing sha{1,256} code.
>>
>> I see what you are saying about the OGRGeometry::exportTo{json,kml,gml},
>> but that doesn't (to me) support your argument.  My personal opinion is
>> that GDAL would have been stronger if those had separate and not been a
>> part of OGRGeometry.
>>
>> If there is a critical difference between CPLMD5String or a language and
>> use case where md5 support in that language from GDAL is really required,
>> you need to first document that issue.
>>
>> On Wed, Dec 20, 2017 at 7:46 AM, Dmitry Baryshnikov <[hidden email]
>> <mailto:[hidden email]> <[hidden email]>> wrote:
>>
>>      Just the note that CPLMD5String not only for Python, but any other
>>      API clients: C/C++, Java, Perl, etc.
>>
>>      But, I agree, it can be easily removed.
>>
>>      Best regards,
>>          Dmitry
>>
>>      20.12.2017 18 <tel:20.12.2017%2018> <20.12.2017%2018>:35, Even
>> Rouault пишет:
>>
>>                  And why are you exposing this to python?  Python has
>>                  md5 hashing
>>
>>          already
>>
>>                  built in.
>>
>>              I'm not sure that Python produce the same hash than cvs_MD5*.
>>
>>          I do hope they return the same thing. MD5 is a standardized
>>          algorithm:
>>          https://tools.ietf.org/html/rfc1321
>>          <https://tools.ietf.org/html/rfc1321>
>> <https://tools.ietf.org/html/rfc1321>
>>
>>          I'm not sure we really need to expose it our CPL MD5 through
>>          Python.
>>
>>
>>      _______________________________________________
>>      gdal-dev mailing list
>>      [hidden email] <mailto:[hidden email]>
>> <[hidden email]>
>>      https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>      <https://lists.osgeo.org/mailman/listinfo/gdal-dev>
>> <https://lists.osgeo.org/mailman/listinfo/gdal-dev>
>>
>>
>>
>>
>> --
>> --
>> http://schwehr.org
>>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> [hidden email]
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>>
>>
>>
>>
>> _______________________________________________
>> gdal-dev mailing [hidden email]://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> [hidden email]
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>
>

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev