[Analysis] Geospatial server deployment statistics

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Analysis] Geospatial server deployment statistics

Jonathan Moules-4
Hi All,
At the risk of engaging in self-promotion, this may be of interest to
the community.

I've just finished an analysis of what geospatial server software's
behind the ~2.2 million WMS/WFS/WCS/WMTS datasets that GeoSeer has in
its search-engine index.

The extremely short version of things likely of interest here:

* ArcGIS has by far the most deployments: 2,755 (53.70%); other
proprietary is a rounding error.

* GeoServer is the second most popular for deployments (964 (18.79%)),
and hosts by far the most datasets: 963,603 (43.26%).

* MapServer has a very healthy deployment count too: 544 (10.6%), and
serves a considerable number of datasets: 389,709 (17.49%).

* Put another way, at least 2/3rds of the world's geospatial data that's
served via OGC standards is served by Open Source software (mostly
OSGeo). And over 60% between GeoServer and MapServer alone.

* So basically it looks like many city/county/provincials have an ArcGIS
Server install and use that for (occasionally token!) compliance with
"open data" edicts, but the full-on SDI data warehouses almost all go
for Open Source.

You can find (much) more detail (+ numbers for a bunch of the other
OSGeo projects) in the (ad-free, tracking-free, cookie-free,
javascript-free, in fact both free and Free!) blog post:
https://www.geoseer.net/blog/?p=2020-06-04_geospatial_server_software


So yes, good job to everyone who contributes in any way to all these
projects! Hopefully this reinforces how useful they are; maybe you can
use it in future work-bids too (its the sort of thing that reassures
management). Could also be be useful when it comes to figuring out where
limited OSGeo funds will have most impact.

Comments/thoughts/discussion/feedback welcome (on or off list).
Cheers,
Jonathan


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Sergio Acosta y Lara
Hi Jonathan. This is great work, well done! We were sure that FOSS4G was doing a good job but when you have numbers things get clearer and thje message stronger.
Best,

Sergio Acosta y Lara
Departamento de Geomática
Dirección Nacional de Topografía
Ministerio de Transporte y Obras Públicas
URUGUAY
(598)29157933 ints. 20329/20330
http://geoportal.mtop.gub.uy/

________________________________________
De: Discuss <[hidden email]> en nombre de Jonathan Moules <[hidden email]>
Enviado: jueves, 04 de junio de 2020 15:16
Para: OSGeo Discussions
Asunto: [OSGeo-Discuss] [Analysis] Geospatial server deployment statistics

Hi All,
At the risk of engaging in self-promotion, this may be of interest to
the community.

I've just finished an analysis of what geospatial server software's
behind the ~2.2 million WMS/WFS/WCS/WMTS datasets that GeoSeer has in
its search-engine index.

The extremely short version of things likely of interest here:

* ArcGIS has by far the most deployments: 2,755 (53.70%); other
proprietary is a rounding error.

* GeoServer is the second most popular for deployments (964 (18.79%)),
and hosts by far the most datasets: 963,603 (43.26%).

* MapServer has a very healthy deployment count too: 544 (10.6%), and
serves a considerable number of datasets: 389,709 (17.49%).

* Put another way, at least 2/3rds of the world's geospatial data that's
served via OGC standards is served by Open Source software (mostly
OSGeo). And over 60% between GeoServer and MapServer alone.

* So basically it looks like many city/county/provincials have an ArcGIS
Server install and use that for (occasionally token!) compliance with
"open data" edicts, but the full-on SDI data warehouses almost all go
for Open Source.

You can find (much) more detail (+ numbers for a bunch of the other
OSGeo projects) in the (ad-free, tracking-free, cookie-free,
javascript-free, in fact both free and Free!) blog post:
https://www.geoseer.net/blog/?p=2020-06-04_geospatial_server_software


So yes, good job to everyone who contributes in any way to all these
projects! Hopefully this reinforces how useful they are; maybe you can
use it in future work-bids too (its the sort of thing that reassures
management). Could also be be useful when it comes to figuring out where
limited OSGeo funds will have most impact.

Comments/thoughts/discussion/feedback welcome (on or off list).
Cheers,
Jonathan


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

jody.garnett
In reply to this post by Jonathan Moules-4
That is really interesting Jonathan, if you are open to cross posting it would be nice to reference this from a GeoServer blog post.

I especially liked the fingerprinting:
A ridiculously long, 5000+ item list of default projections that the server supports that 1 in 6 GeoServer administrators hasn't culled

Surprisingly nobody has made a motion to start with a smaller list, and I think we found that if we provided a smaller list folks assume GeoServer is less capable.
What do other WMS implementations do?
--
Jody Garnett


On Thu, 4 Jun 2020 at 11:32, Jonathan Moules <[hidden email]> wrote:
Hi All,
At the risk of engaging in self-promotion, this may be of interest to
the community.

I've just finished an analysis of what geospatial server software's
behind the ~2.2 million WMS/WFS/WCS/WMTS datasets that GeoSeer has in
its search-engine index.

The extremely short version of things likely of interest here:

* ArcGIS has by far the most deployments: 2,755 (53.70%); other
proprietary is a rounding error.

* GeoServer is the second most popular for deployments (964 (18.79%)),
and hosts by far the most datasets: 963,603 (43.26%).

* MapServer has a very healthy deployment count too: 544 (10.6%), and
serves a considerable number of datasets: 389,709 (17.49%).

* Put another way, at least 2/3rds of the world's geospatial data that's
served via OGC standards is served by Open Source software (mostly
OSGeo). And over 60% between GeoServer and MapServer alone.

* So basically it looks like many city/county/provincials have an ArcGIS
Server install and use that for (occasionally token!) compliance with
"open data" edicts, but the full-on SDI data warehouses almost all go
for Open Source.

You can find (much) more detail (+ numbers for a bunch of the other
OSGeo projects) in the (ad-free, tracking-free, cookie-free,
javascript-free, in fact both free and Free!) blog post:
https://www.geoseer.net/blog/?p=2020-06-04_geospatial_server_software


So yes, good job to everyone who contributes in any way to all these
projects! Hopefully this reinforces how useful they are; maybe you can
use it in future work-bids too (its the sort of thing that reassures
management). Could also be be useful when it comes to figuring out where
limited OSGeo funds will have most impact.

Comments/thoughts/discussion/feedback welcome (on or off list).
Cheers,
Jonathan


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss

_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Jonathan Moules-4

Hi Jody,
Thanks for the feedback. You're very welcome to cross-post it; the blog-content is all CC-BY-SA 4.0 by default so share as you wish.

> What do other WMS implementations do?

Projection list - I can't comment on how the other software deals with this from an administration perspective (I've only ever administered GeoServer), but from when I've looked at the GetCaps I don't remember seeing long lists, and no server apart from GeoServer ended up triggering the " > 5000 projections" score item (itself an arbitrary cut-off, didn't test for a low bound).

(One SQL query later...)
Number of projections per dataset below across all server types.

Right column is the number of declared projections (merges the layer and nested layer projections), the left column is how many datasets it applies to.
Note: If the service declares more than 30 projections GeoSeer culls it down to 0.

Count        Num Projections
14            30
200            29
1168        28
1160        27
672            26
3198        25
2037        24
1745        23
5207        22
5175        21
760            20
2348        19
3588        18
4967        17
8254        16
2680        15
14317        14
20274        13
13806        12
17194        11
54545        10
39030        9
43833        8
25608        7
55328        6
50072        5
234676        4
173938        3
149098        2
572727        1
720060        0

I don't imagine there would be big resource savings - it's only around 120kB uncompressed.

Cheers,
Jonathan

On 2020-06-05 16:30, Jody Garnett wrote:
That is really interesting Jonathan, if you are open to cross posting it would be nice to reference this from a GeoServer blog post.

I especially liked the fingerprinting:
A ridiculously long, 5000+ item list of default projections that the server supports that 1 in 6 GeoServer administrators hasn't culled

Surprisingly nobody has made a motion to start with a smaller list, and I think we found that if we provided a smaller list folks assume GeoServer is less capable.
What do other WMS implementations do?
--
Jody Garnett


On Thu, 4 Jun 2020 at 11:32, Jonathan Moules <[hidden email]> wrote:
Hi All,
At the risk of engaging in self-promotion, this may be of interest to
the community.

I've just finished an analysis of what geospatial server software's
behind the ~2.2 million WMS/WFS/WCS/WMTS datasets that GeoSeer has in
its search-engine index.

The extremely short version of things likely of interest here:

* ArcGIS has by far the most deployments: 2,755 (53.70%); other
proprietary is a rounding error.

* GeoServer is the second most popular for deployments (964 (18.79%)),
and hosts by far the most datasets: 963,603 (43.26%).

* MapServer has a very healthy deployment count too: 544 (10.6%), and
serves a considerable number of datasets: 389,709 (17.49%).

* Put another way, at least 2/3rds of the world's geospatial data that's
served via OGC standards is served by Open Source software (mostly
OSGeo). And over 60% between GeoServer and MapServer alone.

* So basically it looks like many city/county/provincials have an ArcGIS
Server install and use that for (occasionally token!) compliance with
"open data" edicts, but the full-on SDI data warehouses almost all go
for Open Source.

You can find (much) more detail (+ numbers for a bunch of the other
OSGeo projects) in the (ad-free, tracking-free, cookie-free,
javascript-free, in fact both free and Free!) blog post:
https://www.geoseer.net/blog/?p=2020-06-04_geospatial_server_software


So yes, good job to everyone who contributes in any way to all these
projects! Hopefully this reinforces how useful they are; maybe you can
use it in future work-bids too (its the sort of thing that reassures
management). Could also be be useful when it comes to figuring out where
limited OSGeo funds will have most impact.

Comments/thoughts/discussion/feedback welcome (on or off list).
Cheers,
Jonathan


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss

_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

jody.garnett
Thanks for the quick feedback :)

Thanks for the feedback. You're very welcome to cross-post it; the blog-content is all CC-BY-SA 4.0 by default so share as you wish.

I would prefer to link, idea is for your article to get a wider reach. 

> What do other WMS implementations do?

Projection list - I can't comment on how the other software deals with this from an administration perspective (I've only ever administered GeoServer), but from when I've looked at the GetCaps I don't remember seeing long lists, and no server apart from GeoServer ended up triggering the " > 5000 projections" score item (itself an arbitrary cut-off, didn't test for a low bound).
...
I don't imagine there would be big resource savings - it's only around 120kB uncompressed.

That is good to know, it still may be worth having a short list by default (since the number of SRS items is often held against geoserver in performance shootouts).

How often do you collect these stats? Or is it the first time ... it would be interesting to know how market share changes over time.

_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Jonathan Moules-4
Links to the blog are welcome too. :-)

These particular stats were done as a one-off. There's a bunch of other stats on: https://www.geoseer.net/stats/ - those are generated monthly.

I was planning on doing a "change over time" thing at some point as I have data going back over two years (although with much less coverage of services at the start of that). I'll let you know if/when I get around to it.

Cheers,
Jonathan


On 2020-06-05 20:45, Jody Garnett wrote:
Thanks for the quick feedback :)

Thanks for the feedback. You're very welcome to cross-post it; the blog-content is all CC-BY-SA 4.0 by default so share as you wish.

I would prefer to link, idea is for your article to get a wider reach. 

> What do other WMS implementations do?

Projection list - I can't comment on how the other software deals with this from an administration perspective (I've only ever administered GeoServer), but from when I've looked at the GetCaps I don't remember seeing long lists, and no server apart from GeoServer ended up triggering the " > 5000 projections" score item (itself an arbitrary cut-off, didn't test for a low bound).
...
I don't imagine there would be big resource savings - it's only around 120kB uncompressed.

That is good to know, it still may be worth having a short list by default (since the number of SRS items is often held against geoserver in performance shootouts).

How often do you collect these stats? Or is it the first time ... it would be interesting to know how market share changes over time.

_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Even Rouault-2
In reply to this post by jody.garnett

On vendredi 5 juin 2020 08:30:39 CEST Jody Garnett wrote:

> That is really interesting Jonathan, if you are open to cross posting it

> would be nice to reference this from a GeoServer blog post.

>

> I especially liked the fingerprinting:

> > A ridiculously long, 5000+ item list of default projections that the

> > server supports that 1 in 6 GeoServer administrators hasn't culled

>

> Surprisingly nobody has made a motion to start with a smaller list, and I

> think we found that if we provided a smaller list folks assume GeoServer is

> less capable.

> What do other WMS implementations do?

 

For Mapserver, it is up to the service administrator (aka mapfile guru) to define the list of CRS he wants to expose. Otherwise just the global (default) mapfile CRS is exposed.

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

jody.garnett
Interesting so some of the 5000+ WMS services attributed to geoserver may be mapserver instances?
--
Jody Garnett


On Fri, 5 Jun 2020 at 13:41, Even Rouault <[hidden email]> wrote:

On vendredi 5 juin 2020 08:30:39 CEST Jody Garnett wrote:

> That is really interesting Jonathan, if you are open to cross posting it

> would be nice to reference this from a GeoServer blog post.

>

> I especially liked the fingerprinting:

> > A ridiculously long, 5000+ item list of default projections that the

> > server supports that 1 in 6 GeoServer administrators hasn't culled

>

> Surprisingly nobody has made a motion to start with a smaller list, and I

> think we found that if we provided a smaller list folks assume GeoServer is

> less capable.

> What do other WMS implementations do?

 

For Mapserver, it is up to the service administrator (aka mapfile guru) to define the list of CRS he wants to expose. Otherwise just the global (default) mapfile CRS is exposed.

 

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Even Rouault-2

On vendredi 5 juin 2020 14:18:42 CEST Jody Garnett wrote:

> Interesting so some of the 5000+ WMS services attributed to geoserver may

> be mapserver instances?

 

I imagine someone with sufficient time and determination could actually do much more than saying this is a Geoserver, this is a Mapserver. You could actually probably tell the precise version number. One advantage of being open source is that our bug trackers are public too. So by looking at which bug a deployment has or not, you could really refine the identification. But that would indeed be time consuming and cumbersome to do. I guess only black hat hackers would be interesting in that :-)

But probably more easily: triggering exception situations in WMS/WFS/etc protocols should help to find the implementation. Error messages are really implementation specific. But maybe part of that was actually done.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

James Klassen
On 6/5/20 4:50 PM, Even Rouault wrote:

On vendredi 5 juin 2020 14:18:42 CEST Jody Garnett wrote:

> Interesting so some of the 5000+ WMS services attributed to geoserver may

> be mapserver instances?

 

I imagine someone with sufficient time and determination could actually do much more than saying this is a Geoserver, this is a Mapserver. You could actually probably tell the precise version number. One advantage of being open source is that our bug trackers are public too. So by looking at which bug a deployment has or not, you could really refine the identification. But that would indeed be time consuming and cumbersome to do. I guess only black hat hackers would be interesting in that :-)

But probably more easily: triggering exception situations in WMS/WFS/etc protocols should help to find the implementation. Error messages are really implementation specific. But maybe part of that was actually done.

 

Even

 


I'm not sure with GeoServer, but MapServer is very easy to tell from the responses.  The GetCapabilities response even includes the version number in an XML comment (which reminds me, I need to update my MapServer installs).

No parameters:

        https://www.example.org/datasets/wms.map?

        No query information to decode. QUERY_STRING is set, but empty.

   
or, depending on how the server is setup
         

      mapserv(): Web application error. Traditional BROWSE mode requires a TEMPLATE in the WEB section, but none was provided.

Missing WMS REQUEST parameter:

        https://www.example.org/datasets/wms.map?SERVICE=WMS&VERSION=1.3.0

        <ServiceExceptionReport version="1.3.0" xsi:schemaLocation="http://www.opengis.net/ogc http://schemas.opengis.net/wms/1.3.0/exceptions_1_3_0.xsd"><ServiceException>
      msWMSDispatch(): WMS server error. Incomplete WMS request: REQUEST parameter missing
      </ServiceException></ServiceExceptionReport>

GetCapabilities:

        https://www.example.org/datasets/wms.map?SERVICE=WMS&VERSION=1.3.0&Request=GetCapabilities

      <WMS_Capabilities version="1.3.0" xsi:schemaLocation=<a class="moz-txt-link-rfc2396E" href="http://www.opengis.net/wmshttp://schemas.opengis.net/wms/1.3.0/capabilities_1_3_0.xsd http://www.opengis.net/sldhttp://schemas.opengis.net/sld/1.1.0/sld_capabilities.xsd http://mapserver.gis.umn.edu/mapserverhttp://www.example.org/datasets/wms.map?SERVICE=WMS&amp;service=WMS&amp;version=1.3.0&amp;request=GetSchemaExtension">"http://www.opengis.net/wms http://schemas.opengis.net/wms/1.3.0/capabilities_1_3_0.xsd  http://www.opengis.net/sld http://schemas.opengis.net/sld/1.1.0/sld_capabilities.xsd  http://mapserver.gis.umn.edu/mapserver http://www.example.org/datasets/wms.map?SERVICE=WMS&service=WMS&version=1.3.0&request=GetSchemaExtension"><!--
 MapServer version 7.0.5 OUTPUT=PNG OUTPUT=JPEG OUTPUT=KML SUPPORTS=PROJ SUPPORTS=AGG SUPPORTS=FREETYPE SUPPORTS=CAIRO SUPPORTS=SVG_SYMBOLS SUPPORTS=RSVG SUPPORTS=ICONV SUPPORTS=WMS_SERVER SUPPORTS=WMS_CLIENT SUPPORTS=WFS_SERVER SUPPORTS=WFS_CLIENT SUPPORTS=WCS_SERVER SUPPORTS=SOS_SERVER SUPPORTS=FASTCGI SUPPORTS=THREADS SUPPORTS=GEOS INPUT=JPEG INPUT=POSTGIS INPUT=OGR INPUT=GDAL INPUT=SHAPEFILE
--><Service><Name>WMS</Name>...

_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Analysis] Geospatial server deployment statistics

Jonathan Moules-4
In reply to this post by Even Rouault-2
 > Interesting so some of the 5000+ WMS services attributed to geoserver
may be mapserver instances?

Extremely unlikely. Multiple fingerprints were used for each piece of
software. For GeoServer I have 9 different fingerprints any one of which
is almost always unique to GeoServer, for MapServer there are 5
fingerprints including a unique XML namespace, and they have a
"MapServer Version" comment too. If there was even one conflict then it
was filed under "UNSURE" (which is mostly MapBender). I leant toward
False Negatives over False Positives.

 > You could actually probably tell the precise version number.

I did this with GeoServer 3 years ago -
http://osgeo-org.1560.x6.nabble.com/Stats-on-GeoServer-versions-as-currently-deployed-td5314686.html 
- but that was using a very different method (a custom scrape of the
GeoServer admin websites as they have the version number explicitly on
them: example: http://demo.geo-solutions.it/geoserver/web/).

As Jim said, this would be easy to do with MapServer too as a result of
the "<!--MapServer Version ..." comment, but only about 40% of MapServer
services have that comment enabled (I don't know if that's because it's
version specific or admins can disable it).


On 2020-06-05 22:50, Even Rouault wrote:

>
> On vendredi 5 juin 2020 14:18:42 CEST Jody Garnett wrote:
>
> > Interesting so some of the 5000+ WMS services attributed to
> geoserver may
>
> > be mapserver instances?
>
> I imagine someone with sufficient time and determination could
> actually do much more than saying this is a Geoserver, this is a
> Mapserver. You could actually probably tell the precise version
> number. One advantage of being open source is that our bug trackers
> are public too. So by looking at which bug a deployment has or not,
> you could really refine the identification. But that would indeed be
> time consuming and cumbersome to do. I guess only black hat hackers
> would be interesting in that :-)
>
> But probably more easily: triggering exception situations in
> WMS/WFS/etc protocols should help to find the implementation. Error
> messages are really implementation specific. But maybe part of that
> was actually done.
>
> Even
>
> --
>
> Spatialys - Geospatial professional services
>
> http://www.spatialys.com
>
_______________________________________________
Discuss mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/discuss