[gdal-dev] Whitespace in WKT

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] Whitespace in WKT

andrew.bell.ia@gmail.com
Hi,

I have some WKT where point X and Y are separated by newline characters rather than spaces.  A look at OGRWktReadToken seems to eat spaces and tabs, but not newlines or other whitespace.  My reading of the OGC simple feature BNF doesn't help much, as AFAICT, the separator between is an "implied" space:

OGC 06-103r4
<point z> ::= <x> <y> <z>
I would have expected to see spacing specified something like:
<point z> ::= <x> <space> <y> <space> <z>
So I'm confused.  Are only tabs and spaces allowed?  Only a single space?  Is this defined somewhere I'm not seeing?
Thanks,
--
Andrew Bell
[hidden email]

_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: Whitespace in WKT

Even Rouault-2
On mardi 8 janvier 2019 11:38:32 CET Andrew Bell wrote:

> Hi,
>
> I have some WKT where point X and Y are separated by newline characters
> rather than spaces.  A look at OGRWktReadToken seems to eat spaces and
> tabs, but not newlines or other whitespace.  My reading of the OGC simple
> feature BNF doesn't help much, as AFAICT, the separator between is an
> "implied" space:
>
> OGC 06-103r4
>
> <point z> ::= <x> <y> <z>
>
> I would have expected to see spacing specified something like:
>
> <point z> ::= <x> <space> <y> <space> <z>
>
> So I'm confused.  Are only tabs and spaces allowed?  Only a single
> space?  Is this defined somewhere I'm not seeing?

The BNF mentions only <space> " " // unicode "U+0020" (space)
but indeed doesn't use it in a rigorous way.
SQL/MM Part 3 (at least the draft of it publicly found or the extract at
https://github.com/postgis/postgis/blob/svn-trunk/doc/bnf-wkt.txt ) doesn't
even mention it...

The general practice in other implementations I've seen (PostGIS, Spatialite),
on the write side, is to just use a single space to separate the coordinates
of a tuple. Some implementations might add an extra space between a geometry
name keyword and the ( parenthesis: "POINT (1 2)" vs "POINT(1 2)". I've never
seen tabulations or newlines.

That said, from a quick test, PostGIS WKT parser seems to support tabulations
and newlines, and several occurences of those separators.

Confirmed by
https://github.com/postgis/postgis/blob/
1ba28a8ea39e8be0eabc322c992a315d9c09528e/liblwgeom/lwin_wkt_lex.l#L92

We might we more tolerant on the read side. Can't think right now of potential
issues in doing so

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev