[gdal-dev] CPLReadLine2L and control characters

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[gdal-dev] CPLReadLine2L and control characters

Martin Landa
Hi,

when working on VFK driver improvements [0] I am dealing with various
problems. I found lines in sample data which contain 0x0x control
character [1] inside. CPLReadLine2L reads all characters:

(gdb) p pszRawLine[68]
$4 = 0 '\000'
(gdb) p pszRawLine[69]
$5 = 63 '?'
(gdb) p pszRawLine[70]
$6 = 4 '\004'
(gdb) p pszRawLine[71]
$8 = 0 '\000'
(gdb) p pszRawLine[72]
$7 = 34 '"'

But then strlen() stops on position 68 and rest of line is simply
lost. Do you have any idea how to deal with such data in reasonable
way?

Thanks a lot for pointers in advance! Martin

[0] https://github.com/OSGeo/gdal/issues/365
[1] https://en.wikipedia.org/wiki/ISO/IEC_8859-2
[2] http://geo102.fsv.cvut.cz/~landa/tmp/0xline.txt

--
Martin Landa
http://geo.fsv.cvut.cz/gwiki/Landa
http://gismentors.cz/mentors/landa
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: CPLReadLine2L and control characters

Even Rouault-2

Martin,

 

>

> when working on VFK driver improvements [0] I am dealing with various

> problems. I found lines in sample data which contain 0x0x control

> character [1] inside.

 

You mean 0x00 right ?

I would not expect those to be valid in ISO-8859-2. How are they supposed to be interpreted ? Are you sure the dataset is valid ?

 

> CPLReadLine2L reads all characters:

>

> (gdb) p pszRawLine[68]

> $4 = 0 '\000'

> (gdb) p pszRawLine[69]

> $5 = 63 '?'

> (gdb) p pszRawLine[70]

> $6 = 4 '\004'

> (gdb) p pszRawLine[71]

> $8 = 0 '\000'

> (gdb) p pszRawLine[72]

> $7 = 34 '"'

>

> But then strlen() stops on position 68 and rest of line is simply

> lost. Do you have any idea how to deal with such data in reasonable

> way?

 

I guess you could transform CPLReadLine2L() into CPLReadLine3L() with an extra int* pnBufLength output argument that would return the final value of nBufLength, and then you would lop on the string to remove those annoying nul characters.

 

Even

 

--

Spatialys - Geospatial professional services

http://www.spatialys.com


_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev
Reply | Threaded
Open this post in threaded view
|

Re: CPLReadLine2L and control characters

Martin Landa
Hi,

2018-04-20 23:06 GMT+02:00 Even Rouault <[hidden email]>:
> You mean 0x00 right ?

right.

> I would not expect those to be valid in ISO-8859-2. How are they supposed to
> be interpreted ? Are you sure the dataset is valid ?

Right, it's not valid in the sense of ISO-8859-2. Anyway I would
attempt to skip such characters and be able read also such line.

> I guess you could transform CPLReadLine2L() into CPLReadLine3L() with an
> extra int* pnBufLength output argument that would return the final value of
> nBufLength, and then you would lop on the string to remove those annoying
> nul characters.

Right, I was just wondering whether it's reasonable to create a new
method just for my case. Thanks for feedback, Ma

--
Martin Landa
http://geo.fsv.cvut.cz/gwiki/Landa
http://gismentors.cz/mentors/landa
_______________________________________________
gdal-dev mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/gdal-dev