FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Hans Milling
Back in 2012 I posted this question: http://osgeo-org.1560.x6.nabble.com/FDO-OGR-3-6-3-7-and-UTF-8-problem-td3898674.html
The OGR provider when reading MapInfo TAB files treat the data as if it was UTF-8. However MapInfo never supported UTF-8 or unicode characters, and for wester european countries it uses Latin1/ISO-8859-1/Windows1252.
This problems seems to persist in FDO OGR 4.1 used by MapGuide 3.1
From MapInfo 15.2 and forward, MapInfo does support unicode characters, but I have no checked if the FDO OGR provider can read the new format at all.
The solutions back then was for me to recompile the provider and forcing it to use ISO-8859-1 and removing support for UTF-8/unicode from all the other files as well.

The right thing would be for the provider to read the information from the TAB file (written in clear text) what character set is used for the data in the DAT file. Are there any plans to implement this, or do I need to create a ticket to get it fixed?
Before C# took over, I wrote a lot of c++ code. I could update the code myself and post a fix, but I have absolutely no clue where to start.

Best regards
  Hans Milling...
Reply | Threaded
Open this post in threaded view
|

Re: FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Jackie Ng
Does this happen with vanilla GDAL/OGR without going through the FDO provider?

- Jackie
Reply | Threaded
Open this post in threaded view
|

Re: FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Hans Milling
Hi Jackie

I am not familiar with Vanilla (Google search did not reveal anything), however I tried a simple console application and using the 4.0 binaries.
A name with a danish/latin letter like:
Næsborg
turns out as:
N波org

      OSGeo.FDO.IConnectionManager mConnMgr = OSGeo.FDO.ClientServices.FeatureAccessManager.GetConnectionManager();
      IConnection mProvConn = mConnMgr.CreateConnection("OSGeo.OGR.4.0");
      IConnectionInfo connInfo = mProvConn.ConnectionInfo;
      IConnectionPropertyDictionary connPropDict = connInfo.ConnectionProperties;
      connPropDict.SetProperty("DataSource", @"C:\test\names.tab");
      connPropDict.SetProperty("ReadOnly", "TRUE");
      mProvConn.Open();
      IDescribeSchema schemaCmd = mProvConn.CreateCommand(OSGeo.FDO.Commands.CommandType.CommandType_DescribeSchema) as IDescribeSchema;
      FeatureSchemaCollection schemaCol = schemaCmd.Execute();
      ISelect selCmd = (ISelect)mProvConn.CreateCommand(OSGeo.FDO.Commands.CommandType.CommandType_Select);
      selCmd.SetFeatureClassName("names");
      IReader myReader = selCmd.Execute();
      while (myReader.ReadNext())
      {
        string name = myReader.GetString("name");
        Console.WriteLine(name);
      }

If I try to convert the name from UTF8 to bytes and back to a string:
        byte[] bytes = System.Text.Encoding.UTF8.GetBytes(name);
        string newname = System.Text.Encoding.GetEncoding("ISO-8859-1").GetString(bytes);
I get this output:
Næ³¢org

Seems like the data is corrupted somehow.
Reply | Threaded
Open this post in threaded view
|

Re: FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Hans Milling
In reply to this post by Jackie Ng
Here is a test mapinfo file if you want to add to MapGuide and see for yourself or run some other test:
http://www.geograf.com/filer/names.zip

The problem existed since 2012 (and before) in all versions of the OGR provider.

The coordinate system used by the TAB file is EUREF89 UTM 32N (EPSG:25832)
Reply | Threaded
Open this post in threaded view
|

Re: FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Jackie Ng
Is this the 2012 thread you are referring to?

http://osgeo-org.1560.x6.nabble.com/FDO-OGR-3-6-3-7-and-UTF-8-problem-td3898674.html

- Jackie
Reply | Threaded
Open this post in threaded view
|

Re: FDO OGR 4.1 still has problems with Latin1/ISO-8859-1 when using MapInfo TAB

Hans Milling
Yes