[gdal-dev] VectorTranslate(ogr2ogr) batch dataset import to postgis
I'm facing some problems with the import of datasets into a postgis
geodatabase, using python and multiprocessing
There's a process consumer (one each for each CPU) which calls conversion
tasks, each taking care of importing a s57 dataset using the
gdal.VectorTranslate python API.
Each s57 (layer,geometry type) pair goes into a separate layer, for example
lndare_point, lndare_polygon, etc. I'm using -append to avoid creating the
database structure in advance, and leave this burden to gdal.VectorTranslate
(ultimately CreateLayer), because s57 has a very complex schema and I would
like to skip this part of the job for the moment.
There's no kind of process synchronization, each task just calls
gdal.VectorTranslate against a dataset. I assume (maybe I'm wrong..) that
postgres will help with concurrent connections.
I'm getting a few errors like this:
CREATE TABLE "s57"."lndare_polygon" ( ... ) | ERROR: a duplicate key value
violates unique constraint "pg_type_typname_nsp_index" | DETAIL: The key
(typname, typnamespace)=(lndare_polygon_ogc_fid_seq, 30188621) already
which I guess is a race condition between two processes calling CreateLayer
on the same layer.
Errors come up in the early stage of conversion, when not all tables have
been created, then no more.
Are there ogr2ogr options (-gt, -ds_transaction, -doo with
PRELUDE_STATEMENTS,CLOSING_STATEMENTS ...) which could help get rid of those
errors in my use case?
Or do you think there's no other way than creating the database structure in