RDF export options

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

RDF export options

Jo Cook
Hi All,

I'm trying to harvest a subset of records from a Geonetwork 3.4.x
installation into CKAN (data.gov.uk) as rdf. There's almost no guidance on
their site about what they accept, apart from this:
https://guidance.data.gov.uk/publish_and_manage_data/harvest_or_add_data/harvest_data/#harvest-data.
I've tried a couple of approaches so far, with varying results:

1) a virtual CSW endpoint with the following CSW GetRecords options:
?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&typeNames=dcat&ElementSetName=full&resultType=results

This produces an error in CKAN about needing a plugin installed for xml, so
I looked at providing an outputFormat=application/json parameter to the
above request, but that simply produced an error about an invalid parameter
value.

2) rdf.search?_cat=mysubset produces what seems to be a valid output but
CKAN imports a single record with multiple attached datasets rather than
multiple records. It also doesn't seem to bring through the actual metadata.

Firstly, can I configure my schema to accept more outputformats?
Secondly, am I missing anything with these two approaches?

My final plan is to use the DCAT schema plugin:
https://github.com/metadata101/dcat-ap1.1/tree/3.4.x but I don't know much
about it and whether it's going to help at all.

Can anyone provide me with any advice?

I would be happy to contribute to the documentation about this if I can
figure it out!

Jo

--
*Jo Cook*
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent
responses at that time, please visit support.astuntechnology.com or phone
our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd, The
Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK 
t:+44 1372 744
009 w: astuntechnology.com <http://astuntechnology.com/> twitter:@astuntech
<https://twitter.com/astuntech>



iShare - enterprise geographic
intelligence platform <https://astuntechnology.com/ishare/>
GeoServer,
PostGIS and QGIS training <https://astuntechnology.com/services/#training>

Helpdesk and customer portal
<http://support.astuntechnology.com/support/login>




Company registration
no. 5410695. Registered in England and Wales. Registered office: 120 Manor
Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.

_______________________________________________
GeoNetwork-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: RDF export options

Jose Garcia
Hi Jo

See feedback inline.

Regards,
Jose García

On Tue, Apr 2, 2019 at 4:37 PM Jo Cook <[hidden email]> wrote:

> Hi All,
>
> I'm trying to harvest a subset of records from a Geonetwork 3.4.x
> installation into CKAN (data.gov.uk) as rdf. There's almost no guidance on
> their site about what they accept, apart from this:
>
> https://guidance.data.gov.uk/publish_and_manage_data/harvest_or_add_data/harvest_data/#harvest-data
> .
> I've tried a couple of approaches so far, with varying results:
>
> 1) a virtual CSW endpoint with the following CSW GetRecords options:
>
> ?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&typeNames=dcat&ElementSetName=full&resultType=results
>
> This produces an error in CKAN about needing a plugin installed for xml, so
> I looked at providing an outputFormat=application/json parameter to the
> above request, but that simply produced an error about an invalid parameter
> value.
>

See comment in next item.


>
> 2) rdf.search?_cat=mysubset produces what seems to be a valid output but
> CKAN imports a single record with multiple attached datasets rather than
> multiple records. It also doesn't seem to bring through the actual
> metadata.
>
>
About the single record, have you check if corresponds to the 1st Dataset
or maybe to the Catalog element in the RDF?


> Firstly, can I configure my schema to accept more outputformats?
>

GeoNetwork seem only supporting application/xml, see
https://github.com/geonetwork/core-geonetwork/blob/master/csw-server/src/main/java/org/fao/geonet/component/csw/GetRecords.java#L152-L153

I guess if CSW spec supports other formats, GeoNetwork can be extended to
support them, but requires some analysis to evaluate the changes required.


> Secondly, am I missing anything with these two approaches?
>

There's also a service rdf.metadata.public.get that returns a RDF file (xml
also) with all published metadata, but not sure if will make any diff as
the format should be the same as when requesting CSW or single metadata RDF.


> My final plan is to use the DCAT schema plugin:
> https://github.com/metadata101/dcat-ap1.1/tree/3.4.x but I don't know much
> about it and whether it's going to help at all.
>
> Can anyone provide me with any advice?
>

Some colleagues used this schema plugin for a project, but not sure if
integrated with CKAN, will ask them to provide further feedback in case
they manage about this.


>
> I would be happy to contribute to the documentation about this if I can
> figure it out!
>
> Jo
>
> --
> *Jo Cook*
> t:+44 7930 524 155/twitter:@archaeogeek
> Please note that currently I do not work on Friday afternoons. For urgent
> responses at that time, please visit support.astuntechnology.com or phone
> our office on 01372 744009
>
> --
> --
> *Sign up to our mailing list
> <https://astuntechnology.com/company/#email-updates> for updates on news,
> products, conferences, events and training*
> *
> *
>
> Astun Technology Ltd, The
> Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
> t:+44 1372 744
> 009 w: astuntechnology.com <http://astuntechnology.com/
> > twitter:@astuntech
> <https://twitter.com/astuntech>
>
>
>
> iShare - enterprise geographic
> intelligence platform <https://astuntechnology.com/ishare/>
> GeoServer,
> PostGIS and QGIS training <https://astuntechnology.com/services/#training>
>
> Helpdesk and customer portal
> <http://support.astuntechnology.com/support/login>
>
>
>
>
> Company registration
> no. 5410695. Registered in England and Wales. Registered office: 120 Manor
> Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.
>
> _______________________________________________
> GeoNetwork-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
> GeoNetwork OpenSource is maintained at
> http://sourceforge.net/projects/geonetwork
>


--













*Vriendelijke groeten / Kind regards,Jose García
<http://www.geocat.net/>Veenderweg 136721 WD BennekomThe NetherlandsT: +31
(0)318 416664 <+31318416664> <https://www.facebook.com/geocatbv>
<https://twitter.com/geocat_bv>
<https://plus.google.com/u/1/+GeocatNetbv/posts>Please consider the
environment before printing this email.*

_______________________________________________
GeoNetwork-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: RDF export options

Paul van Genuchten
In reply to this post by Jo Cook
Hi Jo, would be good to add this question also to the ckan list. Afaik CSW is not a core servicetype in CKAN, my impression is that iso19139 schema is hardcoded in the CSW harvester.

I added Stijn to this thread, who contributed the dcat work. Note that the DCAT schema can result in quite complex RDF, I’m not sure if CKAN will be able to ingest it. This type of metadata would be better ingested by a triple store (i did some experiments with virtuoso, it works quite nicely).

We generally use (the csw-iso19139 harvester or) a push mechanism to push records to CKAN using the CKAN api, i hope to have some sharable code available soon.

Note also the recent work of Francois at https://github.com/geonetwork/core-geonetwork/pull/3212 <https://github.com/geonetwork/core-geonetwork/pull/3212> which provides a data cite formatter in json that could be relevant in this scenario.

Regards, Paul.


> On 2 Apr 2019, at 16:35, Jo Cook <[hidden email]> wrote:
>
> Hi All,
>
> I'm trying to harvest a subset of records from a Geonetwork 3.4.x
> installation into CKAN (data.gov.uk) as rdf. There's almost no guidance on
> their site about what they accept, apart from this:
> https://guidance.data.gov.uk/publish_and_manage_data/harvest_or_add_data/harvest_data/#harvest-data.
> I've tried a couple of approaches so far, with varying results:
>
> 1) a virtual CSW endpoint with the following CSW GetRecords options:
> ?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&typeNames=dcat&ElementSetName=full&resultType=results
>
> This produces an error in CKAN about needing a plugin installed for xml, so
> I looked at providing an outputFormat=application/json parameter to the
> above request, but that simply produced an error about an invalid parameter
> value.
>
> 2) rdf.search?_cat=mysubset produces what seems to be a valid output but
> CKAN imports a single record with multiple attached datasets rather than
> multiple records. It also doesn't seem to bring through the actual metadata.
>
> Firstly, can I configure my schema to accept more outputformats?
> Secondly, am I missing anything with these two approaches?
>
> My final plan is to use the DCAT schema plugin:
> https://github.com/metadata101/dcat-ap1.1/tree/3.4.x but I don't know much
> about it and whether it's going to help at all.
>
> Can anyone provide me with any advice?
>
> I would be happy to contribute to the documentation about this if I can
> figure it out!
>
> Jo
>
> --
> *Jo Cook*
> t:+44 7930 524 155/twitter:@archaeogeek
> Please note that currently I do not work on Friday afternoons. For urgent
> responses at that time, please visit support.astuntechnology.com or phone
> our office on 01372 744009
>
> --
> --
> *Sign up to our mailing list
> <https://astuntechnology.com/company/#email-updates> for updates on news,
> products, conferences, events and training*
> *
> *
>
> Astun Technology Ltd, The
> Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
> t:+44 1372 744
> 009 w: astuntechnology.com <http://astuntechnology.com/> twitter:@astuntech
> <https://twitter.com/astuntech>
>
>
>
> iShare - enterprise geographic
> intelligence platform <https://astuntechnology.com/ishare/>
> GeoServer,
> PostGIS and QGIS training <https://astuntechnology.com/services/#training>
>
> Helpdesk and customer portal
> <http://support.astuntechnology.com/support/login>
>
>
>
>
> Company registration
> no. 5410695. Registered in England and Wales. Registered office: 120 Manor
> Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.
>
> _______________________________________________
> GeoNetwork-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
> GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


_______________________________________________
GeoNetwork-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: RDF export options

Jo Cook
In reply to this post by Jose Garcia
Hi Jose,

The single record that is produced by the Ckan harvester is here:
https://ckan.publishing.service.gov.uk/dataset/test-defra-iso19139-sample-record
and
this is the URL that I used for harvesting from:
https://public.eametadata.com/geonetwork/srv/eng/rdf.search?_cat=DSP_TEST,
so maybe it is a record for the catalog and not for each dataset.

I'll think about testing the DCAT plugin but I may need a different
approach.

Thanks for your responses

Jo

On Wed, Apr 3, 2019 at 7:29 AM Jose Garcia <[hidden email]> wrote:

> Hi Jo
>
> See feedback inline.
>
> Regards,
> Jose García
>
> On Tue, Apr 2, 2019 at 4:37 PM Jo Cook <[hidden email]> wrote:
>
>> Hi All,
>>
>> I'm trying to harvest a subset of records from a Geonetwork 3.4.x
>> installation into CKAN (data.gov.uk) as rdf. There's almost no guidance
>> on
>> their site about what they accept, apart from this:
>>
>> https://guidance.data.gov.uk/publish_and_manage_data/harvest_or_add_data/harvest_data/#harvest-data
>> .
>> I've tried a couple of approaches so far, with varying results:
>>
>> 1) a virtual CSW endpoint with the following CSW GetRecords options:
>>
>> ?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&typeNames=dcat&ElementSetName=full&resultType=results
>>
>> This produces an error in CKAN about needing a plugin installed for xml,
>> so
>> I looked at providing an outputFormat=application/json parameter to the
>> above request, but that simply produced an error about an invalid
>> parameter
>> value.
>>
>
> See comment in next item.
>
>
>>
>> 2) rdf.search?_cat=mysubset produces what seems to be a valid output but
>> CKAN imports a single record with multiple attached datasets rather than
>> multiple records. It also doesn't seem to bring through the actual
>> metadata.
>>
>>
> About the single record, have you check if corresponds to the 1st Dataset
> or maybe to the Catalog element in the RDF?
>
>
>> Firstly, can I configure my schema to accept more outputformats?
>>
>
> GeoNetwork seem only supporting application/xml, see
> https://github.com/geonetwork/core-geonetwork/blob/master/csw-server/src/main/java/org/fao/geonet/component/csw/GetRecords.java#L152-L153
>
> I guess if CSW spec supports other formats, GeoNetwork can be extended to
> support them, but requires some analysis to evaluate the changes required.
>
>
>> Secondly, am I missing anything with these two approaches?
>>
>
> There's also a service rdf.metadata.public.get that returns a RDF file
> (xml also) with all published metadata, but not sure if will make any diff
> as the format should be the same as when requesting CSW or single metadata
> RDF.
>
>
>> My final plan is to use the DCAT schema plugin:
>> https://github.com/metadata101/dcat-ap1.1/tree/3.4.x but I don't know
>> much
>> about it and whether it's going to help at all.
>>
>> Can anyone provide me with any advice?
>>
>
> Some colleagues used this schema plugin for a project, but not sure if
> integrated with CKAN, will ask them to provide further feedback in case
> they manage about this.
>
>
>>
>> I would be happy to contribute to the documentation about this if I can
>> figure it out!
>>
>> Jo
>>
>> --
>> *Jo Cook*
>> t:+44 7930 524 155/twitter:@archaeogeek
>> Please note that currently I do not work on Friday afternoons. For urgent
>> responses at that time, please visit support.astuntechnology.com or phone
>> our office on 01372 744009
>>
>> --
>> --
>> *Sign up to our mailing list
>> <https://astuntechnology.com/company/#email-updates> for updates on
>> news,
>> products, conferences, events and training*
>> *
>> *
>>
>> Astun Technology Ltd, The
>> Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
>> t:+44 1372 744
>> 009 w: astuntechnology.com <http://astuntechnology.com/
>> > twitter:@astuntech
>> <https://twitter.com/astuntech>
>>
>>
>>
>> iShare - enterprise geographic
>> intelligence platform <https://astuntechnology.com/ishare/>
>> GeoServer,
>> PostGIS and QGIS training <https://astuntechnology.com/services/#training
>> >
>>
>> Helpdesk and customer portal
>> <http://support.astuntechnology.com/support/login>
>>
>>
>>
>>
>> Company registration
>> no. 5410695. Registered in England and Wales. Registered office: 120
>> Manor
>> Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.
>>
>> _______________________________________________
>> GeoNetwork-users mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
>> GeoNetwork OpenSource is maintained at
>> http://sourceforge.net/projects/geonetwork
>>
>
>
> --
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Vriendelijke groeten / Kind regards,Jose García
> <http://www.geocat.net/>Veenderweg 136721 WD BennekomThe NetherlandsT: +31
> (0)318 416664 <+31318416664> <https://www.facebook.com/geocatbv>
> <https://twitter.com/geocat_bv>
> <https://plus.google.com/u/1/+GeocatNetbv/posts>Please consider the
> environment before printing this email.*
>


--
*Jo Cook*
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent
responses at that time, please visit support.astuntechnology.com or phone
our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd, The
Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK 
t:+44 1372 744
009 w: astuntechnology.com <http://astuntechnology.com/> twitter:@astuntech
<https://twitter.com/astuntech>



iShare - enterprise geographic
intelligence platform <https://astuntechnology.com/ishare/>
GeoServer,
PostGIS and QGIS training <https://astuntechnology.com/services/#training>

Helpdesk and customer portal
<http://support.astuntechnology.com/support/login>




Company registration
no. 5410695. Registered in England and Wales. Registered office: 120 Manor
Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.

_______________________________________________
GeoNetwork-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: RDF export options

Jo Cook
In reply to this post by Paul van Genuchten
Thanks Paul,

I'll give the DCAT plugin a test to see if it gets me what I need.

Thanks again

Jo

On Wed, Apr 3, 2019 at 8:42 AM Paul van Genuchten <
[hidden email]> wrote:

> Hi Jo, would be good to add this question also to the ckan list. Afaik CSW
> is not a core servicetype in CKAN, my impression is that iso19139 schema is
> hardcoded in the CSW harvester.
>
> I added Stijn to this thread, who contributed the dcat work. Note that the
> DCAT schema can result in quite complex RDF, I’m not sure if CKAN will be
> able to ingest it. This type of metadata would be better ingested by a
> triple store (i did some experiments with virtuoso, it works quite nicely).
>
> We generally use (the csw-iso19139 harvester or) a push mechanism to push
> records to CKAN using the CKAN api, i hope to have some sharable code
> available soon.
>
> Note also the recent work of Francois at
> https://github.com/geonetwork/core-geonetwork/pull/3212 which provides a
> data cite formatter in json that could be relevant in this scenario.
>
> Regards, Paul.
>
>
> On 2 Apr 2019, at 16:35, Jo Cook <[hidden email]> wrote:
>
> Hi All,
>
> I'm trying to harvest a subset of records from a Geonetwork 3.4.x
> installation into CKAN (data.gov.uk) as rdf. There's almost no guidance on
> their site about what they accept, apart from this:
>
> https://guidance.data.gov.uk/publish_and_manage_data/harvest_or_add_data/harvest_data/#harvest-data
> .
> I've tried a couple of approaches so far, with varying results:
>
> 1) a virtual CSW endpoint with the following CSW GetRecords options:
>
> ?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&typeNames=dcat&ElementSetName=full&resultType=results
>
> This produces an error in CKAN about needing a plugin installed for xml, so
> I looked at providing an outputFormat=application/json parameter to the
> above request, but that simply produced an error about an invalid parameter
> value.
>
> 2) rdf.search?_cat=mysubset produces what seems to be a valid output but
> CKAN imports a single record with multiple attached datasets rather than
> multiple records. It also doesn't seem to bring through the actual
> metadata.
>
> Firstly, can I configure my schema to accept more outputformats?
> Secondly, am I missing anything with these two approaches?
>
> My final plan is to use the DCAT schema plugin:
> https://github.com/metadata101/dcat-ap1.1/tree/3.4.x but I don't know much
> about it and whether it's going to help at all.
>
> Can anyone provide me with any advice?
>
> I would be happy to contribute to the documentation about this if I can
> figure it out!
>
> Jo
>
> --
> *Jo Cook*
> t:+44 7930 524 155/twitter:@archaeogeek
> Please note that currently I do not work on Friday afternoons. For urgent
> responses at that time, please visit support.astuntechnology.com or phone
> our office on 01372 744009
>
> --
> --
> *Sign up to our mailing list
> <https://astuntechnology.com/company/#email-updates> for updates on news,
> products, conferences, events and training*
> *
> *
>
> Astun Technology Ltd, The
> Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK
> t:+44 1372 744
> 009 w: astuntechnology.com <http://astuntechnology.com/
> > twitter:@astuntech
> <https://twitter.com/astuntech>
>
>
>
> iShare - enterprise geographic
> intelligence platform <https://astuntechnology.com/ishare/>
> GeoServer,
> PostGIS and QGIS training <https://astuntechnology.com/services/#training>
>
> Helpdesk and customer portal
> <http://support.astuntechnology.com/support/login>
>
>
>
>
> Company registration
> no. 5410695. Registered in England and Wales. Registered office: 120 Manor
> Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.
>
> _______________________________________________
> GeoNetwork-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
> GeoNetwork OpenSource is maintained at
> http://sourceforge.net/projects/geonetwork
>
>
>

--
*Jo Cook*
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent
responses at that time, please visit support.astuntechnology.com or phone
our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd, The
Coach House, 17 West Street, Epsom, Surrey, KT18 7RL, UK 
t:+44 1372 744
009 w: astuntechnology.com <http://astuntechnology.com/> twitter:@astuntech
<https://twitter.com/astuntech>



iShare - enterprise geographic
intelligence platform <https://astuntechnology.com/ishare/>
GeoServer,
PostGIS and QGIS training <https://astuntechnology.com/services/#training>

Helpdesk and customer portal
<http://support.astuntechnology.com/support/login>




Company registration
no. 5410695. Registered in England and Wales. Registered office: 120 Manor
Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.

_______________________________________________
GeoNetwork-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork