Harvest only new records, maintain modifications to old ones

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Harvest only new records, maintain modifications to old ones

Chiara Scaini-2
Hi all! I'm setting up a Geonetwork portal to provide meteorological data. The intention is to enrich metadata entries based on other data coming from different sources. I have a few harvesters that get the metadata from a Thredds catalog. Then I modify the metadata using a python script based on the OWSLib library.

I'd like to harvest and update only the 'new' records (ex. the new simulations for the day) using the same harvester, without moving the files in the backend or creating folders.

In order to disable the harvesting of old data, I would work on the follwing ideas:
1) modify the <gmd:dateStamp> when updating, and set it to the past, so that the harvester does not update the record.
2) use the 'harvesting attribute', but I would have to modify it in the thredds catalog
3) move the items to the local catalog after each harvesting. Can that be done from the command line so every time the harvester finds new data, they are harvested and moved?

Does anyone have suggestions? 

Regarding point 3), would it make sense in future to enable a harvester option for moving items to local catalog?

Many thanks,

Chiara Scaini

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
GeoNetwork-devel mailing list
[hidden email]
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork