Harvest only new records, maintain modifications to old ones
Hi all! I'm setting up a Geonetwork portal to provide meteorological data. The intention is to enrich metadata entries based on other data coming from different sources. I have a few harvesters that get the metadata from a Thredds catalog. Then I modify the metadata using a python script based on the OWSLib library.
I'd like to harvest and update only the 'new' records (ex. the new simulations for the day) using the same harvester, without moving the files in the backend or creating folders.
In order to disable the harvesting of old data, I would work on the follwing ideas:
1) modify the <gmd:dateStamp> when updating, and set it to the past, so that the harvester does not update the record.
2) use the 'harvesting attribute', but I would have to modify it in the thredds catalog
3) move the items to the local catalog after each harvesting. Can that be done from the command line so every time the harvester finds new data, they are harvested and moved?
Does anyone have suggestions?
Regarding point 3), would it make sense in future to enable a harvester option for moving items to local catalog?