I reported an apparent harvester bug on Feb 7 -> #3552. GN harvests
most but not all filtered records from a remote server. Unsuccessfully
harvested records are logged in the harvester with a false error message
(often 0x0 character at a certain line, but examination of the vicinity
with a hex editor reveals no such error). The remote records being
harvested had been validated.
I recently revisited the problem, using the metadata record "Strait of
Georgia Synoptic Bottom Trawl Survey, 2012-2015" from the remote server
http://soggy.zoology.ubc.ca:8080/geonetwork, with harvester filter on
title as "synoptic; trawl" (no quotes). The relevant portion of the
harvester log file states:
ERROR [psfDataCentre] - Error occurred while trying to load an xml file: /d0e11093-8990-4552-8616-a1169e5c50ee/metadata/metadata.xml: Error on line 999: An invalid XML character (Unicode: 0x0) was found in the element content of the document.
org.jdom.JDOMException: Error occurred while trying to load an xml file: /d0e11093-8990-4552-8616-a1169e5c50ee/metadata/metadata.xml: Error on line 999: An invalid XML character (Unicode: 0x0) was found in the element content of the document.
If I download a "mef" for the same record, it too will not import
directly to my local GN, with the same error message.
However, if I download the metadata as "xml", or if I extract the
metadata.xml from the mef that would not load, the record loads into my
local GN fine. So - it might be something with the mef format, but
nothing I can identify.
I have examined the downloaded or extracted metadata record with a hex
editor, and there is no 0x0 anywhere near line 999, or even in the
file. The metadata record also validates perfectly.
I have another dozen or more records (about 3% of about 650 records)
that behave similarly. All the rest harvest normally.
Remote GN server is version 3.4.3, and local server is version 3.8.1.