Exact Match queries in geonetwork search

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Exact Match queries in geonetwork search

Jo Cook
Hi All,

A user of GeoNetwork 3.10 has come to me asking why when they do a
search for a phrase in the standard search, eg for a record title, the
exact title is not the first result to be returned, even if the
results are ordered by relevance. You can see this happening here:
https://spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/home
if you search for "Community Council Boundaries - Scotland" (I have
tested this with and without the '-' in the record title). We've tried
adding quotes around the title in the search but nothing seems to
work.

I have looked into the documentation here:
https://geonetwork-opensource.org/manuals/trunk/en/customizing-application/configuring-search-fields.html#
and I understand that it's being tokenised, but I'm confused why the
exact title match isn't seen as the first most relevant result- it
feels as if it should be?

If that's not the expected behaviour, is there anything I do in the
lucene configuration to get the desired result?

Thanks

Jo

--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For
urgent responses at that time, please visit
support.astuntechnology.com or phone our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd,
Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 

t:+44 1372 744 009 w: astuntechnology.com <http://astuntechnology.com/
twitter:@astuntech <https://twitter.com/astuntech>



iShare - enterprise
geographic intelligence platform <https://astuntechnology.com/ishare/>

GeoServer, PostGIS and QGIS training
<https://astuntechnology.com/training-courses/>
Helpdesk and customer
portal
<https://astuntech.atlassian.net/wiki/spaces/ISHAREHELP/pages/364970043/Astun+Technology+Support+Portal>





Company registration no. 5410695. Registered in England and Wales.
Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no.
864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: Exact Match queries in geonetwork search

Jose Garcia
Hi Jo

About relevance, note that by default it's searched in a special field named any, that contains all the metadata text, so it's not possible to give higher score to results having the text in the title or other specific fields.

It's possible to improve this, but requires some changes in UI and lucene configuration. I add here summarized steps:

1) Configure title or other fields to boost with a boost higher than 1:


<Field name="title" boost="1.5F"/>


2) Change the search field to use in the search to use both any and the title field, with the format title_OR_any, that should search in both fields and with previous configuration boost results with the text in the title.

https://github.com/geonetwork/core-geonetwork/blob/master/web-ui/src/main/resources/catalog/views/default/templates/searchForm.html#L16

data-ng-model="searchObj.params.title_OR_any"


$location.path('/search').search({'title_OR_any': any});


data-ng-href="#/search?title_OR_any={{homeAnyField}}">

3)  Restart / reindex 


Regards,
Jose García


On Wed, Sep 9, 2020 at 12:39 PM Jo Cook <[hidden email]> wrote:
Hi All,

A user of GeoNetwork 3.10 has come to me asking why when they do a
search for a phrase in the standard search, eg for a record title, the
exact title is not the first result to be returned, even if the
results are ordered by relevance. You can see this happening here:
https://spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/home
if you search for "Community Council Boundaries - Scotland" (I have
tested this with and without the '-' in the record title). We've tried
adding quotes around the title in the search but nothing seems to
work.

I have looked into the documentation here:
https://geonetwork-opensource.org/manuals/trunk/en/customizing-application/configuring-search-fields.html#
and I understand that it's being tokenised, but I'm confused why the
exact title match isn't seen as the first most relevant result- it
feels as if it should be?

If that's not the expected behaviour, is there anything I do in the
lucene configuration to get the desired result?

Thanks

Jo

--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For
urgent responses at that time, please visit
support.astuntechnology.com or phone our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd,
Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 

t:+44 1372 744 009 w: astuntechnology.com <http://astuntechnology.com/
twitter:@astuntech <https://twitter.com/astuntech>



iShare - enterprise
geographic intelligence platform <https://astuntechnology.com/ishare/>

GeoServer, PostGIS and QGIS training
<https://astuntechnology.com/training-courses/>
Helpdesk and customer
portal
<https://astuntech.atlassian.net/wiki/spaces/ISHAREHELP/pages/364970043/Astun+Technology+Support+Portal>





Company registration no. 5410695. Registered in England and Wales.
Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no.
864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


--
Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: <a href="tel:+31318416664" style="font-family:Helvetica,Arial,sans-serif" target="_blank">+31 (0)318 416664

Please consider the environment before printing this email.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: Exact Match queries in geonetwork search

Jo Cook
Hi Jose,

Thanks for this- it's exactly what I'm after. I will test out your changes and get them deployed.

Thanks again,

Jo

On Thu, Sep 17, 2020 at 8:53 AM Jose Garcia <[hidden email]> wrote:
Hi Jo

About relevance, note that by default it's searched in a special field named any, that contains all the metadata text, so it's not possible to give higher score to results having the text in the title or other specific fields.

It's possible to improve this, but requires some changes in UI and lucene configuration. I add here summarized steps:

1) Configure title or other fields to boost with a boost higher than 1:


<Field name="title" boost="1.5F"/>


2) Change the search field to use in the search to use both any and the title field, with the format title_OR_any, that should search in both fields and with previous configuration boost results with the text in the title.

https://github.com/geonetwork/core-geonetwork/blob/master/web-ui/src/main/resources/catalog/views/default/templates/searchForm.html#L16

data-ng-model="searchObj.params.title_OR_any"


$location.path('/search').search({'title_OR_any': any});


data-ng-href="#/search?title_OR_any={{homeAnyField}}">

3)  Restart / reindex 


Regards,
Jose García


On Wed, Sep 9, 2020 at 12:39 PM Jo Cook <[hidden email]> wrote:
Hi All,

A user of GeoNetwork 3.10 has come to me asking why when they do a
search for a phrase in the standard search, eg for a record title, the
exact title is not the first result to be returned, even if the
results are ordered by relevance. You can see this happening here:
https://spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/home
if you search for "Community Council Boundaries - Scotland" (I have
tested this with and without the '-' in the record title). We've tried
adding quotes around the title in the search but nothing seems to
work.

I have looked into the documentation here:
https://geonetwork-opensource.org/manuals/trunk/en/customizing-application/configuring-search-fields.html#
and I understand that it's being tokenised, but I'm confused why the
exact title match isn't seen as the first most relevant result- it
feels as if it should be?

If that's not the expected behaviour, is there anything I do in the
lucene configuration to get the desired result?

Thanks

Jo

--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For
urgent responses at that time, please visit
support.astuntechnology.com or phone our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd,
Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 

t:+44 1372 744 009 w: astuntechnology.com <http://astuntechnology.com/
twitter:@astuntech <https://twitter.com/astuntech>



iShare - enterprise
geographic intelligence platform <https://astuntechnology.com/ishare/>

GeoServer, PostGIS and QGIS training
<https://astuntechnology.com/training-courses/>
Helpdesk and customer
portal
<https://astuntech.atlassian.net/wiki/spaces/ISHAREHELP/pages/364970043/Astun+Technology+Support+Portal>





Company registration no. 5410695. Registered in England and Wales.
Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no.
864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


--
Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: <a href="tel:+31318416664" style="font-family:Helvetica,Arial,sans-serif" target="_blank">+31 (0)318 416664

Please consider the environment before printing this email.


--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009


-- 
Sign up to our mailing list for updates on news, products, conferences, events and training

Astun Technology Ltd, Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 
t:+44 1372 744 009 w: astuntechnology.com twitter:@astuntech

Company registration no. 5410695. Registered in England and Wales. Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
Reply | Threaded
Open this post in threaded view
|

Re: Exact Match queries in geonetwork search

Jo Cook
Hi Jose,

I think I have this working for searches made from the search tab, but not for searches made from the contribute tab. I have edited https://github.com/geonetwork/core-geonetwork/blob/master/web-ui/src/main/resources/catalog/templates/editor/editorboard.html#L27 to also use "searchObj.params.title_OR_any" but now nothing seems to happen when I run a search, which makes me think I've got it wrong, OR that there's some javascript to change as well?

Thanks

Jo

On Thu, Sep 24, 2020 at 3:12 PM Jo Cook <[hidden email]> wrote:
Hi Jose,

Thanks for this- it's exactly what I'm after. I will test out your changes and get them deployed.

Thanks again,

Jo

On Thu, Sep 17, 2020 at 8:53 AM Jose Garcia <[hidden email]> wrote:
Hi Jo

About relevance, note that by default it's searched in a special field named any, that contains all the metadata text, so it's not possible to give higher score to results having the text in the title or other specific fields.

It's possible to improve this, but requires some changes in UI and lucene configuration. I add here summarized steps:

1) Configure title or other fields to boost with a boost higher than 1:


<Field name="title" boost="1.5F"/>


2) Change the search field to use in the search to use both any and the title field, with the format title_OR_any, that should search in both fields and with previous configuration boost results with the text in the title.

https://github.com/geonetwork/core-geonetwork/blob/master/web-ui/src/main/resources/catalog/views/default/templates/searchForm.html#L16

data-ng-model="searchObj.params.title_OR_any"


$location.path('/search').search({'title_OR_any': any});


data-ng-href="#/search?title_OR_any={{homeAnyField}}">

3)  Restart / reindex 


Regards,
Jose García


On Wed, Sep 9, 2020 at 12:39 PM Jo Cook <[hidden email]> wrote:
Hi All,

A user of GeoNetwork 3.10 has come to me asking why when they do a
search for a phrase in the standard search, eg for a record title, the
exact title is not the first result to be returned, even if the
results are ordered by relevance. You can see this happening here:
https://spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/home
if you search for "Community Council Boundaries - Scotland" (I have
tested this with and without the '-' in the record title). We've tried
adding quotes around the title in the search but nothing seems to
work.

I have looked into the documentation here:
https://geonetwork-opensource.org/manuals/trunk/en/customizing-application/configuring-search-fields.html#
and I understand that it's being tokenised, but I'm confused why the
exact title match isn't seen as the first most relevant result- it
feels as if it should be?

If that's not the expected behaviour, is there anything I do in the
lucene configuration to get the desired result?

Thanks

Jo

--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For
urgent responses at that time, please visit
support.astuntechnology.com or phone our office on 01372 744009

--
-- 
*Sign up to our mailing list
<https://astuntechnology.com/company/#email-updates> for updates on news,
products, conferences, events and training*
*
*

Astun Technology Ltd,
Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 

t:+44 1372 744 009 w: astuntechnology.com <http://astuntechnology.com/
twitter:@astuntech <https://twitter.com/astuntech>



iShare - enterprise
geographic intelligence platform <https://astuntechnology.com/ishare/>

GeoServer, PostGIS and QGIS training
<https://astuntechnology.com/training-courses/>
Helpdesk and customer
portal
<https://astuntech.atlassian.net/wiki/spaces/ISHAREHELP/pages/364970043/Astun+Technology+Support+Portal>





Company registration no. 5410695. Registered in England and Wales.
Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no.
864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


--
Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: <a href="tel:+31318416664" style="font-family:Helvetica,Arial,sans-serif" target="_blank">+31 (0)318 416664

Please consider the environment before printing this email.


--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009



--
Jo Cook
t:+44 7930 524 155/twitter:@archaeogeek
Please note that currently I do not work on Friday afternoons. For urgent responses at that time, please visit support.astuntechnology.com or phone our office on 01372 744009


-- 
Sign up to our mailing list for updates on news, products, conferences, events and training

Astun Technology Ltd, Epsom Square Centre, 6-7 The Derby Square, Epsom, Surrey, KT19 8AG, UK 
t:+44 1372 744 009 w: astuntechnology.com twitter:@astuntech

Company registration no. 5410695. Registered in England and Wales. Registered office: 120 Manor Green Road, Epsom, Surrey, KT19 8LN VAT no. 864201149.


_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork