Skip to content

fix(search): support identifier aliases (cds, cdsrn, aleph, doi)#743

Merged
kpsherva merged 1 commit intoCERNDocumentServer:masterfrom
TahaKhan998:fix/identifier-search-alias
Apr 2, 2026
Merged

fix(search): support identifier aliases (cds, cdsrn, aleph, doi)#743
kpsherva merged 1 commit intoCERNDocumentServer:masterfrom
TahaKhan998:fix/identifier-search-alias

Conversation

@TahaKhan998
Copy link
Copy Markdown

Closes #703

This PR simplifies identifier search by introducing aliases (identifier, cds, cdsrn, aleph, doi), allowing queries like cds:12345 instead of metadata.identifiers.identifier:value.

Aliases are mapped via SearchFieldTransformer in invenio.cfg, and tests are added to verify correct query transformation.

@TahaKhan998 TahaKhan998 force-pushed the fix/identifier-search-alias branch 12 times, most recently from 6b493bd to 2ac540e Compare March 23, 2026 13:09
@TahaKhan998 TahaKhan998 force-pushed the fix/identifier-search-alias branch from 2ac540e to 6b20af8 Compare March 24, 2026 09:00
Copy link
Copy Markdown
Member

@palkerecsenyi palkerecsenyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect, just a quick question

Copy link
Copy Markdown
Contributor

@zubeydecivelek zubeydecivelek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 🚀

@TahaKhan998 TahaKhan998 force-pushed the fix/identifier-search-alias branch from 6b20af8 to fa32222 Compare March 24, 2026 16:01
Copy link
Copy Markdown
Contributor

@sakshamarora1 sakshamarora1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean 🚀

Copy link
Copy Markdown
Contributor

@kpsherva kpsherva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the identifiers part! I am happy to merge it. (see there is a minor comment left)
What other fields would you propose to simplify?

Comment on lines +392 to +393
"inspire": "metadata.related_identifiers.identifier",
"cds": "metadata.identifiers.identifier",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both cds (legacy) and inspire identifiers' values are integers. How can we ensure that the query will not return both cds and inspire matching records when user searches for cds:12345?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is actually something I tried to handle earlier with an AND clause to enforce both the scheme and the identifier value.

The idea was that something like cds:12345 should translate to “find an identifier where scheme = cds AND value = 12345”, so we don’t get cross-matches with other identifier types.

However, the issue was in how the transformer builds the query. The AND clause was effectively applied across the whole record instead of within the same identifier entry. So it behaved like:

“record has some identifier with scheme = cds AND record has some identifier with value = 12345”

instead of enforcing both conditions on the same identifier object.

Because of that, a record with cds:263303 and inspire:12345 could still match a query like inspire:263303, since the scheme and value conditions were satisfied by different identifiers.

So the issue wasn’t really with the idea of restricting by scheme, but with how the transformer applies those conditions. Right now the mapping only targets the value, so we don’t yet strictly guarantee scheme-level isolation.

@TahaKhan998 TahaKhan998 force-pushed the fix/identifier-search-alias branch from fa32222 to 9d80d2d Compare March 27, 2026 10:47
@TahaKhan998 TahaKhan998 force-pushed the fix/identifier-search-alias branch from 9d80d2d to 4ea108f Compare March 27, 2026 13:57
@kpsherva kpsherva merged commit 3293da8 into CERNDocumentServer:master Apr 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

search phrases: simplify the most commonly used, like identifiers

5 participants