Anonymous

LexBib bibliodata workflow overview: Difference between revisions

From LexBib
Line 47: Line 47:


* For Elexifinder version 2 (spring 2021), we reduced the around 5,000 different person names present by that time in the database to around 4,000 unique person items, using clustering algorithms in [http://openrefine.org Open Refine]. Persons in LexBib have up to six name variants (see query at [[Main_Page#See_what.27s_in_the_database|Main Page]]).
* For Elexifinder version 2 (spring 2021), we reduced the around 5,000 different person names present by that time in the database to around 4,000 unique person items, using clustering algorithms in [http://openrefine.org Open Refine]. Persons in LexBib have up to six name variants (see query at [[Main_Page#See_what.27s_in_the_database|Main Page]]).
* For subsequent updates, we use our own [https://github.com/wetneb/openrefine-wikibase wikibase reconciliation service with open refine]. That means, that person name literals are matched against person items existing in LexBib wikibase, where all name literals previously matched to a person items are stored. [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/sparql/authorliteralsforopenrefine.rq This query] exports wikibase statements pointing to unmatched persons, and [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/maintenance/newcreatorsfromopenrefine.py newcreatorsfromopenrefine.py] re-imports matched person items or creates new items for those that have remained unmatched, and updates the statements.
* For subsequent updates, we use our own [https://github.com/wetneb/openrefine-wikibase wikibase reconciliation service with open refine]. That means, that person name literals are matched against person items existing in LexBib wikibase, where all name literals previously matched to a person items are stored. [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/sparql/authorliteralsforopenrefine.rq This query] exports wikibase statements pointing to unmatched persons, and [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/maintenance/newcreatorsfromopenrefine.py newcreatorsfromopenrefine.py] processes the reconciliation results, creates new items for those names that have remained unmatched, and updates the statements and the literals associated to persons.


==Indexation of bibliographical items with [[LexVoc]] terms==
==Indexation of bibliographical items with [[LexVoc]] terms==