Anonymous

LexBib bibliodata workflow overview: Difference between revisions

From LexBib
Line 46: Line 46:
==Author Disambiguation: Open Refine==
==Author Disambiguation: Open Refine==


* For Elexifinder version 2 (spring 2021), we reduced the around 5,000 different person names present by that time in the database to around 4,000 unique person items, using the clustering algorithms in [http://openrefine.org Open Refine]. Persons in LexBib have up to six name variants (see query at [[Main_Page#See_what.27s_in_the_database|Main Page]]).
* For Elexifinder version 2 (spring 2021), we reduced the around 5,000 different person names present by that time in the database to around 4,000 unique person items, using clustering algorithms in [http://openrefine.org Open Refine]. Persons in LexBib have up to six name variants (see query at [[Main_Page#See_what.27s_in_the_database|Main Page]]).
* For subsequent updates, we use our own [https://github.com/wetneb/openrefine-wikibase wikibase reconciliation service with open refine]. That means, that person name literals are matched against person items existing in LexBib wikibase. [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/sparql/authorliteralsforopenrefine.rq This query] exports wikibase statements pointing to unmatched persons, and [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/maintenance/newcreatorsfromopenrefine.py newcreatorsfromopenrefine.py] re-imports matched person items or creates new items for those that have remained unmatched, and updates the statements.
* For subsequent updates, we use our own [https://github.com/wetneb/openrefine-wikibase wikibase reconciliation service with open refine]. That means, that person name literals are matched against person items existing in LexBib wikibase. [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/sparql/authorliteralsforopenrefine.rq This query] exports wikibase statements pointing to unmatched persons, and [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/maintenance/newcreatorsfromopenrefine.py newcreatorsfromopenrefine.py] re-imports matched person items or creates new items for those that have remained unmatched, and updates the statements.