LexBib bibliodata workflow overview: Difference between revisions

Line 99: Line 99:


The following tasks are planned, and awaiting a detailed workflow design:
The following tasks are planned, and awaiting a detailed workflow design:
* Indexation of bibliographical items written in languages other than English and Spanish
** As soon as LexVoc translation is completed, and a lemmatization procedure for other languages is implemented.
* Evaluation of [[LexVoc terms]] as content-describing indicators
** Idea: Authors rate the content descriptors (LexVoc terms) assigned to their articles. The rating can be used to improve the indexation process (e.g. discard descriptors repeatedly marked as irrelevant, or prioritize descriptors according to a certain frequency threshold).
* Alignment of [[Item:Q5|person]] (and [[Item:Q11|organization]]) items to Wikidata and VIAF:
* Alignment of [[Item:Q5|person]] (and [[Item:Q11|organization]]) items to Wikidata and VIAF:
** This can be done using Open Refine. An experiment using a subset of LexBib showed that about 25% of LexBib persons are found on Wikidata, and around 40% on VIAF. Person entity data on Wikidata contains ORCID identifiers, among other person metadata, like birth (and death) date, affiliations, etc. Person entity data on VIAF contains reference to authored publications (of all domains), birth (and death) date, etc.
** This can be done using Open Refine. An experiment using a subset of LexBib showed that about 25% of LexBib persons are found on Wikidata, and around 40% on VIAF. Person entity data on Wikidata contains ORCID identifiers, among other person metadata, like birth (and death) date, affiliations, etc. Person entity data on VIAF contains reference to authored publications (of all domains), birth (and death) date, etc.