Anonymous

LexVoc: Difference between revisions

From LexBib
4 bytes removed ,  2 years ago
no edit summary
No edit summary
Line 61: Line 61:
Full text indexation with LexVoc terms, version 3, has recently been done for articles written in English and Spanish. You can see in how many articles a term (i.e., at least one of the labels of that term) has been found in LexBib full texts (see below). You can also see single articles and the 10 most frequent associated LexVoc terms on a single bibliographical item's wikipage ([[Item:Q13936|example]]).
Full text indexation with LexVoc terms, version 3, has recently been done for articles written in English and Spanish. You can see in how many articles a term (i.e., at least one of the labels of that term) has been found in LexBib full texts (see below). You can also see single articles and the 10 most frequent associated LexVoc terms on a single bibliographical item's wikipage ([[Item:Q13936|example]]).


In how many articles have we found LexVoc terms (narrowers of "[[Item:Q1|Lexicography]]")? [https://lexbib.elex.is/query/#PREFIX%20lwb%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fentity%2F%3E%0APREFIX%20ldp%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fdirect%2F%3E%0APREFIX%20lp%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2F%3E%0APREFIX%20lps%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fstatement%2F%3E%0APREFIX%20lpq%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fqualifier%2F%3E%0APREFIX%20lpr%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Freference%2F%3E%0APREFIX%20lno%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fnovalue%2F%3E%0A%0Aselect%20%3FTerm%20%3FenPrefLabel%20%28group_concat%28%3FenAltLabel%3BSEPARATOR%3D%22%3B%22%29%20as%20%3FenAltLabels%29%20%3Fcorpus_hits%0A%0Awhere%20%7B%0A%20%20%3FTerm%20ldp%3AP5%20lwb%3AQ7%3B%0A%20%20rdfs%3Alabel%20%3FenPrefLabel.%20filter%28lang%28%3FenPrefLabel%29%3D%22en%22%29%0A%20%7B%20%3FTerm%20ldp%3AP72%2a%20%3Ffacet%20.%20%3Ffacet%20ldp%3AP131%20lwb%3AQ1.%7D%20%23%20present%20in%20narrower-broader-tree%20with%20LexVoc%20facet%20as%20broader%0A%20%20%20UNION%0A%20%7B%20%3FTerm%20ldp%3AP77%20%3FcloseMatch.%20%3FcloseMatch%20ldp%3AP72%2a%20lwb%3AQ1%20.%20%7D%20%23%20includes%20closeMatch%20items%20without%20own%20broader-rels%0A%20%0A%20OPTIONAL%20%7B%20%3FTerm%20skos%3AaltLabel%20%3FenAltLabel.%0A%20%20%20%20%20%20%20%20%20%20%20%20filter%28lang%28%3FenAltLabel%29%3D%22en%22%29%20%7D%0A%20OPTIONAL%20%7B%20%3FTerm%20lp%3AP109%20%5Blps%3AP109%20%3Fcorpus_hits%20%3B%20lpq%3AP84%20%22LexBib%20Oct%202021%20stopterms%22%5D.%7D%0A%0A%20%20%7D%20group%20by%20%3FTerm%20%3FenPrefLabel%20%3Fcorpus_hits%0A%20%20%20%20order%20by%20DESC%20%28xsd%3Ainteger%28%3Fcorpus_hits%29%29 Query].
In how many articles have we found LexVoc terms (narrowers of "[[Item:Q1|Lexicography]]")? [https://lexbib.elex.is/query/#PREFIX%20lwb%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fentity%2F%3E%0APREFIX%20ldp%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fdirect%2F%3E%0APREFIX%20lp%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2F%3E%0APREFIX%20lps%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fstatement%2F%3E%0APREFIX%20lpq%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fqualifier%2F%3E%0APREFIX%20lpr%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Freference%2F%3E%0APREFIX%20lno%3A%20%3Chttps%3A%2F%2Flexbib.elex.is%2Fprop%2Fnovalue%2F%3E%0A%0Aselect%20%3FTerm%20%3FenPrefLabel%20%28group_concat%28%3FenAltLabel%3BSEPARATOR%3D%22%3B%22%29%20as%20%3FenAltLabels%29%20%3Fcorpus_hits%0A%0Awhere%20%7B%0A%20%20%3FTerm%20ldp%3AP5%20lwb%3AQ7%3B%0A%20%20rdfs%3Alabel%20%3FenPrefLabel.%20filter%28lang%28%3FenPrefLabel%29%3D%22en%22%29%0A%20%7B%20%3FTerm%20ldp%3AP72%2a%20%3Ffacet%20.%20%3Ffacet%20ldp%3AP131%20lwb%3AQ1.%7D%20%23%20present%20in%20narrower-broader-tree%20with%20LexVoc%20facet%20as%20broader%0A%20%20%20UNION%0A%20%7B%20%3FTerm%20ldp%3AP77%20%3FcloseMatch.%20%3FcloseMatch%20ldp%3AP72%2a%20lwb%3AQ1%20.%20%7D%20%23%20includes%20closeMatch%20items%20without%20own%20broader-rels%0A%20%0A%20OPTIONAL%20%7B%20%3FTerm%20skos%3AaltLabel%20%3FenAltLabel.%0A%20%20%20%20%20%20%20%20%20%20%20%20filter%28lang%28%3FenAltLabel%29%3D%22en%22%29%20%7D%0A%20OPTIONAL%20%7B%20%3FTerm%20lp%3AP109%20%5Blps%3AP109%20%3Fcorpus_hits%20%3B%20lpq%3AP84%20%22LexBib%20en%2Fes%2012-2021%22%5D.%7D%0A%0A%20%20%7D%20group%20by%20%3FTerm%20%3FenPrefLabel%20%3Fcorpus_hits%0A%20%20%20%20order%20by%20DESC%20%28xsd%3Ainteger%28%3Fcorpus_hits%29%299 Query].


== Stop-labels ==
== Stop-labels ==


Some term lexicalizations (SKOS ''labels'') are very ambiguous, and do not yield proper results (many false positives and/or too many distinct word senses relevant in the domain of Lexicography). At the time being, the approach towards that problem is a label stop-list, which is accessible [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/stopterms.py here]. These term labels are excluded in article full text indexation (if a term has several labels, the other labels are not affected). This is currently our only approach towards ambiguous term labels; we assume that articles will contain a sufficient amount of other indexable terms, and that siblings, narrowers and broaders of the stoplist term will be among them, so that term indexation will still be meaningful.
Some term lexicalizations (SKOS ''labels'') are very ambiguous, and do not yield proper results (many false positives and/or too many distinct word senses relevant in the domain of Lexicography). At the time being, the approach towards that problem is a label stop-list, which is accessible [https://github.com/elexis-eu/elexifinder/blob/master/wikibase/stopterms.py here]. These term labels are excluded in article full text indexation (if a term has several labels, the other labels are not affected). This is currently our only approach towards ambiguous term labels; we assume that articles will contain a sufficient amount of other indexable terms, and that siblings, narrowers and broaders of the stoplist term will be among them, so that term indexation will still be meaningful.