DMLEX on Wikibase: Difference between revisions
No edit summary |
(→SPARQL) |
||
Line 68: | Line 68: | ||
PREFIX lno: <https://lexbib.elex.is/prop/novalue/> | PREFIX lno: <https://lexbib.elex.is/prop/novalue/> | ||
select ?lexeme ?lexeme_nr ?lemma (count (?sense) as ?num_of_senses) | select ?lexeme ?lexeme_nr ?lemma (count (distinct ?sense) as ?num_of_senses) (count (distinct ?def) as ?num_of_defs) (count (distinct ?expl) as ?num_of_examples) | ||
where { | where { | ||
?lexeme ldp:P207 lwb:Q34165; wikibase:lemma ?lemma; ontolex:sense ?sense. | ?lexeme ldp:P207 lwb:Q34165; wikibase:lemma ?lemma; ontolex:sense ?sense. | ||
optional {?sense ldp:P209 ?def.} optional {?sense ldp:P208 ?expl.} | |||
bind (xsd:integer(strafter(str(?lexeme),"https://lexbib.elex.is/entity/L")) as ?lexeme_nr) | bind (xsd:integer(strafter(str(?lexeme),"https://lexbib.elex.is/entity/L")) as ?lexeme_nr) | ||
filter (?lexeme_nr > 34) # this is because of bug https://phabricator.wikimedia.org/T363312 | filter (?lexeme_nr > 34) # this is because of bug https://phabricator.wikimedia.org/T363312 | ||
} | } | ||
group by ?lexeme ?lexeme_nr ?lemma ?num_of_senses | group by ?lexeme ?lexeme_nr ?lemma ?num_of_senses ?num_of_defs ?num_of_examples | ||
order by ?lexeme_nr | order by ?lexeme_nr | ||
</sparql> | </sparql> |
Revision as of 17:49, 28 April 2024
A serialization of the DMLEX model, for LexBib Wikibase
This page describes how lexical resources datasets following the DMLEX model are represented on this Wikibase instance. In the following sections, we describe how the DMLEX core classes are represented on LexBib Wikibase. The aim is to present DMLEX datasets to the user on collaboratively editable entity pages, and to allow SPARQL queries in these.
This model is, of course, heavily inspired by the DMLEX Ontology (the RDF serialization of DMLEX deploying Ontolex-Lemon).
DMLEX on Lexbib Wikibase
Global properties
- "id": P186 (string) - the entry, sense, or form ID in the source dataset
- "sameAs": P57 (url) - should be a proper URI
- "listingOrder": P33 (string) - integer is converted to string
- "langCode": P56 (item) - the IETF language code is mapped to the Wikibase item representing the language
Lexicographical Resource
Entities of this class are modelled as Q-entities of the class Lexicographical Resource.
Object properties
Some properties attached to entities of this class that belong to the DMLEX Controlled Values module point to Q-items belonging to the following classes:
- Definition Type Tag
- Inflected Form Tag
- Label Tag
- Label Type Tag
- Part of Speech Tag - should contain a P202 statement pointing to the Wikibase item desribing the corresponding LexInfo 3.0 POS
- Source Identity Tag
- Transcription Scheme Tag
This full reification of DMLEX controlled values (i.e., that they are not blank nodes, but Q-entities) allows to qualify the statements using properties that point to literal dmlex:tag properties attached to dictionary content with the corresponding controlled value entity.
Datatype properties
Entry
Datatype properties
- "headword" is mapped to wikibase:lemma, to which the language code corresponding to the Lexicographical Resource's "langCode" property value is attached.
- "homographNumber": P187 (string)
Object properties, represented using Wikibase shallow reification (using qualifiers)
- "partOfSpeech": the "tag" value of the dmlex:PartOfSpeech object is mapped to P195 (string), and, in case that string matches to one of the controlled values specified for the Lexicographical Resource, is qualified with the matching controlled value item using P201 (string). A "listingOrder" value is also attached as qualifier.
- "label": the "tag" value of the dmlex:Label object is mapped to P195 (string), and, in case that string matches to one of the controlled values specified for the Lexicographical Resource, is qualified with the matching controlled value item using P203 (string). A "listingOrder" value is also attached as qualifier.
- "pronunciation": the "text" value of the dmlex:Transcription object is mapped to P204 (string). The "scheme" value (an IETF language tag) is attached as qualifier using P205 (string); a P206 (item) as well, in case the literal value matches to one of the controlled values specified for the Lexicographical Resource.
Sense
Inflected Form
SPARQL
#title: Slovar slovenskih členkov entries
PREFIX lwb: <https://lexbib.elex.is/entity/>
PREFIX ldp: <https://lexbib.elex.is/prop/direct/>
PREFIX lp: <https://lexbib.elex.is/prop/>
PREFIX lps: <https://lexbib.elex.is/prop/statement/>
PREFIX lpq: <https://lexbib.elex.is/prop/qualifier/>
PREFIX lpr: <https://lexbib.elex.is/prop/reference/>
PREFIX lno: <https://lexbib.elex.is/prop/novalue/>
select ?lexeme ?lexeme_nr ?lemma (count (distinct ?sense) as ?num_of_senses) (count (distinct ?def) as ?num_of_defs) (count (distinct ?expl) as ?num_of_examples)
where {
?lexeme ldp:P207 lwb:Q34165; wikibase:lemma ?lemma; ontolex:sense ?sense.
optional {?sense ldp:P209 ?def.} optional {?sense ldp:P208 ?expl.}
bind (xsd:integer(strafter(str(?lexeme),"https://lexbib.elex.is/entity/L")) as ?lexeme_nr)
filter (?lexeme_nr > 34) # this is because of bug https://phabricator.wikimedia.org/T363312
}
group by ?lexeme ?lexeme_nr ?lemma ?num_of_senses ?num_of_defs ?num_of_examples
order by ?lexeme_nr