DMLEX on Wikibase: Difference between revisions

(Created page with "= A serialization of the DMLEX model, for LexBib Wikibase = This page describes how lexical resources datasets following the [https://docs.oasis-open.org/lexidma/dmlex/v1.0/csd02/dmlex-v1.0-csd02.pdf DMLEX model] are represented on this Wikibase instance. In the following sections, we describe how the DMLEX core classes are represented on LexBib Wikibase. The aim is to present DMLEX datasets to the user on collaboratively editable entity pages, and to allow SPARQL queri...")
 
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
= A serialization of the DMLEX model, for LexBib Wikibase =
'''A serialization of the DMLEX model, for LexBib Wikibase'''


This page describes how lexical resources datasets following the [https://docs.oasis-open.org/lexidma/dmlex/v1.0/csd02/dmlex-v1.0-csd02.pdf DMLEX model] are represented on this Wikibase instance. In the following sections, we describe how the DMLEX core classes are represented on LexBib Wikibase. The aim is to present DMLEX datasets to the user on collaboratively editable entity pages, and to allow SPARQL queries in these.
This page describes how lexical resources datasets following the [https://docs.oasis-open.org/lexidma/dmlex/v1.0/csd02/dmlex-v1.0-csd02.pdf DMLEX model] are represented on this Wikibase instance. In the following sections, we describe how the DMLEX core classes are represented on LexBib Wikibase. The aim is to present DMLEX datasets to the user on collaboratively editable entity pages, and to allow SPARQL queries in these.
Line 5: Line 5:
This model is, of course, heavily inspired by the [https://github.com/oasis-tcs/lexidma/blob/master/dmlex-v1.0/specification/serializations/RDF/ontology/dmlex.ttl DMLEX Ontology] (the RDF serialization of DMLEX deploying Ontolex-Lemon).
This model is, of course, heavily inspired by the [https://github.com/oasis-tcs/lexidma/blob/master/dmlex-v1.0/specification/serializations/RDF/ontology/dmlex.ttl DMLEX Ontology] (the RDF serialization of DMLEX deploying Ontolex-Lemon).


= Lexicographical Resource =
= DMLEX on Lexbib Wikibase =


Entities of this class are modelled as Q-entities of the class [[Item:Q100|dmlex Lexicographical Resource]].
= Global properties =
 
* "id": [[Property:P186|P186]] (string) - the entry, sense, or form ID in the source dataset
* "sameAs": [[Property:P57|P57]] (url) - should be a proper URI
* "listingOrder": [[Property:P33|P33]] (string) - integer is converted to string
* "langCode": [[Property:P56|P56]] (item) - the IETF language code is mapped to the Wikibase item representing the language
 
== Lexicographical Resource ==
 
Entities of this class are modelled as Q-entities of the class [[Item:Q100|Lexicographical Resource]].
 
=== Object properties ===


Some properties attached to entities of this class that belong to the DMLEX Controlled Values module point to Q-items belonging to the following classes:
Some properties attached to entities of this class that belong to the DMLEX Controlled Values module point to Q-items belonging to the following classes:
Line 15: Line 26:
* [[Item:Q103|Label Tag]]
* [[Item:Q103|Label Tag]]
* [[Item:Q104|Label Type Tag]]
* [[Item:Q104|Label Type Tag]]
* [[Item:Q105|Part of Speech Tag]]
* [[Item:Q105|Part of Speech Tag]] - should contain a [[Property:P202|P202]] statement pointing to the Wikibase item desribing the corresponding LexInfo 3.0 POS
* [[Item:Q106|Source Identity Tag]]
* [[Item:Q106|Source Identity Tag]]
* [[Item:Q107|Transcription Scheme Tag]]
* [[Item:Q107|Transcription Scheme Tag]]


This full reification of DMLEX controlled values (i.e., that they are not blank nodes, but Q-entities) allows to qualify the corresponding literal "tag" properties attached to dictionary content with the corresponding controlled value entity.
This full reification of DMLEX controlled values (i.e., that they are not blank nodes, but Q-entities) allows to qualify the statements using properties that point to literal ''dmlex:tag'' properties attached to dictionary content with the corresponding controlled value entity ([[Lexeme:L170#S3|example]]).
 
=== Datatype properties ===
 
* "title": [[Property:P6|P6]] (string)
* "uri": [[Property:P112|P112]] (url)
 
== Entry ==
 
=== Datatype properties ===
 
* "headword" is mapped to ''wikibase:lemma'', to which the language code corresponding to the Lexicographical Resource's "langCode" property value is attached.
* "homographNumber": [[Property:P187|P187]] (string)
 
=== Object properties, represented using Wikibase shallow reification (using qualifiers) ===
 
* "partOfSpeech": the "tag" value of the ''dmlex:PartOfSpeech'' object is mapped to [[Property:P195|P195]] (string), and, in case that string matches to one of the controlled values specified for the Lexicographical Resource, is qualified with the matching controlled value item using [[Property:P201|P201]] (string). A "listingOrder" value is also attached as qualifier.
* "label": the "tag" value of the ''dmlex:Label'' object is mapped to [[Property:P195|P195]] (string), and, in case that string matches to one of the controlled values specified for the Lexicographical Resource, is qualified with the matching controlled value item using [[Property:P203|P203]] (string). A "listingOrder" value is also attached as qualifier.
* "pronunciation": the "text" value of the ''dmlex:Transcription'' object is mapped to [[Property:P204|P204]] (string). The "scheme" value (an IETF language tag) is attached as qualifier using [[Property:P205|P205]] (string); a [[Property:P206|P206]] (item) as well, in case the literal value matches to one of the controlled values specified for the Lexicographical Resource.
 
== Sense ==
 
a lexeme sense, on Wikibase, is by default modeled as instance of ''ontolex:LexicalSense''. The DMLex class ''dmlex:Sense'' is mapped to this. '''Note: in [https://github.com/oasis-tcs/lexidma/blob/master/dmlex-v1.0/specification/serializations/RDF/ontology/dmlex.ttl dmlex.ttl], ''dmlex:Sense'' is declared subclass of ''ontolex:LexicalConcept'', and not of ''ontolex:LexicalSense''.'''
 
=== Datatype properties ===


= Entry =
* "definition" is mapped to [[Property:P209|P209]], datatype "string".
* "example" is mapped to [[Property:P208|P208]], datatype "string".


== Inflected Form ==


= Sense =
= SPARQL =
== Slovar slovenskih členkov ([[Item:Q34165|Q34165]]) ==
<sparql tryit="1">
#title: Slovar slovenskih členkov entries


PREFIX lwb: <https://lexbib.elex.is/entity/>
PREFIX ldp: <https://lexbib.elex.is/prop/direct/>
PREFIX lp: <https://lexbib.elex.is/prop/>
PREFIX lps: <https://lexbib.elex.is/prop/statement/>
PREFIX lpq: <https://lexbib.elex.is/prop/qualifier/>
PREFIX lpr: <https://lexbib.elex.is/prop/reference/>
PREFIX lno: <https://lexbib.elex.is/prop/novalue/>


= InflectedForm =
select ?lexeme ?lexeme_nr ?lemma (count (distinct ?sense) as ?num_of_senses) (count (distinct ?def) as ?num_of_defs) (count (distinct ?expl) as ?num_of_examples)
where {
  ?lexeme ldp:P207 lwb:Q34165; wikibase:lemma ?lemma; ontolex:sense ?sense.
  optional {?sense ldp:P209 ?def.} optional {?sense ldp:P208 ?expl.}
  bind (xsd:integer(strafter(str(?lexeme),"https://lexbib.elex.is/entity/L")) as ?lexeme_nr)
  filter (?lexeme_nr > 34) # this is because of bug https://phabricator.wikimedia.org/T363312
}
group by ?lexeme ?lexeme_nr ?lemma ?num_of_senses ?num_of_defs ?num_of_examples
order by ?lexeme_nr
</sparql>