Publication:
A learning-based dependency to constituency conversion algorithm for the turkish language

dc.contributor.authorMarşan, B.
dc.contributor.authorYıldız, O. K.
dc.contributor.authorKuzgun, A.
dc.contributor.authorYenice, A
dc.contributor.authorCesur, N.
dc.contributor.authorYenice, A. B.
dc.contributor.authorSanıyar, E.
dc.contributor.authorKuyrukçu, O.
dc.contributor.authorArıcan, B. N.
dc.contributor.authorYıldız, Olcay Taner
dc.contributor.departmentComputer Science
dc.contributor.ozuauthorYILDIZ, Olcay Taner
dc.date.accessioned2023-08-08T08:17:29Z
dc.date.available2023-08-08T08:17:29Z
dc.date.issued2022
dc.description.abstractThis study aims to create the very first dependency-to-constituency conversion algorithm optimised for Turkish language. For this purpose, a state-of-the-art morphologic analyser (Yıldız et al., 2019) and a feature-based machine learning model was used. In order to enhance the performance of the conversion algorithm, bootstrap aggregating meta-algorithm was integrated. While creating the conversation algorithm, typological properties of Turkish were carefully considered. A comprehensive and manually annotated UD-style dependency treebank was the input, and constituency trees were the output of the conversion algorithm. A team of linguists manually annotated a set of constituency trees. These manually annotated trees were used as the gold standard to assess the performance of the algorithm. The conversion process yielded more than 8000 constituency trees whose UD-style dependency trees are also available on GitHub. In addition to its contribution to Turkish treebank resources, this study also offers a viable and easy-to-implement conversion algorithm that can be used to generate new constituency treebanks and training data for NLP resources like constituency parsers.en_US
dc.identifier.endpage5062en_US
dc.identifier.isbn979-109554672-6
dc.identifier.scopus2-s2.0-85144336878
dc.identifier.startpage5054en_US
dc.identifier.urihttp://hdl.handle.net/10679/8587
dc.identifier.wos000889371705018
dc.language.isoengen_US
dc.publicationstatusPublisheden_US
dc.publisherEuropean Language Resources Association (ELRA)en_US
dc.relation.ispartof2022 Language Resources and Evaluation Conference, LREC 2022
dc.relation.publicationcategoryInternational
dc.rightsAttribution-NonCommercial 4.0 International*
dc.rightsopenAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/*
dc.subject.keywordsConstituency parsingen_US
dc.subject.keywordsConstitueny dpendency conversionen_US
dc.subject.keywordsDependency parsingen_US
dc.titleA learning-based dependency to constituency conversion algorithm for the turkish languageen_US
dc.typeconferenceObjecten_US
dspace.entity.typePublication
relation.isOrgUnitOfPublication85662e71-2a61-492a-b407-df4d38ab90d7
relation.isOrgUnitOfPublication.latestForDiscovery85662e71-2a61-492a-b407-df4d38ab90d7

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A learning-based dependency to constituency conversion algorithm for the turkish language.pdf
Size:
339.19 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.45 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections