Publication: A learning-based dependency to constituency conversion algorithm for the turkish language
dc.contributor.author | Marşan, B. | |
dc.contributor.author | Yıldız, O. K. | |
dc.contributor.author | Kuzgun, A. | |
dc.contributor.author | Yenice, A | |
dc.contributor.author | Cesur, N. | |
dc.contributor.author | Yenice, A. B. | |
dc.contributor.author | Sanıyar, E. | |
dc.contributor.author | Kuyrukçu, O. | |
dc.contributor.author | Arıcan, B. N. | |
dc.contributor.author | Yıldız, Olcay Taner | |
dc.contributor.department | Computer Science | |
dc.contributor.ozuauthor | YILDIZ, Olcay Taner | |
dc.date.accessioned | 2023-08-08T08:17:29Z | |
dc.date.available | 2023-08-08T08:17:29Z | |
dc.date.issued | 2022 | |
dc.description.abstract | This study aims to create the very first dependency-to-constituency conversion algorithm optimised for Turkish language. For this purpose, a state-of-the-art morphologic analyser (Yıldız et al., 2019) and a feature-based machine learning model was used. In order to enhance the performance of the conversion algorithm, bootstrap aggregating meta-algorithm was integrated. While creating the conversation algorithm, typological properties of Turkish were carefully considered. A comprehensive and manually annotated UD-style dependency treebank was the input, and constituency trees were the output of the conversion algorithm. A team of linguists manually annotated a set of constituency trees. These manually annotated trees were used as the gold standard to assess the performance of the algorithm. The conversion process yielded more than 8000 constituency trees whose UD-style dependency trees are also available on GitHub. In addition to its contribution to Turkish treebank resources, this study also offers a viable and easy-to-implement conversion algorithm that can be used to generate new constituency treebanks and training data for NLP resources like constituency parsers. | en_US |
dc.identifier.endpage | 5062 | en_US |
dc.identifier.isbn | 979-109554672-6 | |
dc.identifier.scopus | 2-s2.0-85144336878 | |
dc.identifier.startpage | 5054 | en_US |
dc.identifier.uri | http://hdl.handle.net/10679/8587 | |
dc.identifier.wos | 000889371705018 | |
dc.language.iso | eng | en_US |
dc.publicationstatus | Published | en_US |
dc.publisher | European Language Resources Association (ELRA) | en_US |
dc.relation.ispartof | 2022 Language Resources and Evaluation Conference, LREC 2022 | |
dc.relation.publicationcategory | International | |
dc.rights | Attribution-NonCommercial 4.0 International | * |
dc.rights | openAccess | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ | * |
dc.subject.keywords | Constituency parsing | en_US |
dc.subject.keywords | Constitueny dpendency conversion | en_US |
dc.subject.keywords | Dependency parsing | en_US |
dc.title | A learning-based dependency to constituency conversion algorithm for the turkish language | en_US |
dc.type | conferenceObject | en_US |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | 85662e71-2a61-492a-b407-df4d38ab90d7 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 85662e71-2a61-492a-b407-df4d38ab90d7 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- A learning-based dependency to constituency conversion algorithm for the turkish language.pdf
- Size:
- 339.19 KB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
- Name:
- license.txt
- Size:
- 1.45 KB
- Format:
- Item-specific license agreed upon to submission
- Description: