Show simple item record

dc.contributor.authorEghbalzadeh, H.
dc.contributor.authorHosseini, B
dc.contributor.authorKhadivi, S.
dc.contributor.authorKhodabakhsh, Ali
dc.date.accessioned2016-06-30T12:33:36Z
dc.date.available2016-06-30T12:33:36Z
dc.date.issued2012
dc.identifier.urihttp://hdl.handle.net/10679/4245
dc.identifier.urihttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6483172
dc.description.abstractLack of multi-application text corpus despite of the surging text data is a serious bottleneck in the text mining and natural language processing especially in Persian language. This paper presents a new corpus for NEWS articles analysis in Persian called Persica. NEWS analysis includes NEWS classification, topic discovery and classification, trend discovery, category classification and many more procedures. Dealing with NEWS has special requirements. First of all it needs a valid and NEWS-content-enriched corpus to perform the experiments. Our Approach is based on a modified category classification and data normalization over Persian NEWS articles which has led to creation of a multipurpose Persian corpus which shows reasonable results in text mining outcomes. In the literature, regarding to our knowledge there are few Persian corpuses but none of them have Persian NEWS time trend characteristics. Empirical results on our benchmark indicate that in addition to reducing the problem dimensions and useless content, Persica keeps admissible validity and reliability in comparison with standard corpuses in the literature.
dc.language.isoengen_US
dc.publisherIEEE
dc.relation.ispartofTelecommunications (IST), 2012 Sixth International Symposium on
dc.rightsrestrictedAccess
dc.titlePersica: A Persian corpus for multi-purpose text mining and natural language processingen_US
dc.typeConference paperen_US
dc.peerreviewedyes
dc.publicationstatuspublisheden_US
dc.contributor.departmentÖzyeğin University
dc.identifier.startpage1207
dc.identifier.endpage1214
dc.identifier.wosWOS:000317974700225
dc.identifier.doi10.1109/ISTEL.2012.6483172
dc.subject.keywordsText classification
dc.subject.keywordsText mining
dc.subject.keywordsCategorization
dc.subject.keywordsSubject and trend detection
dc.identifier.scopusSCOPUS:2-s2.0-84876395015
dc.contributor.ozugradstudentKhodabakhsh, Ali
dc.contributor.authorMale1
dc.relation.publicationcategoryConference Paper - International - Institutional Graduate Student


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Share this page