Show simple item record

dc.contributor.authorAstekin, M.
dc.contributor.authorZengin, H.
dc.contributor.authorSözer, Hasan
dc.date.accessioned2020-06-10T09:36:07Z
dc.date.available2020-06-10T09:36:07Z
dc.date.issued2018
dc.identifier.isbn978-153865035-6
dc.identifier.issn2639-1589en_US
dc.identifier.urihttp://hdl.handle.net/10679/6600
dc.identifier.urihttps://ieeexplore.ieee.org/document/8621967
dc.description.abstractAnomaly detection is a valuable feature for detecting and diagnosing faults in large-scale, distributed systems. These systems usually provide tens of millions of lines of logs that can be exploited for this purpose. However, centralized implementations of traditional machine learning algorithms fall short to analyze this data in a scalable manner. One way to address this challenge is to employ distributed systems to analyze the immense amount of logs generated by other distributed systems. We conducted a case study to evaluate two unsupervised machine learning algorithms for this purpose on a benchmark dataset. In particular, we evaluated distributed implementations of PCA and K-means algorithms. We compared the accuracy and performance of these algorithms both with respect to each other and with respect to their centralized implementations. Results showed that the distributed versions can achieve the same accuracy and provide a performance improvement by orders of magnitude when compared to their centralized versions. The performance of PCA turns out to be better than K-means, although we observed that the difference between the two tends to decrease as the degree of parallelism increases.en_US
dc.description.sponsorshipCloud Computing and Big Data Laboratory (B3LAB) of TUBITAK-BILGEM ; Software Research Laboratory (SRL) of Ozyegin University
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.ispartof2018 IEEE International Conference on Big Data (Big Data)
dc.rightsrestrictedAccess
dc.titleEvaluation of distributed machine learning algorithms for anomaly detection from large-scale system logs: a case studyen_US
dc.typeConference paperen_US
dc.publicationstatusPublisheden_US
dc.contributor.departmentÖzyeğin University
dc.contributor.authorID(ORCID 0000-0002-2968-4763 & YÖK ID 23178) Sözer, Hasan
dc.contributor.ozuauthorSözer, Hasan
dc.identifier.startpage2071en_US
dc.identifier.endpage2077en_US
dc.identifier.wosWOS:000468499302022
dc.identifier.doihttps://doi.org/10.1109/BigData.2018.8621967en_US
dc.subject.keywordsLog analysisen_US
dc.subject.keywordsDistributed systemsen_US
dc.subject.keywordsParallel processingen_US
dc.subject.keywordsAnomaly detectionen_US
dc.subject.keywordsBig dataen_US
dc.subject.keywordsMachine learningen_US
dc.identifier.scopusSCOPUS:2-s2.0-85062634825
dc.relation.publicationcategoryConference Paper - International - Institutional Academic Staff


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Share this page