Browsing by Author "Koca, Melih"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Master ThesisPublication Metadata only The costs and benefits of turning data into information using big data systems(2014-08) Koca, Melih; Arı, İsmail; Arı, İsmail; Sözer, Hasan; Ercan, Ali Özer; Department of Computer Science; Koca, MelihThis thesis explains problems of and solutions for storing and processing big volumes and streaming types of data in a cost-e ective way for enterprise companies. There are several problems like operational, infrastructural and usability problems about these new concepts of Big Data. The basic data processing concepts are not new, but the data generation volumes and velocities are pushing the limits of centralized architectures. Distributed systems using distributed programming models such as the Hadoop framework are used today to handle big data problems. This thesis will try to combine the structural, architectural and nancial issues to address big data storage and processing problems and will give practical examples based on real-life experiences from several big data applications in di erent sectors including mobile telecommunications, nance and oil-gas elds.Conference ObjectPublication Metadata only Yüksek-ölçekli mobil iletişim verilerinin açık-kaynak hadoop çerçevesi kullanılarak paralel ve iş-hatlı işlenmesi(IEEE, 2012) Koca, Melih; Arı, İsmail; Koçak, Uğur; Çalıkuş, O.; Sezgin, C.; Computer Science; ARI, Ismail; Koca, Melih; Koçak, UğurThe fast increase in mobile device and bandwidth usage is generating big workloads on the IT infrastructures of mobile service providers and increasing management costs. These providers collect log files continuously and use these logs for billing, operational and marketing purposes. In this paper, we describe the design, implementation and efficient parallel processing of large-scale mobile logs using the open-source Hadoop-based low-cost private cloud system for near real-time analytics. We find that batching of small files, parallel loading and pipelining of different workloads by overlapping their disk-and-CPU intensive phases can have significant performance benefits. Optimizations were performed in the light of these findings. Our web-based interface helps users explore progress and performance of their workloads.