Computer Science
Permanent URI for this collectionhttps://hdl.handle.net/10679/43
Browse
Browsing by Rights "Attribution 4.0 International"
Now showing 1 - 18 of 18
- Results Per Page
- Sort Options
ArticlePublication Open Access Actor-critic reinforcement learning for bidding in bilateral negotiation(TÜBİTAK, 2022) Arslan, Furkan; Aydoğan, Reyhan; Computer Science; AYDOĞAN, Reyhan; Arslan, FurkanDesigning an effective and intelligent bidding strategy is one of the most compelling research challenges in automated negotiation, where software agents negotiate with each other to find a mutual agreement when there is a conflict of interests. Instead of designing a hand-crafted decision-making module, this work proposes a novel bidding strategy adopting an actor-critic reinforcement learning approach, which learns what to offer in a bilateral negotiation. An entropy reinforcement learning framework called Soft Actor-Critic (SAC) is applied to the bidding problem, and a self-play approach is employed to train the model. Our model learns to produce the target utility of the coming offer based on previous offer exchanges and remaining time. Furthermore, an imitation learning approach called behavior cloning is adopted to speed up the learning process. Also, a novel reward function is introduced that does take not only the agent’s own utility but also the opponent’s utility at the end of the negotiation. The developed agent is empirically evaluated. Thus, a large number of negotiation sessions are run against a variety of opponents selected in different domains varying in size and opposition. The agent’s performance is compared with its opponents and the performance of the baseline agents negotiating with the same opponents. The empirical results show that our agent successfully negotiates against challenging opponents in different negotiation scenarios without requiring any former information about the opponent or domain in advance. Furthermore, it achieves better results than the baseline agents regarding the received utility at the end of the successful negotiations.ArticlePublication Open Access Artificial intelligence techniques for conflict resolution(Springer, 2021-08) Aydoğan, Reyhan; Baarslag, T.; Gerding, E.; Computer Science; AYDOĞAN, ReyhanConflict resolution is essential to obtain cooperation in many scenarios such as politics and business, as well as our day to day life. The importance of conflict resolution has driven research in many fields like anthropology, social science, psychology, mathematics, biology and, more recently, in artificial intelligence. Computer science and artificial intelligence have, in turn, been inspired by theories and techniques from these disciplines, which has led to a variety of computational models and approaches, such as automated negotiation, group decision making, argumentation, preference aggregation, and human-machine interaction. To bring together the different research strands and disciplines in conflict resolution, the Workshop on Conflict Resolution in Decision Making (COREDEMA) was organized. This special issue benefited from the workshop series, and consists of significantly extended and revised selected papers from the ECAI 2016 COREDEMA workshop, as well as completely new contributions.Conference ObjectPublication Open Access Bandwidth prediction in low-latency media transport(ACM, 2023-06-16) Bentaleb, A.; Akçay, Mehmet Necmettin; Lim, M.; Beğen, Ali Cengiz; Zimmermann, R.; Computer Science; BEĞEN, Ali Cengiz; Akçay, Mehmet NecmettinDesigning a robust bandwidth prediction algorithm for low-latency media transport that can quickly adapt to varying network conditions is challenging. In this paper, we present the working principles of a hybrid bandwidth predictor (termed BoB, Bang-on-Bandwidth) we developed recently for real-time communications and discuss its use with the new Media-over-QUIC (MOQ) protocol proposals.Conference ObjectPublication Open Access The benefits of server hinting when DASHing or HLSing(ACM, 2022-03-17) Lim, M.; Akçay, Mehmet Necmettin; Bentaleb, A.; Beğen, Ali Cengiz; Zimmermann, R.; Computer Science; BEĞEN, Ali Cengiz; Akçay, Mehmet NecmettinStreaming clients almost always compete for the available bandwidth and server capacity. Not every client's playback buffer conditions will be the same, though, nor should be the priority with which the server processes the individual requests coming from these clients. In an earlier work, we demonstrated that if clients conveyed their buffer statuses to the server using a Common Media Client Data (CMCD) query argument, the server could allocate its output capacity among all the requests more wisely, which could significantly reduce the rebufferings experienced by the clients. In this paper, we address the same problem using the Common Media Server Data (CMSD) standard that is work-in-progress at the Consumer Technology Association (CTA). In this case, the incoming requests are scheduled based on their CMCD information. For example, the response to a request indicating a healthy buffer status is held/delayed until more urgent requests are handled. When the delayed response is eventually transmitted, the server attaches a new CMSD parameter to indicate how long the delay was. This parameter avoids misinterpretations and subsequent miscalculations by the client's rate-adaptation logic. We implemented the server and client understanding/processing CMCD and CMSD, respectively. Our experiments show that the proposed CMSD parameter effectively eliminates unnecessary downshifting while reducing both the rebuffering rate and duration.ArticlePublication Open Access BoB: Bandwidth prediction for real-time communications using heuristic and reinforcement learning(IEEE, 2023) Bentaleb, A.; Akçay, Mehmet Necmettin; Lim, M.; Beğen, Ali Cengiz; Zimmermann, R.; Computer Science; BEĞEN, Ali Cengiz; Akçay, Mehmet NecmettinBandwidth prediction is critical in any Real-time Communication (RTC) service or application. This component decides how much media data can be sent in real time. Subsequently, the video and audio encoder dynamically adapts the bitrate to achieve the best quality without congesting the network and causing packets to be lost or delayed. To date, several RTC services have deployed the heuristic-based Google Congestion Control (GCC), which performs well under certain circumstances and falls short in some others. In this paper, we leverage the advancements in reinforcement learning and propose BoB (Bang-on-Bandwidth) — a hybrid bandwidth predictor for RTC. At the beginning of the RTC session, BoB uses a heuristic-based approach. It then switches to a learning-based approach. BoB predicts the available bandwidth accurately and improves bandwidth utilization under diverse network conditions compared to the two winning solutions of the ACM MMSys'21 grand challenge on bandwidth estimation in RTC. An open-source implementation of BoB is publicly available for further testing and research.ArticlePublication Open Access Catching the moment with LoL + in twitch-like low-latency live streaming platforms(IEEE, 2022) Bentaleb, A.; Akçay, Mehmet Necmettin; Lim, M.; Beğen, Ali Cengiz; Zimmermann, R.; Computer Science; BEĞEN, Ali Cengiz; Akçay, Mehmet NecmettinOur earlier Low-on-Latency (dubbed as LoL) solution offered an accurate bandwidth prediction and rate adaptation algorithm tailored for live streaming applications that targeted an end-to-end latency of up to two seconds. While LoL was a significant step forward in multi-bitrate low-latency live streaming, further experimentation and testing showed that there was room for improvement in three areas. First, LoL used hard-coded parameters computed from an offline training process in the rate adaptation algorithm and this was seen as a significant barrier in LoL's wide deployment. Second, LoL's objective was to maximize a collective QoE function. Yet, certain use cases have specific objectives besides the singular QoE and this had to be accommodated. Third, the adaptive playback speed control failed to produce satisfying results in some scenarios. Our goal in this paper is to address these areas and make LoL sufficiently robust to deploy. We refer to the enhanced solution as LoL+ which has been integrated to the official dash.js player in v3.2.0.Conference ObjectPublication Open Access Common media client data (CMCD): Initial findings(Association for Computing Machinery, Inc, 2021-07-16) Bentaleb, A.; Lim, M.; Akçay, Mehmet Necmettin; Beğen, Ali Cengiz; Zimmermann, R.; Computer Science; BEĞEN, Ali Cengiz; Akçay, Mehmet NecmettinIn September 2020, the Consumer Technology Association (CTA) published the CTA-5004: Common Media Client Data (CMCD) specification. Using this specification, a media client can convey certain information to the content delivery network servers with object requests. This information is useful in log association/analysis, quality of service/experience monitoring and delivery enhancements. This paper is the first step toward investigating the feasibility of CMCD in addressing one of the most common problems in the streaming domain: efficient use of shared bandwidth by multiple clients. To that effect, we implemented CMCD functions on an HTTP server and built a proof-of-concept system with CMCD-Aware dash.js clients. We show that even a basic bandwidth allocation scheme enabled by CMCD reduces rebuffering rate and duration without noticeably sacrificing the video quality.Conference ObjectPublication Open Access Common media server data (CMSD) - update on implementations and validation of key use cases(ACM, 2023-06-16) Pham, S.; Law, W.; Beğen, Ali Cengiz; Silhavy, D.; Berthelot, B.; Arbanowski, S.; Steglich, S.; Computer Science; BEĞEN, Ali CengizThe CTA-5006 (Common Media Server Data, CMSD) specification establishes a uniform method for media servers to exchange data with each media object response. The aim is to enhance distribution efficiency, performance, and ultimately, the user experience. We provide an overview of CMSD implementations and focus on integrating CMSD into the dash.js reference player. Three use cases are evaluated to demonstrate the advantages of CMSD, including leveraging edge server throughput estimates to improve initial bitrate selection and low-latency live streaming, prefetching manifests and segments to improve startup delay, and allowing an edge server to suggest a playback bitrate to improve the collective experience. The outcomes from the initial implementations confirm the benefits of using CMSD.ArticlePublication Open Access Conflict-based negotiation strategy for human-agent negotiation(Springer, 2023-12) Keskin, Mehmet Onur; Buzcu, Berk; Aydoğan, Reyhan; Computer Science; AYDOĞAN, Reyhan; Keskin, Mehmet Onur; Buzcu, BerkDay by day, human-agent negotiation becomes more and more vital to reach a socially beneficial agreement when stakeholders need to make a joint decision together. Developing agents who understand not only human preferences but also attitudes is a significant prerequisite for this kind of interaction. Studies on opponent modeling are predominantly based on automated negotiation and may yield good predictions after exchanging hundreds of offers. However, this is not the case in human-agent negotiation in which the total number of rounds does not usually exceed tens. For this reason, an opponent model technique is needed to extract the maximum information gained with limited interaction. This study presents a conflict-based opponent modeling technique and compares its prediction performance with the well-known approaches in human-agent and automated negotiation experimental settings. According to the results of human-agent studies, the proposed model outpr erforms them despite the diversity of participants’ negotiation behaviors. Besides, the conflict-based opponent model estimates the entire bid space much more successfully than its competitors in automated negotiation sessions when a small portion of the outcome space was explored. This study may contribute to developing agents that can perceive their human counterparts’ preferences and behaviors more accurately, acting cooperatively and reaching an admissible settlement for joint interests.ReviewPublication Open Access Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources(Springer, 2024-02-12) Barakat, Huda Mohammed Mohammed; Turk, O.; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Barakat, Huda Mohammed MohammedSpeech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech (TTS) models possess the capability to generate speech of exceptionally high quality, closely mimicking human speech. Nevertheless, given the wide array of applications now employing TTS models, mere high-quality speech generation is no longer sufficient. Present-day TTS models must also excel at producing expressive speech that can convey various speaking styles and emotions, akin to human speech. Consequently, researchers have concentrated their efforts on developing more efficient models for expressive speech synthesis in recent years. This paper presents a systematic review of the literature on expressive speech synthesis models published within the last 5 years, with a particular emphasis on approaches based on deep learning. We offer a comprehensive classification scheme for these models and provide concise descriptions of models falling into each category. Additionally, we summarize the principal challenges encountered in this research domain and outline the strategies employed to tackle these challenges as documented in the literature. In the Section 8, we pinpoint some research gaps in this field that necessitate further exploration. Our objective with this work is to give an all-encompassing overview of this hot research area to offer guidance to interested researchers and future endeavors in this field.Conference ObjectPublication Open Access Dynamic CDN switching - dash-if content steering in dash.js(ACM, 2023-06-16) Silhavy, D.; Law, W.; Pham, S.; Beğen, Ali Cengiz; Giladi, A.; Balk, A.; Computer Science; BEĞEN, Ali CengizThis paper overviews the content steering specification currently being developed in DASH Industry Forum and first implemented in the dash.js reference player.ArticlePublication Open Access The effect of appearance of virtual agents in human-agent negotiation(MDPI, 2022-09) Türkgeldi, Berkay; Özden, Cana Su; Aydoğan, Reyhan; Computer Science; AYDOĞAN, Reyhan; Türkgeldi, Berkay; Özden, Cana SuArtificial Intelligence (AI) changed our world in various ways. People start to interact with a variety of intelligent systems frequently. As the interaction between human and AI systems increases day by day, the factors influencing their communication have become more and more important, especially in the field of human-agent negotiation. In this study, our aim is to investigate the effect of knowing your negotiation partner (i.e., opponent) with limited knowledge, particularly the effect of familiarity with the opponent during human-agent negotiation so that we can design more effective negotiation systems. As far as we are aware, this is the first study investigating this research question in human-agent negotiation settings. Accordingly, we present a human-agent negotiation framework and conduct a user experiment in which participants negotiate with an avatar whose appearance and voice are a replica of a celebrity of their choice and with an avatar whose appearance and voice are not familiar. The results of the within-subject design experiment show that human participants tend to be more collaborative when their opponent is a celebrity avatar towards whom they have a positive feeling rather than a non-celebrity avatar.ArticlePublication Open Access Evaluating the English-Turkish parallel treebank for machine translation(TÜBİTAK, 2022) Görgün, O.; Yıldız, Olcay Taner; Computer Science; YILDIZ, Olcay TanerThis study extends our initial efforts in building an English-Turkish parallel treebank corpus for statistical machine translation tasks. We manually generated parallel trees for about 17K sentences selected from the Penn Treebank corpus. English sentences vary in length: 15 to 50 tokens including punctuation. We constrained the translation of trees by (i) reordering of leaf nodes based on suffixation rules in Turkish, and (ii) gloss replacement. We aim to mimic human annotator’s behavior in real translation task. In order to fill the morphological and syntactic gap between languages, we do morphological annotation and disambiguation. We also apply our heuristics by creating Nokia English-Turkish Treebank (NTB) to address technical document translation tasks. NTB also includes 8.3K sentences in varying lengths. We validate the corpus both extrinsically and intrinsically, and report our evaluation results regarding perplexity analysis and translation task results. Results prove that our heuristics yield promising results in terms of perplexity and are suitable for translation tasks in terms of BLEU scores.ArticlePublication Open Access Evaluation of linguistic and prosodic features for detection of Alzheimer’s disease in Turkish conversational speech(Springer Science+Business Media, 2015-12) Khodabakhsh, Ali; Yesil, Fatih; Guner, Ekrem; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Khodabakhsh, Ali; Yesil, Fatih; Guner, EkremAutomatic diagnosis and monitoring of Alzheimer’s disease can have a significant impact on society as well as the well-being of patients. The part of the brain cortex that processes language abilities is one of the earliest parts to be affected by the disease. Therefore, detection of Alzheimer’s disease using speech-based features is gaining increasing attention. Here, we investigated an extensive set of features based on speech prosody as well as linguistic features derived from transcriptions of Turkish conversations with subjects with and without Alzheimer’s disease. Unlike most standardized tests that focus on memory recall or structured conversations, spontaneous unstructured conversations are conducted with the subjects in informal settings. Age-, education-, and gender-controlled experiments are performed to eliminate the effects of those three variables. Experimental results show that the proposed features extracted from the speech signal can be used to discriminate between the control group and the patients with Alzheimer’s disease. Prosodic features performed significantly better than the linguistic features. Classification accuracy over 80% was obtained with three of the prosodic features, but experiments with feature fusion did not further improve the classification performance.ArticlePublication Open Access Hybrid statistical/unit-selection Turkish speech synthesis using suffix units(Springer International Publishing, 2016-12) Demiroğlu, Cenk; Güner, Ekrem; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Güner, EkremUnit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadvantages is the sudden discontinuities in speech that distract the listeners (Speech Commun 51:1039-1064, 2009). The second disadvantage is that significant expertise and large amounts of data is needed for building a high-quality synthesis system which is costly and time-consuming. The statistical speech synthesis (SSS) approach is a promising alternative synthesis technique. Not only that the spurious errors that are observed in the unit selection system are mostly not observed in SSS but also building voice models is far less expensive and faster compared to the unit selection system. However, the resulting speech is typically not as natural-sounding as speech that is synthesized with a high-quality unit selection system. There are hybrid methods that attempt to take advantage of both SSS and unit selection systems. However, existing hybrid methods still require development of a high-quality unit selection system. Here, we propose a novel hybrid statistical/unit selection system for Turkish that aims at improving the quality of the baseline SSS system by improving the prosodic parameters such as intonation and stress. Commonly occurring suffixes in Turkish are stored in the unit selection database and used in the proposed system. As opposed to existing hybrid systems, the proposed system was developed without building a complete unit selection synthesis system. Therefore, the proposed method can be used without collecting large amounts of data or utilizing substantial expertise or time-consuming tuning that is typically required in building unit selection systems. Listeners preferred the hybrid system over the baseline system in the AB preference tests.ArticlePublication Unknown Machine learning to predict junction temperature based on optical characteristics in solid-state lighting devices: A test on WLEDs(MDPI, 2022-08) Azarifar, Mohammad; Ocaksönmez, Kerem; Cengiz, Ceren; Aydoğan, Reyhan; Arık, Mehmet; Computer Science; Mechanical Engineering; AYDOĞAN, Reyhan; ARIK, Mehmet; Azarifar, Mohammad; Cengiz, CerenWhile junction temperature control is an indispensable part of having reliable solid-state lighting, there is no direct method to measure its quantity. Among various methods, temperature-sensitive optical parameter-based junction temperature measurement techniques have been used in practice. Researchers calibrate different spectral power distribution behaviors to a specific temperature and then use that to predict the junction temperature. White light in white LEDs is composed of blue chip emission and down-converted emission from photoluminescent particles, each with its own behavior at different temperatures. These two emissions can be combined in an unlimited number of ways to produce diverse white colors at different brightness levels. The shape of the spectral power distribution can, in essence, be compressed into a correlated color temperature (CCT). The intensity level of the spectral power distribution can be inferred from the luminous flux as it is the special weighted integration of the spectral power distribution. This paper demonstrates that knowing the color characteristics and power level provide enough information for possible regressor trainings to predict any white LED junction temperature. A database from manufacturer datasheets is utilized to develop four machine learning-based models, viz., k-Nearest Neighbor (KNN), Radius Near Neighbors (RNN), Random Forest (RF), and Extreme Gradient Booster (XGB). The models were used to predict the junction temperatures from a set of dynamic opto-thermal measurements. This study shows that machine learning algorithms can be employed as reliable novel prediction tools for junction temperature estimation, particularly where measuring equipment limitations exist, as in wafer-level probing or phosphor-coated chips.ArticlePublication Unknown Towards interactive explanation-based nutrition virtual coaching systems(Springer, 2024-01) Buzcu, Berk; Tessa, M.; Tchappi, I.; Najjar, A.; Hulstijn, J.; Calvaresi, D.; Aydoğan, Reyhan; Computer Science; AYDOĞAN, Reyhan; Buzcu, BerkThe awareness about healthy lifestyles is increasing, opening to personalized intelligent health coaching applications. A demand for more than mere suggestions and mechanistic interactions has driven attention to nutrition virtual coaching systems (NVC) as a bridge between human–machine interaction and recommender, informative, persuasive, and argumentation systems. NVC can rely on data-driven opaque mechanisms. Therefore, it is crucial to enable NVC to explain their doing (i.e., engaging the user in discussions (via arguments) about dietary solutions/alternatives). By doing so, transparency, user acceptance, and engagement are expected to be boosted. This study focuses on NVC agents generating personalized food recommendations based on user-specific factors such as allergies, eating habits, lifestyles, and ingredient preferences. In particular, we propose a user-agent negotiation process entailing run-time feedback mechanisms to react to both recommendations and related explanations. Lastly, the study presents the findings obtained by the experiments conducted with multi-background participants to evaluate the acceptability and effectiveness of the proposed system. The results indicate that most participants value the opportunity to provide feedback and receive explanations for recommendations. Additionally, the users are fond of receiving information tailored to their needs. Furthermore, our interactive recommendation system performed better than the corresponding traditional recommendation system in terms of effectiveness regarding the number of agreements and rounds.ArticlePublication Unknown Trust in robot–robot scaffolding(IEEE, 2023-12-01) Kırtay, M.; Hafner, V. V. V.; Asada, Minoru; Öztop, Erhan; Computer Science; ÖZTOP, ErhanThe study of robot trust in humans and other agents is not explored widely despite its importance for the near future human-robot symbiotic societies. Here, we propose that robots should trust partners that tend to reduce their computational load, which is analogous to human cognitive load. We test this idea by adopting an interactive visual recalling task. In the first set of experiments, the robot can get help from online instructors with different guiding strategies to decide which one it should trust based on the computational load it experiences during the experiments. The second set of experiments involves robot-robot interactions. Akin to the robot-online instructor case, the Pepper robot is asked to scaffold the learning of a less capable 'infant' robot (Nao) with or without being equipped with the cognitive abilities of theory of mind and task experience memory to assess the contribution of these cognitive abilities to scaffolding performance. Overall, the results show that robot trust based on computational/cognitive load within a sequential decision-making framework leads to effective partner selection and robot-robot scaffolding. Thus, using the computational load incurred by the cognitive processing of a robot may serve as an internal signal for assessing the trustworthiness of interaction partners.