Computer Science
Permanent URI for this collectionhttps://hdl.handle.net/10679/43
Browse
Browsing by Issue Date
Now showing 1 - 20 of 542
- Results Per Page
- Sort Options
Conference ObjectPublication Metadata only Improving automatic emotion recognition from speech signals(International Speech Communications Association, 2009) Bozkurt, E.; Erzin, E.; Eroğlu Erdem, Ç.; Erdem, Tanju; Computer Science; ERDEM, Arif TanjuWe present a speech signal driven emotion recognition system. Our system is trained and tested with the INTERSPEECH 2009 Emotion Challenge corpus, which includes spontaneous and emotionally rich recordings. The challenge includes classifier and feature sub-challenges with five-class and two-class classification problems. We investigate prosody related, spectral and HMM-based features for the evaluation of emotion recognition with Gaussian mixture model (GMM) based classifiers. Spectral features consist of mel-scale cepstral coefficients (MFCC), line spectral frequency (LSF) features and their derivatives, whereas prosody-related features consist of mean normalized values of pitch, first derivative of pitch and intensity. Unsupervised training of HMM structures are employed to define prosody related temporal features for the emotion recognition problem. We also investigate data fusion of different features and decision fusion of different classifiers, which are not well studied for emotion recognition framework. Experimental results of automatic emotion recognition with the INTERSPEECH 2009 Emotion Challenge corpus are presented.Conference ObjectPublication Metadata only Gauss karışım modeli tabanlı konuşmacı belirleme sistemlerinde klasik MAP uyarlanması yönteminin performans analizi(IEEE, 2010) Erdoğan, A.; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, CenkGaussian mixture models (GMM) is one of the most commonly used methods in text-independent speaker identification systems. In this paper, performance of the GMM approach has been measured with different parameters and settings. Voice activity detection (VAD) component has been found to have a significant impact on the performance. Therefore, VAD algorithms that are robust to background noise have been proposed. Significant differences in performance have been observed between male and female speakers and GSM/PSTN channels. Moreover, single-stream GMM approach has been found to perform significantly better than the multi-stream GMM approach. It has been observed under all conditions that data duration is critical for good performance.Conference ObjectPublication Metadata only E-Cube: multi-dimensional event sequence processing using concept and pattern hierarchies(IEEE, 2010) Liu, M.; Rundensteiner, E. A.; Greenfield, K.; Gupta, C.; Wang, S.; Arı, İsmail; Mehta, A.; Computer Science; ARI, IsmailMany modern applications including tag based mass transit systems, RFID-based supply chain management systems and online financial feeds require special purpose event stream processing technology to analyze vast amounts of sequential multi-dimensional data available in real-time data feeds. Traditional online analytical processing (OLAP) systems are not designed for real-time pattern-based operations, while Complex Event Processing (CEP) systems are designed for sequence detection and do not support OLAP operations. We will demonstrate a novel E-Cube model that combines CEP and OLAP techniques for multi-dimensional event pattern analysis at different abstraction levels. A London transit scenario will be given to demonstrate the utility and performance of this proposed technology.Conference ObjectPublication Open Access Use of line spectral frequencies for emotion recognition from speech(IEEE, 2010) Bozkurt, E.; Erzin, E.; Eroğlu Erdem, Ç.; Erdem, Tanju; Computer Science; ERDEM, Arif TanjuWe propose the use of the line spectral frequency (LSF) features for emotion recognition from speech, which have not been been previously employed for emotion recognition to the best of our knowledge. Spectral features such as mel-scaled cepstral coefficients have already been successfully used for the parameterization of speech signals for emotion recognition. The LSF features also offer a spectral representation for speech, moreover they carry intrinsic information on the formant structure as well, which are related to the emotional state of the speaker. We use the Gaussian mixture model (GMM) classifier architecture, that captures the static color of the spectral features. Experimental studies performed over the Berlin Emotional Speech Database and the FAU Aibo Emotion Corpus demonstrate that decision fusion configurations with LSF features bring a consistent improvement over the MFCC based emotion classification rates.Conference ObjectPublication Metadata only RANSAC-based training data selection for emotion recognition from spontaneous speech(ACM, 2010) Eroğlu Erdem, Ç.; Bozkurt, E.; Erzin, E.; Erdem, Tanju; Computer Science; ERDEM, Arif TanjuTraining datasets containing spontaneous emotional expressions are often imperfect due the ambiguities and difficulties of labeling such data by human observers. In this paper, we present a Random Sampling Consensus (RANSAC) based training approach for the problem of emotion recognition from spontaneous speech recordings. Our motivation is to insert a data cleaning process to the training phase of the Hidden Markov Models (HMMs) for the purpose of removing some suspicious instances of labels that may exist in the training dataset. Our experiments using HMMs with various number of states and Gaussian mixtures per state indicate that utilization of RANSAC in the training phase provides an improvement of up to 2.84% in the unweighted recall rates on the test set. This improvement in the accuracy of the classifier is shown to be statistically significant using McNemar’s test.Conference ObjectPublication Metadata only INTERSPEECH 2009 duygu tanıma yarışması değerlendirmesi(IEEE, 2010) Bozkurt, E.; Erzin, E.; Eroğlu Erdem, Ç.; Erdem, Tanju; Computer Science; ERDEM, Arif TanjuBu makalede INTERSPEECH 2009 Duygu Tanıma Yarışması sonuçlarını değerlendiriyoruz. Yarışmanın sunduğu problem doğal ve duygu bakımından zengin FAU Aibo konuşma kayıtlarının beş ve iki duygu sınıfına en doğru şekilde ayrılmasıdır. Bu problemi çözmek için bürün ilintili, spektral ve SMM-temelli (sakl Markov model) öznitelikleri Gauss Bileşen Model (GBM) sınıflandırıcılar ile inceliyoruz. Spektral öznitelikler, Mel frekans kepstral katsayıların (MFKK), doru spektral frekans (DSF) katsayılarını ve bunların türevlerini içerirken, bürün öznitelikleri perde, perdenin birinci türevi ve enerjiden oluşuyor. Bürün ilintili özniteliklerin zamanla değimini tanımlayan SMM özniteliklerini, güdümsüz eğitilen SMM yapılar ile elde ediyoruz. Ayrıca, konuşmadan duygu tanıma sonuçların iyileştirmek için farklı özniteliklerin veri kaynaşımın ve farklı sınıflandırıcıların karar kaynaşımını da inceliyoruz. İki aşamalı karar kaynaşım yöntemimiz beş ve iki sınıflı problemler için sırasıyla,% 41.59 ve %67.90 başarım oranını ve tüm yarışma sonuçları arasında 2. ve 4. sırayı elde etti .Conference ObjectPublication Metadata only Authoring and presentation tools for distance learning over interactive TV(ACM, 2010) Gürel, T. C.; Erdem, Tanju; Kermen, A.; Özkan, M. K.; Eroğlu Erdem, Ç.; Computer Science; ERDEM, Arif TanjuWe present a complete system for distance learning over interactive TV with novel tools for authoring and presentation of lectures and exams, and evaluation of student and system performance. The main technological contributions of the paper include the development of plug-in software so that PowerPoint can be used to prepare presentations for the set-top-box, a software tool to convert PDF documents containing multiple-choice questions into interactive exams, and a virtual teacher whose facial animation is automatically generated from speech.Conference ObjectPublication Open Access Processing nested complex sequence pattern queries over event streams(ACM, 2010) Liu, M.; Ray, M.; Rundensteiner, E. A.; Dougherty, D. J.; Gupta, C.; Wang, S.; Arı, İsmail; Mehta, A.; Computer Science; ARI, IsmailComplex event processing (CEP) has become increasingly important for tracking and monitoring applications ranging from healthcare, supply chain management to surveillance. These monitoring applications submit complex event queries to track sequences of events that match a given pattern. As these systems mature the needfor increasingly complex nested sequence queries arises, while thestate-of-the-art CEP systems mostly focus on the execution of flat sequence queries only. In this paper, we now introduce an iterative execution strategy for nested CEP queries composed of sequence, negation, AND and OR operators. Lastly the promise of applying selective caching of intermediate results to optimize the execution. Our experimental study using real-world stock trades evaluates the performance of our proposed iterative execution strategy for differentquery types.Technical reportPublication Open Access Relating Staged Computation to the Record Calculus(Özyeğin University, 2010-09-06) Aktemur, Tankut Barış; Choi, W.; Computer Science; AKTEMUR, Tankut BarişIt has been previously shown that there is a close relation between record calculus and program generation (e.g. Lisp-like quasiquotations): A translation has been defined to convert staged expressions to record calculus expressions, and it has been shown that the call-by-value semantics of the staged and the record calculi are equivalent modulo the translation and admin reductions. In this work, we investigate the relation further. The contributions are twofold: (1) We fine-tune the previously shown relation between the two operational semantics, and obtain more precise results. In particular, we show that only two kinds of admin reductions suffice, and these reductions can be applied exhaustively. (2) We define a reverse translation that converts record calculus expressions back to the staged calculus, allowing us to go back and forth between the two calculi. We believe that these results provide an important step towards reusing already-existing record calculus static analyses to reason about staged expressions.Book PartPublication Metadata only Runtime verification of component-based embedded software(Springer, 2011) Sözer, Hasan; Hofmann, C; Tekinerdoğan, B.; Akşit, M.; Computer Science; SÖZER, HasanTo deal with increasing size and complexity, component-based software development has been employed in embedded systems. Due to several faults, components can make wrong assumptions about the working mode of the system and the working modes of the other components. To detect mode inconsistencies at runtime, we propose a “lightweight” error detection mechanism, which can be integrated with component-based embedded systems. We define links among three levels of abstractions: the runtime behavior of components, the working mode specifications of components and the specification of the working modes of the system. This allows us to detect the user observable runtime errors. The effectiveness of the approach is demonstrated by implementing a software monitor integrated into a TV system.Conference ObjectPublication Metadata only Towards subtyped program generation in F#(ACM, 2011) Aktemur, Tankut Barış; Computer Science; AKTEMUR, Tankut BarişProgram Generation is the technique of combining code fragments to construct a program. In this work we report on our progress to extend F# with program generation constructs. Our prototype implementation uses a translation that allows simulating program generators by regular programs. The translation enables fast implementation and experimentation. We state how a further extension with subtyping can be integrated by benefiting from the translation.Conference ObjectPublication Metadata only Static analysis of multi-staged programs via unstaging translation(ACM, 2011) Choi, W.; Aktemur, Tankut Barış; Yi, K.; Tatsuta, M.; Computer Science; AKTEMUR, Tankut BarişStatic analysis of multi-staged programs is challenging because thebasic assumption of conventional static analysis no longer holds: the program text itself is no longer a fixed static entity, but rather a dynamically constructed value. This article presents a semanticpreserving translation of multi-staged call-by-value programs into unstaged programs and a static analysis framework based on this translation. The translation is semantic-preserving in that every small-step reduction of a multi-staged program is simulated by the evaluation of its unstaged version. Thanks to this translation we can analyze multi-staged programs with existing static analysis techniques that have been developed for conventional unstaged programs: we first apply the unstaging translation, then we apply conventional static analysis to the unstaged version, and finally we cast the analysis results back in terms of the original staged program. Our translation handles staging constructs that have beenevolved to be useful in practice (typified in Lisp’s quasi quotation): open code as values, unrestricted operations on references and intentional variable-capturing substitutions. This article omits references for which we refer the reader to our companion technical report.Conference ObjectPublication Metadata only ÖZÜ konuşmacı doğrulama sisteminin çok sınıflı senaryoda NIST 2010 veritabanı ile başarımı(IEEE, 2011) Yeşil, Fatih; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Yeşil, FatihPerformance of the speaker verification systems is typically measured based on their binary decision accuracy. However, in speaker verification applications where close to %100 accuracy is required, such as the systems that are used in the call centers of finance companies, it is not possible to rely on the binary decisions of the existing verification systems. Still, in such cases, multi-class verification outputs (for example, high, medium and low verification score) returned by the speaker verification systems can be used by a human agent to either reduce the verification time and/or increase the verification accuracy compared to a human-only scenario. In this work, we compare such multiclass output performance of some of the most popular speaker verification systems when a human agent is assumed to be in the verification loop. Performance is measured by the reduction in the number of questions used by the human agent for verifying the identity of the caller without compromising from the security. Experiments are performed using the NIST 2010 database for the 8 conversation sides (5 minutes each) enrollment data and 10 seconds verification data condition.Conference ObjectPublication Metadata only Konuşmacı aradeğerlemeli SMM tabanlı metinden konuşma sentezleme si̇stemi(IEEE, 2011) Orhan, Mustafa Cem; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Orhan, Mustafa CemHidden Markov Model (HMM) based text-to-speech (TTS) systems offer many advantages compared to the concatenative approach. One of those advantages is the ability to interpolate between different speakers to generate new voices. In this paper, speaker interpolation for HMM-based TTS (HTS) is described and listening test results for the interpolation of English and Turkish voices are presented. Similar to English, we obtained Turkish speech that strongly reflect the interpolation ratio in perceptual similarity. Some insight into the interpolation process is also provided by analysing the spectra of the reference and final voices.Conference ObjectPublication Metadata only NEEL: The nested complex event language for real-time event analytics(Springer International Publishing, 2011) Liu, M.; Rundensteiner, E. A.; Dougherty, D.; Gupta, C.; Wang, S.; Arı, İsmail; Mehta, A.; Computer Science; ARI, IsmailComplex event processing (CEP) over event streams has become increasingly important for real-time applications ranging from health care, supply chain management to business intelligence. These monitoring applications submit complex event queries to track sequences of events that match a given pattern. As these systems mature the need for increasingly complex nested sequence query support arises, while the state-of-art CEP systems mostly support the execution of only flat sequence queries. In this paper, we introduce our nested CEP query language NEEL for expressing nested queries composed of sequence, negation, AND and OR operators. Thereafter, we also define its formal semantics. Subtle issues with negation and predicates within the nested sequence context are discussed. An E-Analytics system for processing nested CEP queries expressed in the NEEL language has been developed. Lastly, we demonstrate the utility of this technology by describing a case study of applying this technology to a real-world application in health care.Conference ObjectPublication Metadata only Defining architectural viewpoints for quality concerns(Springer Science+Business Media, 2011) Tekinerdogan, B.; Sözer, Hasan; Computer Science; SÖZER, HasanA common practice in software architecture design is to apply architectural views to model the design decisions for the various stakeholder concerns. When dealing with quality concerns, however, it is more difficult to address these explicitly in the architectural views. This is because quality concerns do not easily match the architectural elements that seem to be primarily functional in nature. As a result, the communication and analysis of these quality concerns becomes more problematic in practice. We introduce a general and practical approach for supporting architects to model quality concerns by extending the architectural viewpoints of the so-called V&B approach. We illustrate the approach for defining recoverability and adaptability viewpoints for an open source software architecture.Conference ObjectPublication Metadata only Gauss karışım modeli tabanlı konuşmacı doğrulama sistemlerinde kişiye ve kanala uyarlanmada klasik MAP tabanlı yöntemlerin performans analizi(IEEE, 2011) Koşunda, Serol; Yeşil, Fatih; Ayazoğlu, Yaprak; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Koşunda, Serol; Yeşil, Fatih; Ayazoğlu, YaprakIn this paper, performance of Gaussian mixture models (GMM) based algorithms implemented in Speech Processing Laboratory at Ozyegin University, within NIST SRE2004 and 2006 database was reported. Gaussian mixture models (GMM) is one of the most commonly used methods in text-independent speaker verification systems. In this paper, performance of the GMM approach has been measured with different parameters and settings. It has also been observed that eigenchannel-MAP and JFA methods both have increased the performance of the system against session variability which is one of the most challenging problem in text-independent speaker verification systems.EditorialPublication Metadata only Preface(2011) Kim, S.; Uchitel, S.; Garbervetsky, D.; Aktemur, Tankut Barış; Kroening, D.; Orso, A.; Nagappan, N.; Xie, T.; Mueller, P.; Cataldo, M.; Tillmann, N.; Margaria-Steffen, T.; Tonetta, S.; Bradley, A.; Chen, N.; Caso, G. de; Ferrara, P.; He, N.; Kassios, I.; Kicillof, N.; Lewis, M.; Meyer, D.; Nagel, R.; Nimal, V.; Pandita, R.; Pavese, E.; Rajan, A.; Roveri, M.; Sawadsky, N.; Schapachnik, F.; Seo, H.; Shakya, K.; Song, Y.; Summers, A.; Xiao, X.; Yilmaz, Buse; Zhang, L.; Bishop, J.; Breitman, K.; Notkin, D.; Computer Science; AKTEMUR, Tankut Bariş; Yilmaz, BuseConference ObjectPublication Metadata only FPGA bitstream protection with PUFs, obfuscation, and multi-boot(IEEE, 2011) Gören, S.; Özkurt, Ö.; Yıldız, Abdullah; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Yıldız, AbdullahWith the combination of PUFs, obfuscation, and multi-boot, we are able to do the equivalent of partial bitstream encryption on low-cost FPGAs, which is only featured on high-end FPGAs. Low-cost FPGAs do not even have built-in support for encrypted (full) bitstreams. Our particular PUF implementation does not steal valuable FPGA real estate from the actual design with the help of multi-boot. We favor multi-boot over self partial reconfiguration as it is easier to implement.Book PartPublication Metadata only Guiding architects in selecting architectural evolution alternatives(Springer Science+Business Media, 2011) Ciraci, S.; Sözer, Hasan; Aksit, M.; Computer Science; SÖZER, HasanAlthough there exist methods and tools to support architecture evolution, the derivation and evaluation of alternative evolution paths are realized manually. In this paper, we introduce an approach, where architecture specification is converted to a graph representation. Based on this representation, we automatically generate possible evolution paths, evaluate quality attributes for different architectural configurations, and optimize the selection of a particular path accordingly. We illustrate our approach by modeling the software architecture evolution of a crisis management system.