Person:
UĞURDAĞ, Hasan Fatih

Loading...
Profile Picture

Email Address

Birth Date

WoSScopusGoogle ScholarORCID

Name

Job Title

First Name

Hasan Fatih

Last Name

UĞURDAĞ

Publication Search Results

Now showing 1 - 10 of 60
  • Placeholder
    ArticlePublication
    Defect-aware nanocrossbar logic mapping through matrix canonization using two-dimensional radix sort
    (ACM, 2011-08) Gören, S.; Uğurdağ, Hasan Fatih; Palaz, O.; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih
    Nanocrossbars (i.e., nanowire crossbars) offer extreme logic densities but come with very high defect rates; stuck-open/closed, broken nanowires. Achieving reasonable yield and utilization requires logic mapping that is defect-aware even at the crosspoint level. Such logic mapping works with a defect map per each manufactured chip. The problem can be expressed as matching of two bipartite graphs; one for the logic to be implemented and other for the nanocrossbar. This article shows that the problem becomes a Bipartite SubGraph Isomorphism (BSGI) problem within sub-nanocrossbars free of stuck-closed faults. Our heuristic KNS-2DS is an iterative rough canonizer with approximately O(N2) complexity followed by an O(N3) matching algorithm. Canonization brings a partial or full order to graph nodes. It is normally used for solving the regular Graph Isomorphism (GI) problem, while we apply it to BSGI. KNS stands for K-Neighbor Sort and is used for initializing our main contribution 2-Dimensional-Sort (2DS). 2DS operates on the adjacency matrix of a bipartite graph. Radix-2 2DS solves the problem in the absence of stuck-closed faults. With the addition of Radix-3 and our novel Radix-2.5 sort, we solve problems that also have stuck-closed faults. We offer very short runtimes (due to canonization) compared to previous work and have success on all benchmarks. KNS-2DS is also novel from the perspective of BSGI problem as it is based on canonization but not on a search tree with backtracking.
  • Placeholder
    ArticlePublication
    Fast two-pick n2n round-robin arbiter circuit
    (IEEE, 2012-06) Uğurdağ, Hasan Fatih; Temizkan, Fatih; Baskirt, O.; Yuce, B.; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Temizkan, Fatih
    A regular (one-pick) round-robin arbiter circuit picks one active requester (if any) out of n requesters. A two-pick round-robin arbiter selects up to two requesters. An n2n two-pick round-robin arbiter indicates the picked requests with (at most) two-hot n-bit output. A round-robin arbiter is fair to its requesters and does this by repeatedly moving its highest priority pointer to the position immediately next to the second requester picked. Presented is the circuit architecture and VLSI implementation of a new scalable two-pick round-robin arbiter with low latency, which is compared with previous work based on logic synthesis results.
  • Placeholder
    Conference ObjectPublication
    Referanssız görüntü bloklanma ölçümü için yeni bir yöntem
    (IEEE, 2014) Ozansoy, Koray; Özer, N.; Dönmez, F.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Ozansoy, Koray
    Internet’te ve servis sağlayıcı ağlarında video trafiğinin tavan yaptığı günümüzde otomatik görüntü kalitesi ölçümünün faydaları aşikardır. Bu ölçümlerin birçok uygulamada gerçekzamanlı yapılması gerekir ve de bu “Referanssız” yani sıkıştırılmamış (ham) görüntülerin kullanılmadığı bir ölçümleme gerektirir. Dünyadaki video akışlarının artık çoğunluğu sayısaldır. Sayısal video akışları, sıkıştırılmış video iletimi kullanır ve kullanılan sıkıştırma yöntemlerinin çoğu DCT tabanlıdır. Bu tür akışlarda görüntü kalitesi düşüşü genellikle iletim hızının fazla kısılmasından dolayı oluşur ve DCT algoritması blok-tabanlı olduğu için kalite kaybı kendini “Bloklanma” olarak gösterir. Bu çalışmada literatürdeki yöntemlere göre insan algısına daha yakın sonuçlar veren bir bloklanma ölçüm yöntemi (RED isimli) sunuyoruz. RED’in en önemli katkılarından biri otomatik olarak hesapladığı bloklanma değerleri ile testçi insanların verdikleri notlar arasında analitik bir ilişki kurmayı başarmış olmasıdır. RED bu ilişkinin parametrelerini “Regresyon” ile optimize eder. RED, yine literatürden farklı olarak, bloklanmayı hesaplamadan önce “Kenar Tespiti (Edge Detection)” kullanarak bazı blokları hesaplama dışı bırakır.
  • Placeholder
    Conference ObjectPublication
    Hardware implementation of field oriented control for three phase machine drives
    (IEEE, 2020-10-05) Tüfekçi, B.; Önal, B.; Önal, H.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih
    This paper presents a high switching frequency FPGA implementation of Maximum Torque Per Ampere (MTPA) and Flux Weakening which are branch of Field Oriented Control (FOC) method for 3-phase machine drives. A common architecture has been constructed for both BrushLess DC motors (BLDC) and Permanent Magnet Synchronous Motors (PMSM). For this purpose, the controller module was implemented using Space Vector Modulation (SVM) technique. The user interface module was designed to provide real-time torque-time, speed-time, and current-time plots for the user. This interface runs on the PS part of the FPGA and interacts with the user through a UART. The entire system has been verified through simulation.
  • Placeholder
    ArticlePublication
    Efficient combinational circuits for division by small integer constants
    (IEEE, 2016) Uğurdağ, Hasan Fatih; Bayram, A.; Levent, Vecdi Levent; Gören, S.; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Levent, Vecdi Levent
    Division of an integer by an integer constant is a widely used operation and hence justifies a customized efficient implementation. There are various versions of this operation. This paper attacks a particular version of this problem, where the divisor is small and the circuit outputs a quotient and remainder. We propose a fast (low-latency) yet area-efficient combinational circuit topology, which we call Binary Tree based Constant Division (BTCD). BTCD uses a collection of small LUTs wired to each other to form a binary tree. The circuit also has bunch of adders, whose latencies are almost hidden as they operate in parallel with the binary tree. We wrote RTL code generators for BTCD and two previous works in the literature, then generated circuits for dividends of up to 128 bits and divisors of 3, 5, 11, and 23. We synthesized the generated RTL designs using a commercial ASIC synthesis tool. BTCD strikes a good balance between timing (latency) and area. It is up to 3.3 times better in Area-Timing Product (ATP) compared to the best alternative. ATP has a good correlation with energy consumption.
  • Placeholder
    Conference ObjectPublication
    FPGA-based minimal Latency HEFT scheduler for heterogeneous computing
    (IEEE, 2021) Aliyev, Ilkin; Mack, J.; Kumbhare, N.; Akoglu, A.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Aliyev, Ilkin
    This paper proposes a new hardware scheduler. As heterogeneous computing becomes prevalent, mapping applications on to multiple processing elements (PEs) proves to be nontrivial. Heterogeneous Earliest Finish Time (HEFT) algorithm is an already existing scheduler that aims to minimize the total execution time of an application. The paradigm of HEFT is such that it accepts an acyclic task graph as input at run-time and assigns/schedules the precompiled atomic tasks to PEs. HEFT stands out among many such schedulers not only in terms of producing shorter schedules but also in terms of its own short execution time. However, in real-time applications, the lower the latency, the better it is. To the best of our knowledge, this work is the only work that implements HEFT in hardware (on FPGA) further lowering its latency from milliseconds to as much as less than a microsecond. Porting HEFT to hardware has been challenging as data dependencies limit the amount of parallelism. Design of an efficient memory access pattern as well as an “incremental sorter” were key enablers in reducing the latency of the hardware implementation. We also integrated our FPGA-HEFT into an ARM-based SoC and validated its functionality using a realistic workload.
  • Placeholder
    ArticlePublication
    Fast multiplier generator for FPGAs with LUT based partial product generation and column/row compression
    (Elsevier, 2017) Kakacak, Ahmet; Guzel, Aydın Emre; Cihangir, Ozan; Gören, S.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Kakacak, Ahmet; Guzel, Aydın Emre; Cihangir, Ozan
    We present a new parallel integer multiplier generator for FPGAs. It combines (i) a new Generalized Parallel Counter (GPC) grouping algorithm for column compression with (ii) a LUT based partial product generation, is (iii) unique as it automatically generates placement pragmas, (iv) uses a ternary adder as a final adder to exploit FPGA's internal carry-chains, and (v) employs a novel GPC based row compression, which aims to reduce the width of the final adder. We wrote Verilog generators for our method as well as one leading work in the literature. For synthesis, we wrote a script that can do “binary search” for the optimum latency. Our extensive implementation results on Xilinx Virtex-6 FPGAs show that we almost always produce circuits with smaller latency (i.e., timing) and Area-Timing Product (ATP) compared to the state-of-the-art in the literature, by 18% and 12% (on the average), respectively. We also offer smaller latency compared to the HDL * operator by 9% on the average at a cost of 12% larger ATP on the average. We are worse in latency in 6 cases out of 33, in all of which synthesis maps * to DSP slices. We also include area and energy results on Virtex-6 as well as a limited amount of latency, area, and ATP results on Virtex-5 and Altera Stratix III.
  • Placeholder
    EditorialPublication
    Welcome note from the general chairs
    (IEEE, 2017-12-13) Elfadel, I. A. M.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih
    The following topics are dealt with: low-power electronics; system-on-chip; integrated circuit design; CMOS integrated circuits; microprocessor chips; SRAM chips; logic design; flip-flops; power aware computing; MOSFET circuits.
  • Placeholder
    Conference ObjectPublication
    FPGA implementation of a low latency and high SFDR direct digital synthesizer for resource-efficient quantum-enhanced communication
    (IEEE, 2020-09) Annafıanto, Nur Fajar Rızqı; Jabir, M. V.; Burenkov, I. A.; Uğurdağ, Hasan Fatih; Battou, A.; Polyakov, S. V.; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; Annafıanto, Nur Fajar Rızqı
    A Direct Digital Synthesizer (DDS) generates a sinusoidal signal, which is a significant component of many communication systems using modulation schemes. A CORDIC algorithm offers minimum memory requirements compared to look-up-based methods and low latency. The latency depends on the number of iterations, which is determined by the number of angles in the rotation set. However, it is necessary to maintain high spectral purity to optimize the overall system performance. To optimize the opportunity of quantum measurement, low latency and a high spectral purity sine wave generator is essential. The implementation of this design generates output with 64% latency reduction compared to that of the conventional CORDIC design and 72.2 dB SFDR value.
  • Placeholder
    Conference ObjectPublication
    Software defined VLC system: implementation and performance evaluation
    (IEEE, 2015) Hussain, Waqas; Uğurdağ, Hasan Fatih; Uysal, Murat; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih; UYSAL, Murat; Hussain, Waqas
    This paper presents the implementation of an IEEE standard-based Visible Light Communication (VLC) system using software defined radio (SDR) approach. Based on widely used SDR platform Universal Software Radio Peripheral (USRP) and visual programming language LabVIEW, we present a fully standard compliant implementation of all PHY I modes of the IEEE 802.15.7 standard. Rest of the equipments used in the experimental set-up are low cost and commercial off-the-shelf devices. We successfully demonstrate audio streaming through our software defined VLC system, which can transmit and receive data successfully up to 2 meters. We also present bit error rate results of all PHY I modes of IEEE 802.15.7 running on our VLC system, which operates at a distance of 1 meter.