Browsing by Author "Gener, Y. S."

Now showing 1 - 3 of 3

Metadata only
Hardware division by small integer constants
(IEEE, 2017-12) Uğurdağ, Hasan Fatih; Dinechin, F. de; Gener, Y. S.; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih
This article studies the design of custom circuits for division by a small positive constant. Such circuits can be useful for specific FPGA and ASIC applications. The first problem studied is the Euclidean division of an unsigned integer by a constant, computing a quotient and remainder. Several new solutions are proposed and compared against the state-of-the-art. As the proposed solutions use small look-up tables, they match well with the hardware resources of an FPGA. The article then studies whether the division by the product of two constants is better implemented as two successive dividers or as one atomic divider. It also considers the case when only a quotient or only a remainder is needed. Finally, it addresses the correct rounding of the division of a floating-point number by a small integer constant. All these solutions, and the previous state-of-the-art, are compared in terms of timing, area, and area-timing product. In general, the relevance domains of the various techniques are different on FPGA and on ASIC.
Metadata only
Lossless look-up table compression for hardware implementation of transcendental functions
(IEEE, 2019) Gener, Y. S.; Gören, S.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; UĞURDAĞ, Hasan Fatih
Look-Up Table (LUT) implementation of transcendental functions often offers lower latency compared to algebraic implementations at the expense of significant area penalty. MultiPartite table method (MP) can circumvent the area problem by breaking up the implementation into multiple smaller LUTs. However, even these smaller LUTs may be big in high accuracy MP designs. Lossless LUT compression can be applied to one or more of these LUTs to further improve area and even timing in some cases. The state-of-the-art 2T-TIV and 3T-TIV methods decompose the Table of Initial Values (TIV) of MP into a table of pivots and tables of differences from the pivots. Our technique, which we call Fully Random Access differential LUT (FR-dLUT), instead uses differences of consecutive elements and results in a smaller range of differences. We also propose a variant of FR-dLUT with variable length coding (Huffman) called FR-dLUTVL, which introduces don't cares into the difference tables and lets logic synthesis optimize them out. We implemented Verilog generators of MP for sine and exponential, where TIV is a conventional LUT as well as 2T-TIV, 3T-TIV, FR-dLUT, and FR-dLUT-VL. We synthesized the generated designs on FPGA and found that our techniques produce around 10% improvement in area and timing beyond the state-of-the-art in large bit widths.
Metadata only
Semi- and fully-random access LUTs for smooth functions
(Springer, 2020) Gener, Y. S.; Aydın, F.; Gören, S.; Uğurdağ, Hasan Fatih; Electrical & Electronics Engineering; Metzler, C.; Gaillardon, P.-E.; Micheli, G. de; Silva-Cardenas, C.; Reis, R.; UĞURDAĞ, Hasan Fatih
Look-Up Table (LUT) implementation of complicated functions often offers lower latency compared to algebraic implementations at the expense of significant area penalty. If the function is smooth, MultiPartite table method (MP) can circumvent the area problem by breaking up the implementation into multiple smaller LUTs. However, even some of these smaller LUTs may be big in high accuracy MP implementations. Lossless LUT compression can be applied to these LUTs to further improve area and even timing in some cases. The state-of-the-art in the literature decomposes the Table of Initial Values (TIV) of MP into a table of pivots and tables of differences from the pivots. Our technique instead places differences of consecutive elements in the difference tables and result in a smaller range of differences that fit in fewer bits. Constraining the difference of consecutive input values, hence semi-random access, allows us to further optimize designs. We also propose variants of our techniques with variable length coding. We implemented Verilog generators of MP for sine and exponential using conventional LUT as well as different versions of the state-of-the-art and our technique. We synthesized the generated designs on FPGA and found that our techniques produce up to 29% improvement in area, 11% improvement in timing, and 26% improvement in area-time product over the state-of-the-art.