

## Deep Neural Learning based Deming Regression Adder Enhancement on Digital Multiplier for 3D Graphical Applications in VLSI Circuits

M. Renuka<sup>1\*</sup> and G. Mary Valantina<sup>2</sup>

<sup>1</sup>Research Scholar, Department of ETCE, Sathyabama Institute of Science and Technology, Chennai, mannerenu1@gmail.com <sup>2</sup>Associate Professor, Department of CSE, Saveetha school of engineering, Chennai

\*Correspondence: M. Renuka; mannrenu1@gmail.com

**ABSTRACT-** This work aims to investigate 3D Technology to provide better performance enhancement for several generations. The three-dimensional integrated circuit allows better integration density, faster on-chip communications, and heterogeneous integration. The goal of this research is to reduce time consumption and power consumption by introducing the Deep Neural Learnt Deming Regression Based Ladner-Fisher Adder Enhancement (DNLDR-LFAE) Technique in VLSI circuits. Input information (carry inputs) is taken for input layer and transmits to hidden layer 1. Deeming regression analysis has performed at hidden layer 1 to pre-process input data and it is send to hidden layer 2. In that layer, performs carry generation as well as post-processing as well as output outcomes have enclosed with convolution. Finally, at output layer, outcomes were attained to execute well-organized adder improvement of digital multiplier by lesser power as well as time consumption. DNLDR-LFAE Technique measured in terms of power, location and time consumption. An outcome of DNLDR-LFAE Technique decreases time and power consumption of adder enhancement than other methods. The hardware complexity of proposed technique obtains minimized by upto 52% to 63% when designed by Ladner-Fisher Adder.

Keywords: Very Large Scale Integration, Integrated Circuit, Ladner-Fischer, Adder Enhancement, Carry Generation.

#### **ARTICLE INFORMATION**

Author(s): M. Renuka and G. Mary Valantina; Received: 13/02/2023; Accepted: 14/04/2023; Published: 30/04/2023; e-ISSN: 2347-470X; Paper Id: IJEER 1302-06; Citation: 10.37391/IJEER.110202 Webpage-link: https://ijeer.forexjournal.co.in/archive/volume-11/ijeer-110202.html

**Publisher's Note:** FOREX Publication stays neutral with regard to Jurisdictional claims in Published maps and institutional affiliations.

### **1. INTRODUCTION**

For three dimensional graphical applications, area as well as power consumption were issue in VLSI design. Minimum power, area proficiency, as well as maximum performance VLSI, designs play vital part in superior DSP. Addition speed has restricted with carry that main task in digital. Total with every bit position during the essential digital adder has created in sequence only behind prior bit position summed. Carry has promulgated into next position that restrictions addition speed. Y Mounica et al., 2021 [1] introduced an energy-efficient radix-16 Booth multiplier design for combined, signed/unsigned numbers. The design employed the partial product generation unit to minimize the delay and energy considerably. However, the area consumption was not minimized by the energyefficient radix-16 Booth multiplier design.

Dina M. Ellaithy et al., 2019 introduced a dual-channel multiplier (DCM) for 3-D graphic energy efficient second-order piecewise-polynomial function evaluation. The evaluation process was highly dependent on multiplication and squaring

structure. But the area consumption was not minimized by DCM.

Michael Opoku Agyeman et al., 2018 [3] analyzed an energy and performance-aware application mapping algorithm for inhomogeneous 3D Networks-on-Chip. The algorithm was determined with different realistic traffic patterns. However, it failed to lessen complexity level. An internal logic resulted in minimal power delay product was used in Pankaj Kumar and Rajender Kumar Sharma 2016 [4] by minimal voltage as well as high performance 1-bit full adder. But, the complexity level was not reduced.

S. Rakesh and K. S. Vijula Grace 2019 [5] considered Parallel Prefix Adders (PPA) as the fastest adders. Parallel Prefix Adders were efficient circuits for binary addition. Carry Tree Adders attained better performance in VLSI designs. However, the computational cost was not minimized by Parallel Prefix Adders.

For element in digital-signal processing in design of Arithmetic-Logic Units, maximum performance VLSI integer adders were essential one Bala Sindhuri Kandula 2016 [6]. In adder, speed, delay as well as area were essential factor. Speed was achieved by Square Root Carry Select Adder (SQRT-CSLA). But, the overhead was not minimized by VLSI integer adders.

Depending on Gate Diffusion Input technique, full-swing highspeed hybrid Full Adder cell was introduced in Mehedi Hasan et al., 2020 [7]. However, the efficiency level was not improved by GDI technique



Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

With partial products, energy-efficient column compression multiplication by little overhead was introduced in B. Ramkumar and Harish M. Kittur 2013 [8]. But, computation overhead was not reduced by energy-efficient column compression multiplication.

For intelligent system applications, complementary metaloxide-semiconductor (CMOS) design topologies of 1-bit full adder (FA) were introduced in Rekib Uddin Ahmed and Prabir Saha 2019 [9]. However, the power consumption was not minimized by CMOS design topology.

With AND, OR as well as XOR gates by minimal threshold voltage issue, three low power full adders were introduced in Mohan Shoba and Rangaswamy Nakkeeran 2016 [10]. The issue not allowed FA circuits to function without additional inverters. But delay was not minimized by full adder circuits.

The above-mentioned problems are higher power consumption, higher delay, maximum cost, higher area consumption, etc. These problems are overcome, Deep Neural Learnt Deming Regression Based Ladner-Fisher Adder Enhancement (DNLDR-LFAE) presented.

Contributions of DNLDR-LFAE are:

- For decreasing area as well as power consumption, Deep Neural Learnt Deming Regression Based Ladner-Fisher Adder Enhancement (DNLDR-LFAE) Technique is introduced during adder enhancement during VLSI circuits.
- DNLDR-LFAE Technique comprised four layers. Input data have taken to input layer as well as transmitted to hidden layer 1.
- Novelty of the Deming regression process is to discover a line of best fit to select preprocessed optimal data for adder enhancement. Also, a higher likelihood assessment was discovered with reduced time.
- Proposed DNLDR-LFAE Technique uses the novelty of Ladner-Fischer adder in hidden layer 2. It includes preprocessing stage, carry propagation, and generation and post-propagation. This adder is utilized for the service by highly efficient attachment.
- For pre-process input data, Deming analysis performed in hidden layer 1 as well as transmits to hidden layer 2. In addition, execute carry generation as well as post-processing. Then, convolution is used to enclosed output outcomes.

Lastly, the results are attained in output layer. The power as well as time consumption is minimized by resourceful adder digital multiplier improvement.

### **2. RELATED WORKS**

For software to hardware, an upgraded FIR digital filter framework was introduced in Sumbal Zahoor et al., 2017 [11]. Designed framework comprised organization design as well as cost effectual hardware use. Upgraded FIR digital filter framework failed to diminish complexity.

Implementation of 8-bit multiplier design was carried out in Mansi Jhamb et al., 2016 [12] with CMOS. DPL adder evaded

noise edge issue as well as speed degradation on minimal value of supply voltages by CPL logic circuits. But the computation speed was not reduced by 8-bit multiplier design.

For wavelength division multiplexing-basis of electronicphotonic arithmetic logic unit, electronic-photonic computing architecture was introduced in Zhoufeng Ying et al. 2020 [13]. Designed architecture disentangled exponential association among power as well as clock rate resulting in higher speed as well as power. Though power efficiency has enhanced, as well as higher time complexity with electronic-photonic computing architecture.

Pass Transistors, Transmission Gates and Conventional (CMOS) are discussed in Mehedi Hasan et al., 2020 [14] by hybrid Full Adder. However, circuit design complexity not minimized by hybrid FA design.

In floating-point multipliers, approximate comparator was used in Samar Ghabraei et al., 2020 [15] for performing mantissa products. Power, area, as well as delay are improved with precise comparators. But power consumption was not minimized by approximate comparator.

The maximum-speed as well as parallel-prefix adders like Kogge-Stone, Brent-Kung were used in Deepa Yagain et al., 2012 [16]. Kogge-Stone Ling adders and ripple adders were included for confirming their operations. But it failed to decrease time.

With Gate Diffusion Logic passing cell, reconfigurable approximate ripple carry adder was introduced in Sakthivel B and Padma A 2020 [17]. GDI cell performed reconfigurable cell related at adder chain by carry value or approximated value. Adder was configured through choosing operational form. But, energy utilization was not minimized by reconfigurable approximate ripple carry adder.

To speed up transformation process and to minimize the processing time, equivalent as well as pipelined planning of Affine Transform was introduced in Pulak Mondal et al., 2015 [18]. To confirm, architecture was connected with Field-Programmable Gate Array. But computational complexity was not reduced by AT algorithm.

With reduced speed as well as area, minimal-cost, maximumperformance SFU architecture was introduced in Avni Agarwal et al., 2014 [19]. Hybrid number system minimized operation complexity through switching among logarithmic as well as binary system.

To support single-precision IEEE-754 floating-point standard, high-speed special function unit system was introduced Davide De Caro et al., 2009 [20]. The designed system implemented faithfully rounded reciprocal and exponential functions. It failed to lessen area consumption.

An adaptive power gating technique was introduced in Alexander E. Shapiro et al., 2016 [21] for 32-bit Kogge Stone adder as well as determined in 16 nm FinFET technology node. Maximum granularity adaptive power gating method used

243



# International Journal of Electrical and Electronics Research (IJEER)

Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

limited manager with minimal energy utilization as well as circuit overhead.

Digital signal processing (DSP) is the new advanced technology in every engineering discipline. The multiplier is an essential block in many digital systems. Adder enhancement in digital multipliers is carried out to improve the performance of VLSI circuits for 3D applications. Lot of methods performed to improve the digital multipliers for 3D applications. However, time consumption and power consumption are not reduced in VLSI circuits. So, DNLDR-LFAE Technique is introduced.

The current study is to consider the deep learning concept for 3D graphical applications in VLSI. The research aimed to develop DNLDR-LFAE that would decrease power and time consumption using deeming regression analysis, Carry generation, and post-processing.

### **3. RESEARCH METHODOLOGY**

<sup>WLSI</sup> integer adders are essential element in DSP as it utilized in arithmetic-logic units, floating-point mathematics information paths as well as handle generation units. DSP used implementation of digital filters directly in hardware or within DSPs.

#### 3.1 Three Dimensional Technology

Recent development brings 3D anywhere it has reasonable as well as sensible. In chip industry, it elevated their attention. 3D application is the developing technology that forms extremely combined systems through stacking as well as involving dissimilar materials, as well as functional components. Key advantages application varies according to multi-functionality, enhanced performance, diminished power, little form factor, minimized packaging, improved yield as well as consistency, supple heterogeneous integration as well as minimal costs.

#### **3.2 Graphical Processing Unit**

The world is digitally connected by high-speed technologies in products like automobiles, computers and smart phones. Graphical processing units (GPUs) are used in large applications in diverse fields like compute art, engineering, military. Development inside GPU is led to development in hardware plan to manage required boost in performance. GPUs have employed for executing concentrated computation on equal hardware. GUI has user interface type where users interact by electronic device via visual indicator. GUI observed steep learning arc of command-line border typed at keyboard. GUI is carried out via straight strategy of graphical parts. Different mobile tools such as MP3, portable media players, gaming, smart phones utilized in GUI. GUI employs combination of technologies as well as devices to present the platform for gathering and producing the information.

#### 3.3 Dual Channel Multiplier

Dual Channel Multiplier is a handheld device that direct minimizes the amount of power and area. The straight forwardness of serial multipliers constructs for VLSI. Multipliers hardware is enlarged during do again one cell. DCM technology increased the stored region of Power Delay Product (PDP). The core benefits for large input operand sizes are used by LNS. This article reduces the propagation of logarithmic translation. The multiplicative effect architecture attained lesser power dissipation, area, and low transition errors. Figure 1 describes the architecture diagram of DCM-GPU. DCM has easy as well as harmonized one. 'x' as well as 'y' is standard input as well as complementary input. 'p' represent last outcome. All clock loops comprise two serial input bits. Serial input dependent variables  $(x_6, x_4, x_2, x_0)$  were routed for top channel as well as minimum channel has processed with odd index number  $(x_7, x_5, x_3, x_1)$ .



Figure 1: Architecture Diagram of DCM-GPU



Open Access | Rapid and quality publishing

Simultaneously, pairs were changed as well as managed. Partial products  $(PP_0)$  are generated starting from the first clock cycle. It is formulated as,

$$PP_{O} = \begin{cases} Y_{O}X_{O}, \ Y_{1}X_{0} \\ Y_{0}X_{1} \end{cases}$$
(1)

From (1), partial products are generated. The block diagram of 8-bit DCM is illustrated in *figure 2*.



Figure 2: 8-bit DCM Block Diagram

The final product  $(Y_oX_1)$  is added and propagated to the partial product  $(Y_1X_0)$ . Limited material  $(Y_oX_0)$  is extended to escape. The two smaller part  $(P_0 \text{ and } P_1)$  of substance are generated in equations simultaneously,

$$P_0 = Y_0 X_0$$
(2)  

$$P_1 = Y_0 X_1 + Y_1 X_0$$
(3)

For all delay factor sequential input data have conveyed one step to right in following clock cycle. Last result of sum is  $(P_3)$ . Carriage bits produced with complete suppliers were propagated as well as utilized to next partial items. Multiplication has do again and it is given as,

$$P_2 = Y_0 X_2 + Y_1 X_1 + Y_2 X_0 \tag{4}$$

$$P_3 = Y_0 X_3 + Y_1 X_2 + Y_2 X_1 + Y_3 X_0 \tag{5}$$

Partial products have generated. DCM presents parallel as well as serial-to-parallel transformers to input handling as well as parallel result computation. In 90nm CMOS method in DCMs, 8-bit as well as 16-bit DCM is to improve the structural performance.

## **3.4 Dual Neural Learnt Deming Regression based** Ladner-Fishner Adder Enhancement

Deep Learning (DL) has subfield of ML methods stimulated with structure as well as ANN purpose. Data processing as well as computation on huge data is executed by multiple layers. DL functioned depending on human brain. *Figure 3* shows structure of deep neural leaning classifier.

Deep neural leaning classifier by one input layer, two hidden layers as well as one output layer is demonstrated in *figure 3*. In the deep neural learning classifier, the carry inputs as input data taken as well as loaded to input layer at time period 't'. Also, it broadcast to hidden layer 1. Deming regression analysis

performed for pre-process input information as well as transmitted to hidden layer 2. Implement carry generation as well as post-processing in hidden layer 2 as well as output outcomes have enclosed using convolution. Every neuron within one layer has linked for every neuron inside next layer as well as network is susceptible to over fitting data. Each neuron receives input information as of each element of before layer. All neuron in NN finds output range during exacting function as of input values within prior layer. Finally, the results (*i.e.*, optimal carry generation) are attained in the output layer.



Figure 3: Structure of Deep Neural learning Classifier

Addition operation has a major operation in DPS as well as manage scheme. Adder has employed to quick and accuracy of a processor or system. In common principle as well as DSP, addition action deal were considered as of easy Ripple Carry Adder. Adders have usually to dangerous way of several building blocks of microprocessors as well as DSP chips. Its purpose is to shape the arithmetic sum of two binary numbers. The mainly vital for estimating worth of adder designs by delay, as well as area.

Hybrid parallel prefix adders are employed for avoiding high power consumption in the system. This creates enhanced trade between delay and power consumption. An Efficient Ladner-Fischer Adder is new method for higher speed as well as lesser memory. It is tree structure as well as cells in the Carry Generation Stage which reduced to binary addition. Adder addition operation provides huge benefit in minimizing delay.

Ladner–Fischer adder has PPA. Ladner Fischer adder method looks like tree structure to perform mathematics process. Carry propagation delay has eradicated by adders in ripple carry adder It includes pre-processing, carry propagation as well as generation as well as post propagation stage. It employed to service by extremely proficient connection. Binary process element provides the device development. All bit wait to final bit procedure within ripple carry adders. At parallel prefix adders, idea has overlapped carry propagation of first addition by calculation at second addition. The repetitive additions are carried out by multi-operand adder. Ladner Fischer adder illustration is given in *figure 4* and it is given as,



## **International Journal of** Electrical and Electronics Research (IJEER)

Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

Let us consider, the data inputs  $d_k = d_1, d_2, d_3, \dots, d_m$  is taken as well as loaded to input layer. Input values were determined through weight as well as bias. Input as well as weight has symbolized,

$$I(t) = \sum_{k=1}^{m} d_k * w_{ih} + Bias \tag{6}$$

From eq. (1), 'I(t)' is input layer to gather data in time 't', ' $w_{ih}$ ' denotes the initial weight at the input layer. Subsequently, its transmitted into first hidden layer, Deming regression analysis is carried out to perform the pre-processing.



Figure 4: Three stages of Ladner-Fischer Adder

Deming Regressed Pre-Processing Stage: Generation as well as propagation carried out with all input stage. Adder comprised of black as well as gray cells. Every black cell includes the two AND gates as well as one OR gate. ' $P_i$ ' represents propagation with single AND gate.  $G_i$  denotes the generation with an AND gate.

| $P_i = A_i XOR B_i$             | (7) |
|---------------------------------|-----|
| $G_i = A_i AND B_i$ (or)        | (8) |
| $G_i = P_i OR [G_i AND C_{in}]$ | (9) |

$$G_i = P_i \ OR \ [G_i \ AND \ C_{in}] \tag{9}$$

Time required to create carry signals has O (log n) at prefix adder. Adder has quickest by precedence at scheme time, as well as parallel choice to huge performance adders in industry. Ladner Fischer adder gives better performance due to their low logic depth.

Next side it occupies the vast silicon region. DR introduced for choosing data in hidden layer 1. It has an error-in-variable model. It finds best line fit to pick suitable carry input. Higher likelihood assessment of error-in-variable model has discovered. DR process is described in figure 5.

*Figure 5* describes the algorithmic process diagram of Deming regression analysis. Large number of input data collected. With expression, 'k' denotes number of data. DR process describes the available data  $(a_i, d_i)$  are determined with true values  $(a_i^*, d_i^*)$  which lie on regression line,

| $a_i = a_i^* + e_i$ | (10)  |
|---------------------|-------|
| J J J               | · · · |

 $\delta_i = \delta_i^* + v_i$ (11)



Figure 5: Deming Regression Process

From eq. (10) and (11),  $e_j$  is error value.  $v_j$  is proportion of their variance. These were independent to every other. Intercept ' $I_0$ ' as well as slope ' $s_1$ ' computed,

$$\widehat{a}_i = I_0 + s_1 \widehat{d}_i \tag{12}$$

In eq. (12),  $\hat{d}_i$  as well as  $\hat{a}_i$  is measure of true values of  $d_i$ as well as ' $a_i$ ' correspondingly. For attaining better result, DR process minimizes weight. Algorithmic processes of DR is given as, DR algorithm minimizes time consumption for relevant adder enhancement in VLSI design for 3D application when compared to conventional works. Pre-processed inputs for transmit to hidden layer 2. In that layer, carry generation stage as well as post-processing stage accomplished with VLSI circuits for 3D applications.

#### Algorithm 1: Deming Regression Analysis

| // Deming Regression |                                                  |  |  |
|----------------------|--------------------------------------------------|--|--|
| Input: N             | umber of Data ' $d_1$ , $d_2$ , $d_3$ ,, $d_m$ ' |  |  |
| <b>Output:</b>       | Select the preprocessed optimal data for adder   |  |  |
| enhancen             | nent                                             |  |  |
| Step 1: B            | Segin                                            |  |  |
| Step 2:              | For each data                                    |  |  |
| Step 3:              | Apply Deming regression analysis                 |  |  |
| Step 4:              | Find best fit line                               |  |  |
| Step 5:              | Choose the preprocessed data for adder           |  |  |
| enhancen             | nent                                             |  |  |
| Step 6:              | End For                                          |  |  |
| Step 7:End           |                                                  |  |  |
|                      |                                                  |  |  |



## International Journal of Electrical and Electronics Research (IJEER)

Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

*Carry Generation Stage:* In carry generation, carry has created to every bit termed as carry generate  $(C_g)$ . Carry propagate as well as carry generate created to future process. Last cell denote with all bit action provides carry. End bit carry helps to make sum of next bit at equal time till final bit.

$$C_p = P_i \ AND \ P_{i-1} \tag{13}$$

$$C_g = G_i \ OR \ [P_i \ AND \ G_{i-1}]$$

Where, carry propagate  $C_p$  as well as carry generation  $C_g$  as well as has black cell as well as below equation presented in carry generation has gray cell. Carry employed to next bit sum operation.

(14)

*Post Processing Stage:* In Post processing stage, propagation and generation task undergoes the carry generation stage and gives final sum. Quick as well as precise performance of an adder has employed at VLSI as well as DSP. It is given as,



Figure 6: 8-Bit Efficient Ladner-Fischer Adder

Also, it employed by two sixteen bit addition process. Every bit carry includes post-processing by propagate as well as offers last sum.

*Figure 6* describes an 8-Bit efficient Ladner-Fischer adder to improve the speed and decrease the area for 8-bit addition operation. Input bit ' $A_i$ ' as well as ' $B_i$ ' thinks to generation as well as propagation with XOR as well as AND. It experienced process of black as well as gray cell by carry ' $C_i$ '. The carry has XORed by propagate of next bit to provide the sum.

*Figure* 7 exposed 16-bit efficient Ladner-Fischer adder diagram. Multiple adders applied Logical circuit by recognizing sum of N-bit numbers. All addition operation is carry input  $(C_{in})$  that preceding bit carries output  $(C_{out})$ . Hidden layer output 'H(t)' measured by,

$$H(t) = \sum_{k=1}^{n} d_k * w_i + [w_{ih} * H(t-1)]$$
(16)



Figure 7: 16-bit Efficient Ladner-Fischer Adder

From (16), ' $w_{ih}$ ' is weight among input as well as hidden layer and H(t-1) denotes prior hidden layer output. '\*' is Convolutional operator. Hidden layer output is sent to output layer.

$$O(t) = [w_{ho} * H(t)]$$
(17)

From (17), 'O(t)' is final classification, ' $w_{ho}$ ' denotes weight among hidden as well as output layer. Deep Neural Learnt Deming Regression Based Ladner-Fisher Adder Enhancement algorithm is given below,

Algorithm 2: Deep Neural Learnt Deming Regression Based Ladner-Fisher Adder Enhancement

| \\ Deep Neural Learnt Deming Regression Based Ladner-               |  |  |  |  |
|---------------------------------------------------------------------|--|--|--|--|
| Fisher Adder Enhancement Algorithm                                  |  |  |  |  |
| Input: Data                                                         |  |  |  |  |
| Output: Adder enhancement in digital multiplier                     |  |  |  |  |
| Step 1: Begin                                                       |  |  |  |  |
| Step 2: For each data at input layer                                |  |  |  |  |
| <b>Step 3:</b> The input layer transmits data to the hidden layer 1 |  |  |  |  |
| Step 4: Hidden layer 1 uses Deming regression analysis to           |  |  |  |  |
| preprocess the input information                                    |  |  |  |  |
| <b>Step 5:</b> Hidden layer 2 performs carry generation and post    |  |  |  |  |
| processing stage for adder enhancement                              |  |  |  |  |
| <b>Step 6:</b> The output layer displays result                     |  |  |  |  |
| Step 7: end for                                                     |  |  |  |  |
| Step 8: end                                                         |  |  |  |  |

DNLDR-LFAE technique considers the data. Input layer convey data to hidden layer 1, deming regression analysis carried out to pre-process data. After that, pre-processed data convey to hidden layer 2. In that layer, carry generation stage as well as post-processing stage executes for performing adder enhancement. At last, final outcomes are showed for adder enhancement in digital multiplier for 3D application in VLSI circuits.



## 4. SOFTWARES USED AND RESULTS

In VLSI, DNLDR-LFAE technique has executed implemented during Xilinx ISE design for digital multiplier design. Xilinx ISE is the preparation device offered with Xilinx. Xilinx Integrated Synthesis Environment (ISE) has software device as Xilinx for synthesis. HDL plans which mainly goal growth of embedded firmware to Xilinx FPGA as well as CPLD IC product families. ISE allowed developer towards synthesize their intends, to carry out timing examination, examine RTL figures, to simulate plan reaction to diverse stimuli, as well as arrange target tool by programmer. Programming language is used in VERILOG. It has HDL. With Xilinx ISE design device, result has attained by three ways, namely waveform, as well as RTL format, plan utilization review.

## Table1: Device Utilization of existing Energy-Efficient Radix-16 Booth Multiplier Design

| Device Utilization Summary        |      |           |             |  |
|-----------------------------------|------|-----------|-------------|--|
| Logic Utilization                 | Used | Available | Utilization |  |
| Number of Slice<br>Registers      | 600  | 54576     | 1%          |  |
| Number of fully used LUT-FF pairs | 301  | 1293      | 23%         |  |
| Number of bonded IOBS             | 16   | 296       | 5%          |  |
| Number of<br>BUFG/BUFGCTRLs       | 1    | 32        | 3%          |  |
| Number of Slice LUTs              | 994  | 27288     | 3%          |  |
| Number of Block RAM/FFO           | 2    | 116       | 1%          |  |

Table2: Device utilization of Existing Dual-channel Multiplier (DCM)

| Device Utilization Summary (Estimated values) |      |           |             |  |
|-----------------------------------------------|------|-----------|-------------|--|
| Logic Utilization                             | Used | Available | Utilization |  |
| Number of Slice Registers                     | 604  | 93120     | 0%          |  |
| Number of Slice LUTs                          | 904  | 46560     | 1%          |  |
| Number of fully used LUT-FF pairs             | 280  | 1228      | 22%         |  |
| Number of bonded IOBS                         | 16   | 240       | 6%          |  |
| Number of Block/FIFO                          | 1    | 156       | 0%          |  |
| Number of<br>BUFG/BUFGCTRLs                   | 3    | 32        | 9%          |  |

In VLSI, area has lesser compared than traditional adder circuits. Transistor size of full adders is optimized with minimal delay without higher power consumption. *Table 1, 2* and *3* denotes device utilization of two existing as well as proposed methods.

Device utilization of Energy-Efficient Radix-16 Booth Multiplier Design as well as DCM is shown in *table 1* as well as 2. Numbers of slice registers, number of slice LUTs as well as number of fully employed LUT-FF pairs are presented. By DCM, PWP evaluation feature organization comprises majority

# International Journal of Electrical and Electronics Research (IJEER)

Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

multipliers. Control unit identified begin as well as finish of estimate to handle the function evaluation. Quadratic PWP estimation hardware design has streamlined with other two methods. In microprocessor, designed method not limited to adder as well as prolonged to main units. At global power gating adaptive controller allows maximum granularity local power gating inaccessible Maximum granularity gets additional power savings as circuit has moderately lively. Adaptive power gating reveals significant energy savings in range as 12% towards 27% through delay overhead by 13% which minimized by 3% by improving area as 5% towards 17%. DNLDR-LFAE technique device utilization is illustrated in table 3. From below table, proposed DNLDR-LFAE technique includes 4 input LUT's, occupied slices, slices containing related and unrelated logics, number of bonded IOB's. Adder is considered as basic building blocks at DPS as well as employed during microprocessors, signal processing operations like filtering and convolution. For permitting numerous domains, Accelerating adders are employed for speed up operations. Verilog as well as synthesized with full ASIC flow for attaining IC layout, Adder architecture is intended.

DNLDR-LFAE of performance of area, as well as power consumption outcomes examined. Registers were positioned in input as well as output of each adder for obtains valid timing data. Experimental outcome of both existing and proposed techniques in area, delay, speed as well as power is described in *table 4*. Area is defined as whole cell area of VLSI intend after adder enhancement. Total power described as amount of dynamic, internal, net as well as leakage power. Vital path delay of adder circuits are defined as Delay. Speed is described as the rate at which the VLSI circuits get designed with adder enhancement.

| Device Utilization Summary (Estimated values)        |      |           |             |
|------------------------------------------------------|------|-----------|-------------|
| Logic Utilization                                    | Used | Available | Utilization |
| Number of 4 input LUTS                               | 16   | 11,776    | 1%          |
| Number of occupied slices                            | 12   | 5.888     | 1%          |
| Number of Slices<br>containing only related<br>logic | 12   | 12        | 100%        |
| Number of Slices containing unrelated logic          | 0    | 12        | 0%          |
| Number used as AND/OR<br>logics                      | 0    |           |             |
| Total Number of 4 input<br>LUTs                      | 16   | 11.776    | 1%          |
| Number of bonded IOBS                                | 26   | 372       | 6%          |
| Average Fanout of Non-<br>clock Nets                 | 1.73 |           |             |

Table 3: Device Utilization of proposed DNLDR-LFAE model



Open Access | Rapid and quality publishing

Table 4: Comparison of Existing and Proposed Adder Enhancement Methods in terms of Area, Delay and Power

| Word-Size       | Adder                                                             | Area<br>(um <sup>2</sup> ) | Delay<br>(ns) | Power<br>(mW) | Speed (ns) |
|-----------------|-------------------------------------------------------------------|----------------------------|---------------|---------------|------------|
| 16-bit<br>Adder | Energy-<br>Efficient<br>Radix-16<br>Booth<br>Multiplier<br>Design | 954.19                     | 0.98          | 0.512         | 8.94       |
|                 | DCM                                                               | 1045.21                    | 1.32          | 0.601         | 9.68       |
|                 | Proposed<br>DNLDR-<br>LFAE<br>technique                           | 745.354                    | 0.71          | 0.435         | 6.47       |

From table results, the proposed DNLDR-LFAE technique attained improved performance in area, delay, speed as well as power. Below figure describes graphical comparison of four parameters of the VLSI circuits.

DNLDR-LFAE technique of area, delay, speed, as well as power is found to be observed as 745.354  $\mu$ m<sup>2</sup>, 0.71 ns, 0.435 mW, and 6.47 ns with 16-bit adder. The proposed DNLDR-LFAE technique of area, delay, speed, and power is found to be observed as 745.354  $\mu$ m<sup>2</sup>, 0.71 ns, 0.435 mW, and 6.47 ns with 16-bit adder. The existing Energy-Efficient Radix-16 Booth Multiplier Design and DCM of area, delay, speed, and power is found to be observed as 954.19 $\mu$ m<sup>2</sup>, 1045.21 $\mu$ m<sup>2</sup>, 0.98 ns, 1.32 ns, 0.512 mW, 0.601 mW, 8.94 ns and 9.68 ns with 16-bit adder. The proposed DNLDR-LFAE technique is minimized as compared to [1] and [2]. Comparison of four factors of the VLSI circuits are given as follows.



Figure 8: Measurement Analysis of Area

*Figure 8, 9, 10* and *11* explains experimental of area, delay, power as well as speed in two different adders. Adders are utilized to perform execution based on power as well as delay values in *figure 11*. The results prove that the area, delay, power, and speed of the DNLDR-LFAE technique are reduced

than other methods. The reason for the lesser area, delay, power, and speed is to applyDeming Regression and Ladner-Fisher Adder Enhancement with deep learning concept. The deeming regression analysis is executed to choose preprocessed optimal data. By using carry generation and post-processing, adder enhancement is executed. In addition, the well-organized adder enhancement of the digital multiplier is performed by power, as well as speed.



Figure 9: Measurement Analysis of Delay



Figure 10: Measurement Analysis of Power



Figure 11: Measurement Analysis of Speed



# International Journal of Electrical and Electronics Research (IJEER)

Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

Proposed DNLDR-LFAE technique of area consumption is reduced by 22% and 29% than state of art methods respectively. The delay of proposed DNLDR-LFAE technique is reduced by 28% and 46% than [1] and [2] respectively. The power consumption of proposed DNLDR-LFAE technique is reduced by 15% and 28% compared [1] as well as [2] respectively. The speed of proposed DNLDR-LFAE technique is reduced by 28% and 33% than [1] and [2] respectively.

Example 1

Input A=0 ----- $\rightarrow$ Equivalent Binary Number= 00000000; //0 Input B=65---- $\rightarrow$  Equivalent Binary Number= 01000001; //65 SUM=01000001 $\rightarrow$ Output value



Figure 12: Simulation Results of Proposed DNLDR-LFAE technique



Figure 13: VHLD design for Proposed DNLDR-LFAE technique



Research Article | Volume 11, Issue 2 | Pages 242-252 | e-ISSN: 2347-470X

*Figure 12* and *13* explains the simulation results of DNLDR-LFAE technique and VHLD design for Proposed DNLDR-LFAE technique. Based on execution area attained as layouts, full adders require smallest area that considered for minimum delay and power consumption. Experimental situation has employed to comparing full-adders. Full-adder planned by logic structure as well as DPL logic approach. Input buffers size experienced decay in input signals as well as it equals load of small inverters for designed technology.

Key benefit of simulation has small inverters circuit consumption linked with device under test (DUT) inputs. The power changes consistent with capacitive load which DUT provides in inputs. While element is no direct power supply connections, energy needed for charging as well as discharging DUT internal nodes. After coming as voltage source, shortcircuit of DUT receives signals by restricted slopes as buffers linked on inputs rather than ideal.

#### 4.1 Real Time Application

For quicker migration to power constrained devices such as PDAs, cell-phones as well as wearable, the power consumption reduction of 3D computer graphic processing has vital application. The realism of graphic processing is improved by Texture mapping during mapping functions such as ID, 2D, or 3D array, or a mathematical function to surface of a 3D object. An interpolation computation has necessary one with higher quality texture mapping by huge number of multiplyaccumulate (MAC) computations and memory accesses. Power consumption has minimized significantly at maximum abstraction with architectural levels. An effort was considered to optimize multiply as well as MAC operations. Multiplier structures reconfigure minimizes power consumption based on zero inputs and input-rate variations. For optimization, an alternative source has needed required since texture mapping inputs are non-zeroes. Architectural level technique has data path width minimization at hardwired as well as zeromultiplexed manner.

### **5. CONCLUSION**

This paper discusses the effectiveness of the proposed DNLDR-LFAE technique gets increased significantly in the assessment. The work done with advanced approaches ensuring the stability. In DNLDR-LFAE technique, the deeming regression analysis pre-processes the input data. After that, carry generation and post-processing are carried out. Then, output outcomes enclosed via convolution. Lastly, the outcomes achieved in output layer for lessen power and time for adder development of digital multiplier. Digital multiplier is examined to reduce the error and compare the parameters like Area, Delay, and Power. The designed technique includes a Ladner-Fisher Adder parallel prefix network with parallel operation. The process of the DNLDR-LFAE technique is rapid and exact in which the VLSI hardware implementation is less delayed and effective. The hardware complexity of the proposed DNLDR-LFAE technique gets reduced by 52% to 63% when designed with Learner-Fisher Adder.

In future work, the proposed method is extended for an efficient adder enhancement of digital multiplier with minimal power and time consumption by using Deep Fully Connectedness Convolutional Neural Network with Brent-Kung Adder Circuit.

#### REFERENCES

- [1] Y Mounica, K Naresh Kumar, Sreehari Veeramachaneni and Noor Mahammad S, "Energy efficient signed and unsigned radix 16 booth multiplier design," Computers & Electrical Engineering, Elsevier, vol. 90, pp. 1-8, 2021. https://doi.org/10.1016/j.compeleceng.2020.106892
- [2] Dina M. Ellaithy, Magdy A. El-Moursy, Amal Zaki and Abdelhalim Zekry, "Dual-Channel Multiplier for Piecewise-Polynomial Function Evaluation for Low-Power 3-D Graphics," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 4, pp. 790-798, 2019. DOI: 10.1109/TVLSI.2018.2889769
- [3] Michael Opoku Agyeman, Ali Ahmadinia and Nader Bagherzadeh, "Energy and performance-aware application mapping for inhomogeneous 3D networks-on-chip", Journal of Systems Architecture, Elsevier, vol. 89, pp. 103-117, 2018. https://doi.org/10.1016/j.sysarc.2018.08.002
- [4] Pankaj Kumar and Rajender Kumar Sharma, "Low voltage high performance hybrid full adder," Engineering Science and Technology, an International Journal, Elsevier, vol. 19, no. 1, pp. 559-565, 2016. https://doi.org/10.1016/j.jestch.2015.10.001
- [5] S. Rakesh and K. S. Vijula Grace., "A comprehensive review on the VLSI design performance of different Parallel Prefix Adders," Materials Today: Proceedings, Elsevier, vol. 11, no. 3, pp. 1001-1009, 2019. https://doi.org/10.1016/j.matpr.2018.12.030
- [6] Bala Sindhuri Kandula, K. Padma Vasavi and I. Santi Prabha, "Area Efficient VLSI Architecture for Square Root Carry Select Adder Using Zero Finding Logic," Procedia Computer Science, Elsevier, vol.89, pp. 640-650., 2016. https://doi.org/10.1016/j.procs.2016.06.028
- [7] Mehedi Hasan et al., "Gate Diffusion Input technique based full swing and scalable 1-bit hybrid Full Adder for high performance applications," Engineering Science and Technology, an International Journal, Elsevier, vol. 23, no. 6, pp. 1364-1373, 2020. https://doi.org/10.1016/j.jestch.2020.05.008
- [8] B. Ramkumar and Harish M. Kittur, "Faster and Energy-Efficient Signed Multipliers," VLSI Design, Hindawi Publishing Corporation, vol. 2013, pp. 1-18, 2013. https://doi.org/10.1155/2013/495354
- [9] Rekib Uddin Ahmed and Prabir Saha, "Implementation Topology of Full Adder Cells," Procedia Computer Science, Elsevier, vol.165, pp. 676 – 683, 2019. https://doi.org/10.1016/j.procs.2020.01.063
- [10] Mohan Shoba and Rangaswamy Nakkeeran, "GDI based full adders for energy efficient arithmetic applications," Engineering Science and Technology, an International Journal, Elsevier, vol. 19, no. 1,pp. 485 – 496, 2016. https://doi.org/10.1016/j.jestch.2015.09.006
- [11] Sumbal Zahoor, Shahzad Naseem and Wei Meng, "Design and implementation of an efficient FIR digital filter," Cogent Engineering, Taylors and Francis, vol. 4, no. 1, pp. 1-15, 2017. https://doi.org/10.1080/23311916.2017.1323373
- [12] Mansi Jhamb, Garima and Himanshu Lohani, "Design, implementation and performance comparison of multiplier topologies in power-delay space", Engineering Science and Technology, an International Journal, Elsevier, vol.19, no. 1, pp. 355-363, 2016. https://doi.org/10.1016/j.jestch.2015.08.006
- [13] Zhoufeng Ying at al., "Electronic-photonic arithmetic logic unit for high-speed computing," Nature Communications, vol. 11, no. 2154, pp. 1-15, 2020. DOI: 10.1038/s41467-020-16057-3
- [14] Mehedi Hasan et al., "Design of a Scalable Low-Power 1-Bit Hybrid Full Adder for Fast Computation," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 8, pp. 1464-1468, 2020. DOI: 10.1109/TCSII.2019.2940558
- [15] Samar Ghabraei, Morteza Rezaalipour, Masoud Dehyadegari and Mahdi Nazm Bojnordi, "AxCEM: Designing Approximate Comparator-Enabled Multipliers," Journal of Low Power Electronics Appliances, v. 10, no. 1, pp. 1-9, 2020. DOI: 10.3390/jlpea10010009

251



- [16] Deepa Yagain, Vijaya Krishna A, and Akansha Baliga, "Design of High-Speed Adders for Efficient Digital Design Blocks," International Scholarly Research Notices, Hindawi Publishing Corporation, vol. 2012, pp. 1-18, 2012. https://doi.org/10.5402/2012/253742
- [17] Sakthivel B and Padma A, "Area and delay efficient GDI based accuracy configurable adder design," Microprocessors and Microsystems, Elsevier, vol. 73, pp. 1-15, 2020. https://doi.org/10.1016/j.micpro.2019.102958
- [18] Pulak Mondal, Pradyut Kumar Biswal and Swapna Banerjee, "FPGA based accelerated 3D affine transform for real-time image processing applications,"Computers and Electrical Engineering, vol. 49, pp. 1-15, 2015. https://doi.org/10.1016/j.compeleceng.2015.04.017
- [19] Avni Agarwal, P. Harsha, Swati Vasishta, and S. Sivanantham "Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid Number System," Journal of Computer Networks and Communications, Hindawi Publishing Corporation, vol. 2014, pp. 1-18, 2014. https://doi.org/10.1155/2014/890354
- [20] Davide De Caro, Nicola Petra and Antonio G. M. Strollo, "Highperformance special function unit for programmable 3-D graphics processors," IEEE Transactions on Circuits and Systems Part I: Regular Papers, vol. 56, no. 9, pp. 1968–1978, 2009.DOI: 10.1109/TCSI.2008.2010150
- [21] Alexander E. Shapiro, Francois Atallah, Kyugseok Kim, Jihoon Jeong, Jeff Fischer and Eby G. Friedman "Adaptive power gating of 32-bit Kogge Stone adder," Integration, the VLSI Journal, Elsevier, vol. 53, no.C, pp. 80–87, 2016. https://doi.org/10.1016/j.vlsi.2015.12.001



© 2023 by the M. Renuka and G. Mary Valantina. Submitted for possible open access publication under the terms and conditions of

the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).