

# Approximate Computing Using Voltage Over Scaling Technique for Image Compression

**Junqi Huang<sup>1\*</sup>, T. Nandha Kumar<sup>2</sup>, and Haider A. F. Almurib<sup>2</sup>**

<sup>1</sup>*School of Opto-Electronic & Comm. Eng., Xiamen University of Technology, Xiamen, 361024, China*

<sup>2</sup>*University of Nottingham Malaysia, Semenyih, 43500, Malaysia*

\*Correspondence: [huangjunqi@xmut.edu.cn](mailto:huangjunqi@xmut.edu.cn)

**ABSTRACT** - Approximate computing has extensively been adopted as a fault-tolerant method to achieve energy-efficient designs in image processing. This paper introduces a novel, integrated approximate approach for implementing runtime-based voltage over scaling (VOS) at both the circuit and algorithmic levels, specifically for approximate discrete cosine transform (ADCT) and zigzag low-complexity approximate DCT (ZLCADCT) in image compression. In the proposed VOS scheme, the supply voltage of exact and approximate adder cells is reduced below the nominal level, causing the output delay to surpass the worst-case delay and generating errors in addition, while lowering energy consumption. A mathematical model applicable to both exact and approximate adder cells using VOS is first presented. The results from this model align closely with simulation outcomes, validating its accuracy. Subsequently, an exhaustive simulation of 4-bit and 8-bit subtraction, followed by ADCT and ZLCADCT, is conducted using VOS. The error rate (ER) normalized mean error distance (NMED) and mean relative error distance (MRED) for the subtractor with approximate cells are significantly lower than those for the subtractor with exact cells under VOS conditions. In ADCT, approximate full adders can operate at lower supply voltages (around 0.77V) than exact full adders (around 0.83V) without a significant loss in Peak Signal-to-Noise Ratio (PSNR). As the number of approximate bits (NAB) increases, the total energy dissipation of ADCT decreases by 33.2%, with an additional 20% reduction achieved through the application of ZLCADCT with VOS.

**Keywords:** Approximate Computing, Voltage Over-Scaling, Approximate Full Adder, Subtractor, Approximate DCT.

## ARTICLE INFORMATION

**Author(s):** Junqi Huang, T. Nandha Kumar, and Haider A. F. Almurib; **Received:** 07/02/2025; **Accepted:** 06/11/2025; **Published:** 10/12/2025;

**E-ISSN:** 2347-470X;

**Paper Id:** IJEER250114;

**Citation:** 10.37391/ijeer.130408

**Webpage-link:**

<https://ijeer.forexjournal.co.in/archive/volume-13/ijeer-130408.html>

**Publisher's Note:** FOREX Publication stays neutral with regard to jurisdictional claims in Published maps and institutional affiliations.



algorithmic noise tolerance (ANT) and algorithmic noise tolerance (ANT) [27].

The VOS technique is introduced as an energy-efficient method by lowering the supply voltage of circuits to reduce energy dissipation [27]. At the same time, it increases the delay beyond the worst-case delay, which can result in errors in the outputs[11]. In [11], VOS is initially applied to an approximate adder cell to achieve optimized approximation. Exact full adder (ExactFA)/approximate mirror adder 1 (AMA1) cells from [28] at various feature sizes are simulated and analyzed in detail in [11] using the VOS technique. The energy-efficient approximation is achieved at runtime by applying VOS to the approximate full adder; the approximation level (number of errors) in a given full adder circuit can be adjusted by controlling the supply voltage, without reconfiguring the design or adding extra circuits. In [11], the process variations in transistors—particularly gate length and input frequency—are evaluated in terms of their effects using VOS, a voltage over scaled Ripple Carry Adder (RCA). The application of addition between two images is also performed. The results show that the energy dissipation of both adder cells decreases significantly as the supply voltage is lowered. An approximate adder cell, compared to the exact adder cell, reduces energy dissipation by 30% at maximum approximation.

In the present manuscript, compared with [11], a substantial amount of new work is presented to further investigate the VOS technique for exact/approximate adder cells. In the present manuscript;

## 1. INTRODUCTION

Approximate computing has been extensively used in error-resilient design to enhance energy efficiency by reducing circuit complexity and enabling circuits to deliver acceptable errors (approximation)[1-3]. Typically, approximate computing techniques have been applied in image processing at the algorithmic, logic, or circuit levels, without support for on-the-fly or runtime adjustments of the approximation [4,5]. At the circuit level, the primary approach is to reduce the number of transistors per adder cell, resulting in tolerable errors while lowering energy consumption and area [6-9]. Additionally, techniques such as voltage over scaling (VOS) [10-13], [39-41] and frequency scaling [14,15] are proposed to reduce energy dissipation and boost throughput in adder circuits. At the logic level, efforts focus on simplifying multi-bit adders and multipliers [16,18] and on enhancing their parallelism [19-21]. At the algorithmic level, research primarily focuses on the approximate discrete cosine transform (ADCT) [22-26] and

- A theoretical framework using mathematical modelling of VOS for exact/approximate adder cells is proposed. The results of the developed mathematical models are validated against simulation results, and they are found to be in close agreement.
- The 4-bit and 8-bit subtraction operations using VOS are exhaustively assessed in terms of their ER, NMED and MRED values. The results show that the ER, NMED, and MRED for the approximate adder cell (AMA1)-based subtractor are significantly lower than those of the exact full adder-based subtractor by using the VOS technique. The AMA1-based subtractor can sustain a much lower supply voltage (0.69V) at the maximum ER value than the exact full adder-based subtractor (0.92V).
- VOS on exact/approximate adder cells is finally applied to real-time applications of image compression techniques, such as approximate DCT(ADCT) and zigzag low-complexity approximate DCT (ZLCADCT) at the algorithmic level to realize a runtime-based approximate computing technique for image compression. Exact/approximate full adders-based RCAs and subtractor are applied to the addition and subtraction operations in ADCT and ZLCADCT. ZLCADCT is proposed in [23] as a deterministic technique that accurately configures the size of the transform matrix (T) according to the number of retained coefficients in the zigzag scanning process. When compared with ADCT, ZLCADCT decreases the number of addition operations and the energy consumption while retaining the PSNR of the compressed image[23]. In this paper, ZLCADCT using VOS is applied to further reduce the energy dissipation of voltage-over-scaled ADCTs. The results show that for both ADCT and ZLCADCT, an approximate full adder can sustain a significantly lower supply voltage than an exact full adder without a significant decrease in PSNR. Also, independent of the type of adder cells, the total energy dissipation for voltage over scaled ZLCADCT is lower than that for voltage over scaled ADCT, while maintaining the same PSNR.

In this paper, VOS on exact/inexact full adder is initially reviewed in *section 2*. *Section 3* describes the mathematical model for VOS. *Section 4* presents the exhaustive simulation results for exact/approximate full adder-based subtractor using VOS technique. *Section 5* presents the results about ADCT and ZLCADCT using VOS. *Section 6* concludes this paper.

## 2. REVIEW OF APPROXIMATE ADDERS USING VOS

VOS is applied to an exact full adder (ExactFA) and an approximate mirror adder 1 (AMA1) in *figure 1* at different feature sizes of transistors (16nm, 22nm, 32nm, and 45nm) in LTSPICE by using a low-power PTM model[29] for performance evaluation in [11].

The simulation process of using VOS is described as follows. All possible vectors (from 000 to 111) are applied as input signal transitions on the three input ports (A, B and C) under nominal input frequency. The initial values of eight vector cases are chosen to vary the input voltage levels across all input ports, so that the voltage levels of A, B, and C change for all cases. The output threshold voltage (50% of the nominal supply

voltage) is used as the determining level. If the output voltage does not reach the determining level, errors are introduced. The worst-case delay is evaluated when the first error appears at the Sum or Carry outputs by applying all input vectors. Then, the number of errors is recorded as the supply voltage is gradually reduced beyond the nominal supply voltage.

*Figure 2* shows the simulation results in [11] for a 32 nm exact full adder and AMA1. The so-called effective error is defined as the total number of errors on both Sum and Carry when all possible input vectors are applied [11]. The number of errors increases generally when the supply voltage decreases for both the exact full adder and AMA1. AMA1 can sustain lower supply voltage for the same number of errors when compared with the exact full adder. The number of effective errors and energy dissipation for different feature sizes (16nm, 22nm, 32nm, and 45nm) are also measured in [11] by using VOS. For both the exact and approximate full adders, the supply voltage can be scaled down to a lower level without introducing more effective errors as the feature size decreases. The energy dissipation also decreases as the supply voltage is reduced. Approximate adder cell, when compared with the exact adder cell, reduces 30% energy dissipation for the maximum approximation. In addition, with VOS, the approximation of the approximate adder cell can be varied from a minimum value to a maximum value at runtime without incurring any additional hardware, while saving the energy dissipation from 31.1% to 87% when compared with the exact adder cell.



**Figure 1.** (1) Single bit mirror exact full adder circuit (ExactFA) [31] and (2) single bit approximate mirror adder circuit (AMA1) [28].



**Figure 2.** The number of errors in Sum and Carry as well as effective errors for exact full adder and AMA1 at 32nm technology node under VOS[11].

Moreover, the proposed method are further validated by analyzing the effect of process variations (such as gate length, input frequency) on applying VOS to the adder cell in [11]. The

results show that, irrespective of the type of adder, the supply voltage required for a given number of errors increases with increasing input frequency/transistor gate length. When applying VOS, the energy variation due to changes in gate length for the approximate full adder is significantly lower than that for the exact full adder. Then, VOS has been applied to 4-bit and 8-bit RCAs for performance evaluation through exhaustive simulation. The results show that the ER and NMED for approximate mirror adder cell 1 (AMA1)-based RCA is significantly lower than exact full adder-based RCA by using the VOS technique; 62% of energy saving is achieved by using approximate adder-based RCA when compared with exact full adder at the maximum ER level. The following metrics [30][11] used to evaluate the voltage over scaled RCA.

**Error Rate (ER):** The proportion of inaccurate results for all possible input cases.

**Error Distance (ED):** The absolute value of difference between accurate results ( $R$ ) and approximate results ( $\hat{R}$ ), i.e.,

$$ED = |R - \hat{R}| \quad (1)$$

**Mean Error Distance (MED):** The average ED value in the outputs for all input cases.

**Normalized Mean Error Distance (NMED):** normalized value of MED by the largest magnitude ( $R_{max}$ ) in the output of exact adder, i.e.

$$NMED = MED/R_{max} \quad (2)$$

**Mean Relative Error Distance (MRED):** mean value of the Relative Error Distance (RED) in the outputs for all input cases, where RED is defined as:

$$RED = ED/R \quad (3)$$

The proposed method is validated by applying it to image addition using exact and approximate full adder-based RCAs under VOS conditions. Results indicate that the approximate circuit operates at a lower supply voltage and produces higher image quality than the exact circuit, thanks to VOS techniques. The subsequent sections include the mathematical modeling of VOS. Additionally, the performance of subtractors with VOS is assessed using metrics ER, NMED, and MRED. Finally, VOS is applied to ADCT and ZLCADCT for further performance evaluation.

### 3. MATHEMATICAL MODELLING

This section presents the mathematical modeling of voltage over scaled full adder cells to confirm that the simulation results of the exact full adder and AMA1 under VOS match the theoretical framework results.

#### 3.1. Equivalent full adder circuit

The equivalent full adder circuit proposed in [14] figure 3 is used to design the proposed model. For the exact full adder and AMA1 shown in Figure 1, PMOS and NMOS transistors are assumed to be electrically identical with their widths at  $2w$  and  $w$ , respectively. Thus, when an equivalent circuit is applied, the

equivalent width for the equivalent Carry circuit of the exact full adder is  $13w/3$  and  $13w/6$  for PMOS and NMOS, respectively. While the equivalent Sum circuit of the exact full adder is  $39w/6$  for PMOS and  $39w/12$  for NMOS[14]. As for AMA1, equivalent widths of the Carry circuit are  $4w$  for PMOS and  $5w/2$  for NMOS, while those of the Sum circuit are  $22w/3$  and  $10w/3$  for PMOS and NMOS, respectively.



**Figure 3.** Equivalent full adder circuit for exact full adder (ExactFA) /AMA1 cell [14].

In figure 3,  $v_g(t)$ ,  $i_2(t)$ ,  $v_2(t)$  and  $V_{DD}$  represent the gate-source voltage, output current, output voltage and supply voltage respectively[14]. According to [32, 33],  $i_2(t)$  can be described in eq. (4) as a generic non-linear scalar function made pull-up  $I_H(t)$  and pull-down  $I_L(t)$  currents dependent on  $v_g(t)$  and  $v_2(t)$  [14].  $w_j(v_g(t))$  (where  $j=H,L$ ) describes nonlinear characteristics of the up and down transitions by the step input  $v_g(t)$ [14, 33]. The nonlinear dynamic term  $i_j(v_2(t), d/dt)$  represents the output voltage behavior ( $v_2(t)$ ) during either the pull-up or pull-down process [14, 33]; then eq. (4) can be further expressed as eq. (5).

$$i_2(t) = I_H(v_g(t), v_2(t)) + I_L(v_g(t), v_2(t)) \\ = w_H(v_g(t)) \cdot i_H(v_2(t), \frac{d}{dt}) + w_L(v_g(t)) \cdot i_L(v_2(t), \frac{d}{dt}) \quad (4)$$

$$i_j(v_2(t), d/dt) = G_j(v_2(t)) + C_j(v_2(t)) \frac{dv_2(t)}{dt}; j = H, L \quad (5)$$

In eq. (5),  $G_j(v_2(t))$  and  $C_j(v_2(t)) dv_2(t)/dt$  are conduction current and displacement current respectively [14, 33]. The conduction current of the transistor is the drain current created by the conduction of charge carriers from the drain to the source[14]. It is assumed that the charge carriers are at saturation velocity in the proposed model; the displacement current relies on the output nonlinear capacitance  $C_j(v_2(t))$ [14].  $C_j$  attributes to the gate capacitance, the overlapping capacitance, and the depletion capacitance [14]. Thus, the output current through  $C_j$  is found in eq. (6) with respect to the output voltage  $v_2$  [14].

$$C_j \frac{dv_2}{dt} = i_j \quad (6)$$

As for the pull-up (for PMOS) process or pull-down (for NMOS) process,  $i_j$  consists of saturation current eq. (7) for PMOS or eq.(9) for NMOS) and the current of the non-saturation state eq. (8) for PMOS or eq. (10) for NMOS) [42]. As  $v_2 = |V_{T,p}|$  or  $(V_{DD} - V_{T,n})$  at time  $t_1(t_0)$ , the PMOS (NMOS) transistor varies from the saturation state to the non-saturation state [14, 42].  $V_{T,p}$  and  $V_{T,n}$  are the threshold voltages of the transistors, and  $\beta_p/\beta_n$  is the transistor transconductance[14].

$$i_j = \frac{\beta_p}{2} \cdot (V_{DD} - |V_{T,p}|)^2 \quad (7)$$

$$i_j = \frac{\beta_p}{2} [2(V_{DD} - |V_{T,p}|)v_2 - v_2^2] \quad (8)$$

$$i_j = -\frac{\beta_n}{2} \cdot (V_{DD} - V_{T,n})^2 \quad (9)$$

$$i_j = -\frac{\beta_n}{2} [2(V_{DD} - V_{T,n})v_2 - v_2^2] \quad (10)$$

For a PMOS during pull-up process,  $v_2$  is given by *eq. (11)* when  $t < t_1$  by using *eq. (6)* and *eq. (7)* [14, 42].

$$v_2(t) = \frac{\beta_p(V_{DD} - |V_{T,p}|)^2 t}{2C_j} \quad (11)$$

As  $t \geq t_1$ ,  $v_2$  is *eq. (12)* by using *eq. (6)* and *eq. (8)* for PMOS [14, 42].

$$v_2(t) = V_{DD} - (V_{DD} - |V_{T,p}|) \left[ \frac{2e^{-(t-t_1)/\tau_p}}{1+e^{-(t-t_1)/\tau_p}} \right] \quad (12)$$

$$\text{Where } t_1 = \frac{2C_j|V_{T,p}|}{\beta_p(V_{DD} - |V_{T,p}|)^2} \quad (13)$$

and the charging time constant  $\tau_p$  for PMOS is given by *eq. (14)*

$$\tau_p = \frac{C_j}{\beta_p(V_{DD} - |V_{T,p}|)} \quad (14)$$

Next, for a NMOS transistor during pull-down process, when  $t < t_0$ ,  $v_2$  is given by *eq. (15)* according to *eq. (6)* and *eq. (9)* [42].

$$v_2(t) = V_{DD} - \frac{\beta_n(V_{DD} - V_{T,n})^2 t}{2C_j} \quad (15)$$

As  $t \geq t_0$  for NMOS,  $v_2$  is *eq. (16)* by using *eq. (6)* and *eq. (10)* [42].

$$v_2(t) = (V_{DD} - V_{T,n}) \left[ \frac{2e^{-(t-t_0)/\tau_n}}{1+e^{-(t-t_0)/\tau_n}} \right] \quad (16)$$

$$\text{Where } t_0 = \frac{2C_j V_{T,n}}{\beta_n(V_{DD} - V_{T,n})^2} \quad (17)$$

and the charging time constant  $\tau_n$  for NMOS is given by

$$\tau_n = \frac{C_j}{\beta_n(V_{DD} - V_{T,n})} \quad (18)$$

### 3.2. Evaluation of Voltage-Scaled Adders

Using the VOS technique on the equivalent circuit, as the applied output voltage varies from low to high, the NMOS is in cutoff, and the output capacitance is charged through the PMOS. Hence, the average output voltage can be estimated through *eq. (11)* to *eq. (14)*. Meanwhile, if the output voltage levels of applied cases in Sum or Carry decrease from high to low, the PMOS is in cutoff and the output capacitance is discharging through NMOS; thus at this moment, *eq. (15)* to *eq. (18)* are used to estimate their average output voltages.

When the proposed model is applied at the circuit level by using VOS,  $C_j$  is calculated only using the gate capacitance.  $C_j$  in *eq. (6)* is multiplied by a voltage varying fitting parameter  $R_x(V_{DD})$

to account for the overlap capacitance and the depletion capacitance. The final output capacitance  $C$  is obtained when  $C_j$  is multiplied by the proposed parameter  $R_x(V_{DD})$ ;  $R_x(V_{DD})$  is subject to a change in the supply voltage  $V_{DD}$ .

The fitting parameters of the pull-up process (Low-to-High in the output) by PMOS for the Sum and Carry of the exact full adder ( $\text{ExactFA}_{LH\text{sum}}$  and  $\text{ExactFA}_{LH\text{carry}}$ ) as well as AMA1 ( $\text{AMA1}_{LH\text{sum}}$  and  $\text{AMA1}_{LH\text{carry}}$ ) are given by *eq. (22)* through *eq. (25)*. Also, the fitting parameters of the pull-down process (High-to-Low in the output) by NMOS for the Sum and Carry of the exact full adder ( $\text{ExactFA}_{HL\text{sum}}$  and  $\text{ExactFA}_{HL\text{carry}}$ ) as well as AMA1 ( $\text{AMA1}_{HL\text{sum}}$  and  $\text{AMA1}_{HL\text{carry}}$ ) are given by *eq. (26)* through *eq. (29)*. Hence the output capacitance is given by *eq. (19)*. Parameter  $a$  in *eq. (22)* to *eq. (29)* is set as  $a = \lceil \lambda \cdot 10^9 \rceil$  where  $\lceil \sqrt{\lambda \cdot 10^9} \rceil$  denotes the ceiling function for the least integer that is no less than  $\sqrt{\lambda \cdot 10^9}$  and  $\lambda$  is the channel length of a transistor.

$$C = C_j \cdot R_x(V_{DD}) \quad (19)$$

Where for the pull-up process by PMOS:

$$x \in \left\{ \text{ExactFA}_{LH\text{sum}}, \text{ExactFA}_{LH\text{carry}}, \text{AMA1}_{LH\text{sum}}, \text{AMA1}_{LH\text{carry}} \right\} \quad (20)$$

And for the pull-down process by NMOS:

$$x \in \left\{ \text{ExactFA}_{HL\text{sum}}, \text{ExactFA}_{HL\text{carry}}, \text{AMA1}_{HL\text{sum}}, \text{AMA1}_{HL\text{carry}} \right\} \quad (21)$$

$$R_{\text{ExactFA}_{LH\text{sum}}}(V_{DD}) = \begin{cases} 0.5a + 1, & V_{DD} \geq V_{ELHs} \\ 1 + 0.5a + \sum_{i=1}^{100(V_{DD}-V_{ELHs})} (0.06a - 0.14)i, & V_{DD} < V_{ELHs} \end{cases} \quad (22)$$

Where  $V_{ELHs} = 0.88 + 0.02a$  for pull-up process by PMOS of exact full adder in Sum

$$R_{\text{ExactFA}_{LH\text{carry}}}(V_{DD}) = \begin{cases} 0.1a - 0.3, & V_{DD} \geq V_{ELHc} \\ 0.1a - 0.3 + \sum_{i=1}^{100(V_{DD}-V_{ELHc})} (0.1a - 0.3)i, & V_{DD} < V_{ELHc} \end{cases} \quad (23)$$

Where  $V_{ELHc} = 0.81 - 0.03a$  for pull-up process by PMOS of exact full adder in Carry

$$R_{\text{AMA1}_{LH\text{sum}}}(V_{DD}) = \begin{cases} a - 0.5, & V_{DD} \geq V_{ALHs} \\ a - 0.5 + \sum_{i=1}^{100(V_{DD}-V_{ALHs})} (0.01a - 0.03)i, & V_{DD} < V_{ALHs} \end{cases} \quad (24)$$

Where  $V_{ALHs} = 0.8 + 0.06a$  for pull-up process by PMOS of AMA1 in Sum

$$R_{AMA1LHcarry}(V_{DD}) = \begin{cases} 0.1a - 0.3, V_{DD} \geq V_{ALHC} \\ 0.1a - 0.3 + \sum_{i=1}^{100(V_{DD}-V_{ALHC})} (0.01a + 0.01)i, V_{DD} < V_{ALHC} \end{cases} \quad (25)$$

Where  $V_{ALHC} = 0.88 - 0.04a$  for pull-up process by PMOS of AMA1 in Carry

$$R_{ExactFAHLSum}(V_{DD}) = \begin{cases} 1.5a - 2, V_{DD} \geq V_{EHLs} \\ 1.5a - 2 + \sum_{i=1}^{100(V_{DD}-V_{EHLs})} (0.05a + 0.15)i, V_{DD} < V_{EHLs} \end{cases} \quad (26)$$

Where  $V_{EHLs} = 0.86 + 0.01a$  for pull-down process by NMOS of exact full adder in Sum

$$R_{ExactFAHLCarry}(V_{DD}) = \begin{cases} 0.5a - 1, V_{DD} \geq V_{EHLc} \\ 0.5a - 1 + \sum_{i=1}^{100(V_{DD}-V_{EHLc})} (0.04a - 0.06)i, V_{DD} < V_{EHLc} \end{cases} \quad (27)$$

Where  $V_{EHLc} = 0.98 - 0.02a$  for pull-down process by NMOS of exact full adder in Carry

$$R_{AMA1HLSum}(V_{DD}) = \begin{cases} 0.3a - 1, V_{DD} \geq V_{AHLs} \\ 0.3a - 1 + \sum_{i=1}^{100(V_{DD}-V_{AHLs})} (0.1a - 0.1)i, V_{DD} < V_{AHLs} \end{cases} \quad (28)$$

Where  $V_{AHLs} = 0.65 + 0.01a$  for pull-down process by NMOS of AMA1 in Sum

$$R_{AMA1HLCarry}(V_{DD}) = \begin{cases} a - 2, V_{DD} \geq V_{AHLc} \\ a - 2 + \sum_{i=1}^{100(V_{DD}-V_{AHLc})} (0.06a - 0.12)i, V_{DD} < V_{AHLc} \end{cases} \quad (29)$$

Where  $V_{AHLc} = 0.81 + 0.01a$  for pull-down process by NMOS of AMA1 in Carry

Figure 4 presents the comparison analysis of the estimated results against the simulated average results, including the maximum and minimum voltage values. This evaluation focuses on the output voltage of the exact full adder and AMA1 using the VOS technique at the 32nm technology node. Additionally, Figure 5 presents the comparison based on 45nm voltage over scaled versions of the exact cell and AMA1.

It can be seen from Figures 4 and 5 that for both the pull-up process and the pull-down process, the estimated results created by using the proposed model are in close agreement with the

simulated average results and well within the boundary of  $v_2$  range between the largest and the least values. Irrespective of the type of adders, during the pull-up process by PMOS, the average output voltage of both Sum and Carry decreases non-linearly when the supply voltage is reduced.



**Figure 4.** The estimated results and simulated average results of the output voltage  $v_2$  ((a) for the pull-up process by PMOS and (b) for the pull-down process by NMOS) of 32nm exact full adder and AMA1 by using VOS



**Figure 5.** The estimated results and simulated average results of the output voltage  $v_2$  ((a) for the pull-up process by PMOS and (b) for the pull-down process by NMOS) of 45nm exact full adder and AMA1 by using VOS technique

However, during the pull-down process by NMOS for both exact full adder and AMA1, the average output voltage of both Sum and Carry increases non-linearly by continuously scaling down the supply voltage until supply voltage falls to around 0.7V (approximately 0.73V to 0.67V). When the supply voltage for the pull-down process is reduced further to below around 0.7V, there is a slight decrease in the output voltages of both the

exact full adder and AMA1 in Sum and Carry. This is mainly because the adder's output voltage is also affected by its supply voltage, and the output voltage cannot exceed the supply voltage. Meanwhile, independent of the type of adders, the output capacitance decreases when the supply voltage of adder circuits is reduced as per the theory *eq. (19)* through *eq. (29)*.

#### 4. VON ON SUBTRACTION

In this section, 4-bit and 8-bit subtraction are assessed by using 32nm & 45nm exact full adders and AMA1 when subjected to VOS. During the process of subtraction ' $a - b$ ', the ' $- b$ ' is converted into 2's complement format for performing additions between ' $a$ ' and ' $- b$ ' by using RCAs. An exhaustive process is employed for the input vectors.



**Figure 6.** Number of errors and variation in Sum for 4-bits and 8-bits subtraction using voltage over scaled exact full adder and AMA1

*Figure 6* illustrates the number of errors and variations in Sum for 4-bits and 8-bits subtractors using exact full adder and AMA1 when subjected to VOS. It can be found from *figure 6* that for both adders, the number of errors increases as the supply voltage decreases. Independent of type of adders, the VOS operation range becomes larger as the feature size of transistors decreases. Irrespective of feature sizes of transistors, the supply voltage of AMA1-based subtractor can be scaled down to the lower level by keeping the lower number of errors when compared with exact full adder-based subtractor. For example, for 4-bits subtraction, the number of errors by using 32nm exact full adder rise sharply at 0.97V VDD and reaches the maximum (260 errors) at 0.92V VDD. However, the number of errors for 32nm AMA1 remains stable at 215 until the supply voltage decreases beyond 0.77V. Also, the variations in Sum between exact results and approximate results for AMA1-based subtractor are significantly lower than exact full adder-based subtractor when the supply voltage varies between around 0.98V and 0.8V. Moreover, *figure 7* and *figure 8* illustrates the

ER, NMED, and MRED results of 4-bit and 8-bit voltage over scaled subtractors using a 32nm/45nm exact full adder and AMA1. Since the exact value  $R$  is the denominator in *eq. (3)*, the  $R$  cannot be zero. Thus, all cases of  $R=0$  are avoided in the calculation of MRED. It shows that the ER, NMED, and MRED increase as the supply voltage decreases. For any feature size of transistors, the ER, NMED, and MRED for the AMA1-based subtractor are significantly lower than the exact full adder-based subtractor when the supply voltage decreases beyond around 0.98V. Independent of the type of adder, the 32nm subtractor can be operated at a lower voltage without increasing the ER compared with the 45nm subtractor. Meanwhile, the ER results for the exact full-adder-based subtractor and the AMA1-based subtractor reach their maximum levels at around 0.92V and 0.69V, respectively. In contrast, the NMED for the exact full adder-based subtractor decreases obviously at around 0.84V VDD.

#### 5. APPROXIMATE DCT AND ZLCADCT USING VOS

In this section, the VOS technique is applied to a traditional ADCT[22] and an improved ADCT technique (ZLCADCT[23]) to realize optimized approximation across circuit level to algorithmic level by using 32nm/45nm exact full adder and AMA1. ADCT matrix  $T_a$  from [22] (by setting  $a = 0$ ) are initially selected to perform evaluation by using VOS technique, and the number of retained coefficients in zigzag scanning is 10. The NAB represents the number of bits (starting from the lowest bit) applying voltage overscaling in an RCA. Next, ZLCADCT[23] are applied to improve the energy performance of ADCT[22] using VOS technique. ZLCADCT is a deterministic technique that accurately configures the size of the transform matrix ( $T$ ) of ADCT according to the number of retained coefficients, such that all computations of unused coefficients can be totally avoided[23]. Meanwhile, ZLCADCT [23] does not cause degradation in output image quality, and the PSNR of ZLCADCT is kept unchanged when compared with ADCT. The PSNR results of ADCT[22] and ZLCADCT [23] for NAB values of 2, 4 and 6 are plotted in *figure 9*, *figure 10* shows the Sum in pixel variation for ADCT[22] and ZLCADCT [23] by make comparisons between the input image and output images at different NAB values.

As shown in *figure 9* the PSNRs for ADCT[22] and ZLCADCT[23] are the same when using the VOS technique. It can be found from *figure 9* that, irrespective of the feature size of transistors, the PSNR of DCT for both adders decreases overall as the supply voltage decreases, while there is a little increase in PSNR of the exact full adder at low voltage level ( $VDD < 0.68V$ ). AMA1 can sustain the higher range (from 1.1V to around 0.78V for both 32nm and 45nm adders at NAB=2) of voltage by remaining basically unchanged value in PSNR, when compared with the exact full adder (from 1.1V to around 0.84V for both 32nm and 45nm adders NAB=2). This means that AMA1 can be scaled down to a lower supply voltage than the exact full adder without causing a significant

degradation in output image quality. Regardless of the type of adders, as the feature size of transistors decreases, DCT using 32nm adders can operate at a slightly lower supply voltage while maintaining the same PSNR value compared to 45nm adders. In addition, the decrease in PSNR value of DCT for the exact full adder is much sharper than that for AMA1. There is a sudden drop in PSNR at the exact cell under a significantly higher supply voltage (around 0.83V), whereas the PSNR reduction for AMA1 is not as severe at a relatively lower supply voltage (around 0.77V). The sudden decrease in PSNR of DCT for the exact full adder is mainly because the pixel variation for Sum increases sharply at a specific supply voltage (shown in Figure 10).



**Figure 7.** ER, NMED and MRED of 4-bits subtraction using voltage over scaled exact full adder and AMA1



**Figure 8.** ER, NMED and MRED of 8-bits subtraction using voltage over scaled exact full adder and AMA1



**Figure 9.** PSNR(in dB) of DCT for ADCT[22] and ZLCADCT[23] by applying the cases of NAB=2 for (1), NAB=4 for (2) and NAB=6 for (3) by using VOS



**Figure 10.** Sum in pixel variation of DCT for ADCT[22] and ZLCADCT[23] by applying the cases of NAB=2 for (1), NAB=4 for (2) and NAB=6 for (3) by using VOS

Meanwhile, in figure 9 and figure 10, when the NAB value increases, the range of scaling down voltage becomes smaller at the same PSNR level for both adders. The sum in pixel variation rises and PSNR values drop significantly by increasing the NAB value when subjected to VOS.

**Table 1.** Applied supply voltage from  $vo_1$  to  $vo_3$  for 32nm ExactFA and AMA1

| Supply Voltage | 32nm ExactFA(V) | 32nm AMA1(V) |
|----------------|-----------------|--------------|
| ( $vo_1$ )     | 0.97            | 0.93         |
| ( $vo_2$ )     | 0.81            | 0.76         |
| ( $vo_3$ )     | 0.7             | 0.63         |

The output images using 32nm exact full adder (ExactFA)/AMA1 under three different specific supply voltage ( $vo_1$ ,  $vo_2$  and  $vo_3$ ) for ADCT[22] and ZLCADCT[23] across NAB values of 2, 4 and 6 are illustrated in *figure 11*, and their PSNR values are shown in *table 2*. Three applied specific scaled-down supply voltages ( $vo_1$ ,  $vo_2$  and  $vo_3$ ) for 32nm exact full adder and AMA1 are given in *table 1* every applied specific voltage of 32nm AMA1 (e.g. 0.93V for  $vo_1$ ) is lower than that of 32nm exact full adder (e.g. 0.97V for  $vo_1$ ). When the supply voltage decreases from  $vo_1$  to  $vo_3$ , new errors are introduced gradually. In *figure 11* for any NAB value, the output images of AMA1-based DCT are significantly better than those of exact full adder-based DCT for any supply voltage level among  $vo_1$ ,  $vo_2$  and  $vo_3$ . The difference in output image quality between the exact full adder-based DCT and AMA1-based DCT becomes more obvious when the NAB value increases.



**Figure 11.** Output images of ADCT[22] and ZLCADCT[23] when NAB=2, 4 and 6 for 32nm exact full adder and AMA1 by using VOS technique (from  $vo_1$  to  $vo_3$ )

As the PSNR values shown in *table 2*, for any applied ADCT, PSNR values of DCT using AMA1 are higher than that of the exact full adder at any NAB value for all three supply voltages  $vo_1$ ,  $vo_2$  and  $vo_3$ . This means that AMA1-based DCT can produce better output images when operated at a lower supply voltage.

The energy dissipation of a completed DCT operation for processing the image of “Lena” and their PSNR values are evaluated in *table 2* for ADCT[22] and ZLCADCT[23] at three different scaled down voltages ( $vo_1$ ,  $vo_2$  and  $vo_3$ ) by using 32nm exact full adder and AMA1 across the NAB values of 2, 4 and 6. *Table 2* shows that, irrespective of ADCT type, the total energy dissipation of DCT decreases as the supply voltage decreases. For any NAB value, energy dissipation of DCT for AMA1 is lower than that for the exact full adder at any specific supply voltage.

**Table 2.** Total energy dissipation and PSNR (NAB=2, 4 and 6) for ADCT[22] and ZLCADCT[23] by using VOS technique (from  $vo_1$  to  $vo_3$ ) for a completed DCT operation using 32nm exact full adder and AMA1

|              | Adder            | NAB | Total energy dissipation (J) for a completed DCT operation |            |            | PSNR(dB)   |            |            |
|--------------|------------------|-----|------------------------------------------------------------|------------|------------|------------|------------|------------|
|              |                  |     | ( $vo_1$ )                                                 | ( $vo_2$ ) | ( $vo_3$ ) | ( $vo_1$ ) | ( $vo_2$ ) | ( $vo_3$ ) |
| ADCT [22]    | Exact full adder | 2   | 1.50E-09                                                   | 1.39E-09   | 1.35E-09   | 27.11      | 26.22      | 25.74      |
|              |                  | 4   | 1.53E-09                                                   | 1.33E-09   | 1.18E-09   | 26.37      | 22.73      | 22.45      |
|              |                  | 6   | 1.60E-09                                                   | 1.25E-09   | 1.02E-09   | 21.73      | 16.14      | 16.03      |
|              | AMA1             | 2   | 1.29E-09                                                   | 1.21E-09   | 1.20E-09   | 27.16      | 27.13      | 26.22      |
|              |                  | 4   | 1.06E-09                                                   | 9.87E-10   | 9.62E-10   | 26.95      | 26.20      | 23.77      |
|              |                  | 6   | 8.40E-10                                                   | 8.08E-10   | 7.14E-10   | 24.85      | 20.67      | 17.40      |
| ZLCADCT [23] | Exact full adder | 2   | 1.13E-09                                                   | 1.04E-09   | 1.00E-09   | 27.11      | 26.22      | 25.74      |
|              |                  | 4   | 1.13E-09                                                   | 9.63E-10   | 8.65E-10   | 26.37      | 22.73      | 22.45      |
|              |                  | 6   | 1.15E-09                                                   | 8.96E-10   | 7.51E-10   | 21.73      | 16.14      | 16.03      |
|              | AMA1             | 2   | 1.01E-09                                                   | 9.54E-10   | 9.45E-10   | 27.16      | 27.13      | 26.22      |
|              |                  | 4   | 8.52E-10                                                   | 7.91E-10   | 7.51E-10   | 26.95      | 26.20      | 23.77      |
|              |                  | 6   | 6.90E-10                                                   | 6.52E-10   | 5.72E-10   | 24.85      | 20.67      | 17.40      |

Furthermore, in *table 2*, the difference of energy dissipation between AMA1 and exact full adder increases with the NAB. In addition, Independent of type of adders, the total energy dissipation for ZLCADCT[23] is significantly lower than that for ADCT[22] for any specific supply voltage. For instance, for AMA1 under NAB=4 by reducing supply voltage to  $vo_2$ (0.76V), the total energy dissipation of ADCT[22] and ZLCADCT[23] is 9.87E-10J and 7.91E-10J respectively. Total energy dissipation of ZLCADCT[23] using AMA1 under  $vo_2$  varies from 9.54E-10J to 6.52E-10J (reduction in 31.44%), 21.16% to 19.3% lower than 12.1E-10J to 8.08E-10J (reduction in 33.2%) for ADCT[22]. Besides, the PSNR values for ZLCADCT[23] and ADCT[22] are the same.

*Table 3* compares the advantages and disadvantages of some approximate methods discussed in the literature. For example, [34-38] often require circuit modifications to implement the respective approximations. While VOS methods ([39]–[41]) are energy-efficient and power-saving, they tend to be limited to

simple adder applications. Additionally, the circuit-level (mathematical) or theoretical framework is not presented. Unlike the methods mentioned above, the work presented in this paper provides a detailed circuit-level analysis of the VOS technique. The results from the developed mathematical models are validated against simulation results and are found to be in close agreement. Furthermore, VOS on exact/approximate adder cells is finally applied to real-time image compression techniques, such as approximate DCT (ADCT) and zigzag low-complexity approximate DCT (ZLCADCT), at the algorithmic level to realize a runtime-based approximate computing method for image compression. It has been demonstrated that ZLCADCT using VOS reduces energy dissipation compared to voltage-over-scaled ADCTs. The results show that for both ADCT and ZLCADCT, an approximate full adder can operate at a significantly lower supply voltage than an exact full adder without a substantial decrease in PSNR. Additionally, regardless of the type of adder cells, the total energy dissipation for voltage overscaled ZLCADCT is lower than for voltage overscaled ADCT, while maintaining the same PSNR.

**Table 3. Comparison of the proposed method with existing methods in the literature**

| Ref. | Method                                                                                   | Advantage                                             | Disadvantage                                                     |
|------|------------------------------------------------------------------------------------------|-------------------------------------------------------|------------------------------------------------------------------|
| [34] | Reviewed AxC techniques at circuit, architecture, application, and algorithmic levels    | Comprehensive review of AxC and future directions.    | No implementation or performance validation.                     |
| [35] | NCFET 6T-SRAM CiM accurate and approximate full adders.                                  | Best power, delay, and PDP efficiency.                | High transistor count.                                           |
| [36] | Implemented 8-T approximate full adder, inexact subtractor, and RCA in CNFET technology. | Compact transistor design.                            | High power, delay, and PDP.                                      |
| [37] | TACA-ACFA with accuracy-configurable adders.                                             | Low power, less delay, and minimum PDP.               | Very high transistor count.                                      |
| [38] | Self-Adjusting Multi-Cycle Approximate Adders (SAMA and SAMA-F).                         | Adaptive multi-cycle design.                          | High power and area due to controller logic.                     |
| [39] | VOS-based SDC for motion estimation in video encoding.                                   | Power savings at different VOS levels.                | Application-specific focus.                                      |
| [40] | VOS-based 8 and 16-bit RCAs and BKAs.                                                    | Energy-efficient operation at scaled supply voltages. | Limited to specific adder designs and no application             |
| [41] | VOS-based Gate-Diffusion Input Approximate Full Adder. Implemented on an 8-bit RCA       | Low power and area achieved.                          | Limited to GDI Technique and no application for image processing |

## 6. CONCLUSIONS

This paper presents an integrated approximate method for implementing the VOS technique at both circuit and algorithmic levels, enabling runtime-based energy-efficient image compression using ADCT and ZLCADCT. A mathematical model is proposed to analyze exact full adder and AMA1 with varying feature sizes using VOS. Results indicate that, regardless of adder type, output capacitance decreases as supply voltage drops. The model's estimates align well with simulation data. The output voltages of both adders in Sum and Carry decrease during the pull-up process as supply voltage lowers, while during pull-down, output voltages increase at high supply levels and decrease at low ones. Subtraction results using voltage overscaled exact and approximate adder cells reveal that, for both transistor sizes, ER, NMED, and MRED of subtractors with AMA1 are lower than those with the exact full adder when supply voltage is below about 0.98V. Additionally, the 32nm subtractor can be powered at lower voltages than the 45nm version at the same ER. Ultimately, integrating VOS with ADCT and ZLCADCT shows that AMA1's supply voltage can be lowered to around 0.77V (for 32nm) while preserving PSNR and reducing energy consumption compared to the exact full adder (at 0.83V). Moreover, regardless of supply voltage and adder type, the total energy for ZLCADCT's DCT is substantially less (e.g., 9.54E-10J to 6.52E-10J) than for ADCT (12.1E-10J to 8.08E-10J), with PSNR maintained.

**Acknowledgments:** A preliminary version of this work has been published as [11]: H. Junqi, T. N. Kumar, and H. Abbas, "Simulation-Based Evaluation of Approximate Adders for Image Processing Using Voltage Over scaling Method," in 2020 IEEE 5th International Conference on Signal and Image Processing(ICSIP), Nanjing, China, 2020, pp. 499-505.

**Conflicts Of Interest:** The authors declare no conflict of interest.

## 9. FUNDING STATEMENT

This work was supported in part by the Natural Science Foundation of Fujian Province of China (No. 2022J05290), in part by the Xiamen University of Technology Initiative Scientific Research Foundation for Advanced Talents (No. YKJ22033R), in part by the Xiamen Scientific Research Project for Overseas Returnees (No. XIARENSHE[2024]241HAO-04), and in part by the Education and Research Project for Middle-aged and Young Teachers of Fujian Province (No. JAT241123).

## REFERENCES

- [1] Z. Ebrahimi, M. Zaid, M. Wijtvliet, and A. Kumar, "RAPID: Approximate Pipelined Soft Multipliers and Dividers for High Throughput and Energy Efficiency," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 42, no. 3, pp. 712–725, 2023.
- [2] K. Cao, M. Chen, S. Karnouskos, and S. Hu, "Reliability-Aware Personalized Deployment of Approximate Computation IoT Applications in Serverless Mobile Edge Computing," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 44, no. 2, pp. 430–443, 2025.

- [3] H. Su and N. Wu, "Deoxys: Defensive Approximate Computing for Secure Graph Neural Networks," in *2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)*, 2024, pp. 54–60.
- [4] H. Junqi, H. Abbas, T. N. Kumar, and F. Lombardi, "An inexact Newton method for unconstrained total variation-based image denoising by approximate addition," *IEEE Transactions on Emerging Topics in Computing*, vol. 10, no. 2, pp. 1192–1207, 2022.
- [5] H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "On the Commutative Operation of Approximate CMOS Ripple Carry Adders (RCAs)," *IEEE Transactions on Nanotechnology*, vol. 23, pp. 265–273, 2024.
- [6] H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "Commutative Approximate Adders: Analysis and Evaluation," in *2021 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)*, AB, Canada, 2021, pp. 1–6.
- [7] S. Asgari, M. R. Reshadinezhad, and S. E. Fatemieh, "Energy-efficient and fast IMPLY-based approximate full adder applying NAND gates for image processing," *Computers and Electrical Engineering*, vol. 113, p. 109053, 2024.
- [8] K. M. Roodbali, E. Abiri, and K. Hassanli, "Highly efficient low-area gate-diffusion-input-based approximate full adders for image processing computing," *The Journal of Supercomputing*, vol. 80, no. 6, pp. 8129–8155, 2024.
- [9] Y. Safaei Mehrabani and R. Faghah Mirzaee, "DAFA: Dynamic approximate full adders for high area and energy efficiency," *Integration*, vol. 97, p. 102191, 2024.
- [10] B. Moons, R. Uyttterhoeven, W. Dehaene, and M. Verhelst, "DVAFS: Trading computational accuracy for energy through dynamic-voltage-accuracy-frequency-scaling," in *2017 Design, Automation & Test in Europe Conference & Exhibition (DATE)*, Lausanne, Switzerland, 2017, pp. 488–493.
- [11] H. Junqi, T. N. Kumar, and H. Abbas, "Simulation-Based Evaluation of Approximate Adders for Image Processing Using Voltage Overscaling Method," in *2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP)*, Nanjing, China, 2020, pp. 499–505.
- [12] P. Reviriego, P. Junsangsr, S. Liu, and F. Lombardi, "Error-Tolerant Data Sketches Using Approximate Nanoscale Memories and Voltage Scaling," *IEEE Transactions on Nanotechnology*, vol. 21, pp. 16–22, 2022.
- [13] X. Zhao, Y. Cui, F. Lyu, C. Gu, C. Wang, and W. Liu, "High Reliable Processor-Based PUF on Voltage Over-Scaling Technique," in *2024 Asian Hardware Oriented Security and Trust Symposium (AsianHOST)*, 2024, pp. 1–6.
- [14] H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "Approximate Computing using Frequency Upscaling," *IET Circuits, Devices & Systems*, vol. 13, no. 7, pp. 1018–1026, 2019.
- [15] H. Junqi, T. N. Kumar, and H. Abbas, "Approximate Newton method using frequency upscaling for total variation-based image denoising," in *2023 IEEE 3rd International Conference on Computer Systems (ICCS)*, Qingdao, China, 2023, pp. 85–90.
- [16] T. Zhang *et al.*, "Design of Majority Logic-Based Approximate Booth Multipliers for Error-Tolerant Applications," *IEEE Transactions on Nanotechnology*, vol. 21, pp. 81–89, 2022.
- [17] B. Perumal, A. Balamanikandan, S. Jayakumar, A. Kumar, and K. Saranya, "Exact Computing Multiplier Design using 5-to-3 Counters for Image Processing," *International Journal of Electrical and Electronics Research*, vol. 12, no. 2, pp. 435–442, 2024.
- [18] M. R. Raja, R. Naveen, C. A. D. Durai, M. Usman, N. K. Shukla, and M. A. Muqeet, "Energy efficient enhanced all pass transformation fostered variable digital filter design based on approximate adder and approximate multiplier for eradicating sensor nodes noise," *Analog Integrated Circuits and Signal Processing*, vol. 118, no. 3, pp. 399–413, 2024.
- [19] S. Swetha and N. S. S. Reddy, "Performance Enhancement of CNFET-based Approximate Compressor for Error Resilient Image Processing," *International Journal of Electrical and Electronics Research*, vol. 11, no. 3, pp. 851–858, 2023.
- [20] B. Rashidi, "APPAs: fast and efficient approximate parallel prefix adders and multipliers," *The Journal of Supercomputing*, vol. 80, no. 16, pp. 24269–24296, 2024.
- [21] Y. Wu *et al.*, "A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits," *ACM Transactions on Design Automation of Electronic Systems*, vol. 29, no. 1, pp. 1–37, 2024.
- [22] S. Bouguezel, M. O. Ahmad, and M. Swamy, "A low-complexity parametric transform for image compression," in *2011 IEEE International Symposium of Circuits and Systems (ISCAS)*, Rio de Janeiro, Brazil, 2011, pp. 2145–2148.
- [23] H. Junqi, T. N. Kumar, H. Abbas, and F. Lombardi, "A Deterministic Low-Complexity Approximate (Multiplier-Less) Technique for DCT Computation," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 8, pp. 3001–3014, 2019.
- [24] H. Junqi, T. N. Kumar, and H. Abbas, "Zigzag low-complexity approximate DCT using frequency upscaling technique," *Journal of Physics: Conference Series*, vol. 1962, no. 1, p. 012050, 2021.
- [25] M. C. Li, A. Ghosh, and S. Sen, "Approximate DCT and Quantization Techniques for Energy-Constrained Image Sensors," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 44, no. 1, pp. 11–24, 2025.
- [26] E. Esmaili, N. Shiri, M. Rafiee, and A. Sadeghi, "A Multiplier-Free Discrete Cosine Transform Architecture Using Approximate Full Adder and Subtractor," *IEEE Embedded Systems Letters*, vol. 16, no. 4, pp. 441–444, 2024.
- [27] R. Hegde and N. R. Shanbhag, "A voltage overscaled low-power digital filter IC," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 2, pp. 388–391, 2004.
- [28] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, "Low-power digital signal processing using approximate adders," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 32, no. 1, pp. 124–137, 2013.
- [29] N. I. M. Group. (2008, November 15). *Predictive Technology Model for Low-Power applications (PTM LP)*. Available: <http://ptm.asu.edu>. [Accessed Jan.15,2019]
- [30] H. A. Almurib, T. N. Kumar, and F. Lombardi, "Inexact designs for approximate low power addition by cell replacement," in *2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)*, Dresden, Germany, 2016, pp. 660–665.
- [31] S. M. Kang, Y. Leblebici, and C. Kim, *CMOS Digital Integrated Circuits: Analysis & Design*, 4th ed. New York, US: McGraw-Hill Higher Education, 2014.
- [32] W. Dghais, T. Cunha, and J. Pedro, "Behavioral model for high-speed digital buffer/driver," in *2010 Workshop on Integrated Nonlinear Microwave and Millimeter-Wave Circuits*, Goteborg, Sweden, 2010, pp. 110–113.
- [33] W. Dghais, H. M. Teixeira, T. R. Cunha, and J. C. Pedro, "Novel extraction of a table-based I-Q behavioral model for high-speed digital buffers/drivers," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 3, no. 3, pp. 500–507, 2013.
- [34] Hans Jakob Damsgaard *et al.*, "Adaptive Approximate Computing in Edge AI and IoT Applications: A Review," *Journal of Systems Architecture*, vol. 150, 2024.
- [35] Venu Birudu *et al.*, "Computing In-Memory Reconfigurable (Accurate/Approximate) Adder Design with Negative Capacitance FET 6T SRAM for Energy Efficient AI Edge Devices," *Semiconductor Science and Technology*, vol. 39, no. 5, 2024.
- [36] Forouzan Bahrami, Nabiollah Shiri, and Farshad Pesaran, "Imprecise Subtractor Using a New Efficient Approximate-Based Gate Diffusion Input Full Adder for Bioimages Processing," *Computers and Electrical Engineering*, vol. 108, 2023.
- [37] Xuemei Fan *et al.*, "A Timing-Aware Configurable Adder Based on Timing Detection for Low-Voltage Computing," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 13, no. 1, pp. 237–248, 2023.

- [38] Elahe Baratalipour, and Arezoo Kamran, "SAMA: Self-Adjusting Multi-Cycle Approximate Adder," *Microelectronics Journal*, vol. 134, 2023.
- [39] Debabrata Mohapatra, Georgios Karakontantis, and Kaushik Roy, "Significance Driven Computation: A Voltage-Scalable, Variation Aware, Quality-Tuning Motion Estimator," *Proceeding of the 2009 ACM/IEEE International Symposium on Low Powers Electronics and Design*, pp. 195-200, 2009.
- [40] Rengarajan Ragavan et al., "Pushing the Limits of Voltage Over-Scaling for Error-Resilient Applications," *Design, Automation & Test in Europe Conference & Exhibitions (DATE)*, Lausanne, Switzerland, pp. 476-481, 2017.
- [41] G.R. Mahendra Babu1, K.P. Sridhar, "Voltage Over Scaling Based GDI Approximate Adder" *International Journal of Electrical and Electronics Engineering*, Volume 12 Issue 1, 47-62, January 2025.
- [42] J. P. Uyemura, *CMOS logic circuit design*, 1st ed. New York, US: Springer Science & Business Media, 1999.



© 2025 by Junqi Huang, T. Nandha Kumar, and Haider A. F. Almurib. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (<http://creativecommons.org/licenses/by/4.0/>).