# High Speed Reverse Converter Design Using Modified Hybrid Parallel Prefix Adders For DSP Applications

Sradha P.S, Asha Sunil

Abstract - Residue number system is an unconventional number system that uses residues of a number in particular modulus for its representation. Major parts of RNS include forward converter, arithmetic unit and reverse converter. The reverse converter in the existing system is based on regular and modular adders which show significant power consumption and superior delay. New high speed implementation of residue number system reverse converter is based on regular and modular Parallel prefix adders, which is used to tradeoff between power, delay, speed and area. A comparison of different prefix structures are performed among that Brent Kung prefix structure have minimum fan out which is used for the design. By using a shift operation in the design a high speed multiplier is designed. Residue number system can be applied to the filter design application by using the modified reverse converter and a forward converter because of the energy efficient and high speed characteristics.

*Index Terms -* Parallel prefix adder, Residue number system, Reverse converter.

# I. INTRODUCTION

These days with the extensive use of wireless devices, power becomes one of the primary design constraints. Residue arithmetic's based on residue number systems provide carry-free arithmetic's and offer potential for high-speed and parallel computation. Arithmetic operations such as addition, subtraction, multiplication can be carried out more efficiently. The adoption of RNS has provided significant efficiency improvements for different types of digital signal processing applications. Today RNS is also regarded as one of the most popular techniques for reducing the power dissipation and the computation load in VLSI system design. RNS consist of three parts forward converter, modulo arithmetic units and reverse converters. The former two parts have modular architecture and it can be implemented as parallel as other parts other parts of RNS. The forward converter decomposes a weighted binary number into a residue number with respect to the moduli set. The residue arithmetic unit depends on the application. The reverse converter transforms a residue represented number into its equivalent weighted binary number. But the reverse converter design is more complex and non modular that it compares to the forward converter design. Reverse converter is an important part of RNS because the conversion delay should

not counteract the speed gain of arithmetic unit. Most research of RNS is focused on designing efficient reverse converter.

To achieve high energy efficiency, voltage over scaling technique is applied to RNS but soft errors introduced by this technique degrade the performance of the system. Reverse conversion formulation can be done with ripple carry adder but linear increase of delay in ripple carry adder with increasing number of bits reduces speed. Chinese remainder theorem (CRT), mixed radix conversion (MRC), new Chinese remainder theorems (New CRTs) are the different types of conversion algorithms for reverse conversion. Use of parallel prefix adder in RNS reverse converter design provides higher consumption because the bit length of adder is large.

In this brief, a hybrid parallel prefix structure is used to design fast reverse converter. The usage of parallel prefix structure improves speed and increases power and area. The use of new hybrid parallel prefix structure provides a tradeoff between speed, area and power consumption. This new design provides significant reduction in power delay product and area than traditional converters. The new design can be converted into a multiplier by performing a shift operation in the design.

#### II. RESIDUE NUMBER SYSTEM

Residue number system is an unconventional number system. Its carry-limited computations provide higher speed and reduce power consumption. First moduli set which includes some pair wise prime numbers should be selected. Forward converter converts the weighted binary numbers to residues corresponding to moduli. Arithmetic operations such as addition, subtraction, multiplication are performed on RNS in parallel without carry propagation between residue digits.

In RNS large numbers can be encoded into smaller numbers so it reduces the complexity of individual blocks. Residue number systems have less probability of errors because error in one channel does not propagate to other channels. RNS can tolerate large reduction in supply voltage compared to corresponding binary architecture. RNS has many applications in digital signal processing. RNS can be applied to the implementation of high speed FIR filter, RNS image coding can offer high speed and low power VLSI implementation for secure image processing, redundant RNS offers new approaches to the design of error detection and correction codes. Reverse converter design is very complex because it requires lot of regular and modular adders. In order to get higher speed parallel prefix adders are used to realize reverse converter. However parallel prefix adders consume higher power. In order to reduce power and to achieve a tradeoff between power, speed and area a hybrid parallel prefix structure is used.

**SRADHA P.S**, Department of ECE, Kerala University, Younus College of Engineering and Technology, Kollam, Kerala, India,

ASHA SUNIL, Department of ECE, Kerala University, Younus College of Engineering and Technology, Kollam, Kerala, India,.



Fig.1. Block diagram of Residue Number System

# III. REVERSE CONVERTER DESIGN

Moduli set selection is the first step in designing reverse converter. It can play a significant role in dynamic range, speed and hardware realization of RNS.The values of the moduli are substituted in the new Chinese remainder theorem conversion algorithm. The resulting equations are simplified and realized using adder components like CSA-EAC, CPA-EAC and CPA. To increase the speed and to reduce power consumption the existing adder components are replaced by new hybrid prefix adder components.

Firstly the operands preparation unit 1 (OPU 1) prepares the operands v1, v2, v3 and v4 based on New CRT algorithm for  $(2^{n}+1, 2^{n}, 2^{n}-1, 2^{2n+1}-1)$ . Modulo  $2^{2n+1}-1$  addition of  $v_1$  and v2 is performed using (2n+1) bit CPA1 with EAC and H represents its output. Modulo (2<sup>n</sup>-1) addition of v<sub>3</sub> and v<sub>4</sub> are performed by (n) bit CPA2 with EAC and K represents its output. The second operand preparation unit (OPU 2) prepares the operands  $v_5$ ,  $v_6$ ,  $v_{81}$ ,  $v_{07}$  based on New CRT algorithm for  $(2^{n}+1, 2^{n}, 2^{n}-1, 2^{2n+1}-1)$ . Modulo  $(2^{2n+1}-1)$ addition of v<sub>5</sub>, v<sub>6</sub>, v<sub>81</sub>, v<sub>07</sub> are performed by (2n)-bit CSA1 with EAC, 2n-bit CSA2 with EAC and finally 2n-bit CPA3 with EAC performs the modulo  $(2^{2n} - 1)$  addition and T represents the output. The operand s can be calculated using (4n+1) bit carry propagate adder CPA and x can be obtained by routing the bits of s and  $x_1$ . The resulting equations are realized by CPA with EAC, CSA with EAC and CPA. In modified reverse converter design hybrid and regular parallel prefix adder components are used to replace CPA with EAC and CPA in existing reverse converter design.

Xı

Xэ



Fig.2. Existing adder based reverse converter

# IV. PARALLEL PREFIX ADDER COMPONETS

The three stages of parallel prefix structure consist of pre –processing stage, prefix carry tree, post processing stage. Pre-processing stage computes carry generate, carry propagate and partial sum. Prefix carry tree get proceed with the previous signal to yield all carry bit signal. Prefix graph contain black cell and buffer cell. Black cell is the processing component. Carry bits from the second stage get passed to the post processing stage where sum is computed.

There are different architectures such as Sklansky conditional adder (SK), Kogge-Stone adder (KS), Brent-Kung adder (BK). Among these Brent-Kung model is chosen because of minimum fan out and higher speed of operation.



Fig.3.Block Diagram parallel prefix structure

#### V. NEW HYBRID PARALLEL PREFIX COMPONETS

# A. Hybrid Regular Parallel Prefix XOR/OR adder structure (HRPX)

The first part of addition is performed using a regular parallel prefix adder and simplified RCA logic is used to do the second part. Because of the constant operand full adder is designed using XOR/OR gate. For most modulo sets  $(2^{n}-1)$ addition is necessary. For  $2^{n}-1$  addition end around carry ECA produce double representation of zero. To correct this problem a detector circuit is needed but it produces additional delay. Binary to excess one converter is modified and used to provide single representation of zero.



Fig.4. HRPX structure

B. Hybrid Modular Parallel Prefix Excess one Adder Structure (HMPE)



Fig.5. HMPE structure

# VI. RESIDUE NUMBER SYSTEM FIR FILTER

The carry free and high speed computation feature of Residue number system can be used for the design of FIR filter. For a FIR filter of order N, the output is the weighted sum of the current and previous values of the input. It is defined by the equation :

K=1



$$y_{mp}(n) = <\Sigma^{N} < a_{mp}(k) . x_{mp} (n-k) > mp > mp$$
  
<sub>K=1</sub>



#### Fig.6. RNS FIR Filter

Here FIR filter is decomposed into p FIR filter working in parallel. In RNS FIR filter, the input is first converted into residues based on the set of moduli. Then FIR filter operation is performed on each residue. Operations are performed in parallel it offer large performance improvements than conventional FIR filter.

## VII. RESULTS AND DISCUSSION

The prefix structures are designed in VHDL. Synthesis is done in Xilinx ISE design suite 13.2 and ModelSim SE 6.3f is used for simulation. Compared to fully prefix adders, circuit using hybrid parallel prefix adder improved area, power, speed.

| -> 1         | reverse_converter/x1  |                  | 1000                |      |
|--------------|-----------------------|------------------|---------------------|------|
|              | /reverse_converter/x2 |                  | 000001001           |      |
| 1            | reverse_converter/x3  |                  | 01010               |      |
| 41           | reverse_converter/x4  |                  | 1010                |      |
| -> 1         | 'reverse_converter/x  | 0000000000010000 | 0000000001000001000 |      |
| 1 🔶          | reverse_converter/v3  |                  | 0101                |      |
| 🔶 l          | reverse_converter/    |                  | 0111                |      |
|              | reverse_converter/    |                  | 1010                |      |
| 🔶 I          | 'reverse_converter/   |                  | 0111                |      |
| -> 1         | reverse_converter/v4  |                  | 1010                |      |
| -> 1         | 'reverse_converter/k0 |                  |                     |      |
| <b>\</b> 1   | 'reverse_converter/k1 | 0000             | 0000                |      |
| 🔶 l          | 'reverse_converter/c2 |                  | 0000                |      |
|              | reverse_converter/c3  |                  |                     |      |
|              | 'reverse_converter/k  |                  | 0000                |      |
| 🔶 /          | reverse_converter/    |                  | 10101               |      |
| 🔶 I          | 'reverse_converter/v1 | 100100000        | 100100000           |      |
| 🔷 l          | 'reverse_converter/v2 |                  | 011/11111           |      |
| <u> 18</u> 1 | Now                   | 1100 ns          |                     | 1000 |
|              | Cursor 1              | Û ns             | 200 400 600 800     | 1000 |

Fig.7. Simulation result of conventional reverse converter

Our main goal is to achieve better tradeoff between delay and power consumption. On the increase of number of bits PDP

$$y(n) = \Sigma^N ak .x(n-k)$$

for our design improved. Delay occurs in multiplier only due to shift operation. Comparison of conventional and modified reverse converter is described in the table.

| Messages                             |                                         |        |
|--------------------------------------|-----------------------------------------|--------|
| Interverse_converter 1000            | 1000                                    |        |
| /reverse_converter 000001001         | 000001001                               |        |
|                                      | 01010                                   |        |
| /reverse_converter 1010              | 10 10                                   |        |
| 🔷 /reverse_converter 00000000000 100 | 000000000000000000000000000000000000000 |        |
| -🔷 /reverse_converter 0101           | 0101                                    |        |
|                                      | 0111                                    |        |
| 🔷 /reverse_converter 1010            | 10 10                                   |        |
| /reverse_converter 0111              | 0111                                    |        |
| -🔷 /reverse_converter 1010           | 10 10                                   |        |
| /reverse_converter 1111              | 1111                                    |        |
|                                      | 0000                                    |        |
| /reverse_converter 0000              | 0000                                    |        |
| /reverse_converter 1111              | 1111                                    |        |
| neverse_converter 0000               | 0000                                    |        |
| /reverse_converter 10101             | 10 10 1                                 |        |
| /reverse_converter 100 100000        | 100100000                               |        |
| -🔷 /reverse_converter 011111111      | 011111111                               |        |
| Ireverse_converter 000011111         | 000011111                               |        |
| /reverse_converter 111100000         | 111100000                               |        |
| Now 2036854775807 n                  | s 15 500                                | ns 100 |

Fig.8. Simulation result of HMPE-BK



Fig.9. Simulation result of RNS FIR filter

#### TABLE I SIMULATION RESULT

| Demonstern | Reverse converter              |                               |  |  |  |
|------------|--------------------------------|-------------------------------|--|--|--|
| Parameters | Conventional reverse converter | Modified reverse<br>converter |  |  |  |
| Power      | 65mW                           | 62mW                          |  |  |  |
|            |                                |                               |  |  |  |
| Delay      | 65ns                           | 60ns                          |  |  |  |

# VIII. CONCLUSION

A reverse converter using a Brent-Kung parallel prefix adder network is presented. For faster operation and reduced power consumption the prefix adder is implemented using new hybrid adder components. Using the same adder structure a multiplier is designed by adding a shift operation. The implementation result shows that the tradeoff between speed and power consumption can be achieved using the new components. The HMPE reverse converter and forward converter for  $(2^n+1, 2^n, 2^{n-1}, 2^{2n+1}-1)$  set in residue number system can be used for filter design applications because of the energy efficient and high speed characteristics of the reverse converter and a forward converter because of the energy efficient and high speed characteristics.

## ACKNOWLEDGMENT

The authors would like to thank anonymous reviewers for their constructive comments and valuable suggestions that helped in the improvement of this paper.

## REFERENCES

[1] Azadeh Alsadat Emrani Zarandi, Amir Sabbagh Molahosseini, Mehdi Hosseinzadeh, Saeid sorouri, Samuel Antao, and Leonel Sousa, "Reverse Converter Design via Parallel- Prefix Adders: Nove Components, Methodology, and Implementations" IEEE Trans. Very

Large Scale Integr. (VLSI) Syst., vol. 23, no. 2, pp. 1063-8210, Feb. 2015.

- [2] J. Chen and J. Hu, "Energy-efficient digital signal processing via voltage-over scaling-based residue number system," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 21, no. 7, pp. 1322–1332, Jul. 2013.
- [3] K. Navi, A. S. Molahosseini, and M. Esmaeildoust, "How to teach residue number system to computer scientists and engineers," *IEEE Trans. Educ.*, vol. 54, no. 1, pp. 156–163, Feb. 2011.
- [4] A. S. Molahosseini, K. Navi, C. Dadkhah, O. Kavehei, and S. Timarchi, "Efficient reverse converter designs for the new 4-moduli sets {2<sup>n</sup> 1, 2<sup>n</sup>, 2<sup>n</sup> + 1, 2<sup>2n+1</sup> 1} and {2<sup>n</sup> 1, 2<sup>n</sup> + 1, 2<sup>2n</sup>, 2<sup>2n</sup> + 1} based on new CRTs," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 4, pp. 823–835, Apr. 2010.
- [5] R. A. Patel, M. Benaissa, and S. Boussakta, "Fast parallel-prefix

architectures for modulo  $2^n - 1$  addition with a single representation of zero," *IEEE Trans. Comput.*, vol. 56, no. 11, pp. 1484–1492, Nov. 2007.



**SRADHA P.S** received B. Tech bachelor degree in ECE from Younus College of Engineering and Technology under Kerala University in 2007. Currently she is pursuing M. Tech in Applied Electronics and Instrumentation from Younus College of Engineering and Technology under Kerala University, Kollam, Kerala. Her areas of interest in research are VLSI and Signal processing



ASHA SUNIL received B.Tech bachelor degree in ECE from Travancore Engineering College under Kerala University and received her master's degree M.Tech in VLSI and Embedded system from TKM Institute of Technology under CUSAT. She is working as Asst. Professor in Dept. of ECE , Younus College of Engineering and Technology, Kollam, Kerala. Her areas of interest are VLSI and embedded systems, signal processing