# Structural adders reduction in fixed coefficient transposed direct form FIR filters

# Raval Jay Manoj, S. Umadevi

Abstract- Over the last two decades, fixed coefficient FIR filters were generally optimized by minimizing the number of adders required to implement the multiplier block in the transposed direct form filter structure. In this paper, an optimization method for the structural adders in the transposed tapped delay line is proposed. Although additional registers are required, an optimal trade-off can be made such that the overall combinational logic is reduced. For a majority of taps, the delay through the structural adder is shortened except for the last tap. The one full adder delay increase for the last optimized tap is tolerable as it does not fall in the critical path in most cases. The criterion for which area reduction up to 4.5% to 9.5% and power reduction up to 10% to 30% for the structural adder block of three benchmarks filters is estimated theoretically. The saving is more prominent as the number of taps grows. The criterion for which reduction in number LUTs, number of bonded IOBs, & number of slices are derived. Actual synthesis results are obtained by Xilinx design ISE suite 14.3 (Sparten 3E family and device-XC3S100) & Cadence RTL compiler with 0.18µm TSMC CMOS libraries.

*Index Terms*— FIR filter, Normal structural adder, proposed structural adder reduction, Xilinx design ISE suite 14.3 (Sparten 3E family and device-XC3S100) and Cadence RTL compiler with, 0.18µm TSMC CMOS libraries, Area & Power reduction.

#### I. INTRODUCTION

The inherit stability makes FIR filters a preferred choice in digital signal processing. As wireless technology advances, FIR filters with shorter transition bands, more stringent stopband attenuation requirement and higher sampling rate, are in great demand. To achieve these goals, ASIC implementation is necessary. The Transposed Direct Form (TDF) structure is preferred over direct form structure for higher order ASIC filters due to its shorter critical path delay. In the direct form structure, the input is delayed before the coefficient multiplication and the register length of each tap is fixed by the input bit width. In the TDF structure, the partial sums generated by the outputs of the coefficient multiplier, are delayed. Thus, the lengths of the registers increase monotonically along the taps to hold the correct precision of the partial sums. Consequently, the number of registers needed for the TDF structure is larger than that for the direct form.

Fig. 1 shows a generic TDF fixed coefficient FIR filter. For long filters, the shorter critical path of the TDF is more significant than the costs of the registers.

Manuscript received February 06, 2015.

Raval Jay Manoj, M.Tech, SENSE Department, VIT University, Tamilnadu, India.



Fig. 1 Transposed direct form FIR Filter.

For fixed coefficient FIR filters, the bit widths of the input and all coefficients are known. This enables the bit width of the coefficient multiplier to be determined from its dynamic range. As the partial sums are delayed before they are added with the coefficient multiplier outputs in the structural adders, the bit widths of the structural adders increase monotonically from the first structural adder towards the output. Careful analysis revealed that for most filters, the bit width of the adder increases only from coefficient N-1 to about N/2, after which the bit width stays relatively constant and increases by no more than two bits. As the bit width of the coefficient multiplier output reduces towards the last tap, longer sign extension is required for these structural adders. This paper proposes an addition scheme to reduce the bit widths of these structural adders so that the total combinational logic is reduced at the expense of some register overhead. To determine if the area reduction is able to offset the overheads of additional adders and registers, a lower bound for the difference between the adder bit width and the coefficient multiplier output bit width is established analytically.

### II. PROPOSED STRUCTURAL ADDER OPTIMIZATION

The fundamental concept of our proposed method can be illustrated by an example in decimal. Let  $\{610, -274, 2, 258\}$  be a set of coefficient multiplier outputs to be accumulated to a large partial sum 1234567 by the structural adders in a tapped delay line. A downright approach is to add one number at a time from the set of smaller integers to the large integer. Alternatively, the integers in the set are summed and then added to the large integer. The latter accumulation scheme, when implemented in hardware, requires the large integer and the smaller integers to be stored at each tap. This incurs a large register overhead, which can be reduced if the large number is split into two smaller integers as shown in Fig. 2.

S. Umadevi, SENSE Department, VIT University, Tamilnadu, India.

#### Structural adders reduction in fixed coefficient transposed direct form FIR filters



Fig. 2 Example of adders size reductions for decimal number accumulation.

By partitioning the large integer into two halves, the register overhead is greatly reduced as only the fourth overlapping digit has to be saved twice. The additional adder at the last step needs only a four-digit addition as the three least significant digits are all zeros. Besides, the reduction of the dynamic ranges of the operands also simplifies the structural adder implementation and reduced the length of sign extension.



Fig. 3 Binary example on the optimization of last three adders of Fig. 2.

Fig.3 shows the binary implementation of the proposed scheme on the last three coefficients of the filter example from Fig. 2. The reduction of adder lengths is observed to be several times more than the register and adder overheads it incurred. Furthermore, the delays through the structural adders, a2 and a1 have been reduced, while the delay through a0 is increased by one Full adder delay. The slight increase in the delay through a0 is not an issue as in most cases, there exists at least one tap (i >0) for which delay(x·c0) < delay(x·ci). The full adder reduction for the structural adders can be offset by the increase in flip-flop overhead. Therefore, information about the minimal difference between the addends of the structural adders is of interest.

### III. NORMAL & PROPOSED ADDERS IMPLEMENTATION:

Take partial sum is large like 1234567 & coefficients are {610, 214, 306, 3} of the TDF FIR filters. Now solution is given by the below method Fig. 4.







Fig. 5 RTL Schematic of proposed & normal adders

The fig. 5 is shows the RTL schematic of proposed & Normal adders. The simulation output for the above adder method is shown in below:

|                                                   |         | D. Simula    | ulon out     | put of ut    | luci         |              |
|---------------------------------------------------|---------|--------------|--------------|--------------|--------------|--------------|
| Name                                              | Value   | 2,999,995 ps | 2,999,996 ps | 2,999,997 ps | 2,999,998 ps | 2,999,999 ps |
| 🕨 📲 a[23:0]                                       | 1234567 |              |              | 1234567      |              |              |
| 🕨 臂 b[11:0]                                       | 610     |              |              | 610          |              |              |
| 🕨 🕌 c[11:0]                                       | 214     |              |              | 214          |              |              |
| 🕨 👫 d[11:0]                                       | 306     |              |              | 306          |              |              |
| 🕨 <table-of-contents> e[11:0]</table-of-contents> | 3       |              |              | 3            |              |              |
| 🕨 🚦 sum[23:0]                                     | 1235700 |              |              | 1235700      |              |              |

B. Simulation output of adder

Fig. 6 Simulation output of normal & proposed adder

C. Simulation report of normal & proposed adder:

The Simulation Report of the above methods is given by Xilinx design ISE suite 14.3 with Spartan 3E family is shown in Table 1.

#### TABLE 1

Number LUTs, Number of bonded IOBs, & Number of slices Calculation

| Logic<br>utilization        | Used   |          | Available | Utilization |              |
|-----------------------------|--------|----------|-----------|-------------|--------------|
|                             | Normal | Proposed |           | Normal      | Propose<br>d |
| Number<br>of slices         | 54     | 33       | 2448      | 2%          | 1%           |
| Number<br>of LUTs           | 103    | 60       | 4896      | 2%          | 1%           |
| Number<br>of bonded<br>IOBs | 104    | 96       | 108       | 96%         | 88%          |

# IV. TRANSPOSED DIRECT FORM FIR FILTER (TDF) BY USING NORMAL & PROPOSED ADDER METHOD:

# A. RTL Schematic of TDF FIR filter

The TDF FIR filter RTL Schematic is shown in below fig. 7.



Fig. 7 RTL Schematic of TDF FIR filter

#### B. Simulation output of TDF FIR filter

1. Simulation output of TDF FIR filter by using normal adder method:

| Name                | Value       | 1,900 ns | 1,920 ns       | 1,940 ns            | 1,960 ns  | 1,980 ns |
|---------------------|-------------|----------|----------------|---------------------|-----------|----------|
| ▶ 👹 b[0:3,11:0]     | (610,214,30 |          |                | 610,214,306,3]      |           |          |
| ▶ 📲 data_out(65:0)  | 1235700     |          |                | 1235700             |           |          |
| 🕨 📲 data_in[31:0]   | 1234567     |          |                | 1234567             |           |          |
| 🗓 dock              | 0           |          |                |                     |           |          |
| 🔓 reset             | z           |          |                |                     |           |          |
| ▶ 👹 samples[0:4,31: | [1234567,12 |          | (1234567,1235) | 77, 1235391, 123569 | ,1235700) |          |
| 🕨 👹 k[31:0]         | 5           | 5        | 5(             |                     |           | 5 5      |
| ▶ 🍓 order[31:0]     | 4           |          |                | 4                   |           |          |
| ▶ 👹 word_size_in[3: | 32          |          |                | 32                  |           |          |
| Word_size_outį      | 66          |          |                | 66                  |           |          |
|                     |             |          |                |                     |           |          |

Fig. 8 Simulation output of TDF FIR filter by using normal adder method

2. Simulation output of TDF FIR filter by using proposed adder method:

| Name                | Value       | 1,999,995 ps  1,999,996 ps  1,999,997 ps  1,999,998 ps  1,999,999 ps |
|---------------------|-------------|----------------------------------------------------------------------|
| ▶ 🍇 b[0:4,11:0]     | [610,214,30 | [610,214,306,3,567]                                                  |
| ▶ 📲 data_out[33:0]  | 1235700     | 1235700                                                              |
| 🕨 📑 data_in[31:0]   | 1234567     | 1234567                                                              |
| 🔓 dock              | 1           |                                                                      |
| 埍 reset             | Z           |                                                                      |
| ▶ 👹 samples[0:5,31: | [567,1177,1 | [567,1177,1391,1697,1700,1235700]                                    |
| 🕨 👹 k[31:0]         | 5           | 5                                                                    |
| ▶ 🎼 order[31:0]     | 5           | 5                                                                    |
| ▶ 👹 word_size_in[3: | 32          | 32                                                                   |
| word_size_out[      | 34          | 34                                                                   |

Fig. 9 Simulation output of TDF FIR filter by using proposed adder method

The Simulation Report of the above methods is given by Xilinx design ISE suite 14.3 with Spartan 3E family is shown in Table 2.

TABLE 2

Number LUTs, Number of bonded IOBs, & Number of slices Calculation of TDF FIR filter

| Logic<br>utilizatio<br>n    | Used   |          | Available | Utilization |          |
|-----------------------------|--------|----------|-----------|-------------|----------|
|                             | Normal | Proposed |           | Normal      | Proposed |
| Number<br>of slices         | 247    | 81       | 2448      | 16%         | 4%       |
| Number<br>of LUTs           | 124    | 40       | 4896      | 6%          | 2%       |
| Number<br>of bonded<br>IOBs | 100    | 36       | 108       | 151%        | 54%      |

TABLE 3Area & Power Calculation of TDF FIR Filter

| Area analysis | s(micro-meter) | Power Analysis(nWatts) |          |  |
|---------------|----------------|------------------------|----------|--|
| Normal        | Proposed       | Normal                 | Proposed |  |
| 11249         | 5430           | 405235.5               | 99615.7  |  |

# V. IMPLEMENTATION RESULTS

The fig. 6 is shows the simulation output of normal and proposed adder and this results is verified by Xilinx design ISE suite 14.3 (Sparten 3E family & device-XC3S100). In Table 1 comparison between normal & proposed adder is shown. The fig. 8 and fig. 9 is shows the simulation output of TDF FIR Filters by normal method and proposed method. This result is verified by Xilinx design ISE suite 14.3 with Spartan 3E family and area and power analysis is verified by Cadence RTL compiler with 0.18µm TSMC CMOS libraries.

#### VI. CONCLUSION

This paper presents a new method to reduce the total area and power of fixed coefficient transposed direct form FIR with a large number of taps by minimizing the bit widths of the structural adders. Sign extensions have been shortened and the delays through the structural adders have been reduced at the expense of some register overhead and a reduced size merged adder for each bisection of a long partial sum. Theoretical estimate shows an area reduction of up to 4.5% to 9% and power reduction up to 10% to 30% for the structural adder's block of the benchmark filters.

#### REFERENCES

 A. G. Dempster and M. D. Macleod, "Use of minimum-adder multiplier blocks in fir digital filters," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.

, vol. 42, no. 9, pp. 569–577, Sept. 1995.

- [2] M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan, "Multiple constant multiplications: efficient and versatile framework and algorithms for exploring common sub expression elimination," IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 15, no. 2, pp. 151–165, Feb. 1996.
- [3] M. Martinez-Peiro, E. I. Boemo, and L. Wanhammar, "Design of high-speed multipliers less filters using a non

- recursive signed common sub expression algorithm," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 49, no. 3, pp. 196–203, Mar. 2002, 1057-7130.
- [4] F. Xu, C.-H. Chang, and C. C. Jong, "Contention resolution algorithm for common sub expression elimination
- in digital filter design, "IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, no. 10, pp. 695–700, Oct. 2005.
- [5] D. L. Maskell, "Design of efficient multiplier less fir filters," IET Circuits, Devices & Systems, vol. 1, no. 2, pp. 175–180, April 2007.
- [6] K. Johansson, O. Gustafsson, and L. Wanhammar, "A detailed complexity model for multiple constant multiplication and an algorithm to minimize the complexity," in Proc. 2005 European Conf. Circuit Theory
- and Design, vol. 3, 28 Aug.-2 Sept. 2005, pp. III/465-III/468 vol. 3.
- [7] L. Aksoy, E. Costa, P. Flores, and J. Monteiro, "Optimization of area in digital fir filters using gate-level metrics," in Proc. 44th ACM/IEEE Design Automation Conference, 2007. DAC '07, 4-8 June 2007, pp.420–423.
- [8] B. C. Wong and H. Samueli, "A 200-mhz all-digital qam modulator and demodulator in 1.2-µm cmos for digital radio applications," IEEE J. Solid-State Circuits, vol. 26, no. 12, pp. 1970–1980, Dec. 1991.
- [9] K. Suzuki, H. Ochi, and S. Kinjo, "A design of fir filter using csd with minimum number of registers," in
- Proc. IEEE Asia Pacific Conf. on Circuits Syst. 1996, 18-21 Nov. 1996, pp. 227–230.[10] TSMC 0.18m Process 1.8-Volt SAGE-XTM Standard Cell Library Data-book, 4th ed. Sunnyvale, CA: Artisan Components Inc., Sept. 2003.
- [11] Y. C. Lim and S. Parker, "Discrete coefficient fir digital filter design based upon an lms criteria," IEEE Trans. Circuits Syst., vol. 30, no. 10, pp. 723–739, Oct. 1983.
- [12] H. Samueli, "An improved search algorithm for the design of multiplier-less fir filters with powers-of-two coefficients," IEEE Trans. Circuits
- Syst., vol. 36, no. 7, pp. 1044-1047, July 1989. 2188

Raval Jay Manoj, M.Tech, SENSE Department, VIT University, Tamilnadu, India.

S. Umadevi, SENSE Department, VIT University, Tamilnadu, India.