## D16-082 Design Group 使用重疊架構之超越 Gb/s 低密度 同位元檢查迴旋碼解碼器設計 An Over Gb/s LDPC-CC Decoder Design Using the Overlapped Architecture 隊伍名稱 海底總動員/OCEAN 長 林 謙/交通大學電子研究所 員 劉榮傑/交通大學電子研究所 張博珣 / 交通大學電子研究所 ## ▋指導教授 張錫嘉/交通大學電子工程學系 交通大學電子研究所博士,目前任職國 家實驗研究院國家晶片系統設計中心副 主任。過去曾進入聯發科技公司擔任副 理,2003年2月返交大任教,曾獲得年度傑出教學獎、中國 電機工程學會「優秀青年電機工程師獎」以及臺灣積體電路 設計學會「傑出年輕學者獎」等肯定;累計著有專書1本、 SCI/EI 期刊論文 36 篇、IEEE 等國內外研討會論文 79 篇、國 內外專利共 48 項。自 2010 年 8 月起升任為教授、並陸續擔 任過交通大學副研發長、IEEE A-SSCC TPC member、IEEE Trans. on Circuits and Systems 擔任 Associate Editor 研究領域: 積體電路與系統、訊號設計、編/解碼、訊號偵測 訊息理論、量子通訊等。 ## 作品摘要 32 2016 旺宏金矽獎半導體設計與應用大賽 在現代無線通訊系統中,電磁波在空氣中傳遞時會有許多干 擾及空間損失,而環境更加複雜時,還需考慮多路徑傳播及 都卜勒效應。以上這些現象在誦訊系統中統稱誦道或雜訊, 會使得接收端收到的資料錯誤(與傳輸端不同)。為了避免 重複跟發送端要求傳發資料,導入了錯誤更正碼的機制。該 機制為在原本資料中增加一些保護資料,透過這些保護資 訊,我們在接收端也有一定的能力來更正錯誤。低密度同位 元檢查碼(Low-Density Parity-Check Codes,簡稱 LDPC) 因為強大的更正能力以及高運算平行度的特型,近年來相關 研究相當的多,在許多標準都曾經採用或討論,例如 IEEE 802.11n 以及 IEEE 802.11ac 等。 然而,未來的通訊系統必定要面對更複雜的通道環境及更多 形式的資料傳輸。因此,錯誤更正碼需要提供更多編碼率及 資料區塊長度的選擇來因應不同的使用情境。例如,在高鐵 上追著當紅影集時,我們需要低碼率且區塊長度較長的碼, 才可有足夠的更正能力,且支援較大的檔案傳送;但若是在 家裡使用較近距離的物聯網時,因為通道品質好,傳輸量又 少,此時就需要高碼率小資料區塊長度的碼。在眾多錯誤 更正碼中,低密度檢查迴旋碼(Low-Density Parity-Check Convolutional Codes,簡稱 LDPC-CC) 便是能夠因應未來多 變多樣需求的有力選項。 祈年來移動誦訊裝置、裝戴裝置或是物聯網大量興起, 掀起 一股潮流;而這些應用皆有低功耗 IC 之需求,因此本競賽 作品即為設計傳輸速率超過 5Gbps 且比文獻中的解碼器具有 更低功耗的 LDPC-CC 解碼器。因為儲存元件在解碼器佔了 大量的面積及功耗,所以在我們在演算法層級上提出重疊解 碼,此方法能大幅減少儲存量。在儲存單元方面,SRAM-暫 存器之混合型 FIFO (first-in, first-out) 的架構在吞吐量以 及功耗上取得極佳的取捨。為了能達到 5Gbps 以上的吞吐 量,進一步採用折疊技巧(folding)來提高運算平行度。在 數據的運算路徑 (data path)上,我們重新設計了 pipeline stage 的分配,使得關鍵路徑 (critical path)可以更加縮短, 以提升吞吐量。本作品以 UMC 65CMOS 製程實作,所提出 的 LDPC-CC 解碼器所占面積為 1.19 mm<sup>2</sup>。在 322MHz 的時 脈運作下能夠達到 7.728Gbps 的吞吐量,而最大功耗為 8.87 pJ/bit/ 疊代解碼。此競賽作品支援四種編碼率,且支援任意 長度的資料區塊長度 圖 1. 重疊架構硬體示意圖 圖 2. 解碼器晶片 ## **Abstract** Nowadays, as wireless communication plays an important role in our daily life, how to deal with the signal distortion during transmission in the air becomes a big issue. Besides interference and spatial degradation, the Doppler shift, multipath effect and so forth lead to errors in received data (wrong received data which is different from transmitted data) as well. Since intuitively asking the transmitter to re-transmit is inefficient, here Error Control Codes (ECCs) are introduced. By adding some redundancy to a message, these techniques enable the receiver to detect and correct most errors, and the original information can be protected. Low-Density Parity-Check Codes (LDPCs), one of error control codes, have received wide attention recently due to its great correcting capability, and have been adopted by many communication standards such as IEEE 802.11n and IEEE 802.11ac. With new technological advances, the channel condition and users' needs are going to be more complicated in the future. For example, a low-rate code with long block length is needed when watching the hottest TV series on a high speed rail. because it provides better correcting ability and larger data transmission. However, in the scenario of using IoT (Internet of Things) devices at home, a high-rate code with short block length is preferred due to the short distance, channel with less interference and small data transmission. Therefore, the next-generation ECCs must support flexible code rates and data lengths for different scenarios. Among loads of ECCs, Low-Density Parity-Check Convolutional Codes (LDPC-CCs) is a promising candidate to satisfy the users' diverse requirements in the future. In recent years, with the trend of portable devices and IoT going viral, the relevant designs are expected to be lowpower. We proposed an LDPC-CC decoder consuming less power than previous LDPC-CC works, and designed to transmit over 5Gbps. In this work, an overlapped architecture is proposed to reduce storage elements, which used to consume lots of area and power in the traditional decoders. We also use SRAM-register hybrid-partitioned FIFO to strike a good balance between throughput and power consumption. Aiming at high throughput over 5Gbps, we employ the folding technique to increase parallelism and redesign the pipeline stages to shorten the critical path. The proposed decoder is implemented in UMC 65-nm CMOS process, able to achieve 7.728Gbps at clock rate of 322MHz with area of 1.19mm<sup>2</sup>. The maximum power consumption is 8.87pJ/bit/iteration. The work supports 4 code rates and variant data block length. | UMC 65nm LL | | |----------------------------|-------------------------------------------------------------------------------------------------------------------------------| | (215,3,6) LDPC-CC with T=3 | | | 1/2, 2/3, 3/4, 4/5 | | | 1135 | | | 6 bits | | | 73.3% | | | 6 | | | 6 | | | 47.04 Kbits | | | 1.19 mm <sup>2</sup> | | | 346MHz <sup>(1)</sup> | 322MHz <sup>(2)</sup> | | 2.076 Gb/s(1) | 7.728 Gb/s(2) | | 372.2(1) | 410.5(2) | | | (215,3,6) LDPC<br>1/2, 2/3,<br>11:<br>6 b<br>73.:<br>6<br>47.04<br>1.19<br>346MHz <sup>(1)</sup><br>2.076 Gb/s <sup>(1)</sup> | <sup>(1)</sup> Measured with code rate 1/2 Fig 3. Chip measurement results <sup>(2)</sup> Measured with code rate 4/5