

# 光場分解加速晶片與雙層分解式 3D 立體顯示器

Light-Field Factorization Processor and Dual-Layer Factored 3D Display

隊伍名稱 我們回來啦!?

We Are Back !?

隊 長 翁笠群/清華大學電機工程研究所

**X** 員 陳立得/清華大學電機工程研究所

林楷平 / 清華大學電機工程研究所

### 作品摘要

在平面顯示器解析度超越人眼極限的今天,我們狂熱地追求更加沉浸式的影像體驗,寄想像於諸如 VR/AR等顯示應用,夢想踏入Metaverse的科幻世界。另一方面,隨著計算攝影學在手機平台上的蓬勃發展,也啟發我們將目光看向如何以演算法的神奇與算力的優勢啟發未來顯示技術的可能性。在3D顯示技術的眾多路線中,分解式光場顯示技術得益於其多層LCD的結構設計及演算法計算的輔助,在顯示3D影像的同時能夠不犧牲空間解析度,能夠達到全視域(水平與垂直方向) 的觀賞效果,更能解決頭戴式顯示器常見的視覺輻輳調節衝突等問題,提供更加沈浸的影音體驗。然而其所仰賴的演算法需求,即使以高階GPU處理,每張圖仍須數十秒的處理時間以及數百瓦的運算功耗。

本作品「光場分解加速晶片與雙層分解式3D立體顯示器」,解決了光場分解式顯示器的運算痛點,以特製的光場分解加速處理器有效完成運算加速,並組建配套的顯示驅動周邊及雙層分解式顯示器原型機作為實證。我們的加速處理器為世界第一顆光場分解加速晶片,最高支援HD30fps光場分解輸出。以TSMC40nm製程實作,能在4.4-16.2 nJ/pixel運算效率下完成7x7視角光場影像的分解運算。成品之顯示器可以展現全視域的光場影像,提供自然沉浸的3D體驗。計算顯示器的每一個元素:顯示面板、影像驅動IC、邏輯運算IP設計都與臺灣產業有很強的關聯。我們團隊以連結這些資源來探討沉浸式顯示的新應用,並期許以本次設計的晶片與配套系統之展示,促進3D顯示領域的發展與普及。



圖一演算法流程與核心元件圖。光場分解演算法將多視角光場影像處理為用於分解式顯示器之顯示影像。本作品由:雙層分解式顯示器、顯示周邊電路、與光場分解加速晶片構成

| 1      |                       | 2.69mm                                  |            |              | 4                    | Technology                                              | 40nm CMOS<br>2.7 mm × 2.7 mm |                    |                  |
|--------|-----------------------|-----------------------------------------|------------|--------------|----------------------|---------------------------------------------------------|------------------------------|--------------------|------------------|
| 2.69mm |                       | 333333333333333333333333333333333333333 |            |              |                      | Chip Area                                               |                              |                    |                  |
|        | 32                    | DV                                      | Light-Fiel | 100          |                      | Core Area                                               | 2.2 mm × 2.2 mm              |                    | 2 mm             |
|        |                       | Buffer                                  |            | MV           |                      | Gate Count                                              | 5.9 M                        |                    |                  |
|        |                       |                                         |            | Buffer       |                      | SRAM Size                                               | 75.1 KB                      |                    |                  |
|        | STANDARD BERKERANDARD |                                         |            |              |                      | IO V <sub>DD</sub>                                      | 3.5 V                        |                    |                  |
|        |                       |                                         | d          | 400          | Core V <sub>DD</sub> |                                                         | 0.61 - 1.06                  | S V                |                  |
|        |                       | Factorization                           |            |              |                      | Frequency                                               | 25 - 200 MHz                 |                    |                  |
|        |                       |                                         |            |              |                      | Performance                                             |                              |                    |                  |
|        |                       |                                         | Engine     |              |                      | Operating Condition<br>(Core V <sub>DD</sub> , °C, MHz) | (1.04, 25, 200) (1.06, 25    |                    | (1.06, 25, 180   |
|        |                       | DV<br>Buffer                            |            |              |                      | Chip Configuration<br>(Rank, Iteration)                 | (1, 25)                      | (2, 50)            | (4, 100)         |
|        |                       |                                         |            | MV<br>Buffer |                      | Power Consumption<br>(mW)                               | 283                          | 442                | 971              |
|        |                       |                                         |            |              |                      | Normalized Energy<br>(nJ/pixel)                         | 13                           | 33                 | 128              |
| ,      |                       |                                         |            | AMMININESS.  |                      | Throughput<br>(Mpixels/s)                               | 30.9<br>(HD 34fps)           | 15.6<br>(HD 17fps) | 7.9<br>(HD 9fps) |

圖二 晶片微縮圖與規格總結。世界第一顆光場分解加速晶片,以 TSMC 40nm 製程實作

## 指導教授

#### 黃朝宗 清華大學電機工程學系

臺灣大學電子工程學研究所博士,現為清華大學電機工程 學系副教授。曾服務於聯詠科技,亦曾於麻省理工學院進 行博士後研究。曾獲傑出人才基金會年輕學者創新獎、未 來科技獎、中國電機工程學會優秀青年電機工程師獎、清 華大學傑出教學獎、旺宏金矽獎最佳指導教授獎等獎項。

#### 研究領域

近來研究以實現高效能、高品質之電腦視覺與計算攝影學應用為主,包含卷積神經網路處理器、立體3D光場顯示器、光場相機等相關研究,



是國內極少數能同時發表頂尖論文至計算機架構(ISCA/MICRO)、晶片設計(ISSCC/VLSIC/ESSCIRC)、電腦視覺(CVPR/ICCV/TPAMI)此三大熱門研究領域之研究者。



We are always after a realistic, immersive visual experience. As the spatial resolution of display gradually exceeds the limit of human vision, we move our enthusiasm to more advancing display technologies such as VR and AR and dream we may someday live in a world of Metaverse. On the other side, we also notice that computational photography has achieved great success on the mobile platform. This reminds us of the potential of combining the algorithm and the display technology. A computational display, factored light field display, stands out from various 3D displays by offering an immersive 3D experience. Taking advantage of the multi-layer structure and a factorization algorithm, it provides full-parallax light fields without sacrificing spatial resolution. It can also solve the vergence-accommodation conflict widely observed in VR displays. However, the required algorithm brings the computation bottleneck: complex computation and the excessive memory access requirement. It requires tens of seconds to process an image while consuming hundreds of watts on high-end GPU platforms.

Our work "Light-Field Factorization Processor and Dual-Layer Factored 3D Display" solves the computation pain point with the devised software-hardware co-optimized algorithm and brings a real-time processing 3D display. We address the computation issue with a specialized light field factorization chip and construct a dual-layer factored display with a display driving peripheral to prove the concept. Our light field factorization chip is the first chip for factored display. This chip is fabricated in TSMC 40nm CMOS technology and can offer up to 30 fps light field factorization throughput. Under different application scenarios, it can efficiently process a 7x7 light field at energy efficiency of 4.4-16.2 nJ/pixel. The factored display can update at a 240 Hz refresh rate and play the light field up to 30 fps. Combining these modules, our work can deliver a natural, immersive 3D experience by displaying a full-parallax light field. All essential elements of computational display: Panel, Driving IC, and Computational Logic IC have strong connections with the industries in Taiwan. We integrate these resources to explore the new application of displays, demonstrate an efficient 3D display system, and hope it can facilitate the development of the 3D display.



Fig. 3 Overview of the 3D display system. The system consists of (a) duallayer factored display, (b) driving peripherals, and (c) light field factorization processor

4