# Thermal-Aware Placement and Routing for 3D Optical Networks-on-Chips

Fengxian Jiao<sup>1</sup>, Sheqin Dong<sup>1</sup>, Bei Yu<sup>2</sup>, Bing Li<sup>3</sup>, and Ulf Schlichtmann<sup>3</sup>

<sup>1</sup>Department of Computer Science & Technology, Tsinghua University, Beijing, China <sup>2</sup>CSE Department, The Chinese University of Hong Kong, Hong Kong <sup>3</sup>Chair of Electronic Design Automation, Technical University of Munich (TUM), Munich, Germany

Abstract—Many-core chip architectures integrate tens to hundreds of processor cores on a single chip. Recent development of photonic interconnects has made Optical Networks-on-Chips (ONoCs) an attractive technology to overcome the drawbacks of electrical networks-on-chips. With ultra-high bandwidth, low latency, and great energy efficiency, ONoCs enable the designer to build scalable systems. However, photonic devices are sensitive to temperature fluctuations, and hence, require proactive management. This paper first calculates the thermal distribution from cell distribution using an approximated Green's function and proposes a post-placement algorithm to reduce the number of photonic devices in the hotspots. The paper then improves the routing algorithm considering bending loss and temperature variations. Experimental results also verify the efficiency and effectiveness of our algorithm.

## I. INTRODUCTION

As the number of cores on Networks-on-Chips (NoCs) continues to climb, the performance of electrical interconnects decreases both on communication bandwidth and power consumption. Photonic technology has been proposed to overcome the drawbacks of electrical links with high bandwidth and low latency.

Usually the photonic devices need to operate at a specific temperature, since the wavelength that an individual microring responds to is highly sensitive to temperature variations [1]. For example, Fig. 1 shows a Photonic Switching Element (PSE) with resonant wavelength  $\lambda_1$ . A signal with wavelength  $\lambda_1$  is coupled into the ring, intensified by resonation processes and decoupled into another waveguide. However, with increase in temperature, the resonant wavelength shifts to  $\lambda_1 + \Delta \lambda$  and the signal will fail to be coupled. In a typical many-core system, it is common to observe that on-chip temperature reaches 90°C due to increasing power densities [2] . Such high operating temperatures also influence the propagation performance of waveguides, since the refractive index of silicon is very sensitive to temperature fluctuations [1].

In this paper, we propose PROTON+T, a thermal-aware postplacement and routing algorithm for 3D ONoCs. The architecture is composed of two layers: electronic layer and photonic layer, which are connected by arrays of Through-Silicon-Vias (TSVs), as shown in Fig. 2. The processor cores are organized into clusters on the electronic layer. Each cluster contains an electronic router for intra-cluster communication. Signals of inter-cluster communication will be guided to the optical layer



Fig. 1: Impact of temperature increase in Photonic Switching Element (PSE).



Fig. 2: Target architecture [6].

via the TSV array. The optical layer consists of memory controllers and hubs, which are responsible for the communication between off-chip memory and clusters. An  $8 \times 8$  Wavelength Routed Optical NoC (WRONoC) is used to connect the memory controllers and hubs. Fig. 3 shows the logic scheme of the  $8 \times 8 \lambda$ -Router. However, power consumption varies significantly between different possible layout implementations for this logic scheme. Hence, there is significant potential to minimize the power consumption by optimizing the layout. Note that although there are some previous works [3], [4] on 2D optical interconnect synthesis, none of them considers the temperature issue for 3D ONoCs at network level. Our work is an extension of PROTON [5], [6] and PLATON [7] with thermal-aware post-placement and improved routing algorithm.

With the force-directed method, we place the optical devices on the chip, avoiding overheated areas. More specifically, our algorithm first calculates the thermal distribution from cell distribution using an approximated Green's function and then separates the force into 4 components with the generated heat flux. For the routing algorithm, we improved the algorithm of [6] by taking the bending loss into consideration. Additionally, we propose an approach to route waveguides while avoiding the

This work is supported in part by NSFC 61176022, and The Research Grants Council of Hong Kong SAR (Project No. CUHK24209017).



Fig. 3: Logic scheme of the  $8 \times 8 \lambda$ -Router [8].

hotspots.

The rest of this paper is organized as follows: Section II describes the details of placement algorithm. Section III gives the routing algorithm. The experimental results is presented in Section IV and conclusion is provided in Section V.

#### II. POST-PLACEMENT

As mentioned above, PLATON [7] is the first force-directed placement algorithm for 3D optical NoCs. PROTON+T is based on the PLATON placement result and works in the post-placement stage. Similar to other quadratic force-directed placers [9], we place the PSEs with a virtual force in each iteration. In this paper, we separate the force into the following four components: a net force, a move force, a hold force and a thermal force. The thermal force moves the PSEs away from high-temperature regions to diminish effects of thermal variation on performance.

Since the heat is caused mainly by the electrical layer, we calculate the thermal distribution based on the distribution of the cores on the electrical layer, using Green's Function. Then we take the obtained thermal distribution as the thermal profile of the placement on the photonic layer. The heat flux vector is generated by taking the derivatives of the thermal profile. After that, a thermal move force based on heat flux spreads PSEs over the optical layer.

### A. Thermal Distribution

Since the PSEs obviously cannot be moved outside of the die, we define the heat flux at the die boundaries to be zero. Hence, the thermal equation can be written as the following form [10]:

$$\nabla^2 T(x,y) + \frac{p(x,y)}{k} = \rho c_p \partial \frac{T(x,y,t)}{\partial t}, \qquad (1)$$

$$\frac{\partial T(x,y)}{\partial x} = 0, \qquad x = 0, x = w,$$
  
$$\frac{\partial T(x,y)}{\partial y} = 0, \qquad y = 0, y = h,$$
(2)

where T(x, y) denotes the temperature at position (x, y), p(x, y) denotes the power density, k denotes thermal conductivity.  $c_p$  denotes the specific heat,  $\rho$  denotes heat capacity and w, h represent the dimensions of die. The heat flux vector can be computed from the thermal distribution directly.

The steady state of Equation (1) is given in the form of the well-known Poisson's equation. This problem can be solved with Green's function, which satisfies:

Then, the temperature distribution can be written as:

$$T(x,y) = \int_0^h dy_0 \int_0^w G(x,y,x_0,y_0) \frac{g(x_0,y_0)}{k(x_0,y_0)} dx_0.$$
 (4)

The expression (4) implies that T(x, y) can be obtained by Green's function G.

# B. Thermal Force

The thermal force moves modules from high-temperature areas to low-temperature areas. Thus, the heat flux vector  $\mathbf{T}'_x$  can be computed by taking the derivatives of T(x, y):

$$\mathbf{T}_{x}^{'} = \frac{\partial}{\partial x}T(x,y).$$
(5)

Then the thermal force  $F_{x,i}^T$  is given as  $F_{x,i}^T = -\theta_i T_{x,i}'$ , where  $T_{x,i}'$  is the thermal flux of module *i*.  $\theta_i$  are positive weights, which ensure the direction of thermal force is negative. The total thermal force vector can be written as:

$$\mathbf{F}_{x}^{T} = (F_{x,1}^{T}, \dots, F_{x,M}^{T}) = \boldsymbol{\theta} \mathbf{T}_{x}^{'}, \tag{6}$$

where the weights are collected in the diagonal matrix  $\theta = diag(\theta_i)$ .

## C. Other Forces

Similar to PLATON [7], the net force, the move force and the hold force can be written as the following forms:

$$\mathbf{F}_{x}^{net} = \nabla_{x} \Gamma_{x} = \mathbf{C}_{x} \mathbf{x} + \mathbf{d}_{x}, \tag{7}$$

$$\mathbf{F}_{x}^{move} = \mathbf{Q} \mathring{\mathbf{C}}_{netm,x} (\mathbf{X}_{netm} - \mathring{\mathbf{X}}_{netm}) \\ = \frac{1}{2} \mathbf{Q} \mathring{\mathbf{C}}_{netm,x} \mathbf{Q}^{T} (\mathbf{x} - \mathbf{x}') + \mathbf{Q} \mathring{\mathbf{C}}_{netm,x} \mathbf{\Phi}_{netm,x}, \quad (8)$$

$$\mathbf{F}_{x}^{hold} = -\left(\mathbf{C}_{x}\mathbf{x}' + \mathbf{d}_{x}\right). \tag{9}$$

Due to space limitations, the reader is referred to [7] for a detailed explanations of all symbols. Finally, we set the sum of these four forces to zero:

$$\mathbf{F}_{x}^{net} + \mathbf{F}_{x}^{T} + \mathbf{F}_{x}^{move} + \mathbf{F}_{x}^{hold} = \mathbf{0}.$$
 (10)

$$(\mathbf{C}_{x} + \frac{1}{2}\mathbf{Q}\mathring{\mathbf{C}}_{netm,x}\mathbf{Q}^{T})(\Delta \mathbf{x}) = \theta \mathbf{T}_{x}^{'} - \mathbf{Q}\mathring{\mathbf{C}}_{netm,x}\boldsymbol{\Phi}_{netm,x}.$$
 (11)

We solve the linear equation system (11) to determine the displacement of x and repeat the procedure for a predefined number of iterations. The process of post-placement is summarized in Algorithm 1.

#### III. ROUTING

For simplicity, we only allow horizontal and vertical routing of waveguides. The objective of routing is to reduce the maximum insertion loss of waveguides. There are three major types of waveguide-loss: propagation loss, crossing loss and  $90^{\circ}$  bending loss. In [6], the authors only considered the first two types. However, from TABLE I, we can see that the  $90^{\circ}$  bending loss is more than crossing loss in the worst case.

In this paper, we use a modified version of  $A^*$ -search routing. On one hand, we balance the propagation loss, crossing loss and bending loss by enabling, but penalizing, waveguide crossing

### Algorithm 1 Post-Placement

**Require:** positions of fixed modules, position of PSEs, module dimensions.

Ensure: positions of PSEs after post-placement.

- 1: Calculate T(x, y) from distribution of cores on electrical layer;
- 2: Calculate  $\mathbf{T}'_{x}$ ;
- 3: *iter*  $\leftarrow$  0;
- 4: while (iter < num) do
- 5: Calculate  $\theta_i, C_x, \dot{C}_{netm,x}, Q$  and  $\Phi_{netm,x}$ ;
- 6: Solve linear equation system (11);
- 7: Update  $\mathbf{x}'$  with  $\Delta \mathbf{x}$ ;
- 8:  $iter \leftarrow iter + 1;$
- 9: end while

penalty term 5.

TABLE I: Optical loss parameters published in [1] Parameters Minimum Mode Maximum 0.1dB/cm Propagation loss 1dB/cm 3.6dB/cm 0.05dB 0.05dB 0.2dB Crossing loss Bending loss 0.00215dB 0.5dB Obstacles Net n<sub>1</sub> Net  $n_2$ Net  $n_i$ 

(a) (b) Fig. 4: A\*-search routing example: (a)without penalty; (b) with

and bending. On the other hand, we adapt the routing algorithm to avoid hotspots by increasing the penalty value of hightemperature areas.

### A. A\*-Search Routing for ONoCs

A\*-Search algorithm is one of the most popular methods to find the shortest path between two locations. The estimated distance from the starting point to the destination passing by vertex n is:

$$f(n) = g(n) + h(n),$$
 (12)

where g(n) represents the exact cost of the path from the starting point to any vertex n, and h(n) represents the heuristic estimated cost from vertex n to the destination.

In this paper, the chip is split into a grid of  $900 \times 900$  bins with a bin size of  $10\mu m \times 10\mu m$ . PSEs, hubs and memory controllers are marked as obstacles. As shown in Fig. 4(a), our algorithm calculates the g(n) and h(n) of bins until finding the path to the destination. The top value and bottom value in a bin is g(n) and h(n), respectively. In this paper, Manhattan distance is used for h(n).

To minimize crossing loss, we mark the used bins and set a penalty term to increase the cost of further usage. Fig. 4(b) is an example of penalizing an already used bin with a penalty term 5. The top value, center value and bottom value in a grid

TABLE II: Results of thermal-aware post-placement

| Benchmark | $il_{max}(dB)$ | total $\#C$ | #CPU (s) |
|-----------|----------------|-------------|----------|
| case1     | 28.2           | 105         | 243.1    |
| case2     | 19.4           | 76          | 232.3    |
| case3     | 21.8           | 74          | 221.8    |
| case4     | 17.9           | 72          | 231.2    |
| case5     | 17.5           | 71          | 215.3    |

are g(n), h(n) and penalty term p(n), respectively. Similar to PROTON [6], we use a dynamic penalty term p(n, k):

$$p(n,k) = \begin{cases} 5, & k \le \max(0, \#net - 10), \\ p(n,k-1) + 5k, & k > \max(0, \#net - 10), \end{cases}$$
(13)

where k represents the number of the current net to be routed, #net means the number of nets.

For each net to be routed, we add the penalty term p(n,k) to the evaluated distances f(n):

$$f(n) = g(n) + h(n) + p(n,k)$$
(14)

and search the path from the starting point to the destination.

To minimize bending loss, we penalize the cost of bending waveguides. When we explore the adjacent bins of bin n, if the next bin's direction is different from the prior bin, we mark the bin as bended bin. Since the bending of nets are all independent, the penalty term can be set as constant c. The best value of c depends on the optical loss parameters.

## B. Thermal-aware A\*-Search Routing

To consider thermal effects during routing, we sample the temperature profile into a  $900 \times 900$  grid T(n), and then define a temperature penalty term p(n):

$$p(n) = \gamma \frac{T(n)}{\max(T(x,y))},\tag{15}$$

where  $\gamma$  is a positive weight. Our algorithm can avoid the hotspots when searching for the shortest path by adding the penalty term to the evaluated distance of each bin.

## IV. EXPERIMENTAL RESULTS

The post-placement algorithm of PROTON+T is implemented in matlab and the routing algorithm is implemented in C++. We obtain the placement result from PLATON [7], and re-implement the routing algorithms of PROTON [6]. We test all the algorithms on a computer with Intel Xeon 2.4GHz CPU and 8GB physical memory. The bending loss, the propagation loss and the crossing loss are set to 0.15dB, 1.5dB/cm and 0.15dB. The drop loss is set to 0 (i.e., it is not considered in the experiment).

For the rest of this section,  $il_{max}$  denotes the maximum insertion loss in dB, #C is the number of crossings, #B is the number of bending corners, and #CPU is the runtime.

TABLE II shows the results of the proposed thermal-aware post-placement algorithm. The post-placement and routing result of case5 is given in Fig. 5. Its temperature profile is shown in Fig. 5(c) and the result with and without temperature profile is shown in Fig. 5(b) and Fig. 5(a) respectively. We can see that our approach spreads the PSEs away from hotspots.

TABLE III compares two different routing algorithms, where we can see that although PROTON+T consumes more runtime,



Fig. 5: Results of post-placement and routing algorithm: (a) Post-placement result without temperature profile; (b) PROTON+T post-placement result; (c) The temperature profile of case5.

| Benchmark  | average $\#B$ |          | average $\#C$ |          | $il_{max}$ |          | #CPU(s) |          |
|------------|---------------|----------|---------------|----------|------------|----------|---------|----------|
| Deneminark | PROTON        | PROTON+T | PROTON        | PROTON+T | PROTON     | PROTON+T | PROTON  | PROTON+T |
| case1      | 39.02         | 26.95    | 28.80         | 28.84    | 15.69      | 13.31    | 43.11   | 378.21   |
| case2      | 37.07         | 23.32    | 26.77         | 25.57    | 16.16      | 12.02    | 38.77   | 397.70   |
| case3      | 35.07         | 22.18    | 23.91         | 21.34    | 15.59      | 11.06    | 36.57   | 338.89   |
| case4      | 39.07         | 24.14    | 27.60         | 25.68    | 15.94      | 11.89    | 38.85   | 284.28   |
| case5      | 37.55         | 23.52    | 20.09         | 20.34    | 15.41      | 10.95    | 37.20   | 277.75   |
| M0         | HI            |          | M2 N<br>M3 X  |          |            | M2       | •       |          |
|            |               | (a)      |               |          | (b)        |          | (c)     |          |

TABLE III: Experimental results of PROTON router and our A\*-Search routing algorithm

Fig. 6: Results of thermal-aware routing algorithm: (a) Routing result without temperature profile; (b) PROTON+T routing result; (c) The temperature profile.

TABLE IV: Results of thermal-aware routing algorithm

| Benchmark | $il_{max1}$ | $il_{max2}$ |
|-----------|-------------|-------------|
| case1     | 13.310      | 15.251      |
| case2     | 12.02       | 13.488      |
| case3     | 11.058      | 12.874      |
| case4     | 11.886      | 13.06       |
| case5     | 10.955      | 10.95       |

it achieves better performance than PROTON: the maximum insertion loss and the number of crossings can be reduced slightly, while the number of bending corners is reduced by 35%.

For the thermal-aware routing algorithm, we compare the maximum insertion loss of the routing algorithm with and without temperature profile. The results are listed in TABLE IV, where " $il_{max1}$ " denotes the result without temperature profile and " $il_{max2}$ " denotes the result with temperature profile. We can see that the maximum insertion loss increases slightly. Fig. 6 shows the result of case5. Comparing Fig. 6(a) with Fig. 6(b), it shows that PROTON+T can effectively adjust the waveguide path to avoid the hotspots in Fig. 6(c).

## V. CONCLUSION

We proposed PROTON+T, a thermal-aware placement and routing algorithm for 3D ONoCs. The algorithm calculates the temperature profile using Green's function and places the PSEs away from hotspots with a force-directed approach. In addition, we improve the routing algorithm by considering bending loss and temperature variation.

#### REFERENCES

- C. J. Nitta, M. K. Farrens, and V. Akella, "On-chip photonic interconnects: A computer architect's perspective," *Synthesis Lectures on Computer Architecture*, vol. 8, no. 5, pp. 1–111, 2013.
- [2] S. V. R. Chittamuru and S. Pasricha, "Spectra: A framework for thermal reliability management in silicon-photonic networks-on-chip," in *Proc. VLSI Design.* IEEE, 2016, pp. 86–91.
- [3] D. Ding, B. Yu, and D. Z. Pan, "GLOW: A global router for lowpower thermal-reliable interconnect synthesis using photonic wavelength multiplexing," in *Proc. ASPDAC*, 2012, pp. 621–626.
- [4] M. Yang and P. Ampadu, "Thermal-aware adaptive fault-tolerant routing for hybrid photonic-electronic NoC," in *International Workshop on Network* on Chip Architectures, 2016, pp. 33–38.
- [5] A. Boos, L. Ramini, U. Schlichtmann, and D. Bertozzi, "PROTON: An automatic place-and-route tool for optical networks-on-chip," in *Proc. IC-CAD*, 2013, pp. 138–145.
- [6] A. von Beuningen, L. Ramini, D. Bertozzi, and U. Schlichtmann, "PRO-TON+: A placement and routing tool for 3D optical networks-on-chip with a single optical layer," ACM JETC, vol. 12, no. 4, p. 44, 2016.
- [7] A. von Beuningen and U. Schlichtmann, "PLATON: A force-directed placement algorithm for 3D optical networks-on-chip," in *Proc. ISPD*, 2016, pp. 27–34.
- [8] A. Scandurra and I. OConnor, "Scalable CMOS-compatible photonic routing topologies for versatile networks on chip," *Network on Chip Architecture*, pp. 121–128, 2008.
- [9] P. Spindler, U. Schlichtmann, and F. M. Johannes, "Kraftwerk2a fast forcedirected quadratic placement approach using an accurate net model," *IEEE TCAD*, vol. 27, no. 8, pp. 1398–1411, 2008.
- [10] S. S.-Y. Liu, R.-G. Luo, S. Aroonsantidecha, C.-Y. Chin, and H.-M. Chen, "Fast thermal aware placement with accurate thermal analysis based on green function," *IEEE TVLSI*, vol. 22, no. 6, pp. 1404–1415, 2014.