

#### **ASP-DAC 2015**



# Machine Learning and Pattern Matching in Physical Design

Bei Yu<sup>1</sup>, David Z. Pan<sup>1</sup>, Tetsuaki Matsunawa<sup>2</sup>, and Xuan Zeng<sup>3</sup> <sup>1</sup>UT Austin; <sup>2</sup>Toshiba; <sup>3</sup>Fudan University <u>http://www.cerc.utexas.edu/utda</u>

Supported in part by NSF, SRC, IBM, Toshiba, NSFC, SSTC



- Modern VLSI Challenges
- Machine Learning and Pattern Matching 101
- Applications in VLSI Design and Verification
- Some Advanced Issues
- Conclusion



The industry forced to extend 193nm lithography

- > Feature size is much smaller than the wavelength
- > Deep sub-wavelength design and manufacturing

# **Machine Learning 101**

Study of algorithms that can learn from data

$$y=f(x)$$

- y : output
- x : input data
- f : function

Supervised learning (labels (y) are given)

- > Classification : y is categorical data
- Regression : y is continuous data
- Unsupervised learning (no labels are given)
  - > Clustering, etc.

### **Machine Learning 101 (cont'd)**



# **Pattern Matching 101**



- Exact Pattern Matching
  - > Detected pattern = template
  - >
- Fuzzy Pattern Matching
  - > Detected pattern ≈ template

6





- Modern VLSI Challenges
- Machine Learning and Pattern Matching 101
- Applications in VLSI Design and Verification
  - > Lithography Hotspot Detection
  - > Lithography Friendly Routing
  - > Datapath Extraction and Placement
- Some Advanced Issues
- Conclusion



#### Lithographic hotspots

- What you see (at design) is NOT what you get (at fab)
- > Hotspots mean poor printability
- > Highly dependent on manufacturing conditions
- > Exist after resolution enhancement techniques

#### Litho-simulations are extremely CPU intensive

- > Full-blown OPC could take a week
- Impossible to be used in inner design loop

# **Various Approaches**

[Xu+ ICCAD07] [Yao+ ICCAD08, [Khang SPIE06], etc.



Pattern/Graph Matching

#### Pros and cons

- Accurate and fast for known patterns
- But too many possible patterns to enumerate
- Sensitive to changing manufacturing conditions
- High overshoot (falsealarms)



SVM [J. Wuu+ SPIE09] [Drmanac+ DAC09] Neural Network Model [Norimasa+ SPIE07][Ding + ICICDT09] Regression Model [Torres+ SPIE09]

**Data Mining/Machine Learning** 

#### Pros and cons

- Good to detect unknown or unseen hotspots
- Accuracy may not be good for "seen" patterns (cf. PM)
- Hard to trade-off accuracy and false alarms



#### Layout Fragmentation

- > With a pre-defined set of measurement operators
- > Accurate and very fast to apply (e.g., link to CALIBRE API)
- Full detection to cover whole layout without samplings (cf. windowbased approach)
- > Complexity and runtime scale O(n)

# **Machine Learning Kernel - SVM**

Support Vector Machine – A linear separation demo



**To maximize** the separation margin

- Convert hotspot detection problem to a *binary* classification problem (hot or nonhot separation)
- Support Vector Machine can find a set of support vectors to construct a boundary plane that maximize the separation of 2 distinctive sets of data

#### A Naïve Combination Combination of ML and Pattern Matching



### **Meta-Classification**

Pattern Matching Methods Good for detecting previously known types of hotspots Machine Learning Methods Good for detecting new/previously unknown hotspots

A New Unified Formulation (EPIC) Good for detecting all types of hotspots with advantageous accuracy/false-alarm (Meta-Classifier)

 Meta-Classification combines the strength of different types of hotspot detection techniques

[Ding et al, ASPDAC 2012 BPA]

### **An Illustrative Example**

| Detection<br>Sub-block | Detection Results<br>(H: hotspot, N: non-hotspot, X: Don' t Care) |   |   |   |   |  |  |
|------------------------|-------------------------------------------------------------------|---|---|---|---|--|--|
| Machine<br>Learning 1  | Х                                                                 | Н | Ν | Н | Ν |  |  |
| Machine<br>Learning 2  | Х                                                                 | Н | Н | Ν | Ν |  |  |
| Pattern<br>Matching    | Н                                                                 | Ν | Ν | Ν | Ν |  |  |
| Final<br>Decision      | Н                                                                 | Н | Ν | Н | Ν |  |  |

# **Components of Meta-Classifier Core**



Base classifier results are first collected

- Weighting functions to make the overall meta decision (e.g., Minimize Mean Square Error among all samples in data set)
  - > Quadratic Programming (QP) formulation
- Accuracy and false-alarm trade-off

#### **False-alarm Rate and Accuracy**



# **ICCAD'12 Contest**

- Released benchmark by Mentor: 2D structures on metal layers with 32 to 28nm processes
- Desired target performance
  - > Low false alarm: <100 false hits/mm<sup>2</sup>
  - > Fast run time: < 1 CPU-hr/mm<sup>2</sup>
  - > Detection accuracy: > 80%
  - > Portability: General calibration strategy
- Publications
  - > [Lin et al. DAC 2013]
  - > [Yu et al. DAC 2013]
  - > [Gao et al. SPIE 2014]

> .....

#### **Lithography-Friendly Detailed Routing**

 [DAC'11] AENEID: Hotspot learning models in early design stage, used to guide routing



#### **AENEID Overall Flow**



 Machine learning models to guide AENEID to avoid hotspot patterns in the early design stages

# **Cost Function For Detailed Routing-I**

$$litho(e) = litho(e)^{HD} + litho(e)^{RPP}$$





#### **Testing Benchmarks and Simulation Results**

| Benchmarks    | CK1                  | CK2                    | CK3                    |  |
|---------------|----------------------|------------------------|------------------------|--|
| Layout Size   | 50X50um <sup>2</sup> | 100X100um <sup>2</sup> | 160X160um <sup>2</sup> |  |
| Nets to Route | 0.45K                | 1.48K                  | 3.4K                   |  |
| M1 Blockage#  | 1K                   | 8.8K                   | 13.1K                  |  |
| M2 Fragment#  | 12.2K                | 41K                    | 152.6K                 |  |
| M2 Blockage#  | 0.14K                | 0.47K                  | 2K                     |  |
| M2 Fragment#  | 0.56K                | 1.9K                   | 8.3K                   |  |

Compared with ELIAD (Minsik Cho, et al. TCAD09), AENEID shows 23%-64% hotspot reduction at the cost of 30% extra run-time without penalty on total wirelength

|                              | AENEID          |     |                  |     |      |                |                 |     |                  |               |                  |    |
|------------------------------|-----------------|-----|------------------|-----|------|----------------|-----------------|-----|------------------|---------------|------------------|----|
|                              | HD              |     |                  |     |      | HD + RPP       |                 |     |                  |               |                  |    |
| Circuit                      | C               | K1  | Cł               | <2  | Cł   | <3             | Cł              | (1  | C                | <b>&lt;</b> 2 | Cł               | <3 |
| Circuit Size um <sup>2</sup> | 50 <sup>2</sup> |     | 100 <sup>2</sup> |     | 16   | 0 <sup>2</sup> | 50 <sup>2</sup> |     | 100 <sup>2</sup> |               | 160 <sup>2</sup> |    |
| Wire-length um               | 85              | 9.3 | 550              | 2.0 | 2479 | 97.0           | 859             | 9.1 | 5502.0           |               | 24797.5          |    |
| Run-time sec                 |                 | 8   | 40               | )9  | 32   | 91             | 8               | }   | 400              |               | 3279             |    |
| Run-time overhead %          | 33              |     | 3                | 8   | 19   |                | 33              |     | 35               |               | 18               |    |
| Metal layer                  | M1              | M2  | M1               | M2  | M1   | M2             | M1              | M2  | M1               | M2            | M1               | M2 |
| Hotspot#                     | 11              | 2   | 34               | 7   | 90   | 17             | 8               | 2   | 22               | 5             | 58               | 15 |
| Hotspot reduc %              | 35              | 33  | 48               | 30  | 44   | 26             | 53              | 33  | 66               | 50            | 64               | 35 |
| Avg. hotspot reduc %         | 36              |     |                  |     |      |                | 50              |     |                  |               |                  |    |
| Avg. extra run-time %        | 30 29           |     |                  |     |      |                |                 |     |                  |               |                  |    |

### **Machine Learning for Placement**

- Data mining and extraction based on not just graph but also physical information
- We can extract datapath like structures even for "random" logics
- Use them to explicitly guide placement
- Very good results obtained cf. other leading placers like simPL, NTUPlace, mPL, CAPO



[Ward+, DAC' 12]

### **Datapath Placement Techniques**

- Steiner wirelength (StWL) improvement through bit-stack alignment
  - > Significantly improves total StWL and routing congestion
- Datapath placement techniques
  - > Skewed Weighting with Step Size Scheduling
  - > Fixed-Point Alignment Constraint
  - > Bit-Stack Aligned Cell Swapping
  - > Datapath Group Repartitioning
- Integrate alignment constraints into forcedirected placement
- Simultaneously place datapath & random logic

[Ward+, ISPD' 12]

# **PADE: Hybrid Experimental Results**

- All numbers are average wirelength ratio cf. PADE
- PADE without datapath extraction generates the simPL wirelength results
- HPWL: 7%+ better
- StWL: 12%+ better



#### **PADE: ISPD2005 Results**

- PADE Wirelength results on the ISPD 2005 Placement Benchmarks
- At least 2% better in HPWL
- At least 3% better in StWL
- Highlights the effectiveness of structure aware extraction

|          | CAPO    | mPL6   | FastPlace3.1 | NTUPlace3 | simPL  | PADE   |
|----------|---------|--------|--------------|-----------|--------|--------|
| Adaptec1 | 97.22   | 86.2   | 88.75        | 91.06     | 87.05  | 85.12  |
| Adaptec2 | 114.54  | 100.64 | 104.03       | 99.06     | 102.13 | 98.92  |
| Adaptec3 | 296.22  | 235.06 | 239.7        | 234.52    | 228.32 | 222.08 |
| Adaptec4 | 257.47  | 208.85 | 215.02       | 211.86    | 201.82 | 196.23 |
| Bigblue1 | 127.72  | 108.31 | 105.24       | 110.02    | 109.94 | 106.98 |
| Bigblue2 | 189.6   | 174.69 | 178.44       | 175.27    | 168.65 | 164.33 |
| Bigblue3 | 452.91  | 370.7  | 421.31       | 389.39    | 369.61 | 361.96 |
| Bigblue4 | 1105.52 | 930.63 | 911.64       | 974.44    | 901.85 | 883.82 |
| AVE      | 1.22    | 1.04   | 1.07         | 1.06      | 1.03   | 1.00   |



- Modern VLSI Challenges
- Machine Learning and Pattern Matching 101
- Applications in VLSI Design and Verification
- Some Advanced Issues
- Conclusion

# **Issue 1: ML or PM?**

#### Machine Learning:

- > (+) good to unseen data
- > (-) longer training time



#### Pattern Matching:

- > (+) Easy to implement, fast
- > (-) Sensitive to process change

Hybrid approaches are desirable!



#### **Issue 2: Feature Extraction**

Fragmentation based feature [ASPDAC'11, SPIE'14]



 $V_F$  is the feature vector associated with *fragment F*  $\Theta$  is a concatenate function;  $\oplus$  is a sort-n-combine function  $\delta_r^F$  includes both proximity and some peripheral information

#### **Issue 2: Feature Extraction**

 Density based feature [Wuu+, ASPDAC'11; Matsunawa +, SPIE'15]



#### **Issue 2: Feature Extraction (cont.)**

- HLAC based feature [Nosato+, JM3'14]
  - > higher-order local autocorrelation (HLAC)
  - > 25 local masks => 25 dimensional vector feature



#### **Issue 2: Feature Evaluation**

#### Analyze Feature Space



Measure feature distances [Matsunawa+, SPIE'15]

$$d_{i} = \frac{\sqrt{(x_{i} - \mu)^{T} V^{-1}(x_{i} - \mu)} - d_{NHS_{min}}}{d_{NHS_{max}} - d_{NHS_{min}}}$$

#### **Issue 3: Overcome Overfitting**

Overfitting: good training, but bad testing



Wn value

Possible Solutions:

- Regularization (additional constraints or objective terms)
- Cross validation

# Conclusion

Machine learning and pattern matching 101

- Applications in VLSI design and verification
  - Lithography hotspot detection
  - Lithography friendly routing
  - > Datapath-like circuit extraction and placement
- Still many open problems and opportunities
  - Hybrid machine learning and pattern matching
  - > Feature extraction and classification
  - > Overfitting in machine learning
  - Cross-layer applications