View
222
Download
0
Category
Preview:
Citation preview
Fast and Scalable Physics-Based Electromigration Checking for Power
Grids in Integrated Circuits
by
Sandeep Chatterjee
A thesis submitted in conformity with the requirements
for the degree of Doctor of Philosophy
Graduate Department of Electrical & Computer Engineering
University of Toronto
c© Copyright 2017 by Sandeep Chatterjee
Abstract
Fast and Scalable Physics-Based Electromigration Checking for Power Grids in Integrated
Circuits
Sandeep Chatterjee
Doctor of Philosophy
Graduate Department of Electrical & Computer Engineering
University of Toronto
2017
Electromigration (EM) is a key reliability concern in chip power/ ground (p/g) grids, which
has been exacerbated by the high current levels and narrow metal lines in modern grids. EM
checking is expensive due to the large sizes of modern p/g grids and is also inherently difficult
due to the complex nature of the EM phenomenon. Traditional EM checking is based on
empirical models, but better models are needed for accurate prediction due to the very small
margins between the allowed failure rates (spec) and the failure rates at which the chips actually
operate in the field. Thus, recent more accurate physics-based EM models have been proposed,
which remain computationally expensive because they require solution of a system of partial
differential equations (PDEs). In this work, we extend the existing physics-based models for EM
in metal branches to track EM degradation in multi-branch interconnect trees and propose a fast
and scalable methodology for power grid EM verification. We speed up our implementation by
using filtering schemes (that focus the computation only on the most EM susceptible trees) and
by developing optimized numerical methods to solve the PDE system arising out of the physics-
based EM models. The lifetimes found using our physics-based approach are on average 2.35x
longer than those based on a (calibrated) Black’s model, as extended to handle mesh power
grids. With a runtime of only 10 minutes for a 4.1M node grid, our approach is extremely fast
and should scale well for large integrated circuits.
ii
Acknowledgements
When people congratulated me on completing my final defense, I cannot help but look back
at the last 4 years of my life: how rewarding and enriching this journey has been. And it would
not have been possible without the help and support of a lot of people, to whom I would like
to express my sincere gratitude in this acknowledgment.
First and foremost, I would like to thank my supervisor Professor Farid N. Najm, because
without his support and encouragement this work would not have been possible. I have learned
a lot of things from him, which has helped make me a better person overall. I am truly thankful
for his brilliant technical (and non-technical) advice and his thoughtful suggestions. He is the
best supervisor one could hope for, and I consider myself extremely lucky that he chose me as
one of his students.
I would like to thank my committee members Professor Vaughn Betz, Professor Paul Chow,
Professor Sean Hum and Professor Peng Li for taking time to review this work and for providing
me with constructive comments, which has definitely improved the quality of this work. I would
also like to thank Dr. Valeriy Sukharev for providing me with the opportunity to collaborate
with him, I learned a lot from him about the industry and about Armenia! I appreciate the
financial support for this project provided by the University of Toronto, Natural Sciences and
Engineering Research Council (NSERC) of Canada, Mentor Graphics (a Siemens business) and
by Semiconductor Research Corporation (SRC).
I consider myself lucky to have such a good set of friends, whose support and encouragement
made the last 4 years of my life so easy and memorable. I would like to thank Mohammad
Fawaz, my friend and colleague, with whom I shared my masters at the University of Toronto
and now we both are finishing our Ph.D together. As it turns out, we are also joining the
same company after graduation, let’s hope this path continues in the future too. Many thanks
to Zahi Moudallal, who is a wonderful guy and is an excellent person to go talk to if you are
having problems with mathematical proofs or notation, or in general too. And how can I for-
get Abdul-Amir (Abed) Yassine, who is my cubicle neighbor and a fellow geek. We share a
common love for TV series and comic book movies, and I have enjoyed our long and “fruit-
ful” discussions on all related topics. I hope one day he gets the cubicle he deserves! This
acknowledgment would be incomplete without the mentioning my friends: Genevieve Hayden,
Aakar Gupta, Aakash Nigam, Dikshant Sharma, Divyam Beniwal, Balsher Singh Sidhu, Vipin
Mathew, Aapar Agarwal, Ajay Thomas, Monika Patel, Noha Sinno, Mehul Srivastava, Nihal
Anand, Rajeev Acharya, Venkatesh Medabalimi, Hari Sridhar and countless others who have
made this journey exciting. I will never forget the numerous Toronto adventures, hikes, camp-
ings, dinners, barbecues, board game nights, late night walks and discussions I had with them.
Also, many thanks to my friends in India for their support and motivation. I wish you all the
best for the future.
My biggest gratitude goes to my parents, Mr. Jitendra Kr. Chatterjee and Mrs. Soma
iii
Chatterjee for their continued support and encouragement throughout my Ph.D. and wishing
only the best for me. Thank you, mom and dad, for believing in me, for making me what I am
today, for all that you have done for me and for which I am forever indebted to you. Thank
you again for all the support, this work is dedicated to you.
Lastly, I offer my regards to those whom I might have missed but supported me in any
respect during the completion of this work.
iv
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 6
2.1 Electromigration Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Atomic Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Void Nucleation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Void Growth Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Effective-EM Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 EM failure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Black’s model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Physics-based EM models . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Korhonen’s Model and its adaptations . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 The Korhonen’s model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Solution for blocking boundary at both ends . . . . . . . . . . . . . . . . 14
2.3.3 Riege Thompson Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.4 CTHKS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Review of Power Grid EM checking approaches . . . . . . . . . . . . . . . . . . . 18
2.4.1 Industrial EM checking approach . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Recent approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Power Grid model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Partial Differential Equations (PDE) . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Ordinary Differential Equations (ODE) . . . . . . . . . . . . . . . . . . . . . . . 23
2.7.1 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.2 Linear Multi-Step Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.3 Error estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7.4 Variable time-stepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8 Compact Thermal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
v
2.9 State Space Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.10 Mean estimation using Monte Carlo random sampling . . . . . . . . . . . . . . . 32
3 Extended Korhonen’s model 34
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Interconnect Tree EM analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Assigning reference directions . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Incorporating thermal stress . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Extending Korhonen’s model to trees . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.1 Boundary Laws for junctions . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.2 PDE system for a general interconnect tree . . . . . . . . . . . . . . . . . 40
3.3.3 Void growth and resistance change . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Solving EKM using IVP formulation . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Discretization for a tree branch . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.3 Boundary Conditions at Diffusion Barrier . . . . . . . . . . . . . . . . . . 44
3.4.4 Boundary Conditions at Dotted-I junction . . . . . . . . . . . . . . . . . . 44
3.4.5 Boundary Conditions at T junction . . . . . . . . . . . . . . . . . . . . . . 45
3.4.6 Boundary Conditions at Plus junction . . . . . . . . . . . . . . . . . . . . 46
3.5 Verifying EKM and the IVP formulation . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.1 Verifying the numerical approach . . . . . . . . . . . . . . . . . . . . . . . 48
3.5.2 Verifying the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6 Comparison between EKM and Black’s model . . . . . . . . . . . . . . . . . . . . 53
3.7 Importance of Temperature distribution . . . . . . . . . . . . . . . . . . . . . . . 55
4 LTI Models for trees 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 State Space representation for a tree . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Subtrees and Time-spans . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.2 LTI system for a subtree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.3 LTI system for pre-void phase . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.4 Final State Space representation . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Choosing the value of N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 Justification for the use of effective-EM currents . . . . . . . . . . . . . . . . . . 69
5 Solution Techniques 73
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Equivalent Homogeneous LTI system for EKM . . . . . . . . . . . . . . . . . . . 73
5.3 Using BDF formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 Review of BDF with fixed time-step . . . . . . . . . . . . . . . . . . . . . 74
vi
5.3.2 Variable coefficient BDF methods . . . . . . . . . . . . . . . . . . . . . . . 75
5.4 Applying VCBDF to solve the Homogeneous LTI system . . . . . . . . . . . . . . 79
5.5 Computing Matrix Exponential using the Arnoldi process . . . . . . . . . . . . . 82
5.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.2 The Arnoldi process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.3 Solving the Homogeneous LTI system . . . . . . . . . . . . . . . . . . . . 83
5.6 Solvers that use the matrix exponential . . . . . . . . . . . . . . . . . . . . . . . 85
5.6.1 Newton Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.6.2 Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.7 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6 Power Grid EM Checking 94
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2 Early Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Determining Branch Temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4 Power Grid EM analysis approaches . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.4.1 Power Grid Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.4.2 The Main Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.4.3 Improved performance with Filtering . . . . . . . . . . . . . . . . . . . . . 101
6.4.4 Parallelization using shared memory . . . . . . . . . . . . . . . . . . . . . 107
6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.5.1 Main Approach vs Filtering Approach . . . . . . . . . . . . . . . . . . . . 112
6.5.2 Comparison of Performance and Accuracy between the solvers . . . . . . 113
6.5.3 Black’s Model vs. EKM for grid MTF estimation . . . . . . . . . . . . . . 115
6.5.4 Effect of Early Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.5.5 Speed-up due to parallelization . . . . . . . . . . . . . . . . . . . . . . . . 118
6.5.6 Break-up of time consumed by different tasks in the code . . . . . . . . . 119
6.5.7 Overall scalability of the approach . . . . . . . . . . . . . . . . . . . . . . 120
7 Conclusions and Future Work 121
Appendices 123
A Properties of system matrix A 124
A.1 Proof of theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A.2 Special Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
B The math behind the Filtering approach 130
B.1 Integration details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
B.2 Deriving confidence bound on µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
B.2.1 Finding δκζ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
vii
B.2.2 Finding δµ′ζ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Bibliography 134
viii
List of Tables
2.1 Butcher tableau characterizing a m stage RK formula with built-in error estimates 25
3.1 Comparison of upstream-to-downstream MTF ratio as reported in [1] and as
estimated using EKM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1 Comparison of solver metrics and runtime . . . . . . . . . . . . . . . . . . . . . . 93
6.1 Details of Power Grids used in experiments. . . . . . . . . . . . . . . . . . . . . 110
6.2 Table of Physical constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.3 Configuration parameters to be used for evaluating all power grid benchmarks . . 111
6.4 Notation used to simplify presentation . . . . . . . . . . . . . . . . . . . . . . . . 111
6.5 Comparison of Power grid MTF obtained using the Main Approach and the
Filtering Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.6 Comparing the performance and accuracy of VCBDF2-VCBDF4 methods for
power grid EM checking using RK45 as reference . . . . . . . . . . . . . . . . . . 114
6.7 Comparison of the RK45 solver (run on the first machine) and the Predic-
tor+Newton solver on the second machine (Quad-core i7@3.4GHz) . . . . . . . . 114
6.8 Comparison of power grid MTF as estimated using Black’s model and Extended
Korhonen’s model (with VCBDF2 solver). . . . . . . . . . . . . . . . . . . . . . 115
ix
List of Figures
1.1 Wire lifetime and current density scaling. Figure taken from [2]. . . . . . . . . . 2
2.1 (a) A conventional or late failure, (b) early failure and (c) simple schematic
representation for both failures. (a) and (b) taken from [3] and [4], respectively. . 8
2.2 A simple volume element with flux divergence. . . . . . . . . . . . . . . . . . . . 11
2.3 3D stress tensor on a small volume element. For each component, the first
subscript/index denotes the direction of the outward normal from the face and
the second subscript/index is the direction of the of stress acting on that face. . . 11
2.4 Schematic for a confined metal line, showing a volume element. . . . . . . . . . . 13
2.5 (a) Stress evolution at different points along the line and (b) stress profile along
the line at different time points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Comparison of stress evolution at cathode of a finite line calculated using Riege-
Thomson model and the reference solution (2.11). . . . . . . . . . . . . . . . . . . 16
2.7 Simple multi-branch interconnect structures. . . . . . . . . . . . . . . . . . . . . 16
2.8 Schematic for a typical on-die power grid. . . . . . . . . . . . . . . . . . . . . . . 20
2.9 DC model of a power grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.10 (a) Cuboids resulting from spatial discretization along x, y and z axis with
their indices (note that we have not shown cuboids with indices (i, j − 1, k) and
(i, j + 1, k) for clarity) and (b) the equivalent electrothermal model for each
cuboid. The conductances gxT , gyT and gzT are shared by the neighbouring
cuboids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1 Cross sectional schematic of Cu dual damascene interconnects. . . . . . . . . . . 34
3.2 A typical interconnect tree structure. . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 A simple 3-terminal tree Td. Dashed arrows denote reference directions. . . . . . 37
3.4 Stress profile around a junction immediately after void nucleation. . . . . . . . . 38
3.5 For Td, (a) evolution of stress at junctions with time and (b) stress profile with
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6 Tree with a (a) dotted-I junction and (b) T junction. . . . . . . . . . . . . . . . . 48
x
3.7 (a) Comparing stress evolution for a dotted-I structure as obtained using EKM
and the CTHKS model, and (b) the error rate plot with respect to the CTHKS
solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.8 (a) Comparing stress evolution for a T-structure as obtained using EKM and the
CTHKS model, and (b) the error rate plot with respect to the CTHKS solution. 49
3.9 Stress profile across the T-structure with time. . . . . . . . . . . . . . . . . . . . 50
3.10 Schematic of a finite line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.11 (a) Comparing stress evolution for a finite-line as obtained using EKM and the
reference solution, and (b) the error rate plot with respect to the reference solu-
tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.12 Comparing the estimated MTF and its 95% confidence bounds as obtained using
EKM with the ones reported by Gan et al. [5]. Note that the confidence bounds
get tighter as the number of TTF samples are increased. . . . . . . . . . . . . . 51
3.13 (a) Schematic view of the test structure used in [1], and (b) Upstream and
downstream configurations as defined with respect to the left via. Both figures
taken from [1]. Here, TiN (Titanium Nitride) is used for barrier liner and SiN
(Silicon Nitride) is used for capping. . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.14 a) Initial current density profile for T1 and heat map showing MTFs estimated
using (b) Extended Korhonen’s model (MTFekm), (c) Black’s model (MTFblk)
and (d) MTFblk −MTFekm. All MTF values are in years. . . . . . . . . . . . . 53
3.15 (a) Initial current density profile for T2 and heat map showing MTFs estimated
using (b) Extended Korhonen’s model (MTFekm), (c) Black’s model (MTFblk)
and (d) MTFblk −MTFekm. All MTF values are in years. . . . . . . . . . . . . . 54
3.16 (a) The actual temperature profile and the assumed nominal temperature dis-
tribution. Heat map showing MTFs estimated with (b) actual temperature
profile (MTFT ), (c) assuming Tm,k = 327.6K for all branches (MTF T ) and
(d) MTFT −MTF T . All MTF values are in years. . . . . . . . . . . . . . . . . . 55
3.17 Estimated MTF as per EKM using (a) the actual temperature profile, and as-
suming the temperature to be (b) 315K (c) 327.6K and (d) 340K for all branches.
The x-axis for all plots represent the junction IDs. Junctions with MTF ≥ 100
years have not been shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.1 Notion of subtrees and time-spans. . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Error rate plots for LTI modelsM8-M50 with respect to the reference solution
obtained usingM64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 (a) Runtime vs. accuracy trade-off for LTI models with different discretizations
and (b) Percentage error in estimated junction void nucleation times for LTI
modelsM8-M50 with respect toM64. Smaller is better. . . . . . . . . . . . . . . 69
xi
4.4 The stress evolution at junctions in response to periodic pulsed branch currents
and their average (effective) values. The time-periods are (a) 2 months, (b) 1
month, (c) 2 weeks, (d) 1 week and (e) is a random waveform. . . . . . . . . . . 71
4.5 Frequency response of the pre-void LTI system for Td using Bode plots. The LTI
system of Td has three outputs and three inputs for the pre-void phase. . . . . . 72
4.6 Frequency response of the post-void LTI system for Td using Bode plots. Here,
n2 has a void, and is thus a part of both branches b1 and b2. Also, now there are
only two inputs because a voided diffusion barrier has no inputs. . . . . . . . . . 72
5.1 Obtaining the next void nucleation time using the Newton solver. . . . . . . . . . 86
5.2 Obtaining the next void nucleation time using Predictor. . . . . . . . . . . . . . . 87
5.3 Showing part of trees (a) T1 and (b) T2 used for comparing solvers. The orange
dots show the junctions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4 (a) Error rate plot for stress evolution at junctions as obtained using VCBDF2-
VCBDF6 solvers and expm approximation and (b) the average absolute error
with respect to RK45 solver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.5 Percentage error in the estimated TTFs of (a) T1 and (b) T2 using the proposed
solvers and RK45 solver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.6 Empirical complexity of VCBDF2 solver for trees (a) T1 and (b) T2, and VCBDF3
solver for trees (c) T1 and (d) T2, computed by using the fitting function time =
aN b, where b is the complexity. . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1 (a) An arrangement of two trees connected by a via taken from the power grid
and (b) the corresponding schematic showing early and conventional failures. . . 95
6.2 Thermal modelling of power grid using CTMs. . . . . . . . . . . . . . . . . . . . 96
6.3 (a) Heat map for Pself heating +Plogic and (b) temperature profile (in Kelvin) for
the M1 layer in ibmpgnew2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 (a) Heat map for Pself heating +Plogic and (b) temperature profile (in Kelvin) for
the M1 layer in PG7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.5 (a) Goodness-of-fit plot for normal distribution and (b) probability distribution
function (pdf) using 200 mesh TTF samples from ibmpg2 main approach. . . . . 100
6.6 The idea for expm filtering scheme. The dotted lines show the would-be stress
evolution if the boundary conditions are not updated when stress reaches σth.
Junction 1 fails before t = tm, Junction 2 fails after. . . . . . . . . . . . . . . . . 102
6.7 Variation of p2 with sample number. . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.8 Flow chart showing the MTF estimation using the Filtering approach. EF stands
for early failure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.9 Workflow for each process in our parallel implementation. . . . . . . . . . . . . . 109
xii
6.10 Comparing the main approach with the filtering approach for the first 5 grids
showing (a) 95% confidence bounds on the estimated MTF, and the TTF samples
obtained by each for (b) ibmpg2 and (c) ibmpg5. . . . . . . . . . . . . . . . . . . 112
6.11 Impact of early failures (EF) on (a) the maximum voltage drop (shown for one
sample grid) and (b) estimated mesh MTF for ibmpg2. Maximum voltage drop
at t = 0 is 3.8%vdd, and vth = 5%vdd. . . . . . . . . . . . . . . . . . . . . . . . . 117
6.12 Statistics of mesh TTF samples for ibmpg2 grid shows an underlying bimodal
distribution for different modes of grid failure. MTFA = 6.67 yrs, MTFB = 7.99
yrs, MTFall = 7.66 yrs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.13 Bar chart comparing speed-ups obtained using 4, 8 and 12 parallel processes with
respect to sequential code. Higher is better. . . . . . . . . . . . . . . . . . . . . . 118
6.14 The figure shows how tm is updated for (a) ibmpg2 and (b) ibmpg5 with MC
iterations for P parallel process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.15 Showing a breakdown of the total runtime (in terms of percentages) consumed
by different tasks in the code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.16 (a) tBDF212 vs. branch count for all test grids and (b) scalability analysis for grids
that only have straight trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
A.1 (a) A typical interconnect tree T with its corresponding graphs (b) G(T ), (c) theconverse G′(T ) and (d) Part of graph Γ(A) for any two adjacent points i and k.
Here, N = 4 and the vertex at n1 is the root. . . . . . . . . . . . . . . . . . . . . 125
A.2 All paths starting from the root and ending in a diffusion barrier for (a) G(T )and (b) the corresponding converse paths in G′(T ). . . . . . . . . . . . . . . . . . 126
xiii
List of Symbols
Symbol Description
σ Hydrostatic stress
t Time
x Distance along the length of branch from some reference point
σth Critical Stress threshold for void nucleation
σT Thermal stress
Ja Atomic flux
B Bulk modulus
C Concentration of atoms
Cv Vacancy concentration
Ω Atomic volume
kb Boltzmann’s constant
Tm Temperature of the metal
q∗ Effective charge
Da Atomic diffusion coefficient or diffusivity
Q Activation energy for vacancy formation
Ea Activation energy in Black’s model
n Current exponent in Black’s model
G Conductance matrix
v Vector of node voltage drops across the power grid
Tamb Ambient temperature
η Dimensionless (scaled) hydrostatic stress
τ Dimensionless (scaled) time
ξ Dimensionless (scaled) distance along the length of branch from some reference
point
δ Thickness of void interface
L, w, h Length, width and height of a branch
j Current density of a branch
N Number of discretizations per branch
vth Voltage drop threshold vector for mesh model
xiv
ρm Resistivity of metal (Copper)
ρb Resistivity of barrier metal (Tantalum)
GT Thermal conductance matrix
CT Thermal capacitance matrix
gxT , gyT , gzT thermal conductance in the x, y and z direction
Tzs Stress free annealing temperature
x State vector for state space representation of a system
A System matrix for state space representation of a system
B Input matrix for state space representation of a system
L Output matrix for state space representation of a system
u Input vector for state space representation of a system
y Output vector for state space representation of a system
h Time step taken by the numerical method
ai, bi Scalar coefficients of a numerical method
ǫPLTE Principal local truncation error
T Random variable that represents the statistics of time to failure of a grid
F (t) Cumulative Distribution Function of a Random variable
µ Mean Time to Failure
v Unbiased estimator of variance
zζ/2 The (1− ζ/2)-percentile of the standard normal distribution
Φ(t) The cdf of standard normal distribution
φ(t) Probability Distribution Function (pdf) of the standard normal distribution
tm Active set cutoff threshold
xv
Chapter 1
Introduction
1.1 Motivation
On-die power/ground (p/g) grids are subjected to a wide variety of degradation mechanisms.
For example, the p/g grid must be designed to withstand the deterioration resulting from Time-
Dependent Dielectric Breakdown (TDDB), current crowding at corners in the metal structure,
and stresses generated due to non-uniform temperature distribution and electromigration. As a
result of these ongoing phenomena, the capacity of the grid to deliver the required power to the
underlying logic circuits reduces over time until it finally fails. Accurately accounting for these
degradation mechanisms is the key to optimally design a power grid that is fast and reliable in
the field for a desired amount of time.
As a result of continued scaling of integrated circuit technology, electromigration (EM) has
become a major reliability concern for the design of on-die power grids in large integrated
circuits. Electromigration is the mass transport of metal atoms due to momentum transfer
between electrons and the atoms in a metal line. This ‘mass transport’ of metal atoms eventually
leads to void formation in the metal line, which degrades its conductivity. If multiple lines
experience failure due to EM, a grid might not be able to provide enough voltage to the
underlying logic blocks, which will result in timing violations and failure of the whole IC.
While it is next to impossible to avoid EM degradation in narrow metal lines, one can design
the power grids to withstand EM damage for a target lifetime. This is where EM models and
CAD tools come into play: their main purpose is to estimate EM damage in a given layout so
that the designer can judiciously use metal resources. While signal and clock lines also suffer
from EM degradation, it is often the case that these lines carry bidirectional current. As a
result, the damage caused by EM is partially reversed and these lines have a longer lifetime. In
contrast, p/g lines carry mostly unidirectional current, with no benefit of healing. Moreover,
the signal lines are more likely to degrade due to thermal fatigue, rather than electromigration
damage [6].
Electromigration is a complex phenomenon and its study, spanning several decades, includes
theoretical analysis, empirical and physical models and full-chip EM checking techniques. When
1
Chapter 1. Introduction 2
Figure 1.1: Wire lifetime and current density scaling. Figure taken from [2].
EM was first discovered to be a failure mechanism for commercial IC designs in 1966 [7], the
initial solution was to make the lines wider. However, wider lines entail less area for routing,
which leads to more design iterations and longer time-to-market, that ultimately results in
less return on investment. Hence, a lot of research has been conducted since 1966 on the
reliability of metal lines under the influence of EM, with the purpose of understanding and
controlling EM damage. Some of this research was focused on improving the resilience of metal
lines to EM failures by improving the fabrication processes and the materials involved. Other
researchers focused on estimating the EM degradation using mathematical models. Simple
empirical EM models, such as the Black’s model [8], were proposed that helped in understanding
the dependence of EM on the current density, line microstructure and a host of other factors. A
series model was proposed to estimate the reliability of the whole power grid from the reliability
of its individual metal lines [9], where it was conservatively assumed that one line failure would
cause the whole system to fail. Based on the series model, and some simplifying assumptions,
Statistical Electromigration Budgeting (SEB) was proposed [10] to allow for reliability trade-
offs between different parts of the grid. Black’s model for line failure combined with the series
model for grid failure is used in the state of the art industrial tools today for EM checking.
Industrial EM tools, based on simple failure models, got the job done for the past 40 years.
However, over the last decade, technology scaling has exacerbated EM [2, 7]. It is now becoming
much harder to sign off on chip designs using state of the art EM checking tools, as there is
no margin left between the predicted EM stress (obtained from the EM tools) and the EM
design rules (formulated based on a target lifetime) [11]. There are at least two reasons for
the loss of the safety margins. First, the EM lifetime itself is becoming progressively worse
due to technology scaling. Fig. 1.1 shows the lifetime and current density trends as the metal
pitch is reduced due to technology scaling. As the interconnect dimensions are scaled down
in smaller technology nodes, their lifetime under the influence of EM decreases even under
Chapter 1. Introduction 3
constant current density [12]. Moreover, since the supply voltages are not scaling down by
the same factor as the line widths, the current densities keep on increasing, which further
reduces the EM lifetimes. Second, the loss of safety margins can also be traced back to the
simplicity and pessimism built in the EM models used by the industrial tools. This simplicity
and pessimism is often rationalized on the grounds of necessity (the actual physical system is
too complex to be analyzed, and modern power grids are very large with up to a billion nodes)
and conservatism (the analyzed system is worse than the actual one). But, as the IC designs
become more complex and new factors come to bear, this simplicity and pessimism, combined
with reduction in EM lifetimes, leave no breathing room for designers who are now forced
to over-design the grids. Thus, there is a need to reconsider the traditional approaches and
develop better EM models that can accurately assess EM degradation so that we can eliminate
the pessimism built in state of the art EM tools and accurately estimate EM lifetime.
1.2 Contribution
The goal of this research is twofold: first to develop an EM model that can accurately estimate
EM lifetime and second, to use that EM model for the verification of on-die power grids. Given
that it is hard to model all the complexity of the EM phenomenon using empirical models, we
will use physics-based EM models for our work. Several physics-based EM models have been
proposed in the literature [13, 14, 15, 16, 17, 18, 19], some which have been used for power
grid EM checking [20, 21, 22, 23], but as we will explain in the next chapter, these approaches
are either so slow that they are not scalable to large grids or they are simplified in a way that
prevents them from taking into account all the factors that affect EM in real designs, so that
they are inaccurate.
In this work, we propose a fast and scalable finite-difference based physical EM checking
approach that accounts for process and temperature variations across the die. Our major
contributions are:
1. We propose a new physics-based EM model, that builds on Korhonen’s one-dimensional
(1D) physical model [16], and augments it by introducing boundary laws at junctions
(where multiple branches meet) to track the material flow and stress evolution in multi-
branch metal segments (for arbitrary complex geometries). We also account for the ther-
mal stresses generated by non-uniform temperature distribution across the grid. We refer
to this as the Extended Korhonen’s Model, or EKM.
2. For each tree, EKM starts out as a system of partial differential equations (PDE) coupled
by boundary laws. We show that this PDE system can be expressed as a succession of
Linear Time Invariant (LTI) systems, where each state represents the hydrostatic stress
at a some point on the tree. We study the properties of this linear system to justify the
use of some well known practices in the field, such as the use of effective DC currents in
EM analysis.
Chapter 1. Introduction 4
3. We develop new numerical approaches, based on Backward Differentiation Formulas
(BDFs) and model order reduction techniques, that are very fast and efficient as com-
pared to the traditional solvers for solving the LTI systems resulting from EKM. These
approaches are optimized by eliminating the Newton iteration step usually associated with
BDFs, and by using customized error control for the problem at hand. These optimized
solvers are partly the reason that our approach is scalable to large grids.
4. We propose a Power Grid EM checking scheme that uses
a) Compact Thermal Models (CTMs) [24] to determine the temperature distribution
of the grid,
b) Extended Korhonen’s Model to track EM degradation in the metal segments and
c) the mesh model [25], as opposed to the series model, to determine grid failure.
The mesh model factors in the inherent redundancy of modern power grids while estimat-
ing its reliability, and gives an accurate estimate of the grid lifetime. The random nature
of EM degradation, caused by process variation, is taken care of by using a Monte Carlo
method, in which successive samples of the grid time to failure (TTF) are found, until
the estimate of the overall Mean Time to Failure (MTF) has converged. We improve our
runtime and scalability by using several filtering schemes that estimate up-front the active
set of trees that are most-likely to impact the MTF assessment of the grid. We show that
the filtering schemes have a minimal impact on the accuracy of MTF estimation. Since
EKM provides a natural way to account for early failures (big voids that disconnect the
via above), we also detect early failures and update the state of the system accordingly.
On the implementation side, we parallelize our code using a multi-process architecture to
take advantage of all available cores in a machine.
Testing our approach on the IBM grid benchmarks [26] and internal benchmarks, with the
largest grid up to 4.1M nodes, shows that the MTF estimated using our physics-based approach
are on average 2.35x longer than those based on a (calibrated) Black’s model. This justifies
the claim that Black’s model can be overly inaccurate for modern power grids and confirms the
need for physical models. With a run-time of only around 16.2 minutes for the most difficult
to solve grid and 10.3 minutes for the largest (4.1M) grid, our approach is extremely fast and
should scale well for large integrated circuits.
1.3 Organization
The thesis is organized as follows: Chapter 2 provides the necessary background material on
electromigration and the prior art regarding the EM models and power grid EM checking
approaches. It also covers the basics of ODE solvers, mean estimation of distributions and
LTI models. Chapter 3 presents the Extended Korhonen’s Model and verifies it by comparing
Chapter 1. Introduction 5
its results with data from experiments published in the literature. In Chapter 4, we study
in detail the LTI models arising out of EKM and introduce the concept of state stamps, that
can be used to quickly and efficiently assemble the LTI system. Chapter 5 develops fast and
scalable numerical approaches that are used to obtain the stress evolution in trees over time
and to determine the time and location of the next void nucleation in a tree. In Chapter 6, we
describe in detail our power grid EM checking approaches that use the physics-based EM model
we proposed in Chapter 3. We also compare the MTF estimates obtained using a calibrated
Black’s model and EKM to show the inherent limitations of the Black’s model. We conclude
and give future research directions in Chapter 7.
Chapter 2
Background
In this chapter, we will review the required background material. We will start by reviewing
the basics of Electromigration in Section 2.1, followed by the mathematical models that have
been proposed to explain the process of EM degradation in Section 2.2. We will then focus
on one particular physics-based EM model, namely Korhonen’s model and its adaptations in
Section 2.3. In Section 2.4 we will review the industrial EM checking approaches for power grids,
with some recently proposed enhancements and, in Section 2.5, we will present the power grid
model that is used in the field to perform EM checks. In Sections 2.6 and 2.7, we will review the
numerical methods for solving Partial Differential Equations (PDE) and Ordinary Differential
Equations (ODE), respectively. We will then apply one of the numerical methods (method of
lines) to the heat transfer PDE in Section 2.8 and show the electro-thermal equivalence. In
Section 2.9, we will review the state space models and finally in Section 2.10, we will review
the Monte Carlo random sampling approach for estimating the mean of a distribution within
user specified error tolerances.
2.1 Electromigration Basics
Electromigration is the mass transport of metal atoms due to momentum transfer between
electrons (driven by an electric field) and the atoms in a metal line. Equivalently, one can
also say that EM is the diffusive motion of vacancies in a metal segment under the influence
of an applied electric field and/or stress gradients. A vacancy is the absence of a metal atom
in a crystal lattice. As we will see a little later, the movement of atoms/vacancies generates
mechanical stress within a metal segment, which is used as a measure of EM degradation. EM
is highly dependent on the specific microstructure of a given line. As such, due to random
manufacturing variations, the time to failure (TTF) due to EM is a random variable. For a
given microstructure, the rate of EM degradation depends on the type of metal, geometry,
temperature and current density of the given line segment.
6
Chapter 2. Background 7
2.1.1 Atomic Flux
Under conditions of high current density, metal atoms are pushed in the direction of the electron
flow. The number of atoms moving across a cross-section of a metal line per second per unit
area is known as the atomic flux. The total atomic flux in a metal segment is the result of
fluxes generated due to two different phenomenon:
i) electronic flux, generated due to the applied electric field and is always opposite to the
direction of the applied electric field (i.e the atoms are pushed in a direction opposite to
the applied electric field) and
ii) gradient flux, generated by the stress gradient itself and always flows from points of low
vacancy concentration (i.e. compressive stress) to high vacancy concentration (i.e. tensile
stress).
Note that the gradient flux counteracts the electronic flux. For example, consider a finite metal
line embedded in a rigid dielectric material. Then, the metal atoms and the atomic flux are
confined within the line. We express this by saying the atomic flux is blocked at the boundary
and cannot escape. Now, if we apply a strong electric field in the line, the electric current
will flow from anode to cathode (recall that by convention, electric current always flows from
anode to cathode). Then, the electronic flux will push the metal atoms from cathode to anode.
Correspondingly, the vacancies will move towards the cathode, and will generate tensile stress
there. The anode end of the line will develop compressive stress. This stress gradient in turn
generates the gradient flux that flows from anode to cathode, and opposes the electronic flux. A
higher spatial stress derivative leads to a higher gradient flux and vice versa. The phenomenon
of gradient flux opposing the electronic flux is often referred to as the back-stress effect in the
literature [27]. The process of EM degradation can be divided into two phases: void nucleation
and void growth.
2.1.2 Void Nucleation Phase
If the in-flow of metal atoms is equal to the out-flow at every point on a line segment, then
clearly no deformation or failure will occur. On the other hand, if the in-flow is not equal to
the out-flow, atomic flux divergence (AFD) is said to occur. AFD is a necessary prerequisite
for EM degradation and is typically observed in locations with some sort of barrier to atomic
movement, such as at the end of a line, at locations where the width of the metal segment
changes or around grain boundaries where the microstructure changes. Flux divergence at
these locations generates points of high tensile and compressive stresses within the segment.
The amount of compressive stress needed to cause a pile-up of metal atoms (a hillock) leading
to a short circuit is very high in modern metal systems, hence failure due to short circuit is
not usually observed. However, the build up of tensile stress eventually leads to formation of
a void when the stress reaches a pre-determined critical threshold. This initial phase of EM
Chapter 2. Background 8
(a) (b)
(c)
Figure 2.1: (a) A conventional or late failure, (b) early failure and (c) simple schematic repre-sentation for both failures. (a) and (b) taken from [3] and [4], respectively.
degradation, when stress is increasing over time but the void has not yet nucleated, is known
as the void nucleation phase.
If the critical stress threshold for void nucleation cannot be reached, the stress profile settles
at some steady state value. This happens because as the tensile and compressive stresses in
a metal segment increase with time, the gradient flux also increases. On the other hand, the
electronic flux remains constant because it depends on the applied electric field. When the
gradient flux becomes equal to the electronic flux, the net atomic flux becomes zero and the
system reaches a steady state. For a given metal segment, the steady state stress profile is
primarily determined by the applied electric field.
2.1.3 Void Growth Phase
Once a void nucleates, the void growth phase begins. In some cases, depending on the geometry
and the location of the void, nucleation by itself might be enough to cause failure due to open
circuit by disconnecting the via [28], as shown in Fig. 2.1b and the schematic of Fig. 2.1c. These
failures are often observed in testing and are typically referred to as early failures. Early failures
give rise to bimodal TTF distributions [29]. On the other hand, a line may still continue to
conduct current even after void nucleation; so that it is not quite an open circuit. This situation
Chapter 2. Background 9
is shown in Fig. 2.1a and the schematic of Fig. 2.1c, and is referred to as a conventional failure.
In this case, the void grows in the direction of the electronic flux and the line resistance increases
towards some finite steady-state value. Even if the void spans the whole cross-section of the
line, conduction remains possible through the high resistance barrier metal liner surrounding
the metal, as shown in Fig. 2.1c. In testing of single isolated lines, failure is deemed to happen
when the increase in resistance is 10%− 20% of the initial resistance value.
2.1.4 Effective-EM Current
EM is a long-term failure mechanism. As such, short-term transients typically experienced
in chip workloads do not play a significant role in EM degradation. Thus, standard practice
in the field is to use an effective-EM current model [30] to estimate EM degradation, so that
the lifetime of a metal line when carrying the constant effective current and the time-varying
transient current is the same. The effective-EM current is often computed based on some
assumed periodic current waveform with period tp. If the waveform is unidirectional, then the
effective-EM current is equal to the time-average current density [31]
jdc,eff = javg =1
tp
∫ tp
0j(τ)dτ. (2.1)
For the case of bidirectional currents, let j+(t) and j−(t) denote the current waveforms in the
chosen positive and negative directions, respectively. Then, the effective-EM current density is
given as [30, 32]
jac,eff =1
tp
(∫ tp
0j+(τ)dτ − ϕ
∫ tp
0|j−(τ)|dτ
)
, (2.2)
where ϕ is the EM recovery factor that is determined experimentally. The positive direction is
chosen such that∫ tp0 j+(τ)dτ ≥
∫ tp0 |j−(τ)|dτ .
2.2 EM failure Models
Many empirical and physics-based models have been proposed to explain EM degradation in a
line. We will now review some of these models, focusing on EM models that are important to
understand the contribution of this work.
2.2.1 Black’s model
One of the earliest empirical models for estimating the EM mean time to failure (MTF) was
proposed by J. R. Black in 1969 [8]. As per his model, the time to failure (TTF) of an isolated
metal line has a lognormal distribution (to account for the randomness due to microstructure)
Chapter 2. Background 10
with mean time to failure given as
MTF =Abljn
exp
(Ea
kbTm
)
, (2.3)
where Abl is a proportionality constant, j is the constant current density (current per unit
cross-sectional area) in the line, kb is Boltzmann’s constant, Tm is the temperature of the line,
n is the so-called current density exponent and Ea is the activation energy. The parameters
Abl, n and Ea are determined experimentally using accelerated testing: isolated metal lines are
tested with high current densities at higher than typical operating temperatures. The TTFs
thus obtained are fitted to a lognormal distribution using goodness of fit methods to estimate
the MTF under testing conditions. The parameters Abl, n and Ea are then determined using
regression analysis [33], and are used for extrapolating the results back to typical operating
conditions.
Later, Blech et al. [34, 35, 36] discovered that not all lines fail due to EM: an isolated
metal line (that has not already failed) is immune to EM failure if the product of its length and
current density is less than the critical Blech product (jL)c, defined as [37]
(jL)c =Ω∆σmax
q∗ρ, (2.4)
where Ω is the atomic volume, ∆σmax > 0 is the maximum stress difference between the cathode
and the anode before void nucleation occurs, q∗ is the absolute value of the effective charge of
the migrating atoms and ρ is the resistivity of the metal. This phenomenon later came to be
known as the Blech effect.
Equation (2.3), combined with the Blech effect (2.4), is known as the Black’s model and
is currently the EM model being used in state of the art commercial tools. The benefit of
using Black’s model is that it is computationally very fast and scales well as the problem size
increases. However, Lloyd [38] pointed out that the fitting parameters Abl, n and Ea obtained
under accelerated testing conditions are not valid at actual operating conditions, and this
leads to significant errors in lifetime extrapolations. Further, Hauschildt et al. [39] conducted
experiments which demonstrated that n depends on the temperature and thermal stress and Ea
depends on the current density of the line. These observations make the use of Black’s model
controversial.
2.2.2 Physics-based EM models
To remedy the shortcomings of the Black’s model, many physics-based EM models have been
proposed. These physics-based models are often presented in the form of partial differential
equations (PDE), that express how a physical quantity of interest, which provides a measure
of EM degradation, is influenced by factors such as the material properties, geometry, current
density and temperature of the metal structure. The PDE, coupled with appropriate boundary
Chapter 2. Background 11
Figure 2.2: A simple volume element withflux divergence.
Figure 2.3: 3D stress tensor on a smallvolume element. For each component, thefirst subscript/index denotes the directionof the outward normal from the face andthe second subscript/index is the directionof the of stress acting on that face.
conditions, can track the EM degradation of a metal structure. Physics-based EM models are
versatile and can be easily adapted to handle different configurations, as opposed to Black’s
model where the fitting parameters are usually valid only for the range of conditions under
which they were obtained.
Most physics-based EM models are based on the following continuity equation
∂Cv
∂t= ∇Ja + γ(t), (2.5)
where Cv is the vacancy concentration, i.e number of vacancies per unit volume, Ja is the atomic
flux, γ(t) is a sink/source term that models the recombination/generation of vacancies at grain
boundaries and ∇ is the Laplace operator, which in Cartesian coordinates can be stated as:
∇ =∂
∂x+
∂
∂y+
∂
∂z. (2.6)
Simply put, (2.5) states that for a small volume element, the time rate of change of vacancy
concentration is equal to the sum of the spatial gradient (derivative) of atomic flux and the rate
of recombination/generation of vacancies (higher flux gradient means higher flux divergence
and vice-versa). For example, consider a small volume element, for which the out-flow of atoms
is greater than the in-flow (Fig. 2.2), which means a positive gradient for Ja. If we ignore γ(t)
for simplicity, then we can see that the vacancy concentration in the volume element increases
with time, which generates tensile stress and may eventually lead to a void nucleation. The
physics-based EM models proposed in the literature differ in what they use as a measure of EM
degradation, and how they account for the recombination/generation of vacancies.
Chapter 2. Background 12
The earliest physics-based models [13, 14] used vacancy concentration as a measure of EM
degradation and their failure criteria was based on critical vacancy concentration, i.e. if the
vacancy concentration at any point along the metal line reaches a critical value, a void nucleates
at that point. However, when this model was applied to isolated metal lines, it was found that
the predicted failure times were orders of magnitude smaller than the observed failures times.
This anomaly was corrected by Kirchheim [15], who proposed the first EM model which used
hydrostatic stress σ as a measure of EM degradation. Here, hydrostatic stress is the average
of all normal components of the full stress tensor (see Fig. 2.3), i.e. σ = (σxx + σyy + σzz)/3.
Kirchheim’s model used the relationship between vacancy concentration and stress to “track”
the evolution of stress in a line. A void nucleates when stress along any point on the line reaches
a critical stress threshold. Kirchheim’s model was later simplified by Korhonen et al. [16] using
Hooke’s Law. Further, Kirchheim and Korhonen et al. solved their respective models to obtain
a closed form expression for σ(x, t) (stress as a function of position x on the line at time t)
for a simple configuration: a single metal line embedded in a rigid dielectric with atomic flux
blocked at the line ends. We will study Korhonen’s model in detail in the next section.
All EM models presented up to this point are one-dimensional (1D) models, i.e. at any
given point along the line (x axis), the gradient of stress along the y and z axes are ignored
by assuming that the stress is uniform over the whole cross sectional area. These 1D models
require more computation than Black’s model, but scale moderately well as the problem size
increases. Sarychev et al. [17] proposed the first three dimensional (3D) EM model that can
track stress along the x, y and z axes. Later, Sukharev et al. [18] introduced the concept of
‘plated’ atoms to capture generation/annihilation of vacancies at grain boundaries and Orio
[19] introduced the notion of a 3D diffusion coefficient to model EM degradation in greater
detail. These 3D EM models, though accurate, are computationally expensive and do not scale
well. As such, they are not suitable for full-chip p/g grid EM checking.
2.3 Korhonen’s Model and its adaptations
In this section, we will review the 1D EM model proposed by Korhonen [16], which will be
referred to as Korhonen’s model throughout this work. We will then focus on some of its
adaptations proposed in the literature.
2.3.1 The Korhonen’s model
Consider a metal line confined in a rigid dielectric material with line length along the x axis, as
shown in Fig. 2.4. If it is assumed that stress is uniform across the cross section of the line, then
for any volume element within the line, the relative change in C(x, t), the number metal atoms
per unit volume, corresponds to the increment in hydrostatic stress σ(x, t) as per Hooke’s Law
dC
C= −dσ
B, (2.7)
Chapter 2. Background 13
Figure 2.4: Schematic for a confined metal line, showing a volume element.
where B is the bulk modulus and C is often referred to as the concentration of atoms. In
an ideal lattice, C = 1/Ω, where Ω is the atomic volume. The atomic flux Ja, in the volume
element is a combination of the gradient flux, generated when ∂σ/∂x 6= 0 and the electronic
flux, generated when the current density j 6= 0. It can be stated as
Ja =DaCΩ
kbTm
(∂σ
∂x− q∗ρ
Ωj
)
, (2.8)
where Da is the coefficient of atomic diffusion (also called the diffusivity), kb is the Boltzmann’s
constant, Tm is the temperature in Kelvin, q∗ is the absolute value of the effective charge of
the migrating atoms and ρ is the resistivity of the conductor. Using (2.7) and (2.8) in (2.5),
assuming γ(t) to be proportional to −∂C/∂t and applying some simplifying approximations,
Korhonen proposed that the hydrostatic stress σ(x, t), at location x from some reference point
and at time t, can be found by solving the following PDE
∂σ
∂t=
BΩ
kbTm
∂
∂x
Da
(∂σ
∂x− q∗ρ
Ωj
)
. (2.9)
In Korhonen’s formulation, σ is positive for tensile stress and negative for compressive stress.
If the stress at any point along the line reaches the critical stress threshold σth > 0, a void
nucleates at that point. Korhonen’s model captures the dynamics of stress evolution within a
volume element, and as with any PDE, one needs to specify boundary conditions and initial
conditions in order to obtain a solution. Note that in (2.9), it is implicitly assumed that stress,
diffusivity and current density are differentiable with respect to x.
Diffusivity of metal lines
The atomic diffusion coefficient Da is usually expressed using the Arrhenius law
Da = D0 exp
(
− Q
kbTm
)
, (2.10)
Chapter 2. Background 14
time (yrs)0 2 4 6 8 10
Stress(M
Pa)
-250
-200
-150
-100
-50
0
50
100
150
200
250
(a)
x = 0x = L/2x = L
Length (×10−6 m)0 10 20 30 40 50
Stress(M
Pa)
-250
-200
-150
-100
-50
0
50
100
150
200
250
(b)
t = 0t = 0.20t = 0.80t = 1.80t = 3.80t = 10.00
Figure 2.5: (a) Stress evolution at different points along the line and (b) stress profile along theline at different time points
where D0 is a constant and Q is the activation energy for vacancy formation and diffusion. The
randomness in TTF due to EM is primarily accounted for by the corresponding randomness
in Da, which has been shown to be lognormally distributed [40]. Strictly speaking, Da also
depends on the stress value at a given point. However, it has been reported that the numerical
results with stress dependent Da are “not too different” from constant Da [16]. Hence, as in
many previous works [20, 21, 22, 23, 41], we will assume that Da is stress-independent.
2.3.2 Solution for blocking boundary at both ends
Korhonen provided an analytical solution for a finite line with flux blocked at both ends.
Consider a finite metal segment of length L that carries a current density j and has a constant
diffusivity Da throughout the line. Korhonen assumed blocked boundary conditions (flux was
blocked at both ends), i.e. Ja(0, t) = Ja(L, t) = 0 and zero initial stress in the metal segment.
Then, as per (2.9), the stress can be found as
σ(x, t) =q∗ρjL
Ω
[
−1
2+
x
L− 4
∞∑
n=0
m−2n exp
(
−m2nνt
L2
)
cos(
mnx
L
)]
, (2.11)
where mn = (2n+1)π and ν = DaBΩ/(kbTm). We will refer to (2.11) as the reference solution
for the finite line. Fig. 2.5 shows the stress evolution for a finite line as per (2.11) with L = 50µm
and j = 6× 109A/m2 flowing from x = 0 to x = L. Since the current flows from x = 0 (anode)
to x = L (cathode), the electron flow pushes the metal atoms in the opposite direction. This
results in development of tensile stress at x = L (cathode) and compressive stress at x = 0
(anode), as shown in Fig. 2.5.
Chapter 2. Background 15
Role of j, L and Da
The final steady state stress profile across the line can be easily obtained by setting t = ∞ in
(2.11), and is given by
σ(x,∞) =q∗ρjL
Ω
(
−1
2+
x
L
)
. (2.12)
The stress profile at t = 10 yrs, as shown in Fig. 2.5b, is almost the steady state stress profile.
As per (2.12), the steady state stress profile depends on the product of current density j and
line length L. Note that the steady state tensile stress at the cathode is the maximum tensile
stress that can be achieved in the line. Thus, for a finite line to be EM immune, we must have
max[σ(x,∞)] = σ(L,∞) < σth =⇒ jL <2Ωσthq∗ρ
, (2.13)
which is the same as the critical Blech product (2.4) with ∆σmax = 2σth (the stress difference
between the cathode and the anode is maximum during the steady state).
As mentioned before, the atomic flux should be zero at steady state, and this is also readily
observable from Korhonen’s model. From (2.12), it is easy to see that at t =∞
∂σ
∂x=
q∗ρj
Ω, (2.14)
which when used in (2.8), gives
Ja =DaCΩ
kbTm
(∂σ
∂x− q∗ρ
Ωj
)
=DaCΩ
kbTm
(q∗ρ
Ωj − q∗ρ
Ωj
)
= 0. (2.15)
For a given current density, the time rate of change of stress depends on the atomic diffusion
coefficient Da: a higher value of Da leads to a higher rate of EM degradation and vice versa.
Since Da has an exponential dependence on temperature, it becomes important to include
temperature in EM analysis. The observations that the steady state stress profile depends on
j and L and that the derivative of stress with respect to time depends on Da are applicable for
complex interconnect structures as well.
2.3.3 Riege Thompson Model
Korhonen’s analytical solution for stress evolution in case of a finite line is theoretically inter-
esting, but is not practically useful as modern ICs are made of connected metal segments that
have complex geometries. Thus, many authors have made efforts to adapt Korhonen’s model
to track stress in multi-branch interconnect structures.
S. P. Hau-Riege and C. V. Thompson [42] developed a closed form analytical expression for
stress evolution at a junction (a point where multiple metal lines meet). They supplemented
Korhonen’s model with boundary conditions that model the interaction of atomic flux at the
junction and conceptually replaced connected branches with semi-infinite limbs. Further, they
Chapter 2. Background 16
time (yrs)0 2 4 6 8 10
Str
ess
(MP
a)
0
100
200
300
400
500
Riege-Thomsonexact solution for finite line
Figure 2.6: Comparison of stress evolution at cathode of a finite line calculated using Riege-Thomson model and the reference solution (2.11).
Figure 2.7: Simple multi-branch interconnect structures.
assumed that the stress at the other end of the limbs is constant and is always equal to the
initial stress σ0. With these simplifying assumptions, the stress evolution at the junction is
given by
σjn(t) = σ0 +
√
4t
π
ρq∗
Ω
√BΩ
kbTm
∑
k Da,kjk∑
k
√Da,k
. (2.16)
Fig. 2.6 compares the stress evolution at cathode of a finite line using (2.11) and (2.16). Because
Riege-Thomson’s model replaces branches with semi-infinite limbs, it cannot account for the
back-stress developed due to blocking flux boundary on the anode end of the finite line. That’s
why in Riege-Thompson’s model, the junction stress exceeds the steady state stress value and
the solution discrepancy increases with time. Nevertheless, it is accurate for small time-spans
and it does provide an upper bound on the stress value at a junction and has been used in some
works for power grid EM checking [21].
Chapter 2. Background 17
2.3.4 CTHKS Model
Chen et al. [43, 44] recently developed analytical closed form expressions for stress evolution
in simple multi-branch segments shown in Fig. 2.7. In doing so, they made the following
simplifying assumptions:
i) All branch lengths are equal, assumed to be L.
ii) All branches have the same constant diffusivity Da and temperature Tm.
iii) The initial stress at t = 0 is zero everywhere.
iv) There are no voids at t = 0.
We will refer to their model as the CTHKS model, after the initials of the authors. As per this
model, the stress evolution in branch b1 of a 3-terminal tree as shown in Fig. 2.7a is
σ1(x, t) =q∗ρ
2Ω
∞∑
n=0
[
2j1
g (3L+ 4nL− x, t) + g (L+ 4nL+ x, t)
−2j2
g (L+ 4nL− x, t) + g (3L+ 4nL+ x, t)
+(j2 − j1)
g (2L+ 4nL− x, t) + g (4nL− x, t)
+g (4L+ 4nL+ x, t) + g (2L+ 4nL+ x, t)
]
, (2.17)
where g(u, t) is defined as
g(u, t) , 2
√
νt
πexp
(
− u2
4νt
)
− u erfc
(u
2√νt
)
, (2.18)
with erfc being the complementary error function and ν=DaBΩ/(kbTm). The authors provided
similar analytical expressions for all interconnect trees shown in Fig. 2.7, which can found in [44].
They compared their solutions to the results obtained using COMSOL Multiphysics software
and reported a maximum percentage error of 0.5%.
There are numerous shortcomings in the Riege-Thompson and the CTHKS model. Both
models are not directly applicable to the complex interconnect layouts found in modern power
grids. Riege-Thompson’s model allows for different diffusivities and temperatures for the
branches connected to a junction, which CTHKS model does not. On the other hand, CTHKS
model can account for the back-stress generated due to EM, which Riege-Thompson’s model
cannot. Both models cannot be applied during the void growth phase of EM. All these factors
greatly limit their usefulness for power grid EM checking.
Chapter 2. Background 18
2.4 Review of Power Grid EM checking approaches
2.4.1 Industrial EM checking approach
The state of the art approach for p/g grid EM checking is to break up the grid into isolated
branches, assess the reliability of each branch separately using Black’s model and use the earliest
branch failure time as the failure time for the whole grid. Thus, it is assumed that the grid
fails as soon as any of its branches fail and this is known as the series model of grid failure,
which was first proposed in [9]. Under the series model, the failure rate of the system is the
sum of failure rates of its individual components. Some industrial EM tools use this concept to
budget EM reliability among various parts of the grid. In other words, this allows designers to
re-balance metal usage in different parts of the grid (e.g. widening some lines to improve their
reliability while narrowing others) in a way that doesn’t impact the overall reliability of the
grid. This idea of EM budgeting was first introduced by J. Kitchin [10] in 1995 and is known
as Statistical Electromigration Budgeting (SEB).
As mentioned before, the reliability assessment for each individual branch is done using
Black’s model. Recall that as per Black’s model, 1) a line is immune to EM failure if the
product of its current density and length is less than the critical Blech product and 2) the MTF
of a branch is inversely proportional to its current density, raised to some power. For branches
that are deemed not to be EM immune as per Blech’s criteria, a maximum allowed current
density limit jmax is calculated based on a target (series model) MTF, denoted as µtarget, using
the following relation [45], which is derived form Black’s equation
jmax = jacc
(µacc
µtarget
)1/n
exp
Ea
nkb
(1
Tm,use− 1
Tm,acc
)
, (2.19)
where µacc is the observed MTF under accelerated testing conditions using current density jacc
and temperature Tm,acc, while Tm,use is the actual operating temperature at which the chip will
be used and the other symbols are as defined before.
This industrial EM checking approach is highly inaccurate for at least two reasons:
1. Ignoring Material Flow :
In order to apply Black’s model, it is implicitly assumed that the connected neighboring
branches have no impact on the lifetime of a given branch. This is incorrect because in
todays mesh structured power grids, many branches within the same layer are connected
as part of what is called an interconnect tree, and the atomic flux can flow freely between
them. Indeed, two identical connected branches that carry the same current density
can in practice have quite different values of MTF, as Gan et al. [5] and Wei et al. [46]
have demonstrated in their experiments, so that connected lines can influence each other
leading to different MTF values.
2. Series System Assumption:
Chapter 2. Background 19
The second problem lies with the series system model of the power grid failure. Modern
power grids use a mesh structure. As such, there are many paths for the current to flow
from the C4 bumps to the underlying logic, a characteristic that we refer to as redundancy.
Mesh power grids are in fact closer to (but not quite) a parallel system. As such, it is
highly pessimistic to assume that a single branch failure will always cause the whole grid
to fail.
Over the last few years, many approaches have been proposed that overcome some of these
shortcomings. We will review them next.
2.4.2 Recent approaches
Chatterjee et al. [25, 47] proposed the mesh model as an alternative to the series model. In the
mesh model, a grid is deemed to have failed not when the first branch fails, but when enough
branches have failed so that the voltage drop at some grid node(s) has exceeded a pre-defined
threshold that is chosen so as to avoid causing errors in the underlying logic. However, [25, 47]
still used Black’s model to find the reliability of individual branches, which as we saw before is
inaccurate.
Huang et al. [20] proposed a compact EM model for approximating the TTF of a branch
within an interconnect tree by using a modified version of Korhonen’s solution for a finite line
(2.11). The modification accounted for the material flow and was based on the steady state
stress analysis for the whole tree. Huang et al. approximated the kinetics of branch resistance
change due to void growth using a drift velocity model and used the mesh model to determine
grid failure. The authors later extended their work to incorporate thermal stresses in the grid
[22]. However, their approach was very slow, requiring up to 32 hours to estimate the failure
time of a 400K node grid. The modification based on steady state analysis can determine the
potential void locations in a tree, but the actual time and sequence of void nucleations might
vary considerably from the predicted ones. Moreover, in their approach, only one power grid
TTF sample was obtained and thus, the random nature of EM degradation was not accounted
for.
Li et al. [21, 23] used the Riege-Thompson model (2.16) to drive their EM verification
tool. In [21], the authors also proposed a heuristic greedy approach to increase the tree widths
in order to meet power grid integrity and reliability constraints. But, their approach suffers
from all the drawbacks of Riege-Thompson model. In addition, the authors assumed atomic
diffusivity to be the same throughout the whole tree, which is not true. Atomic diffusivity Da
can be assumed to be the same over short distances, but it varies across the whole tree due
to random grain boundary orientations [48, 41]. Thus, there is a need for a new EM checking
approach that accurately models EM degradation using physics-based models, combined with
a mesh model to account for redundancy, while being fast enough to be practically useful.
Chapter 2. Background 20
Figure 2.8: Schematic for a typical on-die power grid.
2.5 Power Grid model
An on-die power/ground (p/g) grid is a multi-layered metal structure that is used to deliver
power from the external package to the underlying logic. A typical power grid structure is
as shown in Fig. 2.8. Each metal layer mostly consists of a set of alternating parallel power
and ground stripes, that are respectively connected to the power and ground stripes of the
immediate upper and lower neighboring layers by vias. This gives rise to the mesh structure
in modern grids. These metal stripes are the multi-branch structures that are referred to as
interconnect trees. Note that the stripes are not necessarily straight lines: they may have bends
or orthogonal branches. However, they do not have loops. The top layer is connected to the
external package through C4 bumps, while the bottom layer is connected to the underlying
logic. The metal stripes are embedded in a rigid dielectric material, such as Silicon Dioxide.
The minimum spacing between the stripes is determined by the technology node. Usually, some
power or ground stripes are removed from a layer to make room for signal lines, which means
that the stripes in power grids are not uniformly placed. The width and height of the metal
stripes increase as we go from the bottom layer to the top layer.
There are three types of parasitic effects on a p/g grid: resistive, capacitive and inductive.
The resistive parasitics are responsible for the voltage drop across the grid under DC currents,
which is typically referred to as the IR drop. The capacitive effects arise due to the proximity
of metal wires, MOSFET capacitances and de-coupling capacitances. The inductive effects
are mostly due to the connections to the package through the C4 bumps, and are referred to
as L di/ dt drops. However, when it comes to EM, only the resistive parasitics are important
because EM analysis is based on effective-EM (DC) current densities. A p/g grid is a linear
system, with current sources (modeling the effects of the underlying logic circuits) as inputs
and node voltage drops as outputs. Since p/g grids carry mostly unidirectional currents, the
effective-EM currents are the same as average currents. In this work, we will use the mesh
Chapter 2. Background 21
Figure 2.9: DC model of a power grid.
model [25, 47] for p/g grid reliability checks, in which user-provided thresholds on average
voltage drops are used to determine the grid lifetime. In this framework, it becomes sufficient
to perform DC analysis of the power grid, driven by average source currents. Thus, a DC model
of the grid as shown in Fig. 2.9, devoid of any capacitances and inductance, is sufficient for EM
verification.
The power grid nodes, excluding the nodes connected to the voltage sources, are numbered
1, 2, . . ., m with the ground node being 0. Let i = [ik] ∈ Rm be the vector of non-negative
average source currents tied to the grid, such that ik = 0 if node k has no current source. Let
uk(t) be the voltage at node k, and u(t) = [uk(t)] ∈ Rm be the vector of all node voltage signals.
The voltage vector u(t) is a function of time t because it varies over large time-scales as the
grid degrades due to EM. Applying Kirchoff’s current law (KCL) at every node leads to the
following nodal analysis (NA) formulation
G(t)u(t) = −i+Gv(t)udd, (2.20)
whereG(t) andGv(t) arem×m conductance matrices that vary over large time-scales and udd is
a constant vector each entry of which is equal to vdd. Gv is a matrix of conductances connected
to the voltage sources and the matrix G = [gj,k] can be easily constructed using element stamps
[49]. If we set ik = 0 ∀k, then clearly u(t) = udd for all time, so that G(t)udd = Gv(t)udd from
(2.20). Define vk(t) , vdd − uk(t) to be the voltage drop at node k, and let v(t) = [vk(t)] ∈ Rm
be the vector of voltage drops. Then, the NA formulation can be re-written in terms of the
voltage drop vector as
G(t)v(t) = i. (2.21)
We will use this revised system to obtain the voltage drops directly for a given power grid.
Chapter 2. Background 22
2.6 Partial Differential Equations (PDE)
A PDE is an equation for some quantity z (dependent variable) that depends on two or more
independent variables and involves derivatives of z with respect to at least some of the inde-
pendent variables. A second order PDE for z(x, t) in two independent variables t and x is of
the general form
A∂2z
∂t2+ 2B
∂2z
∂t∂x+C
∂2z
∂x2+D
∂z
∂t+ E
∂z
∂x+ Fz = G(t, x), (2.22)
where A, B and C cannot all be zero. In order to solve the PDE, one needs to specify the
boundary conditions (conditions to be satisfied at the boundary of the domain of an independent
variable, say x, for all t) and the initial conditions (e.g. the value of z is specified ∀x at some
t = t0). A second order PDE is said to be linear if the equation, its boundary and initial
conditions do not include any non-linear combination of the independent variables or their
derivatives. A second order PDE is said to be parabolic if B2 − 4AC = 0. Korhonen’s model
(2.9) is a parabolic PDE if Da is assumed to be independent of the stress.
For any given boundary and initial conditions, the objective of solving a PDE is to find the
value of z for all x at some t = tf . There are many ways to solve a PDE, and the solution method
to be used depends on the problem itself. Laplace transform is a powerful technique to obtain an
analytical closed form solution, or the exact solution of a PDE. However, for complex systems,
it is often not possible to derive a closed form solution. For such systems, numerical solution
approaches such as the finite difference method [50], finite element method [51], finite volume
method [52], gradient discretization method(s) [53] or spectral method(s) [54] are preferred.
For numerically solving a linear parabolic PDE, the method of lines is a particularly useful
technique. The method of lines (MoL) [55] is a special finite-difference based technique, where
the basic idea is to discretize the PDE in all but one independent variable, so that we are left
with a set of Ordinary Differential Equations (ODE) that approximate the PDE. As we will
see, there are many well-established methods for solving an ODE system. We can use them to
solve the ODE system approximating the PDE, giving us the solution of the PDE system.
Discretizing the PDE along any variable requires us to approximate the partial derivatives.
For a sufficiently smooth function, one can approximate the partial derivatives using difference
formulas obtained from the Taylor series. Consider a sufficiently smooth function z(x, t) :
R× R→ R, then using the Taylor series we can write
z(x+∆x, t+∆t) = z(x, t) +∂z
∂t∆t+
∂z
∂x∆x
+1
2!
∂2z
∂t2(∆t)2 +
∂2z
∂x∂t∆t∆x+
∂2z
∂x2(∆x)2
+ . . . (2.23)
To approximate the partial derivative of z with respect to, say x, we set ∆t = 0 in (2.23) and
Chapter 2. Background 23
re-arrange to get
∂z
∂x=
z(x+∆x, t)− z(x, t)
∆x− 1
2!
∂2z
∂x2∆x− 1
3!
∂3z
∂x3(∆x)2 − . . . (2.24a)
≈ z(x+∆x, t)− z(x, t)
∆x. (2.24b)
Equation (2.24b) is known as forward difference approximation and is accurate up to the first
order, i.e. the norm for all terms ignored in (2.24b) (which is essentially the error) is bounded
from above by K∆x, where K is a constant. Similarly, if ∆x is replaced by −∆x in (2.23), we
obtain the backward difference approximation, which is also first order accurate
∂z
∂x=
z(x, t)− z(x−∆x, t)
∆x+
1
2!
∂2z
∂x2∆x− 1
3!
∂3z
∂x3(∆x)2 − . . . (2.25a)
≈ z(x, t)− z(x−∆x, t)
∆x. (2.25b)
Adding (2.24a) and (2.25a), we get the central difference formula which is second order accurate
∂z
∂x=
z(x+∆x, t)− z(x−∆x, t)
2∆x− 2
3!
∂3z
∂x3(∆x)2 + . . . (2.26a)
≈ z(x+∆x, t)− z(x−∆x, t)
2∆x. (2.26b)
Higher order partial derivatives can be similarly obtained. The central difference formula ap-
proximating the second order partial derivative can be stated as
∂2z
∂x2≈ z(x+∆x, t) + z(x−∆x, t)− 2z(x, t)
(∆x)2. (2.27)
We will use the central difference formulas (2.26b) and (2.27) for approximating the partial
derivatives.
2.7 Ordinary Differential Equations (ODE)
An ordinary differential equation (ODE) is an equation for some quantity z (dependent variable)
that depends on one independent variable and involves ordinary derivatives (as opposed to
partial) of z with respect to the independent variable. A first order ODE can be written as
dz
dt= f(z(t), t), (2.28)
where f : R × R → R, z : R → R and t is an independent scalar variable. It is of first order
because the highest derivative is only the first derivative. If z and f are vectors in Rn, then we
get a system of ordinary differential equations, or simply an ODE system. In order to solve an
ODE system, one needs to specify the initial condition(s), i.e. the value of z at t = t0. An ODE
Chapter 2. Background 24
system with an initial condition is generally referred to as an Initial Value Problem or IVP
dz
dt= f(z(t), t), z(t0) = z0, t ∈ [t0, tf ]. (2.29)
A sufficient condition for this IVP to have a unique solution is that f(z(t), t) be continuous on
[t0, tf ]× Rn and that it satisfies the Lipschitz condition [49] with respect to t. An IVP is said
to be well-posed if for a given finite perturbation in the initial condition z0, the perturbation
in the solution of the IVP is bounded. We will assume that all IVPs we are trying to solve are
well-posed. The ODE system (2.28) with z ∈ Rn is said to be linear if f(z, t) takes the form
f(z, t) = A(t)z(t) + u(t), (2.30)
where A(t) ∈ Rn×n and u(t) ∈ R
n. Further, if A(t) is independent of time, then we end up
with a Linear Time Invariant (LTI) system
dz
dt= Az(t) + u(t). (2.31)
We will return to linear ODEs and LTI systems when we discuss the state-space representation
of a system.
There are many well-known techniques for numerically solving an IVP, i.e. an ODE with a
given initial condition. All these techniques involve discretization of the independent variable,
usually time t, and extending the known initial solution at t = t0 in a step-by-step fashion such
that dz/ dt = f(z, t) is (approximately) satisfied for all time-steps t0 < t1 < t2 < . . . < tn−1 <
tn < . . . up to the final time point t = tf . We will denote the true solution at time tn by z(tn),
and the approximate numerical solution obtained by the numerical method as zn. Obviously
for a good numerical method, zn ≈ z(tn) within some user-specified error bound. The solution
between two time-points tn−1 and tn is obtained by using an interpolation polynomial, which
depends on the numerical method being used. Numerical methods for solving IVPs can be
broadly classified into two types [49, 56, 57]:
1. One-step methods: These methods make use of the previously computed solution at time-
point tn to compute the solution at the next time-point tn+1. Some examples of such
methods would be Forward Euler (FE), Backward Euler (BE), Trapezoidal (TR) and
Runge-Kutta (RK) methods. The Runge-Kutta methods often evaluate the function f(·)at intermediate time-points between tn and tn+1 in order to improve the accuracy of the
solution.
2. Multi-step methods: A k-step method makes use of previously computed solutions at k
time points tn, tn−1, . . ., tn−k+1 to compute the solution at the next time-point tn+1.
Multi-step methods (k > 1) require some start-up scheme to compute the first k solutions
before the method can be applied. These methods are particularly suitable for stiff sys-
tems. We will focus mainly on linear multi-step methods, because circuit equations are
Chapter 2. Background 25
Table 2.1: Butcher tableau characterizing a m stage RK formula with built-in error estimates
a1 b11 b12 . . . b1m
a2 b21 b22 . . . b2m...
......
...am bm1 bm2 . . . bmm
w1 w2 . . . wm
w∗1 w∗
2 . . . w∗m
often stiff. Backward Differentiation formulas (BDF) and Adams-Moulton methods are
some examples of multi-step methods.
In general, almost all numerical methods for solving IVPs can be written in the following
general formk−1∑
j=−1
ajzn−j = hφf (zn+1, zn, . . . , zn−k+1, tn; h), (2.32)
where k ≥ 1, aj are scalar coefficients, h = tn+1− tn is the time-step (assumed to be fixed) and
φf (·) is a function that depends on f(·). The objective is to solve (2.32) for zn+1. A numerical
method is said to be convergent if, for a well-posed IVP satisfying the Lipschitz condition, we
have
limh→0
(
maxtn∈[t0,tf ]
‖z(tn)− zn‖)
= 0. (2.33)
Convergence guarantees that for a well behaved IVP, any desired level of accuracy can achieved
by choosing a small enough fixed step-size h. A numerical method is convergent if and only if it
is both zero stable and consistent. A numerical method is said to be zero-stable if there exists
a constant h0 > 0 such that for a well-posed IVP, the change in its initial condition by a finite
amount produces a bounded change in the (discrete) solution of the IVP obtained by applying
the numerical method with fixed step-size h < h0. Since a numerical method approximates
the underlying solution, the LHS and RHS of (2.32) when applied to the true solution differ
by O(hp), where O(hp) denotes that the discrepancy in LHS and RHS of (2.32) has an upper
bound of Khp for some constant K. This discrepancy is referred to as the residual. A numerical
method is said to be consistent if its residual is O(hp) with p ≥ 2.
2.7.1 Runge-Kutta Methods
A m-stage Runge-Kutta (RK) method evaluates the function f(·) at m points in the interval
[tn, tn+1] and φf (·) is the weighted average of these sampled values. Specifically, a m stage RK
formula to evaluate zn+1 can be stated as
zn+1 = zn + h(w1k1 + . . .+ wmkm), (2.34)
Chapter 2. Background 26
where
kj = f(tn + haj , zn + h
m∑
r=1
bjrkr). (2.35)
This formula is characterized by the table of m2 + 3m parameters as shown in table 2.1, with
the last row mainly used for computing error estimates [see (2.48)]. This table of parameters
is usually referred as the Butcher tableau. The parameters are usually chosen to make the
implementation easier or to improve the accuracy of the method. A RK method is said to be
explicit if bjr = 0 for r ≥ j, otherwise it is implicit. Explicit RK methods are numerically less
expensive and kj can be sequentially computed. On the other hand, implicit RK methods may
end up being a fully coupled non-linear system that can be hard to solve.
2.7.2 Linear Multi-Step Methods
For linear multi-step (LMS) methods, φf (·) is linear so that (2.32) becomes
k−1∑
j=−1
ajzn−j = h
k−1∑
j=−1
bjf(zn−j , tn−j), (2.36)
where the bj are also scalar coefficients. If b−1 = 0, the method is said to be explicit, otherwise
it is implicit and one may need to solve a non-linear equation to compute the value of zn+1.
For many LMS methods, the scalar coefficients aj and bj can be determined using the
linear difference operator. The linear difference operator is also useful in defining the order of
an LMS method, which determines its accuracy and will be useful later in understanding the
error estimates. The linear difference operator of an LMS method with fixed time-step h is an
operator that takes an arbitrary function s(t) and produces the following time function
D[s(t); h] ,k−1∑
j=−1
ajs(t− jh)− h
k−1∑
j=−1
bjs(1)(t− jh), (2.37)
where s(t) is assumed to be differentiable as often as desired and s(1)(·) is the 1st derivative of
s(·). If D is applied to the true solution z(t) and evaluated at tn, we get the so called residual
Rn+1 =k−1∑
j=−1
ajz(tn − jh)− h
k−1∑
j=−1
bjz(1)(tn − jh). (2.38)
Using the Taylor series expansion of s(t) in (2.37), evaluating the derivatives and collecting
similar terms, we can write
D[s(t); h] = C0s(t) + C1hs(1)(t) + . . .+Cqh
qs(q)(t) + . . . , (2.39)
Chapter 2. Background 27
where s(q) denotes the qth derivative of s with respect to t and
C0 =k−1∑
j=−1
aj , (2.40)
C1 = −k−1∑
j=−1
jaj −k−1∑
j=−1
bj , (2.41)
...
Cq =(−1)qq!
k−1∑
j=−1
jqaj −(−1)q−1
(q − 1)!
k−1∑
j=−1
jq−1bj , (2.42)
...
An LMS method is said to be of order p if we have C0 = C1 = . . . = Cp = 0 and Cp+1 6= 0, so
that the residual is given by
Rn+1 = Cp+1hp+1z(p+1)(tn) +O(hp+2), (2.43)
where O(hp+2) denotes that the norm of all following terms are bounded from above by Khp+2,
where K is a constant. Thus, an order p LMS method has a residual of the order of hp+1.
2.7.3 Error estimates
Error in the computed solution depends on the stability of the IVP we are trying to solve and
on the stability of the numerical method used to solve the IVP. Well-posed IVPs are stable and
do not introduce significant errors. For any numerical method, the total error consists of two
components: a local error which is introduced in the present step (moving from tn to tn+1)
and a global error which is propagated from all previous steps (from t0 to tn+1). For stable
LMS numerical methods, it can be shown that the local error of an order p method is O(hp+1)
and its global error is O(hp) [49]. In general, it is really hard to estimate the global error; it
is computationally expensive and is of limited applicability [58, 59]. Instead, in practice the
accuracy of the computed solution is often determined by analyzing the error introduced in a
single integration step: it is ensured that the local error per integration step is bounded as per
some user provided tolerance. The local truncation error (LTE) is often used as a measure of
the local error and is defined as follows: For a k-step method, let zn+1 be the value returned
when we artificially set the previous k computed solutions to their true solutions, i.e. we set
zn−j = z(tn−j) for j = 0, 1, . . . , k − 1. Then the LTE is defined as
ǫLTE(h) = z(tn+1)− zn+1. (2.44)
Chapter 2. Background 28
The LTE can be thought of as a direct measure of how well the discrete numerical formula
approximates the true solution. Lambert [60] shows that for an LMS method
ǫLTE(h) = Rn+1 +O(hp+2). (2.45)
Motivated by this, the principal local truncation error or PLTE for an order p LMS method
with fixed time-step h is defined as
ǫPLTE = Cp+1hp+1z(p+1)(tn), (2.46)
where the expression of residual from (2.43) was used. In most cases, ǫPLTE is used as a proxy
for local error estimation in LMS methods, with Cp+1 often referred to as the error constant.
For RK methods, the LTE can be estimated using Richardson error estimates [61], or it can
be built-in the RK method. These RK methods combine two methods, usually one of order
p and another of order p − 1, so that they have common intermediate evaluations between a
single step but different output coefficient values w∗r . Specifically, the lower order method is
defined as
z∗n+1 = zn + h(w∗1 k1 + . . .+ w∗
mkm), (2.47)
where kj are the same as (2.35). The coefficients w∗r form the bottom row of the table 2.1. The
error estimate is then calculated as
ǫRK = zn+1 − z∗n+1 = h
m∑
r=1
(wr − w∗r )kr, (2.48)
which can be shown to be O(hp).
2.7.4 Variable time-stepping
Error estimates are useful not only to judge the accuracy of a solution, but also to imple-
ment variable step-size numerical solvers. Almost all modern ODE solver implementations use
variable time-stepping: they monitor the accuracy of the solution using error estimates and
adaptively change the step-size in the course of the computation. The decision to increase or
decrease the step-size is primarily taken based on user-specified error tolerances: the step-size
is decreased if the user-specified tolerance is not met (decreasing step-size should decrease error
for convergent numerical methods), and it is increased if the error estimate for the past few
time-steps is very low as compared to the user provided error tolerances. This usually results in
larger step sizes when the solution is varying slowly and smaller step sizes when the solution is
changing rapidly. In absence of variable time stepping, a fixed time-step ODE solver is forced
to take the smallest time step that satisfies the user tolerances at the steepest part of the solu-
tion. As a result, variable time-step solvers are considerably faster as compared to their fixed
time-step variants.
Chapter 2. Background 29
Implementing a change in time-step requires the use of some heuristics to decide how much
the time-step should change. There is no universal best way of changing the time-step, it
depends on the type of IVP and the numerical method being used. Some strategies include
scaling the time step by a constant factor for every increase or decrease (i.e. the next time-step
is sh or h/s where h is the previous time-step), while in others the scaling factor depends on the
error estimation itself. Changing the time-step is easier for single step methods: they can just
evaluate the solution at the next time-point and move on. However, changing the time-step is
a little harder for multi-step methods as all the previous time-points were obtained using the
older time-step value. This is usually overcome using interpolation methods or using variable
coefficient formulas. One last thing: changing the time-step in implicit ODE solvers that often
require solving non-linear coupled equations (or at the very least a linear-system solve) can slow
down the solver as it prevents re-using the previously computed factorizations.
2.8 Compact Thermal Models
Since temperature plays an important role in EM degradation, we will need a way to determine
the power grid temperature distribution. This can be done by using Compact Thermal Models
[62], as shown below. This section also provides a small case study for the application of MoL
to a PDE.
Assuming isotropic thermal conductivity (independent of position and temperature), the
heat transfer/diffusion equation for solids in Cartesian coordinates can be written as [24, 63]
Dcp∂Tm
∂t= κT
(∂2Tm
∂x2+
∂2Tm
∂y2+
∂2Tm
∂z2
)
+ γT , (2.49)
where Tm is the time and space dependent temperature profile in Kelvin (K), γT is the power
density of the heat source(s) (Watt/m3), κT is the thermal conductivity (Watt/(m.K)), D
is material density (kg/m3) and cp is the specific heat (Joule/(kg.K)). The PDE (2.49) can
be converted to an ODE by using the Method of Lines, i.e. by discretizing along the spatial
domain (x, y and z axis). Let ∆x, ∆y and ∆z be the discretizations along the x, y and z axes,
respectively. Then, we will end up with small cuboid shaped volume elements of dimension
∆x × ∆y × ∆z as shown in Fig. 2.10a. Each cuboid is identified by a unique triplet index
(i, j, k) and is isothermal with temperature Tm(i, j, k). The index i increases as we move along
the x direction, j increases along the y direction and k increases along the z direction. Then,
we can write using the central difference formula
DcpdTm(i, j, k)
dt= κT
Tm(i+ 1, j, k)− 2Tm(i, j, k) + Tm(i− 1, j, k)
∆x2
+ κTTm(i, j + 1, k)− 2Tm(i, j, k) + Tm(i, j − 1, k)
∆y2
+ κTTm(i, j, k + 1)− 2Tm(i, j, k) + Tm(i, j, k − 1)
∆z2+ γT . (2.50)
Chapter 2. Background 30
(a) (b)
Figure 2.10: (a) Cuboids resulting from spatial discretization along x, y and z axis with theirindices (note that we have not shown cuboids with indices (i, j−1, k) and (i, j+1, k) for clarity)and (b) the equivalent electrothermal model for each cuboid. The conductances gxT , gyT andgzT are shared by the neighbouring cuboids.
After multiplying both sides by ∆x∆y∆z and some re-arranging, we get
cTdTm(i, j, k)
dt+ 2 (gxT + gyT + gzT )Tm(i, j, k)
− gxT Tm(i+ 1, j, k) + Tm(i− 1, j, k)− gyT Tm(i, j + 1, k) + Tm(i, j − 1, k)− gzT Tm(i, j, k + 1) + Tm(i, j, k − 1) = iT , (2.51)
where
gxT = κT∆y∆z
∆x, gyT = κT
∆x∆z
∆y, gzT = κT
∆x∆y
∆z, (2.52)
cT = Dcp∆x∆y∆z, iT = γT∆x∆y∆z. (2.53)
Clearly, an equivalence can be drawn between (2.51) and electric circuits, where Tm(i, j, k) is
equivalent to voltage at node (i, j, k), κT is equivalent to electrical conductivity, gxT , gyT and
gzT are equivalent to electrical conductances, cT is equivalent to a capacitor to the ground and
iT is equivalent to a current source. The cuboid is thus equivalent to a thermal node that has a
current source, a capacitor and 6 resistors connected to neighbouring thermal nodes, as shown
in Fig. 2.10b, and so (2.51) can be obtained by applying KCL at this node. This arrangement
is known as a Compact Thermal Model (CTM) [62]. Using CTMs, we can express the system
Chapter 2. Background 31
of ODEs ∀i, j, k in a given volume concisely as
CTdTm(t)
dt+GTTm(t) = iTs(t) +GT,0Tamb, (2.54)
where GT is the thermal conductance matrix, CT is the diagonal thermal capacitance matrix,
Tm(t) is the vector of temperatures at all thermal nodes, iTs(t) is the vector of iT values for each
thermal node, Tamb is the surrounding ambient temperature and GT,0 is a matrix consisting of
thermal conductances from boundary nodes (at the top, bottom and sides) to the surroundings
that model the heat transfer between the given volume and the surroundings. These boundary
conditions can be isothermal (fixed temperature), insulated (no heat transfer) or convective
(heat loss due to difference in ambient and boundary temperatures) [24]. Equation (2.54) is
the equivalent to the RC power grid model obtained using NA analysis.
2.9 State Space Models
A state space model (SSM) of a linear system with n state variables, ni inputs and no outputs,
can be written as
x(t) = A(t) x(t) +B(t)u(t), (2.55a)
y(t) = L(t) x(t) +D(t)u(t), (2.55b)
where x(t) , dx/ dt, x(t) = [xi(t)] ∈ Rn is a state vector of xi states, A(t) = [ai,j(t)] ∈ R
n×n is
the system matrix, B(t) = [bi,j(t)] ∈ Rn×ni is the input matrix, L(t) = [li,j(t)] ∈ R
no×n is the
output matrix, D(t) = [di,j(t)] ∈ Rno×ni is the feedforward matrix, u(t) = [ui(t)] ∈ R
ni is the
vector of inputs to the system and y(t) = [yi(t)] ∈ Rno is the output of the system. At any
given time t, the state vector describes the linear system completely. Note that (2.55a) is a first
order ODE and is the same as (2.30). If all the matrices in (2.55) are independent of time t,
then we end up with a linear time invariant (LTI) system of the form
x(t) = A x(t) +Bu(t), (2.56a)
y(t) = L x(t) +Du(t). (2.56b)
An example of an LTI system is (2.54), with x(t) = Tm(t), A = −C−1T GT , B = C−1 and
u(t) = iTs(t)+GT,0Tamb. Similar to an ODE, one needs to specify the initial condition at some
time t = t0 in order to solve a SSM. Consider a simple homogeneous (no input) LTI system of
the form
x(t) = Ax(t), (2.57)
with given initial condition x(0). Then, its solution can be shown to be
x(t) = eAtx(0), (2.58)
Chapter 2. Background 32
where eAt is the matrix exponential that can be expressed as
eAt =∞∑
k=0
Aktk
k!. (2.59)
Moreover, if the system matrix A has distinct eigenvalues λ0, λ1, . . ., λn−1 or is diagonalizable,
then each state variable xi can be expressed as a weighted sum of n exponential components
xi(t) =
n−1∑
j=0
mi,j eλjt, i = 0, 1, . . . , n− 1, (2.60)
where the mi,j are constant coefficients that depend on the initial conditions and the eigenvec-
tors of A. A homogeneous system, as given by (2.57), is asymptotically stable if and only if all
eigenvalues of A have strictly negative real part and is unstable if any eigenvalue of A has a
positive real part. From (2.60), if all eigenvalues have negative real parts, then xi(∞) = 0. So
an asymptotically stable system decays to zero in the absence of any input. On the other hand,
if any eigenvalue λi has a positive real part, then eλit blows up as t → ∞, which results in an
unstable system.
In the presence of inputs u(t), the solution is the sum of a homogeneous response that
depends on the initial condition (given by (2.58)) and a forced response which is calculated
using a convolution integral, as shown here
x(t) = eAtx(0) +
∫ t
0eA(t−τ)Bu(τ)dτ. (2.61)
In most cases, computing the matrix exponential is a very expensive operation. As such,
numerical integration methods presented in Section 2.7 are often used to solve a state space
model.
2.10 Mean estimation using Monte Carlo random sampling
Consider a continuous random variable (RV) T that has a certain distribution. The RV T is
a function that maps the outcome of a random process (e.g. the microstructure of a line or a
grid) to a real number (e.g. the TTF). The sets T ≤ t represent the events where the RV
T has a value less than t, and have assigned probability values. A Cumulative Distribution
Function or cdf of a RV T, denoted by FT (t) is defined as
FT (t) , PT ≤ t, (2.62)
where the right hand side is equal to the probability that the RV T takes a value less than or
equal to t. A RV is completely characterized by its cdf.
Random sampling refers to the process of iteratively generating sample values from an
Chapter 2. Background 33
underlying distribution of a RV. Monte Carlo methods estimate a quantity of interest based on
repeated random sampling. A classic example is the problem of estimating the mean of a RV,
denoted by µ or E[T]. Ideally, if we draw a very large number of samples from the underlying
distribution of the RV and calculate the arithmetic average, we can estimate E[T]. However,
it is often practically not possible to obtain a large number of samples. Ideally, one would like
to know how close the estimated mean µ is to the true mean µ of the distribution and stop
when the estimated mean is close enough to the true mean. This is where Monte Carlo random
sampling comes into play.
Suppose we are sampling from a RV that has normal distribution and the variance of the
distribution is not known. Then, in order to ensure an upper bound ǫmc on the relative error
between µ and µ with a confidence of (1− ζ)× 100%, the number of samples s needed is given
by [64]
s ≥[
zζ/2v
|µ|ǫmc/(1 + ǫmc)
]2
, (2.63)
where zζ/2 is the (1 − ζ/2)-percentile of the standard normal distribution (i.e. a normal dis-
tribution with mean 0 and variance 1) and v is the unbiased estimator of variance calculated
as
v =1
s− 1
s∑
i=1
(Ti − µ)2, (2.64)
with Ti being the ith sample obtained. The usage of v instead of the true variance (which
is unknown) is acceptable only when s is large enough. As per [64], (2.63) can be used only
when s ≥ 30. Roughly speaking, a confidence of (1 − ζ) × 100% means that the estimation
procedure using (2.63) as a stopping criteria satisfies the relative error bound |µ− µ|/µ ≤ ǫmc
(1− ζ)× 100% of the time.
Chapter 3
Extended Korhonen’s model
3.1 Introduction
In this chapter, we will present the first main contribution of our work: the Extended Korhonen’s
model (EKM). We will begin by formally defining interconnect trees and explaining why they
are important for EM checking. We will then introduce EKM by using a simple interconnect
tree as an example. After this, we will state the boundary laws to model the material transfer
between the connected branches and state EKM as a PDE system. We will then describe our
numerical approach for converting the PDE system to an ODE system by using the Method of
Lines. We will verify our numerical approach and the proposed model itself by comparing its
results with prior art. Finally, we will compare the estimated junction MTFs in a tree obtained
using (calibrated) Black’s model and EKM to show the inherent inaccuracy of Black’s model
and investigate the effect of temperature on EM lifetime. A preliminary version of this work
appeared in [65].
3.2 Interconnect Tree EM analysis
Figure 3.1: Cross sectional schematic of Cu dualdamascene interconnects.
As mentioned before, breaking up a tree into
individual branches for EM analysis is not ac-
curate because it ignores the material flow be-
tween the connected branches. However, in
modern p/g grids, one does not have to treat
the whole p/g as a connected structure when
it comes to material flow. Modern p/g are
made of Copper (Cu) and are fabricated us-
ing a dual damascene process [21]. In a dual-
damascene process, the metal line and via are
formed simultaneously using copper. A bar-
rier metal liner (usually Tantalum) must completely surround all Cu interconnects to prevent
34
Chapter 3. Extended Korhonen’s model 35
Figure 3.2: A typical interconnect tree structure.
the Copper from diffusing into the surrounding dielectric. The cross section of a typical metal
via structure in a Cu dual damascene process is as shown in Fig. 3.1. Due to the presence of
the barrier metal liner around the vias and branches, Cu atoms from one tree cannot diffuse
to another tree. As a result, the metal atoms are confined within a tree and we only need to
account for the material flow between the branches of a tree while conducting EM analysis.
An interconnect tree is a continuously connected acyclic structure of straight metal lines
within one layer of metalization such that atomic flux can flow freely within it. Fig. 3.2 shows a
typical interconnect tree structure. Formally, an interconnect tree is a graph T = (N ,B) withno cycles, where N is a set of grid junctions and B is a set of resistive branches. A branch is
defined to be a continuous straight metal line of uniform width that has the same current density
along its length. A junction is any point on the interconnect tree where a branch ends or where
a via is located. Usually, but not always, current density around a junction is discontinuous,
as different branches in a tree are allowed to have different widths. This discontinuity can be
caused either by differences in the widths of connected branches, or by a change in the currents
due to the presence of a via. We define the degree of a junction to be the number of branches
connected to it. Note that a via does not contribute to the degree of a junction. In this work, a
junction with degree 1 will be referred to as a diffusion barrier, a junction with degree 2 will be
referred to as a dotted-I junction, a junction with degree 3 will be referred to as a T junction
and a junction with degree 4 will be referred to as a plus junction. We treat corners in a tree
as dotted-I junctions. Due to the planar nature of interconnect trees, junctions with degrees
higher than 4 are rarely found in practice.
Many previous works [20, 22, 21] assumed atomic diffusivity Da to be constant throughout
the tree. In our case, we will assume Da to be constant within a branch, but it may vary across
different branches within a tree. Thus, we end up with a piecewise constant Da throughout the
tree. This is done for two reasons: 1) It allows for a more general framework that can easily
fall-back to constant Da for the whole the tree if required and 2) it is physically more accurate
Chapter 3. Extended Korhonen’s model 36
to assume an effective diffusivity (at a macroscopic level) that varies over short distances [41, 66]
due to random grain boundary orientations. If required, a long branch can be broken down
into smaller branches with different diffusivities.
There are two consequences of assuming fixed branch diffusivities. First, atomic flux di-
vergence (AFD) is now higher at branch ends, i.e. junctions, as compared to branch interior.
Higher AFD leads to higher (positive or negative) time-rate of change of stress. Thus, in our
model, voids will nucleate only at junctions of a tree. This is not a problem since it is much
more common in the field to find voids around via locations and grain boundaries [5, 46, 28]
as compared to the branch interior. Second, branches cannot have temperature gradients be-
cause Da depends on temperature [as shown in (2.10)]. Practically, temperatures cannot change
drastically over short distances, hence assuming a branch to be isothermal is a very mild as-
sumption. The branch diffusivities can vary over time if the temperature changes, but at any
given time, the entire branch has the same diffusivity.
3.2.1 Assigning reference directions
Before doing any analysis, we need to assign reference directions to all branches. This is
necessary to consistently track the directions of branch currents and atomic flux throughout
the tree.
An interconnect tree is equivalent to a graph, with grid junctions as vertices and branches as
edges. With this analogy, there are many ways to assign reference direction to the branches. We
choose the following way: starting from any diffusion barrier, we traverse the whole interconnect
tree using a breadth-first search on the graph. This creates predecessor-successor relationships
between the junctions. The reference direction for each branch is then assigned from predecessor
to successor. The branch current (and atomic flux) is positive if it flows in the reference
direction, otherwise it is negative. Likewise, the reference point for distance is the predecessor
junction, so that for any branch bk, xk = 0 is the predecessor and xk = Lk (line length) is the
successor. In Fig. 3.2, if we choose to start from the leftmost diffusion barrier (labelled as n1),
then the reference directions for each branch would be as shown by the dashed arrow lines.
3.2.2 Incorporating thermal stress
In the case of on chip interconnects, the metal lines are embedded in a rigid confinement.
Because of the difference in coefficients of thermal expansion (CTE) of the metal (Copper) am
and the confinement (Silicon Dioxide) asi, stress is generated as the metal cools down after
deposition. This so called thermal stress can be expressed as [67]
σT,k(t) = B(am − asi)(Tzs − Tm,k(t)), (3.1)
where B is the bulk modulus, σT,k is the thermal stress, Tm,k is the temperature of branch bk
and Tzs > Tm,k is the stress free annealing temperature. The initial stress σk(xk, 0) in branch
Chapter 3. Extended Korhonen’s model 37
Figure 3.3: A simple 3-terminal tree Td. Dashed arrows denote reference directions.
bk at t = 0 is equal to its thermal stress so that
σk(xk, 0) = σT,k(0). (3.2)
If σk(xk, 0) > σth, thermally induced voids will nucleate in a tree. However, that would require
the branch temperatures to be much lower than the room temperatures, which is a highly
unlikely scenario. In fact, the supported temperature range for commercial devices is 0C to
70C [68]. Thus, thermally induced voids are ignored in this work.
3.3 Extending Korhonen’s model to trees
To find the level of EM degradation in an interconnect tree, we will extend Korhonen’s model
to account for the coupling of stress between the tree branches. For better understanding, we
will first illustrate our approach with a simple interconnect tree as shown in Fig. 3.3. We will
then generalize the scheme into a set of boundary laws and state the PDE system for the whole
tree.
Consider a simple tree Td = (N ,B), with N = n1, n2, n3 and B = b1, b2, as shown in
Fig. 3.3. Branch bk has dimensions Lk × wk × hk (length × width × height), carries a current
density jk, has an atomic diffusivity of Da,k and temperature Tm,k, where k is 1 or 2 in this
case. The reference direction for both branches is from left to right, so that n1 is the reference
point for b1 and n2 is the reference point for b2. Within branch bk, the distance from their
respective reference point is denoted by xk. Note that x1 = L1 and x2 = 0 denote the same
point: the location of junction n2. We are interested in stress as a function of position and
time, i.e. σ1(x1, t) and σ2(x2, t) for branches b1 and b2, respectively. Once σ1 and σ2 are known,
we can easily determine the EM degradation in the branches.
For any point within a branch, Korhonen’s model (2.9) captures the dynamics of stress
evolution. Since atomic diffusivity is assumed to be constant for a branch, we can re-write (2.9)
for branches b1 and b2 as
∂σk∂t
=BΩDa,k
kbTm,k
∂
∂xk
(∂σk∂xk− q∗ρ
Ωjk
)
, xk ∈ (0, Lk) and k = 1, 2. (3.3)
At junctions, the diffusivity and current density change abruptly. As such, their spatial deriva-
tives are undefined and Korhonen’s model cannot be applied at junctions. Instead, we need to
Chapter 3. Extended Korhonen’s model 38
Figure 3.4: Stress profile around a junction immediately after void nucleation.
state the boundary conditions to describe the behaviour of stress and atomic flux at the junc-
tions. For example, in Fig. 3.3, we need to state the boundary conditions at the two diffusion
barriers n1 and n3 and the dotted-I junction n2.
Diffusion Barrier
Junctions n1 and n3 are diffusion barriers, where the atomic flux is blocked. Considering the
nucleation phase first, Ja is zero at the barrier so that from (2.8)
Ja,1(0, t) = 0 =⇒ ∂σ1(0, t)
∂x1=
q∗ρ
Ωj1, (3.4a)
Ja,2(L2, t) = 0 =⇒ ∂σ2(L2, t)
∂x2=
q∗ρ
Ωj2. (3.4b)
We next move to the void growth phase. For a void to nucleate at n1 (n3), we must have
j1 < 0 (j2 > 0), so that the electron flow pushes the metal atoms away from n1 (n3). Exactly
what happens around a void is somewhat complicated and cannot be fully captured in a 1D
model. Sukharev et al. [67] provide a simplified extension of the Korhonen 1D model to describe
behaviour of stress around a void, which we will use in our work. When the stress value at any
junction reaches σth, a void nucleates at that point. Just after the void nucleation, stress falls
to zero inside the void and at the void surface, but remains at its original value σth at a very
short distance of δ ≈ 1nm from the void surface. We refer to δ as the thickness of the void
interface. For example, the stress profile at n1 just after void nucleation is shown in Fig. 3.4.
Recall that stress gradient gives rise to gradient flux that flows from points with lower stress
towards points of higher stress. In this case, the high spatial stress gradient gives rise to a
high gradient flux that always flows away from the void. This flux is responsible for the void
growth. The net flux at the junction is now the sum of this gradient flux and the electronic flux.
However, the magnitude of the electronic flux is very small as compared to the gradient flux.
We thus ignore the electronic flux and state the boundary conditions at the diffusion barriers
Chapter 3. Extended Korhonen’s model 39
during the void growth phase as
∂σ1(0, t)
∂x1=
σ1(0, t)
δ, (3.5a)
∂σ2(L2, t)
∂x2= −σ2(L2, t)
δ, (3.5b)
where σ1(0, t) = σth and σ2(L2, t) = σth at the time of void nucleation. A growing void presents
amoving boundary problem for a PDE that is computationally very expensive to solve. However,
as we will see, the steady state void length is very small, (≈ 0.5% of line length), so that (3.5)
is a good approximation for stress gradient around the void.
Dotted-I Junction
The interaction of atomic flux at dotted-I junction n2 is the key to describing the coupling of
stresses in branches b1 and b2. Considering the nucleation phase first, the junction n2 is the
same physical point of both b1 and b2, so that
σ1(L1, t) = σ2(0, t). (3.6)
In other words, stress is continuous across the junction. This makes sense because if stress is
discontinuous across a junction and abruptly jumps from one value to another within a short
distance, the high stress gradient would quickly equalize the stress across the junction simply
because the atomic flux can flow freely between b1 and b2 when there is no void at n2. This
brings us to our second boundary condition, which can be stated mathematically as
w1h1Ja,1(L1, t) = w2h2Ja,2(0, t). (3.7)
Note that (3.7) is applicable to an infinitesimal cross-section at n2, and states that the material
flow across an infinitesimal cross-section is conserved. This is true for any infinitesimal cross-
section inside the branch as well, and is implicitly accounted for in Korhonen’s model. However,
over a finite region or volume element, the net atomic flux entering may not be equal to the
net atomic flux leaving, which gives rise to flux divergence and generates stress in the line.
Next we will consider the void growth phase. Once a void nucleates at n2, it is shared by
both branches b1 and b2. For our 1D model, we make the reasonable assumption that the void
completely covers the entire cross-sectional area of the junction. As a result, there would be
no flow of atomic flux between b1 and b2. Hence, during the void growth phase, we effectively
treat n2 as a diffusion barrier for both branches b1 and b2, so that
∂σ1(L1, t)
∂x1= −σ1(L1, t)
δ,
∂σ2(0, t)
∂x2=
σ2(0, t)
δ, (3.8)
where σ1(L1, t) = σ2(0, t) = σth at the time of void nucleation. The alternate assumption, that
Chapter 3. Extended Korhonen’s model 40
a void partially covers the cross-section at a junction is hard to model in a 1D scenario where
every location is essentially treated as a point. Note that the branches are still electrically
connected as the current can flow through the barrier metal liner.
As we will see a little later, (3.3) combined with the boundary conditions obtained from
(3.4)-(3.8) and the initial condition as stated in (3.2), is the PDE system that completely
determines σ1 and σ2. We will next generalize the above schemes for capturing flux interactions
at junctions, into a set of laws that forms the basis for our approach.
3.3.1 Boundary Laws for junctions
Consider a junction np, and let Bp be the set of branches connected to np. Let tf,p be the time
of void nucleation for this junction. Then, the boundary laws, motivated mainly by the law of
conservation of mass and physical observations, can be stated as:
Law 1. Until a void nucleates at np, the stress values in any two branches where they meet at
np are equal.
Law 2. For t < tf,p, the number of metal atoms flowing into np per unit time is the same as
the number of metal atoms flowing out from it
∑
bk∈Bp,in
wkhkJa,k =∑
bk∈Bp,out
wkhkJa,k, (3.9)
where wk (hk) is the width (height) of the branch, Bp,in is the set of branches for which the
reference direction is going into np, and Bp,out is the set of branches for which the reference
direction is going out from np.
Law 3. For t ≥ tf,p, there is no flow of atomic flux between the connected branches Bp. The
stress gradient at the junction, generalizing from (3.5) and (3.8), is
∂σk,p∂xk
= ±σk,pδ
, (3.10)
where σk,p is the value of stress at end-point np of branch bk. The sign is positive for bk ∈ Bp,outand negative for bk ∈ Bp,in.
3.3.2 PDE system for a general interconnect tree
We are now ready to state the complete PDE system that describes the stress evolution for an
interconnect tree of arbitrary complex geometry over time. We refer to this PDE system as the
Extended Korhonen’s model.
Consider a tree T = N ,B. A branch bk ∈ B has dimensions Lk × wk × hk and carries
a current density jk. Let Da,k and Tm,k represent the atomic diffusivity and temperature of
branch bk. Let xk denote the distance from the reference point (predecessor junction) in branch
Chapter 3. Extended Korhonen’s model 41
bk with 0 ≤ xk ≤ Lk. For any junction np ∈ N , let Bp,in (Bp,out) be the set of connected
branches for which the reference direction is going into (out of) the junction, and let tf,p > 0
be its time of void nucleation. Then, the Extended Korhonen’s model can be stated as
PDE:∂σk∂t
=BΩDa,k
kbTm,k
∂
∂xk
(∂σk∂xk− q∗ρ
Ωjk
)
, ∀bk ∈ B, xk ∈ (0, Lk), (3.11a)
BC: ∀np ∈ N s.t. t < tf,p∑
bk∈Bp,in
wkhkJa,k(Lk, t) =∑
bk∈Bp,out
wkhkJa,k(0, t), (3.11b)
σk(Lk, t) = σi(0, t), ∀bk, bi ∈ Bp,in × Bp,out, (3.11c)
∀np ∈ N s.t. t ≥ tf,p
∂σk,p∂xk
=
−σk,p(Lk, t)
δ∀bk ∈ Bp,in,
σk,p(0, t)
δ∀bk ∈ Bp,out,
(3.11d)
IC: σk(xk, 0) = σT,k(0) ∀bk ∈ B. (3.11e)
3.3.3 Void growth and resistance change
Once the stress at any point on the tree reaches σth, a void nucleates at that point. As noted
before, in EKM, void nucleation occurs only at junctions and not within the branches. Once
a void nucleates at a junction, it is shared by all the branches connected to that junction, i.e.
it affects the resistance of all connected branches. Tracking void growth is useful in order to
determine the change in branch resistances and the corresponding current densities. However,
void growth dynamics in dual damascene copper interconnects is a complex phenomenon and
involves void migration (movement of void within a branch), healing (void size reduction due
to change in the current direction) and saturation (steady state void volume for given branch
current densities) [69, 70]. Since p/g grid branches carry mostly unidirectional current, void
healing rarely happens. Also, there is no change in void size during migration [69], which means
that void migration has no effect on the branch resistance. Thus, we will ignore void migration
and healing in this work.
Sukharev et al. [67] show that the initial void growth rate for an EM induced void is very
high. This is attributed to the high initial gradient flux as explained in Section 3.3. Hence,
as a conservative approximation, we assume that once a void nucleates at any junction np, the
void lengths for all branches bk connected to np reach their steady state values in a very short
period of time. As a result, the line resistance rises immediately to its steady state value for
all connected branches. The steady state void volume for branch bk can be calculated as
Vk,sat = Lkwkhk
(σT,kB
+q∗ρ|jk|Lk
2BΩ
)
. (3.12)
Chapter 3. Extended Korhonen’s model 42
In our case, since we assume that a void covers the entire cross-section area, the void length is
simply given by lk,v = Vk,sat/(wkhk). In the presence of a void, the branch current is forced to
take the high resistance path through the metal liner. Correspondingly, the branch resistance
Rk becomes
Rk = ρblk,v/Ab + ρm(Lk − lk,v)/Am , (3.13)
where ρm(ρb) and Am(Ab) are the resistivity and cross-sectional area of the metal (liner),
respectively. For any branch bk, Vk,sat and jk are inter-dependent on each other. As such,
we iteratively find jk and Vk,sat using modified Richardson iteration. It should be noted that
although we assume a saturated volume for the void, the boundary conditions for any junction
where a void has nucleated is the same as the one used for transient void growth. Thus, in
assuming immediate steady state void volume, we have replaced the actual transient current
densities by their respective conservative steady state values.
3.4 Solving EKM using IVP formulation
In this section, we will describe our approach for solving the Extended Korhonen’s model using
method of lines (MoL). First, for points within a branch, we will use MoL to convert the PDE
system into a ODE system by discretizing along the spatial domain. Then, using the laws
proposed in Section 3.3.1, we will derive the boundary conditions at the junctions. Finally, we
will merge the two and state the IVP formulation that describes the stress evolution for a given
tree.
Since we will deal with power grids that are composed of trees, the IVP (as well as the LTI
systems in the next chapter) is formulated to solve EKM for trees. However, EKM as shown in
(3.11), is applicable to non-tree interconnect structures (that have loops) as well and one can
formulate an equivalent IVP (or LTI system) for general graphs.
3.4.1 Scaling
Before proceeding with MoL, we will scale stress, distance and time by introducing their di-
mensionless variants. This leads to stable PDEs that are easier to solve numerically. We define
the following scaling factors for any branch bk ∈ B
τ=
BΩ
kbT ⋆m
D⋆at
L2c
, ηk=
ΩσkkbT ⋆
m
, ξk=
xkLk
, (3.14)
where D⋆a is the atomic diffusivity at some chosen nominal temperature T ⋆
m and Lc is some
chosen characteristic length. The new variables τ , η and ξ are referred to as reduced time,
stress and distance, respectively. Using (3.14) in (3.11a) and applying the chain-rule, we get
∂ηk∂τ
= θk∂
∂ξk
(∂ηk∂ξk− αk
)
, (3.15)
Chapter 3. Extended Korhonen’s model 43
where θk = (L2cDa,kT
⋆m)/(L2
kD⋆aTm,k) and αk = (q∗ρjkLk)/(kbT
⋆m). Since, for any given branch,
αk is not a function of distance ξk, we have ∂αk/∂ξk = 0, so that
∂ηk∂τ
= θk∂2ηk∂ξ2k
. (3.16)
Equation (3.16) constitutes the scaled PDE to be solved for ∀bk ∈ B. Also, the atomic flux in
bk can be restated in terms of the reduced variables as
Ja,k =Da,kCT ⋆
m
LkTm,k
(∂ηk∂ξk− αk
)
. (3.17)
3.4.2 Discretization for a tree branch
We uniformly discretize branch bk into N segments, where N is the same for all branches
because we have scaled all branch lengths to 1 as in (3.14). The reduced stress at each of the
N + 1 discrete spatial points 0, 1, . . . N in branch bk is denoted by ηk,i and the time rate of
change of ηk,i is [from (3.16)]
∂ηk,i∂τ
= θk∂2ηk,i∂ξ2k
, i = 0, 1, . . . N. (3.18)
Further, we approximate the partial derivatives with respect to ξk using the central difference
formula, so that (3.18) leads to
dηk,idτ
= θk
(ηk,i+1 + ηk,i−1 − 2ηk,i
(∆ξ)2
)
, i = 0, 1, . . . N, (3.19)
where ∆ξ = ∆ξk = 1/N , ∀k. The corresponding atomic flux Ja,k,i at the ith point in branch bk
is given as
Ja,k,i =Da,kCT ⋆
m
LkTm,k
(ηk,i+1 − ηk,i−1
2∆ξ− αk
)
. (3.20)
The ODE system given by (3.19) ∀bk ∈ B, combined with the initial condition (3.2), approx-
imates the PDE system (3.11); so that the solution of the ODE system gives us the solution
of (3.11). However, the formulation is not yet complete because the ODEs at all junctions
(i = 0, N ∀bk ∈ B) require the values of ηk,−1 and ηk,N+1, which are not part of the ξk
domain. The values at these ghost points are obtained by solving the respective boundary
condition(s), as we next explain.
To simplify the presentation going forward, we define the following for any two branches
bi, bk ∈ Brik , Li/Lk, pik , Da,iTm,k/(Da,kTm,i),
wik , wi/wk, γik , rkiwikpik, Υk , θk/(∆ξ)2.(3.21)
Chapter 3. Extended Korhonen’s model 44
3.4.3 Boundary Conditions at Diffusion Barrier
Consider a diffusion barrier np connected to branch bk. We have two cases, one where np is the
predecessor junction (at ξk = 0, start of the branch) and one where it is the successor junction
(at ξk = 1, branch end). We will first obtain the boundary conditions for np at ξk = 0. Let τf
be the time of void nucleation at this barrier. Then, the corresponding boundary condition is
[using (3.9) and (3.10)]
∂ηk,0∂ξk
=
αk τ < τf ,
ηk,0(Lk/δ) τ ≥ τf ,(3.22)
where ηk,0 corresponds to σk,p in (3.10), with ηk,0 = ηth = Ωσth/(kbT⋆m) at τ = τf . Using the
central difference approximation, we get
ηk,1 − ηk,−1
2∆ξ=
αk τ < τf ,
ηk,0(Lk/δ) τ ≥ τf ,(3.23)
which can be easily solved for ηk,−1
ηk,−1 =
ηk,1 − 2∆ξαk τ < τf ,
ηk,1 − 2∆ξηk,0(Lk/δ) τ ≥ τf .(3.24)
Similarly, for a diffusion barrier at ξk = 1, we get
ηk,N+1 =
ηk,N−1 + 2∆ξαk τ < τf ,
ηk,N−1 − 2∆ξηk,N (Lk/δ) τ ≥ τf .(3.25)
3.4.4 Boundary Conditions at Dotted-I junction
Consider a dotted-I junction np. Without loss of generality, we will assume that np is at the
end of branch 1 and at the beginning of branch 2. To formulate the ODE at np, we need the
value of at least one of the ghost points (η1,N+1 or η2,−1). Let τf be the time of void nucleation
at this junction. Then, using (3.9), we have for τ < τf (h1 = h2 within a metal layer)
w1Ja,1,N − w2Ja,2,0 = 0. (3.26)
Define ∆η1,N , η1,N+1 − η1,N−1 and ∆η2,0 , η2,1 − η2,−1. Then substituting the expression for
atomic flux from (3.20) in (3.26), we get
w1Da,1CT ⋆
m
L1Tm,1
(∆η1,N2∆ξ
− α1
)
− w2Da,2CT ⋆
m
L2Tm,2
(∆η2,02∆ξ
− α2
)
= 0
=⇒ ∆η1,N − γ21∆η2,0 = u1, (3.27)
Chapter 3. Extended Korhonen’s model 45
where u1 = 2∆ξ (α1 − γ21α2). Also, from law 1, η1,N = η2,0 when τ < τf . Hence, the time rate
of change of stress should also be the same, so that using (3.16)
∂η1,N∂τ
=∂η2,0∂τ
=⇒ ∂2η1,N∂ξ21
=θ2θ1
∂2η2,0∂ξ22
for τ < τf . (3.28)
Applying the central difference formula in (3.28), we get
η1,N+1 + η1,N−1 − 2η1,N(∆ξ)2
=θ2θ1
(η2,1 + η2,−1 − 2η2,0
(∆ξ)2
)
=⇒ ∆η1,N + 2(η1,N−1 − η1,N ) = r212 p21 (−∆η2,0 + 2(η2,1 − η2,0))
=⇒ ∆η1,N + r212 p21∆η2,0 = u2, (3.29)
where u2 = 2(r212p21η2,1 − η1,N−1 + (1− r212 p21)η1,N ). Solving for ∆η1,N and ∆η2,0 form (3.27)
and (3.29), we get
∆η1,N =r12u1 + w21u2
r12 + w21, ∆η2,0 = −
u1 − u2r12p21(r12 + w21)
. (3.30)
Thus, the final expression for the ghost points η1,N+1 and η2,−1 are
η1,N+1 = η1,N−1 +r12u1 + w21u2
r12 + w21, (3.31a)
η2,−1 = η2,1 +u1 − u2
r12p21(r12 + w21). (3.31b)
Once a void nucleates at np, it is treated as a diffusion barrier for all connected branches.
Thus, for τ ≥ τf , the boundary conditions are given by
η1,N+1 = η1,N−1 − 2∆ξη1,N (L1/δ), (3.32a)
η2,−1 = η2,1 − 2∆ξη2,0(L2/δ). (3.32b)
3.4.5 Boundary Conditions at T junction
Consider a T junction np. Similar to the dotted-I junction, we will assume that np is at the
end of branch 1 and at the beginning of branches 2 and 3. To complete the ODE formulation
at np, we need the value of at least one of the ghost points (η1,N+1, η2,−1 or η3,−1). Let τf be
the time of void nucleation at this junction. Then, using (3.9), we get (h1 = h2 = h3 within a
metal layer)
w1Ja,1,N − w2Ja,2,0 − w3Ja,3,0 = 0 for τ < τf . (3.33)
Chapter 3. Extended Korhonen’s model 46
Also, for τ < τf , stress should be continuous across np (law 1), so that η1,N = η2,0 = η3,0, which
gives [using (3.16)]
∂η1,N∂τ
=∂ηk,0∂τ
=⇒ ∂2η1,N∂ξ21
=θkθ1
∂2ηk,0∂ξ2k
for τ < τf , k = 2, 3. (3.34)
Same as before, we substitute the expression of atomic flux from (3.20) in (3.33) and apply the
central difference formula in (3.34) to obtain the value of ghost points. We omit the complete
derivation and only present the final values
η1,N+1 = η1,N−1 +u1r12r13 + u2r13w21 + u3r12w31
r12r13 + r13w21 + r12w31, (3.35a)
η2,−1 = η2,1 +u1r13 − u2(r13 + w31) + u3w31
r12p21(r12r13 + r13w21 + r12w31), (3.35b)
η3,−1 = η3,1 +u1r12 + u2w21 − u3(r12 + w21)
r13p31(r12r13 + r13w21 + r12w31), (3.35c)
where u1 = 2∆ξ (α1 − γ21α2 − γ31α3), and uk = 2(r21kpk1ηk,1 − η1,N−1 + (1− r21kpk1)η1,N
), for
k = 2, 3.Using law 3, np is treated as a diffusion barrier during the void growth phase, so that for
τ ≥ τf
η1,N+1 = η1,N−1 − 2∆ξη1,N (L1/δ), (3.36a)
ηk,−1 = ηk,1 − 2∆ξηk,0(Lk/δ), k = 2, 3. (3.36b)
3.4.6 Boundary Conditions at Plus junction
The boundary conditions for the plus junction can be obtained by following the same procedure
as done before. Consider a plus junction np, which is at the end branch 1 and at the beginning
of branches 2, 3 and 4. Let τf be the time of void nucleation at this junction. Then, using law
1 and equations (3.9) and (3.16), we have for τ < τf
w1Ja,1,N − w2Ja,2,0 − w3Ja,3,0 − w4Ja,4,0 = 0, (3.37)
∂η1,N∂τ
=∂ηk,0∂τ
=⇒ ∂2η1,N∂ξ21
=θkθ1
∂2ηk,0∂ξ2k
, k = 2, 3, 4. (3.38)
Solving as before, we can obtain the value of ghost points, which are as shown here
η1,N+1 = η1,N−1 +u1r12r13r14 + u2r13r14w21 + u3r12r14w31 + u4r12r13w41
r12r13r14 + r13r14w21 + r12r14w31 + r12r13w41, (3.39a)
η2,−1 = η2,1 +u1r13r14 − u2(r13r14 + r14w31 + r13w41) + u3r14w31 + u4r13w41
r12p21(r12r13r14 + r13r14w21 + r12r14w31 + r12r13w41), (3.39b)
η3,−1 = η3,1 +u1r12r14 + u2r14w21 − u3(r12r14 + r14w21 + r12w41) + u4r12w41
r13p31(r12r13r14 + r13r14w21 + r12r14w31 + r12r13w41), (3.39c)
Chapter 3. Extended Korhonen’s model 47
0 2 4 6time (yrs)
0
100
200
300
400
500
600
700
Str
ess
(Mpa
)
(a)
0 20 40 60 80 100x (10 -6 m)
0
100
200
300
400
500
600
700
Str
ess
(Mpa
)
(b)
0.00 yrs0.59 yrs2.98 yrs
4.76 yrs
4.79 yrs
6.00 yrs
Figure 3.5: For Td, (a) evolution of stress at junctions with time and (b) stress profile withtime.
η4,−1 = η4,1 +u1r12r13 + u2r13w21 + u3r12w31 − u4(r12r13 + r13w21 + r12w31)
r14p41(r12r13r14 + r13r14w21 + r12r14w31 + r12r13w41), (3.39d)
where u1 = 2∆ξ (α1 − γ21α2 − γ31α3 − γ41α4), and uk = 2(r21kpk1ηk,1−η1,N−1+(1−r21kpk1)η1,N ),
for k = 2, 3, 4.
Using law 3, np is treated as a diffusion barrier during the void growth phase. Thus, for
τ ≥ τf
η1,N+1 = η1,N−1 − 2∆ξη1,N (L1/δ), (3.40a)
ηk,−1 = ηk,1 − 2∆ξηk,0(Lk/δ), k = 2, 3, 4. (3.40b)
The IVP formulation is completed by eliminating the ghost points from the ODEs at junctions
by using (3.24), (3.25), (3.31), (3.32), (3.35), (3.36), (3.39) and (3.40). Fig. 3.5 shows the
solution obtained using the IVP formulation for tree Td of Fig. 3.3, with L1 = L2 = 50µm, and
j1 = −j2 = 6× 109 A/m2. In this scenario, since the electronic flux moves away from junction
n2 in both branches, it develops tensile stress which ultimately leads to void nucleation.
3.5 Verifying EKM and the IVP formulation
In this section, we will first verify the IVP formulation of EKM by comparing our numerical
results with known analytical solutions for simple interconnect trees. Then, we will compare
the lifetime estimates of Extended Korhonen’s model with experimental results published in
the literature to verify the model itself. We will use a standard variable time step Runge-Kutta
method with the Butcher tableau as given by Dormand and Prince [71] for integrating the IVPs
obtained.
Chapter 3. Extended Korhonen’s model 48
Figure 3.6: Tree with a (a) dotted-I junction and (b) T junction.
3.5.1 Verifying the numerical approach
As mentioned in the background, analytical solutions are known for 1) a finite line with blocked
boundary conditions, as provided by Korhonen [16] and 2) for simple interconnect trees (with
some simplifying assumptions) as given by the CTHKS model [44]. We will compare our
numerical solution with both analytical solutions.
First, we compare our solution with the analytical solution of the CTHKS model. Chen et
al. [44] compared the solution of CTHKS model to the solution obtained by using COMSOL,
an industry standard PDE solver, and reported a maximum error of 0.5%. Thus, a comparison
with the CTHKS model would provide an indirect comparison of our numerical method with
COMSOL.
CTHKS model makes some simplifying assumptions to derive the analytical solution, which
are listed in Section 2.3.4. In order to compare EKM and CTHKS model, we make the same
simplifying assumptions. We will use the simple interconnect trees shown in Fig. 3.6, with all
branch lengths being L = 50 µm. We will use N = 20 discretizations per branch to formulate
the IVP (a higher value of N gives a more accurate solution and vice versa). The initial
branch current densities for the dotted-I structure Td are assumed to be j1 = 1 × 109 A/m2
and j2 = −2 × 109 A/m2. Fig. 3.7a compares the stress evolution at the junctions of tree Tdwith time as obtained using EKM and CTHKS model. In Fig. 3.7b, we plot the percent error
between the stress values against the CTHKS solution, i.e. if σEKM(x, t) and σCTHKS(x, t) represent
the solutions obtained using EKM and CTHKS model respectively, at some discrete point x
at time t, then a blue dot is the point σCTHKS(x, t), 100×(σEKM(x, t)− σCTHKS(x, t))/σCTHKS(x, t).The maximum absolute error between the solutions obtained using the CTKHS model and the
EKM is 0.5 MPa, i.e. max(|σCTHKS(x, t)−σEKM(x, t)|) = 0.5 MPa. The red and black lines show
the contour for the maximum absolute error. This kind of plot is known as the error rate plot,
and we will frequently use it to show the error between two quantities. The percentage errors
are high when the stress values are close to 0, which is to be expected. For the T-structure T⊥,the current densities are j1 = 0.9× 109 A/m2, j2 = −2× 109 A/m2 and j3 = −0.8× 109 A/m2.
The comparison of stress evolution at the junctions is shown in Fig. 3.8a. The error rate plot
in Fig. 3.8b shows a maximum absolute error of 0.9 MPa. This demonstrates that the results
obtained from the EKM are in excellent agreement with the CTHKS model, and by extension
Chapter 3. Extended Korhonen’s model 49
0 1 2 3 4 5 6 7 8 9 10
time (yrs)
-80
-60
-40
-20
0
20
40
Str
ess
(MP
a)
n1
EKM
n1
CTHKS
n2
EKM
n2
CTHKS
n3
EKM
n3
CTHKS
(a)
-60 -40 -20 0 20 40Stress (Mpa)
-20
-15
-10
-5
0
5
10
15
20
Err
or (
%)
Percent error 0.502 MPa-0.502 MPa
(b)
Figure 3.7: (a) Comparing stress evolution for a dotted-I structure as obtained using EKM andthe CTHKS model, and (b) the error rate plot with respect to the CTHKS solution.
0 1 2 3 4 5 6 7 8 9 10
time (yrs)
-100
-80
-60
-40
-20
0
20
40
Str
ess
(MP
a)
n1
EKM
n1
CTHKS
n2
EKM
n2
CTHKS
n3
EKM
n3
CTHKS
n4
EKM
n4
CTHKS
(a)
-60 -40 -20 0 20Stress (Mpa)
-20
-15
-10
-5
0
5
10
15
20
Err
or (
%)
Percent error 0.916 MPa-0.916 MPa
(b)
Figure 3.8: (a) Comparing stress evolution for a T-structure as obtained using EKM and theCTHKS model, and (b) the error rate plot with respect to the CTHKS solution.
COMSOL. For a better comprehension of how the stress profile in T⊥ varies over time, we show
a 3D plot of stress evolution in Fig. 3.9.
For comparison with the reference solution proposed by Korhonen, we use the finite line as
shown in Fig. 3.10 with L=40 µm and j=2×109 A/m2. The initial thermal stress is assumed
to be 434.7 MPa. We use N=40 discretizations per branch. The stress values are computed
for each discretized point in the line for 200 equidistant time points between 0-50 years. In
Fig. 3.11a, we compare the stress evolution at the junctions and at x = L/2 (with time) as
obtained using the reference solution (2.11) and the numerical solution of the IVP formulation.
The error rate plot (Fig. 3.11b) for stress values at all discretized points over time with respect
to the stress values obtained using the reference solution shows a maximum absolute error of
∼1.37 MPa. In this case, the maximum percent error is approximately 0.34%, which shows
that our numerical solution for a finite line is very close to the reference solution.
Chapter 3. Extended Korhonen’s model 50
Str
ess
(MP
a)
x ( m)
y ( m)
60-80
-60
40
-40
0
-20
0
20
20
40
40 206080
100 0120
t = 3.80 yrst = 5.80 yrs
t = 10.00 yrs
t = 1.80 yrst = 0.80 yrst = 0.20 yrs
Figure 3.9: Stress profile across the T-structure with time.
Figure 3.10: Schematic of a finite line.
0 10 20 30 40 50Time (yrs)
360
380
400
420
440
460
480
500
Str
ess
(MP
a)
n2 ref.
n2 EKM
L/2 ref.L/2 EKMn1 ref.
n1 EKM
(a)
360 380 400 420 440 460 480 500Stress (Mpa)
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Err
or (
%)
Percent error 1.368 MPa-1.368 MPa
(b)
Figure 3.11: (a) Comparing stress evolution for a finite-line as obtained using EKM and thereference solution, and (b) the error rate plot with respect to the reference solution.
3.5.2 Verifying the model
We will now verify the model itself using previously published experimental data by Gan et al. [5]
and Moreau et al. [1].
Gan et al. conducted experiments using the dotted-I structure shown in Fig. 3.6a, with each
Chapter 3. Extended Korhonen’s model 51
(i) (ii) (iii) (iv) (v)Experiment number
0
20
40
60
80
100
120
140
160
180
time
(hou
rs)
Experiment (Gan et al.)
Simulation (EKM)
Used for calibration
Blacks' ModelMTF
Figure 3.12: Comparing the estimated MTF and its 95% confidence bounds as obtained usingEKM with the ones reported by Gan et al. [5]. Note that the confidence bounds get tighter asthe number of TTF samples are increased.
branch being L = 250µm long. They used 5 different current density configurations, which are
listed as follows:
i) j1 = 2.5× 1010 A/m2, j2 = 2.5× 1010 A/m2.
ii) j1 = 2.5× 1010 A/m2, j2 = 0 A/m2.
iii) j1 = 2.5× 1010 A/m2, j2 = 0.5× 1010 A/m2.
iv) j1 = 2.5× 1010 A/m2, j2 = −0.5× 1010 A/m2.
v) j1 = 2.5× 1010 A/m2, j2 = −2.5× 1010 A/m2.
The failure was determined based on a 30% increase in the branch resistance.
We use data from configuration (i) to calibrate EKM: since EKM assumes that voids im-
mediately reach their steady state after nucleation, we empirically choose an appropriate value
for the mean diffusivity, so that the mean void nucleation times (as estimated using EKM) are
equal to the mean failure times reported in [5]. We then use the calibrated model to estimate
the MTF for all the remaining configurations. The MTF and confidence bounds estimated
using EKM are based on 100 TTF samples, where each TTF sample is obtained by assign-
ing lognormally generated diffusivities to the branches of the tree. The results are shown in
Fig. 3.12, where the comparison is made in terms of the MTF and its 95% confidence bounds.
As can be seen, EKM gives a conservative estimate for configuration (ii) and a close enough
MTF estimate for configurations (iii), (iv) and (v), which is well within the experimentally
determined 95% confidence bounds. Since branch b2 has zero current density in configuration
(ii), it takes longer to reach the 30% resistance increase, which is not accounted for in EKM
Chapter 3. Extended Korhonen’s model 52
(a) (b)
Figure 3.13: (a) Schematic view of the test structure used in [1], and (b) Upstream and down-stream configurations as defined with respect to the left via. Both figures taken from [1]. Here,TiN (Titanium Nitride) is used for barrier liner and SiN (Silicon Nitride) is used for capping.
Table 3.1: Comparison of upstream-to-downstream MTF ratio as reported in [1] and as esti-mated using EKM.
Temp. ib Upstream Downstream Ratio
(C) (mA) Experiment[1] EKM Experiment[1] EKM µexpu
µexpd
µekmu
µekmdµexp
ustdev. µekm
ustdev. µexp
dstdev. µekm
dstdev.
25015 1294.5 0.33 1292.3 0.34 – – 480.43 0.37 – 2.67
25 979.3 0.28 617.2 0.34 378.9 0.25 223.35 0.37 2.58 2.76
30010 748.7 0.34 834.8 0.29 203.9 0.36 351.25 0.33 3.67 2.38
20 348.5 0.22 332.3 0.30 – – 120.53 0.31 – 2.76
35015 184.6 0.25 211.3 0.26 48.5 0.28 84.51 0.28 3.81 2.50
25 98.3 0.18 110.6 0.28 – – 40.05 0.30 – 2.76
because it assumes that the voids immediately achieve their steady state volume. Simulating
with EKM, we found that for configurations (ii), (iv) and (v), the voids nucleate only at junc-
tion n2, and for configuration (iii), the voids nucleate at junctions n2 and n3, which is the same
as observed by Gan et al. in [5]. On the other hand, if Black’s model was calibrated using
configuration (i), it would predict the same MTF for b1 in all cases (shown by the dashed line
in Fig. 3.12) regardless of the current density in j2, which is pessimistic in this case.
Moreau et al. [1] conducted experiments to find out the impact of redundant through silicon
vias or TSVs on EM lifetimes. Redundant TSVs are often used as simple design solutions to
increase the EM lifetime. The schematic view of their test structure is shown in Fig. 3.13a.
They used two configurations, an upstream configuration where the direction of the electron
flow was towards the RDL (redistribution layer) and a downstream configuration in which the
direction of the electron flow was away from the RDL, as shown in Fig. 3.13b. For the up-
stream configuration, voids will nucleate below the right hand side vias and for the downstream
configuration, voids will nucleate below the left hand side via. Let ib represent the magnitude
Chapter 3. Extended Korhonen’s model 53
20 40 60 80 100 120 140 160 180
branch number
-5
0
5
10
j (A
/m2)
#109 (a)
(b)
(c)
20
40
60
80
100+
(d)
20 40 60 80 100 120 140 160 180
branch number
-100
-50
0
50
100
di
ere
nce (
yrs
)
MTF
Figure 3.14: a) Initial current density profile for T1 and heat map showing MTFs estimated using(b) Extended Korhonen’s model (MTFekm), (c) Black’s model (MTFblk) and (d) MTFblk −MTFekm. All MTF values are in years.
of current flowing in the 500 µm branch. In their experiments, it was observed that even if ib
is kept at the same magnitude in both configurations, the presence of redundant TSVs on the
right side improved the EM lifetime by 2-4x in the upstream configuration due to the metal
reservoir effect. To simulate this metal structure using EKM, we first calibrate it based on data
available for the upstream configuration at T = 250C and ib = 15 mA. Using this calibrated
model, we estimate the MTF for all the other configurations. Ideally one should use the data
at different temperature points to obtain a more accurate calibration, but in this case we are
interested in the ratio of MTF as observed in the upstream and downstream configurations,
rather than the actual MTF values. The results are tabulated in Table 3.1. As can be seen,
EKM consistently predicts that the MTF in the upstream configuration is 2-3x longer than the
downstream configuration, which is similar to the reported results. Since there are 4 redundant
TSVs, a ratio close to 4 was to be expected. This effect cannot be modelled by only using
Black’s model as it simply depends on the current density of a given line, and thus would give
the same MTF in both configurations.
3.6 Comparison between EKM and Black’s model
In this section, we will compare the MTFs estimated using Black’s model and EKM for two
interconnect trees denoted as T1 and T2, extracted from the IBM power grid benchmarks [26].
Chapter 3. Extended Korhonen’s model 54
20 40 60 80 100 120 140 160 180
branch number
-2
-1
0
1
2
j (A
/m2)
#109 (a)
(b)
(c)
20
40
60
80
100+
(d)
20 40 60 80 100 120 140 160 180
branch number
0
50
100
diff
ere
nce (
yrs
)
MTF
Figure 3.15: (a) Initial current density profile for T2 and heat map showing MTFs esti-mated using (b) Extended Korhonen’s model (MTFekm), (c) Black’s model (MTFblk) and(d) MTFblk −MTFekm. All MTF values are in years.
Both trees are straight metal stripes, consisting of 193 junctions (2 diffusion barriers and 191
dotted-I junctions) and 192 branches each. For a fair comparison, we calibrate Black’s model
based on data obtained from Korhonen’s model, so that for a finite line, the MTF predicted
by Black’s model and EKM are the same. Since Black’s model gives branch MTFs and EKM
computes junction MTFs, we report the junction MTFs as the MTF for all connected branches
for this comparison. The MTF estimate from EKM is the arithmetic average of 100 TTF
samples, where each TTF sample is obtained by assigning lognormally generated diffusivities
to all branches in the tree and simulating the tree up to 100 years.
Tree T1 has a high current density profile, with maximum initial branch current density being
5.1 × 109 A/m2 (Fig. 3.14a). In this case, the calibrated Black’s model estimates the smallest
MTF to be around 6 yrs, whereas the smallest MTF found using the Extended Korhonen’s
model is around 24 yrs, which is ∼ 4x longer. Fig. 3.14b and 3.14c show the heat map of MTFs
of all branches within the tree as estimated using EKM and Black’s model and Fig. 3.14d shows
the difference in the estimated values. This scenario clearly shows that Black’s model can be
highly pessimistic.
Next, consider tree T2 which has a low current density profile, with maximum initial branch
current density being 1.5×109 A/m2 (Fig. 3.15a). Here, due to the Blech effect, Black’s model
predicts that no failure should occur. However, due to the material flow between the branches,
we found that the smallest MTF would be around 2.2 yrs. This test case shows that Black’s
Chapter 3. Extended Korhonen’s model 55
20 40 60 80 100 120 140 160 180
branch number
320
325
330
335
Tem
pera
ture
(K
)
(a)
Actual temp. dist.
Assumed nominal temp.
(b)
(c)
40
60
80
100+
(d)
20 40 60 80 100 120 140 160 180
junction ID
-100
-50
0
50
di
ere
nce (
yrs
)
MTF
Figure 3.16: (a) The actual temperature profile and the assumed nominal temperature distri-bution. Heat map showing MTFs estimated with (b) actual temperature profile (MTFT ), (c)assuming Tm,k = 327.6K for all branches (MTF T ) and (d) MTFT −MTF T . All MTF valuesare in years.
model can also be highly optimistic for a tree, especially when it has a low current density
profile. Similar to the previous figure, Fig. 3.15b and 3.15c show the heat map of MTFs of all
branches within the tree as estimated using EKM and Black’s model and Fig. 3.15d shows the
difference in the estimated values.
3.7 Importance of Temperature distribution
In this section, we will explore the effect of temperature on the lifetimes estimated using EKM.
For this study, we will use tree T1. The junction MTFs are obtained by taking average of
100 TTF samples, where each TTF sample is obtained by assigning lognormally generated
diffusivities to all branches in the tree and simulating the tree up to 100 years.
We first estimate the MTFs using the actual temperature distribution, as shown in Fig. 3.16a.
This temperature distribution was obtained by using compact thermal models (the detailed pro-
cedure is presented in Section 6.3). For this case, the smallest MTF was observed to be around
24 years. Now, we artificially assume a constant temperature of 327.6K throughout the tree,
i.e. Tm,k = 327.6K ∀k. Note that 327.6K is the average of the actual branch temperatures. In
this case, the first failure happens around 22.5 yrs, which is close enough to the actual smallest
MTF. However, the similarity ends here, with all subsequent junction MTFs being different
Chapter 3. Extended Korhonen’s model 56
(a) using actual Temp. distribution
20 40 60 80 100 120 140 160 1800
50
100
MT
F (
yr)
(b) Tm = 315K
20 40 60 80 100 120 140 160 1800
50
100M
TF
(yr
)
(c) Tm = 327.6K
20 40 60 80 100 120 140 160 1800
50
100
MT
F (
yr)
(d) Tm = 340K
20 40 60 80 100 120 140 160 1800
50
100
MT
F (
yr)
Figure 3.17: Estimated MTF as per EKM using (a) the actual temperature profile, and assumingthe temperature to be (b) 315K (c) 327.6K and (d) 340K for all branches. The x-axis for allplots represent the junction IDs. Junctions with MTF ≥ 100 years have not been shown.
from each other. In particular, the actual MTFs are lower for branches 1-50 and 140-192 where
the actual temperature is more than 327.6K and are higher for branches 51-140 where the
actual temperature is less than 327.6K (see Fig. 3.16b, 3.16c and 3.16d). This shows that a
single nominal temperature cannot model the effect of an uneven temperature distribution. A
higher nominal temperature would result in lower MTF values for all junctions and vice versa.
This is shown in Fig. 3.17, where we show the MTF computed with different values of nominal
temperature. The minimum MTFs for test cases with Tm,k as 315K, 327.6K and 340K ∀k are
60.1 years, 22.5 years and 9.07 years, respectively. Hence, temperature distribution plays a very
important role and should be taken into account while doing EM lifetime analysis.
Chapter 4
LTI Models for trees
4.1 Introduction
In the last chapter, we described in detail the Extended Korhonen’s model (EKM) and also
verified it against known analytical solutions for simple cases and published experimental results
in the literature. In this chapter, we will dig deeper into EKM to show that it has a state space
representation, which is a succession of Linear Time Invariant (LTI) systems. Expressing
EKM as a LTI system has at least two advantages. First, it allows us to analyze EKM better
using well-known LTI system concepts, which we will do in this chapter. Second, we can now
develop optimized numerical methods to solve EKM, which we will do in the next chapter. A
preliminary version of this work appeared in [72].
In this chapter, we will show that EKM is an asymptotically stable system with all eigenval-
ues being negative real numbers. We also investigate the accuracy vs. speed trade off for LTI
models obtained using different values of N . Finally, we will justify the use of average currents
for EKM by studying the frequency response of tree LTI systems.
4.2 State Space representation for a tree
A PDE system is said to be linear if the equation, its boundary and initial conditions do not
include any non-linear combination of the variables or their derivatives. From (3.11), it is clear
that for EKM, there are no non-linear combinations for the variables (σ, x and t) and the
derivatives involved. Thus, EKM is a linear PDE system. To illustrate this point, we will
revisit the example with dotted-I structure Td presented in Section 3.3. Specifically, we will
show that when no voids are present in Td, the IVP formulation obtained after eliminating the
ghost points is essentially a LTI system. For clarity, we repeat the following definitions
rik , Li/Lk, pik , Da,iTm,k/(Da,kTm,i),
wik , wi/wk, γik , rkiwikpik, Υk , θk/(∆ξ)2.(4.1)
57
Chapter 4. LTI Models for trees 58
At the diffusion barrier n1, the stress evolution is given by the ODE
dη1,0dτ
= Υ1 (η1,−1 + η1,1 − 2η1,0) . (4.2)
Substituting η1,−1 from (3.24) in (4.2), we can eliminate the ghost point
dη1,0dτ
= Υ1(η1,1 − 2∆ξα1 + η1,1 − 2η1,0) = −2Υ1(η1,0 − η1,1)− 2Υ1∆ξα1.
Similarly, we can eliminate the ghost points in ODEs at n2 and n3. The final IVP can be
written as
dη1,0dτ
= −2Υ1 (η1,0 − η1,1)− 2∆ξΥ1α1, (4.3a)
dη1,idτ
= Υ1 (η1,i−1 − 2η1,i + η1,i+1) , i ∈ 1, 2, . . . , N − 1, (4.3b)
dη1,Ndτ
= 212Υ1
(η1,N−1 − (1 + γ21)η1,N + γ21η2,1
)+ 2∆ξ12Υ1(α1 − γ21α2), (4.3c)
dη2,idτ
= Υ2(η2,i−1 − 2η2,i + η2,i+1), i ∈ 1, 2, . . . , N − 1, (4.3d)
dη2,Ndτ
= 2Υ2 (η2,N−1 − η2,N ) + 2∆ξΥ2α2, (4.3e)
ηk,i(0) =ΩσT,k(0)
kbT ⋆m
, k ∈ 1, 2 and i ∈ 1, 2, . . . , N − 1, (4.3f)
where 12 = r12/(r12 + w21). Clearly, (4.3) can be written as a LTI system
η1,0(τ)
η1,1(τ)...
η1,N (τ)
η2,1(τ)...
η2,N (τ)
=
−2Υ1 2Υ1 0 . . .
Υ1 −2Υ1 Υ1 . . .. . .
. . .. . .
. . . 212Υ1 −212Υ1(1+γ21) 212Υ1γ21 0 . . .
0 Υ2 −2Υ2 Υ2 . . .. . .
. . .. . .
. . . 2Υ2 −2Υ2
η1,0(τ)
η1,1(τ)...
η1,N (τ)
η2,1(τ)...
η2,N (τ)
+
2∆ξΥ1 0 0
0 0 0...
......
0 2∆ξ12Υ1 0
0 0 0...
......
0 0 2∆ξΥ2
−α1
(α1 − γ21α2)
α2
. (4.4)
Following a similar procedure, it can be shown that the IVP formulation of Td in the presence
Chapter 4. LTI Models for trees 59
Figure 4.1: Notion of subtrees and time-spans.
of voids is also an LTI system. Note that (4.4) is not the final LTI system for a tree with no
voids, because it can be shown that the system matrix is singular. We will discuss this case in
detail in Section 4.2.3.
4.2.1 Subtrees and Time-spans
Before going any further, we need to introduce the concept of subtrees and time-spans. When a
void nucleates at a junction, EKM conceptually treats it as a diffusion barrier for all connected
branches, so that there is no material flow between them. Thus, the tree is effectively divided
into separate subtrees. A subtree of tree T = N ,B is graph T = N , B with N ⊆ N and
B ⊆ B. Fig. 4.1 illustrates the notion of subtrees. Let τp be the time of the pth void nucleation,
with τ0 = 0. For the time-span [τ0, τ1), a tree has no voids. Thus, T = T and Nf , the set
of failed junctions, is empty. We will refer to this time-span as the pre-void phase. For all
subsequent time-spans, the subtrees will have at least one failed junction that has a void. We
will refer to these time-spans as the post-void phase. At τ = τ1, the first void nucleates, say at
n2. Since n2 is a dotted-I junction, it divides the tree into two subtrees T1 and T2, as shown
in Fig. 4.1. Note that n2 appears in both subtrees, and as per EKM is treated as a diffusion
barrier with a void, which we will refer to as a voided diffusion barrier. Similarly, at τ = τ2,
junction n6 fails creating three new subtrees. The whole tree is now divided into four subtrees.
In general, if Nf is the set of failed junctions in the whole tree, then the number of subtrees ns
can be found using
ns = 1 +∑
np∈Nf
deg(np)− |Nf |, (4.5)
Chapter 4. LTI Models for trees 60
where deg(np) is the degree of junction np before void nucleation and |Nf | is the number of
junctions that have failed in the tree.
4.2.2 LTI system for a subtree
Consider a subtree T = N , B of tree T and let Nf be the set of failed junctions in the subtree
[e.g. Nf = n2 for T2 in the time-span [τ1, τ2)]. Similar to the IVP formulation, we uniformly
discretize each branch bk into N segments, where N is the same for all branches. Then, there
would be a total of q + 1 discretized points, where q = N |B|. Each discretized point is given a
unique index i ∈ i0+ 0, 1, 2, . . . q, where the offset i0 ensures unique indices for all discretized
points within the tree T . Let xi represent the reduced stress at the ith discretized point in the
tree. Then, using (3.18), the time rate of change of xi in branch bk is
∂xi∂τ
= θk∂2xi∂ξ2k
. (4.6)
Replacing the partial spatial derivative with respect to ξk in (4.6) with central difference ap-
proximation and solving the boundary conditions at junctions to eliminate the ghost points
leads to the following translated LTI system for a subtree in the time-span [τp, τp+1)
˙x(τ − τp) = Ax(τ − τp) + Bu, (4.7a)
y(τ − τp) = Lx(τ − τp), (4.7b)
x(0) = x0, (4.7c)
where x = [xi] ∈ Rq+1 is the state vector of the subtree, A = [ai,j ] ∈ R
(q+1)×(q+1) is the system
matrix, B = [bi,j ] ∈ R(q+1)×(|N |−|Nf |) is the input matrix, u = [ui] ∈ R
|N |−|Nf | is the input
vector, L = [li,j ] ∈ R|N |×(q+1) is the output matrix and y = [yi] ∈ R
|N | is the output vector that
consists of stress values at all junctions. The initial condition x0 is easily obtained from the
stress profile of the tree at τ = τp computed using the LTI models of the previous time-span,
or it is given by the residual thermal stress at τ = 0.
Each state xi contributes some non-zero entries to the ith row of A, B, u and L, which
we refer to as a state stamp. State stamps are conceptually similar to element stamps used
in SPICE for generating circuit matrices. The notion of stamps is useful to assemble the LTI
system for a given subtree: we start by initializing all matrices and vectors to zeros and add
the stamps as we traverse through the tree. The state stamps are determined based on the
location, the adjacent points and the presence or absence of a void at point i. Two points are
said to be adjacent to each other if they are physically next to each other in a subtree. We will
use A(i) denote the set of indices for points adjacent to i.
Chapter 4. LTI Models for trees 61
State Stamps for A
Diffusion barrier Consider state xi for a diffusion barrier np at the beginning or at the end
of branch bk, with A(i) = i1. Let τf be the time of void nucleation at this barrier. Then, the
non-zero entries in the ith row are given as
ai,i =
−2Υk τ < τf ,
−2Υk(1 + ∆ξLk/δ) τ ≥ τf ,(4.8a)
ai,i1 = 2Υk ∀τ. (4.8b)
Higher degree junctions Consider a state xi for a junction np with degree d (d is 2, 3 or
4) and A(i) = i1, i2, . . . , id. Without loss of generality, we will assume that np is at the end
of branch 1 and at the beginning of branches 2, . . . , d. Let τf be the time of void nucleation at
np. Then, the state stamp corresponding to np for τ < τf are
ai,i = −21dΥ1
d∑
k=1
γk1, (4.9a)
ai,ik = 21dΥ1γk1, k = 1, . . . , d, (4.9b)
where
12 =r12
r12 + w21, 13 =
r12r13r12r13 + r13w21 + r12w31
, (4.10a)
14 =r12r13r14
r12r13r14 + r13r14w21 + r12r14w31 + r12r13w41. (4.10b)
As mentioned before, when a void nucleates at junction np, it generates new subtrees.
Clearly, each subtree will have at least one void located at the newly created voided diffusion
barrier. For any subtree, let i be the index of the discretized point at the beginning or the
end of branch bk where a void is present, and let A(i) = i1 be its only adjacent point in the
subtree. Then the state stamps for A is simply given by
ai,i = −2Υk(1 + ∆ξLk/δ), ai,i1 = 2Υk. (4.11)
Branch interior Consider state xi for a discretized point within branch bk, with A(i) =
i1, i2. Then, the non-zero entries of the ith row of A are
ai,i = −2Υk, ai,i1 = ai,i2 = Υk. (4.12)
In EKM, a void cannot nucleate inside a branch. Hence, there are no state stamps for the
corresponding case.
Chapter 4. LTI Models for trees 62
Theorem 1. (properties of A) For a subtree T , let A be the system matrix obtained using
stamps (4.8)-(4.12) Then:
(a) For the pre-void phase, all eigenvalues of A are real and non-positive, with exactly one
eigenvalue being 0.
(b) For the post-void phase, A is non-singular, with all eigenvalues being real and negative.
The proof of this theorem is given in the appendix A. From theorem 1, A is singular in the
pre-void phase. This is problematic because a singular matrix is not invertible, and hence we
cannot find the steady state solution for the LTI system (which will be required later) and we
also cannot apply any model order reduction techniques. Thus, we will derive a non-singular
LTI system for the pre-void phase in the next section. But before we do that, we will complete
this section by presenting the state stamps for B, L and u.
State Stamps for B
Similar to A, each state xi contributes some non-zero entries to the input matrix B. By the
nature of the ODE system, the inputs are present only at junctions that have no voids, so that
number of inputs is equal to |N | − |Nf |, the number of un-voided junctions in a subtree. Thus,
B is a (q + 1) × (|N | − |Nf |) matrix. Let all the un-voided junctions junction in a subtree be
represented as np, with p ∈ 0, 1, 2, . . . , |N | − (|Nf |+ 1). Then, the state stamps for B are as
follows:
Branch interior and voided diffusion barrier Any state xi that lies within a branch or
is at a voided diffusion barrier does not contribute anything to the ith row of B. Thus, the
corresponding row in B is all zeros.
Diffusion barrier For a diffusion barrier at junction np with state xi at the beginning or the
end of branch bk, the non-zero entry in the ith row of B is
bi,p = 2∆ξΥk. (4.13)
Higher degree junctions For a junction np with degree d ∈ 2, 3, 4, which is at the end of
branch 1 and at the beginning of branches 2, . . . , d, the state-stamp is given as
bi,p = 2∆ξ1dΥ1, (4.14)
where 1d is as given in (4.10).
Overall, the structure of B is that such that the pth column corresponding to the un-voided
junction np with state xi has a non-zero entry at the ith row. All other entries are 0.
Chapter 4. LTI Models for trees 63
State Stamps for u
The input vector u = [up] ∈ R|N |−|Nf | is the vector of inputs at un-voided tree junctions. Here,
the value of up corresponds to junction np, and is determined as follows:
Diffusion barrier For a diffusion barrier np located at the beginning or end of branch bk, we
have
up = ±αk, (4.15)
where the sign is positive for a diffusion barrier at the end of a branch and is negative for a
diffusion barrier at the starting of a branch.
Higher degree junctions For a junction np with degree d ∈ 2, 3, 4, which is at the end of
branch 1 and at the beginning of branches 2, . . . , d, we have
up = α1 −d∑
k=2
γk1αk. (4.16)
State Stamps for L
The output matrix L = [lp,i] ∈ R|N |×(q+1) is just a matrix of 1’s and 0’s that selects the states
at junctions to be the output of the system
lp,i =
1 xi is a state at junction np,
0 otherwise.(4.17)
4.2.3 LTI system for pre-void phase
From theorem 1, A is singular in the pre-void phase. This happens because the corresponding
boundary conditions model it as a closed system, i.e. there is no exchange of atoms with
other trees. This creates a dependency among the states xi of the whole tree, which leads to a
singular system matrix. In this subsection, we will state that dependency, which is essentially
an alternate form of conservation of mass, and use it to get a corresponding non-singular LTI
system. Since T = T in the pre-void phase, we will consider the whole tree while applying the
conservation of mass.
From Hooke’s law (2.7), we can write for branch bk ∈ B
C(ξk, τ) = C0e−ηkkbT
⋆m/(BΩ), (4.18)
where ηk ≡ ηk(ξk, τ), C is the concentration of atoms and C0 is its equilibrium value in the
absence of stress. Then, the total number of atoms Ntot in the tree at any time τ can be written
Chapter 4. LTI Models for trees 64
as (h, the height of the tree is same for all branches in the tree)
Ntot = C0h∑
bk∈B
wkLk
∫ 1
0e−ηkkbT
⋆m/(BΩ) dξk
≈ C0h∑
bk∈B
wkLk
∫ 1
0
(
1− ηkkbT⋆m
BΩ
)
dξk
=C0h
B
B∑
bk∈B
wkLk −kbT
⋆m
BΩ
∑
bk∈B
wkLk
∫ 1
0ηk dξk
, (4.19)
where we used the approximation ex ≈ 1 + x for x ≪ 1 because ηkkbT⋆m ≪ BΩ, ∀τ . Since,
the number of atoms in the tree is the same for any time τ , the tensile/compressive stresses
generated by the movement of atoms can only vary in a way that conserves the number of
atoms in the tree. Thus, the second summation term on the right hand side of (4.19) should
be constant. Define
β(τ) ,∑
bk∈B
wkLk
∫ 1
0ηk(ξk, τ)dξk =
q∑
i=0
cixi(τ), (4.20)
where q = N |B| and the integral was evaluated using the trapezoidal rule. The value of ci
coefficients are
ci =
Lkwkhk∆ξ xi is inside branch bk,
(∆ξ/2)∑
bk∈Bp
Lkwkhk xi is at junction np. (4.21)
Bp is the set of branches connected to np. Since the residual thermal stress values for all points
at τ = 0 is known from (3.2), β(0) = β0 =∑q
i=0 cixi(0) is a known quantity. Then, in order to
satisfy the conservation of mass, we must have
β0 = β(τ) =
q∑
i=0
cixi(τ) ∀τ. (4.22)
This gives us a linear dependence between the states so that one state can be eliminated from
(4.7), which will make the system matrix non-singular as it removes the (only) zero eigenvalue.
Note that we can only eliminate a non-output state from the system. Without loss of generality,
let x0 be the non-output state to be eliminated. If we denote x = [xi] ∈ Rq for 1 ≤ i ≤ q to be
the new state vector for the pre-void phase, we can write from (4.22)
x0(τ) = −cT x(τ) + β0/c0, (4.23)
where c = c−10 [ c1 c2 . . . cq ]
T ∈Rq. Now, the singular LTI system for pre-void phase can be
Chapter 4. LTI Models for trees 65
written as
[
x0(τ)
˙x(τ)
]
=
[
a0,0 a1q
aq1 Aq
]
︸ ︷︷ ︸
A
[
x0(τ)
x(τ)
]
+
[
b0|N |
Bq
]
︸ ︷︷ ︸
B
u, (4.24a)
y(τ) =[
l|N |0 Lq
]
︸ ︷︷ ︸
L
[
x0(τ)
x(τ)
]
, (4.24b)
where u ∈ R|N | is as obtained using state stamps for pre-void phase and
a1q = [ai,k]T ∈ R
q for i = 0, 1 ≤ k ≤ q,
aq1 = [ai,k] ∈ Rq for 1 ≤ i ≤ q, k = 0,
Aq = [ai,k] ∈ Rq×q for 1 ≤ i, k ≤ q,
b0|N | = [bi,k]T ∈ R
|N | for i = 0, 0 ≤ k ≤ |N | − 1,
Bq = [bi,k] ∈ Rq×|N| for 1 ≤ i ≤ q, 0 ≤ k ≤ |N | − 1,
l|N |0 = [li,k] ∈ R|N | for 0 ≤ i ≤ |N | − 1, k = 0,
Lq = [li,k] ∈ R|N |×q for 1 ≤ i ≤ |N | − 1, 0 ≤ k ≤ q.
Since we are eliminating x0, the first row in (4.24a) is removed, and we are left with
˙x(τ) = aq1 x0(τ) + Aqx(τ) + Bqu. (4.25)
Using (4.23) in the LTI system (4.25), we get
˙x(τ) = (Aq − aq1 cT ) x(τ) + Bqu+ (β0/c0)aq1. (4.26)
Define
A , Aq − aq1 cT , (4.27a)
B , Bq + (β0/c0)aq1u, u ∈ R|N | and u · u = 1, (4.27b)
L , Lq. (4.27c)
Then, the non-singular LTI system for the pre-void phase can be stated as
˙x(τ) = Ax(τ) + Bu, (4.28a)
y(τ) = Lx(τ), (4.28b)
x(0) =[
ηT,1(0) ηT,2(0) . . . ηT,q(0)]
, (4.28c)
where ηT,i(0) is the initial reduced thermal stress at point i.
Chapter 4. LTI Models for trees 66
4.2.4 Final State Space representation
From the previous discussion, it is clear that the state space representation for a tree has to be
updated whenever a void nucleates at any of its junctions. In addition, the size of the model
changes as voids nucleate in the tree. For any given time-span [τp, τp+1), the system matrix,
input matrix and the output matrix are fixed (independent of time), which gives us an LTI
system. Once a void nucleates, EKM assumes that it reaches its steady state volume in a
negligible amount of time. Correspondingly, the branch resistances change fairly quickly and
the current densities also change to their new effective values in a negligible amount of time.
As such, the input vector u is also fixed for a given time-span. Overall, for each time-span,
EKM is an LTI system with step inputs.
To state the succession of LTI systems, we first define the following
x(τ) ,
x(τ) p = 0,[
x1(τ − τp)T . . . xns(τ − τp)
T]T
p > 0,(4.29a)
A(τ) ,
A p = 0,
A1
. . .
Ans
p > 0,(4.29b)
B(τ) ,
B p = 0,
B1
. . .
Bns
p > 0,(4.29c)
u(τ) ,
u p = 0,[
uT1 . . . uTns
]Tp > 0,
(4.29d)
L(τ) ,
L p = 0,
L1
. . .
Lns
p > 0,(4.29e)
where the subtrees are numbered 1 to ns [ns is obtained from (4.5)] and it is assumed that indices
of all points within a subtree is contiguous. Then, the complete state space representation of a
tree can be simply be stated as
x(τ) = A(τ)x(τ) +B(τ)u(τ), (4.30a)
y(τ) = L(τ)x(τ), (4.30b)
x(τp) = xp,0. (4.30c)
Chapter 4. LTI Models for trees 67
Here, xp,0 is the initial condition of the LTI system for the time-span [τp, τp+1), and is given by
xp,0 =
[
ηT,1(0) . . . ηT,N |B|(0)]
p = 0,
P
[
−cTx(τ−1 ) +β0c0
x(τ−1 )T]T
p = 1,
Px(τ−p ) p ≥ 2,
(4.31)
with x(τ−p ) being the solution obtained at τ = τp using the LTI system of the previous time-span
[τp−1, τp) and P is just an incidence matrix of 1s and 0s that maps the stress values from the
old indices to the new ones, taking care of the fact that the newly voided junction can now be
a part of multiple subtrees. The size (order) of the state space representation of a tree is given
by
q =
N |B| 0 ≤ τ < τ1,ns∑
i=1
(N |Bi|+ 1) τp ≤ τ < τp+1 and p ≥ 1,(4.32)
where Bi is set of branches in the ith subtree.
From theorem 1 and Section 4.2.3, it is clear that all eigenvalues of the system matrix
A are negative real numbers for all time-spans. Thus, the corresponding LTI systems are
asymptotically stable, such that the forced response for step inputs grow towards some steady
state value. This is to be expected because steady-state stress in confined finite metal line has
been studied and reported in the literature [34, 35, 46, 18], and it is natural to expect it to
generalize for interconnect trees as well.
4.3 Choosing the value of N
The accuracy of our numerical approach heavily depends on how well the LTI model approx-
imates the PDE system. A finer discretization leads to a larger LTI system that results in a
more accurate approximation but takes longer to solve and vice versa. As such, it becomes
imperative to study what value of N gives a good accuracy-speed trade-off.
For this study, we will again use tree T1, which was used in Chapter 3 as well. Recall that
this tree is a straight metal stripe with 193 branches and 192 junctions. We will denote the
LTI model generated with N discretizations per branch asMN . In this study, we will generate
the following LTI models for T1: M8,M10,M16,M20,M25,M32,M40,M50 andM64, with
M64 being the reference solution as it is the most accurate. For each LTI model, we simulate
T1 for a time-period of 15 years, and store the stress values at all the outputs in the tree for
100 equidistant time-points. We also store the void nucleation times and the sequence of void
nucleations as estimated by the different LTI models. We use the 2nd order variable coefficient
Backward Differentiation Formula (VCBDF2) solver, to be presented in the next chapter, to
Chapter 4. LTI Models for trees 68
0 100 200 300 400 500 600
Stress (Mpa)
-15
-10
-5
0
5
10
15
Err
or (
%)
N = 8
Percent error 11.372 MPa-11.372 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-10
-5
0
5
10
Err
or (
%)
N = 10
Percent error 7.310 MPa-7.310 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-5
0
5
Err
or (
%)
N = 16
Percent error 2.873 MPa-2.873 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-5
0
5
Err
or (
%)
N = 20
Percent error 2.362 MPa-2.362 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-4
-2
0
2
4
Err
or (
%)
N = 25
Percent error 1.206 MPa-1.206 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-3
-2
-1
0
1
2
3
Err
or (
%)
N = 32
Percent error 0.858 MPa-0.858 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-2
-1
0
1
2
Err
or (
%)
N = 40
Percent error 1.207 MPa-1.207 MPa
0 100 200 300 400 500 600
Stress (Mpa)
-1
-0.5
0
0.5
1
Err
or (
%)
N = 50
Percent error 0.331 MPa-0.331 MPa
Figure 4.2: Error rate plots for LTI models M8-M50 with respect to the reference solutionobtained usingM64.
simulate these LTI models1.
Fig. 4.2 shows the error rate plot for all models (M8-M50) with respect to the reference
solution obtained using M64. In each plot, the red and black lines show ǫabs,ub, the upper
bound on the absolute error between the two solutions. An LTI model with smaller ǫabs,ub has
a more accurate solution. As expected, ǫabs,ub decreases as we increase N . Fig. 4.3a shows
the trade-off between runtime and accuracy (i.e. ǫabs,ub). There is a very clear trade-off here,
increasing N decreases ǫabs,ub at the cost of runtime. However, note that increasing N beyond
16 gives diminishing returns: the decrease in ǫabs,ub gets slower and runtime increases rapidly.
A similar trend can be seen for other trees as well.
Another way of reporting the accuracy is to compare the void nucleation times and the
sequence of junction failures as obtained using the different LTI models. For all LTI models
1We use VCBDF2 because it will be the main numerical solver used for obtaining our final results
Chapter 4. LTI Models for trees 69
0 10 20 30 40 50 60 70N
0
2
4
6
8
10
12
Max
. Abs
. Err
or (
MP
a)
0
2
4
6
8
10
12
Tim
e ta
ken
(sec
s)
(a)
8 10 16 20 25 32 40 50N
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
erro
r (%
)
1st failure
2nd failure
3rd failure
(b)
Figure 4.3: (a) Runtime vs. accuracy trade-off for LTI models with different discretizationsand (b) Percentage error in estimated junction void nucleation times for LTI modelsM8-M50
with respect toM64. Smaller is better.
(M8-M64), we found 3 junction failures and the sequence of junction failures obtained were
exactly the same, but the estimated failure times varied slightly. The percentage error in
estimated failure times is as shown in Fig 4.3b. In this case (i.e. for tree T1 using VCBDF2
solver), there is clearly no benefit of moving beyond N = 16. In fact, the errors in M20 and
M25 are somehow larger as compared to M16 for this case. The general trend, however, is
similar to what we observed before: diminishing returns after N = 16. Hence, we will use
N = 16 to generate the tree LTI models in all future experiments.
4.4 Justification for the use of effective-EM currents
Because EM is a long term failure mechanism, short-term transients do not play a significant
role in EM dynamics. Thus, the standard practice in the field is to use an effective-EM cur-
rent, essentially a DC current, for doing EM analysis. For power grid lines, that carry mostly
unidirectional currents, effective EM current is the time-average of the current waveform. How-
ever, we need to verify that average currents are indeed sufficient for EM analysis, and that
we are not missing out on anything. In this section, we will provide a theoretical basis and an
experimental justification for the use of effective-EM currents.
Although the motivation to use effective currents comes from experimental evidence [30, 32,
73], one can also understand it in terms of Korhonen’s model and EKM. A simple integration
of (3.11a) for any branch bk gives
σk(xk, tp) = σk(xk, 0) +BΩ
kbTm,k
∂
∂xk
Da,k
(∫ tp
0
∂σk∂xk
dt− q∗ρ
Ω
∫ tp
0jk(t) dt
)
. (4.33)
From (4.33), it can be seen that the stress evolution with time is determined by the integral
of the current density waveform jk(t), and thus the cumulative behaviour of current density
Chapter 4. LTI Models for trees 70
is more important than short time transients. Given that EKM is a linear system, if jk(t) is
replaced by an effective-EM (DC) current density jeff such that the integration up to time tp is
the same as that obtained using jk(t), the stress values obtained at time tp would be the same.
In other words, if
jeff =1
tp
∫ tp
0jk(t) dt, (4.34)
the system response at tp using the time-varying (transient) and effective current densities will
be the same. This provides the basis for using average currents for EM lifetime analysis.
In order to justify the use of effective-EM currents, we conduct a small experiment using
the three junction tree Td with two diffusion barriers n1 and n3 and a dotted-I junction n2
(see Fig. 3.3). The experiment consists of five tests. In each test, we compare the system
response, i.e. stress evolution at junctions for the pre-void phase as obtained using a transient
branch current waveform and its effective (average) value. The results are shown in Fig. 4.4.
The transient current waveforms for j1 and j2 are periodic unidirectional DC pulses with a
duty-ratio of 0.5 for the first four tests and are randomly generated for the fifth test case. The
time-period of the pulse waveforms is chosen to be large enough so that its effect on the stress
evolution is visible. In all cases, it can be clearly observed that the stress evolution computed
using jeff tracks the stress evolution obtained using the pulsed or the random waveform really
well. A similar observation can be made for the post-void phase as well.
The transient and effective system responses become almost similar as the time-period is
reduced from 2 months (Fig. 4.4a) to 1 week (Fig. 4.4d). This ‘agreement’ in the system
response obtained using transient and effective current densities become more prominent as
the time-period is reduced further. This observation can be readily explained by the frequency
response of the LTI system. Recall that a frequency response determines the gain of the output
with respect to the input as the input frequency is varied in a given spectrum. For a multi-input
multi-output system (as is the case with our LTI system), the frequency response of each output
with respect to each input has to be considered.
In Fig. 4.5 and Fig. 4.6, we show the frequency response of all outputs with respect to each
input for the pre-void and post-void phase, respectively, using Bode plots. Each plot also shows
the bandwidth, defined as the first frequency where the gain drops below 70.79% (-3 dB) of its
DC value. Clearly, the Bode plots for all outputs show a frequency response similar to a low
pass filter, where only the low frequency input components are allowed to pass through and the
high frequency components are attenuated. The computed bandwidths for all outputs are in
the range of 1-25 Hz, which is very small when compared to the operating frequency of modern
logic circuits. Thus, the use of average currents for EM analysis is justified in modern on-die
power grids.
Chapter 4. LTI Models for trees 71
0 2 4 6time (yrs)
360
380
400
420
440
460
Str
ess
(MP
a)
n1
trans. resp.eff. resp.
0 2 4 6time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
n2
trans. resp.eff. resp.
0 2 4 6time (yrs)
100
150
200
250
300
350
400
450
Str
ess
(MP
a)
n3
trans. resp.eff. resp.
0 0.1 0.2 0.3time (yrs)
-1.5
-1
-0.5
0
0.5
1
1.5
j (A
/m2)
10 10 current density
j1
j2
(a)
0 2 4 6time (yrs)
380
400
420
440
460
Str
ess
(MP
a)
n1
trans. resp.eff. resp.
0 2 4 6time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
n2
trans. resp.eff. resp.
0 2 4 6time (yrs)
150
200
250
300
350
400
450
Str
ess
(MP
a)
n3
trans. resp.eff. resp.
0 0.1 0.2 0.3time (yrs)
-1.5
-1
-0.5
0
0.5
1
1.5
j (A
/m2)
10 10 current density
j1
j2
(b)
0 2 4 6time (yrs)
380
390
400
410
420
430
440
450
Str
ess
(MP
a)
n1
trans. resp.eff. resp.
0 2 4 6time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
n2
trans. resp.eff. resp.
0 2 4 6time (yrs)
150
200
250
300
350
400
450S
tres
s (M
Pa)
n3
trans. resp.eff. resp.
0 0.1 0.2 0.3time (yrs)
-1.5
-1
-0.5
0
0.5
1
1.5
j (A
/m2)
10 10 current density
j1
j2
(c)
0 2 4 6time (yrs)
380
390
400
410
420
430
440
450
Str
ess
(MP
a)
n1
trans. resp.eff. resp.
0 2 4 6time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
n2
trans. resp.eff. resp.
0 2 4 6time (yrs)
150
200
250
300
350
400
450
Str
ess
(MP
a)
n3
trans. resp.eff. resp.
0 0.1 0.2 0.3time (yrs)
-1.5
-1
-0.5
0
0.5
1
1.5
j (A
/m2)
10 10 current density
j1
j2
(d)
0 2 4 6time (yrs)
380
400
420
440
460
Str
ess
(MP
a)
n1
trans. resp.eff. resp.
0 2 4 6time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
n2
trans. resp.eff. resp.
0 2 4 6time (yrs)
150
200
250
300
350
400
450
Str
ess
(MP
a)
n3
trans. resp.eff. resp.
0 0.1 0.2 0.3time (yrs)
-1.5
-1
-0.5
0
0.5
1
1.5
j (A
/m2)
10 10 current density
j1
j2
(e)
Figure 4.4: The stress evolution at junctions in response to periodic pulsed branch currents andtheir average (effective) values. The time-periods are (a) 2 months, (b) 1 month, (c) 2 weeks,(d) 1 week and (e) is a random waveform.
Chapter 4. LTI Models for trees 72
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-200
-150
-100
-50
0
mag
nitu
de (
dB)
out:n1, in:u1
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-400
-300
-200
-100
0
100
mag
nitu
de (
dB)
out:n1, in:u2
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-400
-300
-200
-100
0
100
mag
nitu
de (
dB)
out:n1, in:u3
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-3000
-2500
-2000
-1500
-1000
-500
0
mag
nitu
de (
dB)
out:n2, in:u1
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-200
-150
-100
-50
0
50
mag
nitu
de (
dB)
out:n2, in:u2
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-3000
-2000
-1000
0
1000
mag
nitu
de (
dB)
out:n2, in:u3
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-6000
-5000
-4000
-3000
-2000
-1000
0
mag
nitu
de (
dB)
out:n3, in:u1
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-3000
-2000
-1000
0
1000
mag
nitu
de (
dB)
out:n3, in:u2
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-200
-150
-100
-50
0
50
mag
nitu
de (
dB)
out:n3, in:u3
bandwidth: 22.78 Hz bandwidth: 1.49 Hz bandwidth: 2.05 Hz
bandwidth: 1.14 Hz bandwidth: 1.35 Hz bandwidth: 1.75 Hz
bandwidth: 0.88 Hz bandwidth: 0.96 Hz bandwidth: 1.11 Hz
Figure 4.5: Frequency response of the pre-void LTI system for Td using Bode plots. The LTIsystem of Td has three outputs and three inputs for the pre-void phase.
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-200
-150
-100
-50
0
50
mag
nitu
de (
dB)
out:n1, in:u1
bandwidth: 0.25 Hz
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-3000
-2500
-2000
-1500
-1000
-500
0
mag
nitu
de (
dB)
out:n2 (in b1), in:u1
bandwidth: 0.24 Hz
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-3000
-2500
-2000
-1500
-1000
-500
0
mag
nitu
de (
dB)
out:n2 (in b2), in:u2
bandwidth: 1.46 Hz
10 0 10 2 10 4 10 6 10 8
frequency (Hz)
-200
-150
-100
-50
0
50
mag
nitu
de (
dB)
out:n3, in:u2
bandwidth: 1.56 Hz
Figure 4.6: Frequency response of the post-void LTI system for Td using Bode plots. Here, n2
has a void, and is thus a part of both branches b1 and b2. Also, now there are only two inputsbecause a voided diffusion barrier has no inputs.
Chapter 5
Solution Techniques
5.1 Introduction
In the last chapter, we showed that the EKM for a tree is essentially a succession of LTI
systems. These LTI systems can be easily solved using standard numerical techniques presented
in Section 2.7. However, our final goal is to estimate the reliability of the power grid, which
might require solving large LTI systems for thousands of trees in the grid. Thus, we need faster
and scalable numerical approaches. As such, we will focus on developing optimized numerical
methods for solving tree LTI systems in this chapter.
First, we will present an equivalent homogeneous LTI system for EKM because it has the
advantage of requiring less computational work per step in the numerical methods. We then
present three numerical methods for solving the homogeneous LTI system to determine the next
void nucleation: Variable coefficient Backward Differentiation Formulas (VCBDF), Newton’s
method and a Predictor based method. Newton’s method and the Predictor based method
use model order reduction (based on the Arnoldi process) to quickly compute the analytical
solution involving the matrix exponential. Finally, we compare and report the performance
and accuracy of the numerical techniques, by using a standard variable time-step Runge-Kutta
method with Butcher tableau as given by Dormand and Prince [71] and as implemented by [56]
as the reference solution. A preliminary version of this work will appear in [74].
5.2 Equivalent Homogeneous LTI system for EKM
From (4.30), EKM is an LTI system with a fixed input vector for any given time-span [τp, τp+1)
x(τ) = Ax(τ) +Bu, (5.1a)
y(τ) = Lx(τ), (5.1b)
x(τp) = xp,0. (5.1c)
73
Chapter 5. Solution Techniques 74
Since the input vector is fixed, we can simplify the LTI system by using the following change
of variables
z(τ) = x(τ)− xss, (5.2)
where xss = −A−1Bu is the vector of steady state stress profile of the tree for the given fixed
input u, assuming σth → ∞. Let yss = Lxss. Then, the homogeneous LTI system can be
written as
z(τ) = Az(τ), (5.3a)
y(τ) = Lz(τ) + yss, (5.3b)
z(τp) = xp,0 − xss. (5.3c)
Any numerical method for solving EKM needs to integrate the above ODE system [z(τ) =
f(z, τ) = Az(τ)] to find z(τ) ∈ Rq, where q is given in (4.32). The main objective of numerically
solving (5.3) is to compute the 1) time and location of the next void nucleation in the tree and
2) the stress profile of the tree at the time of void nucleation. We need this information to set
up the LTI system for the next time-span. As we will see in the next chapter, this objective
fits in the larger framework of determining the power grid MTF.
In the subsequent sections, we will make use of the notation and the theory presented in
Section 2.7. Specifically, we will use zn to denote the solution computed by the numerical
method that approximates the true solution z(τn), and zn[i] to denote the ith component of the
true solution vector zi(τn), i.e. zn ≈ z(τn) and zn[i] ≈ zi(τn).
5.3 Using BDF formulas
We found from practical experience that (5.3) is a stiff system. An LTI system with all negative
eigenvalues (which is the case for us) is said to be stiff if the ratio of its largest to smallest
magnitude eigenvalue is very large [49]. We observed this ratio to be of the order of 109−1010 formany of the system matrices. Solving a stiff system is difficult because the solution consists of
a combination of rapidly varying and slowly varying components, which usually forces explicit
integration methods (like Runge-Kutta) to take smaller time-steps in order to maintain the
the solution accuracy. Thus, one needs to use appropriate numerical methods while solving
(5.3). In this section, we will describe the use of variable coefficient Backward Differentiation
Formulas (VCBDFs) to numerically integrate the LTI systems.
5.3.1 Review of BDF with fixed time-step
BDFs are a type of linear multi-step (LMS) method that are particularity suited to solve stiff
systems. Suppose we wish to solve an ODE system z = f(z, τ) where z(τ) is a vector function
Chapter 5. Solution Techniques 75
of τ . Then, a k-step BDF method takes the following general form
zn+1 + a0zn + a1zn−1 + · · ·+ ak−1zn−(k−1) = h b−1f(zn+1, τn+1), (5.4)
where a−1 = 1 by convention and h = τn−τn−1 is the fixed time-step taken by the BDF method.
A k-step BDF method can be derived using the linear difference operator if its order is also k.
Recall that for an order k method, the first k + 1 coefficients of the linear difference operator
should be zero (see Section 2.7.2). In other words, we must have C0 = C1 = . . . = Ck = 0,
which gives k + 1 equations in k + 1 unknowns. These unknowns are the scalar coefficients
b−1, a0, . . . , ak−1 of the k-step BDF formula. Hence, we can solve C0 = C1 = . . . = Ck = 0
to get their value. For example, a 2-step BDF method with fixed h, obtained by setting
C0 = C1 = C2 = 0, is given by
zn+1 −4
3zn +
1
3zn−1 =
2
3hf(zn+1, τn+1), (5.5a)
ǫPLTE = −2
9h3z(3)(τn), (5.5b)
where z(3)(τn) is the third derivative of z(·) evaluated at τn. Similarly, fixed time-step BDFs
of order 3-6 can be derived and are given in [49]. BDF formulas of orders greater than 6 are
known to be unstable.
As mentioned in the background, all modern ODE solver implementations use variable
time-stepping to speed-up the computation. One way to extend a fixed time-step order k BDF
formula to incorporate variable time-steps is to use interpolation methods. Here, the solution
obtained at previous time-points τn, τn − hpre, . . . , τn − (k − 1) hpre is first interpolated to
the new time-points τn, τn − hnew, . . ., τn − (k − 1) hnew by using an appropriate polynomial
before applying the BDF formula, where hpre is the present time-step of the solver and hnew
is the new time-step. This technique has a problem: if hnew is significantly larger than hpre,
the interpolation polynomial might not provide accurate solutions at the new time-points. This
might lead to stability problems, especially if the step-size is changing frequently [75, 76]. Hence,
instead of using interpolation polynomials, we can re-derive the BDF methods so that they have
built-in support for non-equidistant data points.
5.3.2 Variable coefficient BDF methods
These methods are called variable coefficient BDF (VCBDF) methods because the coefficients
b−1, a0, . . . , ak−1 are now dependent on the sequence of time-steps taken, and thus are not
fixed as before. The derivation of VCBDF methods requires us to restate the concepts of linear
difference operator and residual for the case of variable time-step methods, as we do next.
Chapter 5. Solution Techniques 76
Linear Difference Operator and Residual for Variable time-step methods
The linear difference operator, as stated in (2.37), assumed a fixed time-step h. The corre-
sponding time function for the variable step-case of a k-step method can be stated as
D[ s(τ);~h ] ,k−1∑
j=−1
ajs(τ −∆j)− hn+1
k−1∑
j=−1
bjs(1)(τ −∆j), (5.6)
where hn+1 = τn+1−τn, s(τ) is some function that can differentiated as often as desired, s(1)(·) isthe 1st derivative of s(·), ~h is a vector of time-steps hn+1, hn, . . ., hn−(k−2) and ∆j = τn− τn−j .
We assume that the ratio hn+1/hn−j is bounded. Similar to the fixed time-step case, when
(5.6) is applied to the true solution z(τ) (instead of the computed sequence zn, zn−1, . . .) andevaluated at τn, it gives the residual Rn+1 for the variable time-step case
Rn+1 = D[ z(τ);~h ]τn =k−1∑
j=−1
ajz(τn −∆j)− hn+1
k−1∑
j=−1
bjz(1)(τn −∆j). (5.7)
Using a Taylor series expansion of s(τ) around τ , we have
s(τ) = s(τ) +∞∑
r=1
1
r!s(r)(τ)(τ − τ)r, (5.8)
where s(r)(τ) is the rth derivative of s(·) evaluated at τ . Differentiating (5.8), we get
s(1)(τ) =∞∑
r=1
1
(r − 1)!s(r)(τ)(τ − τ)r−1. (5.9)
Then, evaluating (5.8) and (5.9) at τ = τ −∆j , we get for j = −1, 0, 1, . . . , k − 1
s(τ −∆j) = s(τ) +∞∑
r=1
(−∆j)r
r!s(r)(τ) (5.10)
and
s(1)(τ −∆j) =∞∑
r=1
(−∆j)r−1
(r − 1)!s(r)(τ). (5.11)
Plugging (5.10) and (5.11) into (5.6) and collecting terms with the same order of differentiation
leads to
D[ s(τ);~h ] = C0s(τ) + C1s(1)(τ) + . . .+Crs
(r)(τ) + . . . (5.12)
with
C0 =k−1∑
j=−1
aj , (5.13a)
Chapter 5. Solution Techniques 77
C1 = −k−1∑
j=−1
∆jaj − hn+1
k−1∑
j=−1
bj , (5.13b)
...
Cr =(−1)rr!
k−1∑
j=−1
(∆j)raj −
(−1)r−1hn+1
(r − 1)!
k−1∑
j=−1
(∆j)r−1
bj , (5.13c)
....
Clearly, this is very similar to the equidistant case as shown in (2.39)-(2.42), with the difference
being that the Cr values are now a function of ∆j values, and hence the ~h vector. Define
~ , max(~h ) = max(hn+1, hn, hn−1, · · · , hn−(k−2)
).
Then ∆j = τn − τn−j ≤ K~ = O(~), which in turn gives (∆j)r = O(~r) and hn+1(∆j)
r−1 =
O(~r). From (5.13c), we then get Cr = O(~r). A variable time-step method is said to be of
order k if C0 = C1 = . . . = Ck = 0 and Ck+1 6= 0, so that the residual is given by
Rn+1 =∞∑
r=k+1
Crz(r)(τn) = Ck+1z
(k+1)(τn) +O(~k+2), (5.14)
which motivates the definition of PLTE for a variable time-step method
ǫPLTE = Ck+1z(k+1)(τn). (5.15)
Because Ck+1 = O(~k+1), the PLTE is essentially K~k+1z(k+1)(τn) for some constant K, which
is similar to what is observed in the fixed time-step case. In fact, this whole generalization
for variable time-step methods gracefully falls back to the respective equations for the fixed
time-step methods if we assume hn+1 = hn = · · · = hn−(k−2) = h.
Deriving the VCBDF methods
Generalizing from (5.4), a k-step VCBDF method will use the following difference equation for
an ODE system z = f(z, τ)
zn+1 + a0zn + a1zn−1 + · · ·+ ak−1zn−(k−1) = hn+1 b−1f(zn+1, τn+1), (5.16)
where we are trying to compute the solution zn+1 at time-point τn+1 based on previously
computed solution points (τn, zn), (τn−1, zn−1), . . ., (τn−(k−1), zn−(k−1)). Similar to the case of
fixed-step BDF methods, a k-step VCBDF method will be of order k, so that its PLTE is given
by (5.15). We are now ready to derive the VCBDF methods.
Chapter 5. Solution Techniques 78
Deriving the 2-step VCBDF method The 2-step VCBDF (VCBDF2) and its PLTE are
given by
hn+1b−1f(zn+1, τn+1) = zn+1 + a0zn + a1zn−1, (5.17a)
ǫPLTE = C3z(3)(τn), (5.17b)
where C3 is the error constant. In order to find the values of the coefficients, we set C0, C1 and
C2 in (5.13) to zero, so that
C0 = 0 =⇒ a1 + a0 = −1, (5.18a)
C1 = 0 =⇒ a1hn + b−1hn+1 = hn+1, (5.18b)
C2 = 0 =⇒ a1h2n − 2b−1h
2n+1 = −h2n+1, (5.18c)
which gives three equations in three unknowns that can be solved to find the values of the
coefficients
a0 = −(hn + hn+1)
2
hn (hn + 2 hn+1), (5.19a)
a1 =hn+1
2
hn (hn + 2 hn+1), (5.19b)
b−1 =hn + hn+1
hn + 2 hn+1. (5.19c)
Using (5.19) in (5.13c), we can find the value of the error constant as:
C3 = −h2n+1(hn + hn+1)
2
6 (hn + 2hn+1), (5.20)
which completes the derivation of the VCBDF2 method.
Deriving higher order VCBDF methods Similarly, we can find the coefficients and the
error constant Ck+1 for a k-step VCBDF (VCBDFk) method of order k, with k = 3, 4, 5, 6,by setting C0 = C1 = . . . = Ck = 0, which gives rise to the following set of equations
k∑
j=0
aj = −1, (5.21a)
k∑
j=1
aj(∆j)r + (−1)r+1rb−1hn+1 = (−1)r+1
hn+1, r = 1, 2, . . . , k. (5.21b)
Chapter 5. Solution Techniques 79
Before presenting the solution of (5.21), we will define the sum operator Ψ(u, v), where u and
v are integers, as follows
Ψ(u, v) ,
v∑
j=u
hn−j if u ≤ v,
0 otherwise.
(5.22)
Note that Ψ(u, u) = hn−u and ∆j = Ψ(0, j − 1) when j > 0. The solution to (5.21) can now be
stated as
b−1 =1
hn+1
[k−2∑
i=−1
1
Ψ(−1, i)
]−1
, (5.23a)
aj =
(−1)j+1hn+1b−1
k−2∏
i=0,i 6=j−1
Ψ(−1, i)
j−1∏
i=−1
Ψ(i, j−1)k−2∏
i=j
Ψ(j, i)
, j = 0, 1, . . . , k − 1 (5.23b)
(5.23c)
with the error constant being
Ck+1 = −hn+1b−1
(k + 1)!
k−2∏
i=−1
Ψ(−1, i). (5.24)
Using (5.22)-(5.24), we can compute the coefficients for all VCBDF methods.
5.4 Applying VCBDF to solve the Homogeneous LTI system
Eliminating the Newton iteration step A major drawback of an implicit method like
VCBDF is the requirement of a computationally expensive Newton iteration step to solve (5.16)
for a non-linear f(·). Fortunately, in our case, the ODE system we are trying to solve is a
homogeneous LTI system, as shown in (5.3a). Thus, the solution zn+1 at the next time-point
τn+1 is easily obtained by doing a linear system solve of the following equation
(hn+1b−1A− I)zn+1 =
k−1∑
i=0
aizn−1, (5.25)
where I is the identity matrix and the coefficients b−1, a0, . . . correspond to the chosen k-step
BDF method. There is no need to re-factor the LHS if hn+1 and b−1 are unchanged.
Estimating the PLTE Another factor that usually affects the performance of a k-step
VCBDF method is the estimation of ǫPLTE, as it requires the calculation of the (k+1)th deriva-
Chapter 5. Solution Techniques 80
tive of z(τ), which is usually not available or is difficult to compute. However, given that we
have a homogeneous LTI system, the (k + 1)th derivative is simply
z(k+1)(τn) = Ak+1z(τn) ≈ Ak+1zn. (5.26)
Hence, it is straightforward to compute the PLTE
ǫPLTE ≈ Ck+1Ak+1zn. (5.27)
Note that ǫPLTE = [ǫPLTE,i] is a (q × 1) vector, with the ith value denoting the error in the ith
solution component of zn.
Error Control and Variable time stepping We use ǫPLTE for error control, i.e. keeping
the solution within ǫabs and ǫrel, the absolute and relative error bounds provided by the user,
respectively and in deciding the value of the next time-step. As with many modern ODE
implementations, we use a weighted root-mean-square norm to compute a scalar error metric
from (5.27), as shown below
ǫs =
√∑q−1
i=0 (wi ǫPLTE,i)2
q, (5.28)
where the weight wi is based on the value of the current solution zn[i] and on the tolerances
provided by the user
wi =1
ǫabs + |zn[i]| ǫrel. (5.29)
We accept a step when ǫs ≤ 1, otherwise we reject it. For determining the new step-size for a
k-step VCBDF method, we empirically found the following to be a good heuristic
hnew =
min(
0.6ǫ−1/(k+1)s hpre, 10hpre, hmax
)
ǫs ≤ 0.1 and nlast ≥ k + 4,
max(
0.6ǫ−1/ks hpre, 0.2hpre
)
ǫs ≥ 1,
hpre otherwise,
(5.30)
where nlast is the number of steps taken since the last change in step-size and hmax is the
maximum allowed step-size. This heuristic works due to the following reasons:
• Stability: We have an upper bound of 10 and a lower bound of 0.2 for the ratio hnew/hpre.
We found this necessary in empirical testing to ensure stability of the method. If hnew
becomes less than a pre-defined minimum step-size, we stop the integration.
• Lazy Time-step Changes: Even if a change in step-size is warranted by ǫs, we defer it
until at least k + 4 steps have been taken by the k-step VCBDF method since the last
change in time-step. This condition seeks to balance the trade-off between re-factorizing
the LHS of (5.25) to take a larger time-step versus using the previous factorization to
Chapter 5. Solution Techniques 81
quickly compute the new solution using the present but smaller time-step. Frequently re-
factoring the LHS of (5.25) can considerably slow down the method and hence we avoid
it.
• Upper limit on time-step: We have an absolute upper limit on the value of step-size h.
This is particularly useful due to the nature of the problem: stress evolution due to EM
has a gently decreasing slope in time. As such, the time-steps taken by the solver tend to
keep increasing. Also, we only need to integrate the LTI system for a comparatively short
time-span until the next void nucleates. As such, at some point, re-factoring the LHS
of (5.25) in order to take a larger time-step becomes less efficient than using the already
computed factorization to take the smaller step. In order to illustrate this point, lets take
a simple example. Suppose we have to integrate for a time-span of ∆τ , with hpre being
the present time-step and hnew = 10hpre being the larger time-step that could be taken.
Also, let tfac and tbf be the CPU time required for LU-factorization and backward-forward
solve, respectively. Typically, we have tfac/tbf ≥ 50. Then, changing the time-step to the
new value will not be preferable if
tfac +∆τ
10hpretbf >
∆τ
hpretbf =⇒ hpre >
9tbf10tfac
∆τ =⇒ hpre > 0.018∆τ.
Note that for simplicity, we assumed in the above analysis that the time-step is fixed
(with h equal to either hpre or 10hpre) when we integrate in the time-span ∆τ . However, a
similar conclusion can be drawn for variable time-steps as well, mainly because factoring
a matrix is costly as compared to doing a backward-forward solve.
Determining void nucleation time A void nucleates at a junction when its stress value
reaches the critical threshold xth = Ωσth/(kbT⋆m) > 0. Let i be the index of the discretized
point located at a junction and let xss,i be its steady state value. Then, while stepping through
time, if zn[i] + xss,i < xth and zn+1[i] + xss,i ≥ xth, this junction will fail at τ = τf such that
τn < τf ≤ τn+1. The value of τf can be determined using linear interpolation, or by using a
newton divided difference formula [49] of an appropriate order. In practice, linear interpolation
works quite well because we limit the maximum step-size taken by the solver. Once τf is
known, the stress values at all discretized points are computed for τ = τf , so that we have all
the information required to set up the next LTI system.
Till now, we were solving the LTI system by stepping through time using numerical methods
that use difference equations [like (5.16)] to approximate the underlying function. The next two
numerical methods will take a different approach, where instead of using difference equations,
it will use the analytical solution of the LTI system itself to determine the next void nucleation.
However, the analytical solution requires computation of the matrix exponential, which can be
a very expensive operation. In the following sections, we will first present a fast and efficient
technique for approximating the matrix exponential using model order reduction, specifically
Chapter 5. Solution Techniques 82
the Arnoldi process [77]. Then, we will present the two final numerical approaches that use the
matrix exponential computation to find the time and location of the next void nucleation in
the tree.
5.5 Computing Matrix Exponential using the Arnoldi process
5.5.1 Motivation
The homogeneous LTI system for time span [τp, τp+1), as presented in (5.3), has a closed form
analytical solution, given by
z(τ) = eA(τ−τp)z(τp), (5.31)
which, using the change of variables, gives the solution for the original LTI system (5.1)
x(τ) = xss + eA(τ−τp)(xp,0 − xss), (5.32)
where eA(τ−τp) is the matrix exponential and xss = −A−1Bu. However, the full size of the state
space representation of a tree, as given in (4.32), becomes very large for finer discretizations
(i.e. large N) or for large trees (larger branch count) and computing the matrix exponential for
such a big system is computationally expensive. Hence, we will now present a way of computing
the matrix exponential using a reduced order model obtained by the Arnoldi Process.
5.5.2 The Arnoldi process
The Arnoldi process [77] for some matrix M = [mi,k] ∈ Rq×q attempts to compute an upper
Hessenberg matrix H = [hi,k] ∈ Rq×q and an orthonormal basis V = [vi,k] ∈ R
q×q such that
VT MV = H ⇐⇒ MV = VH. (5.33)
Note that orthonormality implies VT V = I, where I is the identity matrix. Also, an upper
Hessenberg matrix is a matrix such that all entries below the first sub-diagonal is zero. An
example of a 4× 4 upper Hessenberg matrix would be
1 9 8 4
2 16 5 −10 7 6 22
0 0 2 4
. (5.34)
If V = [v1 v2 . . . vq] with vi ∈ Rq, then H is the orthogonal projection of M onto the
spanv1, v2, . . . , vq. Equating the kth columns in MV = VH gives
Mvk =k+1∑
i=1
hi,kvi, k = 1, 2, . . . , q − 1. (5.35)
Chapter 5. Solution Techniques 83
Algorithm 1 Arnoldi Process
Input: LA,UA, q, y, sOutput: Hs,Vs, sinvr1: sinvr ← q ⊲ Initially, assume size of invariant subspace to be the size of A2: Hs ← zeros(s+ 1, s)3: Vs ← zeros(q, s)4: v1 ← y/‖y‖25: for j = 1→ s do6: x← BF Substitution(LA,UA, vj) ⊲ backward-forward substitution7: for i = 1→ j do8: hi,j = vTi x9: x← x− hi,jvi ⊲ modified Gram-Schmidt orthogonalization
10: end for11: hj+1,j = ‖x‖212: if hj+1,j == 0 then13: sinvr ← j ⊲ Size of invariant sub-space is j14: return15: end if16: vj+1 ← x/hj+1,j
17: end for
If we define Vs , [v1 v2 . . . vs] and Hs , [hi,k] ∈ Rs×s with 1 ≤ i, k ≤ s < q, then it can
be shown that (5.35) is equivalent to [78]
MVs = VsHs + hs+1,svs+1,seTs , (5.36)
where es ∈ Rs is the sth unit vector (a vector that has 1 in position s and 0 elsewhere). The
eigenvalues of Hs converge to the s extreme (largest magnitude) eigenvalues of M. Hence, if we
stop the Arnoldi process after obtaining s columns, we end up with the projection VTs MVs ≈
Hs that approximates the s largest magnitude eigenvalues of M.
5.5.3 Solving the Homogeneous LTI system
Recall that all eigenvalues of A are negative real numbers. Because the large magnitude eigen-
values of A die out quickly, the dynamics of stress evolution is primarily governed by the set
of smallest magnitude eigenvalues of A, which we refer to as the dominant modes. Hence, in
our case, we want to approximate the smallest magnitude eigenvalues of the system matrix A,
which can be done applying the Arnoldi process to M = A−1, because the smallest magnitude
eigenvalues of A correspond to the largest magnitude eigenvalues of A−1 (if λ is an eigenvalue
of A, then 1/λ is an eigenvalue of A−1). Algorithm 1 gives the procedure we use to compute
Hs and Vs such that
VTs A−1Vs ≈ Hs, (5.37)
Chapter 5. Solution Techniques 84
which gives
VTs A
−1VsH−1s ≈ I =⇒ A−1VsH
−1s ≈ Vs =⇒ H−1
s ≈ VTs AVs. (5.38)
The inputs to the algorithm are LA and UA, respectively the lower triangular and the upper
triangular matrix obtained using the LU factorization of A, the size of the original system q,
an arbitrary starting vector (seed) y ∈ Rq and s, the desired number of extreme eigenvalues to
approximate. The output is the Hessenberg matrixHs, the orthonormal matrixVs, and the size
of invariant sub-space sinvr [78] in case it is less than s. Note that for computational efficiency,
we avoid explicitly computing A−1 and use backward-forward substitution to compute the
matrix-vector product A−1vj in line 6. This algorithm costs s backward-forward substitutions,
and q2/2 +O(q) inner products and scale-add operations.
If we define
z = VTs z ⇐⇒ z = Vsz, (5.39)
then we can write the homogeneous LTI system (5.3) as
˙z(τ) = VTs Az(τ) = VT
s AVsz(τ) ≈ H−1s z(τ), (5.40a)
y(τ) ≈ LVsz(τ) + yss, (5.40b)
z(τp) = VTs (xp,0 − xss). (5.40c)
Here, using the orthonormal basis Vs, we project the original state vector of size q on to the
reduced state vector of size s to generate a reduced order model that captures the dominant
modes. The solution to (5.40a) is simply
z(τ) ≈ eH−1s (τ−τp)z(τp). (5.41)
From (5.39) and (5.40c), we can re-write (5.41) as
VTs z(τ) ≈ eH
−1s (τ−τp)VT
s (xp,0 − xss) =⇒ z(τ) ≈ VseH
−1s (τ−τp)VT
s (xp,0 − xss).
Finally, using the change of variables (5.2), we obtain x(τ)
x(τ) ≈ xss +VseH
−1s (τ−τp)VT
s (xp,0 − xss). (5.42)
Equation (5.42) is similar to (5.32), with the exception that here, we need to compute the
matrix exponential of a much smaller matrix (s ≪ q), which can be done efficiently using the
scaling and squaring method given in [78]. Note that eA(τ−τp) ≈ VseH
−1s (τ−τp)VT
s , with it being
equal only when s = q or s = sinvr.
There is another way to estimate (5.32) using the reduced order model, which is slightly
more optimized. Note that we need to compute the matrix vector product eA(τ−τp)(xp,0− xss),
Chapter 5. Solution Techniques 85
for which computing eA(τ−τp) explicitly is not required. The product can be computed directly
if, instead of input y being an arbitrary vector in Algorithm 1, we use y = (xp,0 − xss) [78].
Then, x(τ) can be obtained using
x(τ) ≈ xss + ‖xp,0 − xss‖2VseH
−1s (τ−τp)e1, (5.43)
where e1 = [1 0 . . . 0]T ∈ Rq. We will refer to (5.43) as the expm approximation.
5.6 Solvers that use the matrix exponential
The unique characteristic of the expm approximation is that it can directly compute the stress
profile of a tree/subtree at any given time-point τ ∈ [τp, τp+1), without stepping through time.
We will now present two numerical methods that utilize this unique characteristic to compute
the location and time of the next void nucleation in the tree.
5.6.1 Newton Solver
Let m be the set of indices assigned to all unfailed junctions (junctions with no void) in the
tree. Define gi(τ) : R→ R, ∀i ∈ m
gi(τ) , xi(τ)− xth, (5.44)
where xth = Ωσth/(kbT⋆m). Another equivalent definition would be gi(τ) , zi(τ) − zth,i, where
zth,i = xth − xss,i. Clearly gi(τ) ≤ 0, with it being 0 only when a junction fails at τ = τf so
that xi(τf ) = xth. Then, for the time span [τp, τp+1), we can state the objective of finding the
next void nucleation time in the tree as the following problem
Find the minimum τf > τp s.t. gi(τf ) = 0 for some i ∈ m. (5.45)
The index i associated with the minimum τf for which gi(τf ) = 0 gives the location of the
newly failed junction.
One way of solving (5.45) is using the Newton’s method applied to solve gi(τf ) = 0 for
every unfailed junction in the tree. Newton’s method is an iterative method in which the
function to be solved is linearized in the neighborhood of the present candidate solution using
the gradient (slope) to find the next candidate solution. If we use τkf to denote the present
candidate solution, then the next candidate solution τk+1f is obtained using
τk+1f = τkf −
gi(τkf )
gi(τkf ). (5.46)
To apply Newton’s method, we have to evaluate gi(τ) ∀i ∈ m, which can be done using the expm
approximation, and gi(τ) = xi(τ), which is already known from the LTI system formulation.
Chapter 5. Solution Techniques 86
0 1 2 3 4 5 6 7 8
Time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
[0.7 yrs, 498.9 MPa]
[0.0 yrs, 434.7 MPa]
[4.8 yrs, 587.7 MPa][6.1 yrs, 598.3 MPa]
[6.4 yrs, 600.0 MPa]
[2.7 yrs, 559.8 MPa]
Goes to 0
Figure 5.1: Obtaining the next void nucleation time using the Newton solver.
The newton iterations are terminated when the following two conditions are satisfied
|gi(τk+1f )| ≤ ǫnt,abs,
|τk+1f − τkf | ≤ ǫnt,rel τ
kf + ǫnt,abs,
where ǫnt,abs and ǫnt,rel are the absolute and relative error tolerances provided by the user. A
typical newton iteration to find the next void nucleation time is as shown in Fig. 5.1. The blue
curve shows the actual stress evolution and linearized models used by Newton’s method are
shown by dashed orange lines. For this case, the solution was obtained in 5 iterations.
5.6.2 Predictor
Newton’s method uses a linear model to approximate the function gi(τ), or xi(τ), around the
candidate solution. However, once the stress values of a junction are determined for a few
time-points using the expm approximation, we can also use other higher order (possibly non-
linear) models for extrapolating the rest of the trend for the nearby time-points. This works in
practice because from experience, we know that (except for a small time-interval after the void
nucleation) stress is a slowly varying function of time, so that the dynamics of stress near the
known solutions can be approximated well enough. While various exponential or log functions
may be suitable, we have found empirically that the following power function template provides
a very good local temporal approximation
xi(τ) = cτ b+a ln τ , (5.47)
where a, b and c are parameters to be determined. Taking ln on both sides of (5.47), we get
ln(xi(τ)) = ln c+ (b+ a ln τ) ln τ = ln c+ b ln τ + a(ln τ)2. (5.48)
Chapter 5. Solution Techniques 87
0 1 2 3 4 5 6 7 8Time (yrs)
400
450
500
550
600
650
Str
ess
(MP
a)
actual stress evolutionTTF predictor fitPoints usedestimated TTF
[5.5 yrs, 593.7 MPa]
[2.5 yrs, 556.6 MPa]
[4.0 yrs, 579.6 MPa]
[6.4 yrs, 600.0 MPa]
Figure 5.2: Obtaining the next void nucleation time using Predictor.
Thus, ln(xi(τ)) is a simple quadratic in ln τ , with a, b and ln c as the three coefficients. The
coefficients can be easily determined using using regression analysis and least-squares fitting if
the value of xi(τ) is computed for at least three time-points. Once the coefficients are known,
τf can be computed using roots of the quadratic polynomial
τf = exp
(
min
−b+√
b2 − 4a ln(c/xth)
2a,−b−
√
b2 − 4a ln(c/xth)
2a
)
. (5.49)
We will refer to this technique as the Predictor because we are essentially using curve-fitting to
predict the junction failure time. The accuracy of the Predictor approach heavily depends on
the time-points chosen: if the actual junction TTF is close to the chosen time-points, then the
Predictor gives accurate results. Given that we do not know the failure times beforehand, we
use heuristics to choose the time-points at which to evaluate the stress profile using the expm
technique. Fig. 5.2 shows how (5.47) provides a local approximation to the stress evolution at
the given junction, which can then be solved to find the failure time.
5.7 Experimental Results
In this section, we will report the performance and accuracy of the proposed numerical methods,
by comparing them to a standard variable time-step Runge-Kutta method with the Butcher
tableau as given by Dormand and Prince [71] and as implemented in [56]. We will refer to this
solver as RK45, as it computes fourth- and fifth-order accurate solutions. The performance
comparison will be done in terms of run-time, and accuracy will be compared using error rate
plots and the estimated time and sequence of void nucleations.
C++ implementations were written for all the proposed methods: VCBDF2-VCBDF6, the
Newton solver and the Predictor based solver. The size of all reduced order models computed
Chapter 5. Solution Techniques 88
(a)
(b)
Figure 5.3: Showing part of trees (a) T1 and (b) T2 used for comparing solvers. The orangedots show the junctions.
using the Arnoldi Process was chosen to be s = min(0.05q, 100), where q is the original size of
the LTI system, as we empirically found that this gave the best accuracy-speed trade-off. For
the comparison, we choose two trees from IBM power grid benchmarks. The first tree T1, shownin Fig. 5.3a, is a structurally simple straight metal stripe, with 192 branches and 193 junctions
(2 diffusion barriers and 191 dotted-I junctions). The second tree T2, shown in Fig. 5.3b, has a
more complex structure and consists of 540 branches and 541 junctions (26 diffusion barriers,
494 dotted-I, 18 T and 3 plus junctions). The LTI models for both trees were generated with
N = 16 discretizations per branch. We used two machines with two different CPU architectures
for carrying out the simulations, as the relative performance of the proposed solvers seems to
vary depending on the CPU architecture.
We will first report the accuracy of solvers VCBDF2-VCBDF6 and the expm approximation
by computing the stress values at all junctions in T1 for specific time-points and comparing
the results with the reference solution obtained from the RK45 solver. The comparison is
done using error rate plots and is shown in Fig. 5.4a. The maximum percentage error is less
than 0.07% and 0.2% for all VCBDF solvers and the expm approximation, respectively, which
clearly demonstrates their accuracy. Fig. 5.4b shows the average absolute error between the
solutions. All the errors are of the order of 10−4 − 10−2 MPa, which is relatively small. For
VCBDF methods, the average error decreases as the order increases, which shows that higher
order VCBDF methods are more accurate. In terms of the average error, accuracy of expm
approximation is similar to VCBDF4 method. A similar trend is observed for tree T2, and some
other trees we tested. The accuracy of a given solver is independent of the CPU architecture.
For the next comparison, we simulate both trees using all solvers for a period of 20 years,
and collect various performance metrics as well as the time and sequence of junction failures.
The sequence of junction failures obtained using all solvers are identical. Fig. 5.5 shows the
percentage error between the junction TTFs estimated using the proposed solvers and the RK45
solver. Clearly, the errors are very small for all solvers, except for the Predictor, that has the
highest percentage error for both trees. This shows that the VCBDF solvers and the Newton
Chapter 5. Solution Techniques 89
350 400 450 500 550 600
Stress (Mpa)
-0.1
-0.05
0
0.05
0.1
Err
or (
%)
VCBDF2 Percent error 0.289 MPa-0.289 MPa
350 400 450 500 550 600
Stress (Mpa)
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Err
or (
%)
VCBDF3 Percent error 0.077 MPa-0.077 MPa
350 400 450 500 550 600
Stress (Mpa)
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
Err
or (
%)
VCBDF4 Percent error 0.045 MPa-0.045 MPa
350 400 450 500 550 600
Stress (Mpa)
-6
-4
-2
0
2
4
6
Err
or (
%)
10 -3 VCBDF5 Percent error 0.020 MPa-0.020 MPa
350 400 450 500 550 600
Stress (Mpa)
-0.03
-0.02
-0.01
0
0.01
0.02
0.03E
rror
(%
)VCBDF6 Percent error
0.099 MPa-0.099 MPa
350 400 450 500 550 600
Stress (Mpa)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Err
or (
%)
expm Percent error 0.807 MPa-0.807 MPa
(a)
1.3435e-02
2.5455e-03
1.3663e-03
1.2880e-04 1.1936e-04
1.2950e-03
VCBDF2 VCBDF3 VCBDF4 VCBDF5 VCBDF6 expm0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
Str
ess
(MP
a)
(b)
Figure 5.4: (a) Error rate plot for stress evolution at junctions as obtained using VCBDF2-VCBDF6 solvers and expm approximation and (b) the average absolute error with respect toRK45 solver.
Chapter 5. Solution Techniques 90
VCBDF2 VCBDF3 VCBDF4 VCBDF5 VCBDF6 Newton Predictor0
0.05
0.1
0.15
0.2
0.25
0.3
erro
r (%
)
1st failure2nd failure3rd failure
(a)
VCBDF2 VCBDF3 VCBDF4 VCBDF5 VCBDF6 Newton Predictor0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
erro
r(%
)
1st failure2nd failure3rd failure4th failure
VCBDF2 VCBDF3 VCBDF4 VCBDF5 VCBDF60
1
2
3
4
5
6
710 -3 Close up view of error in VCBDF methods
(b)
Figure 5.5: Percentage error in the estimated TTFs of (a) T1 and (b) T2 using the proposedsolvers and RK45 solver.
solver have high accuracy, with the Predictor having medium to low accuracy.
Table 5.1 compares the performance of the proposed solvers and RK45 using the following
metrics:
- Total Steps: the number of successful time-steps taken by the solver to simulate the tree
up to 20 years.
- Failed Steps: the number of time-steps rejected by the solver due to high error.
- f(z, τ)evals: the number of derivative evaluations f(zn, τn) = Azn at the present time
point τn.
- LU’s: The number of LU factorizations computed for A and (hn+1b−1A− I).
- BF subs.: The number of backward-forward substitutions done using already computed
Chapter 5. Solution Techniques 91
factorization of either A or (hn+1b−1A− I).
- expm solves: The number of times expm approximation was used for computing x(τ).
- time taken: The time taken by the solver, in seconds, to simulate the corresponding tree
up to 20 years.
- speed-up: The speed-up obtained by the solver as compared to RK45 solver.
From the data, it is clear that except for one scenario (VCBDF6 for T1 on Core i7), the proposed
solvers are faster than RK45, sometimes by orders of magnitude. This is to be expected, as
RK45 is not optimized for the problem at hand while our VCBDF solvers benefit from the
time-stepping and other optimizations in Section 5.4. Among all VCBDF solvers, VCBDF2 is
the fastest solver, with VCBDF3 and VCBDF4 being a close second and third, respectively.
The solvers VCBDF5 and VCBDF6 are comparatively slow. The reason for the slowdown of
the higher order solvers can be attributed to the calculation of PLTE using (5.27): A VCBDFk
solver requires Ak+1 to compute the PLTE, which results in higher error norms for larger k
values (in our case, ‖Ak1z‖ ≥ ‖Ak2z‖ iff k1 ≥ k2). This forces the higher order VCBDF solvers
to take smaller time-steps in order to maintain the solution accuracy. As a by-product, we get
more accurate solutions, as evident by the preceding accuracy comparison. Also, Ak+1 becomes
dense as k increases, so that calculation of the PLTE itself takes more time.
For a given problem, the performance of VCBDF solvers are better on the Xeon CPU as
compared to the Core i7. Overall, the VCBDF solvers are ∼2.5x faster on the Xeon CPU as
compared to the Core i7 CPU. Given that all the performance metrics (number of steps, LU
factorizations etc.) are almost identical on both CPUs, the difference in performance stems
from faster LU factorization of (hn+1b−1A − I) and corresponding faster backward-forward
substitution on Xeon CPUs. For the given simulations, LU factorization of (hn+1b−1A−I) and
the corresponding backward-forward substitution are respectively 5.5x and 8x faster on Xeon
as compared to Core i7, even though we use SuiteSparse [79, 80, 81, 82] to perform all sparse
matrix operations on both machines.
The performance of the Newton solver and the Predictor varies depending on the structure
of the tree and the machine architecture. For Xeon CPU, Newton solver outperforms all the
VCBDF solvers for T1 but is slower than VCBDF2-VCBDF5 for T2. On the other hand, on
the Core i7 CPU, the Newton solver is at least 1.5x faster than all VCBDF solvers for both
trees T1 and T2. The performance of the Predictor also follows a similar trend. Curiously
enough, this difference in performance comes from the difference in speed of backward-forward
substitution in the Arnoldi process on both machines: backward-forward substitution using
the LU factorization of A is 10x faster on Core i7 as compared to Xeon CPU. This makes the
Newton solver and the Predictor a viable option for machines with Core i7 CPU.
We also compute the empirical complexity of the fastest numerical methods. This is done
by increasing the value of N for a given tree (we do not use the runtimes from different trees to
compute the complexity because there are other factors that might affect the runtime, such as
Chapter 5. Solution Techniques 92
0 10 20 30 40 50 60 70
N
0
0.5
1
1.5
2
2.5
time
(sec
s)
VCBDF2
(a)
5 10 15 20 25 30 35 40 45 50
N
1
2
3
4
5
6
7
8
9
time
(sec
s)
VCBDF2
(b)
0 10 20 30 40 50 60 70
N
0
1
2
3
4
5
6
time
(sec
s)
VCBDF3
(c)
5 10 15 20 25 30 35 40 45 50
N
0
2
4
6
8
10
12
14
16
time
(sec
s)
VCBDF3
(d)
Figure 5.6: Empirical complexity of VCBDF2 solver for trees (a) T1 and (b) T2, and VCBDF3solver for trees (c) T1 and (d) T2, computed by using the fitting function time = aN b, where bis the complexity.
the structure of the tree). The results are shown in Fig. 5.6. With almost linear complexities,
the VCBDF2 and VCBDF3 solvers appear to be scalable for large problem sizes.
Chapter 5. Solution Techniques 93
Table 5.1: Comparison of solver metrics and runtime
Host Total Failed f(z, τ) BF expm time speed
CPU Tree Solver steps steps evals LU’s subs. solves taken up
Xeon T1 RK45 248246 15 1490252 – – – 43.38 –
E5-2687W VCBDF2 388 7 – 104 395 – 0.46 93.64x
3GHz VCBDF3 431 8 – 148 439 – 0.60 72.32x
VCBDF4 1048 15 – 212 1063 – 0.92 46.94x
VCBDF5 3261 28 – 1078 3289 – 5.47 7.93x
VCBDF6 7167 49 – 3874 7216 – 14.51 2.99x
Newton – – 24 1 489 28 0.36 120.51x
Predictor – – – 1 1241 40 0.49 87.84x
T2 RK45 233907 8 1404550 – – – 151.31 –
VCBDF2 548 16 – 153 564 – 3.51 43.09x
VCBDF3 656 15 – 220 671 – 3.97 38.12x
VCBDF4 1034 22 – 365 1056 – 5.23 28.92x
VCBDF5 2301 42 – 683 2343 – 8.54 17.71x
VCBDF6 5815 66 – 2488 5881 – 24.88 6.08x
Newton – – 29 1 758 35 8.68 17.43x
Predictor – – – 1 896 31 8.97 16.86x
Core i7 T1 RK45 248246 15 1490252 – – – 42.26 –
4770 VCBDF2 388 7 – 104 395 – 1.34 31.63x
3.4GHz VCBDF3 429 8 – 148 437 – 1.49 28.35x
VCBDF4 1027 15 – 215 1042 – 3.09 13.68x
VCBDF5 3101 27 – 1084 3128 – 27.33 1.55x
VCBDF6 7044 48 – 3777 7092 – 59.73 0.71x
Newton – – 24 1 489 28 0.31 138.28x
Predictor – – – 1 1241 40 0.47 89.85x
T2 RK45 233907 8 1404550 – – – 161.03 –
VCBDF2 548 16 – 153 564 – 4.96 32.45x
VCBDF3 662 15 – 220 677 – 6.10 26.41x
VCBDF4 1021 23 – 394 1044 – 8.09 19.90x
VCBDF5 2319 46 – 798 2365 – 14.45 11.14x
VCBDF6 5916 77 – 2764 5993 – 40.82 3.94x
Newton – – 29 1 758 35 3.07 52.48x
Predictor – – – 1 896 31 3.06 52.69x
Chapter 6
Power Grid EM Checking
6.1 Introduction
In this chapter, we will present two approaches for estimating the mean time to failure (MTF) of
a power grid under the influence of electromigration. We will start by explaining what an early
failure is and why it impacts the power grid reliability, which will be followed by our approach
for determining branch temperatures using compact thermal models. Then, we will present
the two power grid EM checking approaches: the main approach and the filtering approach,
where the second approach improves over the first one by focusing the computation only on
the EM-susceptible trees in the grid. Finally, we will present the experimental results where,
among other things, we will compare 1) the power grid MTFs obtained using a calibrated
Black’s model and the Extended Korhonen Model and 2) the performance of our solvers in the
context of power grid EM checking.
6.2 Early Failures
A void nucleation at a junction typically increases the resistance of all connected branches.
However, in a power grid, which consists of multiple trees electrically connected to each other
by vias, a void nucleation may have another effect. Consider two trees in two consecutive
metal layers connected by a via as shown in Fig. 6.1a, the schematic representation of which
is shown in Fig. 6.1b. In this case, we have two junctions, one above and one below the via.
Depending on the direction of the current densities, a void might form above or below the via.
If a large enough void forms below a via, it might in some cases cause an open circuit failure
by disconnecting the via. This phenomenon is known as early failure and has been reported in
the literature [29]. It happens because the capping layer is not conductive; hence if the void
covers the entire cross-section of a via (as shown in Fig. 6.1b), there is no conductive path left
between the via and the tree below and the current in the via completely falls to 0. On the
other hand, voids that form above the via generally happen at the top of the line away from the
via, and so take a long time to completely fill the cross-section, and even then do not translate
94
Chapter 6. Power Grid EM Checking 95
(a) (b)
Figure 6.1: (a) An arrangement of two trees connected by a via taken from the power grid and(b) the corresponding schematic showing early and conventional failures.
to an open circuit because the current can continue to flow through the (high resistance) metal
liner. We will refer to these kinds of failure as conventional failures. Removal of a via, as it
happens during early failures, can have a severe impact on grid reliability and thus should be
accounted for in the EM analysis.
6.3 Determining Branch Temperatures
Temperature affects EKM on the following three fronts:
1. The initial stress at t = 0 for any given tree is mainly due to the thermal stress, which is
strongly dependent on the initial temperature [see (3.2)]. A higher thermal stress often
leads to a smaller void nucleation time and vice-versa.
2. The diffusivity of branch bk, which primarily determines the time rate of change of stress,
depends on its temperature Tm,k [see (2.10)]. Diffusivity increases with increase in temper-
ature, so that the time rate of change of stress also increases with increase in temperature
and results in smaller void nucleation times.
3. The steady state void length depends on the thermal stress: higher thermal stress leads
to larger voids.
We have already seen in Section 3.7 that it is important to account for temperature variation
across a tree while estimating its EM degradation using EKM. We also saw that there is no
‘nominal temperature’ that can capture the effect of the actual temperature variation. As
such, it becomes important to determine the temperature profile of all trees across all layers
in the power grid for realistic EM assessment. We do this using the compact thermal models
(CTM) obtained using electro-thermal equivalence, as detailed in Section 2.8. We will now
briefly summarize the procedure for applying the CTM approach to determine the temperature
distribution of the whole power grid.
Each layer in the power grid is discretized into uniform volume elements called thermal
blocks [24]. Each thermal block represents an isothermal volume within a layer, and as such all
Chapter 6. Power Grid EM Checking 96
Figure 6.2: Thermal modelling of power grid using CTMs.
branches and junctions that reside within a thermal block have the same temperature. Since we
assume the atomic diffusivity to be the same throughout a branch, there can be no temperature
gradient within a branch. Hence, each branch is associated with only one thermal block. For
each block, we perform thermal analysis using CTMs [62] based on electro-thermal equivalence.
Recall that a CTM is a lumped thermal RC network, with heat dissipation modelled as a current
source, as shown in Fig. 6.2. Specifically, each thermal block is represented as a thermal node
connected to 6 resistors, a current source and a capacitor, and their values can be calculated
using (2.52).
The number of thermal blocks per layer is the same and is decided based on the required
resolution of temperature distribution. In addition, we assume convective boundary condition
[24] at the top and insulated boundary conditions at the four sides to model the heat transfer
between the power grid and the surroundings. The CTMs for thermal blocks, combined with
the boundary conditions, gives us a thermal grid that can be solved for finding the temperature
distribution of the power grid [see (2.54)]. In our case, we are only interested in the steady
state temperature distribution because transients in temperature occur on a time scale that is
small when compared to the EM. Thus, we ignore the thermal capacitance and use the steady
state temperature distribution in our analysis, which can be obtained by solving
GTTm(t) = iTs(t) +GT,0Tamb (6.1)
for Tm(t). This gives the temperature at every thermal node, and correspondingly for all
branches. All symbols in (6.1) are explained in Section 2.8. The total power dissipated in the
kth thermal block (iTs,k) is calculated using iTs,k = Pself heating+Plogic where Pself heating is due
to the average power dissipated by joule heating of the metal branches within the thermal block
and Plogic is the average heat dissipated by the underlying logic, due to active switching activity
and leakage currents. Note that Plogic contributes to power dissipation of thermal blocks in the
lowest layer only.
Chapter 6. Power Grid EM Checking 97
5 10 15 20 25 30
xcoord (mm)
5
10
15
20
25
30
ycoo
rd (
mm
)
0
0.002
0.004
0.006
0.008
0.01
0.012
(a)
320
30
25
325
20 30
ycoord (mm)
15
330
20
xcoord (mm)
10105
335
0 0
340
322
324
326
328
330
332
334
336
338
(b)
Figure 6.3: (a) Heat map for Pself heating + Plogic and (b) temperature profile (in Kelvin) forthe M1 layer in ibmpgnew2.
0.1 0.2 0.3 0.4
xcoord (mm)
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
ycoo
rd (
mm
)
0
1
2
3
4
5
6
7
10 -4
(a)
300
0.4
310
0.30.4
ycoord (mm)
320
0.2 0.3
xcoord (mm)
0.20.1
330
0.10 0
340
305
310
315
320
325
330
335
(b)
Figure 6.4: (a) Heat map for Pself heating + Plogic and (b) temperature profile (in Kelvin) forthe M1 layer in PG7.
Fig. 6.3 and 6.4 show the heat map for power consumption and the computed temperature
profile for the lowest metal layer M1 using CTMs for power grids ibmpgnew2 and PG7, respec-
tively. The specification for these grids are provided in Table 6.1. For PG7, the four bottom
layers are divided into 3×3 sub-grid islands, of which one island is switched off. This gives
rise to the power heat map as shown in Fig. 6.4a. The corresponding temperature profile in
Fig. 6.4b also reflects this, with the temperature being lowest above the switched off island.
Chapter 6. Power Grid EM Checking 98
6.4 Power Grid EM analysis approaches
6.4.1 Power Grid Model
As we saw in Chapter 4, the cutoff frequency for tree LTI models is less than 25 Hz. As
such, short-term transients with frequencies in MHz or GHz range typically experienced in
chip workloads do not play a significant role in EM degradation. Hence, and consistent with
standard practice in the field, we use an effective-current model [30], so that the grid currents
are assumed to be constant at some average (effective) value, at least during the void nucleation
phase. As per EKM, once a void nucleates, branch resistances change fairly quickly and the
currents change, also fairly quickly, to new effective values. Thus, between any two successive
void nucleations, the power grid has fixed currents, voltages, and conductances and so can be
modelled using a DC model as given in (2.21), which we re-state again here
G(t)v(t) = i, (6.2)
where G(t) is the time-varying (but piecewise-constant) conductance matrix, v(t) is the corre-
sponding time-varying (but piecewise constant) vector of node voltage drops and i is the vector
of average (effective) values of the current sources tied to the grid.
6.4.2 The Main Approach
We will use the mesh model [25, 47] to find the Mean Time to Failure (MTF), in which the
grid is deemed to fail not when the first void nucleates, but when enough voids have nucleated
so that the user-provided voltage drop threshold value has exceeded at some grid node. The
voltage-drop threshold value for every grid node (or a subset of grid nodes) is captured in the
vector vth and ensures that there are no timing violations in the underlying logic as long as node
voltage drops are below the threshold. As a byproduct, however, this process also produces the
time when the first void nucleates, which helps us generate the MTF under a series model, in
which a grid is deemed to fail when the first void nucleates. We report the series model MTF
for comparison purposes.
Obtaining one grid TTF sample
We assume that the grid is undamaged (no voids) at t = 0 and that all node voltage drops are
less than vth, i.e. v(0) < vth. We calculate the initial temperature distribution at t = 0, which
gives the initial thermal stress profile for the trees and the branch diffusivities. A power grid
is a collection of interconnect trees. As such, to estimate the EM degradation of the grid, we
formulate the LTI system for every tree as shown in Section 4.2.4 and numerically integrate
them to obtain the stress at all junctions as a function of time. At this point our main objective
is to find the time and location of the next void nucleation among all junctions in all the trees.
Chapter 6. Power Grid EM Checking 99
Let nf be the next junction that fails and tf be its time of void nucleation. Then, to determine
nf and tf efficiently, we propose the following 3 step approach:
1. Sort : For every unfailed junction in a tree, we calculate a crude estimate of the junction
nucleation time by using a simple linear model with slope equal to the present time-rate
of change of stress at that junction. The trees are then sorted in ascending order by their
minimum junction nucleation time.
2. Simulate: We set nf = ∅ and tf = ∞, and start simulating the trees one by one as
determined by the ordering of the previous step, up to either the first junction failure or
tf , whichever is earlier. If a junction in a tree fails before tf , we update nf and tf so that
they always store the best estimate of next junction to fail and its TTF. When we finish
simulating all trees, nf and tf have the correct final values.
3. Synchronize: Since the sorting step is based on a crude estimate, we might have trees
that have been simulated to a time point greater than tf . The solution for these trees are
no longer valid because the void nucleation at nf will the change current densities in the
power grid. Thus, we re-simulate all such trees to determine their stress profile at t = tf .
In principle, the simulate and synchronize steps can be done using any of the proposed
solvers from the previous chapter. While the RK45 solver and all VCBDFk solvers do a pretty
good job at simulating a tree, the Newton solver and the Predictor have some shortcomings.
The Newton solver suffers from the same drawbacks as a general Newton method: if we don’t
start sufficiently close to the true solution, the convergence is not guaranteed. On the other
hand, it is really difficult to determine the time-points for which the Predictor gives a good
junction TTF estimate. Thus, the junction failure times obtained by the Predictor are usually
not accurate enough, which was evident in the experimental results of the last chapter. To
overcome these shortcomings, we combine the two solvers as follows: we first use the Predictor
to find the time of next junction failure and then use the Newton solver to refine the estimated
failure time. This works really well in practice because the solution of the Predictor is always
close to the true solution and evaluating expm and the derivatives are cheap as compared to
generating the reduced model itself. Hence, for power grid EM assessment, we will combine the
Newton and Predictor methods into one method, which we will refer to as the Predictor+Newton
method.
Once nf and tf are known, we calculate the steady state volume of the void using (3.12),
update the resistances of connected branches using (3.13) and compute the new voltage drops
and current density values. We then examine to see if the recently nucleated void leads to an
early failure, by checking the following two conditions: i) is the void located below a via (this
is determined using the power grid structure) and ii) is the void large enough to disconnect the
via. If both conditions are met, the void at nf leads to an early failure, so that we remove the
via from the power grid and update the voltage drops and the current density values. Then we
re-calculate the power dissipation for all thermal nodes, find the new temperature distribution
Chapter 6. Power Grid EM Checking 100
8 9 10 11 12 13 14 15Time (yrs)
0.005 0.01
0.05 0.1
0.25
0.5
0.75
0.9 0.95
0.99 0.995
Pro
babi
lity
TTF samplesgoodness-of-fit
(a)
8 9 10 11 12 13 14 15Time (yrs)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
PD
F
from simulationparameters from fit
(b)
Figure 6.5: (a) Goodness-of-fit plot for normal distribution and (b) probability distributionfunction (pdf) using 200 mesh TTF samples from ibmpg2 main approach.
and update the branch diffusivities. The LTI system for all trees are updated as given in
Section 4.2.4. At this stage, we again need to find nf and tf , which can be done by repeating
the sort-simulate-synchronize steps.
The time of first void nucleation gives the TTF of the grid as per the series model. Due
to increase in branch resistances, the voltage drops in the grid continue to increase as we
move forward in time. Each time we update the voltage drop, we check to see if a voltage drop
violation has occurred somewhere. The earliest time when the voltage drop at any node exceeds
vth is the TTF of the grid as per the mesh model. As the power grid size increases, updating
voltage drops due to changing branch resistances becomes more computationally expensive,
which can limit the scalability of our approach. To overcome this problem, we update voltage
drops using Preconditioned Conjugate Gradient (PCG) method. At t = 0, we have to factorize
the conductance matrix in order to find the initial voltage drops. We use this initial factorization
as a pre-conditioner for our Conjugate Gradient (CG) method. This makes the voltage drop
updates really efficient because the perturbation in G(t) due to void nucleations is minimal,
hence the factorization of G(0) acts as an excellent incomplete factorization for G(t), that
results in very fast convergence within a few iterations.
Estimating grid MTF
To account for the random nature of EM degradation, we performMonte Carlo random sampling
to estimate the MTF. In each Monte Carlo iteration, we assign new lognormally generated
diffusivities to all the branches in the grid. This effectively produces a new instance of the
whole power grid, which we refer to as a sample grid. Then, as stated in the previous section,
we simulate the sample grid to generate a TTF value based on the series model and another
based on the mesh model. With enough samples, we form two averages as our estimates of the
series MTF and the mesh MTF.
Chapter 6. Power Grid EM Checking 101
Let T be the RV that represents the statistics of the mesh TTF for this approach, then
the expected value of T, denoted by E[T], is the mesh MTF of the grid. Using goodness of fit
methods, it was found that the normal distribution is a good fit for T (see Fig. 6.5). Therefore,
we can use standard statistical sampling (Monte Carlo) [64] to find the value of E[T] to within a
user-specified error tolerance. The number of samples required for the Monte Carlo to terminate
is given in (2.63). This stopping criteria ensures that we have (1 − ζ)× 100% confidence (e.g.
ζ =0.05 for 95% confidence) that the relative error in MTF estimation is less than user-provided
relative error threshold ǫmc (e.g. ǫmc = 0.05 for 5% relative error threshold).
Though this is the most accurate approach, numerically solving all the trees in the power
grid using the EKM can be computationally expensive. In this work, we use this approach only
on smaller grids and refer to it as the main approach. The results from this approach serve as
a benchmark of comparison for more optimized approaches.
6.4.3 Improved performance with Filtering
We will now present a method that drastically reduces the run-time with almost no impact on
accuracy. We will refer this as the Filtering approach. For each sample grid, solving all the
trees up to the time of grid failure yields a specific sequence of void nucleation times in certain
trees that are of interest. In particular, all trees that nucleate their first void before the time
of grid failure are of interest to us. All trees that nucleate their first void after the grid failure
are inconsequential to the analysis, and we would do well to filter them out in the first place.
Unfortunately, we don’t know up-front which set of trees should be solved, and which can be
discarded. However, we can devise an approximate filtering scheme that indicates which subset
of trees will most likely nucleate a void before all the rest, and focus our computation on those
trees.
Finding the Active set
For a given sample grid, we restrict our attention to a subset of trees whose estimated first void
nucleation times are smaller than some threshold t = tm. We call this subset of trees as the
active set and tm as the active set cutoff threshold. tm is a part of the Monte Carlo process. We
start with a sufficiently high value of tm, that is reduced as more TTF samples are obtained.
We will provide more details on tm later. Note that we don’t need to know the actual void
nucleation times for junctions in a tree, rather we only need to know if the first void in a tree
nucleates before tm. In addition, any filtering approach needs to be quick, or at least it should
be faster than simulating all trees in the grid to be considered viable. We now present some
filtering approaches that can be employed to speed-up the MTF estimation.
Steady State Filter In this approach, we compute the steady state stress profile of the all
trees using the respective LTI models. Any tree with a junction that has a maximum steady
state stress value larger than the critical stress threshold σth is included in the active set.
Chapter 6. Power Grid EM Checking 102
0 2 4 6 8 10 12Time (yrs)
400
450
500
550
600
650
700
750
800
Str
ess
(MP
a)
Junction 2
Junction 1
Figure 6.6: The idea for expm filtering scheme. The dotted lines show the would-be stressevolution if the boundary conditions are not updated when stress reaches σth. Junction 1 failsbefore t = tm, Junction 2 fails after.
Riege-Thomson Filter As mentioned in Section 2.3.3, Riege and Thompson proposed an
analytical expression for stress evolution at a junction by replacing all its connected branches
with semi-infinite limbs [42]. After some algebraic manipulation, the TTF of a junction as
estimated by their model can be stated as
tf =
(σth − σ0
ρq∗
)2 ΩπkbTm
4B
∑
bk∈Bp
√
Da,k
∑
bk∈Bp
Da,kjk
∑
bk∈Bp
Da,kjk > 0,
∞∑
bk∈Bp
Da,kjk ≤ 0,
(6.3)
Recall that due to the assumption of semi-infinite limbs, Riege-Thompson model cannot account
for back stress generated due to EM, and thus it gives a conservative TTF estimate for a
junction. Based on this conservative approximation, the trees that are likely to nucleate their
first void before tm are declared to be part of the active set.
expm based filter If the stress evolution at a junction is to cause void nucleation before time
tm, then that junction’s would-be stress value at tm should be higher than σth (see Fig. 6.6).
Here, the would-be stress value at tm denotes the hypothetical stress value at a junction if
the boundary conditions are not updated at the time of void nucleation. We use the expm
approximation to calculate the would-be stress profile of every tree at t = tm, and any tree with
junction stresses greater than σth at t = tm is included in the active set.
VCBDF2 based filter In this approach, we generate LTI models for all trees using a smaller
value of N , usually 8 or 10, and then use the VCBDF2 solver with relaxed tolerances to integrate
Chapter 6. Power Grid EM Checking 103
these coarse tree LTI models up to tm, or up to the time of its first void nucleation, whichever
is earlier. We include a tree in the active set if its first void nucleation time is less than tm.
We have already shown in Section 4.3 that the LTI models generated using smaller values of N
give less accurate but correct results.
The steady state filter and the Riege-Thomson filter are computationally very efficient, but
they are also very conservative so that the active set obtained using them usually consists of
many trees that will not fail before tm. The expm filter has very few false positives. However,
it assumes that the tensile stress at junctions is a monotonic function of time, which is not
always true and this makes the expm filter exclude trees where the stress overshoots σth before
dropping down to a value less than σth at t = tm. Finally, the VCBDF2 based filter provides a
good estimate of the active set, but it might not be as fast as the expm based filter because it
still has to step through time. Also, the performance of the VCBDF2 based filter depends on
tm: a higher value of tm usually slows down the VCBDF2 based filter. The performance of the
expm filter is independent of tm, as it does only one expm evaluation.
Estimating mesh MTF from limited samples
If the sample grid fails before tm, we obtain a sample TTF. On the other hand, it might be
the case that the sample grid hasn’t failed up to t = tm. In this case, we set the TTF sample
equal to tm, and such a sample is called a limited sample. Thus, in the Filtering approach, we
effectively sample from a RV T′ that has a maximum value of tm and has a normal distribution
same as T ∀t ≤ tm, where T is the RV the represents the statistics of mesh TTF as obtained
from the main approach. The RV T′ is a limited RV, and has the following definition.
Definition 1. Let Y be a random variable (RV) with cumulative distribution function (cdf)
FY (t) and let l and u be two scalars with l < u and at least one of them finite. Then, RV Y′
is called a limited RV between limits l and u, with Y being the underlying RV, if it has the
following cdf [83]
FY ′(t) =
0 t < l,
FY (t) l ≤ t < u,
1 t ≥ u.
(6.4)
In our case, T′ has a limited normal distribution with l = −∞ and u = tm, and the
underlying normal RV is T. A straight forward averaging of obtained TTF samples would give
us E[T′], which is not the mesh MTF of the power grid. Thus, in this section, we will derive a
relation to estimate E[T], the actual mesh MTF from the obtained samples.
Using the law of total expectation [84], we can write for T
E[T] = E[T|T ≤ tm]F (tm) + E[T|T > tm](1− F (tm)), (6.5)
Chapter 6. Power Grid EM Checking 104
where F (t) is the cdf of the normal RV T. We can also express E[T′] in similar terms
E[T′] = E[T′|T′ ≤ tm]F ′(tm) + E[T′|T′ > tm](1− F ′(tm)), (6.6)
where F ′(t) is the cdf of RV T′. From the definition of a limited RV, we know that F ′(tm) =
F (tm), E[T′|T′ ≤ tm] = E[T|T ≤ tm] and E[T′|T′ > tm] = tm. Hence, we can re-write (6.6)
as
E[T′] = E[T|T ≤ tm]F (tm) + tm(1− F (tm)). (6.7)
Subtracting (6.7) from (6.5), we get
E[T] = E[T′] + (E[T|T > tm]− tm)(1− F (tm))
= E[T′] + E[T− tm|T > tm](1− F (tm)). (6.8)
The term E[T− tm|T > tm] is the Mean Residual Life (MRL) of the power grid at t = tm, and
it can be showed that [85]
E[T− tm|T > tm] =1
1− F (tm)
∫ ∞
tm
[1− F (z)]dz. (6.9)
Combining (6.8) and (6.9), we get
E[T]− E[T′] =
∫ ∞
tm
[1− F (z)]dz. (6.10)
Define µ , E[T], v2 , V ar[T], µ′ , E[T′] and (v′)2 , V ar[T′]. Also, let Φ(t) be the cdf of a
standard normal distribution N (0, 1) (normal distribution with mean 0 and variance 1), so that
Φ(t) =1√2π
∫ t
−∞e−z2/2dz =
1
2
[
1 + erf
(t√2
)]
, (6.11)
which can be computed using the erf() function on most operating systems. Then, from the
definition of the cdf of a normal distribution, we can re-write (6.10) as
µ− µ′ =
∫ ∞
tm
(
1− Φ
(z − µ
v
))
dz. (6.12)
The RHS of (6.12) can be integrated to give (see appendix B for a step-by-step derivation)
µ− µ′ = (µ− tm)
(
1− Φ
(tm − µ
v
))
+ vφ
(tm − µ
v
)
, (6.13)
where φ(t) = e−t2/2/√2π is the pdf of standard normal distribution. Using (6.13), we could
have estimated µ from µ′ if variance v of the underlying normal was known. Unfortunately,
thats not the case. However, as we will show in the next subsection, we can estimate (with
Chapter 6. Power Grid EM Checking 105
some confidence) the value of F (tm), the cdf at t = tm from the Monte Carlo experiments.
Thus, F (tm) is a known quantity. Let and pf , F (tm). Then, we can write
pf = Φ
(tm − µ
v
)
=⇒ v =tm − µ
Φ−1(pf ), (6.14)
where Φ−1 denotes the inverse cdf of a standard normal distribution, which also can be evaluated
on most operating systems using erfinv() function. Now, using (6.14) in (6.13), we get
µ− µ′ = (µ− tm)
(
1− pf −φ(Φ−1(pf )
)
Φ−1(pf )
)
,
which can be easily solved for µ
µ =µ′ + (κ− 1)tm
κ, (6.15)
where κ is a function of pf
κ = pf +φ(Φ−1(pf )
)
Φ−1(pf ). (6.16)
Modifying the Monte Carlo stopping criteria
In addition to finding µ from µ′, we also need to derive a new stopping criteria for the Monte
Carlo random sampling process to ensure that µ computed using (6.15) is estimated within
the user specified tolerances akin to the main approach. In order to do this, we have to first
introduce the notion of a true value, estimated value and error in estimation. Let T ′1, T
′2, . . . T
′s
be s samples obtained from RV T′ using a Monte Carlo process, of which slim are limited
samples. Then, define
µ′ ,1
s
s∑
k=1
T ′k, pf ,
s− slims
, (6.17)
where µ′ is the estimated value of µ′ and pf is the estimated value of pf obtained using the
TTF samples. Using µ′ and pf in (6.15) we can calculate µ, the estimated value of µ. Note
that µ′, pf and µ are the true values, so that
lims→∞
µ′ = µ′, lims→∞
pf = pf , and lims→∞
µ = µ. (6.18)
Then, the error in estimation can be written as
δµ = |µ− µ|, δµ′ = |µ′ − µ′|, and δpf = |pf − pf |. (6.19)
Similar to the main approach, we would like to stop the Monte Carlo process when we are
(1−ζ)×100% confident that the relative error in estimated MTF is less than some user provided
Chapter 6. Power Grid EM Checking 106
threshold ǫmc. This can be achieved if
δµζ
µ≤ ǫmc ⇐⇒
δµζ
µ≤ ǫmc
1 + ǫmc, (6.20)
where δµζ is (1−ζ)× 100% confidence bound on the estimation error δµ. In other words, this
means that the interval [µ− δµζ , µ+ δµζ ] will contain µ (the true value) (1− ζ)× 100% of the
time. To find δµζ , we apply propagation of errors [86] to (6.15)
δµζ =
√(∂µ
∂µ′δµ′
ζ
)2
+
(∂µ
∂κδκζ
)2
, (6.21)
where δµ′ζ and δκζ are the (1− ζ)× 100% confidence bounds on µ′ and κ, respectively. δµ′
ζ is
obtained from simulation, using the technique given in [83] and δpfζ can be calculated from the
TTF samples using [87]. The complete details are given in the appendix B. Here, we present
the final expression
(δµζ)2 =
(δµ′ζ)
2
κ2+
z2ζ/2(tm − µ′)2pf (1− pf )
κ4s
[
1 +
(
1 +1
2y2
)2]
, (6.22)
when spf ≥ 5 and s(1 − pf ) ≥ 5. Here, zζ/2 is the (1 − ζ/2)-percentile of N (0, 1), κ is the
estimated value of κ using pf and y = Φ−1(pf )/√2. Similar to the main approach, we obtain
at least 30 TTF samples before starting to check the stopping criteria (6.20).
Final workflow of the Filtering approach
The workflow of the filtering approach is very similar to the main approach, but with a few
key differences. First, instead of simulating all the trees, we determine the active set at t = 0
using the previously presented Filtering approaches, and simulate only the trees in the active
to obtain a grid TTF sample. In practice, we use a combination of filters. For example, we first
use the Riege-Thomson filter to remove trees that have their first failure greater than ktm for
some k > 1, and then apply either expm or the VCBDF2 based filters to finalize the active set.
Also, we usually add a slack ∆tm > 0 to tm in deciding the active set, so that any tree that has
the first failure time less than tm +∆tm becomes a part of the active set. The slack ∆tm not
only results in a conservative active set, but it also ensures that we don’t miss out on any tree
due to the error incurred by using a reduced order or coarse LTI model. The slack also serves
another important purpose: it might happen that trees that were not included in the active set
at t = 0 may become eligible to be a part of it due to change in current densities caused by
the previous junction failures. Usually, this behaviour is observed for trees that were excluded
from the active set due to a small margin1. Hence, including them in the active set safe guards
1It is rare to observe a tree that previously had its first junction failure much greater than tm, to suddenly
become eligible for to be a part of the active set.
Chapter 6. Power Grid EM Checking 107
our approach against potential pitfalls.
Second the mesh MTF sample is calculated using (6.15) and the stopping criteria as shown
in (6.20) and (6.22) is used. If the final value of tm is chosen such that 0 < pf ≤ 1, then (6.15)
is used to calculate the mesh MTF. If pf = 0, (6.15) cannot be used to estimate the MTF µ
because pf = 0 =⇒ Φ−1(pf ) = −∞, so that κ = 0 and (6.15) gives µ = ∞. This scenario
might happen if the value of tm is so small to begin with that all mesh TTF samples are limited
samples. In this case, we can only say with certainty that µ > tm. On the other hand, if pf = 1,
this means that none of the samples obtained are limited by tm, and thus the standard Monte
Carlo stopping criteria (2.63) can also be applied.
5 10 15 20 25 30
sample number
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
p2
Figure 6.7: Variation of p2 with sample number.
Third, we now have an ‘extra’ Monte
Carlo parameter tm, whose value needs to be
decided. The value for tm should be chosen
carefully. If the value of tm is high, then we
will waste a lot of time simulating trees that
would never fail before grid failure and our
performance will suffer. On the other hand,
a lower value of tm reduces the run-time, but
we found from experiments that it results in
slower Monte Carlo convergence and would
require more iterations. The sweet spot for
tm lies somewhere in between. Indeed, from
experiments we found that we get the best
runtime if tm is close to, but greater than the MTF µ. Alas, we obviously don’t know the MTF
beforehand. Hence, we use an adaptive strategy for determining tm. For the first few iterations,
we choose a sufficiently high value, so that the TTF samples are not limited by tm. Then, based
on the mean of the samples obtained so far we decrease the value of tm so that it gets closer to
the estimated mean. While a lot of strategies of reducing tm based of the obtained samples are
possible, we found the following to be the most effective
tm,k+1 = min
tm,k, p1
[
tm,k + (µ− tm,k)
(
0.3 +0.4
1 + e−p2(s−10)
)]
, (6.23)
where p1 ≥ 1 is a safety factor and p2, the steepness of the logistic function that varies with
the number of samples obtained as shown in Fig. 6.7. The overall flow of the filtering approach
is shown in Fig. 6.8.
6.4.4 Parallelization using shared memory
Estimating the TTFs of different sample grids in each Monte Carlo iteration are independent
of each other, and thus it can be parallelized. In our implementation, we use a multi-process
architecture, with each process bound to a separate core, to carry out different MC iterations
Chapter 6. Power Grid EM Checking 108
Figure 6.8: Flow chart showing the MTF estimation using the Filtering approach. EF standsfor early failure.
in parallel (see Fig. 6.9). These processes use shared memory for inter-process communication.
The first process allocates and initializes the shared memory object, which contains 1) a ran-
dom number generator to be used for generating sample grids, 2) a table to store the results
generated by all MC iterations, 3) the time threshold tm required for finding the active set, 4)
the process IDs of all active processes using the shared memory and 5) several read-write locks
to synchronize the read/write accesses to the shared memory. All subsequent processes map
this memory space to their own address space.
At the beginning of every MC iteration, each process uses the shared random number
generator to generate a sample grid. When a process completes its MC iteration and obtains
a TTF sample, it writes the results to the shared table, updates the estimated MTF and tm
based on the TTF samples obtained so far from all the processes and checks if the stopping
criteria has been satisfied. If its not satisfied, this process starts a new MC iteration. On the
other hand, if the stopping criteria is satisfied, this process sends an interrupt to all the other
Chapter 6. Power Grid EM Checking 109
Figure 6.9: Workflow for each process in our parallel implementation.
processes to signal the end of the task and then stops. Any time a process receives an interrupt,
it assumes that the stopping criteria has been satisfied in some other process, and thus stops.
The last process to stop deallocates the shared memory object.
6.5 Experimental Results
A C++ implementation was written to test the proposed electromigration assessment method-
ologies. Two types of test grids were used to verify our approach: IBM power grids [26] and
internal grids. The IBM power grids are a suite of 8 power grid benchmarks drawn from real
industrial designs. The largest IBM grid has around 720K electrical nodes. In order to sim-
ulate larger benchmarks, we use internally generated power grids. The internal grids are our
own non-uniform grids, synthesized as per user specifications, including grid dimensions, metal
layers, pitch, and width per layer. The current sources are randomly placed on the grid. The
technology specifications are consistent with 1V 45 nm CMOS technology. The details for
the grids are as given in Table 6.1. The grids names prefixed with ‘ibm’ are IBM power grid
benchmarks and the grids PG1-PG7 are internal grids, which go up to 4.1M nodes.
The interconnect material is assumed to be Copper, and the physical constants used for
simulation are listed in Table 6.2. The configuration parameters for the numerical solvers and
the Monte Carlo MTF estimation procedure are as given in Table 6.3. The size of the reduced
order models generated using the Arnoldi Process was chosen to be min(0.05q, 100), where q is
the size of the original LTI system. To make the presentation clear, Table 6.4 lists the notation
we will use in the later sections for presenting the experimental results. Similar to the last
chapter, we will use two different machines to carry out the experiments. The first machine
Chapter 6. Power Grid EM Checking 110
Table 6.1: Details of Power Grids used in experiments.
Grid Metal Nodes Branches Junctions Trees C4s Current v0,max
Name Layer sources (% vdd)
ibmpg1 2 6,085 10,853 11,562 709 100 5,387 4.4
ibmpg2 4 61,677 61,143 61,605 462 120 18,419 4.7
ibmpg3 5 410,011 401,412 409,601 8,189 494 100,527 4.4
ibmpg4 6 474,524 465,416 475,069 9,653 312 132,916 4.9
ibmpg5 3 248,838 495,656 497,658 2,002 100 236,600 4.7
ibmpg6 3 403,915 797,579 807,825 10,246 132 380,742 9.95
ibmpgnew1 6 315,951 698,101 717,629 19,528 494 178,965 4.8
ibmpgnew2 6 717,754 698,101 717,629 19,528 494 178,965 4.4
PG1⋆ 8 36,862 36,189 36,862 673 9 2,448 4.9
PG2⋆ 8 146,112 144,755 146,112 1,357 64 30,010 4.9
PG3 8 560,468 557,816 560,468 2,652 100 40,254 4.9
PG4 8 1,232,260 1,226,703 1,232,260 5,557 225 89,508 4.9
PG5 8 1,643,814 1,636,888 1,643,814 6,926 668 188,161 4.8
PG6 8 2,629,448 2,617,216 2,629,431 12,215 944 566,736 4.8
PG7 8 4,094,704 4,082,039 4,094,704 12,665 1,471 886,124 4.75
⋆ PG1 and PG2 will only be used for validating the filtering approach.
is a 12 core 3GHz Linux machine with Xeon CPU and 128 GB of RAM. The second machine
is quad-core 3.4 GHz Linux machine with Core i7 CPU and 32 GB of RAM. Unless stated
otherwise, all simulations are run on the first machine.
We carried out many experiments with the following objectives in mind: a) To validate the
Filtering approach as presented in Section 6.4.3 by comparing its results with the main approach
of Section 6.4.2, b) To verify the correctness of proposed numerical methods (VCBDFk and
Predictor+Newton) in the context of power grid EM checking by comparing their results to the
results obtained from the RK45 solver, c) To check the accuracy of Black’s model for power grid
EM verification and d) To show that accounting for early failures in a power grid EM verification
is important. Finally, we will report the speed-up obtained by our tool due to parallelization,
the break-up of time consumed by various parts of the code and study the overall scalability of
our approach.
Chapter 6. Power Grid EM Checking 111
Table 6.2: Table of Physical constants
Symbol Description Value
B Bulk modulus 135.21 GPa [88]
Ω Atomic volume 1.66× 10−29 m3
kb Boltzmann’s constant 1.38× 10−23 Joule/K
q∗ Effective charge 8.0109× 10−19C [89]
σth Critical stress threshold 600 MPa [20]
δ Thickness of void interface 10−9 m [67]
Tamb Ambient temperature 300K (27C)
Tzs Stress free annealing temperature 623K [22, 90]
am − asi Difference in coefficients of thermal expansion 1.068×10−5 K−1
ρm Resistivity of metal (Copper) 2.1991× 10−8 ohm·mρb Resistivity of barrier metal (Tantalum) 1.7082× 10−7 ohm·m
Table 6.3: Configuration parameters to be used for evaluating all power grid benchmarks
Symbol Description Value
N number of discretizations per branch 16
ǫabs Absolute error tolerance for ODE 10−6
ǫrel Relative error tolerance for ODE 10−3
ζ To ensure a (1− ζ)× 100% confidence bound on MTF 0.05 (95% confidence)
ǫmc Maximum relative error bound on estimated MTF 0.05 (max. 5% error)
vth Voltage drop threshold for all nodes 5% of v†dd
ǫnt,abs Absolute error tolerance for stopping newton iteration 10−8
ǫnt,rel Relative error tolerance for stopping newton iteration 10−5
tm,0 Initial value of active set cut-off threshold 20 years
† For ibmpg6, we use 10% of vdd.
Table 6.4: Notation used to simplify presentation
Symbol Description
µxs
Series MTF for a given grid estimated using x, where x is either
a numerical method or an EM model.
µxm
Mesh MTF for a given grid estimated using x
txP
The time taken to estimate the Mesh MTF using x with P parallel processes.
Chapter 6. Power Grid EM Checking 112
ibmpg1 ibmpg2 ibmpg5 PG1 PG2Grid Name
5
6
7
8
9
10
11
12
13
14
15
MT
F (
yrs)
with
95%
con
f. bo
unds
Main approach
Filtering Approach
(a)
0 5 10 15 20 25 30Sample num (in ascending order)
9
10
11
12
13
14
15
TT
F s
ampl
e (y
rs)
Main ApproachFiltering Approach
(b)
0 5 10 15 20 25 30 35Sample num (in ascending order)
4.5
5
5.5
6
6.5
7
7.5
8
8.5
TT
F s
ampl
e (y
rs)
Main ApproachFiltering Approach
(c)
Figure 6.10: Comparing the main approach with the filtering approach for the first 5 gridsshowing (a) 95% confidence bounds on the estimated MTF, and the TTF samples obtained byeach for (b) ibmpg2 and (c) ibmpg5.
6.5.1 Main Approach vs Filtering Approach
We will first verify that the filtering approach indeed leads to significant speed-ups with minimal
loss of accuracy as compared to the main approach. Table 6.5 compares the series and mesh
MTF as estimated using the main approach (µmains and µmainm , respectively) and the filtering
approach (µflts and µfltm , respectively). We could only test the main approach on six of the
smallest benchmarks using the VCBDF2 solver due to memory and runtime constraints. From
Table 6.5, it is clear that as the grid size increases, the filtering approach leads to larger speed-
ups with negligible loss in accuracy. Note that both these approaches were parallelized by using
12 processes to carry out the different Monte Carlo iterations simultaneously. In Fig. 6.10a, we
show the 95% confidence bounds on MTF as obtained using the main approach and the filtering
approach, which can be seen to be almost identical. In Fig. 6.10b and 6.10c, we show the sample
TTF values, sorted in ascending order, obtained by both approaches for ibmpg2 and ibmpg5
grids, respectively, in the process of estimating the MTF. Clearly, for these grids, the filtering
approach does a good job of identifying the EM-susceptible trees because the TTF samples
Chapter 6. Power Grid EM Checking 113
Table 6.5: Comparison of Power grid MTF obtained using the Main Approach and the FilteringApproach.
Main Approach Filtering Approach Error(%) Speed-up
Grid µmains µmainm tmain12 µflts µfltm tflt12 Series Mesh tmain12
tflt12Name (yrs) (yrs) (mins) (yrs) (yrs) (mins)
ibmpg1 3.39 7.08 2.80 3.39 6.99 0.87 0.03 1.30 3.23x
ibmpg2 6.60 11.94 9.36 6.63 11.98 1.36 0.53 0.35 6.86x
ibmpg5 4.44 6.35 43.14 4.45 6.34 2.44 0.29 0.14 17.65x
PG1 6.81 11.73 1.14 6.81 11.69 0.21 0.08 0.36 5.52x
PG2 2.55 6.07 58.50 2.57 6.17 2.75 0.56 1.64 21.26x
PG3∗ 4.39 15.19 469.37 4.36 16.50 2.62 0.65 8.64 179.45x
∗ The MC process for the main approach could not converge within the set time limit for PG3.
from both approaches are almost the same. This proves the value of the filtering approach, as
it makes MTF estimation using physics-based EM models scalable by focusing the computation
only on EM-susceptible trees. For all subsequent sections, we will use the filtering approach for
obtaining the MTF estimates.
6.5.2 Comparison of Performance and Accuracy between the solvers
Table 6.6 compares the performance and accuracy of the VCBDF2-VCBDF4 solvers presented
in chapter 5 in the context of power grid EM checking by comparing their run-time and the
estimated MTFs with those of the RK45 solver. The MTF estimation for all simulations is
parallelized using 12 processes, one running on each core. We use only the three fastest solvers,
VCBDF2-VCBDF4, for this comparison. Since the VCBDF solvers have been optimized for
the problem at hand, they are very fast as compared to the standard RK45 solver. Overall, for
the given benchmarks, VCBDF2 is 39.6x faster, VCBDF3 is 31.9x faster and VCBDF4 is 22.2x
faster than RK45. The VCBDF solvers are also accurate, with the average percentage error
in MTF estimation across the board being only around 1%. As expected, the error in MTF
estimation decreases as we move towards the higher order solvers.
We test the Predictor+Newton solver on the second machine which has a Core i7 CPU
because, as we saw in the last chapter, these solvers had consistently better performance on
the Core i7 CPU as compared to the Xeon CPU. For the baseline, we will again use the results
obtained using the RK45 solver. Table 6.7 shows the comparison. In spite of using reduced
order models, the Predictor+Newton method is an accurate method, with the error in the series
and mesh MTF being only 1% and 1.84%, respectively. We also compare the speed-up obtained
by the Predictor+Newton solver with respect to the RK45 solver. We realize that this is not a
good comparison, since the runtimes are obtained on different machines and the RK45 solver
Chapter 6. Power Grid EM Checking 114
Table 6.6: Comparing the performance and accuracy of VCBDF2-VCBDF4 methods for powergrid EM checking using RK45 as reference
RK45 VCBDF2 VCBDF3 VCBDF4
Grid µRKs µRKm tRK12 µBDF2s µBDF2m tBDF312 µBDF3s µBDF3s tBDF312 µBDF4s µBDF4m tBDF412
Name (yrs) (yrs) (mins) (yrs) (yrs) (mins) (yrs) (yrs) (mins) (yrs) (yrs) (yrs)
ibmpg1 3.51 7.04 6.12 3.39 6.99 0.87 3.55 7.06 1.26 3.53 7.06 1.42
ibmpg2 6.71 11.91 35.93 6.63 11.98 1.36 6.63 11.98 1.43 6.63 11.98 1.86
ibmpg3 4.56 7.02 326.34 4.57 6.96 4.56 4.56 6.83 6.45 4.54 6.99 10.58
ibmpg4 8.82 17.05 336.43 8.83 16.83 8.01 8.65 17.08 10.64 8.65 17.08 15.45
ibmpg5 4.52 6.17 15.28 4.45 6.34 2.44 4.43 6.33 2.95 4.43 6.33 3.42
ibmpg6 5.58 11.27 237.72 5.61 11.40 16.21 5.61 11.25 21.03 5.61 11.24 29.13
ibmpgnew1 4.01 13.28 39.56 3.97 13.18 5.67 3.99 13.18 6.78 3.99 13.18 8.16
ibmpgnew2 4.58 7.18 62.56 4.62 7.21 4.88 4.63 7.20 5.51 4.62 7.22 5.97
PG3 4.35 16.87 369.42 4.36 16.50 2.62 4.35 16.46 3.45 4.40 16.92 5.60
PG4 3.60 10.43 426.61 3.60 10.36 9.47 3.61 10.29 10.34 3.64 10.46 13.85
PG5 3.91 8.55 236.71 4.00 8.58 3.80 4.00 8.66 4.18 3.96 8.66 5.49
PG6 – – – 3.23 14.87 10.95 3.28 14.57 13.75 3.22 14.89 22.48
PG7 – – – 4.31 9.10 10.35 4.24 9.20 11.78 4.13 9.11 15.23
Average error/speed-up 1.05% 1.08% 39.6x 1.01% 1.16% 31.9x 1.01% 0.69% 22.2x
Table 6.7: Comparison of the RK45 solver (run on the first machine) and the Predictor+Newtonsolver on the second machine (Quad-core i7@3.4GHz)
RK45 Predictor+Newton Error(%) Speed-up
Grid µRKs µRKm tRK12 µprnews µprnewm tprnew4 Series Mesh tRK12tprnew4Name (yrs) (yrs) (mins) (yrs) (yrs) (mins)
ibmpg1 3.51 7.04 6.12 3.51 7.04 2.46 0.07 0.09 2.49x
ibmpg2 6.71 11.91 35.93 6.62 11.92 5.28 1.42 0.05 6.80x
ibmpg3 4.56 7.02 326.34 4.51 7.31 14.59 1.13 4.17 22.37x
ibmpg4 8.82 17.05 336.43 8.79 17.37 23.37 0.32 1.86 14.40x
ibmpg5 4.52 6.17 15.28 4.47 6.34 4.46 0.98 2.77 3.43x
ibmpg6 5.58 11.27 237.72 5.53 10.52 60.81 0.87 6.66 3.91x
ibmpgnew1 4.01 13.28 39.56 4.01 13.27 15.45 0.06 0.06 2.56x
ibmpgnew2 4.58 7.18 62.56 4.64 7.18 12.38 1.32 0.07 5.05x
PG3 4.35 16.87 369.42 4.30 17.29 8.42 1.29 2.52 43.87x
PG4 3.60 10.43 426.61 3.65 10.35 31.66 1.43 0.77 13.47x
PG5 3.91 8.55 236.71 3.99 8.66 12.67 2.09 1.24 18.68x
Average 1.00% 1.84% 12.46x
Chapter 6. Power Grid EM Checking 115
Table 6.8: Comparison of power grid MTF as estimated using Black’s model and ExtendedKorhonen’s model (with VCBDF2 solver).
Black’s model EKM (using VCBDF2) Comparison
Grid µblks µblkm tblk1 µekms µekmm tekm1 µekms
µblks
µekmm
µblkm
tblk1
tekm1Name (yrs) (yrs) (mins) (yrs) (yrs) (mins)
ibmpg2 2.33 5.07 2.620 6.63 11.98 11.12 2.85x 2.36x 0.24x
ibmpg3 2.58 5.72 28.292 4.57 6.96 46.56 1.77x 1.22x 0.61x
ibmpg4 2.50 5.28 36.812 8.83 16.83 64.40 3.53x 3.19x 0.57x
ibmpg5 2.25 3.58 2.249 4.45 6.34 20.52 1.98x 1.77x 0.11x
ibmpg6 1.37 1.54 4.818 5.61 11.40 121.28 4.08x 7.42x 0.04x
ibmpgnew1 1.63 3.33 6.557 3.97 13.18 50.58 2.44x 3.95x 0.13x
ibmpgnew2 1.78 6.10 46.387 4.62 7.21 31.26 2.60x 1.18x 1.48x
PG3 8.65 15.49 7.717 4.36 16.50 29.15 0.50x 1.07x 0.26x
PG4 3.25 6.01 16.935 3.60 10.36 75.76 1.11x 1.72x 0.22x
PG5 3.83 8.69 44.499 4.00 8.58 27.87 1.04x 0.99x 1.60x
PG6 3.70 9.10 105.294 3.23 14.87 76.67 0.87x 1.63x 1.37x
PG7 2.43 5.23 62.652 4.31 9.10 95.70 1.77x 1.74x 0.65x
Average 2.05x 2.35x 0.60x
is parallelized with 12 processes wheres as the Predictor+Newton is only parallelized with 4
processes. Nevertheless, the Predictor+Newton solver is still 12.5x faster than the RK45 solver.
If we extrapolate the runtimes of the Predictor+Newton solver to 12 cores by dividing them
with a conservative scaling factor of 2.5, then this solver will be as fast as the VCBDF3 solver.
Also, on the Core i7 CPU, we found that the Predictor+Newton method is on an average 2.7x
faster as compared to the VCBDF2 method.
From the results, we can see that using the VCBDF2 solver, the run-time for the most
difficult to solve grid (ibmpg6) is only around 16.2 minutes and the run time for the largest grid
(PG7) is around 10.4 minutes. This shows the scalability of our approach for large grids, which
has been made possible by a combination of optimized numerical methods and good filtering
techniques.
6.5.3 Black’s Model vs. EKM for grid MTF estimation
Table 6.8 lists the series and mesh MTFs obtained using Black’s model and the Extended
Korhonen’s model (EKM) proposed in this work. Columns µblks and µblkm denote respectively
the estimated series and mesh MTFs when Black’s model was used to determine branch TTFs
Chapter 6. Power Grid EM Checking 116
[25, 47]. We calibrate the Black’s model based on data obtained from Korhonen’s model, so that
for a finite line, the MTF predicted by Black’s model and EKM are the same. The columns µekms
and µekmm list the series and mesh MTFs, respectively, estimated using the filtering approach
given in Section 6.4.3 with the VCBDF2 solver. From the table, we note that µekms > µblks
for all grids except PG3 and PG6, and µekmm > µblkm for all grids except PG5 (for which it is
almost equal). Overall, the mesh (series) MTF estimated using EKM is 2.35x (2.05x) longer
than that found using Black’s model (with the ratio µekmm /µblkm being as much as 4x longer for
some grids). This serves as evidence of how Black’s model can lead to over-design of grids:
Suppose that for grid ibmpg4, the target mesh MTF is 10 years. If the grid EM sign-off was
done using Black’s model, ibmpg4 would have have failed the sign-off because its mesh MTF
estimated using Black’s model is 5.28 years. Thus, a designer will conclude that he/she needs to
widen the lines in order to achieve the target MTF. However, taking into account the material
flow between connected branches, we can see that the grid survives for 16.8 years! In fact, the
designer can even reduce the cross-sectional area of branches to reduce metal usage, because
there is an extra reserve of 6 years. Thus, the use of Black’s model leads to over-design of grids
and leaves a lot of margin on the table.
Next, we compare the performance of the Black’s mesh MTF estimation engine with the
EKM engine. Since the code that estimates Black’s model is not parallelized, we report the
sequential run-times for our approach, whereby all Monte-Carlo iterations are performed in a
single process. Based on the comparison shown in the last column of Table 6.8, overall our
approach is 0.6x slower. However, the fact that our approach, which solves large PDE systems
for several trees is only 0.6x slower than Black’s model based approach, which is only a simple
empirical model, is quite encouraging.
6.5.4 Effect of Early Failures
In order to assess the impact of early failures on the grid lifetime, we present a case study
using the ibmpg2 grid; we estimate its mesh MTF under two settings, one where early failure
detection is on and the other where early failure detection is turned off. As can be seen from
Fig. 6.11b, turning off early failures gives an optimistic MTF estimate which is 34% longer
than the actual MTF. Thus, if the target product lifetime is set as 15 yrs, this grid will fail
EM sign off due to the impact of early failures, but would erroneously succeed if early failures
are ignored. The difference in MTFs stems from the influence of early failures on node voltage
drops. In Fig. 6.11a, we show how the maximum node voltage drop changes with time (for one
sample grid) as the voids nucleate due to EM. Since early failures lead to removal of a via, their
impact on voltage drops is more severe, which ultimately leads to shorter lifetimes. In general,
the effect of early failures gets more pronounced as the difference between the maximum initial
voltage drop and vth increases.
Statistical analysis of EM failures in copper interconnects often shows bimodal distributions
due to the presence of early failures [29]. A similar bimodal distribution can be observed in
Chapter 6. Power Grid EM Checking 117
0 5 10 15 20Time (yrs)
3.8
4
4.2
4.4
4.6
4.8
5
volta
ge d
rops
(%
)
(a)
EF detection ONEF detection OFF
10 12 14 16 18 20Time (yrs)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
PD
F
(b)
simulation EF ONsimulation EF OFFfit EF ONfit EF OFF
MTF = 12.46 yrs
TTF = 16.37 yrs
TTF = 11.56 yrs
MTF = 16.74 yrs
Figure 6.11: Impact of early failures (EF) on (a) the maximum voltage drop (shown for onesample grid) and (b) estimated mesh MTF for ibmpg2. Maximum voltage drop at t = 0 is3.8%vdd, and vth = 5%vdd.
4 6 8 10 12
Time (yrs)
0.00010.00050.001
0.005 0.01
0.05 0.1
0.25
0.5
0.75
0.9 0.95
0.99 0.995
0.999 0.99950.9999
Pro
babili
ty
(a)
4 6 8 10 12
Time (yrs)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
PD
F
(b)
Mode A
Mode B
fit Mode A
fit Mode B
Mode A
All TTF
Samples
Mode B
Figure 6.12: Statistics of mesh TTF samples for ibmpg2 grid shows an underlying bimodaldistribution for different modes of grid failure. MTFA = 6.67 yrs, MTFB = 7.99 yrs, MTFall =7.66 yrs.
the statistics for mesh TTF samples obtained using our power grid EM analysis. Consider the
following two failure modes for a given sample grid : Mode A, in which all junction failures
that lead to grid failure are early failures and Mode B, where at least one junction failure is
a conventional failure. Fig. 6.12a and Fig. 6.12b show respectively the probability plot the
empirical pdf for the two failure modes obtained using 2500 mesh TTF samples from ibmpg2.
Since the pdf for failure modes A and B have a lot of overlap, the overall distribution is almost
normal.
Chapter 6. Power Grid EM Checking 118
0 2 4 6 8 10 12
Speed-up
ibmpg1
ibmpg2
ibmpg3
ibmpg4
ibmpg5
ibmpg6
ibmpgnew1
ibmpgnew2
PG3
PG4
PG5
PG6
PG7P = 4P = 8P = 12
Figure 6.13: Bar chart comparing speed-ups obtained using 4, 8 and 12 parallel processes withrespect to sequential code. Higher is better.
0 5 10 15 20 25 30MC iteration number
13
14
15
16
17
18
19
20
t m (
yrs)
P = 12P = 8P = 4P = 1
(a)
0 5 10 15 20 25 30 35MC iteration number
6
8
10
12
14
16
18
20
TT
F s
ampl
e (y
rs)
P = 12P = 8P = 4P = 1
(b)
Figure 6.14: The figure shows how tm is updated for (a) ibmpg2 and (b) ibmpg5 with MCiterations for P parallel process.
6.5.5 Speed-up due to parallelization
The speed-ups obtained with the multi-process architecture are shown in Fig. 6.13. All
speedups are calculated based on the sequential runtime in which all MC iterations are per-
formed in a single process. For 4 parallel processes, we obtained an average speed-up of 3.4x, for
8 parallel processes, we got a speed up of 5.7x and for 12 parallel processes, we got an average
speed up of 8.5x. The reason for this sub-linear speedup is the ‘slow’ update of tm in the paral-
Chapter 6. Power Grid EM Checking 119
0 20 40 60 80 100
Percent of total time
ibmpg2
ibmpg3
ibmpg4
ibmpg5
ibmpg6
ibmpgnew1
ibmpgnew2
PG5
PG6
PG7Extract Power grid trees
Calculate v0 using cholmod
Compute initial temp. dist. + prepare trees
Generate sample grid + re-init data structures
Find the active set
Sort
Simulate
Synchronize
Update voltage drops using PCG
Other
Figure 6.15: Showing a breakdown of the total runtime (in terms of percentages) consumed bydifferent tasks in the code.
lelized version as compared to the sequential version. Fig. 6.14 shows this phenomenon. Recall
that the initial value of tm is 20 years, and it is updated as more TTF samples are obtained.
A higher value of tm leads to a longer runtime for a given MC iteration, because more trees
are included in the active set. In the sequential version, by design only the first 5 Monte Carlo
iterations run with the tm = 20 years, after which all subsequent iterations use updated values
of tm. On the other hand, for a parallelized version with P processes, the first P Monte Carlo
iterations will run with tm = 20 years. This affects the scalability of our parallel version and
results in sub-linear speed-ups.
6.5.6 Break-up of time consumed by different tasks in the code
In Fig. 6.15, we show the percentage of time consumed by different tasks in the code while
estimating the MTF using VCBDF2 solver with 12 parallel processes. These percentages are
based on the total time taken by the respective task across multiple calls. From the figure, it
can be seen that overall, the majority of the runtime is spent in doing the following three tasks:
1) finding the next junction failure using the Sort- Simulate-Synchronize steps (consumes ∼36%on an average across all grids), 2) finding the active set (consumes ∼35% on an average) and
3) updating the node voltage drops using PCG after void nucleation(s) (consumes ∼16.5% on
an average). The synchronize step consumes only around 0.19% of the runtime, which shows
that the sort step does a good job of ordering the trees. The other tasks (in Fig. 6.15) mainly
consists of updating the temperature distribution and the MTF estimate, checking the stopping
criteria and updating the power grid structure after void nucleation.
Chapter 6. Power Grid EM Checking 120
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Number of branches 10 6
0
2
4
6
8
10
12
14
16
18
t 12BD
F2
(min
s)
(a)
0 1 2 3 4 5 6
Number of branches 10 6
0
2
4
6
8
10
12
14
t 12BD
F2
(m
ins)
(b)
Figure 6.16: (a) tBDF212 vs. branch count for all test grids and (b) scalability analysis for gridsthat only have straight trees.
6.5.7 Overall scalability of the approach
Fig. 6.16a shows the run-times for all the grids we tested, plotted in ascending order of their
branch count. We use branch count as a measure of problem size because from (4.32), a
higher branch count leads to a larger LTI system. Overall, the runtime increases as we move
towards grids with higher branch count, but this increase is erratic because a host of other
factors, such as the geometry of the tree, the stiffness of the tree LTI system, the difference
between the maximum initial voltage drop and vth and the sensitivity of node voltage drops to
branch resistance values, also influence the runtime. If we select grids that have only straight
metal stripes as trees (i.e. no T or plus junctions), then we get a more consistent trend, as
shown in Fig. 6.16b. Computing the empirical complexity for these grids using the function
tBDF212 = anb, with n being number of branches and exponent b being the scalability factor, gives
a = 0.0052 and b = 0.5045. The reason for this sub-linear scalability can be mainly attributed
to the observation that the percentage of trees which become a part of the active set reduces
as the grid size increases, so that the increment in computation is less than the corresponding
increment in the problem size. If this trend continues for larger grids as well, and we ignore the
effect of other factors, then a simple calculation shows that a power grid with a billion branches
can be solved in around 3 hours.
Chapter 7
Conclusions and Future Work
A well-designed power grid in Integrated Circuits (ICs) not only must perform as desired, but it
should also survive and function as intended for a target lifetime before failing. As the modern
designs become more complex and the structural dimensions of electronic interconnects become
ever-smaller due to technology scaling, electromigration (EM) has emerged as a major reliability
concern for modern on die power grids. Modern power grids are huge, and can have up to a
billion nodes. Due to the scale of the problem, only the most simple EM methods have been used
so far power grid EM analysis. State of the art industrial EM checking tools use Black’s model
for branch failure combined with a series model for grid failure to determine system reliability
under the influence of EM. While this has served the purpose for last 40 years, we are now at
a stage where the simplicity and pessimism of the EM tools, that were once their virtues, are
now acting against them. Technology scaling has increased branch current densities, that has
drastically reduced the EM lifetimes. Thus, the industrial EM tools, due to their pessimistic
approach, are now unable to provide any breathing room for designers who are forced to over
use metal resources in designing the power grids.
This necessitates a EM checking tool that moves away from the overly simplistic EM models
and is scalable so that it can be applied to large power grids. In this work, we developed such
an approach. We proposed the Extended Korhonen’s Model (EKM) that can track stress in
multi-branch interconnect trees of arbitrary geometry. We then showed that this model can be
expressed as an LTI system and developed fast and scalable numerical methods for solving these
LTI systems. Finally, we developed a scalable approach for power grid EM checking using a
filtering scheme that determines upfront the set of trees that are most likely to impact the failure
of a grid, and then focusing our computation on those trees. The techniques developed in this
work have allowed the EM verification of a 4.1M node grid using physics-based models in only
∼10 minutes, which has not been done before. The results and the studies done in this work
clearly demonstrate that Black’s model is inaccurate, and one should move to physics-based
models to estimate EM degradation accurately.
There are many avenues to further extended this work. A desirable extension would be to
develop a budgeting framework (akin to SEB for the series model) that will enable the chip
121
Chapter 7. Conclusions and Future Work 122
level reliability to be traded between different parts of the grid using the Extended Korhonen’s
model. Also, since we have an LTI system representation, there might be a way to bypass
the Monte Carlo iterations altogether by developing an effective statistical model that directly
evaluates the mesh model based MTF. Some other proposed extensions include incorporating
current crowding in our EM tool and solving the reverse problem: Given a power grid and a
target MTF, how can we generate current constraints that guarantee the grid survival up to
the target MTF.
Appendices
123
Appendix A
Properties of system matrix A
In this section, we will provide the proof for theorem 1. The proof depends on the properties of
1) irreducible and ) non-negative matrices. As such, we will start with the definitions of such
matrices, and state the supporting theorems and lemmas that we will use to prove theorem 1.
The proof also appeals to some simple graph theory concepts, which we will review a little later.
Wherever possible, we have grouped the definitions/concepts so that it precedes the theorem
or lemma where it will be used, to make the presentation clear.
Definition 2. A square matrix M = [mi,k] ∈ Rn×n induces a directed graph Γ(M) whose
vertices are 0, 1, 2 . . . n− 1, and whose directed edges are i→ k if mi,k 6= 0. We call Γ(M) as
the directed graph of matrix M.
Definition 3. If there is a directed path in the graph from every vertex to every other vertex,
then the graph is said to be strongly connected.
Definition 4. A matrix M = [mi,k] is said to irreducible if Γ(M) is strongly connected [91].
Lemma 1. A is an irreducible matrix for both pre-void and post-void phase.
Proof. For any subtree T , consider two weighed directed graphs G(T ) and G′(T ), where the
discretized points are the vertices and any two adjacent points have an edge between them. In
G(T ), the direction of each edge is the same as reference direction assigned to the branches
and in G′(T ), the direction of each edge is opposite to the assigned reference direction, so that
G′(T ) is the converse of G(T ). For G(T ) (G′(T )), the weight of a directed edge i→ k (k → i)
between adjacent vertices is equal to ai,k (ak,i), which can be determined using (4.8)-(4.12).
Fig. A.1b and A.1c shows the graphs G(T ) and G′(T ) for interconnect tree of Fig. A.1a. A
weighed adjacency matrix can be used to represent the connectivity of a graph. If a graph has
n nodes, then the adjacency matrix will be of size n × n. The entry in the ith row and kth
column of a weighed adjacency matrix is equal to weight of edge i→ k if its exists, otherwise it
is 0. Let W = [wi,k] and W′ = [w ′k,i] be the weighed adjacency matrices for graphs G(T ) and
G′(T ), respectively. Then, wi,k = ai,k and w ′k,i = ak,i. Now, we can write A as
A = W + Ad +W′, (A.1)
124
Appendix A. Properties of system matrix A 125
(a) (b)
(c) (d)
Figure A.1: (a) A typical interconnect tree T with its corresponding graphs (b) G(T ), (c) theconverse G′(T ) and (d) Part of graph Γ(A) for any two adjacent points i and k. Here, N = 4and the vertex at n1 is the root.
where Ad is simply a diagonal matrix whose ith diagonal entry is equal to ai,i, the ith diagonal
entry of A. From (A.1), it is clear that Γ(A) = G(T ) ∪ G′(T ) ∪ Γ(Ad), so that for any two
adjacent vertices i and k, Γ(A) has both edges i→ k and k → i, as shown in Fig. A.1d. Thus,
in Γ(A) there is always a path from every vertex to every other vertex. Hence Γ(A) is strongly
connected and A is irreducible.
Definition 5. A matrix M = [mi,j ] is said to diagonally dominant if |mi,i| ≥∑
k 6=i |mi,k| ∀i.
Lemma 2. A is diagonally dominant for both pre-void and post-void phase.
Proof. From the state stamps (4.8)-(4.12), we have for the ith row in A
|ai,i| >q∑
k=0,k 6=i
|ai,k|, for a voided diffusion barrier, (A.2a)
|ai,i| =q∑
k=0,k 6=i
|ai,k|, otherwise. (A.2b)
Thus, A is diagonally dominant.
Appendix A. Properties of system matrix A 126
(a) (b)
Figure A.2: All paths starting from the root and ending in a diffusion barrier for (a) G(T ) and(b) the corresponding converse paths in G′(T ).
We will now review some simple graph theory concepts that are required to state the proof.
For any two vertices i and k, if there exists a directed path from i to k, then i is said to be an
ancestor of k and k is a descendant of i. In addition, if i and k are adjacent points, then i is
the parent of k and k is the child of i. In G(T ), each vertex has at most one parent while in
G′(T ), each vertex has at most one child. Note that there is only one vertex in G(T ) that hasno parents. The same vertex in G′(T ) has no children. We will designate this vertex as the root
for both G(T ) and G′(T ). Another concept that we will appeal to is a linear graph or a path.
A path is a tree where each vertex has at most one child or equivalently at most one parent.
Consider all paths in G(T ) that start from the root and end at a diffusion barrier, as shown in
Fig. A.2a. Clearly, the union of all such paths is equal to the graph itself. Since G′(T ) is the
converse of G(T ), the paths in G′(T ) are the converse of the paths in G(T ) (see Fig. A.2b).
We will now state all the remaining definitions and theorems, followed by the final proof.
Definition 6. A matrix M = [mi,k] is said to be non-negative if mi,k ≥ 0 ∀i, k.
Definition 7. Let λ1, λ2, . . . λn be the (real or complex) eigenvalues of a matrix M = [mi,k] ∈Rn×n. Then the spectral radius κ(M) is defined as
κ(M) = max|λ1|, |λ2|, . . . |λn|. (A.3)
Theorem 2. (Perron-Frobenius theorem) Let M ∈ Rn×n and suppose that M is irreducible
and non-negative. Then κ(M) > 0 is a simple eigenvalue of M with an associated positive
eigenvector.
Definition 8. A matrix M = [mi,k] ∈ Rn×n is said to be irreducibly diagonally dominant if M
is irreducible, all its rows are diagonally dominant and there is at least one row i that satisfies
|mi,i| >∑n
k=0,k 6=i |mi,k|.
Theorem 3. An irreducibly diagonally dominant matrix is non-singular.
The proofs for theorems 2 and 3 have been provided in [91].
Appendix A. Properties of system matrix A 127
A.1 Proof of theorem 1
Part (a). We will first prove part 1(a), by proving the following statements for the system
matrix A of subtree T = N , B in the pre-void phase:
(i) All eigenvalues of A have non-positive real parts.
(ii) There is exactly one eigenvalue at 0.
(iii) All eigenvalues of A are real.
Proving (i). From Gershgorin disc theorem [78], all eigenvalues of A are located in the
union of q + 1 discs
q⋃
i=0
z ∈ C : |z − ai,i| ≤
q∑
k=0,k 6=i
|ai,k|
≡ G(A). (A.4)
From diagonal dominance, we always have |ai,i| ≥∑q
k=0,k 6=i |ai,k| and ai,i < 0 ∀i. Thus, G(A)
would lie in the left-half of the complex plane touching the imaginary axis at the origin. Hence,
all eigenvalues of A have non-positive real parts.
Proving (ii). In the pre-void phase, all the row sums in A are zero. Thus, we must have
at least one eigenvalue at 0. Indeed, we have Ay = 0 for y =[
1 1 . . . 1]T
or a multiple
thereof. Thus, y is an eigenvector for the 0 eigenvalue. Define
Ac = A+ cI, (A.5)
where c = max|ai,i|. Then, clearly Ac is non-negative and irreducible because it is obtained by
only adding c to the diagonal entries of A (non-diagonal elements are unaffected). Also, if λ0 ≥λ1 ≥ . . . λq are the eigenvalues of A (including multiplicities), then λ0 + c ≥ λ1 + c ≥ . . . λq + c
are the eigenvalues of Ac. From part (a), we know that all eigenvalues are non-positive, thus
λ0 = 0 is the largest eigenvalue of A and λ0 + c = c is the largest eigenvalue of Ac. But, since
Ac is non-negative and irreducible, we must have κ(Ac) = c. By Perron-Frobenius theorem, c
is a simple eigenvalue of Ac. Hence, 0 is a simple eigenvalue of A.
Proving (iii). We will prove this by showing that A is similar to a symmetric matrix. For
this, we restate (A.1)
A = W + Ad +W′. (A.6)
Let D = [di,k] ∈ R(q+1)×(q+1) be a diagonal matrix (di,k = 0 if k 6= i) and S , D−1AD be a
matrix similar to A. Then
S = D−1WD+D−1AdD+D−1W′D
= D−1WD+ Ad +D−1W′D.(A.7)
Appendix A. Properties of system matrix A 128
For S to be symmetric, we must have S = ST , so that
D−1WD+ Ad +D−1W′D = (D−1WD+ Ad +D−1W′D)T
= DWTD−1 + Ad +D(W′)TD−1.(A.8)
Note that by construction, we have wk,i 6= 0 ⇐⇒ w ′i,k 6= 0. Thus, the structure (sparsity
pattern) of W and (W′)T is the same. This is to be expected because G′(T ) is the converse of
G(T ). Then, from (A.8), S will be symmetric if we can find a diagonal matrix D such that
D−1WD = D(W′)TD−1, (A.9)
which in turn requires
wi,kdk,kdi,i
= w ′k,i
di,idk,k
=⇒ ai,kdk,kdi,i
= ak,idi,idk,k
=⇒ (dk,k)2 =
ak,iai,k
(di,i)2. (A.10)
Thus, if (A.10) is satisfied for all edges i → k in G(T ) and k → i in G′(T ), S will be
symmetric. To show that such a satisfying assignment is possible, consider a path in G(T ) thatstarts from the root and ends at any diffusion barrier. For every edge P (k) → k in the path,
where P (k) denotes the (only) parent of vertex k in the path, we enforce the following condition
(dk,k)2 =
(ak,P (k)
aP (k),k
)
(dP (k),P (k))2, (A.11)
that leads to the following transitive relation for any vertex k in the path
(dk,k)2 =
(ak,P (k)
aP (k),k
)(aP (k),P (P (k))
aP (P (k)),P (k)
)
. . .
(aC(r),r
ar,C(r)
)
(dr,r)2. (A.12)
Here, r is the index of the root vertex and C(k) denotes the (only) child of vertex k in the path.
If we choose dr,r 6= 0, then we can uniquely determine all dk,k values, corresponding to vertex k
in the path, as we traverse it starting from the root. By traversing all the paths starting from
the root and ending at diffusion barriers, we can determine D matrix, such that S is symmetric.
A being similar to a symmetric matrix will have real eigenvalues.
Part (b). In the post-void phase, the system matrix A will have at least one voided
diffusion barrier. Hence, there will be at least one row i that satisfies |ai,i| >∑q
k=0,k 6=i |ai,k|[from (4.11)]. Thus, as per definition 8, A is irreducibly diagonally dominant and hence, non-
singular. Also, as we did for the pre-void phase, we can proof that all eigenvalues of A are
real and non-positive. However, since 0 cannot be an eigenvalue of A in post-void phase, all
eigenvalues of A are negative real numbers.
Appendix A. Properties of system matrix A 129
A.2 Special Case
For a subtree that only has dotted-I junctions and diffusion barriers, we can also prove that
system matrix A has distinct eigenvalues. The proof relies on the following theorem, which has
been stated and proved in [92].
Theorem 4. Let A ∈ Rn×n be a tridiagonal matrix
A =
a1 b2
c2 a2 b3. . .
. . .. . .
. . .. . . bn
cn an
.
Then A has n real and distinct eigenvalues if it satisfies the following three conditions for
i = 2, 3, . . . , n:
i) A is irreducible, i.e. bici 6= 0.
ii) A is diagonally dominant, i.e. |ai| ≥ |bi|+ |ci|.
iii) sign(bici) = sign(ai−1ai).
Lemma 3. If a subtree only has diffusion barriers and dottedI junctions, all eigenvalues of
system matrix A = [ai,k] obtained using state-stamps (4.8)-(4.12) are real and distinct for both
pre-void and post-void phase.
Proof. Consider a subtree T that has only diffusion barriers and dotted-I junctions. Clearly, Twill have only two diffusion barriers at the two ends with multiple dotted-I junctions in between.
Without loss of generality, we will assume that the indices are ordered, so that the index of
a parent is always less than the index of its child. This imposes a complete ordering on the
assigned indices to the discretized points, so that if the leftmost diffusion barrier was chosen
as the root, the indices would increase as we go from left to right, with the rightmost diffusion
barrier having the largest assigned index. For such a case, the system matrix A = [ai,k] obtained
using state-stamps (4.8)-(4.12) will be tridiagonal. In addition it satisfies all conditions stated
in theorem 4:
i. A is irreducible (see Lemma 1)
ii. A is diagonally dominant (see Lemma 2).
iii. Since ai,i < 0 ∀i and ai,k > 0 for k 6= i, we always have ak,kak−1,k−1 > 0 and ak,k−1ak−1,k >
0.
Thus, from theorem 4, all eigenvalues of A are real and distinct.
Appendix B
The math behind the Filtering
approach
In this section, we will provide step by step details of the integration of (6.12) that leads to
(6.13). We will also show how we get the expression for δµζ , the (1 − ζ) × 100% confidence
bound on µ.
B.1 Integration details
Lets denote the RHS of (6.12) by I. Now using (6.11) in (6.12), we can write
I =1
2
∫ ∞
tm
(
1− erf(z − µ
v√2
))
dz. (B.1)
Let y ,z − µ
v√2
=⇒ dz = v√2dy and h ,
tm − µ
v√2
. Then, (B.1) becomes
I =v√2
∫ ∞
herfc(y)dy, (B.2)
where we used erfc(y) = 1− erf(y). From the definition of erfc, we get
I√2
v=
∫ ∞
h
(2√π
∫ ∞
ye−u2
du
)
dy. (B.3)
Using integration by parts, we get:
I√2
v=
[
y2√π
∫ ∞
ye−u2
du
]∞
h
−∫ ∞
hyd
dy
(2√π
∫ ∞
ye−u2
du
)
dy
=2√π
[
y
∫ ∞
ye−u2
du
]∞
h
−∫ ∞
hy
(
− 2√πe−y2
)
dy (Using Leibnitz rule)
130
Appendix B. The math behind the Filtering approach 131
=2√π
[
y
∫ ∞
ye−u2
du
]∞
h
+1√π
[
e−y2]∞
h(Integrated using u = 2y)
=2√π
[
0− h
∫ ∞
he−u2
du
]
+1√π
[
0− e−h2]
= −herfc(h) + 1√πe−h2
.
By substituting h, and replacing the corresponding expressions for standard normal cdf Φ(·)and pdf φ(·), we get the final expression
I = − tm − µ
2
[
1− erf(tm − µ
v√2
)]
+v√2π
exp
[
−1
2
(tm − µ
v
)2]
= (µ− tm)
[
1− 1
2
(
1 + erf
(tm − µ
v√2
))]
+ vφ
(tm − µ
v
)
= (µ− tm)
[
1− Φ
(tm − µ
v
)]
+ vφ
(tm − µ
v
)
. (B.4)
B.2 Deriving confidence bound on µ
Given that µ is function of µ′ and κ, the estimation errors in µ′ and κ propagate to µ. Thus,
we can write using propagation of errors [86]
δµζ =
√(∂µ
∂µ′δµ′
ζ
)2
+
(∂µ
∂κδκζ
)2
, (B.5)
where δµζ , δµ′ζ and δκζ are the (1− ζ)× 100% confidence bounds for µ, µ′ and κ, respectively.
The partial derivatives in (B.5) can be easily calculated
∂µ
∂µ′=
1
κand
∂µ
∂κ=
tm − µ′
κ2. (B.6)
Thus, if we determine δµ′ζ and δκζ , we can determine the δµζ .
B.2.1 Finding δκζ
Let y = Φ−1(pf )/√
2. Then, κ can be written as
κ = pf +φ(y√2)
y√2
= pf +e−y2
2y√π. (B.7)
Define g(y) , e−y2/(2y√π). Then, we can write κ = pf + g(y), with error δκζ as
δκζ =
√
(δpfζ)2 +
(∂g
∂yδyζ
)2
, (B.8)
Appendix B. The math behind the Filtering approach 132
where δpfζ is the (1− ζ)× 100% confidence bound on δpf . Now
∂g
∂y=
(1
2√π
)∂(e−y2/y)
∂y= −e−y2(2y2 + 1)
2y2√π
(B.9)
and
y =Φ−1(pf )√
2=⇒ pf = Φ(y
√2)
=⇒ δpfζ =d
dyΦ(y√2)δyζ =
1
2
d
dy(1 + erf(y)) δyζ =
1
2
(
2e−y2
√π
)
δyζ
=⇒ δyζ =
√π
e−y2δpfζ . (B.10)
Using (B.9) and (B.10) back in (B.8), we get
δκζ =
√
(δpfζ)2 +
(e−y2(2y2 + 1)
2y2√π
×√πδpfζ
e−y2
)2
= δpfζ
√
1 +
(
1 +1
2y2
)2
. (B.11)
From [87], the (1− ζ)× 100% confidence bound on δpf after obtaining s samples is
δpfζ ≤ zζ/2
√
pf (1− pf )
swhen spf ≥ 5, s(1− pf ) ≥ 5, (B.12)
where zζ/2 is the (1 − ζ/2)-percentile of standard normal distribution and pf is the estimated
value of pf . Hence, for s samples, we can state the (1− ζ)× 100% confidence bound on δκ as
δκζ ≤ zζ/2
√√√√
pf (1− pf )
s
[
1 +
(
1 +1
2y2
)2]
. (B.13)
B.2.2 Finding δµ′ζ
Because RV T′ has a limited normal distribution, the confidence intervals for the normal dis-
tribution cannot be applied directly to calculate the confidence bounds in this case. Hence,
we use the technique presented in [83] which uses the notion of generalized confidence intervals
[93]. The procedure requires calculating the percentiles of generalized pivotal quantities (GPQ)
using simulation. The steps are as follows:
1. After obtaining s samples, estimate the mean µ using (6.15) and the standard deviation
v using
v =tm − µ′
Φ−1(pf )pf + φ (Φ−1(pf )), (B.14)
where (B.14) was found using (6.13)-(6.16).
2. Generate a large number of (Z,U2) samples, where Z is a sample from standard normal
Appendix B. The math behind the Filtering approach 133
distribution and U is a sample from chisquared distribution with s−1 degrees of freedom.
3. For each sample (Z,U2), calculate the GPQs Qµ and Qv for µ and v′ respectively using
Qµ = µ− Z
U/√s− 1
v√s, Qv =
v
U/√n− 1
. (B.15)
4. For each (Qµ, Qv), calculate µ′ by substituting all occurrences of µ by Qµ nd v by Qv in
(6.13).
5. Sort all values obtained in the previous step in ascending order. The 100 ζ2th and 100(1− ζ
2)th
percentiles of the sorted values give us the (1− ζ)× 100% confidence bounds µ′lb and µ′
ub,
respectively, from which δµ′ζ can be estimated.
Bibliography
[1] S. Moreau and D. Bouchu, “Reliability of dual damascene tsv for high density integration:
The electromigration issue,” in 2013 IEEE International Reliability Physics Symposium
(IRPS), April 2013, pp. CP.1.1–CP.1.5.
[2] J. Warnock, “Circuit design challenges at the 14nm technology node,” in ACM/IEEE 48th
Design Automation Conference (DAC-2011), San Diego, CA, June 5-9 2011, pp. 464–467.
[3] M. Hauschildt, M. Gall, S. Thrasher, P. Justison, R. Hernandez, H. Kawasaki,
and P. S. Ho, “Statistical analysis of electromigration lifetimes and void evolution,”
Journal of Applied Physics, vol. 101, no. 4, p. 043523, 2007. [Online]. Available:
http://dx.doi.org/10.1063/1.2655531
[4] C. S. Hau-Riege, “An introduction to cu electromigration,” Microelectron-
ics Reliability, vol. 44, no. 2, pp. 195 – 205, 2004. [Online]. Available:
https://doi.org/10.1016/j.microrel.2003.10.020
[5] C. L. Gan, C. V. Thompson, K. L. Pey, and W. K. Choi, “Experimental characterization
and modeling of the reliability of three-terminal dual-damascene Cu interconnect trees,”
J. Appl. Phys., vol. 94, no. 2, pp. 1222–1228, 2003.
[6] R. Monig, R. R. Keller, and C. A. Volkert, “Thermal fatigue testing of thin metal films,”
Review of Scientific Instruments, vol. 75, no. 11, pp. 4997–5004, 2004.
[7] B. Geden, “Understand and avoid electromigration (EM) and IR-drop in cus-
tom IP blocks,” Synopsys, White Paper, November 2011. [Online]. Available:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.443.498&rep=rep1&type=pdf
[8] J. R. Black, “Electromigration- a brief survey and some recent results,” IEEE Transactions
on Electronic devices, vol. 16, no. 4, pp. 338–347, 1969.
[9] D. Frost and K. Poole, “A method for predicting VLSI-device reliability using series models
for failure mechanisms,” Reliability, IEEE Transactions on, vol. R-36, no. 2, pp. 234–242,
June 1987.
134
Bibliography 135
[10] J. Kitchin, “Statistical electromigration budgeting for reliable design and verification in
a 300-MHz microprocessor,” in VLSI Circuits, 1995. Digest of Technical Papers., 1995
Symposium on, June 1995, pp. 115–116.
[11] A. S. Oates, “Interconnect reliability challenges for technology scaling: A circuit focus,” in
2016 IEEE Int. Interconnect Tech. Conf. / Adv. Metallization Conf. (IITC/AMC), May
2016, pp. 59–59.
[12] C. K. Hu, D. Canaperi, S. T. Chen, L. M. Gignac, B. Herbst, S. Kaldor, M. Krishnan,
E. Liniger, D. L. Rath, D. Restaino, R. Rosenberg, J. Rubino, S. C. Seo, A. Simon, S. Smith,
and W. T. Tseng, “Effects of overlayers on electromigration reliability improvement for
cu/low k interconnects,” in 2004 IEEE International Reliability Physics Symposium. Pro-
ceedings, April 2004, pp. 222–228.
[13] R. Rosenberg and M. Ohring, “Void formation and growth during electromigration in
thin films,” Journal of Applied Physics, vol. 42, no. 13, pp. 5671–5679, 1971. [Online].
Available: http://scitation.aip.org/content/aip/journal/jap/42/13/10.1063/1.1659998
[14] M. Shatzkes and J. R. Lloyd, “A model for conductor failure considering diffusion
concurrently with electromigration resulting in a current exponent of 2,” Journal
of Applied Physics, vol. 59, no. 11, pp. 3890–3893, 1986. [Online]. Available:
http://scitation.aip.org/content/aip/journal/jap/59/11/10.1063/1.336731
[15] R. Kirchheim, “Stress and electromigration in Al-lines of integrated circuits,” Acta
Metallurgica et Materialia, vol. 40, no. 2, pp. 309 – 323, 1992. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/095671519290305X
[16] M. A. Korhonen, P. Borgesen, K. N. Tu, and C.-Y. Li, “Stress evolution due to electro-
migration in confined metal lines,” J. Appl. Phys., vol. 73, no. 8, pp. 3790 –3799, apr
1993.
[17] M. E. Sarychev, Y. V. Zhitnikov, L. Borucki, C.-L. Liu, and T. M. Makhviladze,
“General model for mechanical stress evolution during electromigration,” Journal
of Applied Physics, vol. 86, no. 6, pp. 3068–3075, 1999. [Online]. Available:
http://scitation.aip.org/content/aip/journal/jap/86/6/10.1063/1.371169
[18] V. Sukharev, E. Zschech, and W. D. Nix, “A model for electromigration-induced
degradation mechanisms in dual-inlaid copper interconnects: Effect of microstructure,”
Journal of Applied Physics, vol. 102, no. 5, pp. –, 2007. [Online]. Available:
http://scitation.aip.org/content/aip/journal/jap/102/5/10.1063/1.2775538
[19] R. de Orio, H. Ceric, and S. Selberherr, “Physically based models of elec-
tromigration: From Black’s equation to modern TCAD models,” Microelec-
Bibliography 136
tronics Reliability, vol. 50, no. 6, pp. 775 – 789, 2010. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0026271410000193
[20] X. Huang, T. Yu, V. Sukharev, and S. X.-D. Tan, “Physics-based Electromigration Assess-
ment for Power Grid Networks,” in ACM/EDAC/IEEE Design Automation Conf., June
2014, pp. 1–6.
[21] D.-A. Li, M. Marek-Sadowska, and S. Nassif, “A method for improving power grid resilience
to electromigration-caused via failures,” IEEE Trans. Very Large Scale Integr. (VLSI)
Syst., vol. 23, no. 1, pp. 118–130, Jan 2015.
[22] X. Huang, V. Sukharev, J.-H. Choy, M. Chew, T. Kim, and S. X.-D. Tan,
“Electromigration assessment for power grid networks considering temperature and
thermal stress effects,” Integration, the VLSI Journal, vol. 55, pp. 307–315, 2016. [Online].
Available: https://doi.org/10.1016/j.vlsi.2016.04.001
[23] D. A. Li, M. Marek-Sadowska, and S. R. Nassif, “T-VEMA: A temperature- and variation-
aware electromigration power grid analysis tool,” IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 23, no. 10, pp. 2327–2331, Oct 2015.
[24] Y.-K. Cheng, P. Raha et al., “ILLIADS-T: an electrothermal timing simulator for tempera-
ture sensitive reliability diagnosis of CMOS VLSI chips,” IEEE Trans. on Computer-Aided
Design of Integrated Circuits and Systems, vol. 17, no. 8, pp. 668–681, Aug 1998.
[25] S. Chatterjee, M. Fawaz, and F. N. Najm, “Redundancy-Aware Electromigration Checking
for Mesh Power Grids,” in IEEE/ACM Int. Conf. on Comput. Aided Design, San Jose,
CA, Nov. 2013, pp. 540–547.
[26] S. R. Nassif, “Power grid analysis benchmarks,” in ASP-DAC, 2008, pp. 376–381.
[27] Y.-L. Cheng, S. Y. Lee, C. C. Chiu, and K. Wu, “Back stress model on electromigra-
tion lifetime prediction in short length copper interconnects,” in 2008 IEEE International
Reliability Physics Symposium, April 2008, pp. 685–686.
[28] B. Li, J. Gill, C. Christiansen, T. Sullivan, and P. S. McLaughlin, “Impact of via-line
contact on cu interconnect electromigration performance,” in IEEE Int. Rel. Phys. Symp.,
April 2005, pp. 24–30.
[29] E. T. Ogawa, K. D. Lee, H. Matsuhashi, K. S. Ko, P. R. Justison, A. N. Ramamurthi, A. J.
Bierwag, P. S. Ho, V. A. Blaschke, and R. H. Havemann, “Statistics of electromigration
early failures in Cu/oxide dual-damascene interconnects,” in 39th Annual IEEE Int. Rel.
Physics Symp. Proc., 2001, pp. 341–349.
[30] L. M. Ting, J. S. May, W. R. Hunter, and J. W. McPherson, “AC electromigration char-
acterization and modeling of multilayered interconnects,” in IEEE Int. Rel. Phys. Symp.,
March 1993, pp. 311–316.
Bibliography 137
[31] V. Sukharev, X. Huang, and S. X.-D. Tan, “Electromigration induced stress evolution
under alternate current and pulse current loads,” Journal of Applied Physics, vol. 118,
no. 3, p. 034504, 2015.
[32] K. Lee, “Electromigration recovery and short lead effect under bipolar- and unipolar-pulse
current,” in IEEE International Reliability Physics Symposium (IRPS), april 2012, pp.
6B.3.1 –6B.3.4.
[33] “Standard method for calculating the electromigration model parameters for current den-
sity and temperature,” JEDEC Solid State Technology Association, Arlington, VA, Stan-
dard, Feb 1998.
[34] I. A. Blech, “Electromigration in thin aluminium on titanium nitride,” Journal of Applied
Physics, vol. 47, no. 4, pp. 1203–1208, 1976, doi: 10.1063/1.322842.
[35] I. A. Blech and C. Herring, “Stress generation by electromigration,” Applied Physics Let-
ters, vol. 29, no. 3, pp. 131–133, 1976.
[36] I. A. Blech and K. L. Tai, “Measurement of stress gradients generated by electromigration,”
Applied Physics Letters, vol. 30, no. 8, pp. 387–389, 1977.
[37] A. Abbasinasab and M. Marek-Sadowska, “Blech effect in interconnects: Applications and
design guidelines,” in Proceedings of the 2015 Symposium on International Symposium on
Physical Design, ser. ISPD ’15. New York, NY, USA: ACM, 2015, pp. 111–118. [Online].
Available: http://doi.acm.org/10.1145/2717764.2717772
[38] J. Lloyd, “Black’s law revisited-Nucleation and Growth in Electromigration failure,”
Microelectronics Reliability, vol. 47, no. 9-11, pp. 1468–1472, 2007. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0026271407003630
[39] M. Hauschildt, C. Hennesthal, G. Talut, O. Aubel, M. Gall, K. B. Yeap, and E. Zschech,
“Electromigration early failure void nucleation and growth phenomena in Cu and Cu(Mn)
interconnects,” in Reliability Physics Symposium (IRPS), 2013 IEEE International, April
2013, pp. 2C.1.1–2C.1.6.
[40] J. Lloyd and J. Kitchin, “The electromigration failure distribution: The fine-line case,” J.
Appl. Phys., vol. 69, no. 4, pp. 2117–2127, Feb 1991.
[41] V. Mishra and S. S. Sapatnekar, “The impact of electromigration in copper interconnects
on power grid integrity,” in Proceedings of the 50th Annual Design Automation Conference,
2013, pp. 88:1–88:6. [Online]. Available: http://doi.acm.org/10.1145/2463209.2488842
[42] S. P. Hau-Riege and C. V. Thompson, “Experimental characterization and modeling of
the reliability of interconnect trees,” J. Appl. Phys., vol. 89, no. 1, pp. 601–609, 2001.
Bibliography 138
[43] H.-B. Chen, S.-D. Tan, V. Sukharev, X. Huang, and T. Kim, “Interconnect reliability
modeling and analysis for multi-branch interconnect trees,” in ACM/EDAC/IEEE Design
Automation Conf., June 2015, pp. 1–6.
[44] H. B. Chen, S. X. D. Tan, X. Huang, T. Kim, and V. Sukharev, “Analytical modeling
and characterization of electromigration effects for multibranch interconnect trees,” IEEE
Trans. on Comput.-Aided Design of Integrated Circuits and Systems, vol. 35, no. 11, pp.
1811–1824, Nov 2016.
[45] B. Li, P. S. McLaughlin, J. P. Bickford, P. Habitz, D. Netrabile, and T. D. Sullivan,
“Statistical evaluation of electromigration reliability at chip level,” IEEE Transactions on
Device and Materials Reliability, vol. 11, no. 1, pp. 86–91, March 2011.
[46] F. L. Wei, C. S. Hau-Riege, A. P. Marathe, and C. V. Thompson, “Effects
of active atomic sinks and reservoirs on the reliability of Cu low-k intercon-
nects,” Journal of Applied Physics, vol. 103, no. 8, 2008. [Online]. Available:
http://scitation.aip.org/content/aip/journal/jap/103/8/10.1063/1.2907962
[47] S. Chatterjee, M. Fawaz, and F. N. Najm, “Redundancy-aware power grid electromigration
checking under workload uncertainties,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, vol. 34, no. 9, pp. 1509–1522, Sept 2015.
[48] J. Lloyd and J. Kitchin, “The electromigration failure distribution: The fineline case,”
Journal of Applied Physics, vol. 69, no. 4, pp. 2117–2127, 1991.
[49] F. N. Najm, Circuit Simulation. John Wiley and Sons, 2010.
[50] J. Thomas, Numerical Partial Differential Equations: Finite Difference Methods.
Springer-Verlag New York, 1995.
[51] J. N. Reddy, An Introduction to the Finite Element Method, 3rd ed. McGraw-Hill, 2006.
[52] T. Barth and M. Ohlberger, Finite Volume Methods: Foundation and Analysis. JohnWiley
& Sons, Ltd, 2004. [Online]. Available: http://dx.doi.org/10.1002/0470091355.ecm010
[53] J. Droniou, R. Eymard, T. Gallouet, and R. Herbin, “Gradient schemes: a generic frame-
work for the discretisation of linear, nonlinear and nonlocal elliptic and parabolic equa-
tions,” Mathematical Models and Methods in Applied Sciences, vol. 23, no. 13, pp. 2395–
2432, 2013.
[54] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang, Spectral Methods: Fundamen-
tals in Single Domains. Springer-Verlag Berlin Heidelberg, 2006.
[55] W. Schiesser, Computational Mathematics in Engineering and Applied Science: ODEs,
DAEs, and PDEs. Taylor & Francis, 1993.
Bibliography 139
[56] E. Hairer, S. P. Norsett, and G. Wanner, Solving ordinary differential equations, 2nd ed.
Springer-Verlag Berlin Heidelberg, 1993.
[57] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential-
Algebraic Problems, 2nd ed. Springer-Verlag Berlin Heidelberg, 1996.
[58] L. F. Shampine and H. A. Watts, “Global error estimates for ordinary differential
equations,” ACM Trans. Math. Softw., vol. 2, no. 2, pp. 172–186, Jun 1976. [Online].
Available: http://doi.acm.org/10.1145/355681.355687
[59] L. Shampine, What everyone solving differential equations numerically should know, Jan
1978. [Online]. Available: https://www.osti.gov/scitech/biblio/6219108
[60] J. D. Lambert, Numerical Methods for Ordinary Differential Systems: The Initial Value
Problem. Wiley, 1991.
[61] J. C. Butcher, Numerical Mathods for Ordinary Differential Equations, 2nd ed. John
Wiley and Sons, 2003.
[62] W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan,
“Hotspot: a compact thermal modeling methodology for early-stage VLSI design,” IEEE
Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 5, pp. 501–513, May 2006.
[63] M. N. Ozisik, Boundary Value Problems of Heat Conduction. Mineola, New York: Dover
Publications, Inc., 2002.
[64] I. Miller, J. E. Freund, and R. Johnson, Probability and Statistics for Engineers. Engle-
wood Cliffs, N.J.: Prentice-Hall, Inc., 1990.
[65] S. Chatterjee, V. Sukharev, and F. N. Najm, “Fast physics-based electromigration checking
for on-die power grids,” in 2016 IEEE/ACM International Conference on Computer-Aided
Design (ICCAD), Nov 2016, pp. 1–8.
[66] R. de Orio, H. Ceric, and S. Selberherr, “A compact model for early electromigration
failures of copper dual-damascene interconnects,” Microelectronics Reliability, vol. 51, pp.
1573 – 1577, 2011. [Online]. Available: https://doi.org/10.1016/j.microrel.2011.07.049
[67] V. Sukharev, A. Kteyan, and X. Huang, “Postvoiding stress evolution in confined metal
lines,” IEEE Transactions on Device and Materials Reliability, vol. 16, no. 1, pp. 50–60,
March 2016.
[68] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Cicrcuits: A Design
Perspective, 2nd ed. Pearson, Dec 2002.
Bibliography 140
[69] Z.-S. Choi, J. Lee, M. K. Lim, C. L. Gan, and C. V. Thompson, “Void dynamics in
copper-based interconnects,” Journal of Applied Physics, vol. 110, no. 3, p. 033505, 2011.
[Online]. Available: http://dx.doi.org/10.1063/1.3611408
[70] L. Arnaud, F. Cacho, L. Doyen, F. Terrier, D. Galpin, and C. Monget, “Analysis
of electromigration induced early failures in cu interconnects for 45 nm node,”
Microelectronic Engineering, vol. 87, no. 3, pp. 355 – 360, 2010. [Online]. Available:
https://doi.org/10.1016/j.mee.2009.06.014
[71] J. Dormand and P. Prince, “A family of embedded runge-kutta formulae,” Journal of
Computational and Applied Mathematics, vol. 6, no. 1, pp. 19 – 26, 1980. [Online].
Available: http://dx.doi.org/10.1016/0771-050X(80)90013-3
[72] S. Chatterjee, V. Sukharev, and F. N. Najm, “Power grid electromigration checking using
physics-based models,” IEEE Trans. on Comput.-Aided Design of Integrated Circuits and
Systems, no. 99, 2017.
[73] J. J. Clement, “Reliability analysis for encapsulated interconnect lines under dc and
pulsed dc current using a continuum electromigration transport model,” Journal of Applied
Physics, vol. 82, no. 12, pp. 5991–6000, 1997.
[74] S. Chatterjee, V. Sukharev, and F. N. Najm, “Fast physics-based electromigration assess-
ment by efficient solution of linear time-invariant (LTI) systems,” in 2017 IEEE/ACM
International Conference on Computer-Aided Design (ICCAD), Nov 2017, pp. 1–1, to ap-
pear.
[75] K. W. Tu, “Stability and convergence of general multistep and multivariate methods with
variable step size,” Ph.D. dissertation, Univ. of Illinois at Urbana-Champaign, 1972, dept.
of Computer Science.
[76] R. K. Brayton, F. G. Gustavson, and G. D. Hachtel, “A new efficient algorithm for solving
differential-algebraic systems using implicit backward differentiation formulas,” Proceed-
ings of the IEEE, vol. 60, no. 1, pp. 98–108, Jan 1972.
[77] W. E. Arnoldi, “The principle of minimized iterations in the solution of the matrix
eigenvalue problem,” Quarterly of Applied Mathematics, vol. 9, no. 1, pp. 17–29, 1951.
[Online]. Available: http://www.jstor.org/stable/43633863
[78] N. J. Highham, Functions of Matrices: Theory and Computation. Society for Industrial
and Applied Mathematics, 2008.
[79] T. A. Davis, “A column pre-ordering strategy for the unsymmetric-pattern multifrontal
method,” ACM Trans. Math. Softw., vol. 30, no. 2, pp. 165–195, Jun. 2004. [Online].
Available: http://doi.acm.org/10.1145/992200.992205
Bibliography 141
[80] ——, “Algorithm 832: Umfpack v4.3—an unsymmetric-pattern multifrontal method,”
ACM Trans. Math. Softw., vol. 30, no. 2, pp. 196–199, Jun. 2004. [Online]. Available:
http://doi.acm.org/10.1145/992200.992206
[81] T. A. Davis and I. S. Duff, “A combined unifrontal/multifrontal method for unsymmetric
sparse matrices,” ACM Trans. Math. Softw., vol. 25, no. 1, pp. 1–20, Mar. 1999. [Online].
Available: http://doi.acm.org/10.1145/305658.287640
[82] T. Davis and I. Duff, “An unsymmetric-pattern multifrontal method for sparse lu factor-
ization,” SIAM Journal on Matrix Analysis and Applications, vol. 18, no. 1, pp. 140–158,
1997. [Online]. Available: http://epubs.siam.org/doi/abs/10.1137/S0895479894246905
[83] I. Bebu and T. Mathew, “Confidence intervals for limited moments and truncated moments
in normal and lognormal models,” Statistics & Probability Letters, vol. 79, no. 3, pp. 375
– 380, 2009.
[84] N. Weiss, P. Holmes, and M. Hardy, A Course in Probability. Pearson Addison Wesley,
2005.
[85] E. A. Amerasekera and F. N. Najm, Failure Mechanisms in Semiconductor Devices, 2nd ed.
John Wiley and Sons, Oct. 1998.
[86] H. H. Ku, “Notes on the use of propagation of error formulas,” Journal of Research of
the National Bureau of Standards, vol. 70C, no. 4, pp. 263–273, 1966. [Online]. Available:
http://archive.org/details/jresv70Cn4p263
[87] A. D. Lawrence D. Brown, T. Tony Cai, “Interval estimation for a binomial
proportion,” Statistical Science, vol. 16, no. 2, pp. 101–117, 2001. [Online]. Available:
http://www.jstor.org/stable/2676784
[88] V. Sukharev, E. Zschech, and W. D. Nix, “A model for electromigration-induced
degradation mechanisms in dual-inlaid copper interconnects: Effect of microstructure,”
Journal of Applied Physics, vol. 102, no. 5, p. 053505, 2007. [Online]. Available:
http://dx.doi.org/10.1063/1.2775538
[89] A. Lodder and J. P. Dekker, “The electromigration force in metallic bulk,” in Proc. of the
Stress Induced Phenomena in Metallization: 4th International Workshop, vol. 418, 1998,
pp. 315–329.
[90] A. L. S. Loke, “Process integration issues of low-permittivity dielectrics with copper for
high-performance interconnects,” Ph.D. dissertation, STANFORD UNIVERSITY, Mar
1999.
[91] R. A. Horn and C. R. Johnson, Eds., Matrix Analysis. New York, NY, USA: Cambridge
University Press, 1986.
Bibliography 142
[92] K. Veselic, “On real eigenvalues of real tridiagonal matrices,” Linear Algebra and its Ap-
plications, vol. 27, pp. 167 – 171, 1979.
[93] S. Weerahandi, “Generalized confidence intervals,” Journal of the American Statistical
Association, vol. 88, no. 423, pp. 899 – 905, Sept. 1993.
Recommended