The Development of a Flexible and Efficient Chip Thermal Imaging Capability

Marc Knox, Chenzhou Lian, Alan Weger, Xiaojin Wei, IBM
Agenda

- Background
- Concept Origin and Project Goals
- Advantages/Novelty
- The Development Team & System Users
- System & Hardware Description
- Case Study (Power 7 processor wafer C4 pullouts)
Background

- Thermal imaging of semiconductor die is a useful and informative method of visualization of the thermal and power relationships on a die
- The effects of various workloads, patterns and conditions can be seen in real time
- Thermal imaging is particularly useful in looking for spatial (cross chip) variations and issues
  - Design related, Process related, Defect related
- The savvy SWTW audience may ask: "So how exactly does this relate to wafer probe"
- Answer: Thermal Imaging is another tool that can be used to provide insight into wafer probe power/thermal issues....... Please stay tuned
Thermal Imaging Concept Origination and Project Goals

• Concept Origin
  – Concept derived during collaboration between the Test Development and Packaging Development teams at IBM
  – Mutual recognition of benefits and shared vision of project goals in these organizations

• Goals:
  – Multiple functional organizations looking for a faster, easier, cheaper, more universal thermal imaging capability
  – Desirable attributes
    – Available for early silicon (new designs)
    – Available for multiple product designs
    – Minimum customization and special resources to support
Advantages/Novelty

- Thermal imaging capability is not new nor novel
- The novelty of this implementation lies in the attributes:
  - Capability is widely available to multiple users
  - Capability is available for multiple products
  - Capability is available for use with early silicon
  - Capability is built directly on top of an existing and fully supported manufacturing test platform and infrastructure
    - Hardware support (BIBs, sockets, tools)
    - Software support (all devt & mfg infrastructure)
  - Synergy of hardware development for new packages
  - Test Development support
- Essentially “piggybacking” on an existing infrastructure
Thermal Imaging Project

**Project Collaborators**

- Packaging Thermal Development
- Research Optics and HS Test
- Test Equip Engineering

**Users**

- Packaging Thermal Devt
- Burn-In/Burn-In Test team
- Wafer/Module Test team
- Systems Teams
  - DFT
  - Design
  - Pwr Pkg & Cooling
- Silicon Process Teams
- Failure Analysis Teams
- Characterization Team

3 Sites, Parts and Pieces of ~12 Folks
Thermal Imaging Hardware Concept

Production Burn-In Tool

Figure 3-3. Simplified Side View of System

Figure Courtesy of Micro Control Corp
Burn-In Screener (before modification)

Slot for BIB - product to tool interface board (not shown for clarity)

Chip, Cooling Heat Sink

Fluid Pipes

Thermal Tray (heat sink array)

Fluid Chiller
Thermal Imaging Hardware Concept

- Modified screener
  - Parts face up, sink array modified for IR cooling and camera access
  - Socket modified to accommodate IR cooling
  - Sapphire (IR transparent) window on top of chip with cooling fluid flowing between the chip and the window
  - Objective is uniform laminar flow, Cooling fluid = FC77
  - Camera mounted above and looking down at chip
Final Assembly View

- IR Camera
- Chip, Cooling Cell
- Thermal Tray
- Fluid Pipe
- BIB
Cooling Cell Assembly on Burn-In Board

- Fluid Pipe - Inlet
- Fluid Cooling Cell
- BI Board
- Fluid Pipe - Outlet
- Heat Sink Array (Tray)
- Window Glass with Chip Underneath
Cooling Cell Close-UP
Thermal Imaging Details

- **Imaging on small areas can be enhanced by thinning chips**
  - Backside grind (thin) the chips to increase thermal contrast
  - The silicon thickness between active circuits and the backside spreads the heat laterally thereby decreasing contrast
  - Thinning allows more accurate micro level temperature gradients
  - Typically thinning for local area work and leaving thick for full die imaging
  - Camera only captures surface temperatures

- **Imaging can be enhanced by coating the chip surface**
  - Silicon is transparent to IR. Underlying structures have various emissivities
  - Without a coating, wiring and patterns are observed
  - Coating provides a consistent emissivity which enhances imaging
  - Used mainly with thin die and micron level temperature analysis (small regions)

- **Manipulations of Digital Images are performed**
  - Lens reflection is removed by taking a background IR image before and after every measurement – subtract out the reflection
  - Thermal calibrations are performed using on-chip thermal sensors
  - Flow directional cancellation can be performed (avg out fluid heating effects)
System Details

• **Power Limitations**
  – Power is limited to ~125W (Large die, >500mm^2 die)
  – Limitation due to coolant thermal conductivity (FC77)
  – Limitation due to flow rate of coolant
  – Convective heat transfer coefficient is related to Nusselt number Nu and thermal conductivity k.

• **Fluid properties desired are:**
  – IR transparency, Thermal conductivity

• **Fluid flow in this system can be in the Laminar or Turbulent or Transition region**
  – At flow rate of 1.5GPM, flow is turbulent
  – At flow rate of 0.5 GPM, flow is laminar
  – Lower flows (~0.5GPM) show less flow heating effects
# Property Comparison – Air, Water, FC77

<table>
<thead>
<tr>
<th>Fluid Property</th>
<th>Air</th>
<th>Water</th>
<th>FC77</th>
</tr>
</thead>
<tbody>
<tr>
<td>( \rho ), kg/m(^3)</td>
<td>1.2</td>
<td>997</td>
<td>1770</td>
</tr>
<tr>
<td>( \nu ), m(^2)/s</td>
<td>15.6e-6</td>
<td>0.857e-6</td>
<td>0.8e-6</td>
</tr>
<tr>
<td>( k ), W/m-K</td>
<td>0.026</td>
<td>0.613</td>
<td>0.063</td>
</tr>
<tr>
<td>( C_p ), J/kg(^\circ)C</td>
<td>1005</td>
<td>4180</td>
<td>1046</td>
</tr>
<tr>
<td>( Pr )</td>
<td>0.71</td>
<td>5.83</td>
<td>23.5</td>
</tr>
</tbody>
</table>

\[
Pr = \frac{\nu}{\alpha} = \frac{\text{viscous diffusion rate}}{\text{thermal diffusion rate}} = \frac{c_p \mu}{k} \\
Nu_L \sim Pr^{\frac{1}{3}} \\
Nu_L = \frac{hL}{k_f} = \frac{\text{Convective heat transfer coefficient}}{\text{Conductive heat transfer coefficient}}
\]
Camera Description

- Camera Type: FLIR SC7600BB
- Lens: 25 mm, G1, and G3

### Technical specifications

<table>
<thead>
<tr>
<th>Specification</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Detector Material</td>
<td>InSb</td>
</tr>
<tr>
<td>Spectral Band</td>
<td>3 - 5 μm (1.5 - 5 μm optional)</td>
</tr>
<tr>
<td>Image Size</td>
<td>640x512</td>
</tr>
<tr>
<td>Pitch</td>
<td>15 μm</td>
</tr>
<tr>
<td>Aperture</td>
<td>F/3</td>
</tr>
<tr>
<td>Windowing</td>
<td>320x256 / 160x128</td>
</tr>
<tr>
<td>Random windowing (down to)</td>
<td>48x4</td>
</tr>
<tr>
<td>Max Frame Rate (full frame)</td>
<td>100 Hz</td>
</tr>
<tr>
<td>Integration Time Range</td>
<td>3 - 20000 μs</td>
</tr>
<tr>
<td>Integration Time Mode</td>
<td>ITR / IWR</td>
</tr>
<tr>
<td>Shutter</td>
<td>In place of a filter</td>
</tr>
<tr>
<td>Radiometry</td>
<td>NETD &lt;25 mK (20 mK typical)</td>
</tr>
<tr>
<td>Temperature measurement accuracy</td>
<td>± 1 °C or ± 1%</td>
</tr>
<tr>
<td>Filters wheel</td>
<td>4 slots for 1” filter 1 mm thick</td>
</tr>
<tr>
<td>Timings and Signals</td>
<td>External synchronisation: LV TTL</td>
</tr>
<tr>
<td>Analogue signal</td>
<td>1 x (-5 to 5 V) / 2 x (0 to 10 V)</td>
</tr>
<tr>
<td>Digital output</td>
<td>GigE Ethernet / Camlink</td>
</tr>
<tr>
<td>Video output</td>
<td>PAL or NTSC, Composite or S-Video</td>
</tr>
</tbody>
</table>

### Optional lenses

<table>
<thead>
<tr>
<th>Lens</th>
<th>FOV</th>
</tr>
</thead>
<tbody>
<tr>
<td>12 mm F2</td>
<td>44 x 36 °</td>
</tr>
<tr>
<td>25 mm F2</td>
<td>22 x 17 °</td>
</tr>
<tr>
<td>50 mm F2</td>
<td>11 x 8.8 °</td>
</tr>
<tr>
<td>100 mm F2</td>
<td>5.5 x 4.4 °</td>
</tr>
<tr>
<td>200 mm F2</td>
<td>2.75 x 2.2 °</td>
</tr>
<tr>
<td>Microscope lens G1 F/2</td>
<td>9.6 x 7.7 mm</td>
</tr>
<tr>
<td>Microscope lens G3 F/2</td>
<td>3.2 x 2.6 mm</td>
</tr>
</tbody>
</table>

Case Study – Power 7 Processor C4 Pullouts
Power 7 Processor C4 Pullout Problem

- C4 “pullouts” being seen primarily at the wafer test sector
- A C4 Pullout is loosely defined as a C4 that detaches from the underlying metallurgy after probing
  - Very low percentage of chips overall but large number of wafers impacted
  - Chips and Probes being Damaged, Impacts to yield and productivity

Optical images of pull-out sites:

- Polyimide burned/charred
- Original footprint of Ni
- Original footprint of TiW
Root Cause Investigative Work

A Root Cause hypothesis was generated based on multiple overlapping data points:

- Pullouts caused by the interaction of highly variable wafer edge processing which results in short channels and high leakage (certain wafers and wafer edge)
- In combination with accelerated voltage screening (WC power)
- Local power dense regions resulted in localized thermal runaway
- Runaway was self limiting in some cases (chips test good after the fact)

Wafer Test Elevated Voltage Screen

Wafer Edge Chip Power Indications
- Higher Delta PSRO
- Higher Leakage
- Higher propensity to current clamp
- Faster Cores
- High thermal resistance at both WFT, MFT
  • (indicating higher pwr density)

Highly Localized, Self Limiting Thermal Runaway - Pwr/Gnd Bumps and underlying metallurgy destroyed
Root Cause Investigative Work
Thermal Imaging

- All evidence to date indicates that this issue is caused by regional hotspots which are generated by short channels and high leakage at the edge of the wafer.

- While the investigating team was very confident in the conclusions it had drawn about the nature of the local hotspots, it was deemed necessary provide conclusive data to support the assertion that defects were not a contributor to root cause.

- Thermal imaging was viewed to be one of the most conclusive methods to ascertain with certainty that the root cause conclusion was correct.

- Fortuitously, our imaging project was almost ready at this time!
Thermal Imaging Experiment and Setup

- A wafer that did have pullouts was identified
- 3 suspect chips from that wafer were identified and built into modules
  - These were chips that did not have pullouts but had data that would suggest they would be more likely/susceptible

- Hence the link from wafer probe test to a module chip level imaging system

- Modules were tested after build to ensure chips were not defective
- Control modules were also provided
- Modules were prepped for imaging
  - Lids removed
  - Silicon was thinned
  - An emissivity coating was applied
Suspect Chip Wafer Detail

Wafer Outer Edge

C0  C4
C1  C5
C2  C6
C3  C7
Imaging Results

Control Chip @ Nominal Voltage +300mV, ~70W

Coolant Flow

Suspect Chip @ Nominal Voltage ~70W

Coolant Flow
Conclusions/Results Interpretation

- Control/normal part indicates variation due to coolant flow (known issue) but all of the cores have relatively even heating indicating well distributed power density.
- Entire outer edge of the suspect die (labeled hot die) shows high power density corresponding to the outer edge of the wafer.
- Hotspot is regional in nature, it is not from a single spot.
  - the entire end of the chip lights up, two end cores and the MC
- Same base power (70W) but large cross chip variation on the suspect part.
  - >90C core to core delta on the suspect part vs. 20C on control.
- Results clearly confirm prior root cause hypothesis and matches with prior supporting evidence and conclusions.
  - Results NOT indicative of a small area defect nor a point defect.
Acknowledgements

• The Authors would like to acknowledge and thank the following individuals for their contributions to this project.

  Paul Aube
  Chenzhou Lian
  Paul Bodenweber
  Cory Hinton
  Paul Gaschke
  Jeff Kutner
  Scott Boeleans
  Roger Gamache
  Ted Lewis
  Kamal Sikka