Electrical Modeling of STT-MRAM Defects

Lizhou Wu*, Mottaqallah Taouil† Siddharth Rao, Erik Jan Marinissen† Said Hamdioui*

*Delft University of Technology
Mekelweg 4, 2628 CD Delft, The Netherlands
{Lizhou.Wu, M.Taouil, S.Hamdioui}@tudelft.nl

†IMEC
Kapeldreef 75, B-3001 Leuven, Belgium

Abstract—Spin-transfer-torque magnetic RAM (STT-MRAM) is one of the most promising emerging memory technologies. As various manufacturing vendors make significant efforts to push it to the market, appropriate STT-MRAM testing is of great importance. In this paper, we demonstrate that conventional STT-MRAM defect modeling, which is based on linear resistors, is too pessimistic in representing the real nature of physical defects. It may result in incorrect fault models, which in turn can lead to low-quality test solutions. In addition, we propose a generic defect modeling methodology which captures the non-linear behavior of STT-MRAM defects accurately; a defect is modeled by adjusting the affected STT-MRAM technology parameters. The methodology is illustrated by two examples, namely a pinhole defect and a sidewall redeposition defect, which are simulated for accurate fault modeling. In case of a pinhole defect, the STT-MRAM suffers from a fast transition between magnetic tunnel junction (MTJ) states with increased write current, making the MTJ more vulnerable to breakdown. However, with the conventional linear resistor as defect model, the memory shows a slow transition or even a transition failure. Similarly, a sidewall redeposition defect causes a fast transition without current elevation, which is not observed when using the conventional approach.

I. INTRODUCTION

As the downscaling of CMOS memory technology continues, existing memory types become increasingly power hungry and less reliable, while their fabrication becomes more expensive due to increased manufacturing complexity. Therefore, extensive R&D efforts focus on emerging non-volatile memories (NVMs) as alternative memory technologies [1–4]. Among NVMs, STT-MRAM stands out with many attractive features such as nearly unlimited endurance, zero standby leakage, and high density [5]. Nevertheless, several obstacles still need to be addressed before high-volume production can start. Firstly, the manufacturing process of STT-MRAM involves not only standard CMOS processing steps, but also MTJ fabrication and integration. The latter is subject to new manufacturing defects which have not been fully investigated to date [6]. Secondly, new failure mechanisms (e.g., magnetic coupling, STT switching stochasticity) [7], due to the introduction of new materials as well as novel physical phenomena, may lead to manufacturing yield loss or test escapes [8]. Hence, providing correct fault models, which enable the development of efficient test algorithms and/or Design for Testability (DfT), is of great importance. As fault models are typically abstracted from defect injection and circuit simulation, inaccurate defect modeling may lead to incorrect fault models. This in turn results in low-quality tests and/or DfT solutions, which cannot guarantee a low test escape rate, even with a high fault coverage claim. Hence, accurate defect modeling is needed as a critical and crucial step.

There are several papers on MRAM fault modeling and testing [9–15]. In [9], the authors injected ideal resistive shorts and opens into the SPICE model of an MRAM cell and subsequently identified two fault models: multi-victim and kink faults. Similarly, the authors in [10] and [11] performed circuit simulations to analyze the faulty behavior of resistive-open defects and write disturbance faults, respectively. Recently, Yoon et al. studied functional faults in STT-MRAM arrays induced by resistive and capacitive defects occurring both intra-cell and inter-cell, as well as extreme process variation [12–15]; they also proposed a test algorithm and its built-in self-test implementation. However, the limitation of all these prior publications is that they are based on circuit simulations with resistive defect injection (i.e., shorts, bridges, and opens); these resistive defects do not have any link to the actual physical STT-MRAM defects. The MTJ device is a non-linear bipolar device of which its magnetic attributes (e.g., hysteresis loop) are as critical as its electrical ones. As a consequence, having linear electrical resistors represent the STT-MRAM physical defects does not appropriately model the physical effects on the MTJ’s magnetic attributes, STT switching mechanism, and tunneling magneto-resistance.

In this paper, we provide a methodology for accurate and appropriate physical defect modeling. It models the STT-MRAM physical defects by modifying the affected technology parameters of the MTJ device (e.g., the resistance-area (RA) product, the tunneling magneto-resistance ratio (TMR), and the anisotropy field (Hk)). To the best of our knowledge, this is the first paper from a test perspective that accurately models and simulates STT-MRAM-specific defects instead of using ideal linear resistive shorts, bridges, and opens. The contributions of this paper are as follows.

- We demonstrate that conventional defect modeling based on resistive defect injection is too pessimistic to accurately present the physical defects at circuit level.
- We propose a generic defect modeling methodology which captures the non-linear behavior of STT-MRAM defects accurately.
- We apply this methodology to model and simulate the pinhole and sidewall redeposition defects as examples.
- We provide an overview and classification of unique STT-MRAM manufacturing defects.

The rest of this paper is organized as follows. Section II provides a background on STT-MRAM technology. Thereafter, an overview of STT-MRAM defects is presented in Section III. Section IV elaborates the defect modeling methodology. Sections V and VI apply this methodology to model the pinhole...
II. STT-MRAM BASICS

STT-MRAM is considered as the second generation of MRAM technologies [16], as it leverages spin-transfer torque to efficiently switch between the binary magnetic states. It offers an integration density as high as DRAM and potentially matches the performance of SRAM. Therefore, STT-MRAM can serve as last-level caches in the short term and is seen as a leading candidate to be a universal memory in the long run [17]. In this section, we will first introduce the organization of the MTJ device which is the core building block of STT-MRAM. Thereafter, we will briefly explain several concepts related to MTJ states and STT switching mechanism, followed by the common 1T-1MTJ cell design.

A. MTJ Organization

The magnetic tunnel junction (MTJ), the fundamental building block of MRAMs, essentially consists of two ferromagnetic layers sandwiching an extremely thin insulating spacer layer, as illustrated in Fig. 1(a). The top ferromagnetic layer is called free layer (FL) and is responsible for storing the binary information. This layer is often made of CoFeB material; its thickness is typically $t_{FL} = 1.5$ nm [18]. The magnetization of FL points along its intrinsic easy axis and may flip by applying a spin-polarized current through it. The MTJ can be in-plane magnetic anisotropy (IMA) if the easy axis lies in the horizontal cross-section, or perpendicular magnetic anisotropy (PMA) if along the vertical cross-section [16]. PMA-MTJs offer many benefits over IMA-MTJs [5], including: 1) the shape of the MTJ is no longer critical, thus removing a significant bottleneck to technology downscaling; 2) the switching current to reverse the MTJ’s state is considerably reduced. Therefore, we limit our discussion on PMA-MTJs. The bottom ferromagnetic layer, referred to as pinned layer (PL), is used to provide a stable reference direction to the magnetization of the FL; it typically has a thickness of $t_{PL} = 2.5$ nm [18]. Although made of CoFeB as well, the PL anisotropy energy is large enough to avoid switching during operations. The spacer layer in the middle is called tunnel barrier (TB); it serves as an insulting non-magnetic spacer between the FL and PL. In case the TB layer is very thin (typically $t_{OX} = 1$ nm [18]), quantum-mechanical tunneling of electrons through the barrier makes the MTJ behave like a resistor, whose resistance depends exponentially on the barrier thickness.

B. MTJ Binary States And STT Switching

The resistance of the MTJ is low when the magnetization directions in FL and PL are parallel (P) and high when anti-parallel (AP). These two binary magnetic states enable the MTJ device to store a single bit. The MTJ resistance is generally derived from a parameter called resistance-area ($RA$) product; $RA$ can be measured by specific characterization techniques such as current-in-plane tunneling (CIPT) and conducting atomic force microscopy (CAFM) at various processing stages [19], typically in the range of $5-15 \, \Omega \cdot \mu m^2$. The resistance difference between the P and AP states is caused by the tunneling magneto-resistance (TMR) effect [20–22]. The TMR effect means that the good band matching in the P state leads to large tunneling conductance of the barrier, while the poor band matching in the AP state results in less electrons tunneling through the barrier. To qualitatively evaluate the TMR effect, the TMR ratio is widely adopted. It is defined by: $TMR = (R_{AP} - R_{P})/R_{P}$, where $R_{AP}$ and $R_{P}$ are the resistances in AP and P states, respectively. The higher the TMR, the easier it becomes for sense amplifiers to distinguish the magnetic states correctly. For commercially-feasible STT-MRAM products, a minimum TMR ratio of 150% is required [16]. $R_{P}$ can be physically modeled as:

$$R_{P} = \frac{t_{OX}}{C_{1} \cdot \sqrt{\varphi}} \cdot A \cdot \exp(C_{2} \cdot t_{OX} \cdot \sqrt{\varphi})$$

where $\varphi$ is the potential barrier height of MgO, $A = \frac{1}{2}\pi d^2$ the horizontal cross-section of the MTJ device, $C_{1}$ and $C_{2}$ are fitting coefficients depending on $RA$ product as well as the material composition of the MTJ layers. Given a $TMR$ ratio, $R_{AP}$ can be approximately calculated by:

$$R_{AP} = R_{P} \cdot (1 + TMR)$$

In order to switch between the AP and P states, a spin-polarized current is applied across the MTJ device to reverse the magnetization of FL by the spin-transfer torque (STT) [16,25,26]. The minimum energy required for a write operation should be larger than the energy barrier ($E_{B}$) between P and AP states. Fig. 1(b) illustrates the two states and $E_{B}$, which is given by [5]:

$$E_{B} = \frac{\mu_{0} \cdot t_{FL} \cdot M_{s} \cdot A \cdot H_{k}}{2}$$

where $\mu_{0}$ is the vacuum permeability, $M_{s}$ the saturation magnetization, and $H_{k}$ the magnetic anisotropy field. The magnetization dynamics in the STT switching process can be modeled by the Landau-Lifshitz-Gilbert (LLG) equation under the macrospin assumption of the FL nanomagnet [27–29]. By solving the LLG equation, the following expression for the critical switching current ($I_{c}$) is derived [30]:

$$I_{c} = \frac{2\alpha}{\mu_{B} \cdot g(\theta)} \cdot E_{B}$$

where $\alpha$ is the magnetic damping constant, $g$ the gyromagnetic ratio, $e$ the elementary charge, $\mu_{B}$ the Bohr magneton, and...
g(P, θ) a function of the spin polarization (P) of the tunnel current and the angle (θ) between the magnetizations of the FL and PL [23]. Apart from the requirement of write current amplitude, the STT switching behavior necessitates a minimum duration of current application. The average switching time (t_w) is given by [23,26]:

\[
\frac{1}{t_w} = C + \ln\left(\frac{\pi^2}{4}\right) \cdot \frac{\mu_B P}{e \cdot m(1 + P^2)} \cdot I_{\text{margin}}
\]

(5)

where \(C \approx 0.577\) is Euler’s constant, \(\Delta = \frac{E_{\text{th}}}{kT}\) the thermal stability, \(P\) the spin polarization of FL and PL, \(e\) the elementary charge, \(m\) the FL magnetization, \(I_w\) the write current. Equation (5) indicates that the actual switching time is inversely correlated with the write current magnitude. The higher the write current, the faster the magnetization switching.

In summary, RA, TMR, \(\bar{\varphi}\), \(M_s\), and \(H_k\) are critical technology parameters of MTJ device; these may be impacted by physical defects. At electrical level, these parameters will influence the four electrical parameters that determine the MTJ behavior; these are \(R_P\), \(R_{AP}\), \(I_c\), and \(t_w\).

C. 1T-1MTJ Bit-cell Design

The 1T-1MTJ bit-cell design is the most widely-adopted cell design, comprising an MTJ device connected serially with an access transistor [31,32], as shown in Fig. 2(a). The MTJ in this structure serves as a resistive storage element, while the access transistor, typically NMOS, is responsible for selective access. The NMOS gate is connected to a word line (WL), which determines whether a row is accessed or not. The other two terminals are connected to bit line (BL) and source line (SL), respectively. They control write and read operations on the internal MTJ device depending on the magnitude and polarization of voltage applied across them.

Fig. 2(b)-(d) show the three basic operations: write “0”, write “1”, and read. During a write “0” operation, WL and BL are pulled up to \(V_{DD}\) and SL is grounded, thus leading to a current \((I_{w0})\) flowing from BL to SL. In contrast, a write “1” operation requires the opposite current through the MTJ device with WL and SL at \(V_{DD}\), and BL grounded. In order to avoid write failures, write currents in both directions should be greater than the critical switching current \(I_c\). However, the current during a write “1” operation \((I_{w1})\) is slightly smaller than during a write “0” operation \((I_{w0})\), due to the source degeneration of NMOS in write “1” operations [33,34]. For read operations, a read voltage \(V_{\text{read}}\) is applied; it leads to a read current \((I_{\text{rd}})\) with the same direction as \(I_{w0}\) to sense the resistive state \((\text{AP/P})\) of MTJ.

To avoid an inadvertent state change during read operations, known as read disturb, \(I_{\text{rd}}\) should be as small as possible; typically \(I_{\text{rd}} < 0.5I_c\) for MTJs with a thermal stability of \(\Delta = 65\) [35]. However, a too low \(I_{\text{rd}}\) may lead to incorrect read fault [36]. In general, the current magnitude relations must satisfy: \(I_{\text{rd}} < I_c < I_{w1} < I_{w0}\). This is indicated by the widths of the red arrows in Fig. 2. A read operation requires a sense amplifier to determine the resistive state. The sense amplifier may be implemented using a current sensing scheme, where the read-out value is determined by comparing the current of the accessed cell \((I_{\text{cell}}) = I_{\text{ref}}\) with the current of a reference cell \((I_{\text{ref}})\). The sensing result is logical “0” if \(I_{\text{cell}} < I_{\text{ref}}\); otherwise, it outputs logical “1”.

III. STT-MRAM MANUFACTURING DEFECTS

The STT-MRAM manufacturing process mainly consists of the standard CMOS fabrication steps and the integration of MTJ devices into metal layers (e.g., between M4 and M5 layers [37,38]). Fig. 3(a) shows the bottom-up manufacturing flow and Fig. 3(b) the vertical structure of STT-MRAM cells [39]. Based on the manufacturing phase, STT-MRAM defects can be classified into front-end-of-line (FEOL) and back-end-of-line (BEOL) defects. As MTJs are integrated into metal layers during BEOL processing, BEOL defects can be further categorized into MTJ fabrication defects and metalization defects. All potential defects are listed in Table I. Next, we will examine them in detail along with their corresponding processing steps, with a particular emphasis on those introduced during MTJ fabrication.

A. FEOL Defects

The first step of the STT-MRAM manufacturing process is the FEOL process where transistors are fabricated on the wafer. In this phase, typical defects may occur such as semiconductor impurities, crystal imperfections, pinholes in
gate oxides, and shifting of dopants [40,41]. These are the conventional defects which have been sufficiently studied and are generally modeled by resistive opens, shorts and bridges [42–44].

### B. BEOL Defects

After FEOL, M1-M4 metal layers are stacked on top of the transistors followed by a bottom electrode contact (BEC), as illustrated in the zoomed-in part of Fig. 3(b). M1-M4 metalization does not differ from traditional CMOS BEOL steps. The BEC step is used to connect bottom Cu lines with MTJ stacks [18,39]. During this phase, typical interconnect defects may take place, such as open vias/contacts, irregular shapes, big bubbles, etc. [42]. A related specific defect type is provided in [39], where an open contact between the Cu line and BEC has been observed with transmission electron microscopy (TEM) due to polymer leftovers.

To obtain a super-smooth interface between the BEC and the MTJ stack, a chemical mechanical polishing (CMP) step is required. The smoothness of the interface between layers is key to obtaining a good TMR value. CMP processing minimizes the surface roughness with a root-mean-square average of 2 Å [37]. At this stage, both under-polishing and over-polishing of the surface can introduce defects. Specifically, under-polishing causes issues such as orange peel coupling or offset fields which affect the hysteresis curve, while over-polishing may result in dishing or residual slurry particles that are left behind [45].

After the CMP step, the next critical step is the fabrication of the MTJ stack. The latest published MTJ design includes more than 10 layers for performance reasons [46]. However, the increasingly sophisticated design of the MTJ also makes it more vulnerable to manufacturing defects. For example, pinholes in the tunneling barrier (e.g., MgO) could be introduced in this phase [47]. A pinhole filled with CoFeB material forms a defective high-conductance path across the two ferromagnetic layers. It severely degrades the resistance and TMR values, and may even lead to breakdown due to the ohmic heating when an electric current passes through the barrier [48]. Furthermore, the MgO barrier thickness variation and interface roughness result in degradation of resistance and TMR values as well. TEM images in [47] show that the MgO barrier thickness varies from 0.86 nm to 1.07 nm, leading to a huge difference in resistance. In [18], a TMR degradation was observed due to increased surface roughness caused by a complicated inner synthetic anti-ferromagnetic (iSAF) pinned layer design.

After the MTJ stack deposition, annealing is applied to obtain crystallization in MgO tunneling barrier as well as in the CoFeB PL and FL layers [49,50]. At this stage, the PMA originating from the MgO/CoFeB interface and TMR value are strongly determined by the annealing conditions such as temperature, magnetic field, and annealing time [49]. With appropriate annealing conditions, the PMA can be considerably enhanced, leading to higher thermal stability [50]. Underannealing can lead to lattice mismatch between the body-centered cubic (bcc) CoFeB lattice and the face-centered cubic (fcc) MgO lattice, whereas over-annealing introduces atom inter-diffusion between layers. For example, oxygen atoms can diffuse out of the MgO layer to the spacer layers, leaving behind oxygen vacancies, thus severely degrading the TMR value [51].

After MTJ multi-layer deposition and annealing, the next crucial step is to pattern individual MTJ nanopillars [52]. Typically, ion beam etching (IBE) is widely used to pattern MTJ nanopillars [53,54]. During the MTJ etching process, it is extremely difficult to obtain MTJ nanopillars with steep sidewall edges, while avoiding sidewall redeposition and magnetic layer corrosion [47].

The redeposition phenomenon on sidewalls may significantly deteriorate the electrical property of the MTJ device and even cause a barrier-short defect. In order to mitigate the redeposition effect, a side-etching step combined with the Halogen-based reactive ion etching (RIE) and inductively-coupled plasma (ICP) techniques [55–57] is needed by rotating and tilting the wafer. Nevertheless, other concerns arise. For instance, the shadowing effect (limited etching coverage at the lower corner of the MTJ profile due to insufficient spacing between MTJs) [47,58] limits a high-density array patterning, and magnetic layer corrosion degrades the reliability of MTJ devices due to the non-volatile chemicals attached to the CoFeB layers.

After the MTJ etch processing, encapsulation and CMP are required to separate the MTJ pillars. Thereafter, these MTJ pillars are connected to the top electrode contact, followed by M5 metalization. The remaining steps of the manufacturing process are the same as the BEOL of conventional CMOS technology. Typical defects such open contact/vias, small particles, etc. can occur in these steps.

### IV. MODELING METHODOLOGY FOR STT-MRAM-SPECIFIC DEFECTS

As already mentioned in the introduction, traditionally researchers have been using resistive defects (i.e., shorts, bridges, and opens) as electrical models to represent physical defects for MRAM fault modeling [9,13,14]. However, none of these publications above has provided a clear justification on how real STT-MRAM physical defects can be modeled as linear resistors. Inaccurate defect modeling may result in poor fault models, thereby limiting the effectiveness of the corresponding DfT. To accurately model the physical defects, we propose a different defect modeling approach which captures the non-linear behavior of STT-MRAM physical defects accurately.

<table>
<thead>
<tr>
<th>Table I: STT-MRAM DEFECT CLASSIFICATION.</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>BEOL</strong></td>
</tr>
<tr>
<td><strong>Transistor fabrication</strong></td>
</tr>
<tr>
<td><strong>Pinholes in TB</strong></td>
</tr>
<tr>
<td><strong>Atom inter-diffusion</strong></td>
</tr>
<tr>
<td><strong>Re-deposition on MTJ sidewalls</strong></td>
</tr>
<tr>
<td><strong>Magnetic coupling</strong></td>
</tr>
<tr>
<td></td>
</tr>
</tbody>
</table>
Fig. 4. Electrical modeling flow for STT-MRAM-specific defects.

Fig. 4 illustrates the generic modeling flow for STT-MRAM-specific defects, which can be described in three steps as follows.

1) **Physical defect analysis and modeling.** Given a set of physical defects \( D = \{d_1, d_2, \ldots, d_n\} \) that may occur during MTJ fabrication, each defect \( d_i \) has to be physically analyzed and modeled. The effect of defect \( d_i \) can be reflected by a change of the key MTJ-related technology parameters: \( RA, \) \( TMR, \) \( \varphi, \) \( M_s, \) and \( H_k \) (see Section II). This results in effective technology parameters that can be denoted as:

\[
RA_{eff,i}(S_i) = f_i(RA_{df}, S_i) \tag{7}
\]
\[
TMR_{eff,i}(S_i) = g_i(TMR_{df}, S_i) \tag{8}
\]
\[
\varphi_{eff,i}(S_i) = r_i(\varphi_{df}, S_i) \tag{9}
\]
\[
M_{s,eff,i}(S_i) = k_i(M_{s,df}, S_i) \tag{10}
\]
\[
H_{k,eff,i}(S_i) = h_i(H_{k,df}, S_i) \tag{11}
\]

where \( f_i, \) \( g_i, \) \( r_i, \) \( k_i, \) and \( h_i \) are mapping functions corresponding to defect \( d_i \) \((i \in [1, n])\). \( RA_{df}, \) \( TMR_{df}, \) \( \varphi_{df}, \) \( M_{s,df}, \) and \( H_{k,df} \) are the defect-free technology parameters. \( S_i = \{x_1, x_2, \ldots, x_k\} \) is a set of parameters representing the size or strength of defect \( d_i \). It is worth noting that each defect may impact one or more technology parameters. For example, the pinhole defect mainly impacts \( RA \) and \( TMR \) parameters; the defect size is represented by the pinhole area \( A_{ph} \) (i.e., \( S_i = \{A_{ph}\} \)). We will discuss this case in detail in the next section.

2) **Electrical modeling of the defective MTJ device.** In this step, the impact of the updated technology parameters from Step 1 on the electrical parameters is identified; it reflects the way such defect \( d_i \) influences the electrical parameters of the MTJ device. This can be done for example by updating the electrical parameters of the defect-free MTJ model (e.g., the Verilog-A compact model for PMA-MTJ in [23,59]). Note that the electrical parameters are the ones needed for accurate circuit simulation for fault modeling. As discussed in Section II, the key electrical parameters that determine the MTJ electrical behavior consist of \( R_P, \) \( R_{AP}, \) \( I_c, \) and \( t_w \) (see Equations (1,2,4,5)). This step enables us to obtain a raw defective MTJ model.

3) **Fitting and model optimization.** To validate the effectiveness of the defective MTJ model, it is crucial to fit the defective model to measurement data of real defective MTJ devices. If the behavior of the defective model (either its physical or electrical parameters) does not match the characterization data, the model or the parameter adjustment is necessary until an acceptable accuracy is obtained. Finally, we derive an optimized defect-parameterized compact model for defective MTJ devices.

The above developed electrical model for defective MTJ devices enables accurate and appropriate defect injection and circuit simulation for each defect \( d_i \). In the next two sections, we will use the proposed methodology for two common defects (the pinhole and sidewall redeposition [47]), not only to illustrate the methodology, but also to show its superiority in terms of defect modeling.

V. **MODELING OF PINHOLES**

In this section, we take the pinhole defect in the MgO barrier of MTJ as an example to illustrate how our proposed defect modeling methodology is applied. Thereafter, we simulate a single 1T-1MTJ bit-cell with the injected pinhole defect and compare our proposed model with the conventional resistive defect model.

A. **Defect Modeling**

As already mentioned, the defect modeling consists of three steps:

1) **Physical defect analysis and modeling:** As aforementioned in Section III, a pinhole defect \( d_{ph} \) in the MgO barrier has a significant impact on the electron tunneling behavior, which manifests itself as a degradation of \( RA \) and \( TMR \) parameters [60]. Oliver et al. showed in [48,60] that pre-existing pinholes in the insulating barrier of MTJ grow in area over time as a consequence of Joule heating and/or an electric field across the pinhole circumference. Therefore, if
the pinhole is not detected, it might cause a breakdown over time. The effective RA and TMR of MTJ with pinhole defects comply with [60]:

$$RA_{\text{eff}, \text{ph}}(A_{\text{ph}}) = \frac{A}{A_{\text{df}}} + \frac{A_{\text{bd}}}{RA_{\text{df}}}$$  \hspace{1cm} (12)

$$TMR_{\text{eff}, \text{ph}}(A_{\text{ph}}) = TMR_{\text{df}} \cdot \frac{RA_{\text{eff}, \text{ph}}(A_{\text{ph}}) - RA_{\text{bd}}}{RA_{\text{df}} - RA_{\text{bd}}}$$  \hspace{1cm} (13)

where $A_{\text{ph}} \in [0, 1]$ is the normalized pinhole area with respect to the cross-sectional area $A$ of the MTJ device; $S_{\text{ph}} = \{A_{\text{ph}}\}$ in this case. $RA_{\text{df}}$ and $TMR_{\text{df}}$ are the defect-free MTJ’s RA and TMR parameters (i.e., when $A_{\text{ph}} = 0$), respectively. $RA_{\text{bd}}$ is the resultant RA after breakdown of the MTJ device. Note that the pinhole impact on the other technology parameters $\varphi$, $M_x$, and $H_k$ is negligible [48].

We simulated the effective technology parameters in Matlab. We replaced the initial defect-free RA and TMR parameters with Equations (12-13) to observe how they change with the pinhole defect. Fig. 5 shows the impact of pinhole defects on the TMR and RA parameters; clearly the effective RA (left y-axis) decreases exponentially with the pinhole area when less than $\sim 20\%$ of the MTJ’s cross-section. This means that the tunneling magneto-resistance dominates the MTJ resistance for small pinhole defects. When $A_{\text{ph}}$ is larger than $20\%$, the resistance of the MTJ behaves like a metal resistor. The TMR parameter (right y-axis) degrades in a similar way with the normalized pinhole area as shown in Fig. 5. This is because the pinhole defect introduces a competition between the current going through the undamaged part $(A - A_{\text{ph}})$ of the barrier and the current going through the pinhole area, and only the former accounts for the TMR effect [60].

2) Electrical modeling of the defective MTJ device: The mapping from technology parameters to electrical parameters (i.e., $R_P$, $R_{AP}$, $I_c$, $t_w$) is realized by a number of physical models, which are mainly described by Equations (1,2,4,5). For the defective MTJ model, we replaced the original RA and TMR parameters with the effective ones in Equations (12-13). Thus, we obtained a pinhole-adjustable defective Verilog-A PMA-MTJ model with an input argument $A_{\text{ph}}$. With this model, we are able to evaluate how the pinhole defect impacts the MTJ’s electrical behavior. Fig. 6(a) shows that the pinhole defect leads to a shrunk R-V hysteresis loop, indicating that both write and read operations are affected. As the hysteresis loop shrinks below a certain threshold (depending on the pinhole area $A_{\text{ph}}$), it becomes impossible to distinguish between the two states, leading to a stuck-at-fault (SAF). Fig. 6(b) illustrates that the critical switching current $I_c$ gradually increases with $A_{\text{ph}}$ when less than $\sim 80\%$. When larger than $\sim 80\%$, $I_c$ increases exponentially. The increase in $I_c$ results from the degradation of spin polarization $P$ due to the pinhole defect. This means more current is required in order to switch the MTJ state. However, it is worth noting that for $A_{\text{ph}}$ larger than $10\% I_c$ is not that important any more, since the MTJ behaves as a SAF as we discussed previously. Fig. 6(c) shows the effect of pinhole defects on the STT switching time $t_w$. It can be seen that $t_w$ decreases with $A_{\text{ph}}$, and stabilizes around $A_{\text{ph}} = 10\%$. The decrease in $t_w$ is due to an increase in the write current margin (see Equations (5-6)). Note that in Equation (6), $I_c$ increases with the pinhole defect as shown in Fig. 6(b). However, the write current increases faster, as the MTJ resistance declines significantly with the pinhole defect (see Fig. 6(a)). This indicates that the writability (write latency) of MTJ is enhanced by pinhole defects. However, it is worth noting that the increased programming current also makes the MTJ device more vulnerable to a permanent breakdown.

3) Fitting and model optimization: The above pinhole defect model is consistent with measurement results of fabricated MTJ devices with the Ta/PtMn/CoFe/Ru/CoFe/AIO_x/CoFeNiFe/Ta structure proposed in [60]. Although the MTJ design in [60] is based on an AIO_x tunneling barrier, the model presented previously is also applicable to MTJ designs with a MgO barrier. Note that the crystalline MgO barrier provides a much higher TMR value than the amorphous AIO_x barrier [61]. Therefore, most recent MTJ designs adopt a single MgO or double MgO structure [38,39,46]. Since we do not have measurement data of the RA breakdown value $RA_{\text{bd}}$ for the MTJ model of [59], we instead swept the pinhole defect size $A_{\text{ph}}$. Without loss of generality, we set $RA_{\text{bd}}$ to $0.2 \, \Omega \, \mu m^2$ in our simulations to get a similar curve shape to the one reported based on measurement data in [60].
I. INTRODUCTION

In summary, the above observations clearly demonstrate that resistive defect models are not appropriate to do fault modeling. Our proposed defect model, by contrast, captures the non-linear behavior of STT-MRAM defects by adjusting the affected technology parameters.

VI. MODELING OF SIDEWALL REDEPOSITION

In a similar way as for the pinhole defect, we will first propose an accurate electrical model for the sidewall redeposition defect. Thereafter, we simulate a single 1T-1MTJ bit-cell with the injected sidewall redeposition defect and compare it with the resistive defect model.

A. Defect Modeling

1) Physical defect analysis and modeling: Sidewall redeposition defects may be introduced during the MTJ pillar
Eq. (14) and (15). Therefore, we consider the two-step etching process as an example, since it causes less damage to the MTJ device and therefore is preferable in practice. Fig. 9(a) shows how sidewall redeposition defects influence the R-V hysteresis loop of an MTJ device for $a=0.8$. It can be seen that the critical switching voltage reduces for both AP→P and P→P transitions, whereas $R_P$ and $R_{AP}$ are not affected by the sidewall redeposition defects. This is because the reduced $H_k$ of the free layer makes the magnetization easier to flip, while the degradation of $RA$ and $TMR$ is negligible as mentioned before. Fig. 9(b) shows that the critical switching current $I_c$ decreases with the sidewall redeposition due to a lower energy barrier between P and AP states (see Equation (3)). Fig. 9(c) illustrates that the critical switching time $t_w$ also decreases due to the sidewall redeposition defect, as the reduced $I_c$ leads to a higher write current margin (see Equation (6)).

3) Fitting and model optimization: The above simulation results of our proposed model for the sidewall redeposition defects are consistent with the test results in [64]. Despite the fact that the MTJ design (CAP/NiFe/AlOx/CoFe/Ru/CoFe/PtMn) in [64] differs from our used PMA-MTJ model [59], the observations of sidewall redeposition defect impact on the MTJ’s parameters are generic. By adjusting the variables $a$ and $z$, our proposed defect model enables us to fit to a specific MTJ design.

B. Comparison With Resistive Defect Model

Fig. 10(a) illustrates the transient simulations of write “1” operations for 10 ns on the three 1T-1MTJ bit-cells with the ideal defect-free MTJ model, our proposed defective MTJ model, and the conventional resistive defect MTJ model. The solid green curve shows that the defect-free cell undergoes a P→AP transition after $t_w = 6.05$ ns. Since we assume that a sidewall redeposition defect only degrades $H_k$ due to the two-step etching process, our model shows that the amplitude of the write current is nearly independent from the defect. Note that we already observed in Fig. 9(a) that the defect does not impact the resistance in both AP and P states. However, the switching time decreases to 5.63 ns, 4.91 ns, and 4.67 ns for $z = 20\%$, $z = 60\%$, and $z = 80\%$, respectively. These results, however, cannot be obtained by the resistive defect MTJ model. The red curves (representing resistive defects) in Fig. 10(a) show that the MTJ state does not switch for a
The work also demonstrated the importance of un-

test algorithms and DfT solutions with low actual fault

can result in wrong fault modeling, which can in turn results

simulation for STT-MRAM fault modeling is not accurate; it

the conventional way of resistive defect injection and circuit

accuracy and dealing with the non-linear nature of the STT-

and sidewall redeposition defects, we have showed how our

based on a linear resistor. By applying it to the pinhole

eling methodology to the conventional one which is simply

transition failure. However, our proposed model leads to a fast

paralel resistor $R = 1 \, k\Omega$ or $10 \, k\Omega$; only if the resistor is very

high $(100 \, k\Omega)$, the MTJ state switches with longer switching
time compared to the defect-free cell. In addition, the current

through the MTJ is strongly dependent on the resistance value,

which is not the case for our model.

Similarly, Fig. 10(b) compares a write “0” operation using

the three MTJ models. When using the conventional resistive
defect, the memory cell suffers from a slow transition or even

transition failure. However, our proposed model leads to a fast
transition but without current elevation.

VII. DISCUSSION AND CONCLUSION

In this work, we have provided an alternative defect mod-

eling methodology to the conventional one which is simply

based on a linear resistor. By applying it to the pinhole

and sidewall redeposition defects, we have showed how our

approach outperforms the conventional approach in terms of

accuracy and dealing with the non-linear nature of the STT-

MRAM devices.

The results of this work have clearly shown that using

the conventional way of resistive defect injection and circuit

simulation for STT-MRAM fault modeling is not accurate; it

can result in wrong fault modeling, which can in turn results

in test algorithms and DfT solutions with low actual fault

coverage. The work also demonstrated the importance of un-

derstanding and modeling the impact of defects on technology

parameters and thereafter on the electrical parameters in order
to appropriately predict the fault behavior of the STT-MRAM.

However, providing accurate models requires also data and

measurements in order to tune them and make them match

the real world. Hence, for appropriate fault modeling for STT-

MRAM, different aspects are needed to be explored:

- Understanding of STT-MRAM physics and technology.
- Understanding of the physics of unique STT-MRAM defect mechanisms, their occurrence probability, location, etc. and how they influence the STT-MRAM physics.
- Although there are several papers of applied physics [48,58,60,65] looking at the physical behavior of STT-MRAM manufacturing defects and their respective electrical impact, but more work needs to be done in this area.
- Understanding of how the STT-MRAM physical and technology parameters influence the electrical parameters and behavior.
- Collecting measurements/characterization data of defective STT-MRAM cells to calibrate the models, and iterating on the models if needed to get the right matching. Note that the occurrence of defects and their impact are always dependent on processing technology. Therefore, fitting and model optimization is vital to ensure the accuracy of the defect models for a specific STT-MRAM design and manufacturing process.

The interaction between the above different disciplines is

needed for efficient fault modeling; and this on its own is a

challenge. Clearly, the paradigm of fault modeling is changing for emerging technologies such as STT-MRAM.

REFERENCES

[1] Y. Chen et al., “Recent technology advances of emerging memories,”
International Test Conference (ITC), 2004, pp. 124–133.

mally assisted switching MRAMs,” IEEE Transactions on Very Large

International Test Conference (ITC), 2006.


