source | lot | wafer | site | thick |
---|---|---|---|---|

1 | 1 | 1 | 1 | 2006 |

1 | 1 | 2 | 2 | 1988 |

1 | 1 | 3 | 3 | 2007 |

1 | 2 | 2 | 1 | 1987 |

1 | 2 | 3 | 2 | 1983 |

1 | 3 | 1 | 3 | 2004 |

# ASR005. Hierarchical LMM with nested factors and heterogeneous variances - Silicon wafers

The complete script for this example can be downloaded here:

### Dataset

The model using the D003 dataset, and the first few rows are presented below:

### Model

The model that we will fit in this example is an extension of the hierarchical LMM with nested random factors presented here: ASR004. In this case, we will extend this model to allow for some form of heterogeneous variances. The model of interest is:

\[ y = \mu + source + lot + lot:wafer + e\ \] where,\(y\) is the thickness of oxide layer on silicon wafers,

\(\mu\) is the population mean,

\(source\) is the fixed effect of source,

\(lot\) is the random effect of lot, with \(lot \sim \mathcal{N}(0,\,\sigma^{2}_{l_s})\), where the variance is estimated under each level of source, \(s=\{1,2\}\),

\(lot:wafer\) random effect of wafer within lot, with \(wafer \sim \mathcal{N}(0,\,\sigma^{2}_{w})\),

\(e\) is the random residual effect, with \(e \sim \mathcal{N}(0,\,\sigma^{2}_{e})\).Now, let’s take a look at how to write the model with ASReml-R. Note that before fitting the model, `source`

, `lot`

, and `wafer`

need to be set as factors.

```
<- asreml(
asr005 fixed = thick ~ source,
random = ~at(source):lot + lot:wafer,
residual = ~units,
data = d003
)
```

The `at()`

function is used for separating fixed or random terms into conditional subsets. In this case, the variance component of the conditional factor (*e.g.* `lot`

) will be estimated for each level of the conditioning factor (*i.e.* `source`

). The conditional factor must be a factor assumed fixed or random.

Note that the levels of the `at()`

can be directly indicated by using: `at(source, c(1, 2))`

, where in this case \(1\) and \(2\) represents the labels of the factor `source`

. If this conditional term is part of the random model, then an equivalent alternative is to define a complex variance structure. For example, using `diag(source):lot`

will assume a diagonal variance structure for lot where different variances will be estimated for each of the levels of `source`

.

### Exploring outputs

The statistical significance of the fixed effect of `source`

can be evaluated with:

`wald(asr005, denDF = 'numeric')$Wald`

```
Df denDF F.inc Pr
(Intercept) 1 3.8 5.914e+05 6.516776e-11
source 1 3.8 1.526e+00 2.882553e-01
```

And the predicted means, based on the above model, are:

`predict(asr005, classify = 'source')$pvals`

```
source predicted.value std.error status
1 1 1995.111 2.758078 Estimable
2 2 2005.194 7.682133 Estimable
```

The variance components estimated from this model are:

`summary(asr005)$varcomp`

```
component std.error z.ratio bound %ch
at(source, '1'):lot 17.07636 25.289347 0.6752393 P 0
at(source, '2'):lot 222.71180 192.714464 1.1556569 P 0
lot:wafer 35.86622 14.187862 2.5279509 P 0
units!R 12.56961 2.565778 4.8989468 P 0
```

Notice that the variance of lot is not equal for different levels of source. The significance of this difference can be verified comparing the residual likelihood of the current model with the equivalent model ASR004 that does not assume heterogeneous variances.