source | lot | wafer | site | thick |
---|---|---|---|---|
1 | 1 | 1 | 1 | 2006 |
1 | 1 | 2 | 2 | 1988 |
1 | 1 | 3 | 3 | 2007 |
1 | 2 | 2 | 1 | 1987 |
1 | 2 | 3 | 2 | 1983 |
1 | 3 | 1 | 3 | 2004 |
ASR005. Hierarchical LMM with nested factors and heterogeneous variances - Silicon wafers
The complete script for this example can be downloaded here:
Dataset
The model using the D003 dataset, and the first few rows are presented below:
Model
The model that we will fit in this example is an extension of the hierarchical LMM with nested random factors presented here: ASR004. In this case, we will extend this model to allow for some form of heterogeneous variances. The model of interest is:
\[ y = \mu + source + lot + lot:wafer + e\ \] where,\(y\) is the thickness of oxide layer on silicon wafers,
\(\mu\) is the population mean,
\(source\) is the fixed effect of source,
\(lot\) is the random effect of lot, with \(lot \sim \mathcal{N}(0,\,\sigma^{2}_{l_s})\), where the variance is estimated under each level of source, \(s=\{1,2\}\),
\(lot:wafer\) random effect of wafer within lot, with \(wafer \sim \mathcal{N}(0,\,\sigma^{2}_{w})\),
\(e\) is the random residual effect, with \(e \sim \mathcal{N}(0,\,\sigma^{2}_{e})\).Now, let’s take a look at how to write the model with ASReml-R. Note that before fitting the model, source
, lot
, and wafer
need to be set as factors.
<- asreml(
asr005 fixed = thick ~ source,
random = ~at(source):lot + lot:wafer,
residual = ~units,
data = d003
)
The at()
function is used for separating fixed or random terms into conditional subsets. In this case, the variance component of the conditional factor (e.g. lot
) will be estimated for each level of the conditioning factor (i.e. source
). The conditional factor must be a factor assumed fixed or random.
Note that the levels of the at()
can be directly indicated by using: at(source, c(1, 2))
, where in this case \(1\) and \(2\) represents the labels of the factor source
. If this conditional term is part of the random model, then an equivalent alternative is to define a complex variance structure. For example, using diag(source):lot
will assume a diagonal variance structure for lot where different variances will be estimated for each of the levels of source
.
Exploring outputs
The statistical significance of the fixed effect of source
can be evaluated with:
wald(asr005, denDF = 'numeric')$Wald
Df denDF F.inc Pr
(Intercept) 1 3.8 5.914e+05 6.516776e-11
source 1 3.8 1.526e+00 2.882553e-01
And the predicted means, based on the above model, are:
predict(asr005, classify = 'source')$pvals
source predicted.value std.error status
1 1 1995.111 2.758078 Estimable
2 2 2005.194 7.682133 Estimable
The variance components estimated from this model are:
summary(asr005)$varcomp
component std.error z.ratio bound %ch
at(source, '1'):lot 17.07636 25.289347 0.6752393 P 0
at(source, '2'):lot 222.71180 192.714464 1.1556569 P 0
lot:wafer 35.86622 14.187862 2.5279509 P 0
units!R 12.56961 2.565778 4.8989468 P 0
Notice that the variance of lot is not equal for different levels of source. The significance of this difference can be verified comparing the residual likelihood of the current model with the equivalent model ASR004 that does not assume heterogeneous variances.