ASR004. Hierarchical LMM with nested factors - Silicon wafers

The complete script for this example can be downloaded here:

Dataset

In this example will use the D003 dataset, and the first few rows are presented below:

source lot wafer site thick
1 1 1 1 2006
1 1 2 2 1988
1 1 3 3 2007
1 2 2 1 1987
1 2 3 2 1983
1 3 1 3 2004


Model

The hierarchical model with nested factors that we will fit in this example is:

\[ y = \mu + source + lot + lot:wafer + e\ \] where,

    \(y\) is the thickness of oxide layer on silicon wafers,

    \(\mu\) is the overall mean,

    \(source\) is the fixed effect of source,

    \(lot\) is the random effect of lot, with \(lot \sim \mathcal{N}(0,\,\sigma^{2}_{l})\),

    \(lot:wafer\) is the random effect of wafer within lot, with \(wafer \sim \mathcal{N}(0,\,\sigma^{2}_{w})\),

    \(e\) is the random residual effect, with \(e \sim \mathcal{N}(0,\,\sigma^{2}_{e})\).


Now, let’s take a look at how to write the model with ASReml-R. Note that before fitting the model, source, lot, and wafer need to be set as factors.

asr004 <- asreml(
  fixed = thick ~ source,
  random = ~lot + lot:wafer,
  residual = ~units,
  data = d003
)

This model can be understood as a hierarchical model because we have the structure of lots, then wafers nested within lot, and finally, samples nested within wafer within lot. Therefore, we have 3 layers and each is defined by a variance component to describe the structure. In the above model, we used lot:wafer to denote wafers nested within lot, but we did not define the lowest layer, as this is described by the residual variance (MSE).


Exploring outputs

The statistical significance of fixed effects can be tested as:

wald(asr004, denDF='numeric')$Wald
            Df denDF     F.inc           Pr
(Intercept)  1     6 2.402e+05 4.884981e-15
source       1     6 1.526e+00 2.628690e-01


Lets obtain the predicted means, based on the above model with:

predict(asr004, classify = 'source')$pvals
  source predicted.value std.error    status
1      1        1995.111  5.771576 Estimable
2      2        2005.194  5.771576 Estimable

The variance components estimated from this model are:

summary(asr004)$varcomp
          component std.error  z.ratio bound %ch
lot       119.89892 77.007089 1.556985     P 0.1
lot:wafer  35.86768 14.188727 2.527900     P 0.0
units!R    12.57012  2.565935 4.898847     P 0.0

It is clear from here that the main source of variability in this analyses is lot, followed by wafer and then sample. This might have implications on the interpretation and use of this information for future decisions.


And, if we are interested on the random effects (BLUPs) of lot and waffer within lot, these can be obtained with:

BLUP <- summary(asr004, coef = TRUE)$coef.random
head(BLUP, 12)
                  solution std.error     z.ratio
lot_1           1.09967134  6.242120  0.17616953
lot_2          -6.59802807  6.242120 -1.05701720
lot_3           5.39838660  6.242120  0.86483226
lot_4           0.09997012  6.242120  0.01601541
lot_5           8.82236329  6.242120  1.41336012
lot_6          14.72060050  6.242120  2.35826944
lot_7         -12.67121299  6.242120 -2.02995349
lot_8         -10.87175079  6.242120 -1.74167607
lot_1:wafer_1   6.97446465  3.694863  1.88761132
lot_1:wafer_2 -11.53046673  3.694863 -3.12067529
lot_1:wafer_3   4.88519821  3.694863  1.32215960
lot_2:wafer_1   1.03291867  3.694863  0.27955536


More information:

Note that an extension of this model, using a heterogeneous variance structure, is available here: ASR005