<- ainverse(d021ped) ainv
ASR018. LMM for a multi-environmental trial - Pine clones
The complete script for this example can be downloaded here:
Dataset
In this example, we will use the D021 and the D021ped datasets, which correspond to phenotypic and pedigree data of pine clones from a multi-environmental trial. A few lines of each data frame are shown below:
clone | location | block | height_8 | dia_8 |
---|---|---|---|---|
18 | Hill | 2 | 35.9 | 7.00 |
18 | Hill | 4 | 38.3 | 6.60 |
18 | Hill | 6 | 27.7 | 5.85 |
18 | Hill | 8 | 37.0 | 6.40 |
18 | Orchard | 2 | 35.7 | 5.40 |
18 | Orchard | 3 | 37.1 | 6.05 |
clone | dam | sire |
---|---|---|
1 | 0 | 0 |
2 | 0 | 0 |
3 | 0 | 0 |
4 | 0 | 0 |
5 | 0 | 0 |
6 | 0 | 0 |
The example asr017 illustrates the procedure to investigate each site separately, something that is required before fitting and multi-environmental trial (MET) analysis.
Model
In this example we will fit an MET model. The specification of this model is: \[ y = \mu + location + location:block + clone + location:clone + e\ \]
where,\(y\) is the pine clones’ height at age 8,
\(\mu\) is the overall mean,
\(location\) is the fixed effect of site
\(location:block\) is the nested random effect of block within site, with \(block \sim \mathcal{N}(0,\,\sigma^{2}_{bi})\),
\(clone\) is the random effect of clone, with \(clone \sim \mathcal{N}(0,\,\sigma^{2}_{c})\)
\(location:clone\) is the random effect of the interaction of location with clone, with \(location:clone \sim \mathcal{N}(0,\,\sigma^{2}_{lc})\),
\(e\) is the random residual effect, with \(e \sim \mathcal{N}(0,\,\sigma^{2}_{ei})\).
The factors \(clone\) and \(location:clone\) are associated with the numerator relationship matrix (NRM) \(\mathbf{A}\). The letter \(i\) in \(\sigma^{2}_{bi}\) and \(\sigma^{2}_{ei}\) represent a different block effect and a different residual variance component for each location, respectively.
Now, let’s take a look at how to write the model with ASReml-R. Note that before fitting the model, location
, gen
, and clone
need to be set as factors and the inverse of the NRM needs to be created using the pedigree information (see ASR013 for more details).
<- d021[order(d021$location),]
d021 <- asreml(
asr018 fixed = height_8 ~ location,
random = ~at(location):block + vm(clone, ainv) + idv(location):vm(clone, ainv),
residual = ~dsum(~units|location),
data = d021
)
Note that in the above model we are using at(location):block
to allow for the estimation of a different block variance for each of the locations. In addition, the use of dsum()
is required to request a different error variance for each of the locations. Furthermore, the above model is including the inverse of the NRM by the use of the expression vm()
that stands for variance model and that links a factor to the inverse matrix.
Exploring output
The variance components are:
summary(asr018)$varcomp
component std.error z.ratio bound %ch
at(location, 'Hill'):block 1.269777e+00 0.7465103 1.700951 P 0.0
at(location, 'Orchard'):block 1.212249e+00 0.7138307 1.698230 P 0.0
at(location, 'Rail'):block 2.237795e+00 1.9372264 1.155154 P 0.0
vm(clone, ainv) 5.739307e+00 0.9317013 6.160030 P 0.1
location:vm(clone, ainv)!location 4.536677e-07 NA NA B NA
location_Hill!R 1.227888e+01 0.6556501 18.727798 P 0.0
location_Orchard!R 1.368415e+01 0.6752724 20.264636 P 0.0
location_Rail!R 1.312105e+01 0.9797690 13.391987 P 0.0
Note that there are differences on the variance components between the locations (particularly for the residual term). In addition, the term id(location):vm(clone, ainv)
is almost zero, and therefore for this model and with this dataset we do not find any genotype-by-environment.
The heritability can be calculated by site, but in the following expression this is done across all the sites, as follow:
vpredict(asr018, h2 ~ V4/((V1+V2+V3)/3+V4+V5+(V6+V7+V8)/3))
Estimate SE
h2 0.28216 0.03533191
These results indicate a heritability estimate of 0.28 which signifies that the studied trait is under important genetic control. The small standard error of 0.035 associated with the heritability estimate is indicating reliable results.
The statistical significance of fixed effects can be evaluated such as:
wald(asr018, denDF = 'numeric')$Wald
Df denDF F.inc Pr
(Intercept) 1 91.3 1958.000 0.0000000
location 2 7.0 1.622 0.2633317
According to these results, the fixed effects of location are not significantly different.
The random effects are:
<- as.data.frame(summary(asr018, coef = TRUE)$coef.random)
BLUP head(BLUP)
solution std.error z.ratio
at(location, 'Hill'):block_1 1.1324195 0.5239748 2.1612097
at(location, 'Hill'):block_2 0.2533747 0.5135070 0.4934202
at(location, 'Hill'):block_3 -0.2799310 0.5125597 -0.5461431
at(location, 'Hill'):block_4 1.0394901 0.5101069 2.0377888
at(location, 'Hill'):block_5 0.9094417 0.5084001 1.7888309
at(location, 'Hill'):block_6 -0.6220001 0.5099794 -1.2196575
tail(BLUP)
solution std.error z.ratio
location_Rail:vm(clone, ainv)_148 -4.283946e-08 0.0006735484 -6.360264e-05
location_Rail:vm(clone, ainv)_149 -4.746008e-08 0.0006735484 -7.046276e-05
location_Rail:vm(clone, ainv)_150 -1.665287e-07 0.0006735484 -2.472408e-04
location_Rail:vm(clone, ainv)_151 -7.264737e-08 0.0006735484 -1.078577e-04
location_Rail:vm(clone, ainv)_152 -1.338444e-07 0.0006735484 -1.987154e-04
location_Rail:vm(clone, ainv)_153 4.880225e-08 0.0006735484 7.245544e-05
Here, we can see that there are a large number of solutions, but the ones of more interest are the ones with the prefix vm(clone, ainv)_
, which correspond to the breeding values.