OptiDesign: Optimized Experimental Field Design for Plant Breeding
Source:R/OptiDesign-package.R
OptiDesign-package.RdOptiDesign provides tools for constructing, evaluating, and optimizing
experimental field designs for plant breeding and related agricultural
applications. The package integrates field structure, treatment grouping,
genetic relationship information, spatial modelling, and mixed-model
efficiency evaluation directly into the design stage.
Details
Design constructors
The package provides two main design construction functions:
prep_famoptg() constructs repeated-check block designs with flexible
replication, supporting three design classes: augmented repeated-check
designs (checks repeated, all entries unreplicated), partially replicated
(p-rep) repeated-check designs (mixture of replicated and unreplicated
entries), and RCBD-type repeated-check designs (all entries equally
replicated). The p-rep constraint - that no replicated entry appears twice
in the same block - is enforced by construction. Supports optional grouping
from family labels or relationship matrices and optional genetic dispersion
optimisation. Construction only; evaluation is performed separately by
evaluate_famoptg_efficiency().
alpha_rc_stream() constructs fixed-grid alpha row-column stream
designs where the field is treated as a global traversal stream. Replicates
are contiguous stream segments, incomplete blocks may be unequal in size,
checks are included in every block, and unused cells appear only at the end
of the stream. Construction only; evaluation is performed separately by
evaluate_alpha_efficiency().
Design evaluation
evaluate_famoptg_efficiency() evaluates the statistical efficiency of
a design produced by prep_famoptg(). The mixed model contains Block + Row
Column random effects (no replicate or incomplete-block nesting). Variance component
sigma_b2controls block variance. Supported criteria:
A-criterion: mean pairwise contrast variance (fixed) or mean PEV (random). Lower is better.
D-criterion: geometric mean of contrast covariance eigenvalues (fixed only). Lower is better.
CDmean: mean coefficient of determination for GEBV prediction (Rincent et al. 2012). Higher is better. Requires random treatment effects and a genomic prediction model.
evaluate_alpha_efficiency() evaluates the statistical efficiency of
a design produced by alpha_rc_stream(). The mixed model contains
Rep + IBlock(Rep) + Row + Column random effects. Variance components
sigma_rep2 and sigma_ib2 control replicate and incomplete-block
variance respectively. Supports the same A, D, and CDmean criteria as
evaluate_famoptg_efficiency().
Both evaluation functions are fully decoupled from construction so the same field book can be evaluated multiple times under different model assumptions without rebuilding the layout.
Design optimisation
optimize_famoptg() wraps prep_famoptg() and
evaluate_famoptg_efficiency() in a Random Restart (RS) optimisation
loop. RS is used exclusively because the p-rep constraint (no replicated
treatment in the same block twice) is enforced by construction at every
prep_famoptg() call - permutation-based methods such as SA and GA would
require block-aware swap logic to preserve this constraint. Every candidate
is a valid design with all constraints satisfied. Supports A, D, both, and
CDmean criteria.
optimize_alpha_rc() wraps alpha_rc_stream() and
evaluate_alpha_efficiency() in an optimisation loop with three search
strategies:
RS (Random Restart): generate many independent designs, return the best by criterion. Simple and fully safe by construction.
SA (Simulated Annealing): iteratively propose entry permutation swaps, accepting improvements always and degradations with a temperature-governed probability. Repeated across multiple restarts.
GA (Genetic Algorithm): maintain a population of entry permutations and evolve it using Order Crossover (OX1), swap mutation, and elitism.
All methods in both optimisers preserve structural constraints by construction. Both implement a four-point integrity checking strategy: post-construction validation, re-check before storing as best, final check before return, and emergency fallback if no valid design is found.
Internal helpers
alpha_rc_helpers.R provides internal functions shared across all six
exported functions: sparse incidence matrix construction, AR1 precision
matrix generation, the mixed model solver, the Hutchinson stochastic trace
estimator, Chebyshev neighbourhood enumeration, and the dispersion scoring
function. These functions are prefixed with . and are not exported.
Function overview
| Function | Role | Design family |
prep_famoptg() | Construction | Repeated-check block |
evaluate_famoptg_efficiency() | Evaluation | Repeated-check block |
optimize_famoptg() | Optimisation (RS) | Repeated-check block |
alpha_rc_stream() | Construction | Alpha row-column stream |
evaluate_alpha_efficiency() | Evaluation | Alpha row-column stream |
optimize_alpha_rc() | Optimisation (RS/SA/GA) | Alpha row-column stream |
Typical workflow - repeated-check block design
# 1. Construct
design <- prep_famoptg(
check_treatments = checks,
check_families = check_fam,
p_rep_treatments = prep_trts,
p_rep_reps = rep(2L, length(prep_trts)),
p_rep_families = prep_fam,
unreplicated_treatments = unrep_trts,
unreplicated_families = unrep_fam,
n_blocks = 5, n_rows = 15, n_cols = 20
)
# 2. Evaluate
eff <- evaluate_famoptg_efficiency(
field_book = design$field_book,
n_rows = 15,
n_cols = 20,
check_treatments = checks,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10
)
# 3. Or optimise directly
opt <- optimize_famoptg(
check_treatments = checks,
check_families = check_fam,
p_rep_treatments = prep_trts,
p_rep_reps = rep(2L, length(prep_trts)),
p_rep_families = prep_fam,
unreplicated_treatments = unrep_trts,
unreplicated_families = unrep_fam,
n_blocks = 5,
n_rows = 15,
n_cols = 20,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10,
criterion = "A",
n_restarts = 50
)Typical workflow - alpha row-column stream design
# 1. Construct
design <- alpha_rc_stream(
check_treatments = checks,
check_families = check_fam,
entry_treatments = entries,
entry_families = entry_fam,
n_reps = 3,
n_rows = 30,
n_cols = 20,
min_block_size = 19,
max_block_size = 20
)
# 2. Evaluate
eff <- evaluate_alpha_efficiency(
field_book = design$field_book,
n_rows = 30,
n_cols = 20,
check_treatments = checks,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10
)
# 3. Or optimise directly
opt <- optimize_alpha_rc(
check_treatments = checks,
check_families = check_fam,
entry_treatments = entries,
entry_families = entry_fam,
n_reps = 3,
n_rows = 30,
n_cols = 20,
min_block_size = 19,
max_block_size = 20,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10,
method = "RS",
criterion = "A",
n_restarts = 50
)Key differences between the two design families
| Feature | prep_famoptg family | alpha_rc_stream family |
| Blocking structure | Flat blocks | Replicates -> incomplete blocks |
| Replication | Flexible per-entry | Uniform across entries |
| Design types | Augmented, p-rep, RCBD-type | Alpha-lattice |
| Block variance component | sigma_b2 | sigma_rep2 + sigma_ib2 |
| Optimisation methods | RS only | RS, SA, GA |
| P-rep constraint enforced | Yes | Not applicable |
Relationship matrices and grouping
Both design families accept one of three grouping sources for adjacency scoring and clustering within blocks:
Direct family labels (
cluster_source = "Family")A genomic relationship matrix (
cluster_source = "GRM", argumentGRM)A pedigree relationship matrix (
cluster_source = "A", argumentA)
A separate relationship matrix K can be provided independently for:
GBLUP/PBLUP prediction efficiency (
prediction_type = "GBLUP")CDmean computation for genomic selection training population optimisation
Spatial dispersion optimisation (
dispersion_source = "K")
When treatment names do not match matrix rownames, mapping tables id_map
and line_id_map reconcile identifiers.
Included example dataset
The package ships with OptiDesign_example_data, containing synthetic
objects for treatment and family lists, example relationship matrices, and
example argument lists for both design families. Load with:
data("OptiDesign_example_data", package = "OptiDesign")Dependencies
Matrix: sparse matrix operations used throughout efficiency evaluation, AR1 precision matrix construction, and the mixed model coefficient matrix. Required by all six exported functions.
pracma:
mod()is imported for serpentine parity handling inprep_famoptg()(alternating row/column traversal direction).
References
Rincent, R., Laloe, D., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., ... & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds. Genetics, 192(2), 715-728.
Jones, B., Allen-Moyer, K., & Goos, P. (2021). A-optimal versus D-optimal design of screening experiments. Journal of Quality Technology, 53(4), 369-382.
Hutchinson, M.F. (1990). A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics - Simulation and Computation, 19(2), 433-450.
Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671-680.
Holland, J. H. (1992). Adaptation in Natural and Artificial Systems. MIT Press.
Author
Maintainer: Félicien Akohoue akohoue.f@gmail.com (ORCID)