pkgdown/assets/favicon.html

Skip to contents

Introduction

Experimental design is a foundational component of plant breeding and agronomic research. The ability to accurately estimate genetic effects, compare treatments, and predict breeding values depends critically on how field trials are constructed.

Classical designs — randomised complete block designs (RCBD), alpha-lattice designs, and augmented designs — provide well-understood statistical properties but make assumptions that modern breeding programs increasingly cannot meet: regular block structures, balanced replication across all entries, and no prior knowledge of genetic relationships. Contemporary trials involve large numbers of candidates, limited field capacity, spatial heterogeneity, and rich genomic information that classical frameworks treat as external to the design process.

OptiDesign addresses these challenges through a unified framework that integrates:

  • flexible field layout construction for two design families
  • genetic structure from family labels, pedigree matrices, or genomic relationship matrices
  • optional spatial dispersion optimisation to reduce clustering of related entries
  • mixed-model efficiency evaluation under A, D, and CDmean optimality criteria
  • criterion-driven design search that returns the statistically best design across many randomisations

The package follows a single-responsibility architecture: construction, evaluation, and optimisation are separated into distinct functions that can be called independently or chained, making each step transparent and reproducible.


Package Architecture

OptiDesign provides six exported functions organised into two design families:

Function Role Design family
prep_famoptg() Construction Repeated-check block
evaluate_famoptg_efficiency() Evaluation Repeated-check block
optimize_famoptg() Optimisation (RS) Repeated-check block
alpha_rc_stream() Construction Alpha row-column stream
evaluate_alpha_efficiency() Evaluation Alpha row-column stream
optimize_alpha_rc() Optimisation (RS / SA / GA) Alpha row-column stream

The two families differ in their blocking structure, replication model, and the depth of their optimisation capabilities:

Feature prep_famoptg family alpha_rc_stream family
Blocking Flat blocks Replicates → incomplete blocks
Replication Flexible per-entry Uniform across entries
Design types Augmented, p-rep, RCBD-type Alpha-lattice
Block variance sigma_b2 sigma_rep2 + sigma_ib2
Optimisation RS only RS, SA, GA
P-rep constraint Enforced by construction Not applicable

Statistical Framework

The mixed model

Both design families are evaluated under the same general mixed model:

y=Xβ+Zu+ey = X\beta + Zu + e

Symbol Description
yy Vector of observed phenotypes
XX Fixed effects design matrix
β\beta Fixed effects (intercept, checks, entry effects when fixed)
ZZ Incidence matrix linking random effects to plots
uu Random effects (blocks, rows, columns, entries when random)
ee Residual vector

Random effects: uN(0,G)u \sim N(0,\, G) and eN(0,R)e \sim N(0,\, R).

For the repeated-check block family: G1=blockdiag(σb2I,σr2I,σc2I,σg2K1)G^{-1} = \text{blockdiag}(\sigma_b^{-2}I,\; \sigma_r^{-2}I,\; \sigma_c^{-2}I,\; \sigma_g^{-2}K^{-1})

For the alpha row-column stream family: G1=blockdiag(σrep2I,σib2I,σr2I,σc2I,σg2K1)G^{-1} = \text{blockdiag}(\sigma_\text{rep}^{-2}I,\; \sigma_\text{ib}^{-2}I,\; \sigma_r^{-2}I,\; \sigma_c^{-2}I,\; \sigma_g^{-2}K^{-1})

Mixed model coefficient matrix

Efficiency criteria are derived from the mixed model coefficient matrix:

C=(XQXXQZZQXZQZ+G1)C = \begin{pmatrix} X^\top Q X & X^\top Q Z \\ Z^\top Q X & Z^\top Q Z + G^{-1} \end{pmatrix}

where Q=R1Q = R^{-1} is the residual precision matrix.

Residual structures

Three residual structures are supported. For an AR1 process of length nn with autocorrelation ρ\rho, the precision matrix QAR1Q_\text{AR1} is tridiagonal with interior diagonal entries (1+ρ2)/(1ρ2)(1+\rho^2)/(1-\rho^2), edge diagonal entries 1/(1ρ2)1/(1-\rho^2), and off-diagonal entries ρ/(1ρ2)-\rho/(1-\rho^2).

Structure Formula Parameters
IID R=σe2IR = \sigma_e^2 I sigma_e2
AR1 R1=σe2(QAR1(ρr)Ic)R^{-1} = \sigma_e^{-2}(Q_\text{AR1}(\rho_r) \otimes I_c) rho_row
AR1×AR1 R1=σe2(QAR1(ρc)QAR1(ρr))R^{-1} = \sigma_e^{-2}(Q_\text{AR1}(\rho_c) \otimes Q_\text{AR1}(\rho_r)) rho_row, rho_col

Optimality criteria

A-criterion (lower is better): minimises the mean pairwise contrast variance under fixed treatment effects, or the mean prediction error variance (PEV) under random treatment effects.

Acriterion=2p(p1)i<jVar(τ̂iτ̂j)A_\text{criterion} = \frac{2}{p(p-1)} \sum_{i<j} \text{Var}(\hat{\tau}_i - \hat{\tau}_j)

D-criterion (lower is better): minimises the geometric mean of the contrast covariance eigenvalues (fixed effects only).

Dcriterion=exp(logdet(HVH)p1)D_\text{criterion} = \exp\!\left(\frac{\log\det(HVH)}{p-1}\right)

where H=Ipp1JpH = I_p - p^{-1}J_p is the centering matrix and VV is the treatment variance-covariance submatrix of C1C^{-1}.

CDmean (higher is better): the mean coefficient of determination for genomic breeding value (GEBV) prediction (Rincent et al. 2012). Measures the proportion of genetic variance explained by prediction on average across lines.

CDmean=1mean PEVσg2\text{CDmean} = 1 - \frac{\text{mean PEV}}{\sigma_g^2}

CDmean ranges from 0 (no information) to 1 (perfect prediction). It is the primary criterion for optimising training population designs in genomic selection.

Large-design approximation

When the number of treatments exceeds eff_full_max (default 400), exact inversion of the CC submatrix is replaced by the Hutchinson stochastic trace estimator (Hutchinson 1990), which approximates trace(C1[idx,idx])\text{trace}(C^{-1}[\text{idx},\text{idx}]) using mm Rademacher random vectors:

trace(Cidx1)1mk=1mzkC1zk,zkRademacher\text{trace}(C^{-1}_\text{idx}) \approx \frac{1}{m} \sum_{k=1}^{m} z_k^\top C^{-1} z_k, \quad z_k \sim \text{Rademacher}

The result carries the _APPROX mode suffix and D_criterion is NA.


Design Family 1: Repeated-Check Block Designs

Construction with prep_famoptg()

prep_famoptg() builds a repeated-check block design in which checks appear in every block and non-check entries are allocated across blocks according to their specified replication levels.

Three design classes share the same function:

Augmented design — all test entries unreplicated, checks repeated in every block. Suitable for very large entry sets where resources only allow a single observation per candidate.

P-rep design — some entries replicated across multiple distinct blocks, others unreplicated. The most general case: a mixture of candidate entries with different priority levels.

RCBD-type design — all non-check entries equally replicated. When the replication number equals n_blocks, every entry appears in every block — the closest repeated-check analogue of a classical RCBD.

The p-rep constraint — the core structural rule: no replicated treatment ever appears twice in the same block. Enforced by construction at every call, not by post-hoc checking.

Total required plots: total=nblocks×nchecks+i=1vpri+vu\text{total} = n_\text{blocks} \times n_\text{checks} + \sum_{i=1}^{v_p} r_i + v_u

where rir_i is the replication count of p-rep entry ii and vuv_u is the number of unreplicated entries.

Evaluation with evaluate_famoptg_efficiency()

Takes the field_book returned by prep_famoptg() and computes A, D, and CDmean criteria. Fully decoupled from construction — the same field book can be evaluated multiple times under different model assumptions.

The random effect model for this family uses sigma_b2 for the flat block structure (no replicate or incomplete-block nesting).

Optimisation with optimize_famoptg()

Runs n_restarts independent calls to prep_famoptg() with different seeds and returns the design with the best criterion value. Random Restart (RS) is the only method because the p-rep constraint is enforced by construction at every call — permutation-based methods (SA, GA) would require block-aware swap logic to preserve it.


Design Family 2: Alpha Row-Column Stream Designs

Construction with alpha_rc_stream()

alpha_rc_stream() builds a fixed-grid alpha row-column design using a stream-based layout. The field is converted to a one-dimensional planting stream, partitioned into nrepsn_\text{reps} contiguous replicate segments, and each segment is divided into incomplete blocks. Checks appear in every incomplete block; each entry appears exactly once per replicate. Unused cells appear only at the end of the stream.

Block-size constraints are expressed in total block size (checks + entries) via min_block_size and max_block_size. The number of incomplete blocks per replicate bb must satisfy:

vmax_block_sizecbvmin_block_sizec\left\lceil \frac{v}{\text{max\_block\_size} - c} \right\rceil \leq b \leq \left\lfloor \frac{v}{\text{min\_block\_size} - c} \right\rfloor

where vv is the number of entries and cc is the number of checks.

Evaluation with evaluate_alpha_efficiency()

Computes A, D, and CDmean criteria for a design produced by alpha_rc_stream(). The model contains Rep + IBlock(Rep) + Row + Column random effects, using sigma_rep2 and sigma_ib2 — distinct from evaluate_famoptg_efficiency() which uses sigma_b2.

Optimisation with optimize_alpha_rc()

Wraps alpha_rc_stream() and evaluate_alpha_efficiency() in an optimisation loop with three search strategies:

RS (Random Restart) — generate n_restarts independent designs, return the best. Guaranteed validity, simple, easily parallelisable.

SA (Simulated Annealing) — iterative entry permutation swaps with Metropolis acceptance: P(accept worse)=exp(ΔTk)P(\text{accept worse}) = \exp\!\left(-\frac{\Delta}{T_k}\right) where TkT_k cools from sa_temp_start to sa_temp_end. Better at escaping local optima than RS. Invalid swap proposals are treated as neutral events and do not affect the acceptance rate.

GA (Genetic Algorithm) — population of entry permutations evolved via Order Crossover (OX1), random swap mutation, tournament selection, and elitism. Most powerful for global search.


Integrity Checking

Both optimisers implement a four-point integrity checking strategy that guarantees the returned design is structurally valid:

  1. Post-construction — every candidate validated immediately after prep_famoptg() or alpha_rc_stream() returns, before scoring.
  2. Pre-storage — candidate re-checked before updating the running best.
  3. Pre-return — stored best re-checked one final time before returning to the user.
  4. Emergency fallback — if no valid design is found after all iterations, up to 10 fresh random designs are attempted before stopping with an informative error.

For the prep_famoptg family, five structural constraints are verified:

  • No non-check entry appears more than once in a single block
  • Each p-rep treatment appears in exactly p_rep_reps[i] blocks
  • Each unreplicated treatment appears exactly once
  • All checks appear in every block
  • No p-rep treatment occupies the same block twice (the core p-rep guarantee)

Example Dataset

The package ships with a built-in example dataset for both design families:

data("OptiDesign_example_data", package = "OptiDesign")
x <- OptiDesign_example_data
names(x)
#>  [1] "OptiDesign_lines"               "OptiDesign_id_map"             
#>  [3] "OptiDesign_GRM"                 "OptiDesign_A"                  
#>  [5] "OptiDesign_K"                   "OptiDesign_famoptg_example"    
#>  [7] "OptiDesign_alpha_example"       "OptiDesign_famoptg_args_family"
#>  [9] "OptiDesign_famoptg_args_grm"    "OptiDesign_alpha_args_family"  
#> [11] "OptiDesign_alpha_args_grm"

The dataset contains treatment vectors, field dimensions, relationship matrices, and ready-to-use argument lists structured for do.call() workflows.


Workflow 1: Repeated-Check Block Design (Family-Based)

Step 1 — Construct

design_fam <- do.call(
  prep_famoptg,
  c(x$OptiDesign_famoptg_example, x$OptiDesign_famoptg_args_family)
)

dim(design_fam$layout_matrix)
#> [1] 16  8
head(design_fam$field_book)
#>   Treatment Family Gcluster Block Plot Row Column
#> 1      L027    F21     <NA>     1    1   1      1
#> 2      L016    F03     <NA>     1    2   2      1
#> 3      L002    F19     <NA>     1    3   3      1
#> 4      L006    F18     <NA>     1    4   4      1
#> 5      L008    F11     <NA>     1    5   5      1
#> 6      L001    F15     <NA>     1    6   6      1

The field book contains one row per assigned plot with treatment identity, family label, genomic cluster (if applicable), block, plot number, row, and column. There is no efficiency slot — evaluation is a separate step.

Step 2 — Evaluate

eff_fam <- evaluate_famoptg_efficiency(
  field_book         = design_fam$field_book,
  n_rows             = x$OptiDesign_famoptg_example$n_rows,
  n_cols             = x$OptiDesign_famoptg_example$n_cols,
  check_treatments   = x$OptiDesign_famoptg_example$check_treatments,
  treatment_effect   = "fixed",
  residual_structure = "IID"
)

cat("A-criterion (lower is better):", round(eff_fam$A_criterion, 4), "\n")
#> A-criterion (lower is better): 2.4774
cat("D-criterion (lower is better):", round(eff_fam$D_criterion, 4), "\n")
#> D-criterion (lower is better): 0.9643
cat("A-efficiency (higher is better):", round(eff_fam$A_efficiency, 4), "\n")
#> A-efficiency (higher is better): 0.4036
cat("Mode:", eff_fam$mode, "\n")
#> Mode: FIXED_TREATMENT_BLUE_CONTRAST

The same field book can be re-evaluated under a spatial model without rebuilding the design:

eff_fam_ar1 <- evaluate_famoptg_efficiency(
  field_book         = design_fam$field_book,
  n_rows             = x$OptiDesign_famoptg_example$n_rows,
  n_cols             = x$OptiDesign_famoptg_example$n_cols,
  check_treatments   = x$OptiDesign_famoptg_example$check_treatments,
  treatment_effect   = "fixed",
  residual_structure = "AR1xAR1",
  rho_row            = 0.3,
  rho_col            = 0.2
)

cat("A-criterion under AR1xAR1:", round(eff_fam_ar1$A_criterion, 4), "\n")
#> A-criterion under AR1xAR1: 2.1032

Step 3 — Optimise (optional)

opt_fam <- optimize_famoptg(
  # Construction arguments
  check_treatments        = x$OptiDesign_famoptg_example$check_treatments,
  check_families          = x$OptiDesign_famoptg_example$check_families,
  p_rep_treatments        = x$OptiDesign_famoptg_example$p_rep_treatments,
  p_rep_reps              = x$OptiDesign_famoptg_example$p_rep_reps,
  p_rep_families          = x$OptiDesign_famoptg_example$p_rep_families,
  unreplicated_treatments = x$OptiDesign_famoptg_example$unreplicated_treatments,
  unreplicated_families   = x$OptiDesign_famoptg_example$unreplicated_families,
  n_blocks                = x$OptiDesign_famoptg_example$n_blocks,
  n_rows                  = x$OptiDesign_famoptg_example$n_rows,
  n_cols                  = x$OptiDesign_famoptg_example$n_cols,
  # Evaluation arguments
  treatment_effect   = "fixed",
  residual_structure = "IID",
  # Optimiser arguments
  criterion   = "A",
  n_restarts  = 20,
  verbose_opt = FALSE
)

cat("Best A-criterion:", round(opt_fam$optimization$best_score, 4), "\n")
cat("Valid restarts:", opt_fam$optimization$n_restarts -
                        opt_fam$optimization$n_failed, "/",
                        opt_fam$optimization$n_restarts, "\n")

Workflow 2: Alpha Row-Column Stream Design (Family-Based)

Step 1 — Construct

design_alpha <- do.call(
  alpha_rc_stream,
  c(x$OptiDesign_alpha_example, x$OptiDesign_alpha_args_family)
)

dim(design_alpha$layout_matrix)
#> [1] 12 14
cat("Blocks per rep:", design_alpha$design_info$n_blocks_per_rep, "\n")
#> Blocks per rep: 8
cat("Total used plots:", design_alpha$design_info$total_used_plots, "\n")
#> Total used plots: 168
cat("Trailing NA plots:", design_alpha$design_info$trailing_na_plots, "\n")
#> Trailing NA plots: 0
head(design_alpha$field_book)
#>   Plot Row Column Rep IBlock BlockInRep Treatment Family Gcluster Check
#> 1    1   1      1   1      1          1      L146    F08     <NA> FALSE
#> 2    2   1      2   1      1          1      L118    F14     <NA> FALSE
#> 3    3   1      3   1      1          1      L134    F09     <NA> FALSE
#> 4    4   1      4   1      1          1      L153    F20     <NA> FALSE
#> 5    5   1      5   1      1          1      L145    F01     <NA> FALSE
#> 6    6   1      6   1      1          1      L117    F06     <NA> FALSE

Step 2 — Evaluate

eff_alpha <- evaluate_alpha_efficiency(
  field_book         = design_alpha$field_book,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  treatment_effect   = "fixed",
  residual_structure = "IID"
)

cat("A-criterion (lower is better):", round(eff_alpha$A_criterion, 4), "\n")
#> A-criterion (lower is better): 1.3017
cat("D-criterion (lower is better):", round(eff_alpha$D_criterion, 4), "\n")
#> D-criterion (lower is better): 0.5794
cat("Number of treatments evaluated:", eff_alpha$n_trt, "\n")
#> Number of treatments evaluated: 63

Step 3 — Optimise with Random Restart

opt_rs <- optimize_alpha_rc(
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  check_families     = x$OptiDesign_alpha_example$check_families,
  entry_treatments   = x$OptiDesign_alpha_example$entry_treatments,
  entry_families     = x$OptiDesign_alpha_example$entry_families,
  n_reps             = x$OptiDesign_alpha_example$n_reps,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  min_block_size     = 10L,
  max_block_size     = 12L,
  treatment_effect   = "fixed",
  residual_structure = "IID",
  method             = "RS",
  criterion          = "A",
  n_restarts         = 20,
  verbose_opt        = FALSE
)

cat("Best A-criterion (RS):", round(opt_rs$optimization$best_score, 4), "\n")
plot(opt_rs$optimization$score_history, type = "b",
     xlab = "Restart", ylab = "A-criterion",
     main = "RS: A-criterion across restarts")

Step 4 — Optimise with Simulated Annealing

opt_sa <- optimize_alpha_rc(
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  check_families     = x$OptiDesign_alpha_example$check_families,
  entry_treatments   = x$OptiDesign_alpha_example$entry_treatments,
  entry_families     = x$OptiDesign_alpha_example$entry_families,
  n_reps             = x$OptiDesign_alpha_example$n_reps,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  min_block_size     = 10L,
  max_block_size     = 12L,
  treatment_effect   = "fixed",
  residual_structure = "IID",
  method             = "SA",
  criterion          = "A",
  n_restarts         = 3,
  sa_max_iter        = 200,
  sa_temp_start      = 1.0,
  sa_temp_end        = 0.001,
  sa_cooling         = "exponential",
  sa_swap_scope      = "global",
  verbose_opt        = FALSE
)

cat("Best A-criterion (SA):", round(opt_sa$optimization$best_score, 4), "\n")

Workflow 3: GRM-Based Design with Dispersion

When a genomic relationship matrix is available it replaces family labels as the grouping source, and can also drive the optional dispersion optimisation:

design_grm <- do.call(
  alpha_rc_stream,
  c(x$OptiDesign_alpha_example, x$OptiDesign_alpha_args_grm)
)

# GRM clustering populates the Gcluster column for non-check entries
non_check <- design_grm$field_book[!design_grm$field_book$Check, ]
cat("Unique genomic clusters:", length(unique(non_check$Gcluster[
  !is.na(non_check$Gcluster)])), "\n")
#> Unique genomic clusters: 23
# alpha_rc_stream field books use IBlock and Rep, not Block
head(non_check[, c("Treatment", "Family", "Gcluster", "Rep",
                    "IBlock", "Row", "Column")])
#>   Treatment Family Gcluster Rep IBlock Row Column
#> 1      L145    F01      G14   1      1   1      1
#> 2      L146    F08       G7   1      1   1      2
#> 3      L154    F08       G1   1      1   1      3
#> 7      L134    F09       G2   1      1   1      7
#> 8      L104    F22      G20   1      1   1      8
#> 9      L117    F06      G22   1      1   1      9

Workflow 4: CDmean Optimisation for Genomic Selection

CDmean is the primary criterion when the objective is maximising GEBV prediction reliability rather than contrast precision. It requires random treatment effects and a genomic prediction model.

# CDmean optimisation with GBLUP requires a K matrix.
# Here we use the example K from the shipped dataset.
opt_cdmean <- optimize_alpha_rc(
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  check_families     = x$OptiDesign_alpha_example$check_families,
  entry_treatments   = x$OptiDesign_alpha_example$entry_treatments,
  entry_families     = x$OptiDesign_alpha_example$entry_families,
  n_reps             = x$OptiDesign_alpha_example$n_reps,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  min_block_size     = 10L,
  max_block_size     = 12L,
  # Genomic prediction model
  treatment_effect   = "random",
  prediction_type    = "GBLUP",
  K                  = x$OptiDesign_K,
  line_id_map        = x$OptiDesign_id_map,
  varcomp = list(
    sigma_g2   = 0.4,
    sigma_e2   = 0.6,
    sigma_rep2 = 0.1,
    sigma_ib2  = 0.05,
    sigma_r2   = 0.02,
    sigma_c2   = 0.02
  ),
  # CDmean criterion
  method      = "RS",
  criterion   = "CDmean",
  n_restarts  = 20,
  verbose_opt = FALSE
)

cat("Best CDmean:", round(opt_cdmean$optimization$best_score, 4),
    "(higher is better)\n")
cat("CDmean from efficiency slot:",
    round(opt_cdmean$efficiency$CDmean, 4), "\n")

CDmean and A-criterion can point in different directions because they measure different objectives — contrast precision versus prediction reliability. Choose based on your trial objective.


Comparing Optimality Criteria

Both criterion types can be computed on the same design to understand the trade-off:

# Evaluate the same design under both fixed and random models
eff_fixed <- evaluate_alpha_efficiency(
  field_book         = design_alpha$field_book,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  treatment_effect   = "fixed",
  residual_structure = "IID"
)

eff_random <- evaluate_alpha_efficiency(
  field_book         = design_alpha$field_book,
  n_rows             = x$OptiDesign_alpha_example$n_rows,
  n_cols             = x$OptiDesign_alpha_example$n_cols,
  check_treatments   = x$OptiDesign_alpha_example$check_treatments,
  treatment_effect   = "random",
  prediction_type    = "IID",
  residual_structure = "IID"
)

results <- data.frame(
  Criterion  = c("A-criterion", "D-criterion",
                 "A-efficiency", "D-efficiency",
                 "Mean PEV", "CDmean"),
  Value      = c(
    round(eff_fixed$A_criterion,  4),
    round(eff_fixed$D_criterion,  4),
    round(eff_fixed$A_efficiency, 4),
    round(eff_fixed$D_efficiency, 4),
    round(eff_random$mean_PEV,    4),
    round(eff_random$CDmean,      4)
  ),
  Direction  = c("lower=better", "lower=better",
                 "higher=better", "higher=better",
                 "lower=better", "higher=better"),
  Model      = c("fixed", "fixed", "fixed", "fixed", "random", "random")
)
knitr::kable(results, caption = "Efficiency criteria for the example alpha design")
Efficiency criteria for the example alpha design
Criterion Value Direction Model
A-criterion 1.3017 lower=better fixed
D-criterion 0.5794 lower=better fixed
A-efficiency 0.7682 higher=better fixed
D-efficiency 1.7260 higher=better fixed
Mean PEV 0.3883 lower=better random
CDmean 0.6117 higher=better random

Grouping and Dispersion Options

When to use each grouping strategy

Strategy cluster_source Use when
Family labels "Family" Family structure is meaningful and interpretable; no relationship matrix available
Genomic (GRM) "GRM" Genomic data is available; relatedness precision matters more than family labels
Pedigree (A) "A" Only pedigree information is available

Dispersion optimisation

The dispersion step minimises the total genomic relatedness among neighbouring plots:

S=(i,j)𝒩KijS = \sum_{(i,j)\in\mathcal{N}} K_{ij}

where 𝒩\mathcal{N} is the set of plot pairs within Chebyshev distance dispersion_radius. The local swap search accepts a swap if and only if it reduces SS. Key parameters:

Parameter Effect
dispersion_radius Neighbourhood size: 1 = 8-connected, 2 = 24-connected
dispersion_iters Number of swap proposals; more iterations = lower SS at linear cost
dispersion_source Which matrix to score against: "K", "GRM", or "A"

Practical Guidelines

Choosing between design families

Situation Recommended function
Many entries, not all need replication prep_famoptg()
Some entries need priority replication prep_famoptg() (p-rep)
All entries equally replicated, checks needed everywhere prep_famoptg() (RCBD-type)
Fixed field dimensions, operational field-book order alpha_rc_stream()
Alpha-lattice structure needed alpha_rc_stream()

Choosing an optimality criterion

Objective Criterion Function argument
Maximise contrast precision A-criterion criterion = "A"
Minimise overall estimation volume D-criterion criterion = "D"
Balance both Combined criterion = "both"
Maximise GEBV prediction reliability CDmean criterion = "CDmean"

CDmean requires treatment_effect = "random" and prediction_type %in% c("IID", "GBLUP", "PBLUP"). For genomic selection training population optimisation, GBLUP with a real kinship matrix KK is strongly recommended.

Choosing an optimisation method (optimize_alpha_rc only)

Method Best for Cost
RS Quick exploration, guaranteed valid designs Low
SA Moderate search depth, escaping local optima Medium
GA Thorough global search High

For production use, SA or GA with 5–10 restarts and 500–1000 iterations/ generations typically provides a good balance of quality and computation time.

Variance component specification

Variance components have a large effect on efficiency values but a small effect on the ranking of designs. For design comparison purposes, equal variance components (sigma_* = 1) are a reasonable default. For absolute criterion values that are meaningful on the scale of real data, use heritability-consistent components:

# Example: h^2 = 0.5, moderate spatial correlation
varcomp <- list(
  sigma_g2   = 0.5,
  sigma_e2   = 0.5,
  sigma_rep2 = 0.1,
  sigma_ib2  = 0.05,
  sigma_r2   = 0.02,
  sigma_c2   = 0.02
)

Session Information

sessionInfo()
#> R version 4.5.0 (2025-04-11 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22631)
#> 
#> Matrix products: default
#>   LAPACK version 3.12.1
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Bogota
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] OptiDesign_0.1.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] cli_3.6.5         knitr_1.51        rlang_1.1.6       xfun_0.53        
#>  [5] textshaping_1.0.4 jsonlite_2.0.0    htmltools_0.5.8.1 pracma_2.4.6     
#>  [9] ragg_1.4.0        sass_0.4.10       rmarkdown_2.30    grid_4.5.0       
#> [13] evaluate_1.0.5    jquerylib_0.1.4   fastmap_1.2.0     yaml_2.3.10      
#> [17] lifecycle_1.0.5   compiler_4.5.0    fs_1.6.6          htmlwidgets_1.6.4
#> [21] rstudioapi_0.17.1 systemfonts_1.3.1 lattice_0.22-6    digest_0.6.37    
#> [25] R6_2.6.1          bslib_0.10.0      Matrix_1.7-3      tools_4.5.0      
#> [29] pkgdown_2.1.3     cachem_1.1.0      desc_1.4.3

References

Rincent, R., Laloë, D., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., …, & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192(2), 715–728. https://doi.org/10.1534/genetics.112.141473

Jones, B., Allen-Moyer, K., & Goos, P. (2021). A-optimal versus D-optimal design of screening experiments. Journal of Quality Technology, 53(4), 369–382. https://doi.org/10.1080/00224065.2020.1757391

Hutchinson, M. F. (1990). A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics — Simulation and Computation, 19(2), 433–450. https://doi.org/10.1080/03610919008812866

Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680. https://doi.org/10.1126/science.220.4598.671

Holland, J. H. (1992). Adaptation in Natural and Artificial Systems. MIT Press.