pkgdown/assets/favicon.html

Skip to contents

OptiDesign logo


Overview

OptiDesign provides tools for constructing, evaluating, and optimizing experimental field designs for plant breeding and related agricultural applications. It integrates:

  • field layout construction
  • genetic structure (family, pedigree, genomic relationships)
  • optional spatial dispersion optimization
  • statistical efficiency evaluation under mixed models
  • criterion-driven design optimization

into a single, flexible workflow built around two design families.


Why OptiDesign?

In many breeding programs, experimental design is treated as a purely logistical step. However, design choices strongly affect:

  • precision of treatment estimates
  • genomic prediction accuracy
  • ability to separate genetic from environmental effects
  • robustness to field spatial heterogeneity

OptiDesign allows users to explicitly control these aspects at the design stage, rather than correcting them post hoc during analysis. Uniquely, OptiDesign goes beyond single-design construction by offering criterion-driven optimisation — searching across many randomisations to return the design with the best statistical properties for your specific trial objective.


Core Design Concepts

OptiDesign is built around three key ideas.

1. Field realism

Designs reflect how trials are actually implemented in the field:

  • fixed grid layouts (n_rows × n_cols)
  • field-book ordering and serpentine movement
  • contiguous replicate structure
  • unused cells placed at the end of the field stream

2. Genetic awareness

Entries can be arranged based on known genetic structure:

  • family labels
  • pedigree relationships (A matrix)
  • genomic relationships (GRM)

This enables reduction of local genetic relatedness, improved sampling of genetic diversity across spatial blocks, and better estimation of genetic effects.

3. Integrated evaluation and optimisation

Designs can be evaluated and optimised using mixed-model principles before field implementation:

  • fixed-effect precision (BLUEs) — A and D optimality criteria
  • random-effect prediction (BLUP / GBLUP / PBLUP) — mean PEV and CDmean
  • spatial residual structures (IID, AR1, AR1×AR1)
  • criterion-driven search returning the statistically best design

Package Architecture

OptiDesign follows a single-responsibility architecture: construction, evaluation, and optimisation are separated into distinct functions that can be called independently or chained together.

Function Role Design family
prep_famoptg() Construction Repeated-check block
evaluate_famoptg_efficiency() Evaluation Repeated-check block
optimize_famoptg() Optimisation (RS) Repeated-check block
alpha_rc_stream() Construction Alpha row-column stream
evaluate_alpha_efficiency() Evaluation Alpha row-column stream
optimize_alpha_rc() Optimisation (RS / SA / GA) Alpha row-column stream

Design Family 1: Repeated-Check Block Designs

prep_famoptg() — construction

Constructs repeated-check block designs with flexible replication, covering augmented designs, partially replicated (p-rep) designs, and RCBD-type repeated-check designs. Checks appear in every block; entries may be replicated, partially replicated, or unreplicated.

Use this function when:

  • you have many entries but limited field resources
  • checks must appear in every block
  • you need a flexible framework covering augmented, p-rep, or balanced layouts
  • you are working in early- or intermediate-stage breeding trials

Key capabilities:

Feature Details
Replication Flexible per-entry replication
Block allocation P-rep constraint: no treatment appears twice in the same block
Design types Augmented, p-rep, RCBD-type repeated-check
Grouping Family labels, GRM, or pedigree (A) matrix
Dispersion optimization Optional; reduces clustering of related entries

Key design rule: a treatment can appear multiple times overall, but always in distinct blocks — never twice in the same block.


evaluate_famoptg_efficiency() — evaluation

Evaluates the statistical efficiency of a design produced by prep_famoptg(). The mixed model contains Block + Row + Column random effects (no replicate or incomplete-block nesting). Uses variance component sigma_b2 for the flat block structure. Fully decoupled from construction — the same field book can be evaluated multiple times under different model assumptions without rebuilding the layout.

Supported criteria:

Criterion Effect type Direction
A-criterion Fixed or random Lower is better
D-criterion Fixed only Lower is better
CDmean Random only Higher is better

optimize_famoptg() — optimisation

Wraps prep_famoptg() and evaluate_famoptg_efficiency() in a Random Restart (RS) loop. RS is used exclusively because the p-rep constraint is enforced by construction at every call — permutation-based methods would require block-aware swap logic to preserve it. Every candidate design is valid by construction. Supports A, D, both, and CDmean criteria.


Design Family 2: Alpha Row-Column Stream Designs

alpha_rc_stream() — construction

Constructs alpha row-column designs on a fixed grid using a stream-based layout. The field is converted into a single ordered stream of positions, split into contiguous replicate segments, then further divided into incomplete blocks. Checks appear in every incomplete block; entries appear once per replicate.

Block-size constraints are expressed as total block size (checks + entries) through min_block_size and max_block_size. The function translates these into entry-slot limits internally and derives or validates the number of incomplete blocks per replicate accordingly.

Use this function when:

  • field dimensions are fixed and cannot change
  • planting follows field-book order
  • replicates are defined operationally rather than geometrically
  • checks must appear in every incomplete block
  • you want to control block sizes in whole-block terms

Key capabilities:

Feature Details
Grid Fixed n_rows × n_cols
Replicates Contiguous field segments
Block sizes Controlled via min_block_size / max_block_size (total: checks + entries)
Checks Present in every incomplete block
Block count User-fixed or automatically derived as the largest feasible value
Grouping Family labels, GRM, or pedigree (A) matrix
Dispersion optimization Optional

Important: unused cells are placed only at the end of the field stream, not scattered — which is critical for practical field implementation.


evaluate_alpha_efficiency() — evaluation

Evaluates the statistical efficiency of a design produced by alpha_rc_stream(). The mixed model contains Rep + IBlock(Rep) + Row + Column random effects. Uses variance components sigma_rep2 and sigma_ib2 for replicate and incomplete-block variance. Supports the same A, D, and CDmean criteria as evaluate_famoptg_efficiency().


optimize_alpha_rc() — optimisation

Wraps alpha_rc_stream() and evaluate_alpha_efficiency() in an optimisation loop with three search strategies:

Method Description Best for
RS (Random Restart) Generate n_restarts independent designs, return the best Quick exploration, guaranteed validity
SA (Simulated Annealing) Iterative entry-permutation swaps with temperature-governed acceptance Escaping local optima
GA (Genetic Algorithm) Population of permutations evolved via OX1 crossover, swap mutation, and elitism Thorough global search

All three methods preserve all structural constraints by construction. Supports A, D, both, and CDmean criteria.


Key Differences Between Design Families

Feature prep_famoptg family alpha_rc_stream family
Blocking structure Flat blocks Replicates → incomplete blocks
Replication Flexible per-entry Uniform across entries
Design types Augmented, p-rep, RCBD-type Alpha-lattice
Block variance sigma_b2 sigma_rep2 + sigma_ib2
Optimisation methods RS only RS, SA, GA
P-rep constraint Enforced Not applicable

Grouping Options

Both families support three grouping strategies for adjacency scoring within blocks and genomic dispersion:

Strategy Source Best for
Family-based User-defined labels Simple, interpretable grouping
GRM-based Genomic similarity matrix Captures real genomic relationships
Pedigree-based A matrix When genomic data is unavailable

Efficiency Criteria

Criterion Meaning Direction Available for
A-criterion Mean pairwise contrast variance (fixed) or mean PEV (random) Lower is better Both families
D-criterion Geometric mean of contrast covariance eigenvalues Lower is better Fixed effects only
CDmean Mean coefficient of determination for GEBV prediction Higher is better Random effects + GBLUP/PBLUP

CDmean is defined as:

CDmean=1mean PEVσg2\text{CDmean} = 1 - \frac{\text{mean PEV}}{\sigma_g^2}

and directly measures the expected reliability of genomic prediction. It is particularly useful for optimising training population designs in genomic selection (Rincent et al. 2012).


Typical Workflows

Repeated-check block design

library(OptiDesign)

# 1. Construct
design <- prep_famoptg(
  check_treatments        = checks,
  check_families          = check_fam,
  p_rep_treatments        = prep_trts,
  p_rep_reps              = rep(2L, length(prep_trts)),
  p_rep_families          = prep_fam,
  unreplicated_treatments = unrep_trts,
  unreplicated_families   = unrep_fam,
  n_blocks = 5, n_rows = 15, n_cols = 20
)

# 2. Evaluate
eff <- evaluate_famoptg_efficiency(
  field_book         = design$field_book,
  n_rows             = 15, n_cols = 20,
  check_treatments   = checks,
  treatment_effect   = "fixed",
  residual_structure = "AR1xAR1",
  rho_row = 0.10, rho_col = 0.10
)
eff$A_criterion   # lower is better
eff$D_criterion

# 3. Or optimise directly (returns best of 50 random designs)
opt <- optimize_famoptg(
  check_treatments        = checks,
  check_families          = check_fam,
  p_rep_treatments        = prep_trts,
  p_rep_reps              = rep(2L, length(prep_trts)),
  p_rep_families          = prep_fam,
  unreplicated_treatments = unrep_trts,
  unreplicated_families   = unrep_fam,
  n_blocks           = 5, n_rows = 15, n_cols = 20,
  treatment_effect   = "fixed",
  residual_structure = "AR1xAR1",
  rho_row = 0.10, rho_col = 0.10,
  criterion = "A", n_restarts = 50
)
opt$optimization$best_score
opt$optimization$score_history

Alpha row-column stream design

library(OptiDesign)

# 1. Construct
design <- alpha_rc_stream(
  check_treatments = checks,
  check_families   = check_fam,
  entry_treatments = entries,
  entry_families   = entry_fam,
  n_reps = 3, n_rows = 30, n_cols = 20,
  min_block_size = 19, max_block_size = 20
)

# 2. Evaluate
eff <- evaluate_alpha_efficiency(
  field_book         = design$field_book,
  n_rows = 30, n_cols = 20,
  check_treatments   = checks,
  treatment_effect   = "fixed",
  residual_structure = "AR1xAR1",
  rho_row = 0.10, rho_col = 0.10
)
eff$A_criterion
eff$D_criterion

# 3. Or optimise directly using Simulated Annealing
opt <- optimize_alpha_rc(
  check_treatments   = checks,
  check_families     = check_fam,
  entry_treatments   = entries,
  entry_families     = entry_fam,
  n_reps = 3, n_rows = 30, n_cols = 20,
  min_block_size     = 19, max_block_size = 20,
  treatment_effect   = "fixed",
  residual_structure = "AR1xAR1",
  rho_row = 0.10, rho_col = 0.10,
  method = "SA", criterion = "A",
  n_restarts = 5, sa_max_iter = 500
)
opt$optimization$best_score

Genomic prediction optimisation (CDmean)

library(OptiDesign)

# Maximise CDmean for genomic selection training population
opt_cdmean <- optimize_alpha_rc(
  check_treatments   = checks,
  check_families     = check_fam,
  entry_treatments   = entries,
  entry_families     = entry_fam,
  n_reps = 3, n_rows = 30, n_cols = 20,
  min_block_size     = 19, max_block_size = 20,
  treatment_effect   = "random",
  prediction_type    = "GBLUP",
  K                  = my_kinship_matrix,
  varcomp            = list(
    sigma_g2   = 0.4, sigma_e2   = 0.6,
    sigma_rep2 = 0.1, sigma_ib2  = 0.05,
    sigma_r2   = 0.02, sigma_c2  = 0.02
  ),
  method    = "GA",
  criterion = "CDmean",
  ga_pop_size = 20, ga_n_generations = 50
)
opt_cdmean$efficiency$CDmean        # mean GEBV prediction reliability
opt_cdmean$optimization$best_score  # positive CDmean, higher is better

Installation

Install from GitHub with vignettes (recommended):

install.packages("remotes")
remotes::install_github("FAkohoue/OptiDesign",
  build_vignettes = TRUE,
  dependencies    = TRUE
)

Install without vignettes for a faster install:

remotes::install_github("FAkohoue/OptiDesign",
  build_vignettes = FALSE,
  dependencies    = TRUE
)

Documentation

Full documentation, function reference, and tutorials are available at:

https://FAkohoue.github.io/OptiDesign/

To read the vignette after installation:

vignette("OptiDesign-introduction", package = "OptiDesign")

Citation

If you use OptiDesign in published research, please cite:

Akohoue, F. (2026).
OptiDesign: Optimized Experimental Field Design for Plant Breeding.
R package version 0.1.0.
https://github.com/FAkohoue/OptiDesign

If you use CDmean-based optimisation, additionally cite:

Rincent, R., Laloë, D., Nicolas, S., et al. (2012).
Maximizing the reliability of genomic selection by optimizing the calibration
set of reference individuals. Genetics, 192(2), 715–728.
https://doi.org/10.1534/genetics.112.141473

Contributing

Issues, bug reports, and feature suggestions are welcome: https://github.com/FAkohoue/OptiDesign/issues


License

MIT License © Félicien Akohoue