Search for a criterion-optimal alpha-lattice design
Source:R/optimize_alpha_rc.R
optimize_alpha_rc.Rdoptimize_alpha_rc() wraps alpha_rc_stream() and
evaluate_alpha_efficiency() in an optimisation loop that searches for
the design with the best optimality criterion value among many candidate
randomisations. Three search strategies are available:
RS (Random Restart): generate
n_restartsindependent designs and return the best by criterion value.SA (Simulated Annealing): start from a random design and iteratively propose entry permutation swaps, accepting improvements always and degradations with probability \(\exp(-\Delta / T)\), where \(T\) cools from
sa_temp_starttosa_temp_endoversa_max_iteriterations. Repeated acrossn_restartsindependent restarts.GA (Genetic Algorithm): maintain a population of
ga_pop_sizeentry permutations and evolve it using tournament selection, Order Crossover (OX1), random swap mutation, and elitism.
Constraint preservation: all candidate designs are generated by
alpha_rc_stream(), so family/cluster adjacency structure, genetic
dispersion, check placement, and alpha-lattice block balance are preserved
by construction in all three methods - no penalty terms are required.
Integrity guarantee: every candidate design passes a structural integrity check before being scored or stored as the best. The running best is only updated when both the criterion score improves and a re-check passes. A final integrity check is performed on the stored best before returning to the user. If no valid design is found after all iterations, an emergency fallback returns a single freshly constructed valid design.
Usage
optimize_alpha_rc(
check_treatments,
check_families,
entry_treatments,
entry_families,
n_reps,
n_rows,
n_cols,
order = "column",
serpentine = FALSE,
seed = NULL,
attempts = 5000,
warn_and_correct = TRUE,
fix_rows = TRUE,
cluster_source = c("Family", "GRM", "A"),
GRM = NULL,
A = NULL,
id_map = NULL,
cluster_method = c("kmeans", "hclust"),
cluster_seed = 1,
cluster_attempts = 25,
n_pcs_use = Inf,
n_blocks_per_rep = NULL,
min_block_size = NULL,
max_block_size = NULL,
check_placement = c("systematic", "random"),
check_position_pattern = c("spread", "corners_first"),
use_dispersion = FALSE,
dispersion_source = c("K", "A", "GRM"),
dispersion_radius = 1,
dispersion_iters = 2000,
dispersion_seed = 1,
K = NULL,
line_id_map = NULL,
treatment_effect = c("random", "fixed"),
prediction_type = c("IID", "GBLUP", "PBLUP", "none"),
check_as_fixed = TRUE,
residual_structure = c("IID", "AR1", "AR1xAR1"),
rho_row = 0,
rho_col = 0,
varcomp = list(sigma_e2 = 1, sigma_g2 = 1, sigma_rep2 = 1, sigma_ib2 = 1, sigma_r2 = 1,
sigma_c2 = 1),
spatial_engine = c("auto", "sparse", "dense"),
dense_max_n = 5000,
eff_trace_samples = 80,
eff_full_max = 400,
method = c("RS", "SA", "GA"),
criterion = c("A", "D", "both", "CDmean"),
n_restarts = 50L,
sa_max_iter = 1000L,
sa_temp_start = 1,
sa_temp_end = 0.001,
sa_cooling = c("exponential", "linear"),
sa_swap_scope = c("global", "within_rep", "within_block"),
ga_pop_size = 20L,
ga_n_generations = 50L,
ga_crossover_rate = 0.7,
ga_mutation_rate = 0.2,
ga_elitism = 2L,
max_failure_rate_rs = 0.5,
max_failure_rate_sa = 0.2,
ga_invalid_retries = 3L,
verbose_opt = TRUE
)Arguments
- check_treatments, check_families, entry_treatments, entry_families, n_reps, n_rows, n_cols, order, serpentine, seed, attempts, warn_and_correct, fix_rows, cluster_source, GRM, A, id_map, cluster_method, cluster_seed, cluster_attempts, n_pcs_use, n_blocks_per_rep, min_block_size, max_block_size, check_placement, check_position_pattern, use_dispersion, dispersion_source, dispersion_radius, dispersion_iters, dispersion_seed, K, line_id_map
Passed directly to
alpha_rc_stream(). See that function for full documentation.- treatment_effect, prediction_type, check_as_fixed, residual_structure, rho_row, rho_col, varcomp, spatial_engine, dense_max_n, eff_trace_samples, eff_full_max
Passed directly to
evaluate_alpha_efficiency(). See that function for full documentation.- method
Character. Optimisation method:
"RS"Random Restart. Generates
n_restartsindependent designs and returns the one with the best criterion score. Simple, parallelisable, and robust - every candidate is a fully valid design by construction."SA"Simulated Annealing. Iteratively proposes entry permutation swaps, accepting improvements always and degradations with a temperature-governed probability. Repeated across
n_restartsindependent restarts. Better at escaping local optima than pure random restart."GA"Genetic Algorithm. Maintains a population of entry permutations and evolves it using OX1 crossover, swap mutation, and elitism. Most powerful global search but also most computationally intensive.
- criterion
Character. Optimality criterion to drive the search:
"A"Minimise
A_criterion(mean pairwise contrast variance for fixed effects; mean PEV for random effects). Valid for bothtreatment_effect = "fixed"and"random". Recommended default."D"Minimise
D_criterion(geometric mean of contrast covariance eigenvalues). Fixed effects only. Falls back toA_criterionwith a warning for random effects."both"Minimise the mean of
A_criterionandD_criterion. Fixed effects only. Falls back toA_criterionwhen D is unavailable."CDmean"Maximise CDmean (mean coefficient of determination for GEBV prediction; Rincent et al. 2012). Requires
treatment_effect = "random"andprediction_type %in% c("IID", "GBLUP", "PBLUP"). Best suited for optimising training population designs for genomic selection.
- n_restarts
Positive integer. Number of independent random restarts. Used by all three methods: RS runs
n_restartsfull designs; SA and GA each restart their searchn_restartstimes from a new random initial design.- sa_max_iter
Positive integer. Number of swap proposals per SA restart. Each proposal that produces a valid design undergoes Metropolis acceptance; invalid proposals are counted but do not affect the acceptance logic.
- sa_temp_start
Positive numeric. Initial SA temperature \(T_0\). Higher values allow more degrading moves to be accepted early in the search, improving escape from local optima at the cost of less directed search.
- sa_temp_end
Positive numeric. Final SA temperature. Must be strictly less than
sa_temp_start. Low values ensure the search converges to a local optimum by the end of each restart.- sa_cooling
Character. SA cooling schedule:
"exponential"Geometric decay: \(T_k = T_0 \cdot (T_\text{end}/T_0)^{k/\text{max\_iter}}\). Recommended - spends more iterations at low temperature where refinement occurs.
"linear"Linear decay from
sa_temp_starttosa_temp_end.
- sa_swap_scope
Character. Scope of entry swap proposals during SA:
"global"Any two non-check entries anywhere in the design may be swapped. Has the largest impact on the criterion because it changes which entries co-occur in blocks across replicates. Recommended for most use cases.
"within_rep"Swaps restricted to entries within the same replicate. Preserves rep-level balance but limits criterion improvement compared with global swaps.
"within_block"Swaps restricted to entries within the same incomplete block. Has minimal criterion impact under IID residuals because the C matrix is invariant to within-block reordering. Only marginally useful under strong spatial correlation (AR1, AR1xAR1).
- ga_pop_size
Positive integer \(\geq 4\). Number of individuals in the GA population. Larger populations improve coverage of the design space but increase the cost per generation linearly.
- ga_n_generations
Positive integer. Number of GA generations per run. Each generation evaluates up to
ga_pop_size - ga_elitismnew offspring.- ga_crossover_rate
Numeric in \([0, 1]\). Probability of applying OX1 crossover to produce an offspring. When crossover is not applied, the first parent's permutation is passed directly to mutation. Values between 0.6 and 0.9 are typical.
- ga_mutation_rate
Numeric in \([0, 1]\). Probability of applying a random swap mutation to an offspring permutation. Provides diversity; values between 0.1 and 0.3 are typical.
- ga_elitism
Non-negative integer strictly less than
ga_pop_size. Number of best individuals copied unchanged to the next generation. Prevents loss of the best solution found so far. Values of 1 or 2 are usually sufficient.- max_failure_rate_rs
Numeric in \([0, 1]\). Maximum tolerated fraction of RS restarts that may fail construction or integrity checking before the function stops with a diagnostic error. Default
0.5(stop if more than half of restarts fail). Below this threshold a warning is issued instead. Increase for very constrained field geometries where some failure is expected.- max_failure_rate_sa
Numeric in \([0, 1]\). Maximum tolerated fraction of SA swap proposals per restart that may produce invalid designs before a per-restart warning is issued. Default
0.2. This does not stop the function - it is informational. Invalid swaps are counted but treated as neutral (the current design is retained without invoking the Metropolis criterion).- ga_invalid_retries
Non-negative integer. Number of additional random permutation attempts made when a GA offspring fails integrity checking. Default
3. If all retries also fail, the population slot receivesInffitness and participates in no future tournament selection.- verbose_opt
Logical. If
TRUE, prints per-restart or per-generation progress messages including current score, running best, and failure counts. Set toFALSEfor batch or parallel runs.
Value
The return value of alpha_rc_stream() for the best valid design
found, augmented with two additional top-level components:
efficiencyThe full output of
evaluate_alpha_efficiency()for the best design, including all applicable criterion values:A_criterion,D_criterion,A_efficiency,D_efficiency,CDmean, andCD_per_line.optimizationNamed list of optimisation metadata:
methodCharacter.
"RS","SA", or"GA".criterionCharacter. As supplied.
best_scoreNumeric. Best criterion value found, reported in the natural direction for the user. For
criterion = "A","D", or"both": lower is better. Forcriterion = "CDmean": higher is better (positive value, negation is internal only).score_historyFor RS and GA: numeric vector of length
n_restartsorga_n_generationsrespectively, tracking the best criterion value at each step.NAentries indicate failed iterations. For SA: a list ofn_restartsnumeric vectors, each of lengthsa_max_iter, tracking the current score at each iteration within that restart. Forcriterion = "CDmean", all values are reported as positive CDmean.master_seedInteger. The master random seed used.
n_failedInteger. Number of failed candidates (RS: failed restarts; SA: failed initial designs across restarts; GA: total invalid offspring requiring retry or fallback to
Inffitness).NAif not tracked for the method.n_restartsInteger or
NULL. The value ofn_restarts. RS only;NULLfor SA and GA.sa_paramsNamed list or
NULL. SA parameters:n_restarts,max_iter,temp_start,temp_end,cooling,swap_scope. SA only.ga_paramsNamed list or
NULL. GA parameters:pop_size,n_generations,crossover_rate,mutation_rate,elitism,invalid_retries. GA only.
If no valid optimised design is found after all iterations, a warning is issued and the function returns a single freshly constructed valid design via the emergency fallback (see Integrity checking strategy above). If even the emergency fallback fails after 10 attempts, the function stops with an informative error message.
Details
Optimisation target
All methods minimise an internal score (lower = better internally):
criterion | Internal score | Direction |
"A" | A_criterion | Lower is better |
"D" | D_criterion | Lower is better |
"both" | Mean of A_criterion and D_criterion | Lower is better |
"CDmean" | Negated CDmean | Higher CDmean is better |
For "CDmean", the internal negation is transparent to the user:
best_score and score_history in the output are always reported as
positive CDmean values (higher = better). Requires
treatment_effect = "random".
Entry permutation encoding
For SA and GA, designs are encoded as permutations of
seq_along(entry_treatments). A permutation is applied by passing
entry_treatments[perm] and entry_families[perm] to
alpha_rc_stream(), which then randomises and allocates entries
according to its internal logic. This means the permutation influences the
treatment-to-block assignment while alpha_rc_stream() handles all
within-block arrangement, check placement, and dispersion.
Integrity checking strategy
Three structural constraints are verified for every candidate design:
No entry appears more than once within any replicate.
All expected entry treatments are present in the field book.
All check treatments are present and correctly flagged as checks.
These checks are performed at four points in the pipeline:
- Point 1 - after construction
Inside
run_pipeline(), immediately afteralpha_rc_stream()returns. Failures discard the candidate before scoring.- Point 2 - before storing as best
Inside
.update_best(), as a second independent check. Prevents a candidate that passed Point 1 from being stored if its state was corrupted between construction and comparison.- Point 3 - final check before return
After the optimisation loop ends, the stored best is re-checked one final time. If it fails, the emergency fallback is triggered.
- Point 4 - emergency fallback
If no valid design is found, up to 10 fresh random constructions are attempted. The first valid one is returned with a warning. If all 10 fail, the function stops with a diagnostic message.
Failure handling by method
RS: failed restarts (construction error or integrity failure) are
counted and reported. If the failure rate exceeds max_failure_rate_rs,
the function stops with a diagnostic message. Below the threshold a warning
is issued.
SA: invalid swap proposals (NULL candidate) are counted per restart but
are treated as neutral events - the current design is retained, the
temperature counter advances, and the Metropolis acceptance logic is
not invoked. This prevents invalid proposals from distorting the
acceptance rate. If the invalid rate within a restart exceeds
max_failure_rate_sa, a per-restart warning is issued.
GA: when an offspring fails integrity, up to ga_invalid_retries fresh
random permutations are tried as replacements. This prevents population
slots from being occupied by Inf-fitness placeholders that waste
tournament selection capacity. The total count of invalid offspring is
reported in a warning and stored in $optimization$n_failed.
Simulated Annealing temperature schedule
"exponential": \(T_k = T_0 \cdot \alpha^k\) where \(\alpha = (T_\text{end} / T_0)^{1 / \text{max\_iter}}\)"linear": \(T_k\) decreases linearly fromsa_temp_starttosa_temp_endoversa_max_itersteps
Genetic Algorithm operators
Selection: tournament of size 2 over valid (finite-fitness) individuals only
Crossover: Order Crossover (OX1) - always produces a valid permutation of
seq_along(entry_treatments)Mutation: random swap of two positions in the permutation
Elitism: top
ga_elitismindividuals copied unchanged to next generation
References
Rincent, R., Laloe, D., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., ... & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192(2), 715-728.
Jones, B., Allen-Moyer, K., & Goos, P. (2021). A-optimal versus D-optimal design of screening experiments. Journal of Quality Technology, 53(4), 369-382.
Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671-680.
Holland, J. H. (1992). Adaptation in Natural and Artificial Systems. MIT Press.
See also
alpha_rc_stream() for single-design construction without optimisation.
evaluate_alpha_efficiency() for standalone criterion evaluation on an
existing field book.
Examples
## -- Random Restart: fixed effects, A-criterion -----------------------------
result_rs <- optimize_alpha_rc(
check_treatments = c("CHK1", "CHK2", "CHK3"),
check_families = c("CHECK", "CHECK", "CHECK"),
entry_treatments = paste0("G", 1:167),
entry_families = rep(paste0("F", 1:7), length.out = 167),
n_reps = 3,
n_rows = 30,
n_cols = 20,
min_block_size = 19,
max_block_size = 20,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10,
method = "RS",
criterion = "A",
n_restarts = 50
)
#>
#> [optimize_alpha_rc] method = 'RS' | criterion = 'A' | master seed = 554108441
#> [RS] 50 restarts | criterion = 'A'
#> [RS] 1/50 | score = 0.746080 | best = 0.746080
#> [RS] 2/50 | score = 0.746095 | best = 0.746080
#> [RS] 3/50 | score = 0.744183 | best = 0.744183
#> [RS] 4/50 | score = 0.744390 | best = 0.744183
#> [RS] 5/50 | score = 0.747536 | best = 0.744183
#> [RS] 6/50 | score = 0.745166 | best = 0.744183
#> [RS] 7/50 | score = 0.748252 | best = 0.744183
#> [RS] 8/50 | score = 0.746552 | best = 0.744183
#> [RS] 9/50 | score = 0.745492 | best = 0.744183
#> [RS] 10/50 | score = 0.746225 | best = 0.744183
#> [RS] 11/50 | score = 0.743898 | best = 0.743898
#> [RS] 12/50 | score = 0.745469 | best = 0.743898
#> [RS] 13/50 | score = 0.746307 | best = 0.743898
#> [RS] 14/50 | score = 0.745892 | best = 0.743898
#> [RS] 15/50 | score = 0.745473 | best = 0.743898
#> [RS] 16/50 | score = 0.746171 | best = 0.743898
#> [RS] 17/50 | score = 0.748523 | best = 0.743898
#> [RS] 18/50 | score = 0.746380 | best = 0.743898
#> [RS] 19/50 | score = 0.746538 | best = 0.743898
#> [RS] 20/50 | score = 0.747267 | best = 0.743898
#> [RS] 21/50 | score = 0.746706 | best = 0.743898
#> [RS] 22/50 | score = 0.746524 | best = 0.743898
#> [RS] 23/50 | score = 0.747023 | best = 0.743898
#> [RS] 24/50 | score = 0.745524 | best = 0.743898
#> [RS] 25/50 | score = 0.747626 | best = 0.743898
#> [RS] 26/50 | score = 0.744816 | best = 0.743898
#> [RS] 27/50 | score = 0.745067 | best = 0.743898
#> [RS] 28/50 | score = 0.746952 | best = 0.743898
#> [RS] 29/50 | score = 0.745957 | best = 0.743898
#> [RS] 30/50 | score = 0.745420 | best = 0.743898
#> [RS] 31/50 | score = 0.744651 | best = 0.743898
#> [RS] 32/50 | score = 0.747181 | best = 0.743898
#> [RS] 33/50 | score = 0.745141 | best = 0.743898
#> [RS] 34/50 | score = 0.747325 | best = 0.743898
#> [RS] 35/50 | score = 0.744570 | best = 0.743898
#> [RS] 36/50 | score = 0.746746 | best = 0.743898
#> [RS] 37/50 | score = 0.744129 | best = 0.743898
#> [RS] 38/50 | score = 0.747262 | best = 0.743898
#> [RS] 39/50 | score = 0.746573 | best = 0.743898
#> [RS] 40/50 | score = 0.743288 | best = 0.743288
#> [RS] 41/50 | score = 0.743270 | best = 0.743270
#> [RS] 42/50 | score = 0.747472 | best = 0.743270
#> [RS] 43/50 | score = 0.747826 | best = 0.743270
#> [RS] 44/50 | score = 0.745486 | best = 0.743270
#> [RS] 45/50 | score = 0.745246 | best = 0.743270
#> [RS] 46/50 | score = 0.745594 | best = 0.743270
#> [RS] 47/50 | score = 0.743258 | best = 0.743258
#> [RS] 48/50 | score = 0.746165 | best = 0.743258
#> [RS] 49/50 | score = 0.744582 | best = 0.743258
#> [RS] 50/50 | score = 0.745752 | best = 0.743258
#>
#> [optimize_alpha_rc] Done. Best A = 0.743258 (lower is better)
result_rs$efficiency$A_criterion # best A-criterion found
#> [1] 0.7432582
result_rs$optimization$best_score # same value, lower is better
#> [1] 0.7432582
result_rs$optimization$score_history # per-restart scores
#> [1] 0.7460799 0.7460952 0.7441834 0.7443899 0.7475358 0.7451657 0.7482523
#> [8] 0.7465523 0.7454921 0.7462245 0.7438977 0.7454687 0.7463071 0.7458923
#> [15] 0.7454729 0.7461707 0.7485230 0.7463804 0.7465380 0.7472674 0.7467064
#> [22] 0.7465238 0.7470233 0.7455243 0.7476257 0.7448157 0.7450670 0.7469525
#> [29] 0.7459570 0.7454197 0.7446512 0.7471813 0.7451407 0.7473252 0.7445698
#> [36] 0.7467458 0.7441288 0.7472623 0.7465734 0.7432878 0.7432697 0.7474718
#> [43] 0.7478260 0.7454857 0.7452461 0.7455942 0.7432582 0.7461649 0.7445821
#> [50] 0.7457524
result_rs$optimization$n_failed # number of failed restarts
#> [1] 0
if (FALSE) { # \dontrun{
## -- Simulated Annealing: random effects, CDmean ----------------------------
result_sa <- optimize_alpha_rc(
check_treatments = c("CHK1", "CHK2", "CHK3"),
check_families = c("CHECK", "CHECK", "CHECK"),
entry_treatments = paste0("G", 1:167),
entry_families = rep(paste0("F", 1:7), length.out = 167),
n_reps = 3,
n_rows = 30,
n_cols = 20,
min_block_size = 19,
max_block_size = 20,
treatment_effect = "random",
prediction_type = "GBLUP",
K = my_kinship_matrix,
varcomp = list(
sigma_g2 = 0.4, sigma_e2 = 0.6,
sigma_rep2 = 0.1, sigma_ib2 = 0.05,
sigma_r2 = 0.02, sigma_c2 = 0.02
),
method = "SA",
criterion = "CDmean",
n_restarts = 5,
sa_max_iter = 500,
sa_temp_start = 0.05,
sa_temp_end = 0.001,
sa_cooling = "exponential",
sa_swap_scope = "global",
max_failure_rate_sa = 0.3
)
result_sa$efficiency$CDmean # mean prediction reliability [0, 1]
result_sa$optimization$best_score # positive CDmean, higher is better
result_sa$optimization$n_failed # failed SA initial designs
## -- Genetic Algorithm: fixed effects, D-criterion --------------------------
result_ga <- optimize_alpha_rc(
check_treatments = c("CHK1", "CHK2", "CHK3"),
check_families = c("CHECK", "CHECK", "CHECK"),
entry_treatments = paste0("G", 1:167),
entry_families = rep(paste0("F", 1:7), length.out = 167),
n_reps = 3,
n_rows = 30,
n_cols = 20,
min_block_size = 19,
max_block_size = 20,
treatment_effect = "fixed",
residual_structure = "AR1xAR1",
rho_row = 0.10,
rho_col = 0.10,
method = "GA",
criterion = "D",
n_restarts = 3,
ga_pop_size = 20,
ga_n_generations = 50,
ga_crossover_rate = 0.7,
ga_mutation_rate = 0.2,
ga_elitism = 2,
ga_invalid_retries = 3
)
result_ga$efficiency$D_criterion # best D-criterion, lower is better
result_ga$optimization$score_history # per-generation best D-criterion
result_ga$optimization$n_failed # total invalid offspring
} # }