OptiSparseMET 0.1.0
Initial release
OptiSparseMET introduces a unified framework for sparse multi-environment trial (MET) design, jointly addressing treatment allocation across environments and within-environment field design under shared statistical, genetic, and logistical constraints.
The package links allocation structure, genetic connectivity, seed availability, and spatial design assumptions in a single reproducible workflow compatible with mixed-model inference.
Across-environment allocation
-
Added
allocate_sparse_met()for constructing treatment-by-environment incidence matrices using sparse testing principles.- Supports
"random_balanced"(M3) for flexible approximate balance with coverage-first guarantees. - Supports
"balanced_incomplete"(M4) for BIBD-inspired uniform replication structure with enforced equal replication and equal environment sizes across the trial. - Accepts
"M3"and"M4"as convenient aliases. - Supports unequal environment capacities under
random_balanced. - Supports common treatments to ensure design-based connectivity across environments.
- Returns
$allocation_matrix(binary treatment-by-environment incidence matrix) from which pairwise co-occurrence can be computed post-hoc asout$allocation_matrix %*% t(out$allocation_matrix).
- Supports
Added
check_balanced_incomplete_feasibility()to verify the slot identity (J* × r = I × k*) before attempting balanced incomplete allocation, confirming that equal replication is achievable for the chosen dimensions.-
Added
derive_allocation_groups()to construct grouping structures from:- family membership labels
- genomic relationship matrix (GRM)
- pedigree relationship matrix (A matrix)
Group-guided allocation improves genetic connectedness and stability of cross-environment inference.
Capacity and feasibility helpers
- Added
suggest_safe_k()to propose a safe uniform value forn_test_entries_per_environmentgiven treatment count, environment count, common treatments, and a user-defined buffer. - Added
min_k_for_full_coverage()to compute the minimum per-environment capacity required for every non-common treatment to be assigned at least once. - Added
warn_if_k_too_small()to provide a non-fatal diagnostic warning when the chosen capacity is insufficient for full treatment coverage.
These helpers prevent the most common failure mode: passing a capacity too small to assign all treatments before allocate_sparse_met() is called.
Seed-aware replication planning
- Added
assign_replication_by_seed()to determine feasible replication levels based on available seed quantities and per-plot seed requirements.- Supports
"augmented","p_rep", and"rcbd_type"replication modes. - Supports
shortage_actionvalues"error","downgrade", and"exclude"for handling treatments with insufficient seed. - Returns a role-partitioned list (
p_rep_treatments,unreplicated_treatments,excluded_treatments) suitable for direct input tomet_prep_famoptg().
- Supports
Within-environment field design engines
Block-based repeated-check designs
-
Added
met_prep_famoptg()for constructing augmented, partially replicated (p-rep), and RCBD-type repeated-check block designs.Key structural guarantees:
- Check treatments appear in every block.
- Replicated (p-rep) treatments appear at most once per block.
- Unreplicated treatments appear exactly once across the field.
- Optional genetic dispersion optimisation using GRM or A matrix.
- Optional within-environment efficiency evaluation (A, D, CDmean).
-
Added
met_evaluate_famoptg_efficiency()for evaluating the statistical efficiency ofmet_prep_famoptg()designs under:- Fixed or random treatment effects.
- IID, AR1, or AR1×AR1 residual covariance structures.
- A-optimality, D-efficiency, CDmean, and mean PEV criteria.
- Requires
sigma_b2(block variance) invarcomp.
Added
met_optimize_famoptg()for criterion-driven optimisation ofmet_prep_famoptg()designs via Random Restart.
Row-column alpha designs
-
Added
met_alpha_rc_stream()for generating alpha row-column stream designs suitable for large structured fields.Key features:
- Repeated checks in every incomplete block.
- Configurable block sizes via
min_block_size/max_block_sizeor fixedn_blocks_per_rep. - Row-major, column-major, and serpentine traversal orders.
- Optional genetic grouping from family, GRM, or A matrix.
- Optional within-environment efficiency evaluation.
-
Added
met_evaluate_alpha_efficiency()for evaluating the statistical efficiency ofmet_alpha_rc_stream()designs under the same criteria asmet_evaluate_famoptg_efficiency().- Requires
sigma_rep2(replicate variance) andsigma_ib2(incomplete block within replicate variance) invarcomp, distinguishing it from the block-based evaluator.
- Requires
Added
met_optimize_alpha_rc()for criterion-driven optimisation ofmet_alpha_rc_stream()designs via Random Restart, Simulated Annealing, or Genetic Algorithm.
Pipeline orchestration
-
Added
plan_sparse_met_design()providing an end-to-end MET design workflow that integrates:- Across-environment allocation via
allocate_sparse_met(). - Per-environment seed feasibility via
assign_replication_by_seed(). - Environment-specific field design via
met_prep_famoptg()ormet_alpha_rc_stream(), selected bydesigninenv_design_specs. - Mixed design strategies across environments in a single call.
Each environment’s design engine is specified via
env_design_specs, a named list wheredesign = "met_prep_famoptg"ordesign = "met_alpha_rc_stream"selects the constructor. - Across-environment allocation via
Added
combine_met_fieldbooks()to stack environment-level field books produced bymet_prep_famoptg()ormet_alpha_rc_stream()into a unified MET-level field book with standard metadata columns (Environment,LocalDesign,ReplicationMode,SparseMethod,IsCommonTreatment). Handles heterogeneous column sets across environments by filling missing columns withNA.
Statistical framework
Implements the sparse testing identity:
making explicit the tradeoff between number of treatments (J), number of environments (I), replication depth (r), and treatments per environment (k).
Design outputs are compatible with mixed-model analysis:
where the covariance of g may be proportional to a GRM or pedigree A matrix, and the residual covariance structure may be IID, AR1, or AR1×AR1.
Supports design strategies that improve: - cross-environment genetic connectivity - G×E estimation stability - genomic prediction performance (CDmean criterion; Rincent et al. 2012) - precision of BLUP estimates
Efficiency diagnostics
Within-environment designs support optional efficiency evaluation, summarized via plan_sparse_met_design() across all environments:
- A-criterion and A-efficiency
- D-criterion and D-efficiency
- Mean prediction error variance (mean PEV)
- CDmean (coefficient of determination for genomic prediction)
Metrics are reported in $efficiency_summary (long format) and $environment_summary (wide format with has_efficiency, eff_A, eff_D, eff_mean_PEV columns).
Infrastructure
- Initial package structure with all exports documented via roxygen2.
- Bundled example dataset
OptiSparseMET_example_datawith 120 treatments, 4 environments, GRM, pedigree A matrix, prediction matrix K, seed availability data, pre-built allocation argument lists, and environment-specific design specifications. - Vignette describing the statistical framework, two-stage pipeline, and worked examples.
- Unit test suite covering all 13 exported functions plus internal helpers.
- pkgdown configuration for the documentation website.
- GitHub Actions workflows for R CMD check and pkgdown deployment.