Provides a unified block ranking that works across three use cases,
depending on what data the user has available. In all cases,
min_freq filtering inside
build_haplotype_feature_matrix is applied first as a hard
population-level filter – equivalent to MAF filtering for single SNPs.
Blocks are ranked only among those that survive this filter.
Usage
rank_haplotype_blocks(
diversity,
qtl_regions = NULL,
pred_result = NULL,
He_threshold = 0.3,
top_n_blocks = NULL
)Arguments
- diversity
Data frame from
compute_haplotype_diversity. Required for all three use cases.- qtl_regions
Optional. Data frame from
define_qtl_regions. When supplied, blocks are binary-flagged as containing a GWAS hit or not. The p-value is not used for ranking. DefaultNULL.- pred_result
Optional. List from
run_haplotype_prediction. When supplied, blocks are ranked byVar(local GEBV). DefaultNULL.- He_threshold
Minimum He to consider a block diverse enough for haplotype stacking. Default
0.3.- top_n_blocks
Return only the top n blocks. Default
NULL(return all).
Value
Data frame with one row per block, sorted by evidence strength,
with columns: block_id, CHR, start_bp,
end_bp, n_snps, He, n_eff_alleles,
freq_dominant, sweep_flag, is_diverse,
has_gwas_hit (if qtl_regions supplied),
lead_marker, lead_beta, n_sig_markers
(if qtl_regions supplied),
var_scaled, is_important
(if pred_result supplied),
use_case, rank_score, recommendation.
The three use cases
Genotype only (no GWAS, no phenotype): blocks ranked by haplotype diversity (He, effective number of alleles). Blocks with high diversity have the most potential for haplotype stacking.
Genotype + GWAS (no phenotype): blocks are binary-flagged by whether they contain a GWAS-significant marker. Within the GWAS-hit group and within the non-hit group, blocks are further ordered by He. The p-value is not used for ranking – a marker either crosses the significance threshold or it does not.
Genotype + phenotype (? GWAS): blocks ranked by scaled
Var(local GEBV)fromrun_haplotype_prediction. When GWAS results are also available, the binary GWAS flag is added as a secondary layer viaintegrate_gwas_haplotypes.
Link to min_freq
min_freq is a hard population-level pre-filter identical in
purpose to MAF filtering for single SNPs: haplotype alleles observed
at frequency below this threshold cannot have their effects reliably
estimated regardless of trait association, and are dropped before the
dosage matrix is built. rank_haplotype_blocks operates entirely
downstream of this filter – it ranks blocks that have already passed
min_freq, not individual alleles. A block survives as long as
at least one of its alleles passes min_freq.
References
Difabachew YF et al. (2023). Genomic prediction with haplotype blocks in wheat. Frontiers in Plant Science 14:1168547. doi:10.3389/fpls.2023.1168547
Weber SE, Frisch M, Snowdon RJ, Voss-Fels KP (2023). Haplotype blocks for genomic prediction: a comparative evaluation in multiple crop datasets. Frontiers in Plant Science 14:1217589. doi:10.3389/fpls.2023.1217589
Tong J et al. (2024). Stacking beneficial haplotypes from the Vavilov wheat collection. Theoretical and Applied Genetics 137:274. doi:10.1007/s00122-024-04784-w
Tong J et al. (2025). Haplotype stacking to improve stability of stripe rust resistance in wheat. Theoretical and Applied Genetics 138:267. doi:10.1007/s00122-025-05045-0
Examples
if (FALSE) { # \dontrun{
haps <- extract_haplotypes(geno, snp_info, blocks)
div <- compute_haplotype_diversity(haps)
# Use case 1: genotype only -- rank by diversity
ranked <- rank_haplotype_blocks(div)
# Use case 2: genotype + GWAS -- binary flag, then diversity within groups
qtl <- define_qtl_regions(gwas_df, blocks, snp_info)
ranked <- rank_haplotype_blocks(div, qtl_regions = qtl)
# Use case 3: genotype + phenotype -- rank by Var(local GEBV)
pred <- run_haplotype_prediction(geno, snp_info, blocks, blues = blues)
ranked <- rank_haplotype_blocks(div, qtl_regions = qtl, pred_result = pred)
# Top 20 blocks
ranked <- rank_haplotype_blocks(div, pred_result = pred, top_n_blocks = 20)
} # }