Integrate GWAS QTL Regions with Haplotype Prediction Results
Source:R/haplotypes.R
integrate_gwas_haplotypes.RdLinks the output of define_qtl_regions (biological evidence
from GWAS) with the output of run_haplotype_prediction
(statistical evidence from haplotype variance) to identify blocks that
are supported by both lines of evidence. These are the priority candidates
for haplotype stacking in breeding.
Arguments
- qtl_regions
Data frame from
define_qtl_regions.- pred_result
List from
run_haplotype_prediction.- diversity
Data frame from
compute_haplotype_diversity. Optional. If supplied, adds He and sweep_flag to the output.- He_threshold
Minimum expected heterozygosity to flag a block as sufficiently diverse for haplotype stacking. Default
0.3.
Value
Data frame with one row per block that appears in at least one of
the input sources, with columns:
block_id, CHR, start_bp, end_bp,
n_snps, has_gwas_hit (logical),
lead_marker, lead_p, lead_beta,
n_sig_markers, is_important (logical, scaled var >= 0.9),
var_scaled, He, sweep_flag,
is_diverse (logical, He >= He_threshold),
priority_score (0-3),
recommendation (character label).
Sorted by priority_score descending, then var_scaled
descending.
Details
Three complementary evidence layers are combined per block:
- Biological (GWAS)
Does the block contain a genome-wide significant marker? Sourced from
define_qtl_regions().- Statistical (variance)
Does the block explain substantial variance in the trait? Blocks with scaled
Var(local GEBV)>= 0.9 are flaggedimportantbyrun_haplotype_prediction().- Diversity (He)
Does the block have enough haplotype diversity to stack favourable alleles? Sourced from
compute_haplotype_diversity().
The output priority_score is the sum of binary flags for each layer
(0-3). Blocks scoring 3 are supported by all three lines of evidence and
are the strongest candidates for haplotype stacking. Blocks scoring 2 are
worth investigating. Blocks scoring 1 or 0 require caution.
Interpretation guide
| Score | Meaning | Action |
| 3 | GWAS hit + high variance + diverse | Top priority for stacking |
| 2 (GWAS + var) | Real effect, low diversity | Select across populations |
| 2 (GWAS + div) | Real locus, small effect | Include if trait is oligogenic |
| 2 (var + div) | Variance explained, no GWAS | May be pop. structure – verify |
| 1 | Single evidence only | Use with caution |
| 0 | No evidence | Exclude from stacking |
References
Difabachew YF et al. (2023). Genomic prediction with haplotype blocks in wheat. Frontiers in Plant Science 14:1168547. doi:10.3389/fpls.2023.1168547
Weber SE, Frisch M, Snowdon RJ, Voss-Fels KP (2023). Haplotype blocks for genomic prediction: a comparative evaluation in multiple crop datasets. Frontiers in Plant Science 14:1217589. doi:10.3389/fpls.2023.1217589
Tong J et al. (2024). Stacking beneficial haplotypes from the Vavilov wheat collection. Theoretical and Applied Genetics 137:274. doi:10.1007/s00122-024-04784-w
Tong J et al. (2025). Haplotype stacking to improve stability of stripe rust resistance in wheat. Theoretical and Applied Genetics 138:267. doi:10.1007/s00122-025-05045-0
Examples
if (FALSE) { # \dontrun{
# 1. GWAS integration
gwas <- read.csv("gwas_results.csv") # SNP, CHR, POS, P, BETA
qtl <- define_qtl_regions(gwas, blocks, snp_info, p_threshold = 5e-8)
# 2. Haplotype prediction
blues <- read.csv("blues.csv")
pred <- run_haplotype_prediction(geno, snp_info, blocks,
blues = blues,
id_col = "id",
blue_col = "YLD")
# 3. Haplotype diversity
haps <- extract_haplotypes(geno, snp_info, blocks, min_snps = 3)
div <- compute_haplotype_diversity(haps)
# 4. Integrate all three
priority <- integrate_gwas_haplotypes(qtl, pred, diversity = div)
# Top priority blocks for haplotype stacking
priority[priority$priority_score == 3, ]
} # }