Tests pairwise SNP x SNP interactions within haplotype blocks that were
identified as significant by test_block_haplotypes. For each
significant block, all C(p, 2) SNP pairs are tested for interaction using
the model:
$$y = \mu + a_i x_i + a_j x_j + aa_{ij}(x_i x_j) + \varepsilon$$
on GRM-corrected REML residuals (same null model as
test_block_haplotypes), ensuring all tests are
population-structure-corrected.
Restricting to significant blocks avoids the genome-wide explosion of pairwise tests: for 15 significant blocks with ~200 SNPs each, the total number of tests is ~300,000, compared to ~4.4 billion for an unrestricted genome-wide scan.
Usage
scan_block_epistasis(
assoc,
geno_matrix,
snp_info,
blocks,
blues,
haplotypes,
trait = NULL,
sig_blocks = NULL,
min_freq = 0.05,
max_snps_per_block = 300L,
sig_threshold = 0.05,
sig_metric = c("p_simplem_sidak", "p_simplem", "p_bonf", "p_fdr"),
meff_percent_cut = 0.995,
id_col = "id",
blue_col = "blue",
verbose = TRUE
)Arguments
- assoc
Output of
test_block_haplotypes. Used to identify significant blocks and obtain pre-computed GRM residuals.- geno_matrix
Numeric matrix (individuals x SNPs) or
LDxBlocks_backend. The imputed, MAF-filtered genotype matrix from Job 1 (res$geno_matrix).- snp_info
Data frame with columns
SNP,CHR,POS.- blocks
LD block table from
run_Big_LD_all_chr.- blues
Pre-adjusted phenotype means (same format as
test_block_haplotypes). Named numeric vector or named list.- haplotypes
Named list from
extract_haplotypes. Used to re-build the GRM for REML residual computation.- trait
Character. Which trait to use for residual computation when
bluesis a named list. DefaultNULLuses the first trait.- sig_blocks
Character vector. Block IDs to scan.
NULL(default) uses all blocks withsignificant_omnibus = TRUEinassoc$block_tests.- min_freq
Numeric. Minimum MAF for SNPs within the block. Default
0.05.- max_snps_per_block
Integer. Maximum SNPs per block before switching to random subsampling of pairs. Default
300L(C(300,2)=44,850 pairs per block). SetNULLto always use all SNPs.- sig_threshold
Numeric. Significance threshold. Default
0.05.- sig_metric
Character. Which correction drives the primary
significantflag. One of:"p_simplem_sidak"(default, recommended) – simpleM Sidak."p_simplem"– simpleM Bonferroni."p_bonf"– plain Bonferroni (p x n_pairs).
All three p-value columns are always present regardless of this choice.
- meff_percent_cut
Numeric. Variance cutoff for simpleM Meff. Default
0.995.- id_col
Character. ID column when blues is a data frame.
- blue_col
Character. Phenotype column when blues is a data frame.
- verbose
Logical. Default
TRUE.
Value
A named list of class LDxBlocks_epistasis:
resultsData frame. One row per tested SNP pair per block per trait. Always present columns:
block_id,CHR,start_bp,end_bp,trait,SNP_i,SNP_j,POS_i,POS_j,dist_bp,aa_effect,SE,t_stat,p_wald,Meff(simpleM effective test count from interaction eigenspectrum),p_bonf(Bonferroni: p x n_pairs),p_simplem(simpleM Bonferroni: p x Meff),p_simplem_sidak(simpleM Sidak: 1-(1-p)^Meff),significant(driven bysig_metric),significant_bonf,significant_simplem,significant_simplem_sidak.scan_summaryData frame. One row per block: number of pairs tested, number significant, minimum p-value.
n_blocks_scannedInteger.
n_pairs_totalInteger. Total pairwise tests performed.