Skip to contents

This function implements the Independent Metropolis-Hastings algorithm for Bayesian penetrance estimation of cancer risk. It utilizes parallel computing to run multiple chains and provides various options for analyzing and visualizing the results.


  twins = NULL,
  n_chains = 1,
  n_iter_per_chain = 10000,
  ncores = 6,
  max_age = 94,
  baseline_data = baseline_data_default,
  removeProband = FALSE,
  ageImputation = FALSE,
  median_max = TRUE,
  BaselineNC = TRUE,
  var = c(0.1, 0.1, 2, 2, 5, 5, 5, 5),
  burn_in = 0,
  thinning_factor = 1,
  distribution_data = distribution_data_default,
  af = 1e-04,
  max_penetrance = 1,
  sample_size = NULL,
  ratio = NULL,
  prior_params = prior_params_default,
  risk_proportion = risk_proportion_default,
  summary_stats = TRUE,
  rejection_rates = TRUE,
  density_plots = TRUE,
  plot_trace = TRUE,
  penetrance_plot = TRUE,
  penetrance_plot_pdf = TRUE,
  probCI = 0.95,
  sex_specific = TRUE



A data frame containing the pedigree data in the required format. It should include the following columns:

  • PedigreeID: A numeric value representing the unique identifier for each family. There should be no duplicated entries.

  • ID: A numeric value representing the unique identifier for each individual. There should be no duplicated entries.

  • Sex: A numeric value where 0 indicates female and 1 indicates male. Missing entries are not currently supported.

  • MotherID: A numeric value representing the unique identifier for an individual's mother.

  • FatherID: A numeric value representing the unique identifier for an individual's father.

  • isProband: A numeric value where 1 indicates the individual is a proband and 0 otherwise.

  • CurAge: A numeric value indicating the age of censoring (current age if the person is alive or age at death if the person is deceased). Allowed ages range from 1 to 94.

  • isAff: A numeric value indicating the affection status of cancer, with 1 for diagnosed individuals and 0 otherwise. Missing entries are not supported.

  • Age: A numeric value indicating the age of cancer diagnosis, encoded as NA if the individual was not diagnosed. Allowed ages range from 1 to 94.

  • Geno: A column for germline testing or tumor marker testing results. Positive results should be coded as 1, negative results as 0, and unknown results as NA or left empty.


A list specifying identical twins or triplets in the family. For example, to indicate that "ora024" and "ora027" are identical twins, and "aey063" and "aey064" are identical twins, use the following format: twins <- list(c("ora024", "ora027"), c("aey063", "aey064")).


Integer, the number of chains for parallel computation. Default is 1.


Integer, the number of iterations for each chain. Default is 10000.


Integer, the number of cores for parallel computation. Default is 6.


Integer, the maximum age considered for analysis. Default is 94.


Data for the baseline risk estimates (probability of developing cancer), such as population-level risk from a cancer registry. Default data, for exemplary purposes, is for Colorectal cancer from the SEER database.


Logical, indicating whether to remove probands from the analysis. Default is FALSE.


Logical, indicating whether to perform age imputation. Default is FALSE.


Logical, indicating whether to use the baseline median age or max_age as an upper bound for the median proposal. Default is TRUE.


Logical, indicating that the non-carrier penetrance is assumed to be the baseline penetrance. Default is TRUE.


Numeric vector, variances for the proposal distribution in the Metropolis-Hastings algorithm. Default is c(0.1, 0.1, 2, 2, 5, 5, 5, 5).


Numeric, the fraction of results to discard as burn-in (0 to 1). Default is 0 (no burn-in).


Integer, the factor by which to thin the results. Default is 1 (no thinning).


Data for generating prior distributions.


Numeric, allele frequency for the risk allele. Default is 0.0001.


Numeric, the maximum penetrance considered for analysis. Default is 1.


Optional numeric, sample size for distribution generation.


Optional numeric, ratio parameter for distribution generation.


List, parameters for prior distributions.


Numeric, proportion of risk for distribution generation.


Logical, indicating whether to include summary statistics in the output. Default is TRUE.


Logical, indicating whether to include rejection rates in the output. Default is TRUE.


Logical, indicating whether to include density plots in the output. Default is TRUE.


Logical, indicating whether to include trace plots in the output. Default is TRUE.


Logical, indicating whether to include penetrance plots in the output. Default is TRUE.


Logical, indicating whether to include PDF plots in the output. Default is TRUE.


Numeric, probability level for credible intervals in penetrance plots. Must be between 0 and 1. Default is 0.95.


Logical, indicating whether to use sex-specific parameters in the analysis. Default is TRUE.


A list containing combined results from all chains, including optional statistics and plots.