Package 'permPATH'

Title: Permutation Based Gene Expression Pathway Analysis
Description: Can be used to carry out permutation based gene expression pathway analysis. This work was supported by a National Institute of Allergy and Infectious Disease/National Institutes of Health contract (No. HHSN272200900059C).
Authors: Ivo D. Shterev [aut, cre], Kouros Owzar [aut], Gregory D. Sempowski [aut], Kenneth Wilder [ctb, cph] (wrote original version of ranker.h)
Maintainer: Ivo D. Shterev <[email protected]>
License: GPL-3
Version: 1.3
Built: 2025-03-07 04:04:13 UTC

Help Index

Permutation Based Gene Expression Pathway Analysis.


Can be used to carry out permutation based gene expression pathway analysis. This work was supported by a National Institute of Allergy and Infectious Disease/National Institutes of Health contract (No. HHSN272200900059C).


I. D. Shterev, K. Owzar and G. D. Sempowski

Maintainer: I. D. Shterev <[email protected]>


B. Efron, R. Tibshirani (2007) On Testing the Significance of Sets of Genes. The Annals of Applied Statistics. Vol. 1, No 1, 107–129.

A. Subramanian, P. Tamayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gillette, A. Paulovich, S. L. Pomeroy, T. R. Golub, E. S. Lander and J. P. Mesirov (2005), Gene Set Enrichment Analysis: A knowledge-based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Natl. Acad. Sci. USA. Vol. 102, No 43, 15545–15550.

Perform Permutation Based Pathway Analysis


This is the package main function.


perm.path(expr, y, local.test, global.test="wilcoxon", B, gset, min.num=2, max.num, 
imputeval=NULL, transfun=function(x){x}, sort="pval", anno=NULL)



An K×nK \times n matrix of gene expression data, where KK is the number of genes and nn is the number of samples.


An outcome vector of length nn.


Local test statistic of each gene. Current possible choices are ttestt-test, WilcoxonWilcoxon test, PearsonPearson, SpearmanSpearman and JTJT test.


Global test statictic, used to compute the score. Current possible choices are meanmean, meanabsmeanabs (mean of absolute values) and maxmeanmaxmean.


specifies the number of random permutations to be performed.


A list of pathways. Each element is a vector of gene names. The list element names are the pathway names.


Specifies the minimum number of genes that a pathway should have. Pathways with smaller number of genes will be excluded.


Specifies the maximum number of genes that a pathway should have. Pathways with larger number of genes will be excluded.


The gene expression value to be imputed in case of missing values. The default choice is NULLNULL in which case no imputation is done.


Specifies transformation of the gene expression data. The default option is untransformed gene expression data.


Specifies sorting of the results. If sort="pval"sort="pval" sorting is done in order of increasing pvaluesp-values. If sort="score"sort="score" sorting is done in order of decreasing scoresscores.


If TRUETRUE the output contains annotation of each pathway.


This function returns a list consisting of the following elements:


Data frame consisting of the pathway names (Pathway), the genes involved in each pathway (Genes), the number of genes in each pathway (Size), the score for each pathway (Score), the permutation raw p-value (pval), the FWER-adjusted permutation p-value (pfwer), the FDR-adjusted permutation p-value, the Bonferroni-adjusted permutation p-value (bonferroni)


The individual test statistic for each gene


A matrix of scores. The matrix is of dimension (B+1)×K(B+1)\times K, where K is the number of pathways. The first column contains the unpermuted scores, the remaining BB columns contain the scores computed after each permutation.


B. Efron, R. Tibshirani (2007) On Testing the Significance of Sets of Genes. The Annals of Applied Statistics. Vol. 1, No 1, 107–129.

A. Subramanian, P. Tamayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gillette, A. Paulovich, S. L. Pomeroy, T. R. Golub, E. S. Lander and J. P. Mesirov (2005), Gene Set Enrichment Analysis: A knowledge-based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Natl. Acad. Sci. USA. Vol. 102, No 43, 15545–15550.



## Generate toy phenotype and gene expression data sets
## This example consists of 40 genes grouped into 5 pathways and 100 patients
## grp is a binary trait (e.g., case vs control)
## bp is a continuous trait (e.g., blood pressure)
## g is a group indicator

n = 100
K = 40
grp = rep(1:0,each=n/2)
bp = rnorm(n)
g = rep(1:(n/20), rep(20,n/20))

pdat = data.frame(grp, bp, g)
rm(grp, bp)
expdat = matrix(rnorm(K*n),K,n)

## Assign marker names g1,...,gK to the expression data set and
## patient ids id1,...,idn to the expression and phenotype data
gnames = paste("g",1:K,sep="")
rownames(expdat) = gnames
patid = paste("id",1:n,sep="")
rownames(pdat) = patid
colnames(expdat) = patid

#Group the K genes into M pathways of sizes n1,...,nM
M = 5
p = runif(M)
p = p/sum(p)
nM = rmultinom(1, size=K, prob=p)
gset = lapply(nM, function(x){gnames[sample(x)]})
names(gset) = paste("pathway",1:M,sep="")

## Carry out permutation analysis with grp as the outcome
## using the two-sample Wilcoxon with B=100 random permutations
perm.path(expdat, y=pdat[["grp"]], local.test="wilcoxon", global.test="maxmean", B=100, 
gset=gset, min.num=2, max.num=50, sort="score")

## Carry out permutation analysis with g as the outcome
## using the JT test with B=100 random permutations
perm.path(expdat, y=pdat[["g"]], local.test="jt", global.test="maxmean", B=100, 
gset=gset, min.num=2, max.num=50, sort="score")

This is a function for creating an HTML file


The function creates an HTML file.


permPATH2HTML(dat, dir, fname, title=NULL, bgcolor="#BBBBEE")



A data frame.


Directory in which to store the file.


File name.


The title of the html file.


Color for the html background.


## Generate toy phenotype and gene expression data sets
## This example consists of 40 genes grouped into 5 pathways and 100 patients
## grp is a binary trait (e.g., case vs control)
## bp is a continuous trait (e.g., blood pressure)
n = 100
K = 40
grp = rep(1:0,each=n/2)
bp = rnorm(n)

pdat = data.frame(grp, bp)
rm(grp, bp)
expdat = matrix(rnorm(K*n),K,n)

## Assign marker names g1,...,gK to the expression data set and
## patient ids id1,...,idn to the expression and phenotype data
gnames = paste("g",1:K,sep="")
rownames(expdat) = gnames
patid = paste("id",1:n,sep="")
rownames(pdat) = patid
colnames(expdat) = patid

#Group the K genes into M pathways of sizes n1,...,nM
M = 5
p = runif(M)
p = p/sum(p)
nM = rmultinom(1, size=K, prob=p)
gset = lapply(nM, function(x){gnames[sample(x)]})
names(gset) = paste("pathway",1:M,sep="")

## Carry out permutation analysis with grp as the outcome
## using the two-sample Wilcoxon with B=100 random permutations
res = perm.path(expdat, y=pdat[["grp"]], local.test="wilcoxon", global.test="maxmean", 
B=100, gset=gset, min.num=2, max.num=50, sort="score")

# create an html file
#epermPATH2HTML(rstab, dir="/dir/", fname="tophits")