ChIPCor

Use ChIPCor

Download and installation

Click following links to download the package:
- R package source code
Follow the standard procedure to install the package in your R.
ChIPCor depends on BioConductor package qvalue to turn p-values to q-values.

Prepare configuration file

The user needs to prepare an input list containing overlapping information of both the target protein pair for comparison as well as the database of protein pairs from public data repositories under the same biological condition. Each element contains the names of a given pair of proteins and a list of two by two tables representing their overlapping at each segment type.

For preparing the two by two tables list for protein pairs under a given genome segmentation, the users can take the following steps:

Segment the whole genome into bins of equal sizes (e.g. 1000 base pairs). For instance, given the bed file of ChromHMM segmentation for cell line K562, the genome can be segmented into 1000bp long bins:

 
#read in a vector recording the length of each chromosome for hg18   
data(hg18.chrlen)
#segment the genome into 1000bp long bins
#annotate each bin according to ChromHMM states
genomeseg<-genomeSeg("wgEncodeBroadHmmK562HMM.bed",winsize=1000, chrlen)

Similarly, for each protein, each bin can be annotated as 0 or 1 according to the absence or presence of its binding peaks using the function genomeSeg with the same segmentation length (e.g. 1000 base pairs).
Once the peak lists for all the proteins have been converted to 0 or 1 vectors, we obtain a table where each row corresponds to a genomic bin and each column represents the genome-wide binding profile of a protein. The sample database table for K562 can be downloaded here.
Then, the function prepareTbls constructs a two by two table to summarize the co-occurrence pattern of binding sites for each pair of proteins under each genome segmentation type. Each element of the function prepareTbls's output corresponds to a protein pair, and it records the co-occurrence patterns of the pair of protein over each segment type.
```
data(K562peaklocmat)
tbls=prepareTbls(peakloc.mat,genomeseg)
#data structure
str(tbls[[1]])
```
Finally, the function ChIPCor can be applied to measure the spatial correlations of protein binding sites.

Examples

Using ChromHMM to segment the genome, we applied our testing procedure to transcription factors assayed in the ENCODE project cell line K562:

    
data(K562tbls)
#inspection of the data structure
str(tbls[[1]])
#Conduct testing
qval.K562<-ChIPCor(tbls)
head(qval.K562)

and GM12878:

data(GM12878tbls)
qval.GM12878<-ChIPCor(tbls)
head(qval.GM12878)

Measuring the Spatial Correlations of Protein Binding Sites

Author

Maintainer

Download and installation

Prepare configuration file

Examples