Chromatin immunoprecipitation is frequently used to map the location of specific protein factors or epigenetic chromatin modifications. The Bioinformatic Analysis Shared Resource has extensive, historical experience with ChIPSeq and related analyses. While single-factor ChIPSeq is straightforward to process, investigators are increasingly use ChIPSeq in quantitative or comparative assays, which require more customized approaches. Talk to a Core member for advice when starting these experiments.
We generally recommend three biological replicates per sample (antibody or experimental condition). For preliminary, exploratory ChIPSeq, one replicate could be used, but for publication and quantitative purposes, multiple replicates are required. For vertebrate genomes, 20 million reads should be a minimum target sequencing depth. For small model organism genomes, much less can be used. Either paired-end or single-end sequencing strategies may be used; paired-end gives you the exact size of each chromatin fragment, while fragment size must be empirically estimated for single-end. When at all possible, ChIP should be checked by quantitative PCR at known loci prior to submission.
There are numerous software packages that may be used to analyze ChIPSeq. The Core has the most experience with two software packages: Macs2 and the USeq package.
Macs2 is broadly used in the field, well-cited, and fairly straightforward to use with single samples. It uses fragment-based coverage tracks to identify significantly enriched peaks based on a Poisson distribution at essentially base pair resolution. However, it does not handle multiple replicates or conditions well. One solution is to use Irreproducible Discovery Rate (IDR) to identify consensus peaks. Alternatively, the Core has also put together a custom pipeline for handling multiple replicates and/or conditions while using Macs2, along with handling specialized strategies for normalizing replicates; there is detailed documentation on using the pipeline at the repository.
The USeq package has a ChIPSeq application for processing ChIP and was developed by members of the Core several years ago. It uses a sliding-window, count-based approach in combination with the DESeq2 package to identify significantly enriched windows based on the negative binomial model. It works best with multiple replicates.
For quality control comparison, visualization, and high-level analysis, the deepTools package is recommended for many needs. The BioToolBox package is also available for collecting processed data with respect to annotation for graphing purposes. The software was developed by a member of the Core. For R packages, the Chipseeker package is also useful.
The HCI interactive Linux servers have these and other software installed.