Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data

Timothy Bailey, Pawel Krajewski, Istvan Ladunga, Celine Lefebvre, Qunhua Li, Tao Liu, Pedro Madrigal, Cenny Taslim, Jie Zhang

Research output: Contribution to journalArticle

97 Citations (Scopus)

Abstract

Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE) Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.

Original languageEnglish (US)
Article numbere1003326
JournalPLoS Computational Biology
Volume9
Issue number11
DOIs
StatePublished - Nov 1 2013
Externally publishedYes

Fingerprint

Encyclopedias
DNA
Chip
Workflow
Guidelines
chromatin
Histone Code
Complement C1
Sequencing
nucleosomes
Transcription factors
Chromatin Assembly and Disassembly
Nucleosomes
Chromatin Immunoprecipitation
Computational Analysis
Chromatin
histones
reproducibility
Work Flow
chromosome mapping

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Ecology
  • Molecular Biology
  • Genetics
  • Cellular and Molecular Neuroscience
  • Computational Theory and Mathematics

Cite this

Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. / Bailey, Timothy; Krajewski, Pawel; Ladunga, Istvan; Lefebvre, Celine; Li, Qunhua; Liu, Tao; Madrigal, Pedro; Taslim, Cenny; Zhang, Jie.

In: PLoS Computational Biology, Vol. 9, No. 11, e1003326, 01.11.2013.

Research output: Contribution to journalArticle

Bailey, T, Krajewski, P, Ladunga, I, Lefebvre, C, Li, Q, Liu, T, Madrigal, P, Taslim, C & Zhang, J 2013, 'Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data', PLoS Computational Biology, vol. 9, no. 11, e1003326. https://doi.org/10.1371/journal.pcbi.1003326
Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T et al. Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Computational Biology. 2013 Nov 1;9(11). e1003326. https://doi.org/10.1371/journal.pcbi.1003326
Bailey, Timothy ; Krajewski, Pawel ; Ladunga, Istvan ; Lefebvre, Celine ; Li, Qunhua ; Liu, Tao ; Madrigal, Pedro ; Taslim, Cenny ; Zhang, Jie. / Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. In: PLoS Computational Biology. 2013 ; Vol. 9, No. 11.
@article{fbe6ff336f774306b02bda4625046b25,
title = "Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data",
abstract = "Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE) Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.",
author = "Timothy Bailey and Pawel Krajewski and Istvan Ladunga and Celine Lefebvre and Qunhua Li and Tao Liu and Pedro Madrigal and Cenny Taslim and Jie Zhang",
year = "2013",
month = "11",
day = "1",
doi = "10.1371/journal.pcbi.1003326",
language = "English (US)",
volume = "9",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "11",

}

TY - JOUR

T1 - Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data

AU - Bailey, Timothy

AU - Krajewski, Pawel

AU - Ladunga, Istvan

AU - Lefebvre, Celine

AU - Li, Qunhua

AU - Liu, Tao

AU - Madrigal, Pedro

AU - Taslim, Cenny

AU - Zhang, Jie

PY - 2013/11/1

Y1 - 2013/11/1

N2 - Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE) Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.

AB - Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE) Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.

UR - http://www.scopus.com/inward/record.url?scp=84888264156&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84888264156&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1003326

DO - 10.1371/journal.pcbi.1003326

M3 - Article

VL - 9

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 11

M1 - e1003326

ER -