DNA-Protein Interactions Techniques

  • Upload
    parult

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

  • 8/11/2019 DNA-Protein Interactions Techniques

    1/21

    DNAprotein interactions: methods for detection and analysis

    Bipasha Dey

    Sameer Thukral

    Shruti Krishnan

    Mainak Chakrobarty Sahil Gupta

    Chanchal Manghani Vibha Rani

    Received: 24 September 2011 / Accepted: 16 February 2012 / Published online: 8 March 2012

    Springer Science+Business Media, LLC. 2012

    Abstract DNA-binding proteins control various cellular

    processes such as recombination, replication and transcrip-tion. This review is aimed to summarize some of the most

    commonly used techniques to determine DNAprotein

    interactions. In vitro techniques such as footprinting assays,

    electrophoretic mobility shift assay, southwestern blotting,

    yeast one-hybrid assay, phage display and proximity ligation

    assay have been discussed. The highly versatile in vivo

    techniques such as chromatin immunoprecipitation and its

    variants, DNA adenine methyl transferase identification as

    well as 3C and chip-loop assay have also been summarized.

    In addition, somein silicotools have been reviewed to pro-

    vide computational basis for determining DNAprotein

    interactions. Biophysical techniques like fluorescence reso-

    nance energy transfer (FRET) techniques, FRETFLIM,

    circular dichroism, atomic force microscopy, nuclear mag-

    netic resonance, surface plasmon resonance, etc. have also

    been highlighted.

    Keywords DNAprotein interactions Footprinting

    Electrophoretic mobility shift assay Southwestern

    blotting Phage display Yeast one-hybrid assay

    Chromatin immunoprecipitation assay

    Biophysical techniques

    Introduction

    Association of DNA with proteins is a phenomenon of utmost

    importance. In effect, almost all aspects of cellular function,

    such as transcriptional regulation, chromosome maintenance,

    replication and DNA repair depend on the interaction of

    proteins with DNA. Activation of genes by DNA-binding

    proteins is a fundamental regulatory mechanism involving the

    chromatin modifying and transcription complexes to initiate

    the RNA synthesis [1]. Such DNA-binding proteins have

    diverse roles and may function as structural proteins making

    up the nucleosome, enzymes modulating chromatin structure

    to control gene expression, transcription factors, and also as

    cofactors. One of the most widely studied examples of DNA-

    binding proteins is the transcription factor. TFs association

    with DNA is considered to be extremely critical in develop-

    ment processes and in response to environmental stresses.

    Also, in humans their dysfunction can contribute to the pro-

    gression of various diseases [2].

    In view of such an important role played by DNA

    protein interactions, various techniques have evolved over

    the years to elucidate them. Each technique, with its own

    advantages and drawbacks, serves a very specific purpose.

    In brief, the techniques cater either of the two parts of the

    interaction: protein (molecular weight, identity, domains

    etc.) or DNA (general sequence, specific sequence, alter-

    native sequences etc.).

    This review has been focused to aptly summarize some

    of the most important in vitro, in vivo, in silico and bio-

    physical techniques to study DNAprotein interactions,

    owing to the pivotal role played by DNA-associating pro-

    teins in various cellular processes. The review shall assist a

    researcher to understand and evaluate various DNApro-

    tein interaction techniques and use them appropriately for

    their research.

    All the authors have contributed equally.

    B. Dey S. Thukral S. Krishnan M. Chakrobarty

    S. Gupta C. Manghani V. Rani (&)

    Department of Biotechnology, Jaypee Institute of Information

    Technology, A-10 Sector-62, Noida 201307,

    Uttar Pradesh, India

    e-mail: [email protected]

    1 3

    Mol Cell Biochem (2012) 365:279299

    DOI 10.1007/s11010-012-1269-z

  • 8/11/2019 DNA-Protein Interactions Techniques

    2/21

    In vitro techniques to study DNAprotein interactions

    There are several techniques to determine the in vitro

    DNAprotein interactions experimentally. Some of the

    well known in vitro techniques are footprinting assay,

    southwestern assay, electrophoretic mobility shift assay,

    yeast one-hybrid assay, phage display and proximity liga-

    tion assay.

    Footprinting assay

    Foot printing assays are based on the principle of protec-

    tion of protein-bound DNA from degradation. The tech-

    nique is used to decipher the specific sequence to which a

    DNA-binding protein or molecule binds. The procedure

    employs chemical or enzymatic digestion of naked- and

    protein bound-DNA oligomers. Both the reactions are then

    compared using gel electrophoresis. The segment of the

    DNA bound by the protein appears as an empty stretch

    footprint in the protein-bound reaction when compared tothe continuous fragments produced by naked DNA diges-

    tion (Fig. 1a).

    Foot printing has been a valuable tool for elucidating

    sequence specificity and dissociation constants of a variety

    of ligands binding to DNA. The agent used to cleave DNA

    is called the probe. The smaller the probe, the higher is the

    resolution provided, but its chances for cleaving DNA

    under the bound protein also increases. The enzymatic

    digestion methods include the use of DNAse I, MNase [3],

    methidiumpropyl-EDTAFe(II) (MPE) [4, 5], copper phe-nanthroline [6], uranyl photocleavage [7, 8], hydroxyl

    radicals [913] and iron complexes [14]. Comparisons

    between different probes used for foot printing provide

    useful information on their relative merits and demerits

    [1518].

    DNAase I footprinting is the most commonly used

    footprinting assay. DNAase I is a double-strand-specific

    endonuclease, which binds to the minor groove to break

    phosphodiester bonds. The technique was developed by

    Galas and Schmitz [19] for visualizing the binding of the

    lac-repressor protein to the lac-operator sequence. The

    footprinting technique employs use of a single end-radio-labeled, synthetic or natural, DNA fragment. The fragment

    is incubated with either crude or purified protein sample,

    Fig. 1 In vitro techniques to study DNAprotein interactions

    280 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    3/21

    under appropriate binding conditions, allowing the protein

    to bind to its specific DNA sequence. The protein-bound

    fragment and the control (i.e. naked fragment) are then

    subjected to DNAse I treatment in an appropriate buffer,

    with varying concentrations and time periods. Both the

    samples are then run on a denaturating polyacrylamide gel,

    processed and imaged [20].

    There are various key points for this technique [21].First, by altering experimental conditions, DNAse is

    allowed to partially digest the fragment, assuming a single

    nick per fragment. This creates a range of fragments which

    differ from one another by single nucleotide, hence pro-

    viding high resolution for the protein-binding sequence.

    Upon examining variety of experimental footprinting gels,

    it is evident that if all the sequences were cleaved without

    any sequence-dependent specificity by DNAse I, all bands

    would have been of similar intensity, but DNAse I has

    partial sequence specificity, resulting in some sites

    becoming hyper sensitive and thus showing a more intense

    band. Second, end labeling of DNA is for a specific pur-pose. In a single reaction, DNAse will cut both the strands

    leading to mixture of?ve and ve strand fragments which

    are further separated on the denaturating gel. The purpose

    of radio labeling DNA is to provide a clue as to which of

    the two anti-parallel strands does the protein under con-

    sideration bind. Thus, in a 50 labelled reaction only the 50

    30 strand sequence information is provided on the final

    exposed film. Subsequently, it is common to digest both

    ?ve and ve strand labeled fragments in separate tubes and

    then run them along side each other. Lastly, there are a

    variety of methods to analyze the final footprinting image.

    These methods range from visual inspection to creating a

    differential cleavage plot on the basis of densitometric

    analysis. Further there are techniques available for quan-

    titative analysis of binding affinity [22].

    The purpose of a denaturing gel is to make sure that

    fragments show up on the gel only as single-stranded DNA.

    Often the naked DNA is chemically sequenced and run on

    the same gel, as a marker for finding the exact sequence of

    the footprint. However, some precautions must be consid-

    ered. First, it is important not to titrate too much DNA with

    the protein sample. This shall cause a large amount of

    DNA to remain unbound from the protein and thus sus-

    ceptible to DNAse I attack. This fragmented DNA shall

    show up at the place of the usual footprint, making it harder

    to detect. Second, incubation of crude samples with the

    fragment requires the presence of competitor DNA to

    exclude the non-specific binding proteins from showing a

    footprint. The limitation of this technique is that it does not

    provide identity of the protein [20]. Because of the large

    molecular weight of DNAse I, its attack is easily sterically

    hindered, by the bound protein. Other probes for nicking

    DNA, like free radicals may not be hindered so easily and

    hence have a chance of nicking a few bases, under the

    bound protein itself.

    Apart from proteins, footprinting technique is also used

    to elucidate the binding of other small molecules like drugs

    to DNA [22]. The modifications of this technique use

    automation and capillary electrophoresis along with fluo-

    rescent labeling [23, 24]. A technique using DNAse I

    digestion followed by sequencing called DNAse-seq isoften used for genome-wide studies [25]. Protocols using

    automated infrared sequencers, allowing long range and

    highly sensitive DNAse I footprinting have been developed

    [26]. Also using Streptavidin-bound oligonucleotides for

    protein binding and subsequent DNAse I digestion makes

    the technique more convenient [27]. The technique can

    also be used to fish out a protein of interest, from a crude

    sample. It is used for quantitatively measuring the disso-

    ciation constant of proteinDNA interaction [2830].

    Though initially in vitro, the technique has been adopted to

    be used as an in vivo foot printing assay, involving per-

    meabilization of cells followed by DNAse I-mediatedcleavage and ligation-mediated PCR [31,32]. DrugRNA

    footprinting methods have also been developed [33]. It is

    now known that Gold (Au)-DNA conjugates change their

    surface plasmon resonance (SPR) wavelength depending

    on the length of the DNA oligo attached. Comparing the

    SPR wavelength of a control protein-bound DNAAu

    conjugate, with the experimental DNAse I or any other

    probe digested protein-bound DNAAu conjugate provides

    information on the exact number of nucleotides from end,

    where the protein under consideration is bound. This recent

    advancement provides a label-free, quantitative, real-time

    measurement of nuclease activity and footprint of a bound

    protein without running a gel [34].

    Electrophoretic mobility shift assay (EMSA)

    EMSA is a relatively simple in vitro technique to study

    DNAprotein interactions. Its novelty lies in its application

    to deduce the binding parameters and relative affinities of a

    protein for one or more DNA sites or for detecting protein

    nucleic acid interactions with the aim of comparing the

    affinities of different proteins for the same sites [35]. It is

    based on the principle that DNAprotein complexes are

    heavier and move slowly when subjected to non-denaturing

    polyacrylamide or agarose gel electrophoresis as compared

    to unbound free probe. Since, the rate of DNA migration is

    shifted or retarded when bound to protein, the assay is also

    referred to as a gel shift or gel retardation assay. The DNA

    sequence is provided externally and incubated to crude cell

    protein lysate. Following this, the DNA and extracted

    proteins are incubated together in a binding reaction and

    separated on a gel. The DNA probes used may be radio-

    labeled or dyes specific to stain DNA and protein may be

    Mol Cell Biochem (2012) 365:279299 281

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    4/21

    used to visualize the DNAprotein interaction. In general

    poly (dI-dC) is added to abolish any non-specific binding.

    A supershift assay can be performed to specifically assert

    the DNAprotein interactions by using an antibody specific

    to the protein of interest. By incubating antibody along

    with DNAprotein sample followed by gel separation, the

    DNAproteinantibody complex can be visualized as a

    supershifted band. Competition assays may also be per-formed using unlabeled specific and non-specific oligo

    duplexes (Fig. 1b).

    EMSA can be used qualitatively to identify sequence-

    specific DNA-binding proteins in crude lysates and, in

    conjunction with mutagenesis, to identify the important

    binding sequences within a given gene upstream regulatory

    region. EMSA can also be utilized quantitatively to mea-

    sure thermodynamic and kinetic parameters. This tech-

    nique poses several advantages. The most significant

    benefit of EMSA is its ability to resolve complexes of

    different stoichiometry or conformation. Another major

    advantage is that the source of the DNA-binding proteinmay be crude nuclear or whole cell extract, in vitro tran-

    scription product or a purified preparation. In addition, the

    relatively low ionic strength of the electrophoresis buffer

    helps to stabilize transient interactions, permitting even

    labile complexes to be resolved and analyzed by this

    method [3639].

    An additional variation to the conventionally used

    EMSA is capillary electrophoretic mobility shift assay

    (CEMSA) which allows the rapid separation and quan-

    titation of DNAprotein interactions, in uncoated capil-

    laries with no gel matrixes, using high-sensitivity laser-

    induced-fluorescence detection of fluorescein-labeled

    DNA. Capillary electrophoresis (CE) separates analytes

    on the basis of their mass-to-charge ratio and elutes

    complexes in the order of free protein, protein/DNA

    complex, and lastly DNA [40]. A rapid and quantitative

    procedure has also been developed that permit accurate

    assessment of specific DNAprotein interactions on a

    scale more than 100-fold, below the minimum signal

    necessary for EMSA by using a laser-induced fluores-

    cence detection system [41].

    IDEMSA is another modification of EMSA that

    combines immunodepletion with the traditional EMSA

    and supershift assays. In this, the nuclear or cytoplasmic

    extracts are depleted of the specific protein by incubation

    with the relevant antibody and protein A-sepharose. The

    depleted extracts are then analyzed for the presence of

    protein by the EMSA and supershift assay. This tech-

    nique poses the advantage of combining results of im-

    munodepletion and supershift to determine the protein

    composition of a particular proteinDNA complex and

    also the localization of the dimer to a specific complex

    [42].

    Southwestern blotting

    This technique combines the principles of southern and

    western blotting and is primarily used for elucidating the

    molecular weight of the protein in a proteinDNA com-

    plex. Though a super shift assay, an extension of an EMSA

    experiment, provides more information on the nature and

    hence the molecular weight of the protein, often there areno antibodies known for the bound protein. Thus, in cases

    where no preliminary knowledge of the DNA-binding

    protein is available, southwestern blotting provides at least

    some minimal information like molecular weight.

    The experimental procedure involves, a modified wes-

    tern blot using labeled oligonucleotides instead of anti-

    bodies as probes. In brief, the crude or purified

    cytoplasmic/nucleic/whole cell extract containing the pro-

    tein of interest, is resolved on an SDS-PAGE, followed by

    electrophoretic transfer of the proteins from the gel to a

    membrane under conditions favouring renaturation of the

    proteins. The membrane-bound proteins are then incubatedwith oligonucleotides to which the protein of interest

    putatively binds. The membrane is developed, photo-

    graphed and only the band corresponding to the bound

    oligo appears in the final picture (Fig. 1c). Aligning the

    band on the developed picture with the SDS-PAGE posi-

    tion of the protein at that band, marks the protein bound to

    the oligo and provides information about its molecular

    weight [4346]. The SDS-PAGE provides the information

    of the molecular weight, while the blotting allows the

    protein to bind to the sequence. The labelling is required to

    mark the spot of the bound proteinDNA complex [47].

    A 2-D gel electrophoresis, instead of SDS-PAGE and

    on-blot digestion of the DNA-bound protein followed by

    LCMS/MS, analysis provides better information about the

    molecular weight of protein [48, 49]. Non-radioactive

    methods for southern blotting make the procedure less

    cumbersome [50,51]. Moreover, using differently labelled

    oligos on the same blot would provide information on the

    binding affinity of various mutants of the oligo. The same

    blot is probed with different probes by using alkaline

    phosphatase to strip the signal of the bound probe [52]. A

    further modification uses the southwestern blot itself as a

    substrate for nuclease footprinting or other types of foot-

    printing like chemical nuclease and methylation protection,

    thus identifying the exact DNA sequence where the protein

    binds [53]. To differentiate the specific from the non-spe-

    cific binding on the blot, a rapid dimethylsulphate (DMS)

    protection assay has been developed, which distinguishes

    between them on the basis of conditions that specific

    binding creates, making the complex impervious to DMS

    [54]. Though southwestern blotting is primarily a technique

    for knowing the molecular weight of protein binding to a

    known DNA sequence, it can also be used to find the

    282 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    5/21

    sequence of DNA that a particular protein binds to [55].

    While screening various oligos, caution is advised for

    cDNA expression libraries screened by southwestern

    methodologies [56]. Southwestern histochemistry is also an

    important modification, allowing in situ identification and

    localization of DNA-binding proteins. It uses oligonucle-

    otides instead of antibodies to probe a specific protein in a

    histological sample. Incubation of the labelled oligonu-cleotide with the crude or purified cytoplasmic/nucleic/

    whole cell extract, followed by cross-linking using UV

    light and subsequent resolution of the extract by gel elec-

    trophoresis, is an alternative to blotting [5759].

    Apart from these modifications, another modification

    would be coupling the chromatographic separation of

    proteins with SDS-PAGE for each fraction. This shall

    provide better information on the characteristics, purifica-

    tion properties and molecular weight of the protein. Fur-

    ther, this technique is restricted to blotting because

    oligonucleotides cannot be directly made to penetrate a

    SDS-PAGE and bind to the cognate proteins. Hence, if infuture, some protein-resolving oligonucleotide-permeable

    gels are developed, the blotting procedure can be avoided

    and hybridization can happen on the gel itself.

    The disadvantage of this technique is that DNA-binding

    proteins involving multiple subunits may get dissociated

    during the SDS-PAGE step and hence evade detection.

    Even the proteins which are monomers may not renature

    properly on the blot to recognize their binding sequence.

    Proteins requiring co-factors for DNA binding are difficult

    to detect on blot, unless those specific co-factors are added

    [60].

    Yeast one-hybrid assay (Y1H)

    The Y1H, a modification of the yeast-two hybrid assay, is a

    sensitive technique for identifying and analyzing proteins

    that bind to a specific DNA fragment of our interest. In

    1993, Wang and Reed [61] first used the Y1H to clone the

    gene encoding the olfactory neuron-specific transcription

    factor OLF-1.

    The concept of this assay, like the yeast-two hybrid

    assay, exploits the same basic finding that most eukaryotic

    transcription factors have two physically separable

    domains called the activation domain (AD) and the DNA-

    binding domain (DB/DBD). If these domains are separated

    from each other, it results in a functionally inactive tran-

    scription factor that cannot recruit RNA polymerase at its

    corresponding promoter to start transcription [62].

    In the yeast-two hybrid assay, which is used to study

    proteinprotein interactions, a protein X is translationally

    fused to AD, while the other protein Y is translationally

    fused to DB and both are expressed in the same yeast cell.

    The DB-Y fusion is often referred to as the bait and the

    AD-X fusion is called the prey. If X and Y interact within

    the yeast cell, it brings the AD and DB in close physical

    proximity to reconstitute the functionally active transcrip-

    tion factor and allows the expression of a downstream

    reporter gene. Thus, yeast cells expressing the reporter

    gene show that the proteins X and Y interact with each

    other.

    On the other hand, in the one-hybrid system, the bait isreplaced by a DNA sequence of our interest and the

    interaction of a protein X with the bait sequence is assayed.

    If X interacts with the bait DNA sequence, it results in

    bringing AD-X fusion close to the promoter, allowing AD

    to activate the RNA polymerase and result in the expres-

    sion of the downstream reporter gene. While AD recruits

    RNA polymerase, X plays the role of BD since the BD

    fusion protein is absent here. Since this assay contains only

    one-hybrid encoded on a vector, it is called the Y1H. In

    other words, the one-hybrid assay can be used to trap any

    protein (X) having a binding-domain specific for any target

    DNA sequence (Fig. 1d).The one-hybrid assay offers maximal sensitivity because

    detection of the DNAprotein interaction occurs while

    proteins are in their native configurations. In addition, the

    gene encoding the DNA-binding protein of interest is

    immediately available after a library screening [63].

    The procedure first involves the construction of a vector

    carrying the bait-sequence upstream of a reporter gene

    promoter. Transforming the yeast cells with this bait

    reporter construct generates a yeast reporter strain to be

    used for the assay. The bait sequence and reporter gene

    may remain on the vector or can be integrated into the

    chromosome. The integration of the construct into the yeast

    genome is preferred and is ensured using high-frequency

    homologous recombination sites flanking baitreporter

    region on the vector. The transformants are screened by

    marker selection and are then again transformed with the

    vector encoding a DNA-binding protein fused to Gal4p

    AD. The library of AD-DNA-binding protein is screened to

    check for potential DNAproteins specific to the bait

    sequence, which is reported by the expression of the

    reporter genes [63]. The bait sequence can either be an

    artificial site having several tandem repeats of the

    sequence, or it can be a partial site or a fully functional site

    in situ [64].

    The reporter gene most commonly used is HIS3, which

    allows growth of yeast cells showing the positive interac-

    tion in a medium lacking histidine. Any background noise

    because of basal level or leaky expression of HIS3 is

    eliminated by including a competitive inhibitor 3-amino-

    1,2,4-triazole (3AT) in the medium. Hence, a higher level

    of expression of HIS3 is required for survival of yeast cell

    in the medium. The higher level expression is only possible

    in cells showing positive interaction and hence any false

    Mol Cell Biochem (2012) 365:279299 283

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    6/21

    positives are eliminated [64]. LacZ is another reporter gene

    which can be used in luciferase-based assays.

    There are several modifications of the yeast one-hybrid

    system. First, the one-hybrid system can be used to look for

    interaction-defective proteins in which the reporter gene

    codes for a toxic product. Thus, a positive interaction leads

    to expression of the toxin and cell death while a lack of

    interaction confers cell viability. This method is referred toas the reverse one-hybrid assay. When used to screen an

    AD fusion library of random mutations in the DNA-bind-

    ing proteins, it can identify mutations that lead to disrup-

    tion of a DNAprotein interaction [62]. It can also be used

    to detect the therapeutic potential of drugs or other small

    molecules that lead to disruption of DNAprotein interac-

    tions which are involved in the onset of a disease. Like the

    reverse two hybrid assay, the reverse one-hybrid assay can

    be titrated to generate a range of DNAprotein affinities

    [65]. Second, Y1H can be modified to screen various

    binding sites in a bait sequence that binds to a known

    DNA-binding protein. Third, Y1H can also be used toscreen for specific epitopes on a known DNA-binding

    protein.

    Phage display for DNA-binding proteins

    Phage display refers to the method of expressing a peptide

    or protein domain on a bacteriophage capsid by genetically

    fusing its amino acid sequence to that of the coat proteins

    encoded by the phage. A wide variety of proteins can be

    expressed in this way yielding a pool of variants referred to

    as a phage-display library. The proteins of interest can then

    be selected from the library by affinity purification using an

    appropriate ligand. The clones with the highest affinity for

    the target ligand can be enriched by sequential rounds of

    selection and amplified by passage through a bacterial host.

    The identity of the selected clones can be obtained by

    sequencing the phage genome thus giving complete infor-

    mation about the protein of interest [66,67]. Phage display

    is also applied to map the DNAprotein interactions

    because of the advantage of screening a large number of

    protein variants simultaneously and also giving the com-

    plete sequence information of the same [6873].

    In this modified version of phage display, phages

    express a DNA-binding protein domain fused to its coat

    protein. Affinity purification of a pool of clones expressing

    various DNA-binding protein domains is carried out using

    dsDNA oligos (with the binding sequence specific to a

    protein) bound to a solid matrix. Enrichment and amplifi-

    cation is done as in conventional phage-display experi-

    ments and depending on the type of library screened, it

    results in identification of the protein domains in the DNA-

    binding protein that physically interact with the DNA

    (Fig.1e).

    The choice of library to be screened is dictated by the

    aim of the experiment. The most common types of phage-

    display libraries are the random peptide libraries (RPL)

    which are obtained by randomizing the DNA sequence at a

    selected region of the gene encoding the DNA-binding

    protein. This can be used to check which residue in the

    region is involved in the interaction or for selecting rare

    clones with enhanced function, or clones in which thedisplayed domain has acquired a new function as a result of

    mutation. Other libraries like cDNA libraries or genomic

    libraries are used to determine the DNA-binding proteins

    [67]. DNA oligos are prepared by annealing complemen-

    tary oligonucleotides together and by biotinylating at least

    one of the strands so that they can be bound to streptavidin-

    coated matrices.

    The procedure firstly entails the construction of an

    appropriate phage-display library of DNA-binding proteins

    according to the protocol previously described [66]. DNA

    oligos bound to an appropriate matrix are then incubated

    with the phages. The unbound phages are removed usingseveral rounds of washing. The bound phages are then

    eluted and amplified by passage through a bacterial host.

    These affinity-purified and amplified phages are then made

    to undergo another round of affinity purification and then

    again amplified. Several rounds of affinity purification

    followed by amplification lead to the enrichment of the

    phage clones expressing the DNA-binding protein domain

    with maximum affinity to the DNA of interest.

    Once the clones are selected and enriched, their binding

    property is assayed using phage ELISA as a final confir-

    mation. In this, the streptavidin-coated microtiter plates are

    first coated with the biotinylated DNA oligos. Then the

    enriched and amplified phages are allowed to bind to the

    wells. The unbound phages are washed off and anti-phage

    antibody conjugated with an enzyme is added. After

    washing off the unbound antibodies the colour developing

    solution carrying the substrate is added and reaction stop-

    ped after a specified time. The intensity of colour devel-

    oped is measured using plate reader spectrophotometer at

    450 nm. Higher intensity indicates and confirms strong

    interaction between the DNA oligo and the proteins dis-

    played on those phage clones.

    Proximity ligation assay (PLA)

    PLA is used for ultrasensitive protein analysis for mea-

    suring DNAprotein interaction. In this technique, direct

    detection of proteins or DNAprotein interactions is pos-

    sible and DNA representations of detected proteins are

    created. Following this, the amplified oligonucleotides are

    attached to specific protein-binding reagents (mono/poly-

    clonal antibodies). One of the proximity probes is a partly

    double-stranded oligonucleotide with a single-stranded

    284 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    7/21

  • 8/11/2019 DNA-Protein Interactions Techniques

    8/21

    DNAprotein complexes are captured onto a HaloLink

    Resin. This is followed by the standard decross-linking,

    DNA purification and PCR amplification of enriched DNA

    [7880]. In general, there are many alternatives to detect an

    immunoprecipitated chromatin such as polymerase chain

    reaction (PCR), quantitative PCR (qPCR), labelling and

    hybridization to genome-wide or tiling DNA microarrays

    (ChIP-on-chip), molecular cloning and sequencing, or directhigh-throughput sequencing (ChIP-seq) [8187]. There are

    several variations of ChIP assay.

    X-ChIP

    This method allows freezing of all DNA-associated proteins

    by cross-linking using formaldehyde. Formaldehyde reacts

    with primary amines located on amino acids and the bases on

    DNA or RNA molecules, forming a covalent cross-link

    between the specific proteins to the DNA on which they are

    situated. Now the various DNAprotein complexes are iso-

    lated by cell lysis and the crude cell extracts are sonicated toshear the DNA to a smaller size. The proteinDNA complex

    is immunoprecipitated and the DNAprotein cross-links are

    reversed by heating. The proteins are then removed by

    treatment with proteinase K. The DNA portion of the com-

    plex is then purified and identified by PCR using specific

    primers. The use of formaldehyde for cross-linking mini-

    mizes nucleosome rearrangements andis an efficient method

    to analyze proteins that are weakly or indirectly associated to

    DNA. The use of formaldehyde as a crosslinking agent has

    certain limitations like the short cross-linking arm of form-

    aldehyde is not suitable for examining proteins that indi-

    rectly associate with DNA, such as those found in larger

    complexes. So, a variety of other long-range bifunctional

    cross-linkers may have to be used in combination with

    formaldehyde to detect such interactions [88]. The yield of

    chromatin and its resolution may be less after sonication and

    sometimes there are chances of epitope disruption [89,90].

    Native-ChIP (N-ChIP)

    N-ChIP is a technique suited for natural DNAprotein

    interactions where the proteins are tightly associated to

    chromatin in their native state such as histones due to their

    high-affinity for DNA. Hence, these interactions do not

    require cross-linking with formaldehyde. Native chromatin

    within a cell produces smaller fragments, by treatment with

    micrococcal nuclease (MNase) which are then immuno-

    precipitated using antibody specifically against the protein

    of interest. Enzymatic digestion technique is mild and does

    not result in loss of antibody epitope during immunopre-

    cipitation yielding higher immunoprecipitation efficiencies

    [91]. It also provides high resolution as it is possible to

    produce single monosomes of about 175 base pairs.

    However, the digestion by MNase is uneven, as the enzyme

    favours certain areas of genome sequence more. To avoid

    over represented or over looked data, X-chip should be

    carried out as a comparative control [92]. Also nucleo-

    somes may rearrange during digestion and this has to be

    taken into consideration while performing N-ChIP.

    Fast ChIP

    As the name suggests, fast chip is a modification of the

    chip technique for large cell numbers which reduces time

    required for conventional ChIP assay and helps in elimi-

    nating multiple tube transfers thereby preventing loss in

    output. Conventional chip assays require a high cell num-

    ber to begin with due to low recovery rate of cross-linked

    DNA from total cellular DNA. Multiple washes during the

    procedure may also cause loss of specific interactions.

    Therefore, a technique that can reduce the time and chan-

    ces of losing cells is favoured. In this modification of ChIP

    assay, all the steps are similar. However, the cross-links arereversed during 10 min incubation at 100C in an ultra-

    sonic bath, in the presence of Chelex-100, a resin that aids

    in the extraction of DNA. After incubation, the tubes are

    spun and DNA containing supernatant can be directly used

    in PCR [93]. The limitation associated with the fast ChIP is

    that it is suitable only for large cell samples.

    Carrier ChIP

    The carrier ChIP is based on immunoprecipitation from

    very few cells up to 100 cells and is suited for examining

    histone modifications associated with developmentally

    regulated genes. Immunoprecipitation of such a small

    amount of chromatin is facilitated by the addition of carrier

    chromatin from Drosophila or any other species which is

    evolutionarily distant from the species being investigated

    to provide efficient precipitation of target chromatin [94].

    Here native chromatin is partially digested using MNase

    and immunoprecipitated using antibodies to modified his-

    tones. The low amount of chromatin is detected by radio-

    active PCR and phosphorimaging. This technique,

    however, requires the primers to be designed with high

    specificity to prevent any spurious amplification of carrier

    DNA instead of the target chromatin.

    Matrix ChIP

    It is a microplate-based ChIP assay in which all the steps

    are done in microplate wells without sample transfers [95].

    In this method, antibodies immobilized with protein A/G

    are coated into each well of a 96-well plate and further

    processed. Hence, allowing 96 ChIP assays for histone and

    various DNA-bound proteins, including transiently bound

    286 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    9/21

    protein kinases, in a single run. It also allows maintaining

    antibodies in correct orientation which enhances its binding

    capacity [96].

    ChIP-Chip

    As the name suggests, ChIP-Chip is a technique that com-

    bines Chromatin Immunoprecipitation with Microarraytechnology. It consists of labelling the immunoprecipitated

    DNA fragments with a fluorescent dye such as Cy5 or Alexa

    647 and combining it with the genomic DNA labelled with

    Cy3 which serves as the reference DNA. This probe mixture

    is then applied to the microarray chip ideally consisting of

    whole genome and allowed to hybridize. The results of the

    experiment signify the regions of the DNA enriched by

    immunoprecipitation. Hence, the Chip data is obtained in the

    form of one dimensional series of signals with peaks iden-

    tifying the regions bound by the protein of interest [97]. Also,

    since the exact location of each arrayed element is known, a

    genome-wide map of DNAprotein interactions can beconstructed.

    Various computational and mathematical models are

    available which allow the analysis of regions bound by the

    proteins [98]. CisGenome is one such software which ful-

    fils almost all the needs of ChIP data analysis including

    visualization, data normalization, peak detection, false

    discovery rate computation, gene-peak association,

    sequence and motif analysis. Many statistical approaches

    have also been used for the analysis of ChIP data including

    Hidden Markov Model, Welchs t statistic method, and

    titled model-based analysis of tiling-arrays (MAT), to

    identify regions enriched by a transcription factor [99].

    The ChIPChip technique offers several advantages

    over traditional ChIP assays. First, it allows probing of a

    large number of genomic regions in a single experiment,

    eliminating bias and saving time. Second, commercially

    available platforms can be used to study the localization of

    protein binding dismissing the need of running expensive

    large scale quantitative PCR assays. Third, it allows par-

    allel analysis of different genes to be classified in various

    classes which is further useful for their statistical com-

    parison [100].

    Since an ideal microarray covering all the human

    chromosomes is not possible, this technique may be ben-

    eficial if combined with other throughput technologies.

    DIP-Chip

    The modification of ChIPChip is DIPChip that over-

    comes its limitations like interference of proteinprotein

    interactions and competitive binding in vivo. DIPChip is

    more of an in vitro technique with results comparable to in

    vivo assays. The procedure involves interaction of purified

    and mechanically sheared genomic DNA with purified

    protein of interest. The DNAprotein complexes are then

    affinity-purified using appropriate resins. These affinity-

    purified genomic fragments along with the whole genome

    fragments are then amplified and fluorescently labelled

    separately with different dyes for assessing their relative

    abundance in the entire genome of the organism using

    microarray. The samples are analyzed by comparativehybridization to the DNA microarray that covers the entire

    genome of the organism [101].

    ChIP sequencing

    ChIP sequencing combines the technique of chromatin

    immunoprecipitation and DNA sequencing to identify the

    binding sites of various protein factors co-precipitated

    along with DNA fragments during ChIP [102]. For the

    construction of ChIP-seq library, the ends of enriched DNA

    fragments obtained by immunoprecipitation using con-

    ventional ChIP protocol are blunted and phosphorylatedusing T4 kinase. Following this Adenine is added using

    Taq and an adapter is ligated to both the ends of the

    fragment [103]. The library obtained is amplified by PCR

    and DNA fragments of length 100300 bp are selected and

    sequenced. Finally the short sequenced fragments called

    tags are analyzed computationally with the help of align-

    ment tools using a particular genome as reference to

    identify the enriched sites [104].

    This technique has several advantages over ChIPChip

    including low cost, lesser starting material and higher peak

    resolution. However, it also has a number of issues which

    need to be addressed. First, The ChIP-Seq tags represent

    the ends of the enriched fragments and not the binding sites

    of the protein factor. Moreover, the estimation of site to tag

    distance is complicated. Second, no control samples are

    sequenced deeply to check for the regional biases along the

    genome arising on account of chromatin structure and copy

    number variations [102]. Third, lack of advanced and user

    friendly data analysis tools make the analysis of peaks

    difficult.

    However, ChIP-seq has been proved to be a potential

    tool in the study of histone modifications, nucleosome

    positioning and mapping of binding sites of various DNA-

    binding proteins. Moreover, this strategy allows distin-

    guishing alleles on the basis of difference in SNPs, which

    would not have been possible using ChIPChip [103].

    ChIP display

    ChIPChip has been described as a potential method for

    the identification of novel transcription factor binding sites

    in the genome. But it suffers from severe limitations

    including co-precipitation of non-specific DNA fragments

    Mol Cell Biochem (2012) 365:279299 287

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    10/21

    which may sometime even overwhelm the specific ones

    resulting in a strong background noise. To overcome this

    problem, a new technique has been devised called ChIP

    display. This technique is based on the principle of con-

    centrating the target fragments via restriction digestion and

    then scattering the precipitated non-specific DNA frag-

    ments by partitioning the digested fragments into different

    families. The partitioning is based on the identity of thenucleotides at the end of these fragments [105]. Since all

    the target fragments remain in the same family, the signal

    is not eroded and is separated from the non-specific frag-

    ments of different families.

    ChIP display is a prospective tool for the reduction of

    non-specific DNA precipitation. However, it suffers some

    practical limitations. First, since non-specifically precipi-

    tated DNA fragments can unexpectedly bind the protein in

    vitro (but not in vivo), hence utility of this approach is

    debatable. Second, ChIP display is not well suited for a

    comprehensive analysis of target sequences for proteins

    with a large number of genomic targets, such as GATAproteins, histone deacetylases, polycomb proteins or for the

    mapping of histone modifications [105]. It is better suited

    for transcription factors with a more limited number of

    targets.

    Other ChIP variations

    There are certain other categories of ChIP assay setups that

    are classified based on different buffers used which affect

    the purpose and efficiency of the immunoprecipitation such

    as Quick and Quantitative ChIP (Q2 ChIP) and MicroChIP.

    Q2 ChIP incorporates histone deacetylase inhibitor during

    cross-linking which helps in elimination of non-specific

    backgrounds and also has different elution buffers and

    reduced time of protocol. MicroChIP is a miniaturized

    ChIP protocol for 10,000 cells that has applicability in

    genome-wide studies [106,107].

    DNA adenine methyltransferase identification

    (DAMID)

    DAMID is a novel methylation-based tagging technique

    that has emerged as a powerful tool to study chromatin

    interactions in vivo. It has been successfully used to gen-

    erate genome-wide maps of several DNA-binding factors

    including GAGA factors, Max family of transcription

    regulators, coregulators and various other chromatin pro-

    teins [108].

    In this technique, the protein of interest is fused with a

    bacterial DNA adenine methylase (DAM) which is a single

    32 kDa polypeptide and methylates adenine at the sixth

    position in the sequence GATC [109]. This methylation

    causes few changes in the DNA topology and provides a

    unique tagging system to mark the binding sites of specific

    protein factors. This fusion protein is expressed in mam-

    malian cells in low quantities by using a weak promoter

    [108]. The binding of the fusion protein to the target site

    results in the methylation of adenine nucleotides within the

    DAM recognition sequence in close vicinity of the protein

    target site. These methylated sequences are then cleaved by

    DpnI enzyme to recover fragments containing regionsnearby or within the gene along with the target site itself.

    Further, the fragments obtained may be analyzed by

    quantitative PCR assay or subjected to microarray studies.

    To overcome these effects of chromatin accessibility on the

    level of methylation, a control experiment is run in parallel

    which measures the methylation levels in the probed

    sequences after the expression of dam [108].

    DAMID has significant advantages over the conven-

    tional ChIP technique. First, it does not use any cross-

    linking agents to fix the chromatin and also eliminates the

    use of protein-specific antibody. Hence, it provides a

    simpler platform to study the binding properties of co-factors and other proteins that bind indirectly to the DNA

    [110]. Also, there are lesser chances of misidentification of

    target sequences due to accidental cross-linking as in case

    of ChIP. Second, it provides an easier way to study the

    effects of mutations on the targeting specificity of the

    protein of interest which is difficult to perform using

    conventional ChIP assays [108].

    The limitations of DAMID are that it requires dam to be

    bound to the protein without inducing any changes in its

    function. Also, this technique is unsuitable for the detection

    of post-translational modifications, while ChIP successfully

    detects histone modifications. It is a time consuming

    technique as it involves expression of DAM-fusion protein

    for several hours [108].

    Chromosome conformation capture (3C)

    and ChIP-loop assay

    One of the key regulators of gene expression is spatial

    organization of the eukaryotic genome. 3C is a novel

    technique that is used to detect the frequency of interaction

    between two genomic loci in the nuclear space. It is a

    powerful tool to study the link between nuclear organiza-

    tion and transcription regulation. This technique is carried

    out by initially fixing the cells with formaldehyde which

    helps in cross-linking of interacting segments of the gen-

    ome via contacts between their DNA-bound proteins. The

    resulting network of proteinDNA complexes is subjected

    to restriction digestion followed by ligation at low DNA

    concentration, such that the ligation between cross-linked

    DNA fragments is favoured. After the reversal of cross-

    links, the fragments are subjected to quantitative PCR to

    further allow for the measurement of cross-linking

    288 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    11/21

    frequency of the two specific restriction fragments

    (Fig.2c) [111].

    Although ChIP and 3C operate through same basic

    principle of cross-linking proteinDNA interactions but the

    two techniques differ from each other in the information

    they provide whereas ChIP provides information about

    DNA-binding activity of a protein, 3C is used to study

    interaction between two different genomic sites looped bya protein factor.

    To establish a link between 3C and ChIP a novel tech-

    nique called ChIP-loop assay has been developed [112].

    This technique allows the study of proteins mediating the

    interaction between the two genomic loci, by combing the

    two techniques. Initially, the cells are fixed by formalde-

    hyde and the cross-linked chromatin purified from free

    proteins by urea gradient centrifugation. This is followed

    by restriction digestion of the purified cross-linked chro-

    matin and precipitation by protein A/G beads and specific

    antibodies. The precipitated chromatin is then allowed to

    ligate and is further analyzed by quantitative PCR as instandard 3C experiments [111]. Hence, ChIP-loop assay

    helps in studying the proteins that are involved in organi-

    zation of DNA loops to mediate genomic interactions. This

    technique provides a better insight into interactions than 3C

    and ChIP when used alone. However, the major concern is,

    when the DNA is concentrated before ligation, it may lead

    to formation of loops between bead-associated DNA

    fragments. Hence the results obtained may not accurately

    identify the loops of DNA fragments formed in the nuclear

    space. This also makes the quantification of ligation

    products very complicated [113]. Nevertheless, their

    potential use in identifying proteins participating in long-

    range interactions cannot be denied.

    In silico tools for identification of DNAprotein

    interactions

    The computational and in silico approaches to identify

    DNAprotein interactions are an important aspect of these

    interactions. Diverse computational tools are freely avail-

    able which are used to predict DNAprotein interactions.

    Most of these are aimed at predicting the transcription

    factor-based gene regulation.

    TRANSFAC

    TRANSFAC is a comprehensive knowledgebase contain-

    ing eukaryotic gene regulation data from a wide variety of

    eukaryotic organisms, ranging from yeast to humans. It

    mainly comprises of data on transcription factors, their

    experimentally proven binding sites, regulated genes and is

    an extremely diverse tool for transcription factor (TF)

    binding predictions. It has a broad compilation of binding

    sites and allows the derivation of positional weight matri-

    ces, which can be used with the available tools to search

    DNA sequences. Several entries are grouped under differ-

    ent tables of the TRANSFAC database. One of the features

    is the assigning of a quality value to describe the confi-

    dence with which an observed DNA-binding activity could

    be assigned to a specific factor. Nucleotide weight matricesare derived from a collection of binding sites for a factor,

    and these matrices are used by the tool MatchTM to find

    potential binding sites in uncharacterized sequences. Sev-

    eral web programs are also available that utilize the

    TRANSFAC database such as AliBaba2 which is a used for

    predicting TF binding sites in an unknown DNA sequence

    by utilizing the binding sites collected in TRANSFAC.

    P-Match is another new tool for identifying transcription

    factor binding sites in DNA sequences. It combines pattern

    matching and weight matrix approaches to provide a high

    accuracy of recognition.

    TRANSFAC is maintained as a relational database, fromwhich public releases are made available via the web,

    making it an easily accessible database. Several web-based

    tools are linked to TRANSFAC and utilize its database to

    perform unique computational functions [114,115].

    Identification of DNA-binding proteins (iDBPs) server

    The iDBPs server was developed for the identification of

    DNA-binding proteins with known three-dimensional

    structure. In the first stage of classification, the functional

    region of the protein is predicted using the PatchFinder

    algorithm which searches for clusters or patches of evolu-

    tionarily conserved residues on the protein surface. The

    maximum-likelihood (ML) patches found by PatchFinder

    often delineate the functional regions in proteins and spe-

    cifically, the core of DNA-binding regions within DNA-

    binding proteins [116]. The results are sent to the user which

    includes the prediction score of the protein, the expected

    sensitivity and the expected precision at this score cut-off.

    DNA site prediction from a list of adjacent residues

    (DISPLAR)

    DISPLAR is a neural network method that predicts the

    residues of a protein which interact with DNA, if the

    structure of a protein known to bind DNA is provided.

    Several inputs have to be provided to the neural network

    including position-specific sequence profiles and solvent

    accessibilities of each residue and its spatial neighbours.

    The neural network is trained on known structures of

    proteinDNA complexes. DISPLAR shows prediction

    accuracy over 80% and coverage of over 60% of actual

    DNA-contacting residues [117].

    Mol Cell Biochem (2012) 365:279299 289

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    12/21

    FlyFactorSurvey

    FlyFactorSurvey is a database of DNA binding specificities

    for Drosophila TFs. It provides community access to over

    400 recognition motifs and position weight matrices for

    over 200 TFs, including many unpublished motifs. The

    primary source of recognition motifs within FlyFactor-

    Survey is TF binding site selections performed using thebacterial one-hybrid system. Search tools and flat file

    downloads are provided to retrieve binding site information

    (as sequences, matrices and sequence logos) for individual

    TFs, groups of TFs or for all TFs with characterized

    binding specificity. Linked analysis tools allow users to

    identify motifs within the database that share similarity to a

    query matrix or to view the distribution of occurrences of

    an individual motif throughout the Drosophila genome

    [118].

    YEAst search for transcriptional regulators

    and consensus tracking (YEASTRACT)

    YEASTRACT information system allows the identification

    of potential transcription regulators. It is a database that

    contains over 12,346 regulatory associations between

    transcription factors and target genes in Saccharomyces

    cerevisiae [119]. It also characterizes set of genes with

    common expression profile obtained from microarray data

    and searches for occurrence of candidate TF binding sites.

    Multi-genome analysis of positions and patterns

    of elements of regulation (MAPPER)

    It is a search method that helps in identifying the TF

    binding sites which is based on hidden Markov model

    obtained from alignments of known sites. TF binding site

    models can be used to align with the sites provided by the

    TRANSFAC and other databases and then scan the

    sequences of the human, mouse, fly, worms and yeast

    genomes to identify the sites. It has a better specificity and

    sensitivity than other similar computational models. A

    sequence is uploaded as a query and then a model is built

    by allowing multiple sequence alignment of binding sites

    of the transcription factor [120].

    Zinc finger binding site database (ZIFIBI)

    It is a tool that helps in identifying the C2H2 zinc finger

    transcription binding site in the cis regulatory regions of

    the target genes. It makes use of the available data to

    predict the interactions between the nucleotides and the

    amino acids of the zinc finger domain of the protein. The

    most probable state path is calculated using a hidden

    Markov model [121].

    Bioprospector

    It helps in identifying regulatory sequence motifs in cis

    region of target sequence by examining it in the same gene

    expression pattern group. It is based on a C program and

    uses Gibbs sampling strategy. The significance of each

    motif is estimated using the Monte Carlo method. It has

    been successful in identifying binding motifs for Saccha-romyces cerevisiae Ras-related protein 1(RAP1), Bacillus

    subtilisRNA polymerase, andEscherichia colicyclic AMP

    receptor protein (CRP) [122].

    Bindn

    It is a web-based tool that helps to predict the DNA and

    RNA binding sites with the help of support vector

    machines (SVMs). The SVM models are prepared using

    three sequence features like side chain pKa values,

    hydrophobicity index and molecular mass of an amino

    acid. Thus, it helps to identify the functions of the bindingproteins based on primary sequence data [123].

    Bindn?

    Bindn? uses protein sequence features different from

    Bindn to identify the binding sites in the sequences. It also

    takes the support of the SVMs. The protein sequence fea-

    tures used in this case are the biochemical property of the

    amino acids and evolutionary information in terms of the

    position-specific scoring matrix. The new descriptors used

    in Bindn? have shown better performance, sensitivity and

    specificity in comparison to the previous version [124].

    DP-bind

    It helps in predicting the binding sites of a protein by ana-

    lyzing the amino acid sequence. It uses three support models

    for predicting the sites: support vector machines, kernel

    logistic regression and penalized logistic regression. Pre-

    diction can be done using the input sequence alone or the

    profile of evolutionary conservation of the input sequence.

    The output of all the three models are used to provide a

    combined and consensus result with high confidence [125].

    PreDs

    It is a web-based server that allows DNA-binding site

    prediction on protein molecular surfaces. The molecular

    surfaces of the proteins are generated with the help of

    atomic coordinates that are available in a .pdb format. The

    prediction is based on the evaluation of the electrostatic

    potential, local and global curvature of the protein surface

    [126].

    290 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    13/21

    ProNIT

    It is a thermodynamic database that uses quantitative

    binding data rather than just structural data. It contains

    several parameters for analyzing the protein-nucleic acid

    recognition like thermodynamic parameters, experimental

    conditions and structural information of both the protein

    and the DNA. It provides various sorting output options.The thermodynamic parameters used are dissociation

    constant, association constant, Gibbs free energy change,

    enthalpy change and heat capacity change. A relational

    database system combines all of this information to provide

    flexible searching facilities [127].

    Database for polyanion binding proteins (DB-PABP)

    Polyanion binding proteins are diverse proteins that go and

    interact with polyanions which are entities having multiple

    negative charge. The various polyanions identified for such

    interactions are actin, tubulin, DNA, heparin and heparinsulphate. The database thus created is a comprehensive and

    searchable database which has been manually curated. It

    has been implemented as a MY SQL relational database.

    The search is based on four criteria: protein names, poly-

    anion names, source species and the methods used to dis-

    cover the interactions [128].

    DNAProt

    It helps in identifying the DNA-binding proteins from the

    protein sequence. It has considerably good accuracy in

    distinguishing between the DNA-binding proteins and the

    non-DNA-binding proteins by characteristically recogniz-

    ing specific DNA chains. The random forest method is used

    to identify the DNA-binding proteins [129].

    Biophysical techniques as a potential tool for

    DNAprotein interaction studies

    Fluorescence-based techniques

    Fluorescence is a form of luminescence caused by emission

    of an electromagnetic radiation [130]. The simultaneous

    absorption of two photons by an electron (two-photon

    absorption) excites molecule from ground state to higher

    energy (high frequency, low stability) state, leading to

    emission of radiations [131]. This principle and its modifi-

    cations are used to device different fluorescence detection

    techniques, e.g. fluorescence spectroscopy, fluorescence

    intensity, florescence depolarization, fluorescence resonance

    energy transfer and fluorescence correlation spectroscopy. In

    fluorescence-intensity distribution analysis fluorescence

    intensity of a sample with a heterogeneous brightness profiles

    is monitored by spatial brightness distribution and calculat-

    ing theoretical photon count number distributions [132].

    Capillary electrophoresis with laser-induced

    fluorescence

    Capillary electrophoresis coupled with laser-induced fluo-rescence polarization is a hybrid approach to ultrasensitive

    immunoassays [133]. Fluorescence polarization provides

    additional information for identification of affinity com-

    plexes. ProteinDNA interactions can be studied on the

    basis of capillary electrophoretic (CE) separation of bound

    from free fluorescent probe followed by detection with

    laser-induced fluorescence polarization (LIFP) [134].

    Changes in electrophoretic mobility and fluorescence

    anisotropy upon complex formation can be monitored for

    the determination of binding affinity and stoichiometry

    [135]. There are two types of assaysHomogeneous and

    heterogeneous. In the homogeneous assay, the free andbound tracers are joined together and the fluorescence

    polarization of the mixture is a quantitative measure of the

    antibody-bound tracer. The heterogeneous assay involves a

    baseline separation of the free and bound tracer using CE

    with a phosphate running buffer. Results from both assays

    suggest that the CELIFP approaches have a wider appli-

    cation than the immunoassays based on either CELIF or

    fluorescence polarization alone [136].

    Narrow-bore capillaries provide high-speed, high reso-

    lution separations and ultrasensitive detection in a minimal

    sample detection volume. Increased detection limits,

    enhanced identification capacity, potential for miniaturi-

    zation, etc. also adds to its advantages. However, the free

    and bound tracer may have similar electrophoretic mobil-

    ities and thus cannot be separated, making the technique

    inefficient in their identification and quantitation [137].

    Time-resolved fluorescence depolarization

    Time-resolved fluorescence depolarization (Anisotropy) is

    a technique where a short pulse of vertically polarized light

    is directed at the sample where the absorbed light prompts

    the molecule to an excited singlet state [138]. After

    vibrational relaxation, fluorescence light is emitted at lower

    energy; if the molecule rotates during the time interval

    between absorption and emission, there is a decrease in the

    polarization with time that reflects a rate at which the

    molecule rotates diffusionally [139].

    Time-resolved fluorescence spectroscopy can be used to

    analyze the interaction between proteins and DNA. Fluo-

    rescence polarization anisotropy decay can be used as a

    spectroscopic handle to scrutinize the interaction between

    several site-specific DNA-binding proteins and their target

    Mol Cell Biochem (2012) 365:279299 291

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    14/21

    DNA fragments. Solution conditions such as temperature,

    pH, ionic strength, and the presence of effector molecules

    can be varied and interaction can be studied [140].

    Variety of DNA sequences can be tested, both for pre-

    liminary experiments and for evaluating base sequence-

    dependent effects. The assay is reversible which allows

    manipulation of solution conditions so that the effects of

    environment or effector molecules on complex formationcan be accessed directly. Also the rotational correlation

    time directly measures molecular size and shape.

    Double labelled native gel electrophoresis

    and fluorescence-based imaging

    Radiolabeled DNA gel mobility shift assay is modified to

    incorporate an end-labelled DNA probe with a texas-red

    fluorophore and a DNA-binding protein tagged with the

    green fluorescent protein to monitor precisely DNApro-

    tein complex by native gel electrophoresis [141]. Thismethod is applied to the DNA-binding proteins, demon-

    strating that the method is sensitive, permits direct visu-

    alization of both the DNA probe and the DNA-binding

    protein, and enables quantitative analysis of DNA and

    protein complex, and thereby an estimation of the stoi-

    chiometry of protein-DNA binding [142].

    Protein array method combining a near-infrared

    fluorescence detection

    The protein array methodology is used to study DNA

    protein and proteinprotein interactions using probes

    labelled with near-infrared fluorescence dyes (IRDye800)

    with excitation characteristics near 700 or 800 nm detect-

    ing signals from proteins immobilized on a nitrocellulose

    membrane with a high sensitivity [143]. To study protein

    DNA binding, the membranes are incubated in a DNA-

    binding buffer containing poly-dGdC and poly-dAdT or

    sonicated salmon DNA at room temperature for 30 min.

    Then, an Infra red Dye-labelled DNA probe is added to the

    pre binding solution and incubated with a slow rotation at

    room temperature or at 60C. The membranes are washed

    with PBS containing 0.1% Tween and then screened for the

    detection of fluorescent signals by infrared Imaging sys-

    tem. To perform proteinprotein binding, the membranes

    are incubated in the PBS solution with BSA at room

    temperature and then with corresponding Cy5.5-labelled

    protein in PBS containing 1% BSA and 0.1% Tween 20 at

    room temperature for 1 h [144].

    The fluorophores in protein array method with longer

    wavelengths provide a high-signal-to-noise ratio that

    decreases the background effect on membrane surfaces;

    thereby increasing the sensitivity of the detection.

    Fluorescence resonance energy transfer (FRET)

    techniques

    FRET is a non-radiative process whereby an excited donor

    fluorophore transfers energy to a ground-state acceptor as a

    result of a coupling of their transition dipoles. FRET pro-

    vides structural and kinetic information of proteinDNA

    interactions by preparation of dye-labelled nucleic acidsand proteins and increased optical sensitivity. The principle

    of FRET relies on the site-specific labelling with a donor

    and an acceptor dye, with FRET dyes in each interacting

    partner (Intermolecular FRET) or both in the same bio-

    molecule (Intramolecular FRET) (Fig. 3a, b). Direct opti-

    cal excitation of the donor dye results in fast energy

    transfer to the FRET acceptor, which emits fluorescence at

    a longer wavelength [145,146].

    Intramolecular FRET assays, where both dyes are

    located on the same biomolecule are extensively used to

    monitor protein-induced conformational changes in the

    DNA substrate and to determine the global structure andassembly dynamics of a variety of nucleoprotein

    complexes.

    FRET technique relies on its continuous character, so

    that the cleavage reaction can be monitored from the initial

    steps in real-time with no need for extensive sample han-

    dling [146].

    FRETFLIM in situ imaging for proteinDNA

    interactions in the cell nucleus

    This approach allows imaging of the in situ interaction

    between a GFP-fusion protein and DNA in the cell nucleus,

    using FRET [147]. A fluorescence lifetime imaging

    microscopy (FLIM) is used as a reliable tool to detect

    protein in contact with DNA. To develop a FRET-based

    method to visualize DNAprotein interactions in situ, a

    DNA-binding fluorescent dye that is suitable as FRET

    acceptor if GFP is the donor must be used. The members of

    the Sytox fluorescent dye family have a high-affinity for

    nucleic acids and are available with a broad range of

    excitation and emission spectra. Upon binding to DNA or

    RNA, they show several hundred-fold enhancement of

    fluorescence intensity [148].

    Fluorescence lifetime measurements can be performed

    by wide-field frequency-domain FLIM with Argon-ion

    laser as an excitation source. Images at different phases can

    be recorded at the image intensifier. Thereby, phase and

    modulation depth-based lifetime of the emitted fluores-

    cence can be calculated from the resulting set of images

    [149].

    FRETFLIM in situ imaging for proteinDNA interac-

    tions in the cell nucleus is a reliable and quantitative

    method to measure FRET. It is a donor-selective FRET

    292 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    15/21

    method, which is not influenced by acceptor dye molecules

    that are not involved in FRET.

    Nuclear magnetic resonance

    NMR is used to investigate the interactions of DNA with

    proteins. NMR provides dynamic and structural infor-

    mation on the changes in conformation and molecular

    flexibility and enables formulation of mechanistic models

    of DNAprotein interactions [150]. There are some

    sample preparation steps that need to be followed. The

    sample needs to be labelled and various strategies may

    be employed. Either the protein is 15N or 13C labelled

    while the DNA is unlabeled or vice versa. Sample pre-

    cipitation needs to be taken care of as there is a strongelectrostatic interaction involved within the complex

    [150].

    The sample can be analyzed by chemical shift mapping

    where hetero nuclear single quantum coherence (HSQC)

    spectra of labelled molecule is analyzed separately for

    bound and free state. Chemical shifts are sensitive to

    changes in the chemical environment of the protein. The

    DNA interaction with a protein alters chemical environ-

    ment. Thereby, causing shift in the spectra as compared to

    unbound molecules [151]. Cross-saturation experiments

    can also be used to analyze DNAprotein, proteinprotein

    interactions and various binding surfaces of ligands on the

    protein [152]. Another technique that is employed in NMRis the solvent accessibility test which helps in the quanti-

    tative analysis of the amide proton exchange rates of the

    free and the bound protein. However, there are certain

    intermolecular restraints to the NMR spectroscopy like

    nuclear overhauser effect, residual dipolar couplings and

    paramagnetic relaxation enhancement. These restraints

    may hamper the precision and accuracy of the technique

    and hence various modifications have been made which

    can overcome these restraints [150].

    Circular dichroism

    Circular dichroism (CD) is a quantitative technique that

    helps to identify the DNAprotein and proteinprotein

    interactions. It provides additional information about the

    prosthetic groups, bound ligands and the co-factors

    attached. It also helps to identify the conformational

    change in protein molecules. There are signatures corre-

    sponding to the particular interaction based on asymmetry

    induced by the secondary structure of proteins. Thereby,

    Fig. 3 Biophysical techniques to study DNAprotein interactions

    Mol Cell Biochem (2012) 365:279299 293

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    16/21

    identifying the structure of bound protein and the possible

    interactions involved [153,154].

    There are many variations to CD like the stopped flow

    CD and the CD using synchrotron radiation [155]. In case

    of nucleic acids-induced CD measures the asymmetry

    among the bases. The bases as such are planar but there is

    some amount of CD-induced due to the sugar present in the

    backbone of the DNA. It is a powerful technique in ana-lyzing the structural change with respect to factors like

    temperature, ionic strength and pH. It helps in judging the

    extent of interaction between the helices by analyzing the

    melting of peptides [156].

    The circular dichroism technique is considered to be a

    better method than other techniques like nuclear magnetic

    resonance (NMR) as it is faster, economical, uses a small

    amount of sample to analyze and most of the sample can be

    recovered for further analysis.

    The limitations of CD are relatively low resolution

    structural details and little information about the quaternary

    structure of the protein [157].

    Atomic force microscopy (AFM)

    AFM is another powerful tool for imaging DNAProtein

    complexes at a single molecule level [158]. It allows to

    characterize the mechanisms involved in DNAprotein

    complex formation in different conditions with high reso-

    lution. It quantitatively identifies protein position along

    DNA molecules, DNA flexibility, curvature and confor-

    mational change after protein binding.

    AFM is operated in tapping mode which allows the

    elimination of permanent shearing forces and causes less

    damage to the sample surface, even with stiffer probes

    [159]. Different components of the sample which exhibit

    difference in adhesive and mechanical properties show a

    phase contrast and therefore allow a compositional

    analysis.

    The potential of this technique for high-sensitivity, high-

    throughput operation in fluid, and for force detection are

    major considerations for its continued integration into

    mainstream cellular and molecular analyses [160]. It uses

    very small quantities (10-9 to 10-15) of DNA and proteins.

    The technique has limitations when it is applied to

    structural and functional studies of biomolecules, due to the

    resolution limiting motion of DNA molecules. To over-

    come this, the DNA must be tethered to the substrate sur-

    face. Because of its flatness, mica is the most commonly

    used substrate for DNA imaging [161, 162]. Also, large

    DNA molecules remain difficult to be imaged by AFM

    because of their tendency to aggregate. A modified method

    is described by Lysetska et al. [163], to align long-DNA

    fibres in a single direction on unmodified mica to facilitate

    AFM studies.

    Surface plasmon resonance (SPR)

    SPR is a label-free optical technology and an emerging

    alternative to the conventional in vitro techniques to study

    DNAprotein interactions. It uses an evanescent wave

    phenomenon to study changes in refractive index, occur-

    ring close to the sensor chip surface, causing a shift in

    plasmon resonance angle, detected by an imaging system.The general principal that lies behind the working of

    SPR is total internal reflection that occurs when a polarized

    light travels through a medium of higher to lower refractive

    index. When this occurs, the electromagnetic field com-

    ponent penetrates over a short distance into the medium of

    low refractive index resulting in the exponential attenua-

    tion of the evanescent wave. If the interface is coated with

    a thin layer of gold, then the projected beam at the given

    angle will cause resonance coupling between light photons

    and surface plasmons of gold as their frequencies match. A

    change in the refractive index within the environment of

    evanescent wave occurs due to the binding of DNA toprotein. Hence, a real-time measurement of biomolecular

    interactions can be enabled by measuring the refractive

    index changes corresponding to mass changes [164]. Many

    advances have been done in this technique to study DNA

    protein interactions. A multistep chemical modification

    procedure has been proposed to create DNA arrays on gold

    surfaces specifically tailored for the study of proteinDNA

    interactions [165].

    To study DNAprotein interactions, DNA is immobi-

    lized on the chip surface followed by a constant flow of

    buffer over the surface (Fig.3c). The protein analyte is

    allowed to bind to the immobilized DNA and a change in

    the position of reflected light minimum observed in terms

    of resonance units (RUs) which are recorded and a sens-

    ogram is generated. A sensogram is divided into four dif-

    ferent phases: association phase, steady state or equilibrium

    phase, dissociation phase and regeneration phase (Fig. 3d)

    [166].

    SPR offers a variety of advantages over other tech-

    niques. First, the interaction can be monitored very accu-

    rately in real-time. Since the change in refractive index

    corresponds to a change in mass, this method can also yield

    data on the stoichiometry of complexes in addition to

    binding kinetics [164]. Second, simultaneous analysis of

    multiple interaction partners can be seen. Third, it is a

    label-free technology and optical radiation does not harm

    the biomolecules.

    Microcalorimetry

    Being a non-invasive technique, microcalorimetry is a

    potential technique to study the interactions and study of

    biomolecules. It is the measure of calorimetry of small

    294 Mol Cell Biochem (2012) 365:279299

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    17/21

    samples and relies on the similar basic principle of mea-

    surement of heat energy changes occurring during any

    physical or chemical processes.

    For studying proteinDNA interactions, two most

    commonly used microcalorimetric techniques arediffer-

    ential scanning calorimetry (DSC) and isothermal titration

    calorimetry (ITC). DSC measures the heat capacity profile

    of proteins as a function of temperature during processeslike protein unfolding, thermal stability during complex

    formation by measuring the differential heat energy chan-

    ges between sample and reference cells [167]. A pair of

    matched calorimetric cells (sample and reference cell)

    enclosed in an adiabatic chamber and fitted with sensitive

    thermocouple are used. Electronic/Computer controlled

    feedback circuits are used to measure the differential

    temperature lag between cells. ITC is used to study binding

    proteins more directly by measuring not only the magni-

    tude of the binding affinity but also the magnitude of the

    two thermodynamic terms that define the binding affinity:

    the enthalpy and entropy changes [168]. In a typicalexperiment, a solution of a one biomolecule is titrated into

    a solution of its binding partner and the heat released upon

    their interaction is monitored over time. The temperature

    dependence of enthalpy of binding can be used to calculate

    the binding heat capacity [167].

    Since microcalorimetry is not affected by the constraints

    due to size and shape of molecule and does not require any

    chemical modification or solid support, it has become an

    invaluable resource in laboratories [169]. Also the high

    sensitivity and its ability to analyse true binding affinities

    by measuring heat changes and measure nanomolar to

    picomolar binding constants (109 to 1012 M-1) using the

    competitive binding technique makes it a promising tech-

    nique in molecular biology.

    Although ITC is particularly suitable to follow the

    energetics of an association reaction between biomole-

    cules, the combination of ITC and DSC provides a more

    comprehensive description of the thermodynamics of an

    associating system [170].

    Conclusion

    DNAprotein interactions are an integral component of

    biological systems and their study is important for almost

    all biological processes. Several techniques are available to

    aptly determine these interactions and their understanding

    is imperative. At the in vitro level, molecular biology-

    based techniques such as footprinting assays, EMSA,

    southwestern blotting, Y1H phage display and proximity

    ligation assay (PLA) screen DNAprotein interactions

    reliably. The highly dynamic in vivo tools of chromatin

    immunoprecipitation and its variants, DNA adenine methyl

    transferase identification (DAMID) and ChIP-loop assay

    are robust techniques to characterize several DNAprotein

    interactions in cells.In silicoapproaches have also evolved

    drastically over the years to supplement the information

    available to researchers. Various recent biophysical tech-

    niques including fluorescence-based techniques, CD,

    NMR, AFM, SPR and microcalorimetry have a great

    potential for the detection of protein-based interactions.Every technique is unique in its own way and serves a

    unique purpose. As is evident, the current state of methods

    leaves quite a lot to be desired. An ideal method would

    require minimal cell numbers, able to detect rare interac-

    tions with high specificity and sensitivity, easily modified

    to quantify interactions and provide complete information

    on either of protein or DNA, by themselves. Thus, the

    above listed techniques will help researches to assess the

    dynamics of DNAprotein interactions in cellular devel-

    opment and disease progression.

    Acknowledgments This study was supported by the research grantawarded to Dr. Vibha Rani by the Department of Science and

    Technology, Government of India (SR/FT/LS-006/2009: Sept 4,

    2009). We acknowledge Jaypee Institute of Information Technology,

    Deemed to be University for providing the infrastructural support.

    References

    1. Bulyk ML, Gentalen E, Lockhart DJ, Church GM (1999)

    Quantifying DNAprotein interactions by double-stranded DNA

    arrays. Nat Biotechnol 17:573577

    2. Bulyk ML (2006) DNA microarray technologies for measuring

    proteinDNA interactions. Curr Opin Biotechnol 17:422430

    3. Fox KR, Waring MJ (1987) The use of micrococcal nuclease as

    a probe for drug-binding sites on DNA. Biochim Biophys Acta

    909:145155

    4. Dyke MWV, Dervan PB (1982) Footprinting with MPE.Fe(II)

    complementary-strand analyses of distamycin-binding and

    actinomycin-binding sites on heterogeneous DNA. Cold Spring

    Harb Symp Quant Biol 47:347353

    5. Dyke MWV, Dervan PB (1983) Methidiumpropyl-EDTA-Fe(II)

    and DNase I footprinting report different small molecule bind-

    ing site sizes on DNA. Nucleic Acids Res 11:55555567

    6. Spassky A, Sigman DS (1985) Nuclease activity of 1,10-phe-

    nanthroline-copper ion. Conformational analysis and footprint-

    ing of the lac operon. Biochemistry 24:80508056

    7. Nielsen PE, Hiort C, Sonnichsen SH, Buchardt O, Dahl O,Norden B (1992) DNA binding and photocleavage by ura-

    nyl(VI)(UO22?) salts. J Am Chem Soc 114:49674975

    8. Nielsen PE (1992) Uranyl photofootprinting of triple helical

    DNA. Nucleic Acids Res 20:27352739

    9. Churchill MEA, Hayes JJ, Tullius TD (1990) Detection of drug

    binding to DNA by hydroxyl radical footprinting. Relationship

    of distamycin binding sites to DNA structure and positioned

    nucleosomes on 5S RNA genes of Xenopus. Biochemistry

    29:60436050

    10. Cons BMG, Fox KR (1989) High resolution hydroxy radical

    footprinting of the binding of mithramydn and related antibiotics

    to DNA. Nucleic Acids Res 17:54475460

    Mol Cell Biochem (2012) 365:279299 295

    1 3

  • 8/11/2019 DNA-Protein Interactions Techniques

    18/21

    11. Jain SS, Tullius TD (2008) Footprinting proteinDNA com-

    plexes using the hydroxyl radical. Nat Protocols 3:10921100

    12. Shafer GE, Price MA, Tullius TD (1989) Use of the hydroxyl

    radical and gel electrophoresis to study DNA structure. Elec-

    trophoresis 10:397404

    13. Price MA, Tullius TD (1992) Using hydroxyl radical to probe

    DNA structure. In: David MJ, Lilley JED (eds) DNA structures

    part b: chemical and electrophoretic analysis of DNA, 11th edn.

    Academic Press, San Diego, pp 194219

    14. Routier S, Vezin H, Lamour E, Bernier JL, Catteau JP, Bailly C

    (1999) DNA cleavage by hydroxy-salicylidene-ethylendiamine-

    iron complexes. Nucleic Acids Res 27:41604166

    15. Nielsen PE (1990) Chemical and photochemical probing of

    DNA complexes. J Mol Recognit 3:125

    16. Bailly C, Waring MJ (1995) Comparison of different foot-

    printing methodologies for detecting binding sites for a small

    ligand on DNA. J Biomol Struct Dyn 12:869898

    17. Drew HR (1984) Structural specificities of five commonly used

    DNA nucleases. J Mol Biol 176:535557

    18. Fox KR, Waring MJ (2001) High-resolution footprinting studies

    of drug-DNA complexes using chemical and enzymatic probes.

    In: Chaires JB (ed) Drug-nucleic acid interactions. Academic

    Press, San Diego, pp 412430

    19. Galas DJ, Schmitz A (1978) DNAse footprinting: a simple

    method for the detection of protein-DNA binding specificity.

    Nucleic Acids Res 5:31573170

    20. Leblanc B, Moss T (2000) DNAse I footprinting. In: Rapley R

    (ed) The nucleic acid protocols handbook, 8th edn. Humana

    Press, Totowa, NJ, pp 729735

    21. Fox KR (2010) DNAse I footprinting. In: Fox KR (ed) Drug

    DNA interaction protocols: methods in molecular biology. Hu-

    mana Press, Totowa, NJ, pp 153172

    22. Bailly C, Kluza J, Martin C, Ellis T, Waring MJ (2005) DNase I

    footprinting of small molecule binding sites on DNA. In: Walker

    JM, Herdewijn