In silico characterization of the family of PARP like


BioMed Central
BMC Genomics
Open Access
Research article
In silico characterization of the family of PARP-like
poly(ADP-ribosyl)transferases (pARTs)
Helge Otto1, Pedro A Reche2,3, Fernando Bazan2,4, Katharina Dittmar5,
Friedrich Haag1 and Friedrich Koch-Nolte*1
Address: 1Institute of Immunology, University Hospital Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany., 2DNAX Research
Institute, Palo Alto, CA 94304, USA., 3Dana-Farber Cancer Institute, Harvard University, Boston, MA 02115, USA., 4Depts. of Molecular Biology
and Protein Engineering, Genentech, SF, CA 94080, USA. and 5Department of Integrative Biology, Brigham Young University, Provo, UT 84602,
USA.
Email: Helge Otto - helge.otto@t-online.de; Pedro A Reche - reche@research.dfci.harvard.edu; Fernando Bazan - bazan.fernando@gene.com;
Katharina Dittmar - katharinad@gmail.com; Friedrich Haag - haag@uke.uni-hamburg.de; Friedrich Koch-Nolte* - nolte@uke.uni-hamburg.de
* Corresponding author
Published: 04 October 2005 Received: 13 May 2005
Accepted: 04 October 2005
BMC Genomics 2005, 6:139 doi:10.1186/1471-2164-6-139
This article is available from: http://www.biomedcentral.com/1471-2164/6/139
2005 Otto et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: ADP-ribosylation is an enzyme-catalyzed posttranslational protein modification in which
mono(ADP-ribosyl)transferases (mARTs) and poly(ADP-ribosyl)transferases (pARTs) transfer the ADP-
ribose moiety from NAD onto specific amino acid side chains and/or ADP-ribose units on target proteins.
Results: Using a combination of database search tools we identified the genes encoding recognizable
pART domains in the public genome databases. In humans, the pART family encompasses 17 members.
For 16 of these genes, an orthologue exists also in the mouse, rat, and pufferfish. Based on the degree of
amino acid sequence similarity in the catalytic domain, conserved intron positions, and fused protein
domains, pARTs can be divided into five major subgroups. All six members of groups 1 and 2 contain the
H-Y-E trias of amino acid residues found also in the active sites of Diphtheria toxin and Pseudomonas
exotoxin A, while the eleven members of groups 3  5 carry variations of this motif. The pART catalytic
domain is found associated in Lego-like fashion with a variety of domains, including nucleic acid-binding,
protein-protein interaction, and ubiquitylation domains. Some of these domain associations appear to be
very ancient since they are observed also in insects, fungi, amoebae, and plants. The recently completed
genome of the pufferfish T. nigroviridis contains recognizable orthologues for all pARTs except for pART7.
The nearly completed albeit still fragmentary chicken genome contains recognizable orthologues for
twelve pARTs. Simpler eucaryotes generally contain fewer pARTs: two in the fly D. melanogaster, three
each in the mosquito A. gambiae, the nematode C. elegans, and the ascomycete microfungus G. zeae, six in
the amoeba E. histolytica, nine in the slime mold D. discoideum, and ten in the cress plant A. thaliana.
GenBank contains two pART homologues from the large double stranded DNA viruses Chilo iridescent
virus and Bacteriophage Aeh1 and only a single entry (from V. cholerae) showing recognizable homology
to the pART-like catalytic domains of Diphtheria toxin and Pseudomonas exotoxin A.
Conclusion: The pART family, which encompasses 17 members in the human and 16 members in the
mouse, can be divided into five subgroups on the basis of sequence similarity, phylogeny, conserved intron
positions, and patterns of genetically fused protein domains.
Page 1 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
conserved in all known 3D structures of enzymatically
Background
ADP-ribosylation is a posttranslational protein modifica- active mARTs and pARTs. In a seminal study, Collier and
tion in which the ADP-ribose moiety is transferred from co-workers pinpointed the corresponding glutamic acid
NAD onto specific amino acid side chains of target pro- residue in PARP-1 (before its 3D structure was solved) on
teins [1-4]. ADP-ribosylation was originally discovered as the basis of barely detectable sequence similarity to Diph-
the pathogenic principle of Diphtheria toxin, a multido- theria toxin [20,21]. More recently, the 3D structures of
main secreted protein which inactivates elongation factor anthrax lethal factor, VIP2, and iota toxin have been dis-
2 by ADP-ribosylation after translocation into eucaryotic covered to harbour ART-domains that lack a correspond-
cells [5]. Subsequently, numerous other bacterial toxins ing glutamic acid residue and may represent inactivated
were shown to ADP-ribosylate target proteins in host cells. enzymes [16,22,23].
Moreover, endogenous toxin-like ADP-ribosylating
enzyme activities were detected in eucaryotic cells. Several Comparative structure and amino acid sequence analyses
of these enzymes were purified to homogeneity, revealed that PARP-1 and PARP-2 share additional sec-
sequenced, expressed as recombinant proteins, and ondary structure and conserved amino acids with Diph-
crystallized. theria toxin and Pseudomonas exotoxin A, which
evidently are not conserved in other mARTs (Fig. 1) [6,7].
Sequence and structural analyses revealed the existence of These additional elements include a sixth  strand, an
two distinct families of toxin-related ADP-ribosyltrans- alpha helix between  strands 2 and 3, and a trias of
ferases in mammals [6,7]: The RT6 family of GPI- amino acids, the so-called H-Y-E motif, encompassing a
anchored and secretory mono-(ADP-ribosyl)transferases histidine resdiue in  strand 1, a tyrosine residue in 
(mARTs) catalyzes mono-ADP-ribosylation of cell surface strand 3 and the catalytic glutamic acid residue at the front
and secretory proteins [8]. The PARP family of nuclear and edge of  strand 5. These features, highlighted in the 3D
cytoplasmic poly(ADP-ribosyl)transferases (pARTs) cata- structures of PARP-1 and Diphtheria toxin in Figure 1,
lyzes poly-ADP-ribosylation of nuclear and cytosolic pro- clearly distinguish the structures of PARP-1, PARP-2, and
teins [9-12]. While mARTs have been implicated to DT/ETA from those of a second major ART subfamily that
mediate signalling functions of extracellular NAD, pARTs includes rat ART2 and the Bacillus cereus VIP2 toxin. Dis-
have been shown to play important roles in DNA repair tinguishing features of the ART2/VIP2 subfamliy include a
and maintenance of genome integrity [8,9,12]. seventh  strand that displaces  strand 6, three conserved
alpha helices preceding  strand 1, and an R-S-E trias of
In this paper we use the term pART (poly ADP-ribosyl- amino acid residues in place of the H-Y-E motif of PARP-
transferase) rather than the more established term PARP 1 and DT. Interestingly, the recently reported 3D-structure
(poly-ADP-ribosyl-polymerase) for various reasons. of a prototype member of the family of tRNA:NAD 2'
Firstly, to emphasize the structural and functional similar- phosphotransferases (TpT) [24] revealed a striking resem-
ities of the poly- and mono-ADP-rib syltransferase sub- blance to the structures of the PARP-1/DT subfamily
families. Secondly, with respect to the biochemical rather than to those of the ART2/VIP subfamily, including
classficiation of enzymes the term transferase is more the sixth  strand, the alpha helix between  strands 2 and
appropriate than polymerase: ADP-riboslytransferases 3, and a variant H-Y-E motif (H-H-V). These enzymes cat-
belong to the family of glycosyltransferases; the term alyze removal of a splice junction 2' phosphate from
polymerase is more commonly used for template-depend- ligated tRNA. This reaction resembles the reaction cata-
ent DNA or RNA synthesizing enyzmes. Thirdly, use of the lyzed by ARTs but yields ADP-ribose 1"-2" cyclic phos-
term PARP would have confounded comparison of our phate rather than ADP-ribosylated proteins [25].
results with those of the recent review by Ame et al. [11],
who used the term PARP and a numbering system without The remarkable degree of plasticity of ART amino acid
regard to structural similarities among gene family sequences poses a challenging problem for genome data
members. base mining [7] and even the most sensitive database
search programs fail to connect all known members of the
The 3D-structures of rat ART.2 (PDB accession number ART gene family. Notwithstanding, the results of such in
1og3), chicken PARP-1 (1a26, 3pax), mouse PARP-2 silico analyses can provide important insight into the
(1gs0), and numerous ADP-ribosylating toxins uncovered structural and phylogenetic relationship of ART sub-
a common NAD binding fold with a conserved core of five families. We have previously demonstrated that the
 strands arranged in two abutting  sheets [13-19]. These known members of the mART gene family in the human
two  sheets form the upper and lower jaws of a Pacman- and mouse could be faithfully connected with many
like active site crevice (Figure 1). Remarkably, only a sin- known bacterial ADP-ribosylating toxins, but not with
gle amino acid residue, the catalytic glutamic acid residue pARTs or Diphtheria toxin [26,27]. These analyses also
at the front edge of the fifth conserved -strand, is strictly pointed out the presence of mART-encoding genes in the
Page 2 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
4
4
5 loop
loop
5
E
2
E
2
S
Y
1
R
1
H N
N
3
3
7
C
6
C
6
PARP + ART2 +
C
3
1
2 loop
loop
N
N
C
DT + VIP2 +
3
1
2
C
N
loop
loop N
C
TpT
N
loop
C
Schematic
transferases
Figure 1 illustration of the distinguishing structural features of the PARP-1/DT vs. the ART2/VIP2 subfamilies of ADP-ribosyl-
Schematic illustration of the distinguishing structural features of the PARP-1/DT vs. the ART2/VIP2 sub-
families of ADP-ribosyltransferases. Two abutting sheets of anti-parallel  strands form the upper and lower jaws of a
Pacman-like NAD-binding crevice in all known structures of ADP-ribosyltransferases. The distinguishing structural features of
the PARP/DT and ART2/VIP2 subfamilies are depicted schematically on top and are highlighted in the structures of chicken
PARP-1 (3pax), diphtheria toxin (DT) (1tox), an archael tRNA:NAD 2'-phosphotransferase (TpT) (1wfx), rat ART2 (1og3) and
B. cereus VIP2 toxin (1qs2) below. The structures are depicted from the "front view" with a full view of the ligands bound in the
active site crevice. The ligands NAD and 3MB are colored cyan and are depicted as stick models. The central four -strands
(from top to bottom:  5,  2,  1,  3, colored orange) are conserved in all mARTs and pARTs. The  strands at the edges of
the respective sheets ( 4 and  6, colored pink) show greater structural variation than the central  strands. The H-Y-E motif
residues are depicted in red and their side chains are shown as sticks. The glutamic acid residue at the front edge of  5 is the
critical catalytic residue in both diphtheria toxin and PARP-1  a corresponding glutamic acid residue is observed also in the 3D
structures of rat ART2 and numerous bacterial mARTs. Diphtheria toxin (1tox), pseudomonas exotoxin A (1aer), PARP-1
(3pax), and PARP-2 (1gs0) share the following structural features which are not conserved in either rat ART2 (1og3) or most
other bacterial mARTs: the orientation of  6, the alpha helix between  2 and  3 (colored yellow) and the conserved histi-
dine and tyrosine amino acid residues in  1 and  3. The loop between  4 and  5 (colored magenta) is thought to play a role
in the recognition of target proteins and ADP-ribose polymers. Distinguishing features of ART2, VIP2, iota toxin (1gir), and the
C3 exoenzymes (1g24, 1ojz) include three conserved alpha helices upstream of  strand 1, a seventh  strand that displaces 
strand 6 and an R-S-E- motif instead of the H-Y-E motif of PARP-1 and DT. (Note that the depicted ART2 structure carries a
site directed mutation of the catalytic glutamic acid residue E189I). The recently determined 3D structure of the tRNA:NAD
2'-phosphotransferase (1wfx) bears striking resemblance to that of DT and PARP-1 and carries an H-H-V variant of the H-Y-E
motif. Note that the structure of the diphtheria toxin catalytic domain shown here in complex with NAD is truncated C-termi-
nally at the proteolytic cleavage site that separates this domain from the translocation domain. The PARP-1 catalytic domain
shown here is truncated N-terminally at the position of the phase 0 intron that separates this domain from a neighboring heli-
cal domain. The TpT catalytic domain is truncated N-terminally at the point of fusion to a winged-helix domain.
Page 3 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
genomes of many but not all other model organisms. Of ("Representation of pART gene transcripts in the database
note, no mART-encoding genes could be detected in of expressed sequence tags"). By October 2004, each
plants, fungi, or archaea. Here we provide an in depth human pART except pART7 was represented by more than
analysis of the pART gene family. 100 ESTs. Interestingly, each pART except pART7 is repre-
sented by more ESTs than poly (ADP-ribose) glycohydro-
lase (PARG), the single known enzyme capable of
Results and discussion
Identification of human and mouse pART family members removing poly-ADP-ribose from pART target proteins.
in the EST database The large number of ESTs corresponds to a large variety of
The human and mouse pART gene family members were tissues found to contain pART ESTs and presumably
identified using a combination of data base search tools. reflects an ubiquitous pattern of gene expression, i.e. akin
The human and mouse EST databases as well as the non- to that of the house keeping enzymes hypoxanthine-gua-
redundant GenBank database (nr) were screened with nine phosphoribosyltransferase (HPRT) and glyceralde-
tBLASTn using as queries the amino acid sequences of the hyde-3-phosphate dehydrogenase (GAPD). For
catalytic domains of the known and newly identified comparison, the members of the mART gene family
pART family members. Whenever possible, the full coding (ART1-ART5), which exhibit highly restricted patterns of
sequence of the catalytic domain and of the adjacent expression, are each represented by much fewer ESTs than
regions was assembled using the sequences of published the pARTs. As of January 2005, the mammalian gene col-
cDNAs and overlapping ESTs. Screening of the EST and nr lection http://mgc.nci.nih.gov contains annotated full-
databases was initiated in 1997 and was repeated in regu- length cDNA sequences for 10 of the 17 human pARTs
lar intervals. The coding sequences were extended when and for 12 of 16 mouse pARTs (Fig. 2).
suitable new sequences became available. When the
sequences of the human, mouse and rat genomes were Chromosomal localizations and exon/intron structures of
published in 2000, 2001, and 2004, respectively, the EST the human and mouse pART gene family members
database searches were complemented with correspond- The results of tBLASTn and BLASTn searches of the
ing tBLASTn and BLASTn searches of the genome human, mouse, and rat genome sequences yielded the
sequences [28-30]. Thereby, 17 pART family members chromosomal localization and the exon/intron structure
were identified in the human. These genes were desig- of each pART gene family member. The chromosomal
nated pART1-pART17. Numbering reflects the degree of localizations of the pART genes are represented schemati-
amino acid sequence similarity to PARP-1 (= pART1) and cally in Figure 2. All human and mouse pART orthologues
the degree of similarity within each of the pART sub- lie in regions of conserved synteny. There are three con-
groups. An orthologue for each of these genes was served pART gene clusters containing two related para-
detected in the mouse and in the rat, with the sole excep- logues (pARTs 8 and 9; pARTs 12 and 13; pARTs 15 and
tion of pART7. 17). However, the two most closely related pairs of pARTs
(pARTs 5 and 6; pARTs 16 and 17) each are located on dif-
A complete list of human pART family members, includ- ferent chromosomes. All other pARTs are distributed as
ing the common names and aliases of known genes, is single copy genes on different autosomes. In the human
presented in Figure 2. Based on the degree of amino acid genome, the cluster containing pARTs 8 and 9 also con-
sequence similarities, conserved intron positions, and tains pART7. Additional file 2 illustrates the local chromo-
fused protein domains, the mammalian pART family can somal environment of this pART gene cluster on human
be divided into five major subgroups. Group 1 (pART1- chromosome 3q and the syntenic region on mouse chro-
pART4) contains PARP and its closest relatives, PARP-2, mosome 16B3. The local order of genes is similar in the
PARP-3 and VPARP. Group 2 (pART5, pART6) contains human and mouse. However, the region corresponding to
tankyrase 1 and tankyrase 2. Group 3 (pART7-pART10) pART7 is missing in the mouse. The corresponding region
contains four proteins including the recently described B- is also missing in the rat genome (not shown).
Aggressive Lymphoma Protein (BAL = pART9) [31] and a
myc-interacting protein with PARP activity (PARP-10) The total number of exons in each pART gene is depicted
[32]. Group 4 (pART11-pART14) contains four proteins in Figure 2 and the exon structure of the catalytic domain
including the recently described Zinc-finger Antiviral Pro- is illustrated schematically for the human pARTs in Figure
tein (ZAP = pART13) [33] and TCDD-inducible PARP 3. All intron positions within the coding region are fully
(TiPARP) [34]. Group 5 (pART15-pART17) contains three conserved in human and mouse orthologues. With the
proteins of unknown function. sole exception of pART4 (VPARP), the catalytic domain is
encoded by the 3' terminal exons. Remarkably, in all
The steady growth in the number of matching ESTs pART genes, with the exception of pART4 (VPARP) and
obtained for each of the human pART gene family mem- pART14 (TiPARP), the exons encoding the catalytic
bers over the past 6 years is illustrated in additional file 1 domain are separated from the rest of the respective
Page 4 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
gene Hs protein chrom. localization # of exons amino acids MGC accession #
pART aliases
symbol accession #
Hs Mm Hs Mm Hs Mm Hs Mm
1 PARP1 PARP P09874 1q41-q42 1 H5 23 23 1014 1014 BC037545 BC012041
2 PARP2 PARP-2 CAB41505 14q11.2-q12 14 C1 16 16 583 559 na BC062150
3 PARP3 PARP-3 AAM95460 3p21.1-22.2 9 F1 11 11 540 528 (BC014260) BC014870
4 PARP4 vaultPARP AAD47250 13q11 14 C1 34 > 28 1724 > 1446 na na
5 TNKS Tankyrase AAC79841 8p23.1 8 A4 27 27 1327 1320 na BC057370
6 TNKS2 Tankyrase2 NP_079511 10q23.3 19 C2 27 28 1166 1337 na na
7 PARP15 NP_689828 3q21.1 --- 8 --- 444 --- na ---
8 PARP14 AAN08627 3q21.1 16 B3 12 12 1518 1535 na (BC021340)
9 PARP9 BAL NP_113646 3q13.3-q21 16 B3 11 11 854 830 (BC039580) BC003281
10 PARP10 PARP-10 BAB55067 8q24.3 15 D3 11 11 1025 960 na na
11 PARP11 AAF91391 12p13.3 6 F3 8 9 331 331 BC017569 BC040269
12 ZC3HDC1 NP_073587 7q34 6 B1 12 12 701 711 BC081541 na
13 ZC3HAV1 ZAP NP_064504 7q34 6 B1 13 > 11 902 996 (BC025308) (BC029090)
14 TIPARP TiPARP NP_056323 3q25.31 3 E1 6 6 657 657 BC050350 BC068173
15 PARP16 AAH31074 15q22.2 9 C 6 7 322 322 BC006389 BC055447
16 PARP8 NP_078891 5q11.2 13 D2.3 26 26 854 852 (BC075801) (BC021315)
17 PARP6 CAB59261 15q22.23 9 C 22 22 630 630 (BC026955) BC062096
Hs
3
16 5
11
9
7
8
14 6
13
1
12
10
1 2 3 4 5 6 7 8 9 10 11 12
4
2
15
17
13 14 15 16 17 18 19 20 21 22 X Y
Mm
13
12
5
14
17
15
11 3
1
1 2 3 4 5 6 7 8 9 10 11
2
8
4
9
10 6
16
12 13 14 15 16 17 18 19 X Y
Chromosomal localizations and exon compositions of the human and mouse pART family members
Figure 2
Chromosomal localizations and exon compositions of the human and mouse pART family members. A) pART
family members are sorted by subgroup on the basis of similarities in amino acid sequence, intron positions and associated pro-
tein domains. Color-coding of subgroups is as follows: 1 = red, 2 = pink, 3 = orange, 4 = green, 5 = grey. This color-coding is
used in subsequent figures. Official gene designations, common aliases and accession numbers are shown. Exon compositions
and lengths of open reading frames are given for the longest known or predicted gene transcripts. Available full length cDNAs
from the Mammalian Gene Collection (MGC) are indicated with their respective accession numbers. MGC cDNAs which
apparently do not contain the full open reading frame are indicated in parentheses. Hs = Homo sapiens, Mm = Mus musculus. B)
Chromosomal localizations of pART genes were determined by tBLASTn searches of the respective genome sequences using
the amino acid sequences of the catalytic domains of individual pARTs. Members of the five pART family subgroups are color-
coded as in A).
Page 5 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
81 7684 1597
799 931 782 1015 427
901 441 3680 6387
0 2 1
0 0 2 1 2 73 131 152 317
A 0 1 2 0
99 153 128 62 115 82 75 106 187 157 87
11
5 6 7 8
18 19 20 21 22 23 5
23 24 25 26 27
1
H Y I
H Y E
H E
Y
1206 927 1311
275 155 361 246 117
0 2 1
1736 1931 2350 781
0 2 0 0 2
76 131 152 326
135 128 100 99 125 160 0 1 2 0
75 106 187 157 63
12
9 10 11 12
11 12 13 14 15 16
2 6
23 24 25 26 27
H Y I
H Y E
H E
Y
1109 383 5597
208 226 1396 415
0 2 1
0 0 1 1 2800 616
91 131 131 260
150 87 178 156 170
0 1
175 290
13
10 11 12 13
7 8 9 10 11
3 7
6 7 8
Y Y V
E
H Y
H L
Y
7398 981
1592 4362 1419 6376 235 2104 5446
7423 321
0 2 2
0 2 2 2 0 1 0 161 279 448
0 1
174 161 138 96 184 157 125 132
134 175 290
14
4 5 6
9 10 11 12 13 14 15 16
4 8
10 11 12
H Y I
H E
Y H Y L
4166 3241 2122 1342
591 1462 0 0 1 2
138 207 172 142 118
0 1
140 175 380
15
2 3 4 5 6
9
9 10 11
H Y
Y
Q Y T
catalytic
200bp utr CDS intron
1139 255 1513 2815 1124 882 6951
domain
4850 76
0 2 0 1 0 1 2
0 1
49 113 73 70 74 70 85 103
134 175 347
16 19 20 21 22 23 24 25 26
10
9 10 11
conserved exon/intron-boundaries
H I
Y
H Y I
252 752 704 6545 399 171 343
0 2 0 1 0 1 2
49 110 73 70 80 64 85 103
17 15 16 17 18 19 20 21 22
H I
Y
B
E
H Y
26
1
53
2
52 17
3
86
4
34
5
34
6
34
7
34
8
34
9
34
10
64
11
64
12
57
13
169
14
47
15
37
16
36
17
phase 0-Intron phase 1-Intron phase 2-Intron
Schematic diagram of the exon/intron structures of the regions encoding the catalytic domain of pART family members
Figure 3
Schematic diagram of the exon/intron structures of the regions encoding the catalytic domain of pART family
members. A) Exon/intron structures were determined by BLASTn searches of the human genome sequence with individual
pART cDNA sequences. Only the exons corresponding to the catalytic domain of PARP-1 are shown. The coding region is
marked in red, the 3' untranslated region (utr) is marked in white, and a blue bar marks the region corresponding to the cata-
lytic domain. Exons are represented as boxes with the width of each box reflecting the size of the respective exon (the 3' utr
is not drawn to scale). Exon numbers are given with exon 1 corresponding to the exon encoding the presumptive initiation
methionine. In all cases except pART4 (VPARP) the catalytic domain is encoded by the 3' terminal exons. Exon sizes (or size of
coding region in case of the 3' exons) in basepairs are indicated on top of the boxes. Introns are depicted as triangles and are
not drawn to scale. Intron sizes in base pairs are indicated on top of the triangles. The position of each intron with respect to
the reading frame is indicated in the triangles (0 = between codons, +1 = between codon positions 1 and 2, +2 = between
codon positions 2 and 3). Conserved exon boundaries are marked by colored arrows. Codons corresponding to the H-Y-E
motif in the NAD binding crevice of DT and PARP-1 (see Fig. 1) are marked by yellow circles. B) The catalytic domain as delin-
eated in this paper is indicated by the dashed rectangle. For each pART the cDNA coding region within the catalytic domain is
marked by a straight line, regions extending beyond this domain in the 5' direction (and in the 3' driection in case of pART4)
are marked by dashed lines. The positions of the codons corresponding to the H, Y, E residues in the NAD-binding crevice are
indicated by vertical lines. Intron phases are indicated by circles (phase 0), boxes (phase 1), and triangles (phase 2). Numbers
indicate the distance in codons between the conserved histidine in  1 and the next upstream phase 0 intron. Color-coding of
conserved introns corresponds to that shown in A). Nonconserved introns are indicated in blue (filled) icons.
Page 6 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
coding exons by a phase 0 intron shortly upstream of the detected in the first iteration and obtained the highest
codon for the first residue of the conserved H-Y-E catalytic scores. The pARTs of other subgroups were usually
site motif, the conserved histidine in  1 (Fig. 3). For most detected within two additional iterations, except in case of
pARTs, the amino acid sequences encoded by exons pART15. Here, five iterations were required to detect all
upstream of this phase 0 intron do not show any detecta- pART family members.
ble similarities, except for members of a particular sub-
group. We used the position of this phase 0 intron in The amino acid sequence alignments generated by PSI-
pART1 to delineate the N-terminal border of the catalytic BLAST typically contained the highest degree of sequence
domain (e.g., see the green labeled end of the PARP-1- similarity in secondary structure motifs corresponding to
model in Figure 1 and the dashed rectangle in Figure 3B). the NAD-binding cores in the known 3D structures of
chicken PARP-1 (1a26) and mouse PARP-2 (1gs0).
The exon/intron structures of the pART catalytic domains Separate multiple amino acid sequence alignments were
reveal a number of intriguing features (Fig. 3). The region generated with T-Coffee for each of the pART subgroups
encoding the catalytic domain is disrupted by a remarka- using the orthologous sequences from human and mouse
ble variety of introns with the number of introns varying [36]. PSIPRED was used to predict secondary structure
from one in subgroup 3 and in pART14 to six in pARTs 16 units and GenTHREADER was used to predict the optimal
and 17. The catalytic domain of pART1 (PARP-1) and alignment of pART amino acid sequences with the 3D
those of its closest relatives in subgroup 1 are disrupted by structures of chicken PARP-1 and mouse PARP-2 [37]. In
three (pARTs 3 and 4) or four (pARTs 1 and 2) introns. all cases, predictions and alignments yielded consistent
Strikingly, not one of these 14 intron positions is con- results with respect to the sole alpha helix and five of the
served. The catalytic domains of the two closely related six -strands of the PARP-1 catalytic domain (see addi-
tankyrases in subgroup 2 each are interrupted by three tional files 3, 4, 5, 6, 7: "Multiple amino acid sequence
conserved introns. In subgroup 3, the catalytic domains of alignments, secondary structure predictions and thread-
pARTs 7 10 each contain a single conserved intron. The ing results for pART subgroups 1 5"). The small  strand
pARTs of subgroup 4 (pARTs 11 14) share a single con- ( 4) at the upper edge of the active site crevice was
served intron in their catalytic domains, pARTs 11 13 aligned and predicted congruently only for subgroups 1
share a second conserved intron in the catalytic domain, 4, and could not be predicted with confidence for the
which is missing in pART14. The pARTs of subgroup 5 most distant relatives of PARP-1 (pARTs 15 17). Regions
(pARTs 15 17) share two conserved introns in their cata- corresponding to connecting loops showed significant
lytic domains, pARTs 16 and 17 share four additional con- sequence identities only for members of a particular pART
served introns in the catalytic domain, which are missing subgroup. Most likely, these regions fold similarly only in
in pART15. closely related pART family members.
Conserved structural features revealed by multiple amino A striking result of the alignment analyses is that the H-Y-
acid sequence alignments and secondary structure E catalytic site motif is fully conserved only in subgroups
predictions 1 and 2 (pARTs 1 6). All other pARTs show deviations
PSI-BLAST is a powerful, position sensitive iterative pro- from this motif. The histidine in  1 is conserved in 9 of
gram designed to detect distantly related proteins in the the 11 members of subgroup 3 5, the tyrosine in  3 is
protein database [35]. Initial matches in the first iteration conserved in all family members, yet the presumptive cat-
correspond to those detected by classic BLASTp searches alytic glutamic acid at the N-terminal end of  6 is
and typically reveal proteins with an amino acid sequence exchanged in each of the pARTs 7 17.
identity to the query sequence of > 30%. PSI-BLAST then
derives a position specific scoring matrix from the aligned Moreover, the amino acid sequence of the loop immedi-
protein sequences obtained in the first iteration, which is ately upstream of  5 and the active site glutamic acid res-
then used for the subsequent search of the protein data- idue deviates markedly from those of PARP-1 and PARP-2
base. This process is repeated in an iterative fashion until in most other family members except for the tankyrases
no further matches are detected and the search 'con- (pARTs 5 and 6). A growing body of evidence indicates
verges'. We performed PSI-BLAST searches of the protein that this region influences the target specificity of pARTs
database using as query the amino acid sequences of the and mARTs [38-40]. In the 3D structure of PARP-1 with
catalytic domain of each member of the pART gene fam- carba-NAD (3pax), the ligand was found to interact with
ily. Figure 4 schematically illustrates the tiling paths of this loop outside of the active site crevice, and it was pro-
PSI-BLAST searches obtained with the stringent default posed that this may reflect the binding of the ADP-ribose
threshold setting (0.005 for the expect value) for a repre- polymer in the target protein [14].
sentative member of pART family subgroups 1, 3, 4 and 5.
Typically, the other members of the same subgroup were
Page 7 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
input Hs.pART1 Hs.pART9 Hs.pART12 Hs.pART15 Ci.pART
iteration 1
Ag.pARTa Hs.pART8 Hs.pART11 Ag.pARTc Ag.pARTa
=
Dm.pARTa Hs.pART10 Hs.pART14 Hs.pART17 Hs.pART1
traditional
Hs.pART2 Hs.pART7 Hs.pART13 Hs.pART2
Blastp
Ce.pARTa Hs.pART13 Hs.pART7 Dm.pARTa
searches
Hs.pART3 Hs.pART14 Hs.pART8
Ce.pARTb Hs.pART12 Hs.pART10
Hs.pART4 Hs.pART11 Ag.pARTb
Ce.pARTc Hs.pART6 Hs.pART16
Hs.pART5 Hs.pART5 Hs.pART6
Ci.pART Dm.pARTb
Ag.pARTb Hs.pART9
Dm.pARTb Hs.pART5
iteration 2
Hs.pART6 Ag.pARTb Hs.pART3 Hs.pART16 Hs.pART3
Hs.pART14 Dm.pARTb Dm.pARTa Hs.pART2 Ce.pARTa
Hs.pART13 Hs.pART3 Ce.pARTc Ag.pARTa Hs.pART4
Ba.pART Hs.pART1 Hs.pART1 Ce.pARTa Ce.pARTb
Hs.pART16 Dm.pARTa Hs.pART4 Ce.pARTc
Hs.pART7 Ce.pARTb Ce.pARTb Dm.pARTb
Hs.pART11 Hs.pART4 Ag.pARTa Hs.pART5
Ag.pARTc Ce.pARTc Hs.pART2 Ag.pARTb
Hs.pART17 Ag.pARTa Ce.pARTa Hs.pART6
Hs.pART12 Ci.pART Hs.pART14
Hs.pART16
Hs.pART7
Ba.pART
iteration 3 Hs.pART8 Hs.pART2 Hs.pART17 Dm.pARTa Hs.pART8
Hs.pART10 Ce.pARTa Hs.pART15 Hs.pART1 Hs.pART10
Hs.pART15 Hs.pART15 Ag.pARTc Hs.pART3 Hs.pART12
Hs.pART9 Ci.pART Ce.pARTc Hs.pART11
Ag.pARTc Ce.pARTb Hs.pART13
Hs.pART16 Hs.pART4 Ag.pARTc
Hs.pART17 Hs.pART5 Hs.pART17
Ba.pART Ag.pARTb Hs.pART15
Dm.pARTb
iteration 4 Hs.pART6 Hs.pART9
Hs.pART14
Hs.pART7
Hs.pART13
Hs.pART11
Ci.pART
Hs.pART12
Ba.pART
iteration 5 Hs.pART8
Hs.pART10
Hs.pART9
iteration 6 converged Ba.pART converged
iteration 7 converged converged
iteration 8 converged
Representative
family memberstiling paths of PSI-BLAST searches initiated with the catalytic domain amino acid sequences of selected pART
Figure 4
Representative tiling paths of PSI-BLAST searches initiated with the catalytic domain amino acid sequences of
selected pART family members. PSI-BLAST searches were initiated with the catalytic domain amino acid sequences of the
pARTs indicated on top as query sequences with the default threshold setting for the expect value of 0.005. Matching
sequences from selected model organisms are indicated at the iteration in which they first appeared above threshold. pART
subgroups are color coded as in Figure 2. Accession numbers of the indicated pARTs are listed in Figures 2 and 9. Species of
origin is color-coded in the two letter abbreviation of the organism as follows: Homo sapiens (Hs) red, Drosophila melanogaster
(Dm) and Anopheles gambiae (Ag) purple, Caenorrhabditis elegans (Ce) blue, Chilo iridescent virus (Ci) and Bacteriophage Aeh
(Ba) brown.
Page 8 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
The results of the secondary structure prediction and related (sistergroups) to one another than to members of
threading analyses were used to refine a multiple amino the other subgroups. A similar relationship is seen for
acid sequence alignment of the catalytic domains of all pARTs of subgroups 3 and 4. Note that the putative Chilo
human pART family members. The resulting alignment is iridescent virus pART clusters with the mammalian pARTs
shown in Figure 5. The conserved secondary structure of subgroup 1, suggesting that this large double stranded
units corresponding to the catalytic NAD binding core DNA virus may have acquired its pART by horizontal gene
(the six beta strands and one alpha helix marked in Figure transfer.
1) are indicated schematically below the alignment. The
corresponding amino acid residues are highlighted in the The pART catalytic domain has become genetically fused
alignment. Intron positions are projected onto the amino to a wide spectrum of protein domains
acid sequence in Figure 5. The positions of conserved With the exception of closely related members within a
introns are marked by colored arrows below the align- subgroup, the amino acid sequence similarity between
ment. Note that the alignment diverges most strongly pART family members breaks off upstream of  1. Interest-
both in length and in sequence in the loops immediately ingly, loss of sequence similarity correlates well with the
downstream and upstream of  3. presence of a phase 0 intron upstream of  1. All pART
family members except pART4 and pART14 contain such
Figure 6A shows a condensed version of the alignment in a phase 0 intron 26 64 codons upstream of the conserved
which the diverging intervening loops are indicated only histidine in  1 (Fig. 3B).
by the number of amino acid residues. These 66 amino
acid residues can be superimposed well in the 3D struc- Using the sequences flanking the catalytic domain of each
tures of PARP-1, PARP-2, DT, and ETA. The respective pART family member as queries, we performed further
amino acid sequences of DT, ETA and the putative Chilo PSI-BLAST analyses and searches of the Conserved
iridescent virus pART are also shown for these regions. Fig- Domain Database [41]. The results, summarized in Figure
ure 6B shows the calculated amino acid sequence identi- 8, reveal that each of the 17 human pARTs with the possi-
ties of the pART family members in this region. The ble exception of pART15 is a multi-domain protein. Strik-
percentage amino acid sequence identity in the aligned ingly, the pART catalytic domain is associated  in a Lego
core region is higher among members of a particular sub- like fashion  with a broad spectrum of known protein
group than between members of different subgroups, domains. In all family members except pART4 the cata-
lending support to the subgroup assignments. For each lytic domain represents the C-terminal domain.
pART, the next most closely related paralogue is a member
of the same subgroup. Note that two pairs of pART para- A number of associated domains occur in two or more
logues show very close sequence similarity: pARTs 5 and 6 human pART family members. Note that domain sharing
(94% identity in the aligned core region) and pARTs 16 generally is restricted to members of a particular pART
and 17 (86% identity). This close similarity is reflected subgroup. For example, all members of subgroup 1 con-
also in the conserved exon intron structures of the respec- tain a helical domain preceding the catalytic domain,
tive pART pairs (see Fig. 3). whereas this domain is missing in members of other pART
subgroups. The two members of subgroup 2 share SAM
Comparison of mouse and human pART orthologues and ankyrin-repeat domains. Three of four pARTs in sub-
shows that seven of such pairs exhibit 100% sequence group 3 share A1pp domains [42], all members of sub-
identity in the aligned core region (pARTs 1, 5, 6, 11, 14, group 4 share WWE domains, and two members of
16, and 17) and six show > 90% identity (pARTs 2, 3, 4, subgroup 5 contain a second, truncated pART domain,
10, 12, and 15). The mouse and human orthologues of reminiscent of the duplicated inactive ART domain found
pARTs 8, 9, 13 show the least degrees of sequence identity in the VIP2 and iota mART toxins [16,23].
in this region (82%, 82%, and 70%, respectively) (Fig.
6B). Several pARTs carry recognizable zinc-fingers containing
putative RNA-, DNA-, or ubiquitin-binding domains
Phylogenetic analysis of the amino acid sequences of the (pART1, pART2, pART10, pART12, pART13). This indi-
catalytic cores of pARTs resulted in three very similar trees cates that the genetic fusion of a pART catalytic domain
when using Maximum Parsimony (PAUP), Maximum with zinc-fingers has occurred repeatedly in evolution.
Likelihood (PhyML), and Bayesian Markov Chain Monte
Carlo (MrBayes) optimization criteria (Figure 7). All Representation of pARTs in other model organisms
topologies showed moderate to high support values for We also used PSI-BLAST to screen the protein database for
the recovered relationships. All trees recovered five basic recognizable pART family members in other organisms
clades corresponding to the subgroups 1 5. The results using as queries the amino acid sequences of catalytic
indicate that pARTs of subgroups 1 and 2 are more closely domains of each of the 17 human pARTs (Figure 9). The
Page 9 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
pART1 798 T D I K V V D R D S E E A E I I R K Y V K N T H A T T H N A Y D L E V I D I F K I E R E G E C Q R Y K P F - - - - - - - - - - 1
pART2 366 C A L R P L D H E S Y E F K V I S Q Y L Q S T H A P T H S D Y T M T L L D L F E V E K D G E K E A F R - - - - - - - - - - - - 2
pART3 330 C Q L Q L L D S G A P E Y K V I Q T Y L E Q T - - - G S N H R C P T L Q H I W K V N Q E G E E D R F Q A H - - - - - - - - - - 3
pART4 379 C K I E H V E Q N T E E F L R V R K E V L Q N - - - H H S K S P V D V L Q I F R V G R V N E T T E F L - - - - - - - - - - - - 4
pART5 1112 P E D K E Y Q S V E E E M Q S T I R E H R D G G N A G G I F N R Y N V I R I Q K V V N K K L R E R F C H R Q K E V S E E N - - 5
pART6 959 P D D K E F Q S V E E E M Q S T V R E H R D G G H A G G I F N R Y N I L K I Q K V C N K K L W E R Y T H R R K E V S E E N - - 6
pART7 253 D M N H Q L F C M V Q L E P G Q S E Y N T I K D K F T R T C S S Y A I E K I E R I Q N A F L W Q S Y Q V K K R Q M D I K N - - 7
pART8 1327 D M K Q Q N F C V V E L L P S D P E Y N T V A S K F N Q T C S H F R I E K I E R I Q N P D L W N S Y Q A K K K T M D A K N - - 8
pART9 633 Q D E M K E N I I F L K C P V P P T Q E L L D Q K K Q F E K C G L Q V L K V E K I D N E V L M A A F Q R K K K M M E E K L - - 9
pART10 815 P W N N L E R L A E N T G E F Q E V V R A F Y D T L D A A R S S I R V V R V E R V S H P L L Q Q Q Y E L Y R E R L L Q R C - - 10
pART11 126 T Q V P Y Q L I P L H N Q T H E Y N E V A N L F G K T M D R - - N R I K R I Q R I Q N L D L W E F F C R K K A Q L K K K R G - 11
pART12 493 P D P G F Q K I T L S S S S E E Y Q K V W N L F N R T L P F - - Y F V Q K I E R V Q N L A L W E V Y Q W Q K G Q M Q K Q N G - 12
pART13 723 S S K K Y K L S E I H H L H P E Y V R V S E H F K A S M K N - - F K I E K I K K I E N S E L L D K F T W K K S - - - - - - - - 13
pART14 457 P S Q D F I Q V P V S A E D K S Y R I I Y N L F H K T V P E F K Y R I L Q I L R V Q N Q F L W E K Y K R K K E Y M N R K M F G 14
pART15 89 L S S K V L T I H S A G K A E F E K I Q K L T G A P H T P V P A P D F L F E I E Y F D P - A N A K F Y E T - - - - - - - - - - 15
I S S N R S H I V K L P V N R Q L K F M H T P H Q - - - - - - - - - - - F L L L S S P P A K E S N F R A A - - - - - - - - - - 16
pART16 644
pART17 421 I S S N R S H I V K L P L S R - L K F M H T S H Q - - - - - - - - - - - F L L L S S P P A K E A R F R T A - - - - - - - - - - 17
*
*
1 - K Q L H N R R L L W H G S R T T N F A G I L S Q G L R I A P P - - - - - E A P V T G Y M F G K G I Y F A D M V S K S A N Y C H T S Q G 1
2 - E D L H N R M L L W H G S R M S N W V G I L S H G L R I A H P - - - - - E A P I T G Y M F G K G I Y F A D M S S K S A N Y C F A S R L 2
3 - S K L G N R K L L W H G T N M A V V A A I L T S G L R I M - - - - - - - - - P H S G G R V G K G I Y F A S E N S K S A G Y V I G M K C 3
4 - S K L G N V R P L L H G S P V Q N I V G I L C R G L L L P K V V E D R G V Q R T D V G N L G S G I Y F S D S L S T S I K Y S H P G E T 4
5 - H N H H N E R M L F H G S P F I N - - A I I H K G F D E R H A - - - - - - - - Y I G G M F G A G I Y F A E N S S K S N Q Y V Y G I G G 5
6 - H N H A N E R M L F H G S P F V N - - A I I H K G F D E R H A - - - - - - - - Y I G G M F G A G I Y F A E N S S K S N Q Y V Y G I G G 6
7 - D H K N N E R L L F H G T D A D S V P Y V N Q H G F N R S C A - - - - - - - G K N A V S Y G K G T Y F A V D A S Y S A K D T Y S K P D 7
8 - G Q T M N E K Q L F H G T D A G S V P H V N R N G F N R S Y A - - - - - - - G K N A V A Y G K G T Y F A V N A N Y S A N D T Y S R P D 8
9 - H R Q P V S H R L F Q Q V P Y Q F C N V V C R V G F Q R M Y S - - - - - - - T P C D P K Y G A G I Y F T K N L K N L A E K A K K I S A 9
10 - E R R P V E Q V L Y H G T T A P A V P D I C A H G F N R S F C - - - - - - - G R N A T V Y G K G V Y F A K R A S L S V Q D R Y S P P N 10
11 - V P Q I N E Q M L F H G T S S E F V E A I C I H N F D W R I N - - - - - - - G I H G A V F G K G T Y F A R D A A Y S S R F C K D D I K 11
12 - G K A V D E R Q L F H G T S A I F V D A I C Q Q N F D W R V C - - - - - - - G V H G T S Y G K G S Y F A R D A A Y S H H Y S K S D T Q 12
13 - Q M K E E G K L L F Y A T S R A Y V E S I C S N N F D S F L H - - - - - - - E T H E N K Y G K G I Y F A K D A I Y S H K N C P Y D A K 13
14 R D R I I N E R H L F H G T S Q D V V D G I C K H N F D P R V C - - - - - - - G K H A T M F G Q G S Y F A K K A S Y S H N F S K K S S K 14
15 - K G E R D L I Y A F H G S R L E N F H S I I H N G L H C H - - - - - - - L N K T - - S L F G E G T Y L T S D L S L A L I Y S P H G H G 15
16 - K K L F G S T F A F H G S H I E N W H S I L R N G L V V A S N - - - - T R L Q L H G A M Y G S G I Y L S P M S S I S F G Y S G M N K K 16
17 - K K L Y G S T F A F H G S H I E N W H S I L R N G L V N A S Y - - - - T K L Q L H G A A Y G K G I Y L S P I S S I S F G Y S G M G K G 17
2
1 2
1 D P - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I G L I L L G E V A L G N M Y E L K H A S H I S K - L P K G K - - - - - 1
2 K N - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - T G L L L L S E V A L G Q C N E L L E A N P K A E G L L Q G K - - - - - 2
3 G A H H - - - - - - - - - - - - - - - - - - - - - - - - - - - - V G Y M F L G E V A L G R E H H I N T D N P S L K S P P P G F - - - - - 3
4 D G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - T R L L L I C D V A L G K C M D L H E K D F P L T E A P P G Y - - - - - 4
5 G T G C P T H K D R S C Y I C - - - - - - - - - - - - - - - - - H R Q M L F C R V T L G K S F - L Q F S T M K M A H A P P G H - - - - - 5
6 G T G C P V H K D R S C Y I C - - - - - - - - - - - - - - - - - H R Q L L F C R V T L G K S F - L Q F S A M K M A H S P P G H - - - - - 6
7 S N G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - R K H M Y V V R V L T G V F T K G R A G L V T P P P K N P H N P T D L F 7
8 A N G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - R K H V Y Y V R V L T G I Y T H G N H S L I V P P S K N P Q N P T D L Y 8
9 A D K - - - - - - - - - - - - - - - - - - - - - - - - - - - - - L I Y V F E A E V L T G F F C Q G H P L N I V P P P L S P G A I D G H - 9
10 A D G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - H K A V F V A R V L T G D Y G Q G R R G L R A P P L R G P G H V L L R Y 10
11 H G N T F Q I H G V S L Q Q R H L F R T - - - - - - - - - - - - Y K S M F L A R V L I G D Y I N G D S K Y M R P P S K D G S Y V N L Y - 11
12 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - T H T M F L A R V L V G E F V R G N A S F V R P P A K E G W S N A F Y - 12
13 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - N V V M F V A Q V L V G K F T E G N I T Y T S P P P Q F - - - - - - - - 13
G
14 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - V H F M F L A K V L T G R Y T M G S H G M R R P P P V N P G S V T S D L 14
W Q
15 - - - - - - - - - - - - - - H S L L G P I L S C V A V C E V I D H P D V K C Q T K K K D S K E - - - - - - - - - - - - - - - - - - - 15
Q K V S A
16 - K D E P A S S S K S S N T - S Q S Q K K G Q Q S Q F L Q S R N L K C I A L C E V I T S - - - - - - - - - - - - - - - - - - - 16
17 Q H R M P S K D E L V Q R Y N R M N T I P Q T R S I Q S R - - F L Q S R N L N C I A L C E V I T S - - - - - - - - - - - - - - - - - - - 17
3
*
1 - H S V K G L G K T T P D P S A N - - I S L D G V D V P L G T G I S S G V - - - N D T S L L Y N E Y I V Y D I A Q V N L K Y L L K L K F 7
2 - H S T K G L G K M A P S S A H F - - V T L N G S T V P L G P A S D T G I L N P D G Y T L N Y N E Y I V Y N P N Q V R M R Y L L K V Q F 6
3 - D S V I A R G H T E P D P T Q D T E L E L D G Q Q V V V P Q G Q P V P C P E F S S S T F S Q S E Y L I Y Q E S Q C R L R Y L L E V H L 0
4 - D S V H G V S Q T A S V T T D - - - - - - - - - - - - - - - - - - - - - - - - - - - - F E D D E F V V Y K T N Q V K M K Y I I K F S M 1158
5 - H S V I G R P S V N G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - L A Y A E Y V I Y R G E Q A Y P E Y L I T Y Q I 17
6 - H S V T G R P S V N G - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - L A L A E Y V I Y R G E Q A Y P E Y L I T Y Q I 9
7 - D S V T N N T - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - R S P K L F V V F F D N Q A Y P E Y L I T F T A 0
8 - D T V T D N V - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - H H P S L F V A F Y D Y Q A Y P E Y L I T F R K 0
9 - D S V V D N V - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - S S P E T F V I F S G M Q A I P Q Y L W T C T Q 31
10 - D S A V D C I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - C Q P S I F V I F H D T Q A L P T H L I T C E H 19
11 - D S C V D D T - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - W N P K I F V V F D A N Q I Y P E Y L I D F H 0
12 - D S C V N S V - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - S D P S I F V I F E K H Q V Y P E Y V I Q Y T T 22
13 - D S C V D T R - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - S N P S V F V I F Q K D Q V Y P Q Y V I E Y T E 7
14 Y D S C V D N F - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - F E P Q I F V I F N D D Q S Y P Y F V I Q Y E E 7
15 I D R R R A R I K - - - - - - - - - - - - - - - - - - - - - - - - - - - - - H S E G G D I P P K Y F V V T N N Q L L R V K Y L L V Y S Q 49
16 S D L H K H G E - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I W V V P N T D H V C T R F F F V Y E D 30
17 K D L Q K H G N - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I W V C P V S D H V C T R F F F V Y E D 30
4 6
5
Multiple amino acid sequence alignment of the catalytic cores of the human pART family
Figure 5
Multiple amino acid sequence alignment of the catalytic cores of the human pART family. The multiple sequence
alignment was generated with T-Coffee and manually adjusted using the results of the PSI-BLAST, PSIPRED, and Gen-
THREADER analyses. Numbers at the sequence ends indicate the number of additional residues upstream and downstream of
the alignment shown. Residues corresponding to the H Y E motif in the NAD binding crevice of diphtheria toxin are in red and
marked by asterisks. The conserved  sheets and alpha helix are shaded in green and yellow. Conserved intron positions are
marked in the multiple alignment using the same color-coding as in Figure 3. Conserved intron positions are indicated also
above the alignment with arrows. Non-conserved intron positions are marked in blue in the alignment.
Page 10 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
A 1 2 3 4 5 6
2
** *
13 MENFSSYHGTKP 24 WKGFYSTDNKYDAAGYS 10 AGGVVKVTYPGL 45 VVLSL 7 SVEYINNWEQKAALSVELEINF 368
1tox
457 GYVFVGYHGTFL 23 WRGFYIAGDPALAYGYA 12 NGALLRVYVPRS 32 LDAIT 7 RLETILGWPLAERTVVIPSIPT 40
1aer
3pax 851 HNRQLLWHGSRT 25 GKGIYFADMVSKSANYC 7 IGLILLGEVALG 18 HSVKG 35 YNEYIVYDV--AQVNLKYLLKL 9
ci 49 KKTRLLIHGTRC 26 GEGNYFSEHVQKSLNYT 4 DQILLIYEVHVG 8 YNGDR 26 NSEIISYNE--DQSKIKYIIHI 2
h 1 854 HNRRLLWHGSRT 25 GKGIYFADMVSKSANYC 7 IGLILLGEVALG 18 HSVKG 35 YNEYIVYDI--AQVNLKYLLKL 9
m 1 854 HNRRLLWHGSRT 25 GKGIYFADMVSKSANYC 7 IGLILLGEVALG 18 HSVKG 35 YNEYIVYDI--AQVNLKYLLKL 9
h 2 420 HNRMLLWHGSRM 25 GKGIYFADMSSKSANYC 7 TGLLLLSEVALG 19 HSTKG 38 YNEYIVYNP--NQVRMRYLLKV 8
1gs0 396 PNRMLLWHGSRL 25 GKGIYFADMSSKSANYC 7 TGLLLLSEVALG 19 HSTKG 38 YNEFIVYSP--NQVRMRYLLKI 8
h 3 383 GNRKLLWHGTNM 21 GKGIYFASENSKSAGYV 9 VGYMFLGEVALG 19 DSVIA 40 QSEYLIYQE--SQCRLRYLLEV 2
m 3 371 GNRRLLWHGTNV 21 GKGIYFASENSKSAGYV 9 VGYMFLGEVALG 19 DSVIA 40 QSEYLIYKE--SQCRLRYLLEI 2
h 4 430 GNVRPLLHGSPV 30 GSGIYFSDSLSTSIKYS 7 TRLLLICDVALG 19 DSVHG 12 DDEFVVYKT--NQVKMKYIIKF 1160
m 542 GNVRLLFHGSPV 30 GSGIYFSDSLSTSIKYA 7 SRLLVVCDVALG 19 DSVHG 12 DDEFVVYKT--NQVKMKYIVKF >672
4
h 51176 HNERMLFHGSPF 20 GAGIYFAENSSKSNQYV 20 HRQMLFCRVTLG 18 HSVIG 8 YAEYVIYRG--EQAYPEYLITY 19
m 51169 HNERMLFHGSPF 20 GAGIYFAENSSKSNQYV 20 HRQMLFCRVTLG 18 HSVIG 8 YAEYVIYRG--EQAYPEYLITY 19
h ANERMLFHGSPF 20 GAGIYFAENSSKSNQYV 20 HRQLLFCRVTLG 18 HSVTG 8 LAEYVIYRG--EQAYPEYLITY 11
61023
m 61194 ANERMLFHGSPF 20 GAGIYFAENSSKSNQYV 20 HRQLLFCRVTLG 18 HSVTG 8 LAEYVIYRG--EQAYPEYLITY 11
h 7 317 NNERLLFHGTDA 23 GKGTYFAVDASYSAKDT 8 RKHMYVVRVLTG 24 DSVTN 4 PKLFVVFFD--NQAYPEYLITF 2
m 7
h 81391 MNEKQLFHGTDA 23 GKGTYFAVNANYSANDT 8 RKHVYYVRVLTG 24 DTVTD 4 PSLFVAFYD--YQAYPEYLITF 2
m 81408 RNEKHLFHGTEA 23 GKGTYFAVKASYSACDT 8 RKYMYYVRVLTG 24 DTVTD 4 PSIFVVFYD--NQTYPEYLITF 2
h 9 697 PVSHRLFQQVPY 23 GAGIYFTKNLKNLAEKA 8 LIYVFEAEVLTG 23 DSVVD 4 PETFVIFSG--MQAIPQYLWTC 33
m 9 668 SGSQRLFQQVPH 23 GAGIYFTKSLKNLADKV 8 LIYVFEAEVLTG 23 DSVVD 4 PETIVVFNG--MQAMPLYLWTC 38
h 10 879 PVEQVLYHGTTA 23 GKGVYFAKRASLSVQDR 8 HKAVFVARVLTG 24 DSAVD 4 PSIFVIFHD--TQALPTHLITC 21
m 10 828 PVEQVLYHGTSE 23 GQGVYFAKRASLSVLDR 8 YKAVFVAQVLTG 23 DSAVD 4 PRIFVIFHD--TQALPTHLITC 8
h 11 189 INEQMLFHGTSS 23 GKGTYFARDAAYSSRFC 25 YKSMFLARVLIG 23 DSCVD 4 PKIFVVFDA--NQIYPEYLIDF 1
m 11 189 INEQMLFHGTSS 23 GKGTYFARDAAYSSRFC 25 YKSMFLARVLIG 23 DSCVD 4 PKIFVVFDA--NQIYPEYLIDF 1
h 12 556 VDERQLFHGTSA 23 GKGSYFARDAAYSHHYS 5 THTMFLARVLVG 23 DSCVN 4 PSIFVIFEK--HQVYPEYVIQY 24
m 12 566 VDERQLFHGTSA 23 GKGSYFARDAAYSHHYS 5 SHMMFLARVLVG 23 DSCVN 4 PTIFVVFEK--HQVYPEYLIQY 24
h 13 779 EEGKLLFYATSR 23 GKGIYFAKDAIYSHKNC 5 NVVMFVAQVLVG 16 DSCVD 4 PSVFVIFQK--DQVYPQYVIEY 9
m 13 870 KTEMFLFHAVGR 23 GKGNYFTKEAMYSHKSC 5 GTVMFVARVLVG 16 DSCVD 4 PSVFVIFRK--EQIYPEYVIEY 12
h 14 524 INERHLFHGTSQ 23 GQGSYFAKKASYSHNFS 6 VHFMFLAKVLTG 25 DSCVD 4 PQIFVIFND--DQSYPYFVIQY 9
m 14 524 INERHLFHGTSQ 23 GQGSYFAKKASYSHNFS 6 VHFMFLAKVLTG 25 DSCVD 4 PQIFVIFND--DQSYPYFVIQY 9
h 15 144 RDLIYAFHGSRL 21 GEGTYLTSDLSLALIYS 23 IDHPDVKCQTKK 6 DRRRA 11 PKYFVVTNN--QLLRVKYLLVY 51
m 15 144 RDLIYAFHGSRL 21 GEGTYLTSDLSLALIYS 23 IDHPDVKCQIKK 6 DRSRA 11 PKYFVVTNN--QLLRVKYLLVY 51
h 16 689 FGSTFAFHGSHI 26 GSGIYLSPMSSISFGYS 35 LQSRNLKCIALC 6 DLHKH 0 GEIWVVPNT--DHVCTRFFFVY 32
m 687 FGSTFAFHGSHI 26 GSGIYLSPMSSISFGYS 35 LQSRNLKCIALC 6 DLHKH 0 GEIWVVPNT--DHVCTRFFFVY 32
16
h 17 465 YGSTFAFHGSHI 26 GKGIYLSPISSISFGYS 35 LQSRNLNCIALC 6 DLQKH 0 GNIWVCPVS--DHVCTRFFFVY 32
m 17 465 YGSTFAFHGSHI 26 GKGIYLSPISSISFGYS 35 LQSRNLNCIALC 6 DLQKH 0 GNIWVCPVS--DHVCTRFFFVY 32
2
1 2 3 4 5 6
B
tox aer ci g01 h01 h02 h03 h04 h05 h06 h07 h08 h09 h10 h11 h12 h13 h14 h15 h16 h17
tox *** 30 15 18 18 18 20 15 15 15 17 21 9 14 14 15 9 11 17 11 12 ddt
aer 30 *** 18 17 17 20 20 18 17 17 15 14 6 20 15 17 12 12 15 9 9 aer
ci 15 18 *** 36 38 36 32 36 30 32 26 26 15 21 21 27 24 27 18 17 15 ci
g01 18 17 36 *** 97 79 56 47 47 44 33 29 23 26 33 27 26 26 23 26 26 g01
h01 18 17 38 97 *** 79 56 49 49 46 35 29 23 24 32 29 26 27 23 26 26 h01
m01 18 17 38 97 100 79 56 49 49 46 35 29 23 24 32 29 26 27 23 26 26 m01
h02 18 20 36 79 79 *** 58 50 47 46 33 27 21 24 32 29 26 27 23 30 29 h02
m02 17 21 36 76 76 92 53 52 44 44 35 29 26 27 33 30 27 27 24 29 29 m02
h03 20 20 32 56 56 58 *** 36 44 41 36 33 29 32 33 35 36 33 21 23 24 h03
m03 20 20 35 56 58 55 95 41 46 42 38 32 29 32 33 36 33 35 21 23 24 m03
h04 15 18 36 47 49 50 36 *** 46 47 38 29 26 26 32 33 27 30 24 30 26 h04
m04 14 17 32 46 47 47 36 91 44 46 39 29 29 26 32 30 29 29 26 30 26 m04
h05 15 17 30 47 49 47 44 46 *** 94 46 41 35 38 39 41 32 38 21 24 23 h05
m05 15 17 30 47 49 48 44 46 100 94 46 41 35 38 39 41 32 38 21 24 23 m05
h06 15 17 32 44 46 46 41 47 94 *** 46 42 35 38 38 39 30 36 21 24 23 h06
m06 15 17 32 44 46 46 41 47 94 100 46 42 35 38 38 39 30 36 21 23 23 m06
h07 17 15 26 33 35 33 36 38 46 46 *** 79 36 55 62 55 47 50 29 17 17 h07
h08 21 14 26 29 29 27 33 29 41 42 79 *** 39 55 55 50 42 47 21 14 15 h08
m08 17 15 24 30 30 30 38 33 41 41 79 82 36 55 61 52 44 53 24 18 18 m08
h09 09 06 15 23 23 21 29 26 35 35 36 39 *** 46 35 33 39 36 18 15 14 h09
m09 07 05 18 26 24 24 29 27 33 33 36 36 82 41 36 30 35 35 20 20 15 m09
h10 14 20 21 26 24 24 32 26 38 38 55 55 46 *** 52 50 46 52 20 15 17 h10
m10 12 20 21 24 23 23 30 26 33 33 50 50 46 91 52 47 47 55 20 15 15 m10
h11 14 15 21 33 32 32 33 32 39 38 62 55 35 52 *** 64 53 59 24 20 20 h11
m11 14 15 21 33 32 32 33 32 39 38 62 55 35 52 100 64 53 59 24 20 20 m11
h12 15 17 27 27 29 29 35 33 41 39 55 50 33 50 64 *** 62 67 24 23 24 h12
m12 15 17 26 29 30 29 32 33 39 38 56 49 32 47 65 94 59 65 26 24 24 m12
h13 9 12 24 26 26 26 36 27 32 30 47 42 39 46 53 62 *** 56 18 17 18 h13
m13 11 7 24 21 21 23 30 26 38 36 47 44 39 47 55 61 70 53 20 15 17 m13
h14 11 12 27 26 27 27 33 30 38 36 50 47 36 52 59 67 56 *** 21 26 24 h14
m14 11 12 27 26 27 27 33 30 38 36 50 47 36 52 59 67 56 100 21 26 24 m14
h15 17 15 18 23 23 23 21 24 21 21 29 21 18 20 24 24 18 21 *** 30 26 h15
m15 17 15 18 23 23 23 21 24 20 20 29 21 18 20 24 24 18 21 97 30 26 m15
h16 11 9 17 26 26 30 23 30 24 24 17 14 15 15 20 23 17 26 30 *** 86 h16
m16 11 9 17 26 26 30 23 30 23 23 17 14 15 15 20 23 17 26 30 100 86 m16
h17 12 9 15 26 26 29 24 26 23 23 17 15 14 17 20 24 18 24 26 86 *** h17
m17 12 9 15 26 26 29 24 26 23 23 17 15 14 17 20 24 18 24 26 86 100 m17
ddt aer ci g01 h01 h02 h03 h04 h05 h06 h07 h08 h09 h10 h11 h12 h13 h14 h15 h16 h17
Structure based amino acid sequence alignment of the catalytic cores of the pART gene family
Figure 6
Structure based amino acid sequence alignment of the catalytic cores of the pART gene family. A) The alignment
is restricted to those regions corresponding to the conserved secondary structure units of PARP-1 and DT as highlighted in
Figure 1. The H Y E motif is marked by asterisks and is highlighted in red. Black numbers indicate amino acid residues from the
N- and C-terminal ends of the protein and within the loops connecting the structure units shown. For proteins with known 3D
structures the pdb accession number is given and the residues corresponding to respective secondary structure units are
underlined. 1tox = diphtheria toxin; 1aer = pseudomonas exotoxin A, 3pax = chicken PARP-1 (pART1), 1gs0 = mouse PARP-2
(pART2). Human and mouse pARTs are indicated by colored numbers. The sequence of the putative pART from Chilo irides-
cent virus is also shown for comparison (ci). B) Pairwise percentage sequence identities were calculated for the 66 amino acid
residues shown in A), which correspond to the conserved core secondary structure units in Figure 1.
Page 11 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
1tox
1aer
Hs15
100
Mm15
99
87 Hs16
Mm16
100
Hs17
100
Mm17
Hs05
100
99 Ms05
Hs06
100
100
Mm06
75
Ci
99
Hs04
70
Ms04
95 Hs03
61
Mm03
75
3PAX
100
96
Hs01
69 91
Mm01
Hs02
90
Mm02
Hs09
100
Ms09
100
Hs10
100
Mm10
91
Hs07
100
Hs08
99
99
Mm08
Hs11
100
Mm11
98
100 Hs13
Mm13
0.05
89
97 Hs12
Mm12
83
Hs14
100
Mm14
Phylogram of the evolutionary relationship of the pART family
Figure 7
Phylogram of the evolutionary relationship of the pART family. Evolutionary relationships of the amino acid
sequences in the catalytic core of the pARTs shown in Figure 6 are illustrated as a maximum a posteriori phylogram (MAP) of
Bayesian Markov Chain Monte Carlo analysis (pP = 0.92). Posterior probabilities were converted into percentages and are
shown above the branches. Members of the five pART family subgroups are color-coded as in Figure 2: subgroup 1 = red, 2 =
pink, 3 = orange, 4 = green, 5 = grey. Hs = Homo sapiens, Mm = Mus musculus.
Page 12 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
Ce.pARTa icons designation size accession #
Dm.pARTa 1 DBD
1 1 1
DBD DBD DBD
catalytic domains
WGR
BRCT BRCT BRCT
WGR WGR WGR Ci.pART
At.pARTa
1
pART catalytic 180 pfam00644
HYE
1 HYE
DBD HYE HYE
1 HYE
WGR
945 181 truncated
120 na
994 983
1014
pART catalytic
1
DBD
WGR
At.pARTb
HYE
1 2 Ubiquitin-conjugating
Cen
140 cd00195
WGR
enzyme catalytic UBCc
583 1
WGR
Ce.pARTb
HYE
Ba.pART
637
HYE 1 nucleic acid binding domains
3 1
BRCT
1
BRCT WGR
HYE
Gz.pARTa PARP-type
540
DBD 2x 90 pfam00645
HYE
zinc finger
538
1211 SAF/Acinus/PIAS
DBD 2x 35 pfam02037
SAP-domain
HYE 1 HYE
4
CCCH-type
MVPI 752
VIT vWFA
pRBD 4x 30 pfam00642
zinc finger
1724
centriole-
Cen 55 na
localization
1
1
WWE
1 RNA-recognition
HPS RRM 70 pfam00076
WGR
motif
Ce.pARTc
SAM HYI
11
protein-interaction domains
331 At.pARTe
WGR-domain
1
pRBD 1
HYE HYE WWE WGR 85 pfam05406
5 tryp/gly/arg
1 WWE WWE
2276
1 1327
PARP regulatory 135 pfam02877
VHE
SAM
HYI
SAM
Dm.pARTb 12
568 sterile alpha
SAM 65 cd00166
701
motif
1
pRBD
HYE breast cancer suppressor
HYE WWE WWE BRCT 75 pfam00533
Eh.pARTf
protein C-terminal
1
RF
1181
IBR
A1pp Appr-1" processing 135 smart00506
YYV
13
HYL
902 His-Pro-Ser
HPS 180 na
1 1 region
A1pp 358
WWE WWE
ankyrin repeats 20 x 30 cd00204
HYL HYI
7
14
vault protein
444 657
1 VIT 130 smart00609
A1pp A1pp A1pp inter alpha trysin
WWE
von Willebrand
vWFA 160 cd00198
1 factor type A
HYL Dd.pARTg Ag.pARTc
8
1
major vault protein
1
UB
MVPI 160 na
1518 HYY
A1pp
interacting
15
1
A1pp A1pp HYY
322
WWE-domain
WWE 75 pfam02825
1
HYI 259 trp/trp/glu
QYT
9
1604 ubiquitin interaction
Gz.pARTc
UI 18 na
854 HYI
motif
16 1
1
RRM
UI UI 854
C3HC4 type zinc
RF 45 pfam00097
1 finger (RING finger)
HYH
HYI
10 UB U-box 75 pfam04564
1025 HYI
in between RING
17
IBR 65 pfam01485
630 1077 fingers
Schematic diagram of the domain structures of human pARTs and pARTs from distantly related organisms
Figure 8
Schematic diagram of the domain structures of human pARTs and pARTs from distantly related organisms.
Recognizable protein domains in the pART family are represented by the icons defined on the right. The domain structures of
human pARTs (on the left, numbered Pacman icons) and related pARTs from other species are illustrated schematically. Poten-
tial DNA binding domains are boxed in red, potential ubiquitylation motifs are boxed in green. Members of the five pART fam-
ily subgroups are grouped within colored boxes using the color-coding as in Figure 2: subgroup 1 = red, 2 = pink, 3 = orange,
4 = green, 5 = grey. Amino acids corresponding to the HYE catalytic site motif of DT and PARP-1 are shown in the mouths of
the Pacman icons. Black numbers indicate protein lengths in number of amino acids. Species of origin is color-coded in the two
letter abbreviation of the organisms as in Figures 4 and 9: Drosophila melanogaster (Dm) and Anopheles gambiae (Ag) purple,
Caenorrhabditis elegans (Ce), Dictyostelium discoideum (Dd), Entamaoeba histolytica (Eh), and Gibberella zeae (Gz) blue, Arabidopsis
thaliana (At) green, Chilo iridescent virus (Ci) and Bacteriophage Aeh (Ba) brown. Protein database accession numbers for the
illustrated pARTs are listed in Figures 4 and 9. On the right, the approximate size of each domain is indicated in number of
amino acid residues. The accession numbers of the respective domain families in the pfam, cd, and smart databases are indi-
cated. In case of zinc finger (zf) containing domains, the number of recognizable zinc fingers is indicated by colored bars within
the icon.
order in which PSI-BLAST picked up putative pART group but differed markedly for members of different sub-
sequences from the database in successive iterations was groups (see additional file 8: "Representative tiling paths
similar for different members of a particular pART sub- of PSI-BLAST searches initiated with the catalytic domain
Page 13 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
pART protein aliases human mouse chicken fish fly mosquito
1 PARP1 PARP P09874 NP_031441 NP_990594 CAG09179 P35875 a XP 312938 a
2 PARP2 Q9UGN5 NP_033762 CAF92030
3 PARP3 AAM95460 NP_663594 CAG06805
4 PARP4 vaultPARP AAD47250 XP_283217 XP_417150 CAG08214
5 TNKS Tankyrase AAC79841 AAH57370 NP_989671 CAG12585
AAF56487_b XP_321116_b
6 TNKS2 Tankyrase 2 NP_079511 XP_129246 NP_989672 CAG04910
7 PARP15 NP_689828 ---
8 PARP14 AAN08627 XP_488522 XP_422113 CAF98988
9 PARP9 BAL NP_113646 NP_084529 XP_422116 CAG12587
10 PARP10 BAB55067 AAH24074 CAG05989
11 PARP11 AAF91391 NP_852067 XP_416489 CAG01913
12 ZC3HDC1 NP_073587 NP_766481 XP_416333 CAF98285
13 ZC3HAV1 ZAP NP_064504 BAB32047 XP_423977 CAF96305
14 TIPARP TiPARP NP_056323 NP_849223 XP_422828 CAF96664
15 PARP16 AAH31074 NP_803411 XP_413903 CAG05566 XP 308419 c
16 PARP8 NP_078891 AAH21881 CAG05573
XP_424786
17 PARP6 CAB59261 XP_134863 CAF95416
pART protein nematode slime mold fungi amoeba weed viruses bacteria
CAD59237_a
1 PARP1 AAM27195 a
NP_850165_a 1AERA
CAD58666_c EAL47198_a AAB94432
2 PARP2
EAA75569_a CAA88288_b 760286A
CAD59238_d EAL50270_b AAQ17796
3 PARP3 Q09525 b
BAB09119_c AAW80252
CAD59240_e
4 PARP4
5 TNKS CAD59239_b
AAC04454_c
6 TNKS2 AA051129 f
7 PARP15
8 PARP14 AAS38928_g
9 PARP9
10 PARP10
11 PARP11 EAL43406_c
12 ZC3HDC1 EAL50270_d NP_849739_d
13 ZC3HAV1 EAL49071_e AAC36170_e
14 TIPARP EAL45174 f
15 PARP16
16 PARP8
EAA73885_c
17 PARP6
pARTs in distantly related species
Figure 9
pARTs in distantly related species. pART relatives were identified by PSI-BLAST searches as in Figure 4. Matching
sequences from other organisms were sorted by group on the basis of sequence similarity and associated domains. Accession
numbers are given for pARTs from Homo sapiens (human), Mus musculus (mouse), Gallus gallus (chicken), Tetraodon nigroviridis
(puffer fish), Drosophila melanogaster (fruit fly), Anopheles gambiae (malaria mosquito), Caenorhabditis elegans (nematode), Dictyos-
telium discoideum (slime mold), Gibberella zeae (ear root microfungus), Entamaoeba histolytica (amoeba), Arabidopsis thaliana
(cress plant), Chilo iridescent virus and Bacteriophage Aeh1 (viruses), Pseudomonas aeruginosa, Corynebacterium diphtheriae and
Vibrio cholerae (bacteria). Lower case letters in black indicate the pART designations used in Figure 8.
amino acid sequences of selected pART family mem- ated domains, pARTs from other vertebrates including
bers"). In many instances, PSI-BLAST detected pART fish and chicken, generally can be assigned to a particular
sequences from distantly related organisms in earlier iter- human pART orthologue. In contrast, pARTs of lower
ations than the human pART paralogues from other eucaryotes can be assigned to a subgroup but not to a par-
subgroups. ticular vertebrate pART.
Figure 9 summarizes the matches of pART-related proteins pART homologues were found in many model organisms
found in model organisms with completed genome from the animal, plant, fungi, and protist kingdoms. The
sequences. On the basis of amino acid sequence similar- recently completed genome of the pufferfish T. nigroviridis
ity, conserved intron positions and/or conserved associ- contains recognizable orthologues for all pARTs except for
Page 14 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
pART7 [43]. The nearly completed albeit still fragmentary domain, as seen in human pART subgroup 4, is found also
chicken genome contains recognizable orthologues for all in putative pARTs from cress plant. A domain
pARTs except for pARTs 2, 3, 7, 10, and 17 [44]. Simpler corresponding to the unknown upstream region of the
eucaryotes generally contain fewer pARTs (two in the fruit smallest human pART (pART15) is observed also in a
fly D. melanogaster, three each in the malaria mosquito A. pART from the malaria mosquito Anopheles gambiae, and
gambiae, the nematode C. elegans, and the ascomycete G. a duplicated truncated pART catalytic domain as in pARTs
zeae; six in the amoeba E. histolytica, nine in the slime 16 and 17 is observed also in a pART from the microfun-
mold D. discoideum, and ten in the cress plant A. thaliana). gus Gibberella zeae. These results indicate that many of the
domain combinations observed in human and mouse
Remarkably, the yeast S. cerevisae and the archaea lack pARTs represent evolutionary ancient inventions.
detectable pARTs. Only two matches were found in the
viral proteome: these derive from two double stranded Some pARTs of distantly related proteins are associated
DNA viruses: the insect virus Chilo iridescent virus and with domains not found in any of the human pARTs. A
the bacteriophage Aeh1. Although PSI-BLAST initially striking example is that of G. zeae pARTc, which most
failed to connect the pART family with Diphtheria toxin closely resembles human pARTs 16 and 17, but is associ-
and Pseudomonoas exotoxin A, these toxins were readily ated with a second potential catalytic, ubiquitin ligase
connected with the eucaryotic pARTs when using as query domain (Fig. 8). A similar pART is found also in the
a chimera, e.g. of Diphtheria toxin and Chilo iridescent related microfungus Aspergillus nidulans [GenBank:
virus pART in which the sequences of three of the con- EAA66581]. These microfungal pARTs are the only
served structure units highlighted in Figures 1 and 6A were examples found so far, in addition to vertebrate pART4,
interchanged. These searches uncovered a DT/ETA-like where a distinct domain(s) is genetically fused to the C-
putative ADP-ribosyltransferase in V. cholerae, but no terminal end of the pART catalytic domain. The large
other proteins in the microbial proteome in GenBank. domain(s) associated with the putative pART from bacte-
riophage Aeh1 does not bear any resemblance to pART-
Of note, none of the known R-S-E motif bacterial or verte- associated domains in vertebrates but shows distant simi-
brate mARTs were ever connected by PSI-BLAST with the larity to viral coat proteins. The only organism containing
DT/ETA/pART group. In several cases, however, we an isolated pART domain reminiscent of the isolated ART
observed intriguing matches just slightly below threshold domain found in verbetrate mARTs [27] is the Chilo iri-
(in the region surrounding the conserved H in  1) to descent insect virus. This "naked" viral pART catalytic
members of the family of RNA:NAD 2' phosphotrans- domain contains the H-Y-E motif of PARP-1 and DT. It
ferases. These enzymes catalyze a reaction during tRNA will be interesting to determine whether this protein
splicing that is similar to the reaction catalyzed by ARTs, exhibits the predicted pART activity.
but in which ADP-ribose is transferred to the 2'-phosphate
in immature tRNA rather than to an amino acid residue in A striking example of domain shuffling is observed in one
a protein [25]. The 3D-structure of a prototype member of of the three C. elegans pARTs: like the human tankyrases
this gene family, indeed, reveals a structure closely resem- (pARTs 5 and 6), Ce.pARTc contains ankyrin repeats, but
bling that of PARP-1 and Diptheria toxin (see Fig. 1), pro- also harbors the regulatory and WGR domains typical of
viding strong support for the relevance of the matches human group 1 pARTs instead of the SAM domain found
detected by PSI-BLAST. in human pARTs 5 and 6 (Fig. 8). A similar variation of
domains as in Ce.pARTc is found also in one of the ten
For the pART homologues shown in Figure 9 we also ana- pARTs of D. discoideum (Dd.pARTb).
lyzed the sequences flanking the pART catalytic domain
for associated conserved domains. The results reveal that Finally, we addressed the question whether the striking
many pARTs, even from very distantly related organisms, differences in exon/intron compositions of the closest
share domain associations found in human and mouse PARP-1-homologues in groups 1 and 2 might be reflected
pARTs. Some of these are illustrated in Figure 8. For exam- in similar differences in pART orthologues of distantly
ple, the association of regulatory, BRCT, and DNA related species. To this end we determined the exon/
binding domains observed in pART1 (PARP-1) is found intron structures of distant pART orthologues by BLASTn
also in similar proteins encoded by fruit fly, nematode, searches of the respective genome databases using cDNA
microfungi and cress plant genomes. Tankyrase-like asso- sequences as queries; and compared the results with those
ciation with ankyrin repeats is found in pARTs from the obtained for human pART genes. The results are illus-
fruit fly and nematode. The association of a pART catalytic trated schematically in Figure 10, with conserved intron
domain with an A1pp domain, as seen in human pART positions highlighted. As in case of most other genes, the
subgroup 3, is found also in a pART from the slime mold pART genes of 'lower' animals, protists, and plants in gen-
Dictyostelium discoideum. The combination with a WWE eral contain fewer and shorter introns than the human
Page 15 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
Hs.pART1
A
5430 9899 1574 1654 2213 645 2320 1827 903 280 298 1838 2753 3708 2112 590 1427 799 931 782 1015 427
0 1 0 2 0 0 0 1 1 1 1 2 0 0 0 0 0 0 0 2 1 2
281 166 116 215 100 117 177 148 141 243 69 133 196 129 84 123 129 99 153 128 62 115 737
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
GR H Y E
1244 267 1398 3612 424 1122 1737 561 804 181 155 361 246 117
275
1 1 0 0 1 2 0 1 2 0 0 2 0 0 2
53 195 71 51 97 76 103 163 139 64 135 128 100 99 125 260
Hs.pART2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Y
GR H E
Hs.pART5
23546 35198 64273 737 28862 1429 557 1373 1462 95 10153 6095 4169 2243 1420 292 12537 3396 697 8889 1510 1471 901 441 3680 6387
1 1 1 2 0 2 0 1 0 2 0 1 0 2 0 1 0 0 1 0 1 0 0 1 2 0
678 225 96 37 76 95 67 187 122 92 79 172 80 146 166 220 110 189 238 83 121 98 75 106 187 157 232
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
H E
Y
1300 55 36000 >20000 6500
0 1 2 2 2
190 142 334 534 570 1344
Dm.pARTa
1 2 3 4 5 6
GR H Y E
2394 64 60 64 68 60
11 1 0 2 0
Dm.pARTb 688 783 1881 158 293 157 135
1 2 3 4 5 6 7
H Y E
65 649 1094 1653 5407 1551 1390 2145 46
00 0 2 0 0 0 0 2
93 720 123 578 205 333 237 216 254 79
Ce.pARTa
1 2 3 4 5 6 7 8 9 10
H Y E
GR
44 91 49 47
1 10 2
556 78 698 125 326
Ce.pARTb
1 2 3 4 5
H Y E
GR
Ce.pARTc
1335
49 50 77 119 46 488 49 51 59
1 0 1 0 0 0 2 0 0 0
88 554 1712 2295
199 114 245 432 645 204
343
1 2 3 4 5 6 7
8 9 10 11
H Y E
GR
749 80 97 111 149 99 72 82 92 84 83 103 83 136 110 129 229
93
0 2 0 1 2 0 0 1 2 0 2 1 0 0 1 0 0 0
294 50 217 187 282 196 162 61 64 484 113 77 161 183 109 83 138 183 273
At.pARTa
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
GR H Y E
77 81 92 181 196 214 93 117 76 107 93 109 91 103 121 111 86
0 1 0 1 0 1 0 2 1 0 0 0 0 1 0 2 0
212 268 146 82 236 130 56 74 47 50 63 81 111 76 143 74 106 198
At.pARTb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
H Y E
GR
543
Ci.pART
ca.200bp 1
utr CDR Intron conserved exon-intron-boundaries
domains
H Y E
Zn finger SAP Ankyrin SAM BRCT WGR pART rd pART cd
H Y E
B
Dm.pARTa
Hs.pART1
Ce.pARTa
Dm.pARTb
Hs.pART5
At.pARTa
Hs.pART2
Ce.pARTb
At.pARTb
phase 0 intron phase 1 intron phase 2 intron
Ankyrin WGR pART rd pART cd
Schematic
Figure 10diagram of the exon/intron structures of pART family members of distantly related organisms
Schematic diagram of the exon/intron structures of pART family members of distantly related organisms. A)
Exon/intron structures were determined by BLASTn searches of the genome browsers using the pART cDNA sequences. The
positions of codons corresponding to the H Y E motif in the NAD-binding crevice of diphtheria toxin are marked by yellow
circles. The position of the conserved glycine and arginine pair of residues within the WGR domain is marked in blue. Coding
regions for catalytic and other domains are indicated by colored bars. Conserved introns are marked by colored arrows. B)
The diagram contains only those introns that are conserved in at least two distantly related species. Color-coding of the
introns corresponds to that shown in A). The position of codons encoding/corresponding to the H, Y, E residues in the NAD
binding crevice are indicated by vertical lines. The position of each intron with respect to the codon is indicated by circles
(phase 0 introns), boxes (phase 1 introns), and triangles (phase 2 introns). Coding regions for catalytic and other selected
domains are indicated by colored lines as in A).
Page 16 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
homologues. However, some of the introns found in human genome and of the EST database with the
human pART genes are found also in homologues of AK023746 sequence provide no evidence for a distinct
distantly related organisms. For example, all six introns copy of this gene in the human genome. We conclude that
observed in D. melanogaster pARTb are found in corre- the PARP-5c gene identified by Ame et al. represents an
sponding positions also in human pART5 (tankyrase 1); allelic variant or cloning/sequencing error rather than a
yet human pART5 contains 14 additional introns not genuine pART gene family member; i.e. that the total
found in the fruit fly pART. The other pART of the fruit fly number of human pART genes is 17 rather than 18 sug-
shares two of its five introns with human pART1 (PARP- gested in the previous report. Large discrepancies exist
1). The three pARTs of the nematode C. elegans show a dif- also in the number of amino acids assigned in the two
ferent, only partially overlapping set of conserved introns: reports for pART7/PARP-15 (444 vs. 989) and for
Ce.pARTa shares seven of its nine introns with human pART16/PARP-8 (854 vs. 501). The earlier database
pART1, Ce.pARTb shares three of its four introns with entries for PARP-8 (XM_018395) and PARP-15
human pART2, whereas Ce.pARTc does not seem to share (XM_093336) have hence been removed as a result of
any of its introns with pART5, despite the similar domain standard genome annotation processing because these
organization on the protein level (see Fig. 8). The pARTs entries evidently contained frameshift mutations and/or
from the model plant Arabidopsis thaliana contain a fairly fused cDNA sequences that led to erroneous amino acid
high number of introns, however only very few intron assignments. Similarly, the small differences in assign-
positions correspond to ones found also in human pARTs. ments for five other PARPs/pARTs can be accounted for by
For example, At.pARTa which is most closely related to differences in the draft vs. high quality sequence of the
human PARP-1 in terms of amino acid sequence similar- human genome (Ame et al./our study): pART2/PARP2
ity and organization of conserved protein domains, evi- (583/570), pART3/PARP3 (540/533), pART10/PARP10
dently does not share any of its 18 introns with human (1020/1025), and pART14/PARP7 (657/680).
pART1. Strikingly, however, the introns found in the cata-
lytic domain of this pART exhibit conserved positions We assigned the 17 human pARTs into five distinct sub-
with two different human pARTs: two of the four intron groups (Fig. 2). This assignment is supported by several
positions in the catalytic domain of At.pARTa are found in independent lines of evidence: Firstly, members of a par-
corresponding positions in human pART5 (tankyrase), ticular subgroup show higher amino acid sequence
another intron is found at a corresponding position in identities to one another than to members of other sub-
human pART2 (Fig. 10), whereas the fourth intron is not groups (Fig. 6). This is reflected in the tiling paths of PSI-
found in any human pART. At.pARTb which is most Blast searches, where members of the same subgroup were
closely related to human pART2 in terms amino acid detected in the first iteration, whereas members of other
sequence similarity and domain organization, shares one subgroups generally were detected in later iterations (Fig.
of its 17 introns with human pART2. Note further, that in 4). Secondly, members of a particular subgroup typically
only two cases (Chilo iridescent virus pART and pARTa of share one or more associated domains not found in mem-
the fruit fly), the pART catalytic domain lacks introns, i.e. bers of other subgroups (Fig. 8); pARTs 8, 10 and 15 pose
is encoded by a single exon as in case of the vertebrate exceptions to this rule. Thirdly, members of a particular
mARTs [27]. subgroup typically share one or more intron positions not
found in members of other subgroups (Fig. 3); pARTs 1
4 pose notable exceptions to this rule. Fourthly, when
Discussion
The results of our study illustrate the great power and util- genes of two or more pARTs are physically linked in a clus-
ity of the public genome databases and database search ter on the same chromosome, they belong to the same
programs. Moreover, they provide important novel subgroup  possibly reflecting regional duplications (Fig.
insights into the molecular structure and evolution of the 2). Finally, results of all phylogenetic analysis converged
pART gene family. in topologies with clearly distinct clades for each of the
subgroups (Fig. 7). Members of subgroups 1 and 2 evi-
Our results differ in some details from those of a recent dently are more closely related to one another than to
report by Ame and coworkers [11]. These discrepancies other subgroups (Figs. 6 and 7). Similarly, members of
can be explained by errors in the draft sequence of the subgroups 3 and 4 are sister-groups to one another, indi-
human genome available at the time of the previous cating a close relationship.
report. For example, the database entry AK023746 given
by Ame et al. for PARP-5c evidently represents a truncated Members of the pART family are found fused to a striking
cDNA for pART6 (alias tankyrase 2 or PARP-5b). This variety of associated domains (Fig. 8). It is not farfetched
entry contains two point mutations and a 65 bp deletion to hypothesize that the associated domains direct the
in the 3' utr vs. the cDNA and genomic sequences of respective pARTs to subcellular structures and/or target
pART6. Blast analyses of the high quality sequence of the proteins. Genetic fusion of group 1 and group 2 pARTs
Page 17 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
with DNA-binding domains is in line with their estab- 10). Human pARTs 5 17 lack the WGR and helical
lished roles in DNA-repair, chromosome remodeling, and domains. However, pART5/6 (tankyrase)-like pARTs from
mitotic spindle formation [9,11,12]. Moreover, the SAM C. elegans (Ce.pARTc) and D. discoideum (Dd.pARTb) con-
and ankyrin domains of pARTs 5 and 6 have been shown tain the WGR and helical domains whereas a SAM
to mediate interactions with target proteins in telomere- domain is found at this position in human pARTs 5 and 6
associated protein complexes [45]. Similarly, the C-termi- (Fig. 8).
nal domain of pART4 evidently plays a role in targeting
pART4 to the major vault particles [46]. A flurry of A puzzling finding is the lack of conservation of the classic
domains implicated in the ubiquitination pathway point H-Y-E motif found in the catalytic cores of PARP-1, PARP-
to a possible connection between ubiqutitination and 2, Diphtheria toxin and Pseudomonas Exotoxin A (Fig. 1).
ADP-ribosylation. Indeed, it has recently been reported This motif is conserved only in members of subgroups 1
that ADP-ribosylation of TRF1 by tankyrase (pART5) and 2. All other human pARTs carry notable variations
results in the release of the protein from telomers and its from this motif. In particular, all other pARTs carry a
subsequent ubiquitination [47]. Strikingly, pARTs from replacement of the glutamic acid residue in  5, i.e. the
the microfungi G. zea and A. nidulans provide examples residue that was shown to be critical for the catalytic activ-
for the genetic fusion of two enzyme domains catalyzing ities of DT, PARP-1 and many other pARTs and mARTs
these post-translational protein modifications into a sin- [6,7,20,21]. In six cases, this glutamic acid is replaced by
gle polypeptide. an isoleucine residue, in two cases by leucine, and in one
case each by threonine, valine, or tyrosine. Enzyme
So far, only a single example of a 'naked' pART catalytic activity has been reported recently for two of the six pARTs
domain akin to the isolated catalytic domain of the verte- that carry an H-Y-I motif instead of the H-Y-E motif
brate ecto-ARTs 1 5 [27] was recovered from the public (pARTs 10 and 14) [32,34]. Thus, it is not unlikely that the
database. This putative pART from Chilo iridescent virus four other pARTs carrying the H-Y-I motif turn out to be
clusters with the mammalian pARTs of subgroup 1 (Fig. active enzymes (pARTs 11, 12, 16, and 17). Mouse pART8
7), suggesting that this large double stranded DNA virus also carries an H-Y-I motif, whereas its human ortho-
[48] may have acquired its pART by horizontal gene logue, like pART7, carries an H-Y-L variant motif. H-Y-I
transfer. and H-Y-L variant motifs are also found in pARTs from the
slime mold (Dd.pARTg) and amoeba (Eh.pARTf) (Fig. 8).
The definition of the pART catalytic domain proposed in Human pART15 carries an H-Y-Y variant motif, which is
this paper is somewhat smaller than that commonly used conserved in its orthologues from mouse and the malaria
in the field [11]. We used the position of the common mosquito (Fig. 8). It will be interesting to determine
phase 0 intron upstream of the first conserved  sheet to whether and how site directed mutagenesis of the H-Y-E
set the N-terminal end of the catalytic domain (e.g. see motif in pARTs 1 6 to the variant motifs of pARTs 7 17 
Figs. 1 and 3B). The pARTs of subgroup 1 are extended N- and vice versa  affects their enzyme activities. Moreover,
terminally of this position by an alpha helical domain it remains to be determined whether the most striking var-
(Fig. 8) which is often included as part of the PARP-1 cat- iation of the H-Y-E motif  to Q-Y-T in human and mouse
alytic domain. However, since other pART family mem- pART9 is compatible with enzyme activity.
bers lack this region, we propose to omit it from the
proper pART catalytic domain. Moreover, this N-terminal The results of our PSI-BLAST and PSIPRED analyses (Figs.
delineation of the catalytic domain corresponds well to 4, 5, 9 and additional files 3, 4, 5, 6, 7, 8) support the con-
the N-terminus of the 'naked' pART of Chilo iridescent clusions that the pART gene family described here and the
virus as well as to those of Diphtheria toxin and Pseu- mART gene family described in our previous study [27]
domonas exotoxin A after proteolytic processing of the constitute two distinct ART subfamilies, and further, that
signal sequence or translocation domain (Fig. 1). the family of tRNA:NAD 2'-phosphotransferases [24,25]
constitutes a branch that is more closely related to the
With the exception of pART4, the group 1 pARTs are pART subfamily than to the mART subfamily. Our results
extended upstream of this helical region by another illuminate the power and limits of PSI-BLAST searches:
domain named after its conserved motif of tryptophane PSI-BLAST readily connected members of the pART sub-
(W)  glycine (G)  arginine (R) residues. This WGR family in many different species, while DT, ETA and TpTs
domain is found also in poly-A-polymerases, its function were found at or below the threshold. In contrast PSI-
is unknown. Many group 1 pARTs from distantly related BLAST searches never connected pART family members
organisms, e.g. plants, insects, nematodes, and micro- with members of the mART subfamily or vice versa. The
fungi, also contain these two domains. Interestingly, in results of PSI-BLAST searches, thus, are in accord with
Drosophila melanogaster pARTa these three domains (WGR, insights gained from the known 3D structures of repre-
helical, catalytic) are encoded by a single, large exon (Fig. sentative ADP-ribosyltransferases (Fig. 1), i.e. that certain
Page 18 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
conserved structural features clearly distinguish these two Structure and sequence analyses
subfamilies. Is it possible that some of the pART gene Amino acid sequence alignments were performed with T-
family members described here actually possess mono- Coffee [36]. Secondary structure predictions were
ADP-ribosyltransferase rather than poly-ADP-ribosyl- performed with PSIPRED [37]. Threading of amino acid
transferase activity? Given the structural similarity to DT/ sequences onto known 3D structures in PDB were per-
ETA this is a possibility. Moreover, it cannot be excluded formed with GenTHREADER [37]. Sequence analyses
that some family members may have lost enzyme activity were performed using DNA-Star software, 3D-images
and have acquired a new function. In any case, the respec- were prepared with PyMol [51] software.
tive proteins clearly are more closely related to the pART
than to the mART gene family, in line with the nomencla- Phylogenetic analyses
ture proposed here. Have all ARTs encoded in the human Phylogenetic analyses were applied to the 36 catalytic core
genome been identified? A number of ADP-ribosylation amino acid sequences using the dataset in Figure 6. Phyl-
reactions have been described in mammalian cells that ogenetic analyses were performed on the computational
cannot yet be accounted for by the ARTs identified in this cluster of the College of Biology and Agriculture at
study or our previous study, e.g. mono-ADP-ribosylation Brigham Young University by using maximum parsimony
of actin, rho, glutamate dehydrogenase, and of the alpha and Bayesian Markov chain Monte Carlo approaches
and beta subunits of heterotrimeric G proteins [3,4,8]. http://babeast.byu.edu. The topologies were
Given the fact that the pART subfamily described here and reconstructed using equally weighted maximum parsi-
the mART subfamily described in our previous study [27] mony (MP) analysis as implemented in PAUP* 4.0b10
could not be interconnected by PSI-BLAST, it reamins an [52], maximum likelihood (ML) with simultaneous
intriguing possibility that other ART subfamilies in the adjustment of topology, and branch length as imple-
human genome still await to be identified. mented in PhyML [53], as well as Bayesian methods cou-
pled with Markov Chain Monte Carlo inference (BMCMC,
MrBayes) [54]. The best fit likelihood model for amino
Conclusion
The family of proteins containing a PARP-like catalytic acid evolution was determined based on the lowest
domain consists of 17 members in the human and 16 in Akaike Information Criterion (AIC) or Bayesian Informa-
the mouse, rat, and pufferfish. The vertebrate pART family tion Criterion (BIC) score as implemented in ProtTest
can be divided into five subgroups on the basis of 1.2.6 [53,55,56].
sequence similarity, phylogenetic relationships, con-
served intron positions, and patterns of genetically fused The MP analysis was run using 5000 random addition
protein domains. The four members of group 1 and the replicates and tree bisection-reconnection branch swap-
two members of group 2 each contain a conserved trias of ping. Nonparametric bootstrap values were calculated for
residues (H-Y-E motif) also observed in Diphtheria toxin MP and ML analyses (10.000/100 bootstrap replicates,
and Pseudomonas exotoxin A. The eleven other pART pro- 100/1 heuristic random addition replicates) to assess con-
teins carry variants of this motif (six H-Y-I, two H-Y-L, and fidence in the resulting relationships. ML analysis was run
one each Q-Y-T, Y-Y-V, H-Y-Y). All human pARTs are implementing the RtREV+I+G+F model of amino acid
multi-domain proteins in which the pART catalytic evolution (AIC= 4907.73; -lnL= 2800). The a priori infor-
domain is associated in a Lego-like fashion with other mation obtained by ProtTest 1.2.6 was incorporated into
putative protein-protein interaction, DNA binding and the BMCMC analysis. Bayesian phylogeny estimation was
ubiquitination domains. In all but one case (pART4) the achieved using random starting trees, run for 3 106 gen-
catalytic domain represents the C-terminal end of the erations, with a sample frequency of 1000, and ten chains
multi-domain protein. Most of the domain associations (nine heated, temperature= 0.2). Analyses were repeated
observed in human pARTs appear to be very ancient three times to check for likelihood and parameter mixing
inventions since they can be found also in insects, plants, and congruence. Likelihood scores were plotted against
microfungi, and amoeba. generation time to determine stationery levels. Sample
points before reaching stationery were discarded as "burn-
in". Repeated analyses were compared for convergence on
Methods
Database searches the same posterior probability distributions [57]. The
Protein databases were searched using PSI-BLAST [35]. maximum a posteriori tree (MAP) is presented in this
Genome databases were searched using BLASTn and paper, showing to percentage converted posterior proba-
tBLASTn [49]. Tissue distributions of pART-ESTs were ana- bilities (pP%).
lyzed using Electronic Northern calculations at the Gene-
Card website [50].
Abbreviations used
ART = ADP-Ribosyltransferase, BLAST = basic local align-
ment search tool, 3MB = 3-methoxybenzamide, NAD =
Page 19 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
nicotinamide adenine dinucleotide, PDB = protein
Additional File 3
database.
Multiple amino acid sequence alignments, secondary structure predic-
tions, and threading results for pART subgroup 1 A multiple sequence
Authors' contributions
alignment was generated for the catalytic domains of pARTs 1 4 with T-
This study was initiated in the summer of 1997 while FKN
Coffee. Each residue in the sequence is reported as a single letter code. Sec-
was a visiting scientist in FB's lab at DNAX. Initial data-
ondary structure units in the 3D structures of chicken PARP-1 (1a26)
base searches were performed by FKN and FB, later
and mouse PARP-2 (1GS0) are indicated on top of the alignment. Posi-
searches by HO, PAR, and FKN. KD performed the phylo- tions with identical residues in all sequences are marked by asterisks, sim-
ilarities are marked with colons and periods below the alignment. Residues
genetic analyses. FKN supervised the study with essential
corresponding to the H Y E motif in the NAD binding crevice of diphtheria
contributions by FH. HO prepared the figures and FKN
toxin are marked in red. Intron positions are projected onto the multiple
wrote the paper. The results represent the partial fulfill-
alignment and are marked in grey (phase 0), blue (phase 1), and yellow
ment of the requirements for the graduate thesis of HO.
(phase 2). Secondary structure predictions were generated for human
pART1 with PSIPRED and are indicated in blue below the alignment
(pr1); the confidence of the prediction is indicated in orange (highest con-
Additional material
fidence = 9). Secondary structure units are abbreviated as follows: H =
helix; B = residue in isolated beta bridge; E = extended beta strand; G =
310 helix; I = pi helix; T = hydrogen bonded turn; S = bend.
Additional File 1
Click here for file
Representation of pART gene transcripts in the database of expressed
[http://www.biomedcentral.com/content/supplementary/1471-
sequence tags The public EST database was screened for ESTs encoding
2164-6-139-S3.pdf]
pARTs using tBLASTn and the amino acid sequences of the catalytic
domain of known pART family members as queries at the dates indicated
on top. Accession numbers of the corresponding Unigene clusters are indi- Additional File 4
cated. Blank fields indicate lack of detectable ESTs encoding the respective Multiple amino acid sequence alignments, secondary structure predic-
pART catalytic domain. Tissue distribution analyses were performed for tions, and threading results for pART subgroup 2 A multiple sequence
each cluster by "electronic Northern" analyses. For each family member, alignment was generated for the catalytic domains of pARTs 5 and 6 with
the two tissues with the highest numbers of ESTs are indicated. Tissue T-Coffee. Residues, identities, intron positions, and secondary structure
abbreviations: BMR bone marrow, BRN brain, HRT heart, MSL muscle, units are marked as in additional file 3. Indicated secondary structure pre-
PNC pancreas, PST prostate, KDN kidney, LNG lung, LVR liver, LYN dictions were generated for human pART5 (pr5) with PSIPRED.
lymph node, SPC spinal chord, SPL spleen, TMS thymus, UTR uterus Click here for file
Click here for file [http://www.biomedcentral.com/content/supplementary/1471-
[http://www.biomedcentral.com/content/supplementary/1471- 2164-6-139-S4.pdf]
2164-6-139-S1.pdf]
Additional File 5
Additional File 2
Multiple amino acid sequence alignments, secondary structure predic-
Schematic illustration of the local human and mouse chromosomal tions, and threading results for pART subgroup 3 A multiple sequence
environments of the pART subgroup 3 gene cluster The figure schemat- alignment was generated for the catalytic domains of pARTs 7 10 with T-
ically illustrates the local chromosomal environment of the syntenic cluster Coffee. Residues, identities, intron positions, and secondary structure
of pART genes and neighboring genes on human chromosome 3q (top) units are marked as in additional file 3. Indicated secondary structure pre-
and mouse chromosome 16B3 (bottom). The order and orientation of all dictions were generated for human pART7 (pr7) with PSIPRED.
genes in the depicted cluster is conserved. Known transcripts in GenBank Click here for file
are indicated schematically with their respective accession number. Exons [http://www.biomedcentral.com/content/supplementary/1471-
are indicated by boxes. The direction of transcription is marked by arrows. 2164-6-139-S5.pdf]
Grey vertical bars correspond to a scale of 10.000 base pairs. The figure
was modified from the respective online UCSC human and mouse genome
Additional File 6
browsers http://genome.ucsc.edu.
Multiple amino acid sequence alignments, secondary structure predic-
Click here for file
tions, and threading results for pART subgroup 4 A multiple sequence
[http://www.biomedcentral.com/content/supplementary/1471-
alignment was generated for the catalytic domains of pARTs 11 14 with
2164-6-139-S2.pdf]
T-Coffee. Residues, identities, intron positions, and secondary structure
units are marked as in additional file 3. Indicated secondary structure pre-
dictions were generated for human pART11 (pr11) with PSIPRED.
Click here for file
[http://www.biomedcentral.com/content/supplementary/1471-
2164-6-139-S6.pdf]
Page 20 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
8. Seman M, Adriouch S, Haag F, Koch-Nolte F: Ecto-ADP-ribosyl-
transferases (ARTs): emerging actors in cell communication
Additional File 7
and signaling. Curr Med Chem 2004, 11:857-872.
Multiple amino acid sequence alignments, secondary structure predic-
9. Ziegler M, Oei SL: A cellular survival switch: poly(ADP-ribo-
tions, and threading results for pART subgroup 5 A multiple sequence
syl)ation stimulates DNA repair and silences transcription.
alignment was generated for the catalytic domains of pARTs 15 17 with Bioessays 2001, 23:543-548.
10. Smith S: The world according to PARP. Trends Biochem Sci 2001,
T-Coffee. Residues, identities, intron positions, and secondary structure
26:174-179.
units are marked as in additional file 3. Indicated secondary structure pre-
11. Ame JC, Spenlehauer C, de Murcia G: The PARP superfamily.
dictions were generated for human pART15 (pr15) and for human
Bioessays 2004, 26:882-893.
pART16 (pr16) with PSIPRED.
12. Meyer-Ficca ML, Meyer RG, Jacobson EL, Jacobson MK: Poly(ADP-
Click here for file
ribose) polymerases: managing genome stability. Int J Biochem
Cell Biol 2005, 37:920-926.
[http://www.biomedcentral.com/content/supplementary/1471-
13. Ritter H, Koch-Nolte F, Marquez VE, Schulz GE: Substrate binding
2164-6-139-S7.pdf]
and catalysis of ecto-ADP-ribosyltransferase 2.2 from rat.
Biochemistry 2003, 42:10155-10162.
Additional File 8
14. Ruf A, Rolli V, de Murcia G, Schulz GE: The mechanism of the
elongation and branching reaction of poly(ADP-ribose)
Representative tiling paths of PSI-BLAST searches initiated with the
polymerase as derived from crystal structures and
catalytic domain amino acid sequences of selected pART family mem-
mutagenesis. J Mol Biol 1998, 278:57-65.
bers PSI-BLAST searches were initiated with the query sequences indi-
15. Bell CE, Eisenberg D: Crystal structure of diphtheria toxin
cated on top at a threshold setting for the expect value of 0.005 as in
bound to nicotinamide adenine dinucleotide. Biochemistry
Figure 4. pART subgroups are color coded as in Figure 2. Matching
1996, 35:1137-1149.
sequences from the slime mold (D. discoideum, blue) and from a model 16. Han S, Craig JA, Putnam CD, Carozzi NB, Tainer JA: Evolution and
mechanism from structures of an ADP-ribosylating toxin
plant (A. thaliana, green) are indicated at the iteration in which they first
and NAD complex. Nat Struct Biol 1999, 6:932-936.
appeared above threshold. The respective pART homologues from these
17. Oliver AW, Ame JC, Roe SM, Good V, de Murcia G, Pearl LH: Crys-
species were arbitrarily numbered (pARTa-j) in the order in which they
tal structure of the catalytic fragment of murine poly(ADP-
were detected in the search that was initiated with human pART1 (PARP-
ribose) polymerase-2. Nucleic Acids Res 2004, 32:456-464.
1). Protein data base accession numbers are listed in Figure 9. pARTs
18. Menetrey J, Flatau G, Stura EA, Charbonnier JB, Gas F, Teulon JM, Le
indicated in black include short possibly truncated coding sequences of Du MH, Boquet P, Menez A: NAD binding induces conforma-
tional changes in Rho ADP-ribosylating clostridium botuli-
pART homologues that could not be assigned to a particular subgroup with
num C3 exoenzyme. J Biol Chem 2002, 277:30950-30957.
certainty.
19. Li M, Dyda F, Benhar I, Pastan I, Davies DR: The crystal structure
Click here for file
of Pseudomonas aeruginosa exotoxin domain III with nicoti-
[http://www.biomedcentral.com/content/supplementary/1471-
namide and AMP: conformational differences with the intact
2164-6-139-S8.pdf] exotoxin. Proc Natl Acad Sci U S A 1995, 92:9308-9312.
20. Carroll SF, Collier RJ: NAD binding site of diphtheria toxin:
identification of a residue within the nicotinamide subsite by
photochemical modification with NAD. Proc Natl Acad Sci U S
A 1984, 81:3307-3311.
21. Marsischky GT, Wilson BA, Collier RJ: Role of glutamic acid 988
Acknowledgements
of human poly-ADP-ribose polymerase in polymer forma-
This work was supported by grant No310/3 from the Deutsche Forsc-
tion. Evidence for active site similarities to the ADP-ribo-
hungsgemeinschaft to FKN. HO was a grantee of the Studienstiftung des
sylating toxins. J Biol Chem 1995, 270:3247-3254.
Deutschen Volkes. KD is funded by the NSF grants DEB-0120718 and DEB- 22. Pannifer AD, Wong TY, Schwarzenbacher R, Renatus M, Petosa C,
Bienkowska J, Lacy DB, Collier RJ, Park S, Leppla SH, Hanna P, Lid-
9983195. DNAX is fully funded by the Schering Corporation. We thank
dington RC: Crystal structure of the anthrax lethal factor.
Sahil Adriouch, Bernhard Fleischer, Stefan Kernstock, and Stefan Rothen-
Nature 2002, 414:229-233.
burg (University Hospital Hamburg) for critical reading of the manuscript.
23. Tsuge H, Nagahama M, Nishimura H, Hisatsune J, Sakaguchi Y, Ito-
gawa Y, Katunuma N, Sakurai J: Crystal structure and site-
directed mutagenesis of enzymatic components from
References
Clostridium perfringens iota-toxin. J Mol Biol 2003,
1. Aktories K, Just I: Bacterial Protein Toxins. Berlin, Springer
325:471-483.
Verlag; 2000.
24. Kato-Murayama M, Bessho Y, Shirouzu M, Yokoyama S: Crystal
2. Althaus FR, Hilz H, Shall S: ADP-ribosylation of proteins. Berlin,
structure of the RNA 2'-phosphotransferase from Aero-
Springer Verlag; 1985.
pyrum pernix K1. J Mol Biol 2005, 348:295-305.
3. Haag F, Koch-Nolte F: ADP-Ribosylation in Animal Tissues:
25. Spinelli SL, Kierzek R, Turner DH, Phizicky EM: Transient ADP-
Structure, Function and Biology of Mono(ADP-Ribo-
ribosylation of a 2'-phosphate implicated in its removal from
syl)transferases and Related Enzymes. Volume 419. New York,
ligated tRNA during splicing in yeast. J Biol Chem 1999,
Plenum Press; 1997.
274:2637-2644.
4. Jacobson MK, Jacobson EL: ADP-ribose Transfer Reactions:
26. Otto H, Tezcan-Merdol D, Girisch R, Haag F, Rhen M, Koch-Nolte F:
Mechanisms and Biological Significance. New York, Springer
The spvB gene-product of the Salmonella enterica virulence
Verlag; 1989.
plasmid is a mono(ADP-ribosyl)transferase. Mol Microbiol
5. Honjo T, Nishizuka Y, Hayaishi O: Diphtheria toxin-dependent
2000, 37:1106-1115.
adenosine diphosphate ribosylation of aminoacyl transferase
27. Glowacki G, Braren R, Firner K, Nissen M, Kuhl M, Reche P, Bazan F,
II and inhibition of protein synthesis. J Biol Chem 1968,
Cetkovic-Cvrlje M, Leiter E, Haag F, Koch-Nolte F: The family of
243:3553-3555.
toxin-related ecto-ADP-ribosyltransferases in humans and
6. Domenighini M, Rappuoli R: Three conserved consensus
the mouse. Protein Sci 2002, 11:1657-1670.
sequences identify the NAD-binding site of ADP-ribosylating
28. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J,
enzymes, expressed by eukaryotes, bacteria and T-even
Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris
bacteriophages. Mol Microbiol 1996, 21:667-674.
K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P,
7. Bazan JF, Koch-Nolte F: Sequence and structural links between
McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J,
distant ADP-ribosyltransferase families. Adv Exp Med Biol 1997,
Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-
419:99-107.
Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sul-
ston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N,
Page 21 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin 30. Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ,
R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt Scherer S, Scott G, Steffen D, Worley KC, Burch PE, Okwuonu G,
A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Hines S, Lewis L, DeRamo C, Delgado O, Dugan-Rocha S, Miner G,
Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Morgan M, Hawes A, Gill R, Celera, Holt RA, Adams MD, Amanatides
Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, PG, Baden-Tillson H, Barnstead M, Chin S, Evans CA, Ferriera S, Fos-
Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, ler C, Glodek A, Gu Z, Jennings D, Kraft CL, Nguyen T, Pfannkoch
Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, CM, Sitter C, Sutton GG, Venter JC, Woodage T, Smith D, Lee HM,
Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Gustafson E, Cahill P, Kana A, Doucette-Stamm L, Weinstock K,
Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett Fechtel K, Weiss RB, Dunn DM, Green ED, Blakesley RW, Bouffard
N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, GG, De Jong PJ, Osoegawa K, Zhu B, Marra M, Schein J, Bosdet I, Fjell
Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley C, Jones S, Krzywinski M, Mathewson C, Siddiqui A, Wye N, McPher-
KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, son J, Zhao S, Fraser CM, Shetty J, Shatsman S, Geer K, Chen Y,
Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Abramzon S, Nierman WC, Havlak PH, Chen R, Durbin KJ, Egan A,
Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Ren Y, Song XZ, Li B, Liu Y, Qin X, Cawley S, Cooney AJ, D'Souza
Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, LM, Martin K, Wu JQ, Gonzalez-Garay ML, Jackson AR, Kalafus KJ,
Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, McLeod MP, Milosavljevic A, Virk D, Volkov A, Wheeler DA, Zhang
Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer Z, Bailey JA, Eichler EE, Tuzun E, Birney E, Mongin E, Ureta-Vidal A,
M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Woodwark C, Zdobnov E, Bork P, Suyama M, Torrents D, Alexan-
Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, dersson M, Trask BJ, Young JM, Huang H, Wang H, Xing H, Daniels S,
Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood Gietzen D, Schmidt J, Stevens K, Vitt U, Wingrove J, Camara F, Mar
J, Cox DR, Olson MV, Kaul R, Shimizu N, Kawasaki K, Minoshima S, Alba M, Abril JF, Guigo R, Smit A, Dubchak I, Rubin EM, Couronne O,
Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser Poliakov A, Hubner N, Ganten D, Goesele C, Hummel O, Kreitler T,
J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia Lee YA, Monti J, Schulz H, Zimdahl H, Himmelbauer H, Lehrach H,
N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bai- Jacob HJ, Bromberg S, Gullings-Handley J, Jensen-Seaman MI, Kwitek
ley JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge AE, Lazar J, Pasko D, Tonellato PJ, Twigger S, Ponting CP, Duarte JM,
CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Rice S, Goodstadt L, Beatson SA, Emes RD, Winter EE, Webber C,
Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hay- Brandt P, Nyakatura G, Adetobi M, Chiaromonte F, Elnitski L, Eswara
ashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, P, Hardison RC, Hou M, Kolbe D, Makova K, Miller W, Nekrutenko
Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin A, Riemer C, Schwartz S, Taylor J, Yang S, Zhang Y, Lindpaintner K,
EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Andrews TD, Caccamo M, Clamp M, Clarke L, Curwen V, Durbin R,
Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Eyras E, Searle SM, Cooper GM, Batzoglou S, Brudno M, Sidow A,
Slater G, Smit AF, Stupka E, Szustakowski J, Thierry-Mieg D, Thierry- Stone EA, Payseur BA, Bourque G, Lopez-Otin C, Puente XS, Chakra-
Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe barti K, Chatterji S, Dewey C, Pachter L, Bray N, Yap VB, Caspi A,
KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Tesler G, Pevzner PA, Haussler D, Roskin KM, Baertsch R, Clawson
Wetterstrand KA, Patrinos A, Morgan MJ, Szustakowki J, de Jong P, H, Furey TS, Hinrichs AS, Karolchik D, Kent WJ, Rosenbloom KR,
Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ: Initial Trumbower H, Weirauch M, Cooper DN, Stenson PD, Ma B, Brent
sequencing and analysis of the human genome. Nature 2001, M, Arumugam M, Shteynberg D, Copley RR, Taylor MS, Riethman H,
409:860-921. Mudunuri U, Peterson J, Guyer M, Felsenfeld A, Old S, Mockrin S,
29. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal Collins F: Genome sequence of the Brown Norway rat yields
P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis SE, insights into mammalian evolution. Nature 2004, 428:493-521.
Attwood J, Baertsch R, Bailey J, Barlow K, Beck S, Berry E, Birren B, 31. Takeyama K, Aguiar RC, Gu L, He C, Freeman GJ, Kutok JL, Aster JC,
Bloom T, Bork P, Botcherby M, Bray N, Brent MR, Brown DG, Brown Shipp MA: The BAL-binding protein BBAP and related Deltex
SD, Bult C, Burton J, Butler J, Campbell RD, Carninci P, Cawley S, family members exhibit ubiquitin-protein isopeptide ligase
Chiaromonte F, Chinwalla AT, Church DM, Clamp M, Clee C, Collins activity. J Biol Chem 2003, 278:21930-21937.
FS, Cook LL, Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, 32. Yu M, Schreek S, Cerni C, Schamberger C, Lesniewicz K, Poreba E,
Cutts T, Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitza- Vervoorts J, Walsemann G, Grotzinger J, Kremmer E, Mehraein Y,
kis ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I, Dunn Mertsching J, Kraft R, Austen M, Luscher-Firzlaff J, Luscher B: PARP-
DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E, Felsenfeld A, 10, a novel Myc-interacting protein with poly(ADP-ribose)
Fewell GA, Flicek P, Foley K, Frankel WN, Fulton LA, Fulton RS, Furey polymerase activity, inhibits transformation. Oncogene 2005.
TS, Gage D, Gibbs RA, Glusman G, Gnerre S, Goldman N, Goodstadt 33. Gao G, Guo X, Goff SP: Inhibition of retroviral RNA production
L, Grafham D, Graves TA, Green ED, Gregory S, Guigo R, Guyer M, by ZAP, a CCCH-type zinc finger protein. Science 2002,
Hardison RC, Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, 297:1703-1706.
Hlavina W, Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, 34. Ma Q, Baldwin KT, Renzelli AJ, McDaniel A, Dong L: TCDD-induc-
Jaffe DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson EK, ible poly(ADP-ribose) polymerase: a novel response to
Karolchik D, Kasprzyk A, Kawai J, Keibler E, Kells C, Kent WJ, Kirby 2,3,7,8-tetrachlorodibenzo-p-dioxin. Biochem Biophys Res
A, Kolbe DL, Korf I, Kucherlapati RS, Kulbokas EJ, Kulp D, Landers T, Commun 2001, 289:499-506.
Leger JP, Leonard S, Letunic I, Levine R, Li J, Li M, Lloyd C, Lucas S, 35. Altschul SF, Koonin EV: Iterated profile searches with PSI-
Ma B, Maglott DR, Mardis ER, Matthews L, Mauceli E, Mayer JH, BLAST--a tool for discovery in protein databases. Trends Bio-
McCarthy M, McCombie WR, McLaren S, McLay K, McPherson JD, chem Sci 1998, 23:444-447.
Meldrim J, Meredith B, Mesirov JP, Miller W, Miner TL, Mongin E, 36. Notredame C, Higgins DG, Heringa J: T-Coffee: A novel method
Montgomery KT, Morgan M, Mott R, Mullikin JC, Muzny DM, Nash for fast and accurate multiple sequence alignment. J Mol Biol
WE, Nelson JO, Nhan MN, Nicol R, Ning Z, Nusbaum C, O'Connor 2000, 302:205-217.
MJ, Okazaki Y, Oliver K, Overton-Larty E, Pachter L, Parra G, Pepin 37. McGuffin LJ, Bryson K, Jones DT: The PSIPRED protein struc-
KH, Peterson J, Pevzner P, Plumb R, Pohl CS, Poliakov A, Ponce TC, ture prediction server. Bioinformatics 2000, 16:404-405.
Ponting CP, Potter S, Quail M, Reymond A, Roe BA, Roskin KM, 38. Koch-Nolte F, Reche P, Haag F, Bazan F: ADP-ribosyltransferases:
Rubin EM, Rust AG, Santos R, Sapojnikov V, Schultz B, Schultz J, plastic tools for inactivating protein and small molecular
Schwartz MS, Schwartz S, Scott C, Seaman S, Searle S, Sharpe T, weight targets. J Biotechnol 2001, 92:81-87.
Sheridan A, Shownkeen R, Sims S, Singer JB, Slater G, Smit A, Smith 39. Han S, Tainer JA: The ARTT motif and a unified structural
DR, Spencer B, Stabenau A, Stange-Thomann N, Sugnet C, Suyama M, understanding of substrate recognition in ADP-ribosylating
Tesler G, Thompson J, Torrents D, Trevaskis E, Tromp J, Ucla C, bacterial toxins and eukaryotic ADP-ribosyltransferases. Int
Ureta-Vidal A, Vinson JP, Von Niederhausern AC, Wade CM, Wall M, J Med Microbiol 2002, 291:523-529.
Weber RJ, Weiss RB, Wendl MC, West AP, Wetterstrand K, 40. Sun J, Maresso AW, Kim JJ, Barbieri JT: How bacterial ADP-ribo-
Wheeler R, Whelan S, Wierzbowski J, Willey D, Williams S, Wilson sylating toxins recognize substrates. Nat Struct Mol Biol 2004,
RK, Winter E, Worley KC, Wyman D, Yang S, Yang SP, Zdobnov EM, 11:868-876.
Zody MC, Lander ES: Initial sequencing and comparative anal- 41. Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius
ysis of the mouse genome. Nature 2002, 420:520-562. JU, Schuler GD, Schriml LM, Sequeira E, Tatusova TA, Wagner L:
Page 22 of 23
(page number not for citation purposes)
BMC Genomics 2005, 6:139 http://www.biomedcentral.com/1471-2164/6/139
Database resources of the National Center for 54. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic
Biotechnology. Nucleic Acids Res 2003, 31:28-33. inference under mixed models. Bioinformatics 2003,
42. Ladurner AG: Inactivating chromosomes: a macro domain 19:1572-1574.
that minimizes transcription. Mol Cell 2003, 12:1-3. 55. Drummond A, Strimmer K: PAL: an object-oriented program-
43. Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E, ming library for molecular evolution and phylogenetics. Bio-
Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, Nicaud S, Jaffe D, informatics 2001, 17:662-663.
Fisher S, Lutfalla G, Dossat C, Segurens B, Dasilva C, Salanoubat M, 56. Abascal F, Zardoya R, Posada D: Prottest: selection of best-fit
Levy M, Boudet N, Castellano S, Anthouard V, Jubin C, Castelli V, models of protein evolution. Bioinformatics 2005.
Katinka M, Vacherie B, Biemont C, Skalli Z, Cattolico L, Poulain J, De 57. Huelsenbeck JP, Bollback JP: Empirical and hierarchical Bayesian
Berardinis V, Cruaud C, Duprat S, Brottier P, Coutanceau JP, Gouzy estimation of ancestral states. Syst Biol 2001, 50:351-366.
J, Parra G, Lardier G, Chapple C, McKernan KJ, McEwan P, Bosak S,
Kellis M, Volff JN, Guigo R, Zody MC, Mesirov J, Lindblad-Toh K, Bir-
ren B, Nusbaum C, Kahn D, Robinson-Rechavi M, Laudet V, Schachter
V, Quetier F, Saurin W, Scarpelli C, Wincker P, Lander ES, Weissen-
bach J, Roest Crollius H: Genome duplication in the teleost fish
Tetraodon nigroviridis reveals the early vertebrate proto-
karyotype. Nature 2004, 431:946-957.
44. Hillier LW, Miller W, Birney E, Warren W, Hardison RC, Ponting CP,
Bork P, Burt DW, Groenen MA, Delany ME, Dodgson JB, Chinwalla
AT, Cliften PF, Clifton SW, Delehaunty KD, Fronick C, Fulton RS,
Graves TA, Kremitzki C, Layman D, Magrini V, McPherson JD, Miner
TL, Minx P, Nash WE, Nhan MN, Nelson JO, Oddy LG, Pohl CS, Ran-
dall-Maher J, Smith SM, Wallis JW, Yang SP, Romanov MN, Rondelli
CM, Paton B, Smith J, Morrice D, Daniels L, Tempest HG, Robertson
L, Masabanda JS, Griffin DK, Vignal A, Fillon V, Jacobbson L, Kerje S,
Andersson L, Crooijmans RP, Aerts J, van der Poel JJ, Ellegren H,
Caldwell RB, Hubbard SJ, Grafham DV, Kierzek AM, McLaren SR,
Overton IM, Arakawa H, Beattie KJ, Bezzubov Y, Boardman PE, Bon-
field JK, Croning MD, Davies RM, Francis MD, Humphray SJ, Scott CE,
Taylor RG, Tickle C, Brown WR, Rogers J, Buerstedde JM, Wilson
SA, Stubbs L, Ovcharenko I, Gordon L, Lucas S, Miller MM, Inoko H,
Shiina T, Kaufman J, Salomonsen J, Skjoedt K, Wong GK, Wang J, Liu
B, Yu J, Yang H, Nefedov M, Koriabine M, Dejong PJ, Goodstadt L,
Webber C, Dickens NJ, Letunic I, Suyama M, Torrents D, von Mering
C, Zdobnov EM, Makova K, Nekrutenko A, Elnitski L, Eswara P, King
DC, Yang S, Tyekucheva S, Radakrishnan A, Harris RS, Chiaromonte
F, Taylor J, He J, Rijnkels M, Griffiths-Jones S, Ureta-Vidal A, Hoffman
MM, Severin J, Searle SM, Law AS, Speed D, Waddington D, Cheng Z,
Tuzun E, Eichler E, Bao Z, Flicek P, Shteynberg DD, Brent MR, Bye JM,
Huckle EJ, Chatterji S, Dewey C, Pachter L, Kouranov A, Mourelatos
Z, Hatzigeorgiou AG, Paterson AH, Ivarie R, Brandstrom M, Axelsson
E, Backstrom N, Berlin S, Webster MT, Pourquie O, Reymond A, Ucla
C, Antonarakis SE, Long M, Emerson JJ, Betran E, Dupanloup I, Kaess-
mann H, Hinrichs AS, Bejerano G, Furey TS, Harte RA, Raney B,
Siepel A, Kent WJ, Haussler D, Eyras E, Castelo R, Abril JF, Castellano
S, Camara F, Parra G, Guigo R, Bourque G, Tesler G, Pevzner PA,
Smit A, Fulton LA, Mardis ER, Wilson RK: Sequence and compar-
ative analysis of the chicken genome provide unique per-
spectives on vertebrate evolution. Nature 2004, 432:695-716.
45. Seimiya H, Smith S: The telomeric poly(ADP-ribose) polymer-
ase, tankyrase 1, contains multiple binding sites for telom-
eric repeat binding factor 1 (TRF1) and a novel acceptor,
182-kDa tankyrase-binding protein (TAB182). J Biol Chem
2002, 277:14116-14126.
46. Kickhoefer VA, Siva AC, Kedersha NL, Inman EM, Ruland C, Streuli
M, Rome LH: The 193-kD vault protein, VPARP, is a novel
poly(ADP-ribose) polymerase. J Cell Biol 1999, 146:917-928.
47. Chang W, Dynek JN, Smith S: TRF1 is degraded by ubiquitin-
mediated proteolysis after release from telomeres. Genes Dev
2003, 17:1328-1333.
48. Jakob NJ, Darai G: Molecular anatomy of chilo iridescent virus
Publish with BioMed Central and every
genome and the evolution of viral genes. Virus Genes 2002,
scientist can read your work free of charge
25:299-316.
49. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local
"BioMed Central will be the most significant development for
alignment search tool. J Mol Biol 1990, 215:403-410.
disseminating the results of biomedical research in our lifetime."
50. Rebhan M, Chalifa-Caspi V, Prilusky JLD: GeneCards: encyclope-
Sir Paul Nurse, Cancer Research UK
dia for genes, proteins and diseases. [http://bioinformatics.weiz
mann.ac.il/cards].
Your research papers will be:
51. DeLano WL: The PyMOL User's Manual. 2002 [http://
available free of charge to the entire biomedical community
www.pymol.org]. San Carlos, CA, USA., DeLano Scientific
52. Swofford DL: PAUP: Phylogenetic Analysis Using Parsimony
peer reviewed and published immediately upon acceptance
(and other methods) version 4. Sunderland, Massachusetts, Sin-
cited in PubMed and archived on PubMed Central
auer Associates Inc.; 2002.
53. Guindon S, Gascuel O: A simple, fast, and accurate algorithm
yours  you keep the copyright
to estimate large phylogenies by maximum likelihood. Syst
BioMedcentral
Biol 2003, 52:696-704. Submit your manuscript here:
http://www.biomedcentral.com/info/publishing_adv.asp
Page 23 of 23
(page number not for citation purposes)


Wyszukiwarka

Podobne podstrony:
Simulation of Convective Detonation Waves in a Porous Medium by the Lattice Gas Method
Effects of the Family Environment Gene
The Process of Decision Making in Chess Volume 1 Mastering the Theory Philip Ochman, 2012
Quantitative Characterization of Internal Defects in RDX Crystals
Fan Performance Characteristics of Axial Fans
In Control Victims Of Progress
Burning Rate Characterization of GAP HMX Energetic Composite Materials
Age Characteristics of Young Learners
Surface characterization of collagen elastin based biomaterials for tissue
TROJAN ORIGIN THEME IN BOOK ONE OF VIRGIL’S AENEID
Vacuum drying characteristics of eggplants (Long Wua, Takahiro Orikasa)
Characterization of Particle Size Distribution
Fan Performance Characteristics of Centrifugal Fans
Остапчук Cossack Ukraine In and Out of Ottoman Orbit, 1648 1681
Stephen King In a Half World of Terror

więcej podobnych podstron