Tracking the evolution of substrate specificity of ABC transporters
Introduction
The ATP-binding cassette (ABC) transporters are present from bacteria to man and translocate a wide variety of substrates across all cell membranes, often against a concentration gradient by using energy from ATP hydrolysis. Besides the most known function of exporting structurally unrelated drugs, responsible for multidrug resistance (MDR) phenomena, they are also implicated in maintenance of mitochondrial function, maturation of cytosolic Fe/S proteins, pheromone secretion, stress response, or lipid bilayer homeostasis and lipid uptake [ABC Proteins: From Bacteria to Man]. They are built from combinations of conservative units, hydrophilic ATP-binding ABC domains with a set of the conserved sequence motifs, and more divergent transmembrane domains (TMDs) formed by 6-11 alpha-helices. The primary sequence of ABC domain is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, other conserved motifs are present in the ABC: the switch region which contains a histidine loop (H-loop), postulated to polarize the attaching water molecule for hydrolysis, the ABC-signature conserved motif (LSGGQ) specific to the ABC transporter, the Q-loop (between Walker A and the signature), which interacts with the γ phosphate through a water bond, the D-loop, possibly involved in ABC-ABC communication and Mg2+ binding, and A-loop, Aromatic residue interacting with the Adenine ring of ATP [1]. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [PROSITE doumentation PDOC00185, http://au.expasy.org/prosite/PDOC00185].
The eukaryotic ABC transporters are organized either as full transporters composed of two TMDs and two ABC domains, or as half transporters which must homo- or heterodimerize to form a functional transporter. TMDs contribute to the substrate translocation events, i.e. recognition, translocation and release, while ABC domains energize the transport. It is thought that during each transport cycle a transporter undergoes significant structural movements evoked by substrate binding and ATP hydrolysis, which are required for substrate efflux. Phylogenetic analysis suggests the division of eukaryotic transporters into seven subfamilies, with an eight discovered in the fruit fly genome [11441126].
The most variable is the B subfamily, because it comprises of 11 members which are both full and half transporters, forming functional homo- and heterodimers. The common active minimal unit of ABCB transporter consists of two topologically symmetrical subunits composed of N-terminal six helices followed by an ABC domain, as seen in the recently reported X-ray structure of mouse Abcb1a (P-gp) [19325113]. With distinct tissue expression patterns and different membrane localizations (i.e. plasma membrane, endoplasmic reticulum, lysosomal and mitochondrial), ABCB transporters perform various functions in eukaryotic cells (Tab. 1). They differ both qualitatively and quantitatively in the ranges of substrate specificities, transporting drugs, lipids, iron, bile salts, or specific peptides.
The structural features that determine substrate selectivity in ABCB transporters are still under debate and, despite the structural similarity, by no means are universal for all of them. Due to considerable clinical interest the most characterized ABCB transporter is multidrug pump ABCB1. In that case it is thought that the drugs enter the TMD pathway from the membrane bilayer, via small openings formed by four helices [19961542]. The crystal structure of the mouse P-gp confirmed a hydrophobic pocket with the substrate-binding sites, supported by an additional area with polar amino acids [19325113], which agrees with the hypothesis of modular recognition of drugs by hydrogen-bond donors [16503657]. In contrast to the lipopholic ABCB1 substrates, heterodimeric TAP complex transports a pool of well-defined hydrophilic peptides [9256420]. Cross-linking studies mapped the cytosolic part of TMDs in TAP complex as dominant peptide-interacting regions [8955196, 17164240]. Therefore, the mechanism of substrate recognition and selection seems not to be conserved even within ABCB paralogs.
ABCB paralogs likely evolved from a series of mammalian specific gene duplication events [15003118]. Several models of molecular evolution have been proposed to ground the preservation of duplicate genes [15831095]. They differ in the patterns of sequence evolution following gene duplications. Neofunctionalization, as an adaptive process of acquiring new function by a duplicated gene, assumes that the nonsynonymous substitution rate increases after gene duplication because of positive Darwinian selection, while the subfunctionalization or duplication-degeneration-complementation (DDC) models predict increase of the substitution rate because of relaxed purifying selection [10629003]. In the latter process the usual mechanism of preservation the duplicate gene includes partitioning of ancestral expression pattern rather than biochemical functions or acquisition of new functions [10101175]. In general, functional divergence may include a variety of evolutionary processes, such as relaxation of selective constraints, neutral evolution or positive selection. Positive selection acting on a small number of sites may be an important factor in retaining duplicated genes, promoting the acquisition of a novel or more specialized function. There are numerous examples confirming the role of positive selection in functional evolution of duplicated genes within various gene families [15761060] [12871908] [11586358] [12832642]. Since ABCB proteins are able to transport an unusually broad range of substrates, positively selected sites might play role in shaping mechanism of substrate recognition.
To gain more insights into their molecular evolution we performed comprehensive bioinformatics study, based on a set of sequences from fully sequenced mammalian genomes. We predicted the evolutionary determinants of the functional divergence after key duplication events, by integration of comparative sequence and structure analysis. Using similar approach, Jordan et al. have successfully identified a number of residues likely to be involved in the evolutionary transition from transporter to channel function which occured in the C branch of ABC transporters [19020075]. Studies on G protein-coupled receptors showed that evolutionary importance may be a strong predictor of specificity determinants, and that specificity of responsiveness to different drugs is determined by residues that probably do not contact the ligand [20385837]. In our study we identify amino acids that may be important for the functional divergence in the ABCB subfamily, and predict their influence on mechanism of substrate selection and transport, regarding to crystal structure models.
Methods
Sequence datasets
Genomic coding regions of human and mouse ABCB transporters were found by keyword searching in the Consensus CDS Database, maintened by the National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/CCDS/, and downloaded from Ensembl database, http://www.ensembl.org/. Orthologous DNA sequences were searched by mining predefined Ensembl ortholog datasets using human sequences as a query. Each Ensembl orthologous group costists of genes related by a speciation event defined by a topology of the reconstructed phylogenetic tree [19029536]. Datasets were supplemented by sequences from PSI-Blast and TBlastN searches against the non-redundant protein databases at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) and UniProtKB (http://www.uniprot.org/blast/), and genomic databases at Ensembl (http://www.ensembl.org/Multi/blastview/) and NCBI. Partial and frameshift sequences as well as these from duplicated database submissions were discarded. In each remaining case the alternative transcript most similar to the longest human transcript was chosen and annotated by Ensembl and GenBank. Truncated sequences with their neighboring regions were used to search coding exons with the FGENESH+ program at the Softberry website (http://www.softberry.com/). The mammalian genomes used in the study were as follows: human, Homo sapiens v60.37e; chimpanzee, Pan troglodytes v60.21o; macaque, Macaca mulatta v60.10o; mouse, Mus musculus v60.37m; rat, Rattus norvegicus v60.34b; dog, Canis familiaris v60.2p; cow, Bos taurus v60.4i; pig, Sus scrofa v.60.9d; horse, Equus caballus v.60.2g, and, opossum, Monodelphis domestica v60.5l.
Phylogenetic analysis
Multiple sequence alignments of collected translated DNA sequences in each orthologous group were performed using ClustalW v.2.0.10 [17846036] (gap opening penalty = 10, gap extension penalty = 1, BLOSUM62 substitution matrix), and manually refined in SeaView [19854763]. Sequences of full transporters (ABCB1, ABCB4, ABCB5, ABCB11) were cut out into halves according to the predicted topology of human proteins. Because the main object of our analysis was functional evolution within the common core domains, we split the sequences of full-transporters (ABCB1, ABCB4, ABCB5, ABCB11) into halves and disregarded the membrane-spanning domains corresponding to additional functions of the remaining transporters. Therefore, we cut out the sequences of N-terminal domains of TAP1/TAP2 required for tapasin binding [14679198] (residues 1-184 and 1-98, respectively), the N-terminal residues of ABCB6 (1-220) and ABCB9 (1-166) targeting proteins to the endoplasmatic reticulum (ER) [18279659, 20377823], as well as mitochondrial targeting signals of ABCB7 (1-43), ABCB8 (1-55) and ABCB10 (1-122) [9883897, 9878413, 15215243].
Phylogenetic trees were constructed with two distance-based as well as a maximum-likelihood method. We used NJ [3447015] and BioNJ [9254330] algorithms with Kimura distances correction and PhyML [14530136]. We applied the Jones-Taylor-Thornton amino acid substitution matrix, the eight-category discrete gamma-model of among-site rate variation, optimized fraction of invariable sites and the nearest neighbor interchange tree search algorithm. All tree inferences were followed by bootstrap resampling with 100 replicates. Trees were visualized using Archaeopteryx [19860910]. In further analysis we used the topology and branch lengths obtained with PhyML.
To reconstruct ancestral amino acid sequences FastML version 2.02 [12176835] was run on the selected data sets using the neighboir-joining phylogenetic trees. Gamma distribution of substitution rates, and the JTT substitution model were assumed.
Maximum-likelihood inference of positive Darwinian selection
To test the possible influence of positive selection in the shaping the evolution of the ABCB subfamily, we conducted codon-based maximum likelihood analysis in CODEML from the PAML v. 4.4 package on selected subgroups. We excluded rodent Abcb1b transporters from the B1 test, since they are not able to confer the multidrug resistance and therefore their evolution may be guided in different way. We tested the random-sites models M0, M1a, M2a, M7, M8. M0 model was used for estimation the proportion of nonsynonymous to synonymous substitutions per site (ω ratio), indicating average type of selection on the protein level. Deleterious, neutral, and advantageous mutations are signified by ω < 1, ω = 1, and ω > 1, respectively. To detect whether allowing for positive selection improves fitting data into models, we performed likelihood ratio test (LRT) for positive selection, by comparing log-likelihood values obtained under M1a, M2a, M7 and M8 models.
Site-specific functional divergence
It is well recognized that gene duplication supplies raw genetic for functional innovation. The evolutionary determinants of the functional divergence after key duplication events were evaluated with the DIVERGE2 program [11934757]. Both site-specific shifts of amino acid property (type-II functional divergence) and evolutionary rate (type-I functional divergence) were detected [16864604]. While type-I sites are typically highly conserved in a subset of homologous and highly variable in another, the type-II functional divergence results in the fixation of amino acids mutated after the duplication event [16864604]. The program computed posterior probabilities of a site being involved in type-I functional divergence. The strength of type-II functional divergence was measured by the values of the coefficient of functional divergence theta (posterior ratio RII), where a theta value significantly greater than zero indicates functional divergence.
Homology modeling of human protein structures
Three-dimensional structures were modeled based on the structure of mouse Abcb1a in apo form (PDB ID: 3G5U). Its secondary structure was assigned according to the DSSP database [6667333]. Secondary structures of human transporters were predicted by PSIPRED v3.0, http://bioinf.cs.ucl.ac.uk/psipred/, and guided manual refinement of the previous sequence alignment. Comparative models were constructed using MODELLER 9v8 [18429317] and displayed in PyMOL.
Results
Phylogeny of the mammalian ABCB transporters
By exhaustive keyword and homology searches we identified 98 nonredundant cDNA sequences of the subfamily MDR/TAP of ABC transporters from 10 fully sequenced mammalian genomes (see Supplementary material for details). As expected, the complement of ABCB genes is the same in all the genomes, with the exceptions of duplicated variant of Abcb1 in rodent genomes, and identical copies of TAP genes in pig. In the further study we used sequences of common, minimal functional TMD-ABC units. The maximum-likelihood phylogenetic tree (lnL = -46760) constructed on the basis of aligned 131 amino acid sequences, corresponding to TMD-ABC units is shown in Figure 1. Expectedly, the tree contains 14 paralogous clades, which are grouped into four major groups: full transporters (Pgp group, with B1, B4, B5, B11 halves), peptide-specific transporters (TAP group, with TAP1, TAP2 and B9), two mitochondrial homologs of yeast MDL (B8/10 group; with B8 and B10), and homologs of yeast ATM1 (B6/7 group, with B6 and B7). Trees reconstructed by the three algorithms have nearly the same topology. The major exception occurs at the B8/10 node with a bootstrap value of 0.75 in the PhyML tree. In the neighbor-joining trees B10 transporters are more related to the TAP clade, no longer forming a single B8/10 clade. Moreover, both yeast closest homologs, MDL1 and MDL2 belong to the B10 clade (data not shown), leaving the question of the duplication event preceding speciation of B8 and B10 transporters unanswered. However, presence of MDL1 and MDL2 in the NJ tree improves the bootstrap value of the B8/10 node to 87, what gives an evidence of its statistical significance. Previous phylogenetic analysis of mitochondrial ABC transporters has revealed that ATM1 from Agrobacterium species is a common ancestor of B8 and B10 [19248758], but we were not able to confirm this fact in our study.
Dominant negative selection
To detect evolutionary patterns playing in the ABCB family formation, we used maximum-likelihood methods to estimate the nonsynonymous to synonymous substitution rate ω ratio. We computed parameters of likelihood ratio tests for each subgroup using various site models which assume: uniform selective pressure among sites (M0, one-ratio), variable selective pressure without positive selection (M1a, neutral), variable selective pressure with positive selection (M2a, selection), discrete variable selective pressure among sites (M3, discrete), beta-distributed variable selective pressure (M7, beta) or beta-distributed pressure with an extra class of positively selected sites (M8, beta&ω). The log-likelihood values of M1a-M2a and M7-M8 models were compared using a likelihood ratio test (LRT) to test for positively selected sites (with ω > 1). Finally, sites evolving under positive selection (“ω+ sites”) were identified using Bayes empirical Bayes (BEB) approach, under assumption of the beta-distributed site-specific omega ratio with one category of positively selected sites (model M8).
As expected, the omega parameter in the one-ratio model of all paralog groups is much less than 1 (from 0.048 in B9 to 0.330 in B2). It highlights the dominant role of purifying selection removing nonsynonymous mutations in the evolution of ABCB genes. However, in most cases models M2a (selection) and M8 (beta&ω) fit the data significantly better than models M1a (neutral) and M7 (beta) which do not allow for sites under positive selection. Therefore, it indicates the probable role of positive selection in shaping the functional diversification of ABCB genes. However, considerably most sites (over 97% in each case) are characterized by ω ≤ 1. The tests contrasting the models M1a against M2a and M7 against M8 resulted in nearly identical log-likelihood scores suggest that the amino acid changes are neutral or under purifying selection. Interestingly, such situation concerns the C-terminal halve of B4 and N-terminal halve of B5, suggesting that strong purifying constraints were maintained over the entire coding region after B1-B4 and B1/4-B5 duplication events only on one duplicated copy. For remaining subgroups parameters indicate small fractions of sites under positive selection (with ω+ from 1.69 to 12.16, according to model M8). Sites which were identified under M8 model as fixed by positive selection at the p > 0.95 cutoff are listed in the Table 3.
Divergence between subgroups
According to the theory on sequence-based functional divergence established by Gu there are two types of site-specific functional divergence patterns [11264396]. Type-I results in site-specific evolutionary rate shift and characterizes sites which are highly conserved in one subgroup of duplicated genes and variable in another, what arises from relaxation of purifying constraints on a given site. Alternatively, type-II functional divergence comes out of the shift of cluster-specific amino acid property in the early stage after duplication and relates to sites which remain highly conserved. To evaluate the pattern of functional divergence after duplications that gave rise to specific ABCB transporters, we computed site-specific posterior probabilities, denoted by θi (coefficients of type-i functional divergence). The analysis detected sites in protein sequences with changes on selective constraints.
Overall, the signals of type-I divergence after the duplications in the ABCB subfamily are stronger than that for type-II divergence in each analyzed duplication event (Table 3). Therefore, changes of site-specific evolutionary rates occurred more frequently because of relaxation of selective constraint rather than by adaptive fixation of variants. However, the longer evolutionary distance between duplicated clades, and the faster rate of substitutions (as seen after B6-B7 and B8-B10 duplications, vide Figure 1), the more influence of site-specific fixed changes of amino acids. Thus, the differences in evolutionary patterns within specific subgroups may reflect their evolutionary age and level of functional diversification after duplication events. Within the Pgp group, where the type-I divergence predominates, proteins transport similar classes of substrates, but have different expression patterns. The full transporters form also the most recent, mammalian-specific group, occurred between 310 and 276 million years ago (Mya). Duplication events that gave rise to the specific full transporters remained most of the amino acid sites non-changed.
Alternatively, both B8 and B10 transporters localize in mitochondrial membrane and are ubiquitously expressed in various mammalian tissues. Thus, the mechanism of retention of duplicated gene copies must have included relatively many radical changes on the sequence level in the early stage after duplication. This is seen in comparable coefficients of type-I and type-II functional divergence (θI=0.55, θII=0.42), as well as in the proportion of changed residues in the early stage after duplication. Similar pattern occurs in the case of B6 and B7 transporters (θI=0.73, θII=0.48), both having mitochondrial history, and separated from each other before the arthropods-chordates split (993 Mya).
Despite the mammalian-specific history, the duplication event that lead to heterodimeric TAP complex were followed by many radical substitutions, and therefore resembles more ancient duplications, with relatively high impact of type-II divergence (θII=0.29). TAP1 ad TAP2 are significantly more variable than other ABCB proteins, with on average 1.32 and 0.84 amino acid substitutions per site, respectively. Such variability reflects their genomic location within the Class II MHC region and involvement in the function of immune system.
Divergence of full transporters is not substantially guided by differences in the hydrophobic pocket
The mammalian Pgp subgroup consists of four members which evolved from three subsequent duplication events preceded by the common duplication that gave rise to the topology of full transporter (Figure 1). This topology is unique for eukaryotic transporters. Yeast ABC transporters with a tandemly duplicated organization (TMD-ABC)2 localize in plasma membrane and export exogenous drugs or xenobiotics as well as endogenous toxic metabolites [16406363]. Structure, as well as the evolution of mammalian ABCB full transporters are homogenous. Each duplication event were followed by site-specific evolutionary rate shifts (type-I) rather than by shifts of cluster-specific amino acid properties (type-II). Each duplicated copy of gene has undergone under strong negative selection, with small proportions of sites under relaxed positive selection.
Structural location of selected sites differ between paralogous clades. In B1 most of the sites with relaxed evolutionary constraints cluster in the cytoplasmic part of transporter. This variable region contains three asparagine residues that are post-translationally glycosylated in human ABCB1, but preserved only in the chimpanzee transporter, as well as all three sites of type-I divergence after B1-B4 duplication. Two sites (S880, L884 in B1; A879, K883 in B4)with shifts of amino acid properties after B1-B4 duplication are located within TM10, a part of the entrance gate to the protein cavity. Notably, the B1-B4 duplication event were not followed by divergence of residues corresponding to the central drug binding cavity of ABCB1. It is worth mentioning that residues forming the cavity are not fully conserved across mammalian orthologs. Ten (G64, L65, I306, I340, A342, L762, T837, L975, V981, Q990) out of 35 drug-binding residues [19325113] are variable. This number exceeds the average fraction of invariable sites across the B1 subgroup (about 80%). However, six of them (I306, I340, A342, L762, L975, Q990) are different merely in the rodent Abcb1b isoform, which is not capable of conveying the multidrug resistance phenotype [1990275].
However, N-glycosylation does not affect drug transport in ABCB1 [8096511].
http://www.jbc.org/content/274/4/2344.full.pdf+html
Recent experiments demonstrate that the function of a protein can be much more affected by the mutation of positions far from the active site than predicted by this hypothesis, explaining the minimal success of structure-based rational design of proteins in directed evolution experiments [7]. For example, when redesigning the substrate specificity of aspartate aminotransferase (from Asp to Val), only 1 out of 17 mutated residues was located in the active site [8]. The distribution of solvent accessibility for these positions is similar to the distribution for the whole protein (data not shown), which suggests that surface positions are mainly involved in this functional shift. The new paradigm should rather be that, `dispersed substitutions that act synergistically improve enzyme properties and function' [7]. In other words, any position in a protein could be important for the overall function.
Interestingly, the second mitochondrial ABC transporter with high sequence similarity to TAP, Mdl2p, does not affect peptide transport across the inner membrane and its function escaped discovery so far.
Tables
Groupa |
Subcellular location |
Expression in human tissuesb |
Substratesc |
Locus at human genome |
B1 (F) |
Plasma membrane |
Well expressed; brain, adrenal gland, kidney, liver, small intestine, epithelia |
Glucosylceramide, platelet-activating factor, steroid hormones / amphipatic molecules: chemotherapeutic drugs, cytotoxic agents, HIV-protease inhibitors, cyclic and linear peptides |
7q21.1 |
TAP1 (T) |
ER membrane |
At high level; ubiquitous |
MHC-I peptides |
6p21.3 |
TAP2 (T) |
ER membrane |
At high level; ubiquitous |
MHC-I peptides |
6p21.3 |
B4 (F) |
Plasma membrane |
Moderately expressed; liver, spleen, pituitary gland, kidney |
Long-chain phospholipids / some amphipathic drugs |
7q21.1 |
B5 (H/F) |
Plasma membrane |
Well expressed; skin, testis |
? /doxorubicin, rhodamine 123, camptothecin, 5-FU [15205344] |
7p15.3 |
B6 (H) |
ER, plasma membrane [18279659], mitochondria [17006453] |
At very high level; ubiquitous |
Porphyrins / porphyrin-related drugs [17661442], arsenic [21266531] |
2q36 |
B7 (H) |
Mitochondria |
At high level; ubiquitous |
Iron |
Xq12-q13 |
B8 (H) |
Mitochondria |
At very high level; ubiquitous |
Heme, peptides / doxorubicin [21046154] [19147539] |
7q36 |
B9 (H) |
Lysosomes |
At high level; testis, brain |
Peptides |
12q24 |
B10 (H) |
Mitochondria |
Well expressed; ubiquitous |
Iron |
1q42 |
B11 (F) |
Plasma membrane |
Moderately expressed; liver |
Bile acids and bile salts / paclitaxel |
2q24 |
Table 1. Summary of experimental data on mammalian ABCB transporters.
a F, full transporter; T, heterodimer; H, homodimer
b AceView, UniProtKB
c physiological / exogenous
Group |
ωa |
LRTb |
Parameter estimates under M8 modelc |
PSSd |
B1 |
0.217 |
260.9 271.48 |
p0= 0.99680 p= 0.01106 q= 0.08694 (p1= 0.00320) w= 8.21300 |
G9 A11 K12 K13 F17 L19 N20 V36 V53 L56 E74 N81 A82 L85 E86 D87 L88 M89 S90 N91 I92 T93 R95 S96 D97 I98 N99 G102 F103 F104 M105 N106 D110 R113 S119 R157 R210 G324 S327 Q418 M450 A631 A641 A650 R673 L705 I742 D743 P745 A761 F851 H918 H966 K967 L968 S970 L1017 E1024 T1036 K1093 L1096 R1103 A1128 Q1142 R1147 A1156 S1160 H1195 T1277 R1279 Q1280
127 |
B4 |
0.144 |
39.15 39.30 |
p0= 0.99610 p= 0.03364 q= 0.38326 (p1= 0.00390) w= 4.55798 |
R13 T15 A17 I25 K30 V37 V42 S58 I62 S389 N397 Q420 I442 D449 T452 S615 T651 R652 Q668 M676 C677 D686 V716 A862 K1026 Q1141 H1162 Q1194 Q1246
116 |
B5 |
0.264 |
9.89 13.22 |
p0= 0.99779 p= 0.09870 q= 0.39129 (p1= 0.00221) w= 4.84035 |
T639 S645 V651 A662 T666 Q667 A935
50 |
B11 |
0.247 |
83.76 97.81 |
p0= 0.99210 p= 0.07478 q= 0.48296 (p1= 0.00790) w= 3.60327 |
K24 Y26 D41 G42 T57 Q76 T85 D92 Q101 R128 A151 Y269 L664 D668 D676 M677 S683 Y705 V707 E709 V715 Y721 S759 L771 T887 S895 L905 R928 Q931 M935 T965 S1008 S1062 T1066 S1093 S1104 V1259
112 |
TAP1 |
0.330 |
LRTM1a-M2a = 100.5 (p < 0.01) LRTM7-M8 = 123.3 (p < 0.01) |
p0= 0.98250 p= 0.10038 q= 0.60834 (p1= 0.01750) ω= 3.24023 |
(G77 S79 V86 K143 Q150 I177 S184) A205 G225 Q226 N231 P232 T244 S248 V304 N343 E361 N362 V385 E418 V470 S472 S496 Q516 I524 R527 P550 L553 L562 R639 Q656 T664 Y752 H762 R781 G783 A804 D805
80 |
TAP2 |
0.244 |
LRTM1a-M2a = 34.0 (p < 0.01) LRTM7-M8 = 41.9 (p < 0.01) |
p0= 0.99056 p= 0.07830 q= 0.53495 (p1= 0.00944) ω= 3.38574 |
(A17 Q33 F61 P91 P98) W113 C213 T258 N262 L266 S421 S445 V467 N479 S535 H584 Q590 N649 K679 R700 M702 D703
61 |
B6 |
0.139 |
LRTM1a-M2a = 19.3 (p < 0.01) LRTM7-M8 = 26.8 (p < 0.01) |
p0= 0.99418 p= 0.04195 q= 0.53744 (p1= 0.00582) ω= 3.50956 |
(M19 Q20 T35 A47 A57 L62 T81 R99 R135 P148 A178) V230 R231 S232 A234 Q236 T426 L577 R589 A660 D694 A711 T839 E841
35 |
B7 |
0.171 |
LRTM1a-M2a = 60.3 (p < 0.01) LRTM7-M8 = 59.5 (p < 0.01) |
p0= 0.99530 p= 0.02914 q= 0.33140 (p1= 0.00470) ω= 5.77060 |
(F18 R21 H23 S24 S33 V34 S35 S37 P39 W41) H44 A54 I57 P58 E59 S63 I64 R68 G72 Q76 F77 A81 L84 V86 D680 H698 S710
40 |
B8 |
0.183 |
LRTM1a-M2a = 55.1 (p < 0.01) LRTM7-M8 = 60.8 (p < 0.01) |
p0= 0.99486 p= 0.06701 q= 0.63108 (p1= 0.00514) ω= 4.11432 |
(P21 Y37 R38 S41) R65 W66 A77 H87 S105 H109 V111 V133 N176 Q334 A346 C437 S441 K448 E471 V513 R519 C661 D667 R669 W671 A701 E709
56 |
B9 |
0.048 |
LRTM1a-M2a = 15.6 (p < 0.01) LRTM7-M8 = 51.9 (p < 0.01) |
p0= 0.99427 p= 0.05093 q= 1.80787 (p1= 0.00573) ω= 1.96381 |
(M16 A128 G163 A166) V185 L392 R562 P748 L749 P759 E763
21 |
B10 |
0.195 |
LRTM1a-M2a = 14.1 (p < 0.01) LRTM7-M8 = 17.6 (p < 0.01) |
p0= 0.97496 p= 0.07086 q= 0.77107 (p1= 0.02504) ω= 2.34684 |
(S34 V36 G38 S39 S41 P42 F43 L46 R47 A49 R50 L51 W52 W60 V62 R67 W68 R69 S70 C72 R73 G75 A79 R81 G82 V83 L84 L92 R95 G96 S99 F105 P108 G109 P111 R112 L113 R115 A116 R117 G121 A124) P125 G126 P128 R129 L130 R132 R134 G138 A141 A142 W144 D147 W150 P154 G162 R175 Y182 D226 G234 I460 K491 P525 A564 R634 A755
76 |
Table 2. PAML results for positive selection tests.
a calculated under M0 model (one ω ratio for all sites)
b LRT, likelihood ratio test; LRTM7-M8 = 2(lnLM8-lnLM7); LRTM1a-M2a = 2(lnLM2a-lnLM1a); p, p-values from the χ2df=2 test, for the hypothesis of the model without positive selection
c p0, proportion of neutral and negatively selective sites; p and q, shape parameters of beta distribution; p1, proportion of sites under positive selection; ω, ratio for sites under positive selection
d PSS, positively selected sites with posterior probabilities > 95%, under the M8 model, mapped on the human sequences + numbers of sites with p > 70% under the M8 model. PSS of Pgp-N/C were mapped on human ABCB1, TAP/B9 on human ABCB9, TAP1/2 on human TAP1, B6/7 on ABCB6; in parenthesis these within signal peptides or additional helices
Cluster 1a |
Cluster 2a |
θ ± SE, type-Ib |
θ ± SE, type-IIb |
# of sites with no/conservative/radical change in the early stage after duplication |
B1-N (0.444) |
B4-N (0.292) |
0.341600 ± 0.088195 |
0.039704 ± 0.029548 |
447/56/34 |
B1-C (0.581) |
B4-C (0.289) |
0.344709 ± 0.084268 |
0.044088 ± 0.033577 |
443/51/43 |
B1/4-N (0.876) |
B5-N (0.432) |
0.542562 ± 0.072713 |
0.203114 ± 0.039579 |
331/119/87 |
B1/4-C (1.012) |
B5-C (0.480) |
0.626400 ± 0.067425 |
0.199461 ± 0.043277 |
332/115/90 |
B1/4/5-N (1.835) |
B11-N (0.675) |
0.450400 ± 0.055071 |
0.066475 ± 0.059190 |
311/114/112 |
B1/4/5-C (1.819) |
B11-C (0.678) |
0.342400 ± 0.057718 |
0.100522 ± 0.062636 |
292/133/112 |
B1/4/5/11-N (2.753) |
B1/4/5/11-C (2.814) |
0.302400 ± 0.032077 |
-0.038633 ± 0.104084 |
242/138/157 |
TAP1 (1.316) |
TAP2 (0.836) |
0.420000 ± 0.050178 |
0.289459 ± 0.050178 |
249/136/152 |
TAP1/2 (2.848) |
B9 (0.192) |
0.444000 ± 0.120793 |
-0.019519 ± 0.082904 |
235/146/156 |
B6 (0.386) |
B7 (0.242) |
0.725600 ± 0.084440 |
0.477113 ± 0.028943 |
232/151/154 |
B8 (0.374) |
B10 (0.585) |
0.550016 ± 0.074127 |
0.425990 ± 0.033339 |
229/153/155 |
Table 3. The pairwise coefficients of type-I and type-II functional divergence in the ABCB subfamily.
a in parenthesis average number of substitutions per site
b θ, coefficient (posterior probability) of functional divergence
Group |
Type I |
Type II |
||
|
Min / max / median value |
Selected sitesa |
Min / max / median value / cut-off |
Selected sites |
B1 vs. B4 |
0.032 / 0.873 / 0.319 |
K324 H966 K967 vs. S325 V974 N975 (3) |
0 / 3.664 / 0 / 2.0 |
737,795,826,S880,L884 (5) |
|
|
53,60,79,123,149,153,169,172,197,210,228,257,316,321,339,365,389,399,400,412,423,434,479,491,607,685,686,690,691,696,697,699,711,714,715,719,739,743,746,747,749,751,761,765,766,782,784,796,797,802,803,810,811,813,819,824,831,843,851,854,855,857,858,892,899,901,916,917,930,933,945,946,947,948,949,952,954,956,957,974,980,988,990,993,995,1008,1009,1016,1023,1032,1035,1036,1060,1061,1073,1079,1083,1092,1101,1108,1122,1127,1136,1152,1154,1155,1175,1196,1203,1205,1211,1226,1246 (113) |
|
55,58,69,73,106,107,108,117,121,125,131,133,146,151,167,184,196,200,212,225,231,240,252,261,274,287,294,296,297,300,303,313,327,333,335,337,338,371,380,411,415,425,450,460,480,498,510,521,525,547,571,593,594,617,698,706,778,814,825,836,868,908,932,951,973,979,1041,1064,1094,1104,1151,1239,1242 (73) |
|
|
93,146,147,148,169,174,184,227,269,288,314,352,365,378,415,433,439,446,450,453,468,476,478,480,496,574,600,604,608,629,641,890,934,966,968,1027,1066,1079,1088,1108,1205,1208,1240,1285,1303 (45) |
|
185,212,214,216,297,312,325,386,584,610,612,821,823,875,938,945,960,997,1038,1063,1083,1221 (22) |
N vs. C |
|
B1: 36,48,52,117,166,184,188,210,238,246,249,275,284,287,289,300,312,319,329,376,393,428,440,464,478,494,523,592,594 (29) |
|
143,244,273 (3) |
B6 vs. B7 |
|
B6: 251,260,264,267,269,270,271,274,275,277,284,285,288,307,308,322,325,342,344,345,346,347,348,349,351,355,357,359,372,373,378,388,390,391,392,395,398,404,408,417,419,423,429,431,433,436,437,441,453,467,469,473,474,475,478,480,481,482,483,489,491,496,503,523,530,538,542,566,571,572,573,577,578,583,586,587,590,593,600,602,603,607,612,619,632,640,641,642,643,644,646,649,652,654,660,664,665,673,682,688,694,696,697,698,701,705,707,708,711,744,746,747,748,760,761,767,770,772,774,775,776,781,790,797,798,801,802,809,810,814,817,820 (132) |
|
250,253,257,258,260,262,263,264,266,268,272,274,275,276,277,278,280,281,286,287,288,289,291,294,297,307,309,313,314,315,316,317,319,320,321,322,323,324,326,327,328,329,331,332,333,334,335,336,338,340,341,342,344,346,347,348,351,355,357,359,360,365,368,369,370,371,378,379,380,381,382,383,386,394,396,397,400,401,402,403,405,406,407,408,409,412,414,416,417,418,419,420,421,425,433,440,441,444,445,446,448,449,450,451,452,454,460,466,468,474,477,478,481,482,484,486,487,488,492,493,495,499,500,504,505,506,510,512,513,514,516,517,518,519,520,521,522,524,526,532,533,534,535,536,537,541,546,547,550,553,554,555,556,557,558,562,563,564,566,570,573,574,578,585,588,591,598,603,604,605,608,612,614,615,617,618,621,624,627,633,642,644,648,650,653,655,656,658,659,664,665,666,673,677,678,681,682,684,685,688,693,694,697,699,701,702,711,712,714,740,744,746,750,756,759,760,766,768,769,770,771,773,774,775,778,780,793,795,798,799,801,809,813,814,816,819 (236) |
TAP1 vs. TAP2 |
|
TAP1: 232,250,256,266,283,288,304,343,362,371,385,395,398,403,472,483,486,489,513,540,629,678,680,682,686,687,762,764,782,790 (30) |
|
240,245,257,262,272,276,287,299,302,303,306,317,318,321,322,323,327,333,340,350,358,368,369,373,402,404,408,415,419,423,433,441,442,462,465,467,478,485,493,512,520,523,529,530,531,534,535,548,653,669,689,704,705,710,746,748,760,761,767,771,794,797 (62) |
TAP vs. B9 |
|
B9: 231,253,274,366,462,562,581 (7) |
|
- |
B8 vs. B10 |
|
B8: 117,129,133,139,140,171,172,173,182,193,194,202,208,210,216,217,218,220,272,276,280,282,287,298,339,350,352,390,406,433,441,446,448,454,469,475,477,500,513,562,564,565,567,627,658,685,687 (47) |
|
121,123,127,128,130,136,137,138,139,140,141,142,146,148,150,151,154,155,156,157,159,162,163,173,182,184,185,186,188,189,190,191,192,194,195,196,197,199,200,204,206,207,208,209,213,222,223,229,230,233,235,239,242,243,244,245,246,247,250,251,252,253,255,259,261,264,265,268,269,270,271,276,278,279,281,283,285,287,289,290,292,293,297,304,308,309,316,318,320,321,331,332,335,342,344,345,348,350,352,353,355,356,357,358,360,362,364,365,367,368,369,371,372,375,376,378,379,382,385,386,387,388,389,391,393,395,399,400,401,402,403,405,408,409,413,414,415,416,417,418,419,420,423,426,428,429,431,432,434,438,440,442,452,454,456,463,466,473,474,479,480,484,486,491,504,506,511,521,525,527,532,533,536,537,545,546,547,549,553,559,561,563,565,570,574,582,584,591,594,610,615,617,629,640,641,656,657,660,662,665,666,670,688 (203) |
a cut-off 0.75, positions of human proteins
Table 4. Sites selected as being of type-I/-II functional divergence after duplication events.
Mutation the first leucine residue to arginine in the ABC signature of human ABCB1 resulted in very low yield expression [9169612].
|
A-loop |
Walker A |
Q-loop |
ABC signature |
Walker B |
D-loop |
H-loop |
B1 |
F[SNH]YPS |
GNSGCGK |
VSQEP |
LSGGQKQR |
KILLLDEAT |
SALD |
IAHRL |
|
FNYPT |
GSSGCGK |
VSQEP |
LSGGQKQR |
[HRQ]ILLLDEAT |
SALD |
IAHRL |
B4 |
FSYP[SA] |
G[NS]SGCGK |
VSQEP |
LSGGQKQR |
KILLLDEAT |
SALD |
IAHRL |
|
FNYPT |
GSSGCGK |
V[SL]QEP |
LSGGQKQR* |
[RQK][IV]LLLDEAT |
SALD |
IAHRL |
B5 |
F[NS]YPS |
G[PL][NS]GSGK |
V[SR]QEP |
MSGGQKQR |
KILLLDEAT |
SALD |
[VI]AHRL |
|
F[FSV]YP[CS] |
GSSGCGK |
VSQEP |
LSGGQKQR |
KILLLDEAT |
SALD |
V[AT]HRL |
B11 |
FHYPS |
G[SP]SG[AS]GK |
V[SP]QEP |
MSGGQKQR |
KILLLDMAT |
SALD |
V[AS]HRL |
|
F[TK]YPS |
GSSGCGK |
VSQEP |
LSRG[EQ]KQR |
KILLLDEAT |
SALD |
IAHRL |
B2 |
FAYP[NS] |
G[PR]NG[SA]GK |
V[GRS]QEP |
LSGGQRQA |
[RCLY][VLI]LILDDAT |
SALD |
IT[QH]XL |
B3 |
F[AS]YP[NSRY] |
GPNGSGK |
VGQEP |
LAVGQKQ[RC] |
RVLILDEAT |
SALD |
I[AT]HRL |
B6 |
FSY[AT][DN] |
GPSGAGK |
VPQDT |
LSGGEKQR |
[DGH]I[IV]LLDEAT |
SALD |
[VI]AHRL |
B7 |
FEYI[EA] |
GGSGSGK |
VPQD[AS] |
LSGGEKQR |
PVILYDEAT |
SSLD |
IAHRL |
B8 |
F[AS]YP[NSRY] |
GQSGGGK |
ISQEP |
LSGGQKQR |
[TKA]VLILDEAT |
SALD |
IAHRL |
B9 |
FTYRT |
GGSGSGK |
VSQEP |
LSGGQKQR |
PVLILDEAT |
SALD |
IAHRL |
B10 |
F[AT]YPA |
GPSG[SA]GK |
VSQEP |
LSGGQKQR |
KILLLDEAT |
SALD |
IAHRL
|
except of canine ABCB4
Table 5. Sequence motifs in ABCB subgroups.
Figures
Figure 1. Mid-point rooted maximum-likelihood tree of mammalian ABCB transporters. Nodes of duplication events are marked with bootstrap values less than 1.
Figure 2. Dependence of evolutionary age of sites on phenotypical effect on mutations in ABCB1. Sites are classified according to the most recent duplication event which gave rise to their fixation in the B1 group (e.g., “B5” means duplication before speciation of B1/4 and B5 clades). “Variable” sites are nonconserved across B1 orthologs.
Figure 3. Coefficients of type-I and type-II functional divergence after B1-B4, B1/4-B5, B1/4/5-11 and N-C duplications.
Figure 4. Site-specific evolution of full transporters: ABCB1, ABCB4, ABCB5 and ABCB11. Sites under positive selection (model M8, ω+> 1, p > 95%) are shown as red balls; sites of type-I functional divergence (θI > 0.75) after specific duplication events - as grey balls; sites of type-II functional divergence (θII > 0.75) - as yellow balls.
Figure 5. Site-specific evolution of ABCB8 and ABCB10 transporters. Sites divergent after the B8/B10 duplication event are mapped onto human B8 model (type I, θI > 0.75, with grey balls; type-II, θII > 0.75, with yellow balls). Sites under positive selection in B8 and B10 (p > 0.95) are marked with red balls.
Figure . Grey, sites which mutations do not affect function;
.........
To estimate the time of specific duplication events, we supplemented phylogenetic tree by several non-mammalian sequences. The Pgp subfamily of full transporters originated from an eukaryotic ancestral gene, with Ste6p gene in yeast. The duplication led to mammalian B1 and B4 occurred before the most recent common mammalian ancestor and after split of mammals from reptiles, that is between 310 and ~276 million years ago (Mya) [9582070]. Even though the B1 and B4 transporters are adjacent on the chicken (Gallus gallus) genome, similarly to mammalian, they do not cluster with respective mammalian sequences, what suggests that they evolved independently. Due to the presence of B5 transporter in the frog (Xenopus tropicalis) genome, the duplication preceding speciation of B5 transporters occurred earlier, before the separation of amphibians from mammals, before ~360 Mya. However, the ancestral gene has been lost in some lineages, e.g. birds or reptiles. Among full transporters, B11 has the most ancient history, being present in all studied vertebrate genomes.
Age of the ancestors of mammalian genes:
B6 (Bacteria) > B7, B10 (Eukaryota, with yeast counterparts) > B8 (Arthropoda, with fly) > B9 (Urochordata, with Ciona intestinalis) > B11 (with fish, Danio rerio) > B5 (land vertebrates, with frog) > B1, B2, B3, B4 (mammals)
We noticed that the B1/4 duplication event may be limited to mammalian sequences. Even though the chicken B1 and B4 transporters result from the common duplication event, as indicated by phylogenetic and genome context analysis, they do not cluster with respective mammalian transporters. Such division indicates that detailed analysis of functional evolution of the subfamily B transporters should be limited to mammalian genomes.
Conservation of drug-binding site in mammalian ABCB1
B1/B4
By using the posterior analysis, we predicted that among total of aligned x sites, y sites are critical for the mutual functional divergence, with the cut-off value Pi(S1|X)=0.5 (as these sites are moved from the alignment, θ drops to 0).
Because of its clinical importance, there is a lot of data on functional outcomes of specific mutations of ABCB1.
We divided amino acid substitutions into three groups, based on the phenotypical effects on drug transport: (1) with no or little change comparing to wild-type protein; (2) with intermediate change of transport, affecting the specificity for individual substrates; (3) severely reducing function, including change in the expression of the protein at the cell surface. We do not take into account influence of type of mutations (for instance, we do not distinguich between mutations Phe to Ala or Phe to Tyr). In addition, we included single nucleotide polymorphisms (SNPs) of ABCB1 gene which result in so-called non-conserved amino acid changes (excluding VILM, KR, ST polymorhisms), and classify them in the group (1). As a measure of evolutionary age of an amino acid on a specific position we assumed the duplication event that gave rise to its fixed existence. Sites which variable among mammalian orthologs we classified as "unconserved". Thus, we were able to evaluate the dependence of the evolutionary age of sites on their functional significance, measured by phenotypical effects of mutations.
Previous studies of ABC genes suggested that the evolutionary approach used may be a useful method in general for determining the functional consequence of genetic variations [16429166] and that stringent definition of evolutionary conservation of residues by alignment of mammalian orthologs is a strong predictor of fitness and hence protein function [12719533]. However, Dorfman et al. in their study on CFTR (ABCC7) concluded that current computational methods used to predict the molecular consequences of amino acid substitutions on the basis of evolutionary conservation or protein structure are not able to reliably distinguished between disease and neutral mutations in this gene [20059485].
that many disease-association mutations
involving ABC transporters may be due to disruption of domain-domain binding
interactions.
As noted earlier, interface mutations can disrupt ABC transporter domain
interactions in several ways: by interfering with ATP binding or hydrolysis, by
destabilizing or preventing proper folding and association of the domains, or by
interfering with allosteric communication between domains that is suggested by
the large conformational changes seen during the transport cycle.
Our results suggest that current evolutionary-based methods are not able to clearly predict determinants of functional divergence in ABCB transporters. Since these proteins remarkably differ from enzymes, ... However, our findings can <?!>
|
|
unconserved |
B1 vs. B4 |
B1/4 vs. B5 |
B1/4/5 vs. B11 |
N vs. C |
Pgp vs. TAP |
Pgp/TAP vs. B6/B7 |
conserved |
∑ |
1 |
no or little change |
3 |
8 |
11 |
5 |
5 |
6 |
5 |
2 |
45 |
2 |
change |
3 |
7 |
15 |
5 |
14 |
8 |
5 |
1 |
58 |
3 |
severely reduced function |
0 |
3 |
5 |
6 |
6 |
7 |
2 |
13 |
42 |
∑ |
6 |
18 |
31 |
16 |
25 |
21 |
12 |
16 |
145 |
Bibliography
1. Zaitseva, J., et al., A structural analysis of asymmetry required for catalytic activity of an ABC-ATPase domain dimer. The EMBO journal, 2006. 25(14): p. 3432-43.