Cai QC et al / Acta Pharmacol Sin 2003 Oct; 24 (10): 1051-1059

Putative caveolin-binding sites in SARS-CoV proteins1

CAI Quan-Cai2, 3,6, JIANG Qing-Wu2, ZHAO Gen-Ming2, GUO Qiang4, Cao Guang-Wen3, CHEN Teng5

2School of Public Health, Fudan University, Shanghai 200032; 3Department of Epidemiology, Faculty of Health Service, Second Military Medical University, Shanghai 200433; 4Faculty of Health Service, Second Military Medical University, Shanghai 200433; 5Changzheng Hospital, Second Military Medical University, Shanghai 200003, China

1 Project supported by the Key Programs of oppugning SARS from the Ministry of Education of China (No 10), the special programs of oppugning SARS from the Shanghai Science and Technology Commission (No NK2003-002).

6 Correspondence to Dr CAI Quan-Cai. E-mail qccai@smmu.edu.cn

Received 2003-07-02 Accepted 2003-07-03

KEY WORDS severe acute respiratory syndrome (SARS); caveolin-binding motif; replicase 1AB; spike protein; M protein; bioinformatics; molecular modeling; molecular docking

ABSTRACT

AIM: To obtain the information of protein-protein interaction between the SARS-CoV proteins and caveolin-1, identify the possible caveolin-binding sites in SARS-CoV proteins. METHODS: On the basis of three related caveolin-binding motifs, amino acid motif search was employed to predict the possible caveolin-1 related interaction domains in the SARS-CoV proteins. The molecular modeling and docking simulation methods were used to confirm the interaction between caveolin-1 and SARS-CoV proteins. RESULTS: Thirty six caveolin-binding motifs in the SARS-CoV proteins have been mapped out using bioinformatics analysis tools. Molecular modeling and simulation have confirmed 8 caveolin-binding sites. These caveolin-binding sites located in replicase 1AB, spike protein, orf3 protein, and M protein, respectively. CONCLUSION: Caveolin-1 may serve as a possible receptor of the SARS-CoV proteins, which may be associated with the SARS-CoV infection, replication, assembly, and budding.

INTRODUCTION

The severe acute respiratory syndrome (SARS) associated coronavirus (SARS-CoV) has been recognized as the causative agent for SARS[1]. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously known groups of coronaviruses, and is classified as a new group[2]. However, it can be found that SARS-CoV genome has the same frame of sequence structure with other coronaviruses, because almost all of the structure proteins existing in previously known coronaviruses, such as spike glycoprotein (S), envelope protein (E), membrane protein (M) and nucleocapsid protein (N), have been identified in SARS-CoV in the same order[2,3].

Efficient entry of viruses into host cells and release of the viral genome are essential steps in the initiation of the infection cycle. Viruses have adapted to utilize various cell surface molecules as their receptors that often seem to direct the virus to use the clathrin-dependent pathway[4]. SARS-CoV S protein plays a very important role in virus entry, and may use aminopeptidase N (APN) as its cellular receptor, like the two members of coronavirus serogroup I, human respiratory corona-virus (HCoV-229E) and porcine transmissible gastroenteritis virus (TGEV)[5]. However, other entry mechanisms may also mediate the endocytosis of viruses[6]. The present understanding about non-clathrincoated endocytosis of viruses is mainly based on the entry mechanism of simian virus 40 (SV40) through caveolae[7-9].

Caveolae are caveolin-containing specific lipid invaginations in the plasma membrane. Caveolin, a major structural protein of caveolae, contains a scaffolding region that contributes to the binding of the protein to the plasma membrane. There are three caveolin genes expressed in mammals (designated caveolin-1, -2, and -3), and they code for five different isoforms of the protein[10]. Most tissues in the body express at least one of these isoforms. Caveolin-1 and -2 are usually co-expressed and assembled into hetero-oligomers in the ER and Golgi apparatus[11]. These oligomers mature into higher molecular weight complexes once they reach caveolae. Caveolins are involved in cholesterol trafficking, assembly of plasma membrane, endocytosis of small molecules and virus infection[12]. It also contain molecules that play pivotal roles in intracellular signal transduction[13]. In the present paper, we have studied the interaction between SARS-CoV proteins and caveolin-1, not caveolin-2 or -3. The reasons are: (1) Caveolin-1 and -2 are most abundantly expressed in adipocytes, endothelial cells, and fibroblastic cell types, whereas the expression of caveolin-3 is musclespecific[14]. (2) The tissue specificity of caveolin-1 is in lung and muscle[15]. (3) Caveolin-2 may function as an "accessory protein" in conjunction with caveolin-1[14].

What is the mechanism by which the caveolin-1 function? Perhaps the caveolin-1 scaffolding domain (D82-R101) recognizes a common sequence motif within caveolin-binding molecules. To investigate this possibility, Couet J et al have used the caveolin-1 scaffolding domain as a receptor to select caveolin-binding peptide ligands from random peptide sequences displayed at the surface of bacteriophage[16]. Three related caveolin-binding motifs (XXXXXXXXXXX and XXXXXXX where is aromatic amino acid Trp, Phe, or Tyr) were elucidated, and these motifs exist within most caveolae-associated proteins. Thus, caveolin-binding motifs mediate the interaction of caveolin-binding proteins with the scaffolding domain of caveolin-1. Considering the relationship between caveolin-1 and virus infection, we propose a hypothesis that certain domain(s) in SARS-CoV proteins might have caveolin-binding sites containing caveolin-binding motifs, and interact with caveolin-1.

Generally, knowledge of the interaction between virus and host cell is critical in understanding the mechanism of virus entry, replication, and assembling, and in designing small molecules for therapeutic intervention. Accordingly, to find evidences for our above hypothesis that whether the interactions between the SARS-CoV proteins and caveolin-1 exist or not, amino acid motif search was performed by a bioinformatics method, and the possible caveolin-binding sites were confirmed by molecular modeling and simulation methods.

MATERIALS AND METHODS

Materials The genomic sequences, protein sequences of SARS-CoV were retrieved from the GenBank in NCBI. The sequence, SARS coronavirus TOR2 (ACCESSION AY274119, submitted by the British Columbia Centre for Disease Control and National Microbiology Laboratory of Canada), was selected to analysis. The protein sequences listed in Tab 1 and the three caveolin-binding motifs listed in Tab 2 were used in this study. The protein sequence of caveolin-1 (Swissprot: accession Q03135) was prepared for further analysis.

Tab 1. SARS-CoV protein sequences used in this study.

Tab 2. Caveolin-binding motifs used in this study.

Caveolin-binding motifs mapping In this study, caveolin-binding motifs search of the SARS-CoV protein sequences was carried out with DNASIS MAX 1.0 software from Hitachi Software Engineering Co, Ltd. Firstly, the caveolin-binding motifs database was created. Secondly, amino acid motif search function of DNASIS MAX was employed to predict caveolin related interaction domains and binding motifs in the SARS-CoV proteins.

Molecular modeling To confirm above interactions between scaffolding domain of caveolin-1 and SARS-CoV proteins with caveolin-binding motifs, the possible three-dimensional (3D) structures of these monomer proteins were constructed by molecular modeling methods, and the protein-protein interactions were simulated by molecular docking methods.

3D model generation To construct 3D model of caveolin-1 and SARS-CoV proteins with caveolin-binding motifs, the following steps were used: (1)Above proteins sequence datum were prepared for 3D model generation. Since it is impossible to model the whole structures of the full length replicase 1AB and spike protein sequence, they have been rationally splitted into some pieces according to related knowledge. (2) Homology search was performed to identify the homologues related to target protein sequence using BLAST (available from: http://www.ncbi.nlm.nih.gov/BLAST). (3) A multiple sequence alignment containing target sequence and all the homologues we have found above was performed using CLUSTALW (available from: http://www2.ebi.ac.uk/clustalw/). (4) If target protein showed significant homology to another protein of known three-dimensional structure, then we could proceed to comparative or homology modeling to generate models automatically using the SWISSMODEL server (available from: http://www.expasy.ch/swissmod/SWISS-MODEL.html). Otherwise we would need to perform a secondary structure prediction. (5) The three automated secondary structure predictions (PHD http://www.embl-heidelberg.de/predictprotein/, SOPMA http://www.ibcp.fr/serv_pred.html and SSPRED http://www.embl-heidelberg.de/sspred/sspred_info.html) were performed to provide the possible location of alpha helices, and beta strands within the target protein. Then we aligned all of our predictions with our multiple sequence alignment, we got a consensus picture of the structure. (6) 3D-PSSM (http://www.bmm.icnet.uk/~3dpssm/), a fold recognition method, was performed to find a suitable fold for the target protein among known 3D struc-tures. (7) If we have predicted that the target protein would adopt a particular fold within the database, then we got an alignment of secondary structures to find which fold the target protein belongs, and other proteins that adopt a similar fold. We aligned the secondary structures of diverse members of the fold using a structural alignment program, and aligned the secondary structures to the core secondary structure elements. (8) The alignment of the target sequence on to tertiary structure that we got from the fold recognition method was performed, and was edited. (9) The alignment was sent directly to the SWISSMODEL server to generate 3D model of the target sequence. (10) If we did not find a suitable fold, then try ab initio structure prediction. (11)The modeled structure was validated with PROCHECK and WHATIF (available from: http://biotech.embl-ebi.ac.uk:8400/). Above procedure could be summarized in the following flowchart (Fig 1).

Fig 1. The flow chart of 3D model generation.

Interaction simulation To address the interaction feature between the scaffolding domain of caveolin-1 and the SARS-CoV proteins with caveolin-binding motifs, molecular docking simulation was performed using Chemera 2.0. The program, Chemera 2.0, integrates BiGGER (Biomolecular complex Generation with Global Evaluation and Ranking) algorithm for automated protein-protein docking[17]. BiGGER is used to search the complete binding space and select a set of candidate complexes. Each candidate is then evaluated and ranked according to the estimated probability of being an accurate model of the native complex. In the molecular docking simulation, we used the scaffolding domain (D82-R101) of caveolin-1 to automatically restrict the model candidates to those that fit the data. After BiGGER, we gained some possible binding con-formations. The conformation with the lowest binding energy was selected for further analysis.

RESULTS AND DISCUSSION

Caveolin-binding motifs in SARS-CoV proteins By amino acid motif search, we found four proteins of SARS-CoV had some caveolin-binding motifs (Tab 3)

The replicase 1AB polyprotein of SARS-CoV has 25 potential caveolin-binding motifs (Fig 2A, Tab 3). This polyprotein is autocatalytically processed to yield the mature polypeptides from nsP1 to nsP13[3]. Among this peptides, nsP1 (A819-Q3240, viral proteases PLPpro) had 13 caveolin-binding motifs, and nsP5 (A3920-Q4177), nsP9 (S4370-Q5301, RNA-dependent RNA polymerase, RdRp), nsP10 (A5302-Q5902, MB, NTPase/HEL), nsP11 (A5903-Q6429), nsP12 (S6430-Q6775) had 1, 3, 1, 3, and 4 caveolin-binding motifs, respectively. In addition to replicase 1AB, we also found 7, 3, and 1 caveolin-binding motifs in spike protein, orf3 protein, and M protein, respectively (Fig 2B-D, Tab 3).

Fig 2. Caveolin-binding motifs in SARS-CoV proteins. (A)-(D) are the caveolin-binding motifs of replicase 1AB, spike protein, M protein, and ORF3 protein, respectively.

Tab 3. Caveolin-binding motifs in SARS-CoV proteins.

3D model of caveolin-1 and SARS-CoV proteins with caveolin-binding motifs In the present paper, we selected full length caveolin-1 and sequences A819-E930, A2301-C2400, F3066-E3165, A3920-Q4117, C4521-C4670, Y5101-N5150, D5461-L5520, L6381-Q6429, D6648-F6770 of replicase 1AB, sequences M151-P210, D351-G490, F1071-V1210 of spike protein, sequence K61-Y160 of orf3, sequence L61-N120 of M protein for 3D structure modeling. Above sequences covered all the caveolin-binding motifs. Because no homologue was found by homology search, we switched to a fold recognition method, and found a set of suitable folds for the target proteins. Then we proceeded to homology modeling to generate models automatically using the SWISSMODEL server. Finally, we obtained all the 3D structure models (Fig 3). PROCHECK and WHATIF analysis indicated that the 3D models were reasonable.

Fig 3. (A) 3D model of full length caveolin-1. The structure is shown as ribbon representation, which is colored by secondary structure. The scaffolding domain is highlighted in green color. (B)-(F) show the conformations binding caveolin-1 to sequence F4526-F4534 of nsP9, Y5499-Y5506 of nsP10, F6135-F6142 of nsP11, F6408-Y6419 of nsP11 and Y356-F364 of spike protein, respectively. (G)-(I) show the conformations binding caveolin-1 to sequence W69-F77 of orf3 protein, Y141-W149 of orf3 protein and Y94-F102 of M protein, respectively. The structure of caveolin-1 is shown as molecular surface representation colored by secondary structure. The structure of caveolin-binding sites in SARS-CoV proteins is shown as VDW representation. Caveolin-binding sequence is also shown in figure.

Caveolin-binding sites in SARS-CoV proteins To confirm above protein-protein interaction at the 3D level, we performed molecular docking simulation using BiGGER. The 3D structure model of caveolin-1 is shown in Fig 3A. There is a cleft around the scaffolding domain (D82-R101) of caveolin-1. The binding models are shown in Fig 3B-F and Fig 3G-I, which indicates that F4526-F4534, Y5499-Y5506, F6135-F6142, F6408-Y6419 of replicase 1AB, Y356-F364 of spike protein, W69-F77, Y141-W149 of orf3 protein and Y94-F102 of M protein may complementally fit into the binding cleft on the surface of the scaffolding domain of caveolin-1. All the binding affinity were estimated about at the level of mmol/L. Tab 4 lists the hydrophobic contacts between above SARS-CoV proteins and the scaffolding domain of caveolin-1. As an example, the interatomic distances and the hydrophobic contacts between F6135-F6142 of replicase 1AB and the scaffolding domain of caveolin-1 are described detailedly in Tab 5. All above molecular modeling and simulation results confirm 8 caveolin-binding sites from 36 caveolin-binding motifs in SARS-CoV proteins (Tab 4).

Tab 4. The hydrophobic contacts between SARS-CoV proteins and scaffolding domain (D82-R101) of caveolin-1.

Tab 5. The interatomic distances and the hydrophobic contacts between F6135-F6142 of SARS-CoV replicase 1AB and scaffolding domain of caveolin-1.

Potential targets for development of new antiviral drugs and vaccines The early stages of viral infection involve the attachment of virions to the cell surface by binding to a cellular receptor followed by entry into the cell. Enveloped viruses have two options during entry: receptor-mediated endocytosis or direct fusion of the viral envelope with the plasma membrane to deliver nucleocapsid to cytoplasm[18]. SARS-CoV is an enveloped, positive-stranded RNA virus. SARS-CoV infection begins with binding of the spike protein on the viral envelope to a specific receptor on the cell membrane[2]. This receptor might be APN, class I HLA, or other cell surface molecules. The virus receptor is important in selection of the entry route[18]. Examples of viruses binding to class I HLA on the cell surface include Simian virus 40 (SV40) [19] and HCoV-OC43[20]. After binding to class I HLA, SV40 is then translocated to noncoated membrane invaginations known as caveolae. The virus then dissociates from class I HLA and enters cells through the caveolae after initiating a signal transduction cascade[19]. Caveolae have previously been demonstrated to act as the entry route for a number of pathogens[18]. Since one caveolin-binding site was found in spike protein (Fig 3F, Tab 4), and SARS-CoV might utilize class I HLA as its receptor, we could infer that caveolae might also act as the entry route for SARS-CoV as well as for SV40. If this entry route is confirmed by experiment, molecules that inhibit the interaction between spike protein and caveolin-1 might block SARS-CoV infection. This might be a useful target for new drugs designing. The receptor-binding sites and caveolin-binding site in spike protein might also give us a clue to antigen synthesis.

Like other coronaviruses, SARS-CoV replication needs RNA polymerase and helicase which encoded by replicase 1AB gene. Replicase 1AB protein contains four caveolin-binding sites, which are distributed to nsP9 (one site), nsP10 (one site), and nsP11 (two sites) (Tab 4). NsP9 (RdRp) is a RNA-dependent RNA polymerase. NsP10 is a helicase. Caveolin-1 is an unusual protein that can be both an integral membrane protein and soluble in multiple cellular compartments[14]. The interaction between caveolin-1 and above two proteins in cytoplasm might play a part in SARS-CoV replication in host cell. A new drug targeting this interaction might interrupt SARS-CoV RNA replication in host cell.

The SARS-CoV membrane proteins, including the major proteins S (Spike) and M (membrane), are inserted into the endoplasmic reticulum (ER) Golgi intermediate compartment while full-length replicated RNA plus strands assemble with the N (nucleocapsid) protein. This RNA-protein complex then associates with the M protein embedded in the membranes of the ER, and virus particles form as the nucleocapsid complex buds into the lumen of the ER. The virus then migrates through the Golgi complex and eventually exits the cell, likely by exocytosis[3]. Interestingly, caveolin-1 and -2 are usually co-expressed and assemble into hetero-oligomers in the ER and Golgi apparatus[11]. Lipid rafts containing caveolin-1 have been implicated as sites of virion assembly for poliovirus and sites of virus egress for a number of enveloped viruses. These include influenza A virus, measles virus, human immunodeficiency virus type 1, and herpesviruses[18]. In this study, we found M protein of SARS-CoV contained one caveolin-binding site (Fig 3I, Tab 4). Therefore, we think, with reason, that caveolin-1 might be involved in the assembly and release of SARS-CoV through the ER and Golgi apparatus in vivo. SARS-CoV assembly and secretion related to caveolin-1 might be also a potential target for drug development.

CONCLUSION

Possible caveolin-binding motifs in the SARS-CoV proteins have been mapped out using bioinformatics analysis tools, and 8 of 36 caveolin-binding motifs has been confirmed by molecular modeling and simulation methods. Although it is not yet known whether all these 8 caveolin-binding sites interact directly with caveolin-1, our studies provide a rational and systematic basis for investigating whether these protein sequences are indeed recognized as ligands by the scaffolding domain of caveolin-1. In conclusion, caveolin-1 may serve as a possible receptor of the SARS-CoV proteins, which may be associated with the SARS-CoV infection, replication, assembly, and budding.

REFERENCES