What this is
- This research investigates the emergence of the SARS-CoV-2 in Japan, focusing on specific mutations in the spike protein.
- The study involves whole genome sequencing of samples from 159 COVID-19 patients to identify variants.
- It reports familial transmission of the and its classification into global clades.
Essence
- The SARS-CoV-2 , identified in Japan, carries mutations W152L, E484K, and G769V, which may confer immune escape potential. A novel was developed to detect these mutations.
Key takeaways
- The was found in three family members in Japan, indicating household transmission. These individuals were not travelers, suggesting local spread.
- As of April 22, 2021, 2,388 cases of the were recorded globally, with 38.2% from Japan and 47.1% from the United States. This indicates a significant presence in both countries.
- The percentage of SARS-CoV-2 isolates in Japan increased rapidly, exceeding 20% by early March 2021, while remaining low in the United States.
Caveats
- The study's sample size is limited to 159 patients, which may not represent the broader population. More extensive sequencing is needed for a comprehensive understanding of variant prevalence.
- Data availability from different countries varies, potentially skewing the perceived prevalence of the globally.
Definitions
- R.1 lineage: A sublineage of SARS-CoV-2 characterized by specific mutations in the spike protein, notably E484K.
- TaqMan assay: A real-time PCR technique used to detect and quantify specific DNA sequences, including viral mutations.
Simplified
Introduction
Most mutations that occur during viral evolution are neutral and not to influence viral properties. However, some mutations are selectively propagated owing to their positive influence on viral fitness, virulence, and transmissibility [1,2]. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) "Variants of Concern" have emerged within the last few months, largely belonging to three major lineages, B.1.1.7, B.1.351, and P.1 [3 –6]. These emerging lineages are all characterized by multiple mutations in the SARS-CoV-2 spike protein, raising concerns that they may escape monoclonal antibody therapy or vaccine-elicited antibodies. The B.1.1.7 lineage is estimated to have emerged in late September 2020 and has become the dominant strain of SARS-CoV-2 in the United Kingdom [3,7,8]. The B.1.351 lineage has become the dominant strain of SARS-CoV-2 in South Africa, where it was first detected in October 2020 [4]. The P.1 lineage was first identified in four travelers from Brazil and has been associated with cases of reinfection [5,6,9].
The hallmark mutation shared by the B.1.1.7, B.1.351, and P.1 lineages is N501Y, located in the receptor-binding domain (RBD) of the spike protein [3]. SARS-CoV-2 variants with this mutation are thought to be more transmissible and possibly more virulent [7,10 –13]. The other hallmark mutation of the B.1.351 and P.1 lineages is E484K, which has been shown to reduce the neutralizing activity of antibodies, thus raising concerns about vaccine efficacy [14 –19].
In this study, we conducted a genetic surveillance and identified the SARS-CoV-2 R.1 lineage, which harbors an E484K mutation in the RBD, via whole genome sequencing and TaqMan assay. We report the first detailed genetic characterization of the SARS-CoV-2 R.1 lineage, its familial transmission, and its viral evolution as assessed by phylogenetic analysis. The SARS-CoV-2 R.1 lineage has spread worldwide and remains especially prevalent in Japan.
Materials and methods
Ethics statement
The Institutional Review Board of the Clinical Research and Genome Research Committee at Yamanashi Central Hospital approved this study and the use of an opt-out consent method (Approval No. C2019-30). The requirement for written informed consent was waived owing to it being an observational study and the urgent need to collect COVID-19 data.
Patients and sample collection
A total of 192 patients in our hospital were confirmed to have coronavirus disease 2019 (COVID-19), 159 of whom were selected to provide samples for subsequent genome analysis. Nasopharyngeal swab samples were collected by using cotton swabs and placed in 3 ml of viral transport media (VTM) purchased from Copan Diagnostics (Murrieta, CA, United States). We used 200 μl of VTM for nucleic acid extraction, performed within 2 h of sample collection.
Quantitative reverse transcription-polymerase chain reaction (RT-qPCR)
Total nucleic acid was isolated from the nasopharyngeal swab samples using the MagMAX Viral/Pathogen Nucleic Acid Isolation Kit (Thermo Fisher Scientific; Waltham, MA, United States) [20,21]. Following the protocol developed by the National Institute of Infectious Diseases in Japan [22], we performed one-step RT-qPCR to detect SARS-CoV-2 on a StepOnePlus Real-Time PCR System (Thermo Fisher Scientific). The viral load, measured as absolute copy number, was determined using a serially diluted DNA control targeting the nucleocapsid gene of SARS-CoV-2 (Integrated DNA Technologies; Coralville, IA, United States) [23,24].
Whole-genome sequencing
SARS-CoV-2 genomic RNA was reverse transcribed into cDNA and amplified by using the Ion AmpliSeq SARS-CoV-2 Research Panel (Thermo Fisher Scientific) on the Ion Torrent Genexus System in accordance with the manufacturer's instructions [25]. Sequencing reads were processed, and their quality was assessed by using Genexus Software with SARS-CoV-2 plugins. The sequencing reads were mapped and aligned by using the torrent mapping alignment program. After initial mapping, a variant call was performed by using the Torrent Variant Caller. The COVID19AnnotateSnpEff plugin was used for the annotation of variants. Assembly was performed with the Iterative Refinement Meta-Assembler [26].
Clade and lineage classification
The viral clade and lineage classifications were conducted by using the Global Initiative on Sharing Avian Influenza Data (GISAID) database [27], Nextstrain [28], and Phylogenetic Assignment of Named Global Outbreak (PANGO) Lineages [29]. The sequences of the three initially identified R.1 variants were deposited in the GISAID EpiCoV database (Accession Nos. EPI_ISL_1164927, EPI_ISL_1164928, and EPI_ISL_1164929).
Global sequencing data on the SARS-CoV-2 R.1 variant through April 22, 2021 were exported from the GISAID EpiCoV database. We searched for lineage "R.1" and found 2,388 available metadata entries for the R.1 lineage for the period from October 24, 2020 to April 18, 2021, when excluding two data entries lacking a specific collection date. We also searched for the total number of samples by applying the following parameters: host "human", location "Asia /Japan" or "North America/USA", collection date "October 24, 2020 to April 18, 2021". During this period, a total of 17,432 samples were registered in Japan, and 235,373 samples were registered in the United States.
Rapid detection of R.1 lineage SARS-CoV-2 samples with a novel TaqMan assay
We designed a Custom TaqMan assay for detecting SARS-CoV-2 spike protein with the W152L and G769V mutations (Thermo Fisher Scientific). We also used a TaqMan SARS-CoV-2 Mutation Panel for detecting spike E484K (ID: ANU7GMZ, Thermo Fisher Scientific). TaqPath 1-Step RT-qPCR Master Mix CG was used as the master mix. The TaqMan MGB probes for the wild-type and variant alleles were labelled with VIC dye and FAM dye fluorescence, respectively.
Results
Household transmission of SARS-CoV-2 harboring a spike protein with the E484K mutation
As part of our ongoing genomic surveillance of SARS-CoV-2, we began to investigate SARS-CoV-2 spike protein RBD variants in Kofu city, Japan [25]. We previously identified P.1 lineage SARS-CoV-2, which harbors the K417T/E484K/N501Y mutations, in one patient [25]. A consecutive analysis detected the W152L/E484K/G769V mutations in three patients who provided samples on January 14, 2021. All three patients belonged to the same family: a man in his 40s (the father), a boy under 10 years old, and a girl under 10 years old. This family lived in Japan and had no history of travel to foreign countries. The mother of this family was also infected with SARS-CoV-2 around the same time as the other family members, but we were unable to obtain her samples because she was tested at another hospital. These results suggest that the family members were infected with the same virus lineage and that household transmission occurred.
Genetic characterization defines the SARS-CoV-2 R.1 lineage
Our sequencing analysis of the SARS-CoV-2 isolates from the three family members identified the same 21 mutations in each isolate; these comprised 13 missense, six synonymous, and two intergenic variants. Among the missense mutations, four were in the spike protein (W152L, E484K, D614G, and G769V), four were in ORF1ab (T4692I, N6301S, L6337M, and I6525T), one was in the membrane protein (F28L), and four were in the nucleocapsid protein (S187L, R203K, G204R, and Q418H).
To examine the phylogeny based on the identified genetic variations, we analyzed SARS-CoV-2 genomic data using GISAID, Nextstrain, and Pangolin. The sequences of our SARS-CoV-2 were assigned as GR clade (GISAID), 20B clade (Nextstrain), and R.1 lineage (PANGO lineage). The R.1 lineage is a sublineage of the B.1.1.316 lineage, and the common mutations found in the R.1 lineage are listed in Table 1. For the rapid detection of SARS-CoV-2 R.1 lineage isolates, we designed a TaqMan assay that detects the hallmark spike protein mutations (W152L, E484K, and G769V) (Table 1). This TaqMan assay can successfully discriminate the wild-type and variant alleles in each position (Fig 1). Using the TaqMan assay, we identified an additional 14 patients who were infected with SARS-CoV-2 R.1 lineage. We further confirmed the results via whole genome sequencing. These findings suggest that the novel TaqMan assay targeting SARS-CoV-2 R.1 lineage hallmark mutations is a useful tool for detecting the presence of R.1 lineage isolates in SARS-CoV-2-positive PCR samples.

TaqMan assay for discriminating R.1 lineage SARS-CoV-2 by its spike variant allele. We analyzed samples of R.1 lineage SARS-CoV-2 (n = 3), which harbors a spike variant with the W152L/E484K/G769V mutations and samples of SARS-CoV-2 without these three mutations (n = 5) using a TaqMan assay. An allelic discrimination plot of the results, showing the spike variants with W152L (upper), E484K (middle), and G769V (lower). The dots indicate the mutant alleles (purple), reference allele (pink), or no amplification (orange). The dotted circles indicate the mutant alleles of each variant.
| Gene | Mutation |
|---|---|
| ORF1b | P314L |
| ORF1b | G1362R |
| ORF1b | P1936H |
| S | W152L |
| S | E484K |
| S | D614G |
| S | G769V |
| N | S187L |
| N | R203K |
| N | G204R |
| N | Q418H |
Epidemiological event of SARS-CoV-2 R.1 lineage
To investigate the global distribution of SARS-CoV-2 R.1 lineage, we next collected registration data from the EpiCoV of GISAID database [27]. As of March 5, 2021, a total of 2,388 SARS-CoV-2 samples with R.1 lineage had been registered from all over the world, with the majority being from Japan (38.2%; 913/2,388) or the United States (47.1%; 1,125/2,388) (Fig 2A and Table 2). R.1 lineage SARS-CoV-2 was first reported at the end of October 2020 in Texas, United States and was first detected in Japan at the end of November 2020. The change in the number of R.1 lineage SARS-CoV-2 samples has followed similar trends in the United States, Japan, and other countries (Fig 2A).
Regarding the period from October 24, 2020 to April 18, 2021, the number of R.1 lineage SARS-CoV-2 samples deposited in GISAID began an ongoing increase in February 2021 (Fig 2B), with the percentage of these isolates in Japan exceeding 20% since the beginning of March 2021 (Fig 2B). While the percentage of R.1 lineage SARS-CoV-2 remained low in the United States, it grew rapidly in Japan (Fig 2B). We also observed that the numbers and percentage of R.1 lineage SARS-CoV-2 isolates increased in our district of Japan (S1 Fig).

Timeline of SARS-CoV-2 R.1 lineage emergence and country-specific percentages of global cases. () Timeline of the number of confirmed cases of infection with R.1 lineage SARS-CoV-2. The plot, created based on the data in, shows the case load for Japan (pink), the United States (light blue), and other countries (gray). () The percentage of R.1 lineage SARS-CoV-2 strains relative to the total number of registered samples during the period is shown for Japan (upper panel) and the United States (lower panel). The arrow indicates the date of infection for the initial three individuals infected with R.1 lineage SARS-CoV-2 who were identified in this study. A B Table 1
| Country | Number of confirmed cases (= 2,388)n | Frequency |
|---|---|---|
| USA | 1,125 | 47.1% |
| Japan | 913 | 38.2% |
| Germany | 100 | 4.2% |
| Austria | 69 | 2.9% |
| Belgium | 35 | 1.5% |
| Nether lands | 30 | 1.3% |
| Sweden | 26 | 1.1% |
| United Kingdom | 19 | 0.8% |
| Trinidad and Tobago | 18 | 0.8% |
| Spain | 12 | 0.5% |
| France | 8 | 0.3% |
| Canada | 7 | 0.3% |
| Ghana | 6 | 0.3% |
| Turkey | 3 | 0.1% |
| Australia | 3 | 0.1% |
| Croatia | 2 | 0.1% |
| Switzer land | 2 | 0.1% |
| Nigeria | 1 | 0.04% |
| China | 1 | 0.04% |
| United Arab Emirates | 1 | 0.04% |
| Denmark | 1 | 0.04% |
| Finland | 1 | 0.04% |
| Italy | 1 | 0.04% |
| Portugal | 1 | 0.04% |
| Slovakia | 1 | 0.04% |
| New Zealand | 1 | 0.04% |
| Aruba | 1 | 0.04% |
| Total | 2,388 | 100% |
Phylogenetic analysis of SARS-CoV-2 R.1 lineage
To determine the timing of the emergence of the SARS-CoV-2 R.1 lineage and its acquisition of characteristic mutations, we analyzed a phylogeny of SARS-CoV-2 carrying the E484K mutation, generated from global data. It revealed that the SARS-CoV-2 R.1 lineage forms a monophyletic clade (Figs 3A and S2). The parental lineage harboring the E484K mutation later acquired spike protein W152L and G769V mutations, and the SARS-CoV-2 R.1 lineage was predicted to have emerged around September 9, 2020 (Fig 3A). Subsequently, a sublineage, which has been observed in Austria and the United States, diverged after acquiring the ORF1b G814C mutation around October 19, 2020 (Fig 3A). A parental lineage without the ORF1b G814C mutation has been prevalent mainly in Japan and the United States (Fig 3A). In addition to being detected in Japan, the SARS-CoV-2 R.1 lineage has been observed in 27 additional countries worldwide (Fig 3B and Table 2). Collectively, these results demonstrate that the SARS-CoV-2 R.1 lineage is spreading rapidly, especially in Japan.

Phylogenetic tree of R.1 lineage SARS-CoV-2. Global data on the SARS-CoV-2 R.1 lineage prevalence over time. The R.1 lineage acquired its spike W152L and G769V mutations at the root. The sublineage harbors an ORF1b G814C mutation.Geographic distribution of the SARS-CoV-2 R.1 lineage. World map showing the geographic distribution of SARS-CoV-2 R.1 lineage as of April 22, 2021. The size of the circle indicates the number of samples. Nextstrain,, CC-BY-4.0 license. (A) (B) https://nextstrain.org
Discussion
COVID-19 vaccines have been approved in many countries within a year of the initial appearance of this disease, which is an unprecedented scientific achievement [30]. However, the emergence of mutations in SARS-CoV-2 spike RBD and N-terminal domain (NTD) are of great concern because of their potential contribution to immune escape [31,32]. In the present study, we observed an expansion in Japan of R.1 lineage SARS-CoV-2 harboring the spike RBD E484K and spike NTD W152L mutations, which have potential significance for immune escape. Previous reports showed that COVID-19 convalescent and mRNA vaccine-elicited sera/plasma have reduced neutralizing activity against SARS-CoV-2 harboring the E484K mutation [17 –19]. Worryingly, the E484K mutation has been identified in SARS-CoV-2 "Variants of Concern", such as B.1.351 (South Africa) and P.1 (Brazil), and in SARS-CoV-2 "Variants of Interest", such as B.1.526 (New York, NY, USA), B.1.525 (United Kingdom/Nigeria), and P.2 (Brazil) [33 –35]. Furthermore, the W152L mutation is located in the N3 loop of the NTD and may be related to decreased antibody neutralization activity [36]. The new emergent lineage B.1.429, which has a substitution in the same codon 152 (W152C), was first identified in California, United States; this lineage is defined as a "Variant of Interest" by the US Centers for Disease Control and Prevention (CDC) [35]. The W152C mutation is located in the NTD antigenic supersite (designated site i) and is also associated with reduced recognition of the NTD by neutralizing monoclonal antibodies [32,37].
There is concern that SARS-CoV-2 with these immune escape mutations may evade the host immune response. In Brazil, a patient was re-infected with SARS-CoV-2 carrying the E484K mutation, indicating that this mutation is related to escape from neutralizing antibodies in recovered patients [38]. Additionally, a patient who received the second shot of mRNA-1273 vaccine (Moderna) was infected with SARS-CoV-2 harboring the E484K mutation, despite having a serum neutralizing antibody titer that is normally sufficient to prevent infection [39]. Recently, R.1 lineage SARS-CoV-2 was detected at a skilled nursing facility in Kentucky, in residents and healthcare personnel who had received BNT162b2 mRNA vaccines (Pfizer-BioNTech) [40]. These results indicate that even after COVID-19 vaccination, the risk of infection is not completely eliminated. Although COVID-19 mRNA vaccines have demonstrated high efficacy for decreasing the risk of SARS-CoV-2 transmission and severe outcomes from COVID-19 [41 –43], it is still necessary to monitor vaccinated individuals for new emergent lineages that have acquired novel escape mutations.
There are limitations to this epidemiological report on the SARS-CoV-2 R.1 lineage. The amount of SARS-CoV-2 sequencing data available from different countries does not reflect the actual prevalence of specific variants because the extent of sequencing analysis conducted varies among countries [30]. If the sequences of nearly all SARS-CoV-2-positive PCR specimens is analyzed, the results can represent the whole picture of prevailing virus lineages; unfortunately, this is not the case here. However, although the number of samples in our study is small, we analyzed all the SARS-CoV-2-positive PCR samples in our hospital and found that the percentage of samples with R.1 lineage SARS-CoV-2 increased to about 50% by the end of the study period (S1 Fig). Along with the R.1 lineage prevalence, the B.1.1.7 (United Kingdom) lineage prevalence is also increasing in Japan [44]. Further investigation is needed to determine whether these two lineages of SARS-CoV-2 will circulate dominantly in Japan.
Circulating SARS-CoV-2 acquires approximately one or two mutations per month. This seems to be very slow; however, the more the virus circulates in population, the more opportunity it has to change [45]. Therefore, genomic surveillance in real-time will provide us with important insights into the effectiveness of vaccines, the development of antibody therapy, and public health.