What this is
- This research investigates the role of and RNA modifications in esophageal squamous cell carcinoma (ESCC).
- It identifies a signature of long non-coding RNAs () related to these modifications for predicting patient prognosis and immunotherapy outcomes.
- The study utilizes transcriptomic data from public databases and develops a RiskScore model based on ten prognostic .
Essence
- A RiskScore model based on ten /-related effectively predicts prognosis and immunotherapy response in ESCC patients. Low-RiskScore patients show better survival and immune cell infiltration.
Key takeaways
- The study identifies 606 /-related in ESCC, which are classified into three clusters with distinct clinical features and immune landscapes.
- Patients in the low-RiskScore group demonstrate a better prognosis and higher abundance of immune cells, indicating a favorable immune microenvironment.
- The RiskScore model predicts immunotherapy efficacy, with low-RiskScore patients more likely to benefit from immune checkpoint inhibitors.
Caveats
- The findings rely on data from public databases, lacking validation through wet laboratory experiments.
- The prognostic value of the RiskScore model requires further validation in independent patient cohorts.
- The molecular mechanisms of the identified /-related in ESCC remain unclear and warrant further investigation.
Definitions
- m6A: N6-methyladenosine, a prevalent RNA modification affecting gene regulation and stability.
- m5C: 5-Methylcytosine, an RNA modification linked to various biological processes, including cancer progression.
- lncRNA: Long non-coding RNA, a type of RNA that regulates gene expression without encoding proteins.
AI simplified
Introduction
Esophageal carcinoma (EC) is a highly lethal gastrointestinal malignancy worldwide, which occurs an estimated 604,100 newly diagnosed cases and 544,100 individual deaths in 20201. Esophageal squamous cell carcinoma (ESCC) represents the major histological subtype, constituting approximately 90% of all incident EC cases annually. The prevailing therapeutic approach for advanced-stage patients currently contains platin-based chemotherapy, either as a standalone regimen or in conjunction with checkpoint inhibitors. Notably, the integration of PD1 immunotherapy has yielded substantial clinical benefit, particularly within the subset of patients exhibiting tumor proportion scores (TPS) more than 1% or combined positive scores (CPS) higher than 102. Nevertheless, despite breakthroughs in ESCC diagnosis and therapeutic intervention, the prognosis for advanced patients remains unsatisfactory owing to the clandestine onset of symptoms at an early stage and the propensity for metastatic dissemination. Thus, the emergence of a molecular phenotyping-based prognostic risk model offers useful information for the identification and differentiation of high-risk individuals, potentially leading to enhanced medical outcomes.
RNA modifications have been confirmed as a critical epigenetic mechanism for post-transcriptional gene regulation via affecting RNA splicing, stability, and translation. To date, over 170 different types of natural modification have been identified, and N6-methyladenosine (m6A) represents the most prevalent methylated alteration that occurs in all types of RNAs, including cording mRNA, tRNA, rRNA, and non-coding RNA. The process of m6A is determined by RNA methyltransferases (METTL3, METTL14), demethylases (FTO and ALKBH5), and recognized by methylation binding proteins (YTHDF1, YTHDC2, and IGF2BP1)3. Another type of RNA modification, 5-Methylcytosine (m5C) is also widespread in various RNAs but most abundant in eukaryotic tRNAs and rRNAs. It is mainly catalyzed by methyltransferase of DNMT2 and NSUN family, demethylated by methylcytosine dioxygenase TET24. ALYREF recognizes m5C-methylated mRNA, and YBX1 interacts with m5C methylated site to regulate maternal mRNA stabilization. Deregulation of m6A and m5C regulators is reported in diverse human cancers and is intimately linked to oncogenic or tumor-suppressive functions, such as processes of proliferation, spreading, invasion, and immunity5,6. For example, METTL3 and METTL14 have been implicated as overexpressed and act as oncogenes in most cancer types. METTL3 could methylate BCL-2 and c-Myc mRNAs to increase their stability and expression, thus suppressing cancer cells apoptosis, and promoting disease progression7. m6A demethylase FTO displays the oncogenic function via reducing the tumor suppressor mRNA level, resulting in cancer cell differentiation and tumor growth8. The m6A regulators YTHDF1 and IGF2BP1 are overexpressed in ESCC. Increased expression of YTHDF1 and HNRNPC in ESCC could be utilized as a prognostic predictor9. IGF2BP1 interacted with cofactors RPS15 to promote core proteins translation of p38 MAPK signaling in an m6A-dependent manner10. In the case of m5C, the level of m5C is closely related to carcinogenesis, and NSUN2 could promote cancer cell proliferation via up-regulating the m5C levels. In ESCC patients, NSUN2 promotes cancer development and chemo-resistance via the mRNA-m5C modification of cancer-related genes and enhances their expression, such as TIGAR11 and GRB212. Mechanically, elevated GRB2 levels increased the activation of downstream pathways PI3K and ERK in the LIN28B-dependent manner12. TIGAR activates the pentose phosphate pathway to generate more reductants, thus protecting cancer cells from ROS damage11.
Long non-coding RNAs (lncRNAs) are RNA molecules that do not encode proteins but exert a precise regulatory function via interacting with various target mRNAs, regulatory factors, and sponging microRNAs. The tissue-specific and condition-specific expression modes of lncRNAs implicated their potential use as biomarkers for early diagnosis and therapy outcome monitoring in cancer. m6A and m5C can also modify lncRNAs, thereby affecting tumor genesis and progression. In addition, risk models based on m6A or m5C-related lncRNAs were also developed for prognosis prediction. For example, Li et al. reported a prognostic risk model for low-grade glioma based on 8 m6A/m5C methylated lncRNAs signature that could predict survival and infiltration of immune cells13. By consensus clustering analysis of colorectal cancer samples, Song et al. established an m6A/m5C-related lncRNAs signature that displays accurate capability in predicting cancer prognosis, immune-stromal microenvironment, clinicopathological features, and immunotherapy efficacy14. Despite these findings, the advances in exploring the prevalence and molecular function of m6A/m5C methylated lncRNAs remain elusive.
In this study, we extracted transcriptome profiles from The Cancer Genome Atlas (TCGA) database, then conducted bioinformatics analysis to identify m6A/m5C -related lncRNAs signatures associated with ESCC prognosis. Moreover, we established the m6A/m5C lncRNAs-based RiskScore system for survival prediction and used the Gene Expression Omnibus (GEO) dataset for model validation. The association of RiskScore, immune microenvironment, and immune molecules were analyzed. Notably, we assess the predictive value of RiskScore in immunotherapy efficacy. Our findings highlight the potential role of m6A/m5C-related lncRNAs for clinical prognostic prediction in ESCC.
Methods
Data processing of transcriptomic profiles
| TCGA-ESCC | GSE53622 | ||
|---|---|---|---|
| OS | Alive | 65 | 54 |
| Dead | 31 | 66 | |
| Age | β€β60 | 61 | 60 |
| >β60 | 35 | 60 | |
| Gender | Male | 81 | 96 |
| Female | 15 | 24 | |
| Stage T | T1 | 7 | 8 |
| T2 | 34 | 14 | |
| T3 | 50 | 96 | |
| T4 | 5 | 2 | |
| Stage N | N0 | 52 | 58 |
| N1 | 29 | 40 | |
| N2 | 5 | 18 | |
| N3 | β | 4 | |
| NX | 10 | β | |
| Stage M | M0 | 85 | β |
| M1 | 5 | β | |
| MX | 5 | β | |
| Stage | I | 7 | 8 |
| II | 56 | 60 | |
| III | 27 | 52 | |
| IV | 5 | β | |
| Tobaco | Yes | β | 69 |
| No | β | 52 | |
| Alcohol | Yes | β | 64 |
| No | β | 56 |
Identification of m6A/m5C-associated lncRNAs
We obtained m6A or m5C-related genes from previously published sources16. We classified the genes from transcriptome sequencing as either lncRNAs or mRNAs based on annotation information. To assess the expressed correlations between lncRNAs and m6A/m5C-related genes, we conducted Spearman's correlation analysis. We screened for m6A/m5C-lncRNA pairs with an absolute correlation coefficient greater than 0.3 and a p-value less than 0.05, selecting the corresponding lncRNAs as candidate molecules.
LncRNA target genes prediction
LncRNAs modulate the target gene expression through co-localization and co-expression mechanisms. Here, we predicted the lncRNA target genes according to previous research17,18. We screened for protein-coding genes located within 10 kb of each lncRNA to explore their potential function. A correlation matrix was generated by computing the coefficient between all pairs of lncRNAs and mRNAs. We designated genes with an R2 > 0.95 as lncRNA target genes.
Furthermore, we utilized the online tools lncATLAS (http://lncatlas.crg.eu/β)19 and lncSLdb20 (http://bioinformatics.xidian.edu.cn/lncSLdb/index.jspβ) to determine the cellular localization of m6A/m5C-associated lncRNAs.
We performed gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analysis on these lncRNA target genes using the clusterProfile package, and the parameters were set as follows: pAdjust Methodβ=β'BH', p-value Cutoffβ=β0.05, and q-value Cutoffβ=β0.2.
Screening of prognosis-related lncRNAs
To classify ESCC samples from the TCGA dataset, we used the ConsensusClusterPlus package to generate a consistency matrix based on the expression of m6A/m5C-associated lncRNAs. Differential expression analysis of lncRNAs in the TCGA dataset was performed on log2-transformed TPM count data using the IOBR package21. Clustering consistency was evaluated using the k-means algorithm, with parameters including Euclidean distance, seed = 1100, and k.max = 10.
The survival package was employed for Cox analysis of m6A/m5C-associated lncRNAs in the TCGA-ESCC dataset, and a significance threshold of P < 0.05 was applied to identify prognostic genes.
Construction of m6A/m5C-lncRNAs RiskScore model
Using the samples and corresponding genes in the TCGA-ESCC dataset, we conducted lasso analysis with the cv.glmnet function of the lars package. In the lasso analysis, parameters were set as βfamilyβ=βcoxβ, and genes with a coefficient value not equal to zero were retained. The RiskScore was defined as the sum of the product of the lasso score coefficient and gene expression, formulated as follows:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{RiskScore}}= \sum_{i=1}^{n}expression \,\,of \,\,gene\,\, i \,\,* \,\,lasoo \,\,coefficient \,\,of \,\,gene\,\, i$$\end{document}RiskScore=βi=1nexpressionofgeneiβlasoocoefficientofgenei or, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{RiskScore}}= \sum_{i=1}^{n}{exp}_{i}*{coef}_{i}$$\end{document}RiskScore=βi=1nexpiβcoefi, where the expi represents the ith gene expression value (log2(TPM + 1)), and coefi represents the lasso regression coefficient of the ith gene.
To validate the prediction accuracy of the RiskScore, we classified ESCC samples in the TCGA dataset into high- and low-RiskScore groups based on the median RiskScore. KaplanβMeier curves were used to predict overall survival times between these groups, and receiver operating characteristic (ROC) curves were generated with corresponding AUC values calculated to assess predictive ability. The performance of the RiskScore model was also validated using the GSE53622β dataset.
Immune landscape analysis
To evaluate the infiltration of immune cells and stromal cells in different clusters and subgroups, we calculated the ESTIMATE immune scores and stromal scores. Additionally, we utilized the ssGSEA, CIBERSORT, and xCell methods to assess the composition of the tumor microenvironment.
Statistical analysis
Correlation analysis
The Peasron correlation test was performed using the R function corr.test. The P values of the association tests were corrected using FDR. In this paper, Pearson's correlation coefficient was used for testing. Pearson's correlation coefficient is a method used to measure the linear correlation between two variables. Its value is between -1 and 1. When the correlation coefficient is positive, it indicates that there is a positive correlation between the two variables. When the correlation coefficient is negative, it indicates that there is a negative correlation between the two variables. When the correlation coefficient is 0, it indicates that there is no linear correlation between the variables. (ns: p > 0.05, *p β€ 0.05, **p β€ 0.01,***p β€ 0.001,****p β€ 0.0001).
KaplanβMeier survival estimates
KaplanβMeier survival estimation is a nonparametric method that estimates the probability of survival based on the observed survival time. KaplanβMeier curves can be used to visualize survival differences across multiple categories, and the KM curve is a step function that reflects the cumulative probability of survival over time. The curve is horizontal during periods when no events occur and then decreases vertically as the survival function changes with each event.
In conclusion, Spearman's correlation was computed using the corr.test in R, with P values corrected using FDR. Wilcoxon tests were employed for pairwise comparisons. KaplanβMeier curves were generated for prognosis analysis, with the log-rank test determining significant differences. Multivariate Cox regression analysis was conducted to identify independent predictive factors, including RiskScore, age, gender, and pathology stage. All statistical analyses were two-sided, and a significance level of P < 0.05 was applied.
Results
Screening m6A/m5C-related lncRNAs
For these lncRNAs, we predicted their potential target genes by examining coding genes located 10 km upstream and downstream of each lncRNA, obtaining 758 target genes. We performed functional analysis, and found these m6A/m5C-lncRNA target genes were significantly enriched in the olfactory transduction pathway (Fig. 1D). Moreover, these genes demonstrated relevance to multiple biological process, cellular component, and molecular function, such as metabolism-related process, fibrinogen complex, chromatoid body, olfactory receptor activity and odorant binding (Fig. 1EβG).
In addition, we utilized the LncSLdb/lncATLAS database to predict the subcellular localization of lncRNAs. Our findings indicated that 61.55% of m6A/m5C-lncRNAs were localized in the ribosome, while 33.17% of lncRNAs were situated in the cytoplasm. Moreover, 4.79% of m6A/m5C-lncRNAs were found in exosomes, and 0.5% were identified in the nucleus (Fig. 1H).

Identification of m6A/m5C-related lncRNAs in ESCC. (,) Top 30 m5C-related lncRNAs and top 30 m6A-related lncRNAs. () Overlapping lncRNAs between m5C- and m6A-related lncRNAs. () Functional enrichment analysis revealing KEGG pathways enriched by the lncRNA target genes. (β) Enrichment of GO biological process, cellular component, and molecular function. () Cellular localization of m6A/m5C-lncRNAs. A B C D E G H
Molecular classification of ESCC case based on m6A/m5C-lncRNAs expression
We further investigated the differences in various clinical features among the subtypes within the TCGA-ESCC dataset. The matrix results indicated that the m6A/m5C-lncRNAs expression in cluster 3 was relatively lower. In contrast, cluster 2 displayed moderate expression, while cluster 1 exhibited the highest expression levels of m6A/m5C-lncRNAs. Additionally, differences were observed in clinical pathological parameters, including age, grade, immune subtype, and gastric classification, across the three clusters (Fig. 2D).
Notably, significant differences were identified among the three clusters in the hallmark gene set. For instance, samples in cluster 1 displayed relatively lower GSVA enrichment scores across the 50 hallmark gene sets, while cluster 3 exhibited high scores. A total of 32 hallmark gene sets exhibited significant differences between the three clusters (Fig. 2E).
We analyzed immune landscape differences among the three distinct clusters using the xCell, CIBERSORT, and ssGSEA methods for the TCGA-ESCC samples. Our findings indicated that the infiltration ratio of CD4 + memory T cells, CD8 + T cells, and other T cell-related cell types was significantly lower in cluster 3, as evident in the xCELL results (Fig. 2F,G). Furthermore, a total of 17 immune checkpoint genes exhibited significant expression difference among the three clusters. Notably, most of these genes were markedly upregulated in cluster 1 compared to cluster 3 (Fig. 2G).

Identifying three clusters of ESCC cases by consensus clustering in the TCGA set. () Consensus clustering analysis and relative change in the area under the CDF curve. () Heat map of consensus matrix. () Prognosis analysis among the three clusters. () Heatmap of the clinical features among the three clusters in ESCC cases. () GSVA analysis displaying the key pathways enriched by 50 hallmark gene sets across three clusters. (β) Box plots illustrating differences in immunity landscape and immune checkpoint gene expression among three clusters. A B C D E F G
Construction of m6A/m5C-lncRNA-based prognostic signature
In the training set, we observed that patients in the high-RiskScore subtype had a worse prognosis than those in the low-RiskScore group (Fig. 4D,FβH). Furthermore, ROC curve analysis indicated that this model exhibited promising predictive ability for survival in TCGA-ESCC samples (AUC value at 1-, 3-, and 5- years were respectively 0.805, 0.921 and 0.903; Fig. 4E).

Survival analysis results of prognostic lncRNAs between high- and low- expression groups of ESCC samples.

Construction of lncRNAs signature-based RiskScroe system for prognosis prediction. (,) Construction of the RiskScore model using the lasso method. () Lasso coefficients of the 10 prognostic genes. () Survival analysis of ESCC patients in high- and low- RiskScore cohorts. () ROC curve of the RiskScore at 1-, 2- and 3-year follow-up. (β) Distribution of RiskScore, status, and expression of m6A/m5C-lncRNAs in the TCGA. A B C D E F H
| Gene | coef | p.value | Hazard_ratio | Lower_.95 | Upper_.95 | Log rank pvalue | Wald pvalue | Likelihood_pvalue | HR |
|---|---|---|---|---|---|---|---|---|---|
| LINC00898 | ββ0.291339 | 0.00167241 | 0.74726232 | 0.62311853 | 0.89613926 | 0.00125613 | 0.00167241 | 0.00292077 | 0.75 (0.62βββ0.9) |
| LINC00847 | 1.1596129 | 0.00240528 | 3.18869869 | 1.50795139 | 6.74278988 | 0.00221678 | 0.00240528 | 0.00105983 | 3.19 (1.51βββ6.74) |
| LINC01415 | ββ0.3823088 | 0.00261062 | 0.68228435 | 0.53193629 | 0.87512723 | 0.00308498 | 0.00261062 | 0.00876798 | 0.68 (0.53βββ0.88) |
| LINC00993 | ββ0.2139095 | 0.01601545 | 0.80742144 | 0.67842728 | 0.96094218 | 0.01004049 | 0.01601545 | 0.00552336 | 0.81 (0.68βββ0.96) |
| EGFR-AS1 | ββ0.3346057 | 0.00933494 | 0.71562018 | 0.55605549 | 0.92097327 | 0.01120361 | 0.00933494 | 0.01302588 | 0.72 (0.56βββ0.92) |
| LINC01602 | ββ0.201695 | 0.01547201 | 0.81734415 | 0.69421638 | 0.96231013 | 0.01305807 | 0.01547201 | 0.00907101 | 0.82 (0.69βββ0.96) |
| XIST | ββ0.1318115 | 0.02867726 | 0.87650617 | 0.77888553 | 0.98636197 | 0.02146775 | 0.02867726 | 0.01243166 | 0.88 (0.78βββ0.99) |
| MIFβββAS1 | ββ0.4483249 | 0.02921312 | 0.63869717 | 0.42686483 | 0.95565162 | 0.02690006 | 0.02921312 | 0.03236641 | 0.64 (0.43βββ0.96) |
| TTTY15 | 0.16123551 | 0.03157228 | 1.17496165 | 1.01433849 | 1.3610199 | 0.02879093 | 0.03157228 | 0.02035923 | 1.17 (1.01βββ1.36) |
| LINC00208 | ββ0.6352842 | 0.0648618 | 0.52978488 | 0.26989787 | 1.03991935 | 0.03195375 | 0.0648618 | 0.009065 | 0.53 (0.27βββ1.04) |
| LINC01393 | 0.36344936 | 0.02849482 | 1.43828203 | 1.03897716 | 1.99104973 | 0.03489665 | 0.02849482 | 0.01637473 | 1.44 (1.04βββ1.99) |
| LINC01037 | ββ0.1297853 | 0.04422671 | 0.87828402 | 0.77397328 | 0.99665304 | 0.03599402 | 0.04422671 | 0.02655686 | 0.88 (0.77βββ1) |
| LINC00942 | 0.12213836 | 0.04113541 | 1.12991042 | 1.00492827 | 1.27043652 | 0.03952849 | 0.04113541 | 0.04785422 | 1.13 (1βββ1.27) |
| LINC00638 | ββ0.3722211 | 0.0409009 | 0.68920183 | 0.48236903 | 0.98472153 | 0.04014923 | 0.0409009 | 0.04277191 | 0.69 (0.48βββ0.98) |
| LINC00415 | ββ0.1628923 | 0.05239957 | 0.8496827 | 0.7207415 | 1.00169158 | 0.04029506 | 0.05239957 | 0.02560894 | 0.85 (0.72βββ1) |
| LINC01310 | ββ0.3170921 | 0.05800911 | 0.72826368 | 0.52469096 | 1.01081977 | 0.04475305 | 0.05800911 | 0.02244852 | 0.73 (0.52βββ1.01) |
| LINC01036 | ββ0.1226793 | 0.05475261 | 0.88454726 | 0.78047092 | 1.00250225 | 0.0485204 | 0.05475261 | 0.04237056 | 0.88 (0.78βββ1) |
Validation of the RiskScore model
Furthermore, ESCC patients in the TCGA dataset could be classified into different subgroups based on clinical features, such as stage, grade, and age. RiskScore assessments were performed between different clinical feature groups and the previously identified three clusters. Notably, patients in the alive and female subgroups presented a low RiskScore, indicating a significant correlation between RiskScore and survival status and gender. Moreover, a significant difference in RiskScore was observed between the G1 and G2 grade subgroup (P < 0.05, Fig. 6B). Patients in cluster 1 displayed the lowest RiskScore than the other two clusters (P < 0.05, Fig. 6D). Sankey diagram showed the consistency between cluster and RiskScore for survival status prediction, which exhibited a consistent result to the RiskScore assessments (Fig. 6C,D).

Prognostic value of RiskScore models in an external validation set. () Survival analysis of two subgroups via thedataset. () ROC curve analysis of the RiskScore model in the validation set. (β) Distribution of RiskScore, status, and the expression level of lncRNAs in thedataset. A B C E GSE53622 GSE53622

Prognostic efficiency prediction of m6A/m5C-lncRNA signatures. () Predictive value of RiskScore in clinicopathological subgroups. () RiskScore assessments across different clinicopathological groups, including survival status, age, grade, gender, stage, somatic copy number alterations (SCNA) levels, and gastric classification. () Sankey diagram visualized the correlation ship of clusters, RiskScore group and survival status. () Risk score assessments between different cluster cohorts. A B C D
Identification of RiskScore as an independent prognostic factor
We constructed a nomogram model that integrated clinical characteristics and RiskScore for predicting survival in the TCGA (Fig. 7C). To gauge the predictive accuracy, we employed ROC curves, revealing AUC values of 0.518 and 0.648 for age and stage, respectively (Fig. 7G). The nomogram prediction and RiskScore possessed higher AUC values (0.886, 0.818) than other clinically relevant prognostic factors in the TCGA cohort. It is worth noting that due to limited data, we refrain from drawing generalized conclusions beyond 3 years. The calibration curves demonstrated consistency with the standard curve in predictive outcome at 0.5, 1, and 2 years (Fig. 7DβF). These findings revealed nomogram model exhibited reliable prognostic predictive power compared with other clinical characteristics in TCGA cohorts.

Cox regression analysis and nomogram model construction for m6A/m5C-lncRNAs prognostic signature. (,) Cox regression analysis for the survival time in TCGA cohorts. () The nomogram model predicting survival outcomes of ESCC. (β) ROC and calibration curves for predicting the survival time in the TCGA cohort. A B C D G
Association of RiskScore and several highly trustworthy indices
Cancer progression encompasses the transition towards a dedifferentiated oncogenic phenotype and the acquisition of stem cell-like features. A previous study employed the innovative logistic regression OCLR algorithm to introduce two stemness indices for quantifying tumor microenvironment stemness22: mRNA expression and the methylated DNA-based stemness index (mRNAsi and mDNAsi). These indices reflect the gene expression and epigenetic characteristics of stem cells. Epigenetic regulation-based mRNAsi and mDNAsi (EREG-mRNAsi and EREG-mDNAsi) were generated via reconstructing the gene regulatory network from methylation and transcriptome data22. Additionally, single nucleotide variants (SNVs) represent the most prevalent genomic mutations in tumor cells, leading to the production of variant peptides distinct from wild-type proteins, which are subsequently presented by MHC-I23.

Correlation analysis between RiskScore and several clinical trustworthy indices. *Pβ<β0.05; **Pβ<β0.01; ***Pβ<β0.001.
Correlation between lncRNA signature and immune cell proportion
Additionally, we analyzed the expression of immune checkpoint genes between the high and low RiskScore groups. Our findings revealed that five immune checkpoint genes exhibited significantly abnormal expression between the two groups, with their expression levels in the low-risk group being notably higher than those in the high-RiskScore group (Fig. 9B).
Moreover, we assessed the predictive value of RiskScore in immunotherapy efficiency using the TIDE method on TCGA-ESCC data. The results indicated that the low-RiskScore group was more likely to respond to immunotherapy compared with high- RiskScore patients (P = 0.025, Fig. 9C). Similarly, we analyzed the proportions of patients who responded to immunotherapy in the high and low-RiskScore groups using the IMvigor210 dataset. Our findings revealed a significant difference in the percentage of drug response patients between the two subgroups (P = 0.032, Fig. 9D), underscoring the reliable predictive performance of the RiskScore model in immunotherapy response.

Association of RiskScore, infiltrated immune cell and immunotherapy response via TCGA-ESCC datasets. (,) Differences in immune cell infiltration and immune checkpoint gene expression between two subgroups. *Pβ<β0.05; **Pβ<β0.01; ***Pβ<β0.001. () Differences in immunotherapy efficacy between two risk subgroups in TCGA-ESCC cohorts. () Immunotherapy response between high and low-RiskScore groups in the IMvigor210 datasets. A B C D
Discussion
In the treatment of patients with cancer, immunotherapy has emerged as a pivotal therapeutic avenue in the management of gastrointestinal cancers due to its potential to yield rapid and substantial therapeutic benefits for afflicted patients. However, it remains a huge challenge to screen the dominant populations likely to mount a robust response to immune checkpoint inhibitors (ICIs). Here, our findings unveil a strong correlation between low-RiskScores and favorable responses to immunotherapy, signifying the potential use of this system in patient stratification for enhanced therapeutic outcomes. LncRNA represents a diverse category of RNA molecules with intricate roles in regulating gene expression by engaging gene regulatory proteins and microRNAs. Similar to protein-cording RNAs, lncRNAs also undergo RNA methylation that contributes to their regulatory functions in the context of tumorigenesis.
In this study, we analyzed the transcriptomic profiles from public databases and uncovered 606 m6A/m5C-related lncRNAs in ESCC. By consensus clustering analysis, we classified these ESCC samples into three clusters based on lncRNA expression profiles and found that cluster1 was associated with a significant better prognosis than cluster 2 and cluster 3. The survival difference reason was investigated by comparing the immune landscape. As we expected, patients in cluster 1 displayed higher immune cell proportions in CD4 memory T cells and CD8 T cells. Furthermore, most of the immune checkpoint genes were notably increased in cluster 1 than the other two subgroups, indicating that immunity plays a major role in ESCC prognosis.
We employ the lasso method and survival analysis to identify ten m6A/m5C-related lncRNAs associated with ESCC prognosis, including LINC00847, TTTY15, and LINC00942. Notably, LINC00847 has been extensively studied in various cancer types. It emerges as a key player in laryngeal squamous cell carcinoma, and its overexpression has been linked to enhanced cell proliferation and cell cycle progression. Additionally, elevated LINC00847 levels are correlated with lymph node metastasis and poor differentiation, suggesting its potential as a prognostic biomarker. Our results further corroborate the significance of LINC00847 in ESCC survival, consistent with previous research. Mechanistically, LINC00847 is induced by the transcription factor E2F, exerting its influence on cancer progression through the modulation of downstream targets, including miR-147a24.
In addition, we developed a ten m6A/m5C lncRNAs-based RiskScore model, which had a promising ability in survival prediction for both TCGA cohorts and independent validation datasets. Patients stratified into high and low-RiskScore subgroups manifest conspicuous distinctions in key clinical characteristics, encompassing survival status, gender, tumor stage, and grade. Additionally, the RiskScore model unveiled significant associations with immune cell infiltration, notably affecting CD4β+βT cells and Tregs proportion. To further improve its clinical utility, we constructed a comprehensive nomogram model incorporating clinical variables and RiskScore. This model performed excellent in predicting the survival probability of ESCC.
Moreover, emerging evidence highlights that tumor-infiltrating immune cells were correlated with prognosis and immunity. The composition changes and functional activation of immune cell populations within the TME exert a profound impact on patient survival and their responsiveness to immunotherap25. For instance, augmented infiltration of CD4 + lymphocytes has consistently demonstrated a significant association with prolonged survival in ESCC26. In this study, we also characterized increased CD4 + T cell infiltration in low-RiskScore group, which displays a better prognosis. This observation aligns seamlessly with prior research, further reinforcing the notion that increased CD4 + T cell infiltration might be indicative of a more favorable prognosis.
Interestingly, we found the immune checkpoint genes were upregulated in the low-RiskScore group. Among these genes, CD226 deficiency restrains CD8 + T cell function, consequently curtailing the efficacy of cancer immunotherapy27. Aberrant expression of CD70 was linked to tumor progression and immunosuppression in the tumor microenvironment, and it can facilitate immune evasion through interacting with receptor CD2728. TNFRSF9 is an activation marker for tumor-infiltrating Tregs, and inhibition of TNFRSF9 boosts anti-cancer treatments via reducing the immune suppressive function of Tregs29. These compelling findings collectively imply a potential mechanism whereby m6A or m5C methylation on lncRNAs may exert regulatory control over immune checkpoint gene expression. Such regulation contributes to a positive response to immunotherapy in the context of ESCC.
More and more studies have confirmed that lncRNA has a close relationship with the immune microenvironment. Studies have confirmed that most m6A regulators play an important role in the immune microenvironment by activating the PI3K-AKT pathway and enriching infiltrating immune cells30. m6A phenotype-related genes were obtained by identifying the DEGs associated with m6A phenotype and overlapping them. m6A phenotype-related genes involved in immune regulation were screened out, and the enriched biological processes were significantly related to antigenic process, Fc signal transduction, immune cell proliferation regulation and cytokine-mediated pathways. Meanwhile, Zhang et al.31 searched for highly correlated gene clusters associated with immune cells through the algorithm WGCNA. The feature genes of related gene clusters were clustered, and the gene-lncRNA modules related to immune cell score and clinical features were finally determined. An additional integrative analysis of the single-cell RNA-seq dataset has unveiled a compelling close association between LINC00847 and prognosis in lung adenocarcinoma. LINC00847 positively correlates with the infiltration of various immune cell types, and its overexpression significantly down-regulates PDL1 expression in the in-vitro assay, thus casting it as a prospective candidate in tumor immunotherapy32. Furthermore, the male Y chromosome-linked lncRNAs TTTY15 assume a crucial role in carcinogenesis across diverse cancer types. Acting as an RNA sponge, TTTY15 engages several miRNAs to promote cancer progression, exemplified by its interaction with miR-29-3p in colorectal cancer33, miR-98-5p34 and miRNA let-7a-5p35 in gastric cancer. In prostate cancer, TTTY15 exhibits prominent upregulation in most tumor samples, exerting a pro-carcinogenic influence by sponging miRNA let-7, subsequently elevating CDK6 and FN1 expression36. Interestingly, TTTY15 assumes a contrasting suppressive role in NSCLC, wherein its overexpression inhibited cancer proliferation and metastasis through the modulation of TBX437.
Moreover, epigenetic modification of lncRNAs in regulating tumorigenesis and development has been reported. In breast cancer, it has been documented that LINC00942 directly engages methyltransferase METTL14, thereby facilitating METTL14-mediated RNA methylation processes downstream38. Moreover, LINC00942 has been implicated in promoting chemoresistance in gastric cancer by impeding the degradation of oncoprotein MSI2. This, in turn, enhances the stability of cβmyc mRNA, a phenomenon reliant on m6A modification39. Another lncRNA of interest, EGFR-AS1, arising from the reverse strand of lncRNA EGFR, has garnered attention for its overexpression in diverse cancer types. Elevated levels of EGFR-AS1 have been closely associated with unfavorable clinical features, encompassing pathological stage, lymph node metastasis, and overall survival40. Functionally, EGFR-AS1 serves as an oncogene, fostering cell proliferation, chemotherapy resistance, invasion, and stemness through intricate interactions with downstream miRNAs and signaling pathways. EGFR-AS1 notably stabilizes EGFR mRNA, consequently activating the PI3K/AKT pathway to promote proliferation and metastasis in renal cancer cells41.
Furthermore, EGFR activation has been linked to the up-regulation of PDL1 via the p-ERK1/2/p-c-Jun pathway, thereby inducing immune evasion in EGFR-driven cancer42. In squamous cell carcinoma, EGFR-AS1 emerges as a mediator of EGFR addiction, influencing treatment responses. Notably, EGFR-AS1 knockdown has been shown to reverse resistance to tyrosine kinase inhibitors43. In ESCC, EGFR-AS1 has been implicated in up-regulating ROCK1 expression by sponging miR-145, thus promoting cancer cell invasion and migration44. Additionally, lncMIFβAS1 has demonstrated its significance in positively regulating NDUFA4 expression in gastric cancer cells. This regulation is achieved through the sequestration of miRβ212β5p, which attenuates NDUFA4 mRNA degradation. Upregulation of NDUFA4 activates the oxidative phosphorylation pathway, ultimately promoting proliferation and inhibiting apoptosis in gastric cancer cells45. It is noteworthy that the biological roles of the remaining five lncRNAs remain unclear, thus providing new directions for future research endeavors. Collectively, these findings illuminate the indispensable role of m6A/m5C-lncRNAs in cancer development. Moreover, they posit these lncRNAs as promising candidates for novel prognostic biomarkers, holding potential implications for the prognosis of patients afflicted with ESCC.
In summary, our finding demonstrated a ten m6A/m5C-lncRNA signature implicated in ESCC progression and established a lncRNA signature-based RiskScore model for prognosis prediction. Moreover, we assessed the immune landscape and immune checkpoint gene expression for low-RiskScore patients, which might contribute to a beneficial therapeutic response from ICI. The RiskScore system might be a useful tool to determine the m6A/m5C-lncRNA signature application in clinical practice, thus promoting treatment decisions for selecting the patient subgroup that more cline benefits from ICI therapy.
Certain limitations should be mentioned in our context. Firstly, since the data were analyzed through TCGA and GEO databases, there is still a lack of verification of wet laboratory biochemical experiments. Second, our results of m6A/m5C-associated lncRNAs signature were not validated in a separate patient cohort, and the survival predictive value of the RiskScore model requires more external datasets validation for clinical application. Finally, the potential molecular mechanism of these m6A/m5C-related lncRNAs remained unclear in ESCC, and we plan to conduct in vitro or in vivo biological experiments to verify our results in future studies.
Conclusion
In this study, ten m6A/ m5C-lncRNA signals were identified to be associated with ESCC progression. Molecular subgroup analysis based on m5C- and m6A-associated lncRNAs was performed on ESCC samples, and m5C- and m6A-associated lncRNAs related molecular subtypes were obtained, which confirmed significant prognostic differences between subtypes. By establishing the RiskScore model of lncRNA associated with m5C and m6A, we found that there were significant differences between the high and low risk groups regardless of the tumor micro-environment landscape or immune-related genes. At the same time, to further improve the clinical utility of the model, a comprehensive column line model including clinical variables and risk scores was constructed. The model performs well in predicting the survival probability of ESCC. Therefore, we believe that the RiskScore model has a good prognostic power. In addition, RiskScore model also has generalized value in chemotherapy prediction. The results are remarkable, and the analysis methods are solid, diverse and comprehensive. However, this risk score model is built on a public database and needs to be validated in a clinical setting. Bias from some unmeasured clinical variables may have weakened the statistical validity of our study. In the future, we will verify our results in a clinical study.