What this is
- () is a key factor in aging and various diseases, characterized by permanent cell cycle arrest and inflammation.
- Identifying senescent cells is challenging due to their diverse signatures across different cell types and tissues.
- This research introduces SenePy, an open-source platform that utilizes single-cell transcriptomic data to analyze and score signatures.
Essence
- SenePy effectively identifies cell-type-specific signatures of using single-cell transcriptomics, enhancing our understanding of senescence across various tissues and conditions.
Key takeaways
- SenePy integrates 72 mouse and 64 human single-cell transcriptomic signatures to create a robust scoring platform for .
- SenePy signatures outperform traditional markers derived from in vitro studies, providing a more accurate representation of in vivo .
- The study identifies significant heterogeneity in senescent cell signatures across different tissues, emphasizing the need for context-specific markers.
Caveats
- The reliance on existing datasets may limit the comprehensiveness of the identified senescence signatures.
- Single-cell transcriptomics can suffer from dropout effects, potentially impacting the detection of low-abundance senescence markers.
Definitions
- Cellular senescence (CS): A state of permanent cell cycle arrest associated with aging and various diseases, often marked by inflammation.
- Senescence-associated secretory phenotype (SASP): A condition in which senescent cells secrete pro-inflammatory factors that can affect neighboring cells and tissues.
AI simplified
Introduction
Aging is a key risk factor for many chronic diseases1. One biological manifestation of organismal aging is cellular senescence (CS), a phenomenon characterized by permanent cell cycle arrest, impaired homeostatic cellular function, and activation of the senescence-associated secretory phenotype (SASP), which involves the release of pro-inflammatory proteins, proteases, and other bioactive paracrine factors2. Senescent cells accumulate in tissues with increasing organismal age, but senescent cells are found even in young organisms and can accrue prematurely due to exogenous stressors3,4. Accumulated senescent cells contribute to sterile inflammation, tissue remodeling, and local dysfunction, which ultimately drives various pathologies5. CS contributes to a wide array of chronic diseases, including cardiovascular disease, neurodegeneration, and diabetes6–8. The senescent cell burden in aged organisms also contributes to unchecked inflammation and poor outcomes in acute diseases, such as coronavirus infection9. Targeted clearance of senescent cells with senolytics can mitigate disease severity and increase healthspan7–10, but their elimination may exacerbate disease in some contexts11. Despite the growing understanding of the role of CS in aging and various diseases, in vivo CS remains poorly phenotypically and mechanistically characterized2,12. Most CS markers have been identified in cultured cells subjected to experimental conditions that may not accurately represent a living system. More comprehensive markers are required to robustly study CS in living systems.
One of the biggest challenges in studying in vivo CS is the high degree of heterogeneity, as CS involves a multitude of changes in cellular function2. CS has been observed in numerous cell types across all major organs. Senescent cells partially lose their pre-senescence cell identities and phenotypes, suggesting that the mechanistic paths to the senescent states vary between cell types13,14. Cells are also subject to various cell-intrinsic and extrinsic stressors that drive CS. Telomere attrition is a well-known CS trigger, but telomere-independent DNA damage, oxidative stress, and oncogenic signaling can also induce cellular senescence5. Paracrine factors released in the SASP state and cell-surface signaling from senescent cells can induce secondary CS among otherwise healthy cells in close proximity15,16. Drivers of CS shift the transcriptional landscape of senescent cells13,14, but this has been primarily studied in cultured cells. The tissue context or cell identity-specific transcriptional landscapes of CS have not been fully defined. For example, it is unclear whether the CS transcriptional programs in skin fibroblasts exposed to UV light differ from those of fibroblasts in internal organs protected from light. The heterogeneity arising from different stressors, tissues, and cell types makes it difficult to broadly apply transcriptional signatures of CS derived from in vitro cultured cells that have been removed from their in situ tissue environment. There are no universal signatures or markers of CS2. Even the cell cycle arrest inducer p16ink4a (encoded by the gene CDKN2A), which is widely accepted as one of the most specific markers of CS, is not always required for CS induction and its use as a sole marker in transcriptomics data is confounded by the fact that the corresponding CDKN2A locus encodes for multiple genes with overlapping sequence identity17–19. Other CS markers may be constitutively expressed in some cell types or upregulated in general with organismal age or inflammation. Recent work has utilized literature screening and transcriptomics to find a gene set that is broadly differentially abundant in senescent cells20, but this does not account for tissue-, cell-, or stress-specific heterogeneity and may not capture all programs of CS. There remains a need to identify and characterize tissue- and cell-specific CS programs21.
In this study, we aggregate and interrogate large-scale single-cell RNA-sequencing datasets across tissues and ages in mice and humans to define in vivo CS heterogeneity. We developed an algorithmic approach to identify cell-type-specific CS signatures. We used a p16ink4a reporter mouse model dataset and other transcriptomics datasets to validate our approach. We have generated the open-source Python package SenePy (https://github.com/jaleesr/senepy↗). SenePy allowed us to map the kinetics of CS in many tissues and cell types with respect to organismal age and in the context of disease. Using SenePy we were able to identify senescent cells across several tissues and examine similarities as well as tissue-specific and cell-type-specific signatures of cellular senescence.
Results
Known cellular senescence markers are cell-type-specific and poorly characterize in vivo cellular senescence
We examined the expression of established CS markers in comprehensive mouse and human single-cell atlases to determine their dynamics with age and their cell-type-specificity. For mouse data, we utilized the Tabula Muris Senis22 resource, which is comprised of 328 K cells from 19 tissues collected between 1 and 30 months of age (Supplementary Fig. 1A). The human data utilized in this study are derived from 7 studies23–29 that span 37 tissues from individuals aged 1 to 92 years old, altogether comprising 1.6 M cells (Supplementary Fig. 1B). We took the union of SenMayo20 (n = 125), which is a recently published list of cellular senescence-associated genes, and a self-curated CS gene set (n = 108 human, n = 110 mouse) to establish a panel of 181 (181 human, 184 mouse) experimentally validated CS marker genes. The independently generated gene set had significant overlap with SenMayo but contained many unique genes (intersection = 52,51, hypergeometric p-value = 5.5 × 10−97, 2.85 × 10−79). This panel of known or validated CS markers served as a starting point for our downstream analyses.
We next analyzed the aging dynamics of CS markers across all tissues and cell types in the mouse and human datasets with respect to the proportion of positive cells. Some of the most widely used markers of CS, such as Cdkn2a and Cxcl13, showed an overall increase in the proportion of cells expressing these genes with age (FDR p-values = 0.01, 0.03) but had tropism for specific tissue and cell types (Fig. 1C and Supplementary Fig. 3A). Human skin from the face, for example, had appreciable levels of CDKN2A + cells in young individuals and a large increase with age (Student’s t test, one-tailed, p-value = 0.002) (Supplementary Fig. 3B). However, other important markers of CS, such as CDKN1A (p21cip1 encoding gene), were more constitutively expressed in young and old mice and humans (Fig. 1D and Supplementary Fig. 3C).
We examined the dynamics of cellular senescence markers in 60 mouse and 50 human cell types (Supplementary Data 1, 2). Overall, the landscape of all CS markers was highly heterogeneous when stratified by tissue and cell type in both species (Fig. 1E, F). Only 58 of 1540 pairwise combinations of cell types showed significant CS marker gene overlap (Hypergeometric, FDR p-value < 0.05), thus highlighting how CS program transcriptional profiles differ widely between tissues and cell types. Cells from the same tissue were most likely to have significant marker overlap (Chi-square, p-value = 3.2 × 10−12). While fibroblasts from different tissues showed some overlap in senescent cell marker profiles, this was not statistically enriched (Chi-square, p-value = 0.8). The marker found to increase dynamically with age in the largest number of human cell types was CCL4, but this marker was only dynamic in 18 of 50 (36%) of the cell types tested. Ccl5 and Ccl8 were the most universally dynamic in mice but only in 33% of cell types. CDKN2A was one of the most enriched CS markers in both humans and mice, but it was only dynamic in 26% of human and 32% of mouse cell types. These data indicate that there is no universal CS marker gene set for all tissues and cell types and that each cell type within each tissue takes distinct transcriptional paths to the CS state. Instead, our data maps the suitability of known CS markers in different organisms, tissues, and cells.

Known markers are cell-type-specific and poorly characterize in vivo cellular senescence. There is an insignificant overlap between a universal organismal aging signature and previously reported senescence markers (= 0.58, Hypergeometric). The universal signature is comprised of genes present in at least 50% of the cell-specific gene sets elucidated by differential expression between young and old cells previously. Created in BioRender. Rehman, J. (2025)A histogram depicting the 24-month to 3-month ratio (old to young ratio) of all cells expressing every gene in thedataset. Genes to the right of the dashed line have a statistically significant increase based on random permutations. Statistically significant known senescence markers are labeled.,UMAPs (right) representing+ (p16encoding gene) cells from the mouse and human datasets. Cells from 24-, and 30-month mice are denoted old while 1-, and 3-month mice are young. UMAPs (left) show all cells in the datasets and are labeled by broad cell classifications. Bar graphs show the percentage ofcells relative to all cells in the respective datasets. (* = FDR-value = 0.011,= 1000 random permutations).,Cell-specific maps of marker dynamics in mice and humans. Vertical dashed lines represent the start of a tissue and cell types from that tissue are classified and depicted by marker shape. Multiple cell types belonging to the same class are overplotted. The gain represents the percent increase of cells expressing the marker relative to young organisms. Only statistically significant genes are shown. Bar plots colored by senescence-associated gene function depict the percentage of cell populations in which the respective gene is a suitable marker. A B C D E F p Tabula Muris Senis CDKN2A CDKN2A + p n https://BioRender.com/w72f903 ink4a
De novo cell-type-specific signatures derived algorithmically from single-cell transcriptomes
Cell-type signatures were highly heterogeneous, but known markers of CS were enriched in selected cell types, although not in a consistent manner (Fig. 2C, D). The signatures were enriched for known CS markers at a higher rate than gene sets derived from differential expression analyses (Supplementary Fig. 2d). Importantly, similar gene expression signatures (Hypergeometric FDR < 0.05) were more likely to be found between cells from the same tissue (Chi-square, p-value = 1.7 × 10−8) than between cells of the same cell type (Chi-square, p-value = 0.004) within Tabula muris senis. Signatures from cells within the same tissue had a higher average cosine similarity of 0.09 compared to 0.04 for those derived from different tissues (Mann Whitney, p = 9.6 × 10−7). However, there were some exceptions in which signatures from cell types found in multiple tissues shared high similarity. For example, senescent fibroblasts found in mouse lungs were most similar to senescent fibroblasts from mouse tracheas and share the overall highest similarity between any two mouse senescent cell type signatures. Conversely, fibroblast signatures taken from all of the tissues were not more likely to be similar to each other (Chi-square, p-value = 0.4). Many cell-type signatures contained overlapping genes despite the high overall signature heterogeneity (Fig. 2E). There were 368 of 903 (40%) mouse cell-type-signature pairs that shared significant hypergeometric overlap compared to 58 out of 1540 (4%) when using established markers. Based on their genetic profile, cell-type signatures clustered distinctly with each other but not with organismal aging signatures.
These observations indicate that the signatures we derived share some underlying genetic characteristics despite being highly distinct and that these signatures likely represent bona fide in vivo CS programs. Therefore, we developed the open-source SenePy Python software package to score single cells based on their expression of these CS signature genes. SenePy rapidly processes thousands of cells and provides relative senescent scores for every cell within a given population based on the selected signatures (Methods: Scoring cells using SenePy).

De novo cell-type-specific signatures derived algorithmically from single-cell transcriptomes. Overview of the algorithm used to define cell-specific signatures from mice and humans (see methods). Created in BioRender. Rehman, J. (2025).Example signatures derived from mouse cardiomyocytes (top) and human hippocampal choroid plexus cells (bottom). Each node represents a gene and the connections represent co-positivity in cells. Connections are weighted by Pearson’s R. The colors represent distinct hub signatures within the overall cell signatures.,Representative diagram of all derived signatures from mice and humans. Each dot represents a signature and is sized by its number of genes. The dot color is the respective enrichment for each signature compared to previously known senescence markers (− logHypergeometric-value). Each signature is connected to its most similar signature and the color of the connection is based on the cosine similarity. Signatures without significant overlap are connected with gray lines (Hypergeometric FDR-value).Network similarity analysis of mouse cell-specific novel senescence signatures and organismal aging signatures. Each shape represents a signature and lines represent significant similarity between them. Similarity (strength of connections) is defined as -log(BH-corrected Hypergeometric P). The network is clustered and colored by Louvain’s algorithm. A B C D E https://BioRender.com/f97q817 10 10 P P
Distinct modes and phenotypes of cellular senescence exist within the same cell types
We next examined senescent fibroblasts from three different tissues to determine the CS characteristics of similar cell types in different contexts. The mouse fibroblasts from hearts, lungs, and tracheas each had two distinct CS signature hubs. Most fibroblast hub signatures shared little genetic similarity and high cosine distance (Fig. 3H). The senescent cell gene hubs with the highest pairwise similarities were present in the cells of the lungs and trachea, possibly indicating that the spatial proximity and function in the respiratory system may have resulted in similar CS phenotypes. Functionally, these similar hubs in the lungs and trachea shared common biological processes, such as inflammatory response, cytokine signaling, and immune cell chemotaxis (Fig. 3J). However, the trachea hub was uniquely and highly enriched for genes involved in B-cell signaling and neutrophil activation. When the fibroblast populations were scored with SenePy, they showed distinct temporal kinetics (Fig. 3I). In all cell populations, there was a small proportion of senescent cells in young mice and a drastic increase in old mice. The biggest increase in the proportion of senescent cells occurred between 18 and 24 months. Interestingly, cells identified using the most similar trachea and lung hubs had comparable temporal kinetics and nearly identical high proportions of senescent cells in 24-month-old mice. Senescent fibroblast populations in the heart and lungs also followed parallel kinetics despite greater gene and ontological distance. Together, these results suggest multiple modes of CS even within the same populations and that these modes are temporally and phenotypically distinct.

Multiple modes of senescence exist within the same cell populations. The proportion of cells from young (< 4 month) and old mice (> 20 month) expressing their respectivesignatures (Wilcoxon signed-rank test, one-tailed,= 72 pairs).Signature derived from mouse tongue keratinocytes. Each node represents a gene and the connections represent co-positivity in cells. Connections are weighted by Pearson’s R. Nodes are colored by Louvain-based assignment to distinct hub signatures.GO and () KEGG gene set enrichment of the two keratinocytes hub signatures. The “senescence” gene set is the pre-defined senescence marker used in this study. The vertical dashed line represents FDR= 0.05.Pairwise enrichment of the two keratinocyte hubs against all othersenescence signatures.The strip plot depicts the score of each keratinocyte determined byusing the aforementioned hubs. Horizontal dashed lines represent three standard deviations above the mean.Temporal kinetics of the proportion of cells scored three standard deviations above the mean byfor the two keratinocyte hubs.Hierarchical clustering of fibroblast hub signatures from mouse lungs, tracheas, and hearts based on cosine similarity.Temporal kinetics of the proportion of lung, trachea, and heart fibroblast cells scored high byusing their respective signatures.GO gene set enrichment of the most similar trachea and lung fibroblast hubs. Pathways specific to the trachea hub are colored green. All enrichment plots use BH-corrected Fisher’s Exact-values. A B C D E F G H I J SenePy n p SenePy SenePy SenePy SenePy P
Cell-specific signatures are unique but share common stress response and inflammatory pathways
Genes overrepresented in every mouse signature (FDR p-value < 0.05) were enriched for multiple biological processes involved in inflammation, immunity, cytokine signaling, and chemotaxis (Fig. 4B). The NF-kappa B signaling pathway, a known driver of cell senescence, was among the enriched pathways (KEGG, BH-corrected p-value = 1.3 × 10−5), emphasizing that NF-kappa B plays an important role in some programs of in vivo CS. However, only 9 of the 43 signatures were individually enriched for NF-kappa B signaling, indicating that it is far from universal (Fig. 4C). Therefore, to test for transcription factors that might be active, we tested our signatures for TF binding enrichment in their gene promoters (Fig. 4D). The most universally enriched binding motif was RREB1, which was enriched in 38 of 43 signatures. In addition, we found 46 other transcription factors enriched in over half of the mouse signatures.
The universal mouse SenePy signature had no overlap with the 330 global mouse aging genes identified from Tabula Muris Senis by Zhang et al. (Hypergeometric p-value = 2.3 × 10−5) (Supplementary Fig. 2f). However, the universal mouse gene signature does share significant overlap with known markers of cellular senescence (Hypergeometric p-value = 1.4 × 10−13). Conversely, the global aging signature derived from differential expression does not (Supplementary Fig. 2e). This supports our earlier observation that cellular senescence is distinct from organismal aging.
In human cells, only three genes were present in > 25% of the human CS signatures (Fig. 4E): matrix metallopeptidase 9 (MMP9), Myosin light chain 9 (MYL9), and Integral membrane protein 2 C (ITM2C) (FDR p-values 8.1 × 10−8, 5.0 × 10−7, 5.0 × 10−7). MMP9 is a known SASP component and is present in our curated set of CS markers. In comparison, CDKN2A was only present in 9 of 45 signatures, which is still higher than what would be expected by random chance (FDR p-value = 3.4 × 10−4). There were 734 genes overrepresented (FDR p-value < 0.01) in the human cell-type SenePy signatures (Supplementary Data 9). Genes in this universal signature were enriched for innate immune and other biological pathways (Fig. 4F). When signatures were tested individually, the most commonly enriched pathways included neutrophil-mediated immunity, platelet degranulation, cytokine signaling pathways, and other inflammation and immune pathways (Fig. 4G), consistent with SASP.
The most universal genes and pathways active in the cell-type signatures from both species shared some characteristics. There were 46 common genes that were enriched in both the mouse and human cell-type signatures. This was only a marginal overrepresentation compared to random chance (Hypergeometric, p-value = 0.09), suggesting low gene-wise concordance between species. CDKN2A, CXCR2, and CCL3 were the only common genes that were previously established CS markers. However, the pathway concordance between species was high and both sets of common genes were enriched for 32 common pathways, such as NF-kappa B signaling, AGE-RAGE signaling, and chemokine signaling. Likewise, eight of the top 20 most commonly enriched transcription factors from mouse signatures were also commonly enriched in human signatures (Fig. 4H). This suggests that core CS pathways between species are conserved but the individual genes that are enriched in senescent cells show a high degree of genetic variation. Together, these data indicate that our de novo cell-type signatures are enriched for known CS phenotypes and share some commonality between cell types and species despite their high degree of heterogeneity.

Cell-specific signatures are unique but share some genes and biological pathways. Plot depicting the most commonly found genes from the novel mouse cell-specific signatures. Significance depicts how likely those genes would be found in that many signatures by random chance.The 15 most enriched KEGG pathways from the universalsignature.The most commonly enriched KEGG and GO gene sets in everymouse signature.The most commonly enriched transcription factor motifs in the promotors ofmouse signature genes.Plots depicting the most commonly found genes from thehuman cell-specific signatures.The 12 most enriched KEGG pathways from the universalhuman signature.The most commonly enriched KEGG and GO gene sets in every humansignature. The bars note the percent of signatures enriched for the given pathway.The most commonly enriched transcription factor motifs in the promotors of humansignature genes. The human icon was created in BioRender. Sanborn, M. (2025). A B C D E F G H SenePy SenePY SenePy SenePy SenePy SenePy SenePy https://BioRender.com/l04t362
The cell-specific kinetics of senescent cell accumulation with organismal age
The proportions of cells within populations predicted to be senescent by SenePy were not correlated to the replicative potential in their respective populations (Supplementary Fig. 5a, b). Surprisingly, replicative populations, such as large intestine enterocytes, had minimal increases in the proportion of senescent cells with age. This is corroborated by an undetectable change in the number of Cdkn2a + enterocytes with age and the general lack of correlation between Ckdn2a + cells in replicative populations. Likewise, we did not observe a negative correlation between the population-level expression of telomerase and the calculated senescence burden (Supplementary Fig. 5c, d). Thus, suggesting that the well-known driver of in vitro CS, telomere attrition, may not be the primary driver of CS within organisms.

The cell-specific kinetics of senescent cell accumulation with organismal age. UMAPs of () mouse and () human cells depicting broad cell classification and overlayed with cells that were outliers determined by theirscore.The proportional increase ofoutlier cells in old mice (24- or 30-month) relative to 3-month-old mice.The proportion ofhuman outlier cells across age bins. Each row represents cell proportions from 0–0.16 and gray rectangles note that no data is available.The fraction ofoutliers in individual heart tissue cells stratified by the donor. Age increases along the-axis from left to right. The human icon was created in BioRender. Sanborn, M. (2025). A B C D E SenePy SenePy SenePy SenPy x https://BioRender.com/l04t362
identifies ground-truth in vivo cellular senescence more robustly than established markers SenePy
We also examined the ability of SenePy signatures to identify transcriptional changes due to senolytic treatment as well as those seen in experimental conditions that induce CS in multiple in vivo and in vitro models. The genes downregulated in mouse lungs following therapeutic senolysis were enriched for multiple SenePy lung- or airway-specific signatures (Fig. 6D). SenePy signatures were more enriched than cell-type agnostic gene sets. SenePy does not contain a specific skeletal muscle CS signature due to data availability, yet mRNA less abundant in mouse muscle tissue following senolytic treatment was enriched for multiple SenePy signatures, including one from myocytes (Supplementary Fig. 6d). Multiple SenePy endothelial signatures were enriched in mRNA more abundant after radiation-induced CS of human endothelial cells, but the advantage of SenePy over other gene sets was diminished in this in vitro context (Supplementary Fig. 6b, e). The discrepancy between the in vitro and in vivo efficacy of SenePy was even more apparent in models of senescent fibroblasts in culture (Supplementary Fig. 6c and Fig. 6E). In these in vitro contexts, SenePy was outperformed. These data suggest that SenePy recapitulates in vivo cellular senescence and that gene sets derived primarily through previous in vitro experiments do not.
Next, we tested the marker suitability of genes that encode for p16ink4a and p21cip1 in the p16-CreERE2-tdTomato liver cells (Supplementary Fig. 6f). Only 8% of p16high cells, as indicated by tdTomato, had detectable levels of Cdkn2a RNA (p16 encoding gene) (Supplementary Fig. 6g). This suggests that either Cdkn2a expression was not detectable due to single-cell dropout or the expression of Cdkn2a is transitory in the majority of these senescent cells. Another important marker of cellular senescence, Cdkn1a (p21 encoding gene), was found in the majority of both tdTomato+ and tdTomato- cells, making its binary expression an inadequate metric because it’s more universally expressed in non-senescent cells (Supplementary Fig. 6h, i). These data represent a striking example of why sole reliance on known cellular senescence genes like p16 and p21 is not sufficient, especially in single-cell transcriptomics because of low single-cell resolution, dropout, and marker-independent CS programs.
For additional in vivo validation, we analyzed flow-sorted senescent hepatocyte RNA data from a mouse model of oncogene-induced senescence33. Chan et al. induced cellular senescence in mouse hepatocytes using NRAS expression and harvested hepatocytes during peak senescence at 12 and 30 days, along with tumor and healthy tissue at 218 days (Fig. 6F). The cells follow multiple CS pseudotime trajectories from the mV (mVenus) control root (Fig. 6G). The senePy mouse hepatocyte signature is composed of two distinct hubs (hepatocyte 0 and hepatocyte 1). Both hubs strongly correlate to distinct pseudotime trajectories (Pearson’s p-value = 2 × 10−232, p-value = 0), supporting the idea that a single cell type can have multiple CS phenotypes (Fig. 6H). The universal senePy signature score is significantly correlated to global pseudotime (p-value = 0) (Fig. 6I). Cells scored using traditional CS markers and the senMayo gene set were not as strongly associated with CS pseudotime (Fig. 6J). The SenePy score was inversely correlated to the hepatocyte marker, Albumin, likely due to reduced cell identity associated with cellular senescence (Fig. 6K). Independent of pseudotime, senePy scores were higher in senescent hepatocytes relative to the control than cells scored with traditional CS markers (Fig. 6I). There was a 4.3-fold increase in the mean senePy score at day 12 (Mann Whitney, one-tailed, p-value = 2.9 × 10−145) and a 4.1-fold increase at day 30 (p-value = 3.7 × 10−177) relative to the control. Conversely, CS marker-based scoring did not increase at day 12 and had a 1.3-fold increase at day 30 (p-value = 1.6 × 10−17). These results indicate that senePy cell-specific signatures and its derived universal signature robustly recapitulate in vivo CS within mouse hepatocytes.
We also tested an additional scoring method, scDRS34, which is capable of handling the network centrality weights central to SenePy signatures. Both the default binary and the normalized methods of SenePy were highly correlated to scRDS scores (Pearson’s R, p-value = 0) (Supplementary Fig. 7a). The cells identified by scDRS to have an FDR p-value less than 0.05 had very significant overlap with every SenePy outlier threshold tested (Hypergeometric, p < 1.9 × 10−116) (Supplementary Fig. 7b). Cells identified by higher SenePy thresholds had high overlap with scDRS, but the overall Jaccard Index decreases as the number of SenePy cells decreases relative to scDRS cells (Supplementary Fig. 7c). This indicates that higher thresholds may be more specific for senescent cells but with less sensitivity. Both the default binary SenePy method and its normalized count method had high overlap with each other and scDRS (Supplementary Fig. 7d). However, the scores produced by the default binary method produce multimodal distributions, which make it simpler to empirically derive a threshold (Supplementary Fig. 7c, f). All three methods detected a similar increase in the number of senescent cells with age (Supplementary Fig. 7e). When tested in the p16 reporter cells, all three methods generated significantly higher scores in the tdTomato + cells compared to tdTomato- cells (Supplementary Fig. 7f). The scDRS method is incorporated into the SenePy package as an additional scoring flavor.

identifies ground-truth in vivo cellular senescence more robustly than established markers. SenePy UMAPs of single-cells from the kidney which were enriched for td-Tomato+ cells.Enrichment analysis of differentially abundant gene mRNA in the td-Tomato + kidney cells. Gene sets were derived in this study and we also include the SenMayo signature, CS markers from the literature (senMarkers), and a CS gene set derived from Chat-GPT4 (senGPT). Onlykidney epithelial signatures were tested (Fisher’s Exact FDR-value).Density plots depictingscore distributions calculated in kidney cells using kidney-specificsignatures. Each row label depicts the senescence signature used. The color indicates if the cells were td-Tomato + (Mann Whitney, two-tailed).Enrichment analysis of lung tissue genes downregulated after senolytic treatment in mice. Blue labels indicatesignatures (Fisher’s Exact FDR-value).Enrichment analysis of gene mRNAs more abundant in replicative fibroblast senescence (Fisher’s Exact FDR-value).UMAP of single mouse hepatocytes undergoing KRAS oncogene-induced senescence (Chan et al. 2024). mV: mVenus control, D12: day 12, D30: day 30. Tumor and non-tumor were harvested at 218 days after KRAS induction.Pseudotime projected onto the hepatocytes with a mVenus root cell.Two distinct trajectories of pseudotime are closely associated with the scores from bothhepatocyte signatures.Cells scored using theuniversal mouse signature.Cells scored with either senMarkers or senMayo (top). Regression and density plots depicting the association between normalized signature scores and pseudotime (bottom).Association between theuniversal signature score and the expression of Albumin.Scoring based on literature marker genes and the universalsignature (Mann-Whitney, one-sided). The image icons were created in BioRender. Sanborn, M. (2025). A B C D E F G H I J K L SenePy p SenePy SenePy SenePy p p SenePy SenePy S enePy SenePy https://BioRender.com/l04t362
predicts an elevated cellular senescence burden in severe disease SenePy
Next, we used SenePy to score spatiotemporally resolved mouse transcriptomics data following myocardial infarction36. We used the hub signatures we previously derived from mouse hearts to score the spatially resolved spots (Fig. 7F). Senescent loci were found even in the control heart, corroborating earlier observations that even young organisms have baseline levels of cellular senescence. However, since the spatially resolved spots consist of multiple single cells, the number of single senescent cells in these data are unknowable. The proportion of spots with high CS burden was highest at day 7 but the change was not statistically significant, likely due to the small sample size (ANOVA, n ≤ 3). We observed a strong spatial association between spots with high CS burden and heart fibrosis after infarction (Fig. 7G). Senescent-like spots strongly colocalized in regions of the hearts expressing fibrotic markers such as Col1a1, which is not a gene in the SenePy signature. This correlation becomes readily apparent by 7 days post-MI but was not observed in the control heart or hearts shortly after MI. The 8 hubs used to score the spots contributed to the overall CS burden with distinct temporal patterns (Fig. 7H). The endothelial cell hub had the highest contribution at day one but continued to decrease up to day 14. The score from a myocyte hub jumped from baseline on day 7 then dropped back down. Other CS gene programs remained otherwise unperturbed by MI.
In addition, we observed strong spatial autocorrelation between the spatially-resolved spots with high CS burden (Fig. 7F, I). Only one of the 9 samples did not have highly significant spatial clustering of senescent-like spots. Unsurprisingly, this same day-14 sample has low senescent spot association to Col1a1 as well as the lowest overall CS burden. To further investigate this finding of senescent cell clustering and to see if this phenomenon is apparent in other tissues, we utilized spatial transcriptomics data from mouse brains before and after inflammatory insult (Fig. 7J). The control brains had a small number of senescent-like spots. However, the brains from LPS-treated animals had amplified CS signatures and highly significant clustering of senescent-like spots. These data indicate that senescent cells are more likely to be found in close proximity across multiple in vivo systems.
![Click to view full size predicts elevated senescence burden in severe disease. SenePy UMAP showing cells from uninfected control lungs (= 7) and patients who died from COVID-19 (= 20).UMAP of control cells (left) and COVID-19 cells (right) withoutlier cells labeled.Proportion of cells identified as outliers withfrom AT1, AT2, and airway epithelial cells (= 0.004, Mann-Whitney, two-sided).Distributions ofscore from the major cell lung cell classes (Mann-Whitney, two-sided, **= 0.004).Heatmap with the relative senescence burden of each cell type in each patient.Representative whole heart H&E staining overlayed with spatially resolved 10x Visium spots. Yellow spots are identified as senescence outliers from theirscore. Box plot (right) shows the proportion of identified spots at each time point (= 0.30, one-way ANOVA, individual replicate points shown).The bottom images represent post-MI fibrosis via normalizedexpression. Scale bars represent 500 μM. The correlation between senescence-like spots andexpression for each sample is shown by the right box plot (− log[Pearson’s R-value, two-sided]). The indicated gray-scale dots match the shown images to their respective data points in the box plots.The relative contribution to the overall calculated senescence burden from the 8 hubs used to score the spots.Spatial autocorrelation of the senescence-like spots (− log[Moran’s I-value]). The horizontal line represents= 0.05.Representative H&E image of coronal sections spatially resolved by 10x Visium which were taken from mice 24 h after exposure to LPS. Yellow spots are identified as senescence outliers from theirscore. The summary plot (right) depicts spatial autocorrelation of spots in LPS and saline-treated mice (− log[Moran’s I-value],= 3). Box plots in Fig. 7 depict the median and full range of values, with individual points shown. A B C D E F G H I J n n SenePy SenePy p SenePy p SenePy p Col1a1 Col1a1 p p p SenePy p n 10 10 10](https://europepmc.org/articles/PMC11846890/bin/41467_2025_57047_Fig7_HTML.jpg.jpg)
predicts elevated senescence burden in severe disease. SenePy UMAP showing cells from uninfected control lungs (= 7) and patients who died from COVID-19 (= 20).UMAP of control cells (left) and COVID-19 cells (right) withoutlier cells labeled.Proportion of cells identified as outliers withfrom AT1, AT2, and airway epithelial cells (= 0.004, Mann-Whitney, two-sided).Distributions ofscore from the major cell lung cell classes (Mann-Whitney, two-sided, **= 0.004).Heatmap with the relative senescence burden of each cell type in each patient.Representative whole heart H&E staining overlayed with spatially resolved 10x Visium spots. Yellow spots are identified as senescence outliers from theirscore. Box plot (right) shows the proportion of identified spots at each time point (= 0.30, one-way ANOVA, individual replicate points shown).The bottom images represent post-MI fibrosis via normalizedexpression. Scale bars represent 500 μM. The correlation between senescence-like spots andexpression for each sample is shown by the right box plot (− log[Pearson’s R-value, two-sided]). The indicated gray-scale dots match the shown images to their respective data points in the box plots.The relative contribution to the overall calculated senescence burden from the 8 hubs used to score the spots.Spatial autocorrelation of the senescence-like spots (− log[Moran’s I-value]). The horizontal line represents= 0.05.Representative H&E image of coronal sections spatially resolved by 10x Visium which were taken from mice 24 h after exposure to LPS. Yellow spots are identified as senescence outliers from theirscore. The summary plot (right) depicts spatial autocorrelation of spots in LPS and saline-treated mice (− log[Moran’s I-value],= 3). Box plots in Fig. 7 depict the median and full range of values, with individual points shown. A B C D E F G H I J n n SenePy SenePy p SenePy p SenePy p Col1a1 Col1a1 p p p SenePy p n 10 10 10
Discussion
There is a paucity of tissue- and cell-specific markers for senescent cells due to the heterogeneity of cells that undergo cellular senescence (CS). This is especially challenging in single-cell transcriptomics because the high rate of dropout and limited sequencing depth of the technology poses a challenge for using classical CS markers such as Cdkn2a and Cdkn1a as sole indicators of cell senescence. Furthermore, the cell-specific heterogeneity of CS represents a major challenge for the development of a universal CS signature gene panel. Therefore, in this study, we took an unbiased large-data approach to identify cell-specific programs of cellular senescence and created SenePy as an open-source platform (https://github.com/jaleesr/SenePy↗) to identify senescent cells in single-cell transcriptomic data. We validated the SenePy approach using single-cell RNA-seq data in multiple models and applied SenePy to determine the kinetics and heterogeneity of CS across several human and mouse cell types in aging and disease.
Previous studies have generated transcriptomic signatures of CS based on induced CS in controlled in vitro environments13,20. While these signatures have helped advance the mechanistic understanding of CS, it is challenging to use such signatures for highly variable in vivo contexts. We show that cells express many of these genes at higher rates with age at the organism or tissue level, but none of the genes obtained from such CS panels were applicable in the majority of cell types tested. Even the widely used marker CDKN2A was identified in senescent cells in less than a third of mouse and human cell types. Our analysis of the p16-CreERE2-tdTomato mouse cells highlights the limits of using a small number of markers, such as p16ink4a or Cdkn2a in transcriptomics data. While the tdTomato+ cells likely represented bona fide senescent cells32, only a small proportion of individual cells had detectible levels of Cdkn2a at the time of tissue harvest and sequencing. This likely arose from the transient expression of p16 earlier on in the CS program in combination with gene dropout inherent in single-cell sequencing. Nevertheless, many highly visible and impactful studies are forced to rely on a small, unspecific set of markers because better alternatives did not exist.
Our approach focused on CS markers specific to individual cell populations that would not be confounded by transcriptional differences that are merely reflective of organismal aging. We were able to extract cellular sub-population level programs of CS by setting kinetic thresholds based on the prior knowledge that senescent cells increase with age but are present in the minority of cells31 and by examining single-cell co-expression as opposed to differential expression. Unsurprisingly, many of our computationally-derived signatures contained and were statistically enriched for pre-established CS markers. However, these comprised the minority and most genes we identified have not been considered as CS markers. While we did find well-known CS genes like CDKN2A, BCL2-family genes, and various SASP factors to be statistically common within our signatures, the most universal markers were novel ones. For example, in mice, the most common signature gene we identified was the alpha hemoglobin subunit. Hemoglobin has been previously reported as a response gene to oxidative stress in non-erythrocytes37,38 but has yet to be reported in CS. Interestingly, in human signatures, there was no significant overrepresentation of globin genes. We observed this and other important differences between mouse and human signatures, suggesting organism-specific CS marker panels to be more specific. Some of the shared pathways and genes between signatures share commonality with recent single-cell analysis of senescent immune cells39. Both studies show an upregulation of NF-kappa B signaling, BCL2, and antigen presentation and processing, and our results show that this is a common feature in many cell types.
We used our signatures to map the kinetics of senescent cell accumulation in many different tissues and cell types. Recent work has mapped the abundance of senescent marker mRNA in 13 different tissues as a function of mouse age and in a progeria model40. This work, however, was agnostic of cell type and relied on a small set of CS markers. Our study comprehensively maps the increase in senescent cells in many different mouse and human tissues with respect to cell type. Furthermore, we used SenePy to examine trajectories of CS during hepatocyte tumorigenesis. SenePy identified multiple trajectories of CS with distinct phenotypes, consistent with the biological findings in the original study33. The universal SenePy signature strongly correlated to CS development in this hepatocyte-specific context. These data serve as an additional validation of SenePy and show that it can be used to study the process and kinetics of CS induction in disease settings.
Senescent cells contribute to cardiovascular pathology6,41, but their role in disease has never been spatiotemporally characterized. We show that senescent cells localize at sites of heart fibrosis. The proportion of senescent cells in each heart throughout the time series did not significantly change and were present even in the control heart. We also observed highly significant spatial clustering of senescent foci in the hearts and brains of mice which may be supporting evidence for an in vivo bystander effect15. The spatial distributions of senescent cells have been previously examined from spatially resolved transcriptomics data in aged mouse brains42. Their results show that Cdkn2a+ spots are adjacent to activated microglia but show no spatial clustering of Cdkn2a+ spots. However, their methodology relies on a narrow definition of CS, which may not translate to actual p16ink4a and does not account for Cdkn2a dropout or p16-independent forms of CS. These and other data would greatly benefit from a reexamination with more comprehensive gene sets, such as those proposed herein.
By design, our methodology removes genes that are constitutively expressed at baseline or in aged cells to maintain a distinction from organismal aging. Inherently, this discounts genes that may be part of the CS program which overlap with the transcriptional shift with age. We also do not account for genes that are downregulated in senescent cells. Negative markers of CS would add extra information to better identify senescent cells, but to find negative correlations in all pairwise combinations of genes with this methodology was computationally limiting. The comprehensiveness of our signature panel is also limited by the data available at the time of study and the exact set of tissues and cells tested influences any conclusions of universality or comparison between species. Single-cell noise and batch effect may also obfuscate a complete universal CS signature. We do not expect our signatures to negate the need for large-scale future efforts such as SenNet21. We instead expect our work to be complementary and assist these efforts. This work may also serve as a starting point for studying how patient-specific factors such as sex and lifestyle impact distinct senescent cell phenotypes and the kinetics of senescent cell accumulation.
This work comprehensively identified gene expression programs and signatures of senescent cells that are stratified by species, tissue, and cell type and used them to broadly characterize senescent cells in mice and humans. We created SenePy: a computational platform that assigns a CS score to individual cells in single-cell transcriptomic data, which can serve as a resource to uncover cell-type and tissue-specific mechanisms of cellular CS in vivo.
Methods
Data collection
Single-cell RNA mouse data were collected from the Tabula Muris Senis atlas22. Tabula Muris Senis consists of single cells from 30 mice from 1 to 30 months of age taken from 19 tissues. Human single-cell data were collected from 7 studies. The liver data were obtained from five donors ranging from 21 to 65 years old29. Single skin cells were obtained from 6 patients ranging in age from 18 to 48 years old25. Lung data were collected from 17 donors ranging from 21 to 72 years old28. Human heart cells were taken from 14 patients ranging from 40 to 75 years old26. Human hippocampal cells were collected from 37 patients ranging from newborn to 92 years old23. These tissue-specific datasets were given priority for downstream analysis in their respective tissues, but we used additional multi-tissue atlases. Cells from the Human Cell Landscape came from 51 donors ranging from 21 to 66 years old and from 25 different tissues27. Cells from Tabula Sapiens came from 15 patients ranging from 22 to 74 years of age24.
Data annotation
The data were available in a range of formats from fastq to processed and annotated count data. Data from Tabula Muris Senis, Tabula Sapiens, and the human heart study were provided with cell type annotations. The Lung and hippocampal studies provided unannotated counts. Fastq data were processed through 10x CellRanger or the Dropseq protocol (https://mccarrolllab.org/dropseq/↗) depending on the technology used to prepare the libraries. All human fastqs were aligned to GRCh38. Processed single-cell counts were handled with Scanpy43. Cells were filtered out if they had a relatively low or high number of detected genes or a high relative proportion of mitochondrial reads (thresholds varied based on dataset distribution). We used a variety of methods to annotate cell types. Since the Human Cell Landscape contained many tissues and cell types, we transferred the annotations from Tabula Sapiens using scANVI44 after processing the raw data with scVI-tools45. If a cluster, which was called by the Leiden algorithm on the scVI embeddings, had lower than 85% cell-type agreement, those cells were not used in downstream analysis. Cell types from the liver, skin, and lung studies were annotated similarly but clusters with poor label transfer were instead manually annotated using known cell-type markers46. For lack of a reference dataset, the hippocampal data were annotated exclusively using known markers. Annotations were harmonized across datasets (e.g., “kidney endothelial cell” changed to “endothelial cell”) and mapped back onto the raw counts. Cells lacking annotations, because they failed QC or label transfer were discarded. For total dataset visualizations, the species-specific raw data were integrated using scVI, and the embeddings were projected via UMAP.
Mouse cell-type specific age-dynamic genes
Mouse data came from mice aged 1, 3, 18, 21, 24, and 30 months (m) but age availability varied by tissue. Cells were stratified by tissue, age, and cell type. The starting baseline was chosen as 3 m if there were at least 200 3 m cells, if not the starting baseline was aggregated with 1 m cells. Likewise, 30 m was prioritized for old cells if at least 200 were present, otherwise the old baseline fell back to 24 m. The proportion of cells expressing one or more UMI copies of a gene was determined in each population (Eq.). Zero values at 3 m or 1 m were imputed with the inverse of the cell count (). Young cells were used as baselines and the ratios in old cells were determined relative to them (Eq.). For each cell type an age-dynamic score was calculated for each gene that is a sum of individual weighted metrics: young proportion, old proportion, gain, and ratio (Eq.). Each metric has an ideal range which roughly reflects the expected dynamics of senescent cells with age. We created a null distribution for each cell type by shuffling the gene by cell matrix for each cell type 1000 times, resulting in around 20 million null values for each cell type. The data were shuffled across cells to correctly model and account for the actual sparsity within each cell type. To determine if a gene was significantly dynamic, its observed value was compared to the null distribution. Multiple slopes and parameters were tested for the weighted metric functions but the resulting comparison between observed and null values was very stable. Dynamic genes are cell-specific markers and do not account for small changes to baseline levels of constitutively expressed genes, which may be senescence-associated genes but are not specific markers. 1 2 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{{age}}={{n}_{{total\; cells}}}^{-1}$$\end{document} p a g e = n t o t a l c e l l s − 1 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{p}_{{age}}}={{n}_{{{gene}}^{+}{cells}}}/{n}_{{{\mbox{total}}}\; {{\mbox{cells}}}}$$\end{document} p a g e n g e n e + c e l l s n total cells = /
(Eq.): the proportion of cells positive (i.e., expressing a gene) for a gene at a given age. Whereis the number of cells positive for a given gene andis the number of total cells in the same population. 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{{age}}$$\end{document} p a g e \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${n}_{{{gene}}^{+}{cells}}$$\end{document} n g e n e + c e l l s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${n}_{{{\mbox{total}}}\; {{\mbox{cells}}}}$$\end{document} n total cells 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{30{m|}24m}={p}_{30{m|}24m}/{p}_{3{m|}1m}$$\end{document} r 30 24 m m ∣ p 30 24 m m ∣ p 3 1 m m ∣ = /
(Eq.): ratio of old cells positive for a gene relative to young cells positive. Note thatrepresents the proportion of cells positive for a gene in cells from 30- or 24-month-old mice andare the proportion of cells positive for the gene from 3- or 1-month mice. 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{30{m|}24m}$$\end{document} r 30 24 m m ∣ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{30{m|}24m}$$\end{document} p 30 24 m m ∣ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{3{m|}1m}$$\end{document} p 3 1 m m ∣ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${old}\left(x\right)=\left\{\begin{array}{cc}\frac{1}{1+{e}^{-2x}} & {{\rm{if}}}0 \, < \, x\, \le \, 3\\ 1 & {{\rm{if}}}3 \, < \, x\, \le \, 20\\ -\frac{1}{4}x+6 & {{\rm{if\; x}}} \, > \, 20\end{array}\right.$$\end{document} o l d x 1 1 + e − 2 x if x 3 < ≤ 1 if x 3 20 < ≤ − + 1 4 x 6 if x > 20 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${gain}\left(x\right)=\left\{\begin{array}{cc}\frac{x}{5} & {{\rm{if}}}x \, < \, 5\\ -\frac{x}{5}+4 & {{\rm{if}}}x \, > \, 15\\ 1 & {{\rm{otherwise}}}\end{array}\right.$$\end{document} g a i n x x 5 if x < 5 − + x 5 4 if x > 15 1 otherwise = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${young}\left(x\right)=\left\{\begin{array}{cc}1 & {{\rm{if\; x}}} \, < \, 5\\ -\frac{x}{2}+3.5 & {{\rm{otherwise}}}\end{array}\right.$$\end{document} y o u n g x 1 if x < 5 − + x 2 3.5 otherwise = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${ratio}\left({r}_{30{m|}24m}\right)=\left\{\begin{array}{cc}{r}_{30{m|}24m} & {{\rm{if}}}\, {r}_{30{m|}24m} \, < \, 2.5\\ 2.5 & {{\rm{otherwise}}}\end{array}\right.$$\end{document} r a t i o r 30 24 m m ∣ r 30 24 m m ∣ if r 30 24 m m ∣ < 2.5 2.5 otherwise = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${GAD}={old}\left(x\right)+{gain}\left(x\right)+{young}\left(x\right)+{ratio}({r}_{30{m|}24m})$$\end{document} G A D o l d g a i n y o u n g r a t i o = + + + x x x ( ) r 30 24 m m ∣
(Eq.)An individual gene age-dynamic score is a sum of the weighted metrics.is the percent of cells positive for the given gene in old cells, young cells, or the difference between the two (gain). 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${GAD}:$$\end{document} G A D : \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} x
Human cell-type specific age-dynamic genes
Human ages were binned into 10-year bins to account for the continuous range of human ages. Bins were indexed from 8 (8–17 years old) to 88 (88–97 years old), with 9 total bins. To be considered for further analysis a cell-type population must 1) have three unique age bins with at least 100 cells in each bin or a bin with age ≤ 28 and a bin with age ≥ 58 with at least 100 cells and 2) have a bin with age ≥ 48 with at least 100 cells. These criteria were required in individual datasets to avoid confounding effects from multiple studies. Cells were stratified by dataset, tissue, age bin, and cell type. The proportions of cells expressing genes were calculated for each age bin (Eq.). The young starting populations were selected from the 8, 18, or 28 year bins if one was present, else the starting proportion was calculated by regressing the age and known proportion values and solving for 18 years (Eq.). The old ending populations were selected from the oldest age bin () in a given cell type. 1 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{{old}}$$\end{document} p o l d
Genes were considered dynamic if 1) the trend of age by proportion of cells expressing the gene was positive (Eq. 5) and the gene GAD (Eq. 6) was significant when compared to a null distribution.\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{cov}}\left({{\rm{age}}},p\right)=\frac{\sum \left({{{\rm{age}}}}_{i}-{\overline{{{\rm{age}}}}}\right)\left({p}_{i}-{\overline{p}}\right)}{n}$$\end{document}covage,p=∑agei−age¯pi−p¯n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{var}}\left({{\rm{age}}}\right)=\frac{{\left({{{\rm{age}}}}_{i}-\overline{{{\rm{age}}}}\right)}^{2}}{n}$$\end{document}varage=agei−age¯2n4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{18}=\frac{18\cdot {\mathrm{cov}}\left({{\rm{ag}}}e,p\right)}{{\mathrm{var}}\left({age}\right)}+\bar{p}-\frac{\overline{{age}}\cdot {\mathrm{cov}}\left({{\rm{age}}},p\right)}{{\mathrm{var}}\left({age}\right)}$$\end{document}p18=18⋅covage,pvarage+p¯−age¯⋅covage,pvarage
(Eq.): extrapolated proportion of cells positive in the 18-year bin. Whererepresents the covariance between age and proportion(Eq.),is the variance of age, andis number of age bins represented in the data. 4 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{18}$$\end{document} p 18 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{cov}}\left({age},p\right)$$\end{document} cov a g e p , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} p \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathrm{var}}\left({age}\right)$$\end{document} var a g e \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document} n 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=\frac{{\mathrm{cov}}\left({age},p\right)}{{\mathrm{var}}\left({age}\right)}$$\end{document} m = cov a g e p , var a g e
(Eq.): the slope of the linear regression line for the proportions of a given gene with age. See Eq.. 5 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} m \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Delta {Max}={{\rm{argmax}}}}_{{{\rm{age}}}}\left({p}_{\max }\right)-{{{\rm{age}}}}_{\max }$$\end{document} Δ M a x argmax = age age max p max − \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${dMax}\left(\Delta {Max}\right)=\left\{\begin{array}{cc}1 & {{\rm{if}}}\Delta {Max} \, < \, 50\\ -(\Delta {Max}-50) & {{\rm{otherwise}}}\end{array}\right.$$\end{document} d M a x Δ M a x 1 if Δ M a x < 50 − ( ) Δ M a x − 50 otherwise = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${aMax}\left({{{\rm{age}}}}_{\max }\right)=\left\{\begin{array}{cc}1 & {{\rm{if}}}\, {{{\rm{age}}}}_{\max }\ge 38\\ {{{\rm{age}}}}_{\max }-38 & {{\rm{otherwise}}}\end{array}\right.$$\end{document} a M a x age max 1 if age max ≥ 38 age max − 38 otherwise = 6 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${GAD}={old}\left(x\right)+{gain}\left(x\right)+{young}\left(x\right)+{dMax}\left(\Delta {Max}\right) \\+{aMax}\left({{{\rm{age}}}}_{\max }\right)+m\times 5$$\end{document} G A D o l d g a i n y o u n g d M a x a M a x m = + + + + + × x x x Δ M a x age max 5
(Eq.)An individual human gene age-dynamic score is a sum of the weighted metrics.is the difference in years between the population with the maximum positive gene proportion and the oldest age bin. Variables,,are from Eq.. Slope () is represented in percentage points and given a weight multiplier of 5. 6 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${GAD}:$$\end{document} G A D : \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta {Max}$$\end{document} Δ M a x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${old}\left(x\right)$$\end{document} o l d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${gain}\left(x\right)$$\end{document} g a i n x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${young}\left(x\right)$$\end{document} y o u n g x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} m
Identifying novel senescence signatures from mice
Age-dynamic genes for each tissue cell type were found as described above. Count data were subset by these genes and further subset to only include cells from mice > 21 m. Subsets with fewer than 100 cells were not tested further. The count matrixes were binarized to represent cells by gene positivity. Every pairwise combination of genes was tested for Pearson’s correlation (Eq. 7). To test the statistical significance, each pairwise comparison was randomly permutated 500 times. Pairwise correlations were kept if they had a positive r value and if their r value was at least 0.05 higher than the respective q99 (99th percentile) r value from the random permutations (Eq. 8). The filtered correlations were used to construct networks with NetworkX (https://networkx.org/↗). The Louvain algorithm was used to group genes into clusters. Network clusters with fewer than 5 genes or genes with no correlations were removed. Genes loosely connected to clusters were removed if they had fewer than \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log ({n}_{{cluster\; genes}})$$\end{document}log(nclustergenes) (Where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${n}_{{cluster\; genes}}$$\end{document}nclustergenes is the number of genes in a Louvain cluster) connections to other genes in the network. The cleaned clusters are hereby referred to as hubs, and the aggregated hubs for each cell type are cell-specific signatures.7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{i,j}=\frac{{\sum }_{k=1}^{n}\left({x}_{k,i}-\bar{{x}_{i}}\right)\left({x}_{k,j}-\bar{{x}_{j}}\right)}{\sqrt{{\sum }_{k=1}^{n}{\left({x}_{k,i}-\bar{{x}_{i}}\right)}^{2}\sqrt{{\sum }_{k=1}^{n}{\left({x}_{k,j}-\bar{{x}_{j}}\right)}^{2}}}}$$\end{document}ri,j=∑k=1nxk,i−xi¯xk,j−xj¯∑k=1nxk,i−xi¯2∑k=1nxk,j−xj¯2
(Eq.): Pearson’s correlation coefficient for dynamic genesand. Whererepresents the binary expression value of genein cell;is the binary expression value of genein cell; and n is the total number of cells in the population. 7 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{i,j}$$\end{document} r i j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{k,i}$$\end{document} x k i , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{k,j}$$\end{document} x k j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t={q}_{99}\left({r}_{{\mbox{perm}}\left(i,j\right)}\right)+0.05\,$$\end{document} t = + q 99 r perm i j , 0.05 8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{i,j} \, > \, t\,{and}\,{r}_{i,j} \, > \, 0$$\end{document} r i j , r i j , > > t a n d 0
(Eq. 8) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t$$\end{document}t: significance threshold. Where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}_{{{\mbox{perm}}}\left(i,j\right)}$$\end{document}rpermi,j represents the distribution of correlation coefficients for 500 random permutations of gene \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}i and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}k. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${q}_{99}$$\end{document}q99 represents the 99th percentile value of this distribution. The inequality depicts one criteria of gene selection based on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t$$\end{document}t.
Identifying novel senescence signatures from humans
Age-dynamic genes for each dataset-tissue-cell-type were found as described earlier. Count data were then subset by these genes and further subset to only include cells from patients 48 years of age or older. Significant correlations, networks, hubs, and signatures were generated similarly to those from mice.
Novel signature comparison
Each signature or hub has a set of genes and corresponding weights for how many connections a gene shares with other genes. Pairwise cosine similarity was calculated by comparing the union of each gene list and imputing 0 s (Eq. 9). For pairwise hypergeometric similarity between two signatures, the cumulative distribution function for two lists of genes was determined using the genes present in the original species aggregated counts as the background list (Eq. 10). For signature network analysis, all pairwise hypergeometric sf values (i.e., p-values) were corrected with the Bonferroni method, converted to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-{\log }_{10}{{sf}}_{{corrected}}$$\end{document}−log10sfcorrected, and used as similarity scores between signatures if they were significant.
To find genes represented in the signatures more than expected by chance, we used a random permutation method. A set of hubs with random genes identical in size to the original signatures were generated 1000 times from the background set of expressed genes in the dataset. A distribution was created representing the number of times each gene was found in each of the 1000 permutations. The actual number of signatures a gene was found in was compared to this distribution to determine what proportion of randomly sampled genes were below it in rank. 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${cosine}\left({L}_{i},{L}_{j}\right)=\frac{{\sum }_{k=1}^{n}{a}_{i,k}{a}_{j,k}}{\sqrt{{\sum }_{k=1}^{n}\, {a}_{i,k}^{2}}\sqrt{{\sum }_{k=1}^{n}{a}_{j,k}^{2}}},{where}\,{a}_{i,k}=\left\{\begin{array}{c}{w}_{i}k\,{if}\,{g}_{k}\in {L}_{i}\\ 0,{otherwise}\end{array}\right.$$\end{document} c o s i n e w h e r e L i L j , w i g k L i k i f ∈ 0 , o t h e r w i s e = , = ∑ k = 1 n a i k , a j k , ∑ k = 1 n a i k , 2 ∑ k = 1 n a j k , 2 a i k ,
(Eq.): cosine distance between two signaturesand. Whererepresent the weight of genein gene list. Where n represents the total number of genes in the union of signaturesand. 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${cosine}\left({L}_{i},{L}_{j}\right)$$\end{document} c o s i n e L i L j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}$$\end{document} L i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{j}$$\end{document} L j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i,k}$$\end{document} a i k , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}$$\end{document} L i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{j}$$\end{document} L j 10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${sf}=1-P\left(\left|{L}_{i}\cap {L}_{j}\right|-1,\left|N\right|,\left|{L}_{i}\right|,\left|{L}_{j}\right|\right) \\=1-{\sum }_{k=0}^{x-1}\frac{\left({{L}_{j}}\atop {k}\right)\left({N-{L}_{j}}\atop{{L}_{i}-k}\right)}{\left({N}\atop{{L}_{i}}\right)}$$\end{document} s f P = − = − 1 1 L i L j ∩ N L i L j − , , , 1 ∑ k = 0 x − 1 L j k N − L j L i − k N L i
(Eq.): survival function of the hypergeometric distribution. Whereandrepresent two gene lists andrepresents the cardinality of the background gene list, which is comprised of all genes in detected in the respective dataset.is equal to the cardinality of theandintersection minus 1. 10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${sf}$$\end{document} s f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P\left(x,{N},{I},{j}\right)$$\end{document} P x N I j , , , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}$$\end{document} L i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{j}$$\end{document} L j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N$$\end{document} N \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}$$\end{document} L i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{j}$$\end{document} L j
Gene set enrichment – GO, KEGG, transcription factor binding
We used the Enrichr python API gseapy47 for gene set enrichment against the GO and KEGG databases (refs). The background set of genes used came from all expressed genes from their respective datasets. Only FDR-corrected p-values below 0.05 were considered significant. A custom “senescence” gene set was added which was comprised of the union between all literature-based senescence markers collected for this study and senMayo20.
For transcription factor binding analysis, the regions 1000 bp upstream and 500 bp downstream of the transcription start sites were extracted for each gene in a gene list. JASPAR 2020 core vertebrate non-redundant position frequency matrices were us as the input motifs48. The extracted regions were examined for relative motif enrichment using the MEME-suite simple enrichment analysis.
Only Benjamini-Hochberg-corrected p-values below 0.05 were considered significant.
Scoring cells using SenePy
Gene signatures are comprised of genes and their respective number of edges in their network (termed importance value). We developed SenePy, a lightweight and fast scoring algorithm specific for our gene sets that borrows from Seurat’s AddModuleScore() and Scanpy’s tl.score_genes(). SenePY is built in Python and integrates well with scanpy and anndata. SenePy has four core functions: load_hubs(), translator(), score_hub(), and score_all_cells(). The load_hubs() function initializes the hub object which includes the hubs themselves along with additional metadata, such as each hub’s enrichment for known senescence genes. Depending on the input data and its respective reference, the optional translator() function can be used to harmonize gene symbols based on known gene aliases. The score_hub() function takes one input hub and anndata and returns a list of scores for each cell. The score_all_cells() takes one input hub and anndata and stratifies the data based on input categories, for example, to score individual cell types separately to avoid confounding the score.
The scoring happens in multiple steps. First, the mean is calculated for each gene in the dataset across all cells (Eq. 11). All genes are ranked by their mean and split into nbins (default: 25) expression bins (Eq. 12). Next, nctrl_size (default: 50) background genes are selected for each input signature gene from its corresponding expression bin (Eq. 13). The counts data are then optionally binarized (default: True) to represent the binary senescence cellular state and the gene-cell positivity from which the underlying networks were derived. Next, the counts are optionally amplified (Default: True) by their corresponding importance value from the input signature (e.g., if [Cdkn2a, 2] is in the signature all Cdkn2a values would be multiplied by 2) (Eq. 14). Then the cell-by-signature-gene matrix is averaged across the cell axis and subtracted from the mean of the cell-by-background matrix also averaged across the cell axis (Eq. 15).11\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${m}_{j}=\left(1/n\right){\sum }_{i=1}^{n}{X}_{{ij}}$$\end{document}mj=1/n∑i=1nXij
(Eq.) Whereis a matrix which contains the expression level of genein cellandis the average expression of geneacross all cells. 11 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X$$\end{document} X \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${m}_{j}$$\end{document} m j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j 12 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\rm{B}}}}_{1},{{{\rm{B}}}}_{2},\ldots,{{{\rm{B}}}}_{{{\rm{n}}}{{\_}}{{\rm{bins}}}}$$\end{document} B 1 B 2 B n bins _ , , … ,
(Eq.) Whereis the number of bins used to categorize every gene based on their mean expression. 12 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B}_{n\_{bins}}$$\end{document} B n b i n s _ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s\in {B}_{k},{\mbox{for}}\, s\in S$$\end{document} s s S ∈ , ∈ B k for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{BG}}_{{s}^{k}}}=\big\{g\, | \, g\in {{B}_{k}}^{\left({n}_{{{\mbox{ctrl}}}\; {{\mbox{size}}}}\right)},g\, \ne \, s\big\}$$\end{document} B G s k = ∣ ∈ , ≠ g g g s B k n ctrl size 13 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}={\bigcup }_{s\in S}B{G}_{{s}^{k}}$$\end{document} B G B = ⋃ s S ∈ G s k
(Eq.): background gene set. Whereis the gene signature andis a gene within.is a subset of genes that fall within the k-th expression bin based on their mean expression. Whereis a background gene selected from the expression binandis the number of background genes selected for each signature genefrom the corresponding expression bin. Whereis a set of background genes randomly selected from the same expression binas the signature gene.is the union of all background genes selected for each signature gene. Note,depends onand set. 13 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document} B G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document} S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document} s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document} S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B}_{k}$$\end{document} B k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g$$\end{document} g \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B}_{k}$$\end{document} B k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${n}_{{ctrl\; size}}$$\end{document} n c t r l s i z e \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document} s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{BG}}_{{s}^{k}}$$\end{document} B G s k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B}_{k}$$\end{document} B k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document} s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document} B G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document} s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document} B G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${B}_{k}$$\end{document} B k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document} S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Y}_{{ij}}=\left\{1\, {if}\, {X}_{{ij}} \, > \, 0,\,0\, {otherwise}\right\}$$\end{document} Y i j = 1 i f o t h e r w i s e X i j > , 14 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z}_{{ij}}={Y}_{{ij}}*\, {I}_{j}$$\end{document} Z i j Y i j I j = *
(Eq.): modified expression matrix. Whereis the optionally binarized expression matrixandrepresents the optional importance values for gene. The optional importance values are obtained by the centrality of a gene in its network signature (i.e., the number of connected edges). 14 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z}_{{ij}}$$\end{document} Z i j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Y}_{{ij}}$$\end{document} Y i j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${X}_{{ij}}$$\end{document} X i j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${I}_{j}$$\end{document} I j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{Y}_{i,S}}=\frac{1}{\left|S\right|}{\sum}_{s\in S}{Z}_{i,s},{{\mbox{for}}}\, {{i}}\in {{\mbox{all}}} \, {{\mbox{cells}}}$$\end{document} Y i S , ¯ = , ∈ 1 S ∑ s S ∈ Z i s , for all cells i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{Y}_{i,{BG}}}=\frac{1}{\left|{BG}\right|}{\sum}_{g\in {BG}}{Z}_{i,g},{{\mbox{for}}}\, {{i}}\in {{\mbox{all}}}\, {{\mbox{cells}}}$$\end{document} Y i B G , ¯ = , ∈ 1 B G ∑ g B G ∈ Z i g , for all cells i 15 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\mbox{Score}}}_{{{\rm{i}}}}=\overline{{{{\rm{Y}}}}_{{{\rm{i}}},{{\rm{S}}}}}-\overline{{{{\rm{Y}}}}_{{{\rm{i}}},{{\rm{BG}}}}},{{\mbox{for}}}\, {{i}}\in {{\mbox{all}}} \, {{\mbox{cells}}}$$\end{document} Score i = − , ∈ Y i S , ¯ Y i BG , ¯ for all cells i
(Eq. 15) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\mbox{Score}}}_{i}$$\end{document}Scorei: SenePy score for cell \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}i. Where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left|S\right|$$\end{document}S and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left|{BG}\right|$$\end{document}BG are the cardinality (number of elements in the set) of gene signatures \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document}S and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document}BG. Where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z}_{i,s}$$\end{document}Zi,s and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${Z}_{i,g}$$\end{document}Zi,g represent the optionally amplified expression values of the genes in the gene signature \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document}S and background gene set \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document}BG, respectively.
Merging multiple signatures and identifying a universal senescence signature
The senePy signature database is comprised of multiple gene sets of different sizes. An individual gene may be found in multiple sets. The universal signature can be defined by finding genes that are overrepresented in the signature gene sets using all genes in the respective species dataset as the background set. We can determine the probability a gene is included in any number of sets using a generating function (Eq. 16). The cumulative distribution function is calculated from the cumulative sum of the resulting probabilities. Finally, the p-value for any given number of sets by subtracting the respective cdf value from 1. The resulting p-values are then corrected using the Benjamani-Hochberg\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{i}=\frac{\left|{S}_{i}\right|}{\left|{BG}\right|}$$\end{document}pi=SiBG\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left(x\right)={\prod }_{i=1}^{k}\left[\left(1-{p}_{i}\right)+{p}_{i}x\right]$$\end{document}Gx=∏i=1k1−pi+pix\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left(x\right)={\sum }_{m=0}^{k}{a}_{m}{x}^{m}$$\end{document}Gx=∑m=0kamxm16\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P\left(x=m\right)=\frac{{a}_{m}}{{\sum }_{j=0}^{k}{a}_{j}}$$\end{document}Px=m=am∑j=0kaj
(Eq.): The probability that a gene is found in exactlysets (gene signatures).is the probability a gene is found in each setbased on the number of genes in the background.is the overall generating function of allsets and can be defined as the product of individual generating functions for each set.is expanded into a polynomial whereis the coefficient ofand represents the probability a gene is insets. The probability mass function is obtained by normalizing the probabilities so the sum of probabilities is 1. 16 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P\left(x=m\right)$$\end{document} P x m = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} m \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${p}_{i}$$\end{document} p i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${S}_{i}$$\end{document} S i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${BG}$$\end{document} B G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left(x\right)$$\end{document} G x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left(x\right)$$\end{document} G x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{m}$$\end{document} a m \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}^{m}$$\end{document} x m \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} m
Senescence burden in spatially resolved transcriptomics
Data was preprocessed in Scanpy and spots with fewer than 1000 detected genes were removed. Cells were normalized to 10,000 counts and log converted. The 8 heart-specific mouse hub signatures were used to score the spatially resolved mouse hearts independently using senepy.score_hub() with a translator() and with binarize and importance set to False because Visium data has higher gene counts than single-cell data. Outlier spots were identified in each sample if they fell 3 standard deviations outside the mean for their respective sample distribution in addition to a combined sample distribution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({Outlier} \, > \, \mu+3\sigma )$$\end{document}(Outlier>μ+3σ). The outliers from each signature were merged to determine if any given spot was an outlier. Relative senescence burden is presented as the proportion of outlier spots. For the mouse brains, we used the top 150 most common genes in all the signatures because we had no specific mouse brain signatures. To determine spatial autocorrelation, we used the ESDA Python package (https://pysal.org/esda/↗). The weights of the autocorrelation were weighted by the inverse of the Euclidean distance between two spots with a value of 1 to denote an outlier and 0 for normal spots. Three is used as a maximum value for Euclidean distance and the weights for distances beyond three are set to 0 (Eq. 17).17\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I=\frac{n{\sum }_{i}{\sum }_{j}\left(\frac{1}{d\left(i,j\right)}\right)\cdot {{\rm{\delta }}}\left(d\left(i,j\right)\le 3\right)\left({x}_{i}-\bar{x}\right)\left({x}_{j}-\bar{x}\right)}{{\sum }_{i}{\sum }_{j}\left(\frac{1}{d\left(i,j\right)}\right)\cdot {{\rm{\delta }}}\left(d\left(i,j\right)\le 3\right){\sum }_{i}{\left({x}_{i}-\bar{x}\right)}^{2}},{{\mbox{p}}}-{{\mbox{value}}}=1-\left(\Phi \left(I\right)\right)$$\end{document}I=n∑i∑j1di,j⋅δdi,j≤3xi−x¯xj−x¯∑i∑j1di,j⋅δdi,j≤3∑ixi−x¯2,p−value=1−ΦI
(Eq.): Moran’s I. Whereis the number of spots;is the Euclidean distance between spotand spot;is 1 if the distance is greater or equal to 3 and otherwise 0; andandare the values at spotand spot.is the CDF of the standard normal distribution at the Moran’s I value. 17 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I$$\end{document} I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document} n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\left(i,j\right)$$\end{document} d i j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\delta \left(d\left(i,j\right)\le 3\right)$$\end{document} δ d i j , ≤ 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{i}$$\end{document} x i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{j}$$\end{document} x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document} i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document} j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi \left(I\right)$$\end{document} Φ I
Senescence burden in COVID-19 mortality
Single-cell lung data from 20 patients that died from COVID-19 and 7 control patients were collected from an available atlas35. Doublets were removed from each individual sample using SOLO49 in combination with SCVI-tools. Cells with low counts or high mitochondrial reads were removed. SCVI tools were used to integrate the 27 samples, using sample ID as a categorical covariate and mitochondrial read percent, ribosomal read percent, and total counts as continuous covariates. Cell types were manually annotated using known cell-type markers (PanglaoDB). Cell types were scored with respective cell type hubs from SenePy (e.g., epithelial cells were scored with ciliated epithelial, basal cell, club cell, and pneumocyte hubs) using the senepy.score_all_cells() function. Cells were divided and scored as individual subtypes (e.g., AT1, AT2, airway epithelium). Cell outliers were identified in each sample if they fell 3 standard deviations outside the mean within every respective cell-type distribution (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({Outlier} \, > \, \mu+3\sigma )$$\end{document}(Outlier>μ+3σ). Outliers were merged across hubs to identify all cells with potential senescence burden and output as a proportion of total cells.
Validation of senePy signatures from bulk-RNA data
Data were obtained in mixed formats due to differences in study data availability. If differential expression data were available, they were used directly in gene set enrichment. Otherwise, raw counts were processed through a standard Deseq250 pipeline implemented with pyDeseq251. Enrichment analyses were done with the gseapy UCSD GSEA API52. Only senePy signatures relevant to the respective context were used in each enrichment comparison. For example, only SenePy lung signatures were tested if the data came from the lungs. The senGPT signature was generated by prompting ChatGPT-4 for 100 upregulated gene markers of cellular senescence in two non-overlapping batches of 50 genes.
Pseudotime analysis
Raw cell and hashtag counts were retrieved for GSE222338↗33. The data were demultiplexed using HashSolo49. Cells were filtered if they were outside of 5 median absolute deviations from their respective log1p_total_counts, log1p_n_genes_by_count, pct_count_in_top_20_genes, and pct_counts_mt distributions. Doublets were removed with scrublet53. Psuedotime was calculated using CellOracle with a centrally located (2D space) root cell in the mVenus control group54.
Reporting summary
Further information on research design is available in thelinked to this article. Nature Portfolio Reporting Summary
Supplementary information
Supplementary Information Description of Supplementary Data 1-9 Supplementary Data 1-9 Reporting Summary Transparent Peer Review file