Disease spectrum and comorbidity patterns of malignant neoplasms: a multi-center hospital-based retrospective analysis of inpatient insurance claims data.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
029 patients, the ten most common malignancies accounted for 75.
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
[CONCLUSIONS] This study demonstrates a high comorbidity burden among hospitalized cancer patients, with distinct malignancy-specific co-diagnosis patterns. These findings support the need for integrated clinical management and targeted healthcare resource allocation, particularly for older, male, and rural patient populations.
[BACKGROUND] Previous studies indicate a high comorbidity burden among patients with malignant neoplasms, but claims-based comorbidity patterns have not been systematically characterized at the region
APA
Liu M, Chen L, et al. (2026). Disease spectrum and comorbidity patterns of malignant neoplasms: a multi-center hospital-based retrospective analysis of inpatient insurance claims data.. Frontiers in oncology, 16, 1744397. https://doi.org/10.3389/fonc.2026.1744397
MLA
Liu M, et al.. "Disease spectrum and comorbidity patterns of malignant neoplasms: a multi-center hospital-based retrospective analysis of inpatient insurance claims data.." Frontiers in oncology, vol. 16, 2026, pp. 1744397.
PMID
41971438 ↗
Abstract 한글 요약
[BACKGROUND] Previous studies indicate a high comorbidity burden among patients with malignant neoplasms, but claims-based comorbidity patterns have not been systematically characterized at the regional multi-hospital level in China.
[METHODS] This study is a multi-center, hospital-based retrospective analysis of inpatient insurance claims. We analyzed anonymized inpatient medical insurance claims data (2016-2021) from 163 hospitals in Zhanjiang, China, focusing on hospitalized patients with a primary diagnosis of malignant neoplasms. Malignant neoplasms and comorbidities were identified using International Classification of Diseases, 10th Revision(ICD-10) codes. Disease spectrum was stratified by sex, age, and region. Comorbidity patterns were delineated using association rule mining and network analysis.
[RESULTS] Among 107,029 patients, the ten most common malignancies accounted for 75.96% of all cases. Hospitalizations were more frequent among rural populations, males, and individuals aged ≥65 years. The median number of co-diagnosed conditions across major malignancies was 5 (interquartile range [IQR]: 3-7). Network analysis revealed three major co-diagnosis clusters: 1) liver cancer with chronic viral hepatitis and hepatic fibrosis; 2) lung cancer with chronic obstructive pulmonary disease (COPD) and pneumonia; and 3) colorectal cancer with inflammatory bowel disease-related conditions and intestinal obstruction. Patterns varied across sex, age groups, and urban-rural residence.
[CONCLUSIONS] This study demonstrates a high comorbidity burden among hospitalized cancer patients, with distinct malignancy-specific co-diagnosis patterns. These findings support the need for integrated clinical management and targeted healthcare resource allocation, particularly for older, male, and rural patient populations.
[METHODS] This study is a multi-center, hospital-based retrospective analysis of inpatient insurance claims. We analyzed anonymized inpatient medical insurance claims data (2016-2021) from 163 hospitals in Zhanjiang, China, focusing on hospitalized patients with a primary diagnosis of malignant neoplasms. Malignant neoplasms and comorbidities were identified using International Classification of Diseases, 10th Revision(ICD-10) codes. Disease spectrum was stratified by sex, age, and region. Comorbidity patterns were delineated using association rule mining and network analysis.
[RESULTS] Among 107,029 patients, the ten most common malignancies accounted for 75.96% of all cases. Hospitalizations were more frequent among rural populations, males, and individuals aged ≥65 years. The median number of co-diagnosed conditions across major malignancies was 5 (interquartile range [IQR]: 3-7). Network analysis revealed three major co-diagnosis clusters: 1) liver cancer with chronic viral hepatitis and hepatic fibrosis; 2) lung cancer with chronic obstructive pulmonary disease (COPD) and pneumonia; and 3) colorectal cancer with inflammatory bowel disease-related conditions and intestinal obstruction. Patterns varied across sex, age groups, and urban-rural residence.
[CONCLUSIONS] This study demonstrates a high comorbidity burden among hospitalized cancer patients, with distinct malignancy-specific co-diagnosis patterns. These findings support the need for integrated clinical management and targeted healthcare resource allocation, particularly for older, male, and rural patient populations.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- CDH17 facilitates β-catenin nuclear translocation to reduce drug sensitivity in cisplatin-resistant gastric cancer cells.
- Cost-effectiveness analysis of tepotinib vs capmatinib as subsequent therapy in MET exon 14-mutated non-small-cell lung cancer.
- Metabolic engineering of SLC38A2 reprograms glutamine utilization and enhances CAR-macrophage antitumor function in solid tumors.
- Triangulating Associations Between Fruit Intake and Lung Cancer Risk: Evidence from GBD Estimates, Mendelian Randomization, and Real-World Validation.
- Cell-free supernatant of Clostridium butyricum induces mitochondrial apoptosis and suppresses NF-κB pathway in colorectal cancer cells.
📖 전문 본문 읽기 PMC JATS · ~89 KB · 영문
Introduction
Introduction
Malignant neoplasms are a major global public health issue and a leading cause of death. According to the 2021 global cancer data from the International Cancer Center, there were 19.29 million new cancer cases and nearly 10 million cancer deaths worldwide in 2020 (1). China accounted for 4.57 million new cases, representing 23.7% of the global total cases, and the global cancer burden is projected to reach 28.4 million cases by 2040. In China, cancer incidence and mortality continue to increase, with 4,824,700 new cases and 2,574,200 deaths reported in 2022 (2).
Malignant neoplasms arise from interactions of genetic, environmental, behavioral, and socioeconomic risk factors. Age significantly influences cancer development, with most (>50%) cases occurring in adults >65 years (3). Urban-rural disparities in cancer incidence reflect differential environmental exposures, lifestyle practices, and healthcare access (4), contributing to superior early diagnosis/treatment rates in urban settings due to concentrated resources (5). South China, especially Guangdong Province, exhibits distinct epidemiologic pattern of cancer driven by HBV/HCV prevalence, EBV endemicity, and dietary exposures (6). Zhanjiang City exhibits a strikingly elevated age-standardized incidence rate (ASIR) of liver cancer 2.1 times the national average (7), warranting targeted investigation.
Comorbidity, defined as the presence of two or more concurrent chronic conditions, is highly prevalent, especially in middle-aged and older adults (8). In cancer patients, comorbidities significantly impact quality of life (9–11), functional status (9, 12), mortality risk (13, 14), and healthcare costs (15, 16), placing considerable strain on healthcare systems. Optimizing comorbidity management in cancer patients is essential for clinical decision-making and prognosis improvement. The rising global cancer burden, coupled with population aging, underscores the increasing importance of understanding and addressing comorbidities in oncology care.
Cancer-related multimorbidity is increasingly recognized as a major determinant of treatment tolerance, prognosis, and healthcare utilization. However, evidence from China remains limited and is often constrained to single-center cohorts or predefined comorbidity indices. Using region-wide inpatient insurance claims from 163 hospitals in Zhanjiang (2016–2021), this study provides a comprehensive characterization of the inpatient malignant neoplasm spectrum and claims-based co-diagnosis patterns. By applying association rule mining and network analysis, and further exploring heterogeneity by sex, age, and urban-rural residence, our findings may inform integrated inpatient management strategies and local resource allocation for clinically complex cancer populations.
Research on cancer comorbidities in China remains nascent, limiting evidence-based clinical management strategies for malignant neoplasms. Medical insurance claims data provide an effective method for identifying comorbidities within cancer populations (17, 18), overcoming inaccuracies inherent in self-reported data (19). As comorbidities in administrative databases derive from clinical diagnoses, this approach enhances the quality of epidemiological research.
This study seeks to:
Characterize the cancer spectrum in Zhanjiang utilizing population-level insurance claims data (2016-2021);
Identify comorbidity patterns through association rule mining;
Investigate variations in comorbidity patterns according to sex, age, and urban-rural residence.
Malignant neoplasms are a major global public health issue and a leading cause of death. According to the 2021 global cancer data from the International Cancer Center, there were 19.29 million new cancer cases and nearly 10 million cancer deaths worldwide in 2020 (1). China accounted for 4.57 million new cases, representing 23.7% of the global total cases, and the global cancer burden is projected to reach 28.4 million cases by 2040. In China, cancer incidence and mortality continue to increase, with 4,824,700 new cases and 2,574,200 deaths reported in 2022 (2).
Malignant neoplasms arise from interactions of genetic, environmental, behavioral, and socioeconomic risk factors. Age significantly influences cancer development, with most (>50%) cases occurring in adults >65 years (3). Urban-rural disparities in cancer incidence reflect differential environmental exposures, lifestyle practices, and healthcare access (4), contributing to superior early diagnosis/treatment rates in urban settings due to concentrated resources (5). South China, especially Guangdong Province, exhibits distinct epidemiologic pattern of cancer driven by HBV/HCV prevalence, EBV endemicity, and dietary exposures (6). Zhanjiang City exhibits a strikingly elevated age-standardized incidence rate (ASIR) of liver cancer 2.1 times the national average (7), warranting targeted investigation.
Comorbidity, defined as the presence of two or more concurrent chronic conditions, is highly prevalent, especially in middle-aged and older adults (8). In cancer patients, comorbidities significantly impact quality of life (9–11), functional status (9, 12), mortality risk (13, 14), and healthcare costs (15, 16), placing considerable strain on healthcare systems. Optimizing comorbidity management in cancer patients is essential for clinical decision-making and prognosis improvement. The rising global cancer burden, coupled with population aging, underscores the increasing importance of understanding and addressing comorbidities in oncology care.
Cancer-related multimorbidity is increasingly recognized as a major determinant of treatment tolerance, prognosis, and healthcare utilization. However, evidence from China remains limited and is often constrained to single-center cohorts or predefined comorbidity indices. Using region-wide inpatient insurance claims from 163 hospitals in Zhanjiang (2016–2021), this study provides a comprehensive characterization of the inpatient malignant neoplasm spectrum and claims-based co-diagnosis patterns. By applying association rule mining and network analysis, and further exploring heterogeneity by sex, age, and urban-rural residence, our findings may inform integrated inpatient management strategies and local resource allocation for clinically complex cancer populations.
Research on cancer comorbidities in China remains nascent, limiting evidence-based clinical management strategies for malignant neoplasms. Medical insurance claims data provide an effective method for identifying comorbidities within cancer populations (17, 18), overcoming inaccuracies inherent in self-reported data (19). As comorbidities in administrative databases derive from clinical diagnoses, this approach enhances the quality of epidemiological research.
This study seeks to:
Characterize the cancer spectrum in Zhanjiang utilizing population-level insurance claims data (2016-2021);
Identify comorbidity patterns through association rule mining;
Investigate variations in comorbidity patterns according to sex, age, and urban-rural residence.
Materials and methods
Materials and methods
Data source
The study utilized anonymized medical insurance claims data covering all inpatient admissions with a primary diagnosis of malignant neoplasm between January 1, 2016, and December 31, 2021. The data originated from 163 hospitals located in Zhanjiang, China (Supplementary Table 1). Available variables included age, sex, urban–rural residence, year of admission, and diagnoses coded according to the International Classification of Diseases, 10th Revision (ICD-10). The operational definitions of all comorbid conditions, including the corresponding ICD-10 codes, are detailed in Supplementary Table 2 to ensure transparency and reproducibility. Data processing was governed by the following principles: 1. ICD-10 standardization; 2. Deduplication by patient ID; 3. Temporal linkage of multi-admission records.
Study subjects
Inclusion criteria comprised: 1. Hospitalized patients with a primary diagnosis of malignant neoplasm during the study period; 2. Complete information on primary diagnosis, age, sex, residence, and admission date; 3. Malignant neoplasms with a defined primary site. Exclusion criteria were: 1. Secondary malignant neoplasms; 2. Malignant neoplasms of unspecified or unknown primary site; 3. Non-local residents. A final cohort of 107,029 patients met these criteria (Figure 1).
Comorbidity ascertainment
In this claims-based study, comorbidity was operationally defined as any additional diagnosis co-recorded with the index malignant neoplasm in inpatient claims. Accordingly, the term refers to claims-based co-diagnosed conditions during hospitalization, encompassing pre-existing chronic diseases, cancer-related manifestations, treatment-related adverse events, and in-hospital complications. Diagnoses across multiple admissions were aggregated, while conditions recorded only once and not re-documented within 6 months were excluded to reduce transient diagnoses. Accordingly, the identified patterns reflect clinically meaningful co-diagnosis clusters rather than causal relationships.
Statistical analysis
Descriptive statistics characterized the patient cohort and disease spectrum, stratified by sex, age group (0–19, 20–39, 40–64, ≥65 years), and urban-rural residence. Comorbidity burden was described using median counts and interquartile ranges (IQR), and prevalence percentages for the top malignancies. The average annual growth rate (AAGR) was calculated as:
where V0 and Vt denote the annual number of hospitalizations in the first (2016) and last (2021) year (20), respectively, and t denotes the number of years between them. This formulation is consistent with standard compound annual growth rate calculations used in epidemiological trend descriptions (21).Network construction parameters, including support, confidence, and lift thresholds, were fixed across analyses to ensure reproducibility.
Association rule mining
We applied the Apriori algorithm to identify frequent co-diagnosis itemsets and generate association rules (22). This algorithm employs an iterative, level-wise search strategy to construct candidate itemsets and prunes those that fail to meet a predefined minimum support threshold, based on the principle that all subsets of a frequent itemset must themselves be frequent. This property reduces computational burden while ensuring robust pattern identification.
Association rules were evaluated using three standard metrics: support, confidence, and lift. Support (range: 0–1) represents the proportion of records containing a given itemset and reflects its prevalence within the dataset. Confidence (range: 0–1) denotes the conditional probability of observing the consequent diagnosis given the antecedent diagnosis, thereby quantifying the strength of the directional association. Lift (range: >0) compares the observed co-occurrence frequency with that expected under statistical independence; lift > 1 indicates a positive association, lift = 1 indicates independence, and lift< 1 suggests a negative association.
Hierarchical clustering with complete linkage was performed based on pairwise Jaccard similarity matrices to identify clusters of comorbid conditions (23). Significant association rules were visualized using directed network graphs, in which nodes represented malignant neoplasms and co-diagnosed conditions, directed edges denoted associations, edge thickness was proportional to rule confidence, and edge color intensity corresponded to lift magnitude.
Software
The Apriori algorithm was implemented using the “arules” package in R (version 4.2.0). Other statistical analyses, including descriptive statistics, were performed using SPSS (version 26.0). Statistical significance was set at P< 0.05.
Data source
The study utilized anonymized medical insurance claims data covering all inpatient admissions with a primary diagnosis of malignant neoplasm between January 1, 2016, and December 31, 2021. The data originated from 163 hospitals located in Zhanjiang, China (Supplementary Table 1). Available variables included age, sex, urban–rural residence, year of admission, and diagnoses coded according to the International Classification of Diseases, 10th Revision (ICD-10). The operational definitions of all comorbid conditions, including the corresponding ICD-10 codes, are detailed in Supplementary Table 2 to ensure transparency and reproducibility. Data processing was governed by the following principles: 1. ICD-10 standardization; 2. Deduplication by patient ID; 3. Temporal linkage of multi-admission records.
Study subjects
Inclusion criteria comprised: 1. Hospitalized patients with a primary diagnosis of malignant neoplasm during the study period; 2. Complete information on primary diagnosis, age, sex, residence, and admission date; 3. Malignant neoplasms with a defined primary site. Exclusion criteria were: 1. Secondary malignant neoplasms; 2. Malignant neoplasms of unspecified or unknown primary site; 3. Non-local residents. A final cohort of 107,029 patients met these criteria (Figure 1).
Comorbidity ascertainment
In this claims-based study, comorbidity was operationally defined as any additional diagnosis co-recorded with the index malignant neoplasm in inpatient claims. Accordingly, the term refers to claims-based co-diagnosed conditions during hospitalization, encompassing pre-existing chronic diseases, cancer-related manifestations, treatment-related adverse events, and in-hospital complications. Diagnoses across multiple admissions were aggregated, while conditions recorded only once and not re-documented within 6 months were excluded to reduce transient diagnoses. Accordingly, the identified patterns reflect clinically meaningful co-diagnosis clusters rather than causal relationships.
Statistical analysis
Descriptive statistics characterized the patient cohort and disease spectrum, stratified by sex, age group (0–19, 20–39, 40–64, ≥65 years), and urban-rural residence. Comorbidity burden was described using median counts and interquartile ranges (IQR), and prevalence percentages for the top malignancies. The average annual growth rate (AAGR) was calculated as:
where V0 and Vt denote the annual number of hospitalizations in the first (2016) and last (2021) year (20), respectively, and t denotes the number of years between them. This formulation is consistent with standard compound annual growth rate calculations used in epidemiological trend descriptions (21).Network construction parameters, including support, confidence, and lift thresholds, were fixed across analyses to ensure reproducibility.
Association rule mining
We applied the Apriori algorithm to identify frequent co-diagnosis itemsets and generate association rules (22). This algorithm employs an iterative, level-wise search strategy to construct candidate itemsets and prunes those that fail to meet a predefined minimum support threshold, based on the principle that all subsets of a frequent itemset must themselves be frequent. This property reduces computational burden while ensuring robust pattern identification.
Association rules were evaluated using three standard metrics: support, confidence, and lift. Support (range: 0–1) represents the proportion of records containing a given itemset and reflects its prevalence within the dataset. Confidence (range: 0–1) denotes the conditional probability of observing the consequent diagnosis given the antecedent diagnosis, thereby quantifying the strength of the directional association. Lift (range: >0) compares the observed co-occurrence frequency with that expected under statistical independence; lift > 1 indicates a positive association, lift = 1 indicates independence, and lift< 1 suggests a negative association.
Hierarchical clustering with complete linkage was performed based on pairwise Jaccard similarity matrices to identify clusters of comorbid conditions (23). Significant association rules were visualized using directed network graphs, in which nodes represented malignant neoplasms and co-diagnosed conditions, directed edges denoted associations, edge thickness was proportional to rule confidence, and edge color intensity corresponded to lift magnitude.
Software
The Apriori algorithm was implemented using the “arules” package in R (version 4.2.0). Other statistical analyses, including descriptive statistics, were performed using SPSS (version 26.0). Statistical significance was set at P< 0.05.
Results
Results
Patient characteristics
The study cohort comprised 107,029 patients with primary malignant neoplasms. Males constituted 55.89% (n = 59,814) and females 44.11% (n = 47,215), yielding a male-to-female ratio of 1.27:1. Urban residents accounted for 71.98% (n = 79,551) and rural residents for 28.02% (n = 30,960). Age ranged from 0 to 109 years (mean age, 62.47 years), with the largest proportion (47.94%) aged ≥65 years, followed by 40–64 years (43.65%), 20–39 years (7.06%), and 0–19 years (1.34%). (data not shown).
Disease spectrum composition
Within the cohort of 107,029 patients, a total of 110,511 primary malignant tumors were diagnosed, reflecting cases with multiple primary malignancies. Lung cancer was the most frequent diagnosis (n=23,202, 21.00%). Subsequent malignancies comprised liver, colorectal, breast, thyroid, nasopharyngeal (NPC), gastric, cervical cancers, lymphoma, and leukemia-collectively constituting 75.96% of cases. Gender stratification showed lung, liver, colorectal, NPC, and gastric cancers were predominant in males (collectively 64.63% of male cases), while lung, breast, colorectal, thyroid, and cervical cancers were most frequent in females (collectively 59.95% of female cases). Among pediatric patients (<20 years), leukemia, brain/CNS neoplasms, and lymphoma were predominant (71.93%)(Table 1; Figure 2).
Temporal trends
Annual hospitalizations for malignant neoplasms increased from 16,470 in 2016 to 18,466 in 2021, representing an average annual growth rate of 2.32%. Lung cancer consistently had the highest incidence among newly diagnosed cases each year. Liver, colorectal, breast, NPC, thyroid, gastric, and cervical cancers consistently ranked among the top 10 malignancies throughout the six years. Thyroid cancer showed the most notable rise in rank, moving from 7th position in 2016 to 4th by 2021 (Table 2).
Demographic variations
Over the study period, 61,912 new malignancies were diagnosed in males compared to 48,599 in females. Females showed higher hospitalization counts than males below 50 years, whereas males predominated from 50 years onward. The peak age group was 50–59 years in females and 60–69 years in males (Figure 3). Lung cancer was the most common malignancy in both urban and rural areas. Among rural residents, liver cancer ranked second, followed by colorectal cancer. In urban areas, colorectal cancer was the second most common, followed by liver cancer (Figure 4).
Comorbidity burden
The median number of comorbidities varied significantly across cancer types. Patients with lung cancer (median 6, IQR 4–7), liver cancer (6, IQR 3–8), and NPC (6, IQR 3–10) bore the heaviest comorbidity burden. Colorectal, gastric, and lymphoma cancers had a median of 5 comorbidities (IQR 3–7). Cervical cancer patients had a median of 4 comorbidities (IQR 2–7). Breast cancer (3, IQR 1–6), leukemia (3, IQR 2–5), and thyroid cancer (3, IQR 2–5) had the lowest median comorbidity counts. The prevalence of at least one comorbidity among these top malignancies ranged from 69.6% in thyroid cancer to 97.3% in liver cancer. (data not shown).
Comorbidity patterns
Binary comorbidity analysis identified 93 patterns exceeding 10% prevalence across ten common malignancies. Liver cancer demonstrated the highest comorbidity rate: 46.7% with concurrent liver fibrosis/cirrhosis, followed by pneumonia in lung cancer (45.6%). Other frequently co-occurring comorbidities included pleural disease, hypertension, COPD, anemia, diabetes, ischemic heart disease, and cerebral infarction. The top 20 most frequent comorbidities for each of the 10 most prevalent malignancies are presented in Figure 5 and Supplementary Table 3.
Association rule mining yielded 138 significant rules, with particularly high burdens observed in lung, liver, and colorectal cancer patients. Key associations included a robust link between COPD and pleural lesions in lung cancer (lift = 3.30), a striking association of chronic viral hepatitis with liver fibrosis/cirrhosis in liver cancer (lift = 6.95), and a pronounced association between chronic intestinal inflammatory disease and colorectal cancer (lift = 5.94). Beyond these major malignancies, NPC showed strong links to both chronic intestinal inflammatory disease and chronic sinusitis (lift = 9.76), while benign breast hyperplasia demonstrated a remarkably robust association with breast cancer (lift = 13.72). The full spectrum of comorbidity patterns across all cancers is visualized in the network graph (Figure 6). Cancer-specific network details are provided in Supplementary 3–12 Figure.
Sex-specific comorbidity patterns are summarized in Table 3. Anemia and metabolic disorders constituted shared comorbidities across both sexes. Males frequently manifested hepatic fibrosis, chronic hepatitis with liver cancer, and pleural disease with lung cancer, alongside pneumonia (Figure 7A). Females predominantly presented with pleural disease in lung cancer, chronic ischemic heart disease with heart failure, and benign breast hyperplasia with breast cancer (Figure 7B).
Age-specific comorbidity profiles exhibited considerable variation (Table 4). Among patients aged<20 years, comorbidities primarily centered on leukemia and included anemia, pneumonia, and other respiratory disorders (Figure 8A). The 20–39 age group was characterized by predominant associations of liver cancer with hepatitis and thyroid cancer with goiter (Figure 8B). Common patterns in the 40–64 age cohort involved anemia with metabolic disorders, hepatitis, and pleural disease occurring in lung cancer patients (Figure 8C). For patients aged ≥65 years and older, prevalent comorbidities comprised heart failure with chronic ischemic heart disease, COPD (particularly in lung cancer), and diabetes with hypertension (Figure 8D).
Stratified analysis revealed distinct urban-rural disparities in comorbidity patterns (Table 5). Urban residents exhibited higher frequencies of pleural disease in lung cancer, chronic ischemic heart disease with heart failure, and benign breast hyperplasia in breast cancer (Figure 9A). Conversely, rural populations showed heightened associations of liver fibrosis with liver cancer, chronic hepatitis with liver cancer, and pneumonia with lung cancer (Figure 9B).
Patient characteristics
The study cohort comprised 107,029 patients with primary malignant neoplasms. Males constituted 55.89% (n = 59,814) and females 44.11% (n = 47,215), yielding a male-to-female ratio of 1.27:1. Urban residents accounted for 71.98% (n = 79,551) and rural residents for 28.02% (n = 30,960). Age ranged from 0 to 109 years (mean age, 62.47 years), with the largest proportion (47.94%) aged ≥65 years, followed by 40–64 years (43.65%), 20–39 years (7.06%), and 0–19 years (1.34%). (data not shown).
Disease spectrum composition
Within the cohort of 107,029 patients, a total of 110,511 primary malignant tumors were diagnosed, reflecting cases with multiple primary malignancies. Lung cancer was the most frequent diagnosis (n=23,202, 21.00%). Subsequent malignancies comprised liver, colorectal, breast, thyroid, nasopharyngeal (NPC), gastric, cervical cancers, lymphoma, and leukemia-collectively constituting 75.96% of cases. Gender stratification showed lung, liver, colorectal, NPC, and gastric cancers were predominant in males (collectively 64.63% of male cases), while lung, breast, colorectal, thyroid, and cervical cancers were most frequent in females (collectively 59.95% of female cases). Among pediatric patients (<20 years), leukemia, brain/CNS neoplasms, and lymphoma were predominant (71.93%)(Table 1; Figure 2).
Temporal trends
Annual hospitalizations for malignant neoplasms increased from 16,470 in 2016 to 18,466 in 2021, representing an average annual growth rate of 2.32%. Lung cancer consistently had the highest incidence among newly diagnosed cases each year. Liver, colorectal, breast, NPC, thyroid, gastric, and cervical cancers consistently ranked among the top 10 malignancies throughout the six years. Thyroid cancer showed the most notable rise in rank, moving from 7th position in 2016 to 4th by 2021 (Table 2).
Demographic variations
Over the study period, 61,912 new malignancies were diagnosed in males compared to 48,599 in females. Females showed higher hospitalization counts than males below 50 years, whereas males predominated from 50 years onward. The peak age group was 50–59 years in females and 60–69 years in males (Figure 3). Lung cancer was the most common malignancy in both urban and rural areas. Among rural residents, liver cancer ranked second, followed by colorectal cancer. In urban areas, colorectal cancer was the second most common, followed by liver cancer (Figure 4).
Comorbidity burden
The median number of comorbidities varied significantly across cancer types. Patients with lung cancer (median 6, IQR 4–7), liver cancer (6, IQR 3–8), and NPC (6, IQR 3–10) bore the heaviest comorbidity burden. Colorectal, gastric, and lymphoma cancers had a median of 5 comorbidities (IQR 3–7). Cervical cancer patients had a median of 4 comorbidities (IQR 2–7). Breast cancer (3, IQR 1–6), leukemia (3, IQR 2–5), and thyroid cancer (3, IQR 2–5) had the lowest median comorbidity counts. The prevalence of at least one comorbidity among these top malignancies ranged from 69.6% in thyroid cancer to 97.3% in liver cancer. (data not shown).
Comorbidity patterns
Binary comorbidity analysis identified 93 patterns exceeding 10% prevalence across ten common malignancies. Liver cancer demonstrated the highest comorbidity rate: 46.7% with concurrent liver fibrosis/cirrhosis, followed by pneumonia in lung cancer (45.6%). Other frequently co-occurring comorbidities included pleural disease, hypertension, COPD, anemia, diabetes, ischemic heart disease, and cerebral infarction. The top 20 most frequent comorbidities for each of the 10 most prevalent malignancies are presented in Figure 5 and Supplementary Table 3.
Association rule mining yielded 138 significant rules, with particularly high burdens observed in lung, liver, and colorectal cancer patients. Key associations included a robust link between COPD and pleural lesions in lung cancer (lift = 3.30), a striking association of chronic viral hepatitis with liver fibrosis/cirrhosis in liver cancer (lift = 6.95), and a pronounced association between chronic intestinal inflammatory disease and colorectal cancer (lift = 5.94). Beyond these major malignancies, NPC showed strong links to both chronic intestinal inflammatory disease and chronic sinusitis (lift = 9.76), while benign breast hyperplasia demonstrated a remarkably robust association with breast cancer (lift = 13.72). The full spectrum of comorbidity patterns across all cancers is visualized in the network graph (Figure 6). Cancer-specific network details are provided in Supplementary 3–12 Figure.
Sex-specific comorbidity patterns are summarized in Table 3. Anemia and metabolic disorders constituted shared comorbidities across both sexes. Males frequently manifested hepatic fibrosis, chronic hepatitis with liver cancer, and pleural disease with lung cancer, alongside pneumonia (Figure 7A). Females predominantly presented with pleural disease in lung cancer, chronic ischemic heart disease with heart failure, and benign breast hyperplasia with breast cancer (Figure 7B).
Age-specific comorbidity profiles exhibited considerable variation (Table 4). Among patients aged<20 years, comorbidities primarily centered on leukemia and included anemia, pneumonia, and other respiratory disorders (Figure 8A). The 20–39 age group was characterized by predominant associations of liver cancer with hepatitis and thyroid cancer with goiter (Figure 8B). Common patterns in the 40–64 age cohort involved anemia with metabolic disorders, hepatitis, and pleural disease occurring in lung cancer patients (Figure 8C). For patients aged ≥65 years and older, prevalent comorbidities comprised heart failure with chronic ischemic heart disease, COPD (particularly in lung cancer), and diabetes with hypertension (Figure 8D).
Stratified analysis revealed distinct urban-rural disparities in comorbidity patterns (Table 5). Urban residents exhibited higher frequencies of pleural disease in lung cancer, chronic ischemic heart disease with heart failure, and benign breast hyperplasia in breast cancer (Figure 9A). Conversely, rural populations showed heightened associations of liver fibrosis with liver cancer, chronic hepatitis with liver cancer, and pneumonia with lung cancer (Figure 9B).
Discussion
Discussion
Key findings
This large, population-based analysis utilizing comprehensive medical insurance data provides a detailed characterization of the malignant neoplasm spectrum and associated comorbidity patterns among hospitalized patients. We identified lung, liver, and colorectal cancers as the predominant malignancies, consistent with national and global trends (1, 2), alongside significant burdens of NPC, leukemia, and lymphoma. Crude cancer incidence in Zhanjiang City rose 2.32% annually, exceeding the National Cancer Center’s 1.4% benchmark. A substantial comorbidity burden was evident, with a median of 5 comorbidities per patient, varying markedly by cancer type. Patients with liver, lung, and nasopharyngeal cancers bore the heaviest disease burden. Crucially, association rule mining revealed distinct, malignancy-specific comorbidity clusters, particularly involving liver-hepatic, lung-respiratory, and colorectal-inflammatory pathways. These patterns exhibited significant heterogeneity across sex, age groups, and urban-rural residence, reflecting underlying biological, behavioral, and healthcare access factors.
Comorbidity burden and profiles
Geriatric populations bear the greatest cancer burden, with over half of cancer diagnoses occurring at age 65 or older according to established literature (24). Our data corroborate this pattern, showing initial diagnoses at ≥60 and ≥65 years in 60.12% and 47.94% of cases, respectively. Age-related cumulative cellular damage and physiological decline predispose elderly cancer patients to comorbid noncommunicable conditions through degenerative pathways (25). Comorbidity prevalence among cancer patients exhibits substantial methodological heterogeneity globally, with reported rates ranging widely from 0.4% to 90% across studies due to variations in measurement approaches, population characteristics, and malignancy types (26). Geriatric-focused studies and clinical chart reviews consistently report higher burdens than analyses of administrative databases; for instance, SEER data indicate 68.7% prevalence while NHIS reports show 43.7% to 46.6%, dominated by hypertension, arthritis, cardiovascular disease, diabetes, and respiratory disorders (27, 28). Crucially, our investigation utilized clinician-curated diagnostic data from medical institutions, offering superior diagnostic accuracy compared to public databases or self-reported sources. This robust methodology revealed elevated comorbidity rates, with a median of 5 comorbidities per patient and a mean prevalence of 88.3% across malignancies. These findings align with established evidence of site-specific variation, where U.S. cohorts demonstrate comorbidity prevalence ranging from 40.2% to 78.7%. Notably, liver cancer demonstrated the highest comorbidity burden at 97.26%, followed by lung, prostate, and esophageal malignancies, while breast, skin, and thyroid cancers exhibited lower burdens. This hierarchy reflects the pathophysiology whereby visceral malignancies such as hepatic and pulmonary cancers drive age-dependent comorbidity escalation through chronic organ dysfunction, manifesting clinically as hypoproteinemia, ascites, atelectasis, and malnutrition (29, 30). In contrast, superficial carcinomas incur substantially reduced comorbidity risks due to their minimal functional impairment.
Cancer-specific comorbidity heterogeneity
Colorectal cancer
A retrospective cohort study of 5,312 patients using latent class analysis showed that 59% had at least one comorbidity, with 19% of the cohort presenting four or more comorbidities. The median comorbidity count reached five conditions, and 80% of patients exhibited three or more concurrent conditions. Cardiovascular/respiratory diseases, diabetes, metabolic disorders, anemia, and gastrointestinal pathologies emerged as the most prevalent comorbidities according to reference (31). Observed discrepancies between studies primarily stem from methodological differences, particularly variations in assessment tools such as the Charlson Index versus more comprehensive diagnostic criteria.
Breast cancer
Analysis of South African data from 2,281 participants documented a 44% comorbidity prevalence. The most frequent conditions included obesity at 52.8%, hypertension affecting 41.3% of cases, HIV present in 22.0%, and diabetes observed in 13.7% of the cohort according to reference (32). Our larger study of 6,956 patients demonstrated a significantly higher prevalence of 74.1%, primarily driven by benign breast dysplasia, hepatic cysts, anemia, hypertension, and diabetes. The association with benign breast disease should be interpreted cautiously, as it represents a recognized risk or precursor state rather than a strict comorbidity and may reflect antecedent disease history or surveillance-related coding in inpatient claims. Hormonal pathways such as estrogen-mediated obesity and dysplasia may contribute to this distinct clinical profile, underscoring the need for integrated management approaches.
Lung cancer
Significant prevalence heterogeneity reflected methodological diversity (33–36). Common conditions included pneumonia, cerebral infarction, hypertension, diabetes, and COPD (37–40). Our analysis identified 32 distinct comorbidities, with pneumonia, pleural diseases, atelectasis, hypertension, COPD, and diabetes predominating. Association rules revealed 23 significant combinations, notably COPD-pleural disease-hypertension and ischemic cardiomyopathy pneumonia clusters.
Comorbidity profiles by demographic stratification
Comorbidity patterns exhibited significant age-and sex-stratified heterogeneity. Network analysis of 8.8 million discharge records established demographic-specific variations with differential clinical impacts (41). Our association rule mining confirmed divergent comorbidity patterns across sex/age groups in malignant neoplasms, attributable to multifactorial interactions including physiological dimorphism, endocrine regulation, genetic susceptibility, lifestyle behaviors, and environmental exposures.
The shared burden of anemia and metabolic disorders likely reflects systemic effects of cancer and cytotoxic therapies. The male predominance in hepatopulmonary comorbidities aligns with higher rates of smoking and hepatitis exposure among males. However, lifestyle factors such as smoking and alcohol use are not captured in our claims dataset. Given the strong sex differences in smoking prevalence and its established role in both lung cancer and respiratory infections, the observed male-specific respiratory co-diagnosis patterns may be partly influenced by unmeasured confounding. Conversely, the female predominance in thyroid/breast comorbidities suggests influences of sex hormones and gender-specific healthcare utilization patterns.
Comorbidity profiles demonstrate progressive transitions across age groups. Younger patients exhibit predominantly cancer-specific complications such as leukemia-related cytopenias and opportunistic infections. Middle-aged cohorts manifest transitional patterns featuring early chronic conditions alongside malignancy-related sequelae. Older adults develop multimorbidity reflective of cumulative organ dysfunction and age-associated chronic diseases including cardiovascular, metabolic, and respiratory pathologies.
Urban-rural divergence primarily stems from differential malignancy prevalence–particularly elevated hepatocellular carcinoma and NPC burdens in rural areas potentially associated with viral exposures—coupled with healthcare access inequities cited. Rural patients demonstrate stronger associations with conditions linked to diagnostic delays or advanced disease presentations, including hepatic fibrosis and pneumonia. Urban cohorts more frequently exhibit conditions amenable to chronic management such as ischemic heart disease and benign breast disorders.
Advantages and limitations
This study’s strengths include comprehensive analysis of six-year, multi-hospital insurance claims data (n=163), enhancing accuracy in characterizing cancer comorbidity patterns. Several limitations should be acknowledged. This study relied on inpatient administrative insurance claims data, which are subject to diagnostic misclassification and inter-hospital coding variability. The lack of longitudinal follow-up and outpatient records limited the assessment of disease trajectories. Moreover, key clinical variables, including cancer stage, disease severity, laboratory results, treatment details, and behavioral factors such as smoking and alcohol consumption, were unavailable, restricting clinical stratification and increasing the risk of residual confounding. As temporality between cancer and co-recorded conditions cannot be established from inpatient claims alone, the observed associations should be interpreted as co-occurrence patterns rather than causal relationships. Finally, the regional nature of the dataset may limit the generalizability of these findings.
Key findings
This large, population-based analysis utilizing comprehensive medical insurance data provides a detailed characterization of the malignant neoplasm spectrum and associated comorbidity patterns among hospitalized patients. We identified lung, liver, and colorectal cancers as the predominant malignancies, consistent with national and global trends (1, 2), alongside significant burdens of NPC, leukemia, and lymphoma. Crude cancer incidence in Zhanjiang City rose 2.32% annually, exceeding the National Cancer Center’s 1.4% benchmark. A substantial comorbidity burden was evident, with a median of 5 comorbidities per patient, varying markedly by cancer type. Patients with liver, lung, and nasopharyngeal cancers bore the heaviest disease burden. Crucially, association rule mining revealed distinct, malignancy-specific comorbidity clusters, particularly involving liver-hepatic, lung-respiratory, and colorectal-inflammatory pathways. These patterns exhibited significant heterogeneity across sex, age groups, and urban-rural residence, reflecting underlying biological, behavioral, and healthcare access factors.
Comorbidity burden and profiles
Geriatric populations bear the greatest cancer burden, with over half of cancer diagnoses occurring at age 65 or older according to established literature (24). Our data corroborate this pattern, showing initial diagnoses at ≥60 and ≥65 years in 60.12% and 47.94% of cases, respectively. Age-related cumulative cellular damage and physiological decline predispose elderly cancer patients to comorbid noncommunicable conditions through degenerative pathways (25). Comorbidity prevalence among cancer patients exhibits substantial methodological heterogeneity globally, with reported rates ranging widely from 0.4% to 90% across studies due to variations in measurement approaches, population characteristics, and malignancy types (26). Geriatric-focused studies and clinical chart reviews consistently report higher burdens than analyses of administrative databases; for instance, SEER data indicate 68.7% prevalence while NHIS reports show 43.7% to 46.6%, dominated by hypertension, arthritis, cardiovascular disease, diabetes, and respiratory disorders (27, 28). Crucially, our investigation utilized clinician-curated diagnostic data from medical institutions, offering superior diagnostic accuracy compared to public databases or self-reported sources. This robust methodology revealed elevated comorbidity rates, with a median of 5 comorbidities per patient and a mean prevalence of 88.3% across malignancies. These findings align with established evidence of site-specific variation, where U.S. cohorts demonstrate comorbidity prevalence ranging from 40.2% to 78.7%. Notably, liver cancer demonstrated the highest comorbidity burden at 97.26%, followed by lung, prostate, and esophageal malignancies, while breast, skin, and thyroid cancers exhibited lower burdens. This hierarchy reflects the pathophysiology whereby visceral malignancies such as hepatic and pulmonary cancers drive age-dependent comorbidity escalation through chronic organ dysfunction, manifesting clinically as hypoproteinemia, ascites, atelectasis, and malnutrition (29, 30). In contrast, superficial carcinomas incur substantially reduced comorbidity risks due to their minimal functional impairment.
Cancer-specific comorbidity heterogeneity
Colorectal cancer
A retrospective cohort study of 5,312 patients using latent class analysis showed that 59% had at least one comorbidity, with 19% of the cohort presenting four or more comorbidities. The median comorbidity count reached five conditions, and 80% of patients exhibited three or more concurrent conditions. Cardiovascular/respiratory diseases, diabetes, metabolic disorders, anemia, and gastrointestinal pathologies emerged as the most prevalent comorbidities according to reference (31). Observed discrepancies between studies primarily stem from methodological differences, particularly variations in assessment tools such as the Charlson Index versus more comprehensive diagnostic criteria.
Breast cancer
Analysis of South African data from 2,281 participants documented a 44% comorbidity prevalence. The most frequent conditions included obesity at 52.8%, hypertension affecting 41.3% of cases, HIV present in 22.0%, and diabetes observed in 13.7% of the cohort according to reference (32). Our larger study of 6,956 patients demonstrated a significantly higher prevalence of 74.1%, primarily driven by benign breast dysplasia, hepatic cysts, anemia, hypertension, and diabetes. The association with benign breast disease should be interpreted cautiously, as it represents a recognized risk or precursor state rather than a strict comorbidity and may reflect antecedent disease history or surveillance-related coding in inpatient claims. Hormonal pathways such as estrogen-mediated obesity and dysplasia may contribute to this distinct clinical profile, underscoring the need for integrated management approaches.
Lung cancer
Significant prevalence heterogeneity reflected methodological diversity (33–36). Common conditions included pneumonia, cerebral infarction, hypertension, diabetes, and COPD (37–40). Our analysis identified 32 distinct comorbidities, with pneumonia, pleural diseases, atelectasis, hypertension, COPD, and diabetes predominating. Association rules revealed 23 significant combinations, notably COPD-pleural disease-hypertension and ischemic cardiomyopathy pneumonia clusters.
Comorbidity profiles by demographic stratification
Comorbidity patterns exhibited significant age-and sex-stratified heterogeneity. Network analysis of 8.8 million discharge records established demographic-specific variations with differential clinical impacts (41). Our association rule mining confirmed divergent comorbidity patterns across sex/age groups in malignant neoplasms, attributable to multifactorial interactions including physiological dimorphism, endocrine regulation, genetic susceptibility, lifestyle behaviors, and environmental exposures.
The shared burden of anemia and metabolic disorders likely reflects systemic effects of cancer and cytotoxic therapies. The male predominance in hepatopulmonary comorbidities aligns with higher rates of smoking and hepatitis exposure among males. However, lifestyle factors such as smoking and alcohol use are not captured in our claims dataset. Given the strong sex differences in smoking prevalence and its established role in both lung cancer and respiratory infections, the observed male-specific respiratory co-diagnosis patterns may be partly influenced by unmeasured confounding. Conversely, the female predominance in thyroid/breast comorbidities suggests influences of sex hormones and gender-specific healthcare utilization patterns.
Comorbidity profiles demonstrate progressive transitions across age groups. Younger patients exhibit predominantly cancer-specific complications such as leukemia-related cytopenias and opportunistic infections. Middle-aged cohorts manifest transitional patterns featuring early chronic conditions alongside malignancy-related sequelae. Older adults develop multimorbidity reflective of cumulative organ dysfunction and age-associated chronic diseases including cardiovascular, metabolic, and respiratory pathologies.
Urban-rural divergence primarily stems from differential malignancy prevalence–particularly elevated hepatocellular carcinoma and NPC burdens in rural areas potentially associated with viral exposures—coupled with healthcare access inequities cited. Rural patients demonstrate stronger associations with conditions linked to diagnostic delays or advanced disease presentations, including hepatic fibrosis and pneumonia. Urban cohorts more frequently exhibit conditions amenable to chronic management such as ischemic heart disease and benign breast disorders.
Advantages and limitations
This study’s strengths include comprehensive analysis of six-year, multi-hospital insurance claims data (n=163), enhancing accuracy in characterizing cancer comorbidity patterns. Several limitations should be acknowledged. This study relied on inpatient administrative insurance claims data, which are subject to diagnostic misclassification and inter-hospital coding variability. The lack of longitudinal follow-up and outpatient records limited the assessment of disease trajectories. Moreover, key clinical variables, including cancer stage, disease severity, laboratory results, treatment details, and behavioral factors such as smoking and alcohol consumption, were unavailable, restricting clinical stratification and increasing the risk of residual confounding. As temporality between cancer and co-recorded conditions cannot be established from inpatient claims alone, the observed associations should be interpreted as co-occurrence patterns rather than causal relationships. Finally, the regional nature of the dataset may limit the generalizability of these findings.
Conclusions
Conclusions
This large-scale analysis provides a detailed characterization of the malignant neoplasm spectrum and reveals complex, clinically significant comorbidity patterns among hospitalized patients. We confirm a substantial comorbidity burden, particularly associated with lung, liver, and colorectal cancers, and demonstrate distinct patterns using association rule mining. The significant variations observed across sex, age, and urban-rural residence underscore the influence of demographic and potentially socio-environmental factors on comorbid disease presentation. These findings highlight the critical need for integrated clinical management strategies that address both cancer and concurrent chronic conditions. Optimizing resource allocation towards populations with the highest comorbidity burden with older adults, males, rural residents and those with specific high-risk clusters is essential. The identified comorbidity patterns offer valuable insights for developing tailored approaches to cancer care that extend beyond the specific geographic origin of the data used in this analysis. Future research should focus on validating these patterns in diverse populations, elucidating underlying mechanisms, and evaluating interventions to improve outcomes for cancer patients with comorbidities.
This large-scale analysis provides a detailed characterization of the malignant neoplasm spectrum and reveals complex, clinically significant comorbidity patterns among hospitalized patients. We confirm a substantial comorbidity burden, particularly associated with lung, liver, and colorectal cancers, and demonstrate distinct patterns using association rule mining. The significant variations observed across sex, age, and urban-rural residence underscore the influence of demographic and potentially socio-environmental factors on comorbid disease presentation. These findings highlight the critical need for integrated clinical management strategies that address both cancer and concurrent chronic conditions. Optimizing resource allocation towards populations with the highest comorbidity burden with older adults, males, rural residents and those with specific high-risk clusters is essential. The identified comorbidity patterns offer valuable insights for developing tailored approaches to cancer care that extend beyond the specific geographic origin of the data used in this analysis. Future research should focus on validating these patterns in diverse populations, elucidating underlying mechanisms, and evaluating interventions to improve outcomes for cancer patients with comorbidities.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.