基础医学与临床 ›› 2023, Vol. 43 ›› Issue (11): 1685-1692.doi: 10.16352/j.issn.1001-6325.2023.11.1685

• 研究论文 • 上一篇    下一篇

从GEO数据库筛选结肠癌差异关键基因及验证

鄢雯, 李囡, 古同男, 孔祥照*   

  1. 首都医科大学 燕京医学院,北京 101300
  • 收稿日期:2023-06-21 修回日期:2023-09-20 出版日期:2023-11-05 发布日期:2023-10-30
  • 通讯作者: *kongxiangzhao_123@ccmu.edu.cn
  • 基金资助:
    首都医科大学自然科学基金(PYZ19143);首都医科大学燕京医学院科研培育基金(22yts01)

Screening and verification of key genes for colon cancer based on GEO database

YAN Wen, LI Nan, GU Tongnan, KONG Xiangzhao*   

  1. Yanjing Medical College, Capital Medical University, Beijing 101300,China
  • Received:2023-06-21 Revised:2023-09-20 Online:2023-11-05 Published:2023-10-30
  • Contact: *kongxiangzhao_123@ccmu.edu.cn

摘要: 目的 通过基因表达综合(GEO)数据库分析筛选出结肠癌(CC)组织差异表达基因,同时通过体外实验进行验证,挖掘潜在的结肠癌差异关键基因。方法 从GEO数据库获取人CC数据集GSE10950和GSE74602,利用GEO2R和韦恩(Venn)图在线分析工具筛选CC和正常结肠组织的差异表达基因(DEGs)。通过DAVID在线工具对差异表达基因进行基因本体论(GO)和京都基因与基因组百科全书(KEGG)通路富集分析,然后利用Cytoscape软件构建蛋白质相互作用(PPI)网络,并选出核心基因。使用GEPIA数据库进行生存分析,将与预后显著相关的核心基因视为关键基因。用细胞转染及CCK8法进一步验证关键基因的功能。结果 从2个数据集中获得了515个DEGs,其中223个表达上调和292个表达下调。GO富集分析显示上调DEGs参与细胞周期负调控、转录调控等生物学过程,KEGG信号通路分析上调DEGs主要富集于细胞周期和DNA复制等信号通路。PPI网络筛选出33个核心基因。经UCLCAN分析发现,CCNB1、CCNA2、CDC20、CDKN3、DLGAP5、HMMR和NCAPG高表达的CC患者生存期较短(P<0.05)。通过GEPIA数据库验证,CC患者中7个基因的表达水平升高,差异有统计学意义(P<0.05)。KEGG通路分析表明CCNB1、CDC20和CCNA2在细胞周期中高度富集。敲低CCNB1、CDC20和CCNA2表达可显著抑制结肠癌细胞增殖。结论 筛选出可能参与结肠癌发展的7个关键基因,其中CCNB1、CDC20和CCNA2可能成为CC的诊断分子标志物和治疗靶点。

关键词: 结肠癌, 生物信息学分析, 差异表达基因, 蛋白相互作用网络

Abstract: Objective To screen the potential key genes associated with colon cancer(CC) by bioinformatics analysis based on Gene Expression Omnibus (GEO) database, and verify them through in vitro experiments to explore potential colon cancer molecular markers. Methods The human CC data sets GSE10950 and GSE74602 were obtained from GEO database. The differentially expressed genes (DEGs) in colon cancer and adjacent tissues were screened by GEO2R tool and Venn diagram software. DEGs were used to perform by Gene Ontology(GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). The Protein-Protein Interaction (PPI) network of DEGs was constructed by STRING database, then the key genes were screened with Cytoscape software. GEPIA database was used to validate expression and prognostic value of key genes. Finally, the functions of key genes were further verified by cell transfection and CCK8 assays. Results 515 DEGs were obtained from two datasets, of which 223 were up-regulated and 292 were down-regulated. GO enrichment analysis showed that up-regulated DEGs were mainly involved in negative regulation of cell cycle, transcriptional control. KEGG signaling pathways analysis showed that up-regulated DEGs were mainly enriched in cell cycle signaling pathways. A total of 33 hub genes were screened through the PPI. UCLCAN analysis was implemented and 7 of 33 genes were relative to lower survival rate of CC patients. The expression levels of 7 genes by GEPIA analysis (P<0.05).It was verified by GEPIA database that CCNB1,CCNA2,CDC20,CDKN3,DLGAP5,HMMR and NCAPG were highly expressed in colon cancer tissues. KEGG pathway enrichment and found that three critical genes (CCNB1, CDC20 and CCNA2) enriched in cell cycle pathway. The inhibition of CCNB1,CDC20 or CCNA2 significantly promoted proliferation of colon cancer cells. Conclusions A total of 7 key genes were identified to be involved in the occurrence of CC. The expression of CCNB1,CDC20 and CCNA2 is potentially related to the poor prognosis in CC, which may be potential targets of clinical treatment and prognostic markers for CC patients.

Key words: colon cancer, bioinformatics analysis, differentially expressed genes, protein-protein interaction network

中图分类号: