[1]龙 铁,付宇笙,王文达,等.利用衍生特征预测新冠疫情的随机森林方法[J].计算机技术与发展,2023,33(08):9-13.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 002]
 LONG Tie,FU Yu-sheng,WANG Wen-da,et al.Random Forest Model of Predicting Covid-19 with Derived Feature[J].,2023,33(08):9-13.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 002]
点击复制

利用衍生特征预测新冠疫情的随机森林方法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年08期
页码:
9-13
栏目:
大数据与云计算
出版日期:
2023-08-10

文章信息/Info

Title:
Random Forest Model of Predicting Covid-19 with Derived Feature
文章编号:
1673-629X(2023)08-0009-05
作者:
龙 铁付宇笙王文达费 宁
南京邮电大学 计算机学院,江苏 南京 210003
Author(s):
LONG TieFU Yu-shengWANG Wen-daFEI Ning
School of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
关键词:
新型疫情机器学习随机森林衍生特征回归树
Keywords:
Covid-19machine learningrandom forestderived featureregression tree
分类号:
TP181
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 08. 002
摘要:
新冠疫情爆发以来,许多研究运用时滞动力学模型、传播动力学模型和机器学习模型对疫情进行分析,取得了一定的效果。 然而由于不同国家和地区之间发展差异较大,数据不均衡,导致算法普适性较弱。 随机森林( Random Forest)是一种基于决策树或回归树的集成学习模型,由多个 Bagging 集成学习技术训练得到的决策树或回归树投票来获得最终的结果。 在分析数据集特性的基础上,该文将原本难以体现样本差异性的特征值进行变换和组合,衍生出新的特征值,并且根据新增特征值将原有数据进行分组。 采用随机森林构建疫情预测模型,对各个分组数据集分别进行训练和预测。 在随机森林模型中的实验表明,该方法能够有效提高新冠疫情预测准确率,对原本差异显著地区具备更好的适应性,同时很好地防止机器学习过拟合,能较好容忍噪声值和离群值,也给未来类似传染性疾病的预测提供了新的思路。
Abstract:
Since the outbreak of Covid - 19, many works have adopted time series model, transmission dynamics model and machinelearning model to analyze?
the epidemic data, and have achieved certain results. However, due to the large development differencesbetween different countries and regions and?
the uneven data,the universality of the algorithm is weak. Random forest is an ensemblelearning model based on decision trees or regression trees,
which obtains the final result by voting of decision tree or regression treetrained by multiple Bagging ensemble learning techniques. On the basis of analyzing the characteristics of the data set,we transform andcombine the eigenvalues that are difficult to represent the differences of the samples,derives new eigenvalues, and groups the original dataaccording to the new eigenvalues. Random forest is used to construct epidemic prediction model,and each grouped data set is trained andpredicted respectively. Experiments in the random forest model show that the proposed method can effectively improve the predictionaccuracy of the Covid-19,have better adaptability to areas with significant differences,and prevent machine learning overfitting,bettertolerance of noise and outlier values,and provide new ideas for the prediction of similar infectious diseases in the future.

相似文献/References:

[1]陈全 赵文辉 李洁 江雨燕.选择性集成学习算法的研究[J].计算机技术与发展,2010,(02):87.
 CHEN Quan,ZHAO Wen-hui,LI Jie,et al.Research of Selective Ensemble Learning Algorithm[J].,2010,(08):87.
[2]黄秀丽 王蔚.SVM在非平衡数据集中的应用[J].计算机技术与发展,2009,(06):190.
 HUANG Xiu-li,WANG Wei.Application of SVM in Imbalances Dataset[J].,2009,(08):190.
[3]鲁晓南 接标.一种基于个性化邮件特征的反垃圾邮件系统[J].计算机技术与发展,2009,(08):155.
 LU Xiao-nan,JIE Biao.An Individual Anti- Spam Technology[J].,2009,(08):155.
[4]张苗 张德贤.多类支持向量机文本分类方法[J].计算机技术与发展,2008,(03):139.
 ZHANG Miao,ZHANG De-xian.Research on Text Categorization Based on. M- SVMs[J].,2008,(08):139.
[5]汤萍萍 王红兵.基于强化学习的Web服务组合[J].计算机技术与发展,2008,(03):142.
 TANG Ping-ping,WANG Hong-bing.Web Service Composition Based on Reinforcement -Learning[J].,2008,(08):142.
[6]杨雪洁 赵姝 张燕平.基于商空间理论的冬小麦产量预测和分析[J].计算机技术与发展,2008,(03):249.
 YANG Xue-jie,ZHAO Shu,ZHANG Yan-ping.Analysis on Winter Wheat Yield Based on Quotient Space Theory[J].,2008,(08):249.
[7]汤伟 程家兴 纪霞.一种基于概率推理的邮件过滤系统的研究与设计[J].计算机技术与发展,2008,(08):76.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Probability Inference[J].,2008,(08):76.
[8]孙海虹 丁华福.基于模糊粗糙集的Web文本分类[J].计算机技术与发展,2010,(07):21.
 SUN Hai-hong,DING Hua-fu.Web Document Classification Based on Fuzzy-Rough Set[J].,2010,(08):21.
[9]汤伟 程家兴 纪霞.统计学理论在邮件分类中的应用研究[J].计算机技术与发展,2008,(12):231.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Statistical Learning Theory[J].,2008,(08):231.
[10]张高胤 谭成翔 汪海航.基于K-近邻算法的网页自动分类系统的研究及实现[J].计算机技术与发展,2007,(01):21.
 ZHANG Gao-yin,TAN Cheng-xiang,WANG Hai-hang.Design and Implementation of Web Page Automation Classification System Based on K- Nearest Neighbor Algorithm[J].,2007,(08):21.

更新日期/Last Update: 2023-08-10