摘要(Abstract):
针对传统朴素贝叶斯算法属于浅层学习,其特征独立性假设易引起分类效果欠佳的问题,提出一种深度集成朴素贝叶斯模型;该模型受深度森林中集成思想的启发,将高斯朴素贝叶斯、多项式朴素贝叶斯、伯努利朴素贝叶斯3种浅层基分类器集成为具有深层学习结构的朴素贝叶斯模型。结果表明:提出的深度集成朴素贝叶斯模型不仅克服了浅层学习特征表达能力不足的问题,而且缓解了特征独立性假设的缺点;通过在经典文本分类数据集上的实验,证明了提出的深度集成朴素贝叶斯模型的精确率、召回率以及精确率与召回率的调和平均数F_1值显著增大,模型性能良好。
关键词(KeyWords): 朴素贝叶斯模型;浅层学习;深度森林;集成;文本分类
基金项目(Foundation): 国家重点研发计划项目(2018YFB1404500,2018YFB1404503)
作者(Author): 吴皋,李明,周稻祥,岳俊宏,肖福龙
DOI: 10.13349/j.cnki.jdxbn.20200511.003
参考文献(References):
[1] FRIEDMAN N,GEIGER D,GOLDSZMIDT M,et al.Bayesian network classifiers[J].Machine Learning,1997,29(2/3):131-163.
[2] 刘闯.基于朴素贝叶斯与半朴素贝叶斯图像识别比较[J].信息技术与网络安全,2018,37(12):44-47.
[3] 赵中全,刘丹.基于树扩展朴素贝叶斯分类器的Web代理服务器缓存优化[J].计算机工程,2017,43(1):115-119.
[4] 秦锋,任诗流,程泽凯,等.基于ICA方法的朴素贝叶斯分类器[J].计算机工程与设计,2007,28(20):4873-4874,4877.
[5] 李楚进,付泽正.对朴素贝叶斯分类器的改进[J].统计与决策,2016(21):9-11.
[6] 王辉,黄自威,刘淑芬.新型加权粗糙朴素贝叶斯算法及其应用研究[J].计算机应用研究,2015,32(12):3668-3672,3692.
[7] 王行甫,付欢欢,王琳.基于余弦相似度和实例加权改进的贝叶斯算法[J].计算机系统应用,2016,25(8):166-170.
[8] BENGIO Y.Learning deep architectures for AI[J].Foundations and Trends in Machine Learning,2009,2(1):1-127.
[9] FISHER A,IGEL C.Training restricted Boltzmann machines:an introduction[J].Pattern Recognition,2014,47(1):25-39.
[10] SARIKAYA R,HINTON G E,DEORAS A.Application of deep belief networks for natural language understanding[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2014,22(4):778-784.
[11] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[12] ZHOU Z H,FENG J.Deep forest:towards an alternative to deep neural networks[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence,August 19-25,2017,Melbourne,Australia.Freiburg:IJCAI,2017:3553-3559.
[13] 周志华.机器学习[M].北京:清华大学出版社,2016:171-190.
[14] 李航.统计学习方法[M].北京:清华大学出版社,2012:47-51.
[15] 蒋良孝,李超群.贝叶斯网络分类器:算法与应用[M].武汉:中国地质大学出版社,2015:64-66.
[16] 薛参观,燕雪峰.基于改进深度森林算法的软件缺陷预测[J].计算机科学,2018,45(8):160-165.
[17] BREIMAN L.Bagging predictors[J].Machine Learning,1996,24(2):123-140.
[18] FREUND Y,SCHAPIRE R E.A decision-theoretic generali-zation of on-line learning and an application to boosting[J].Journal of Computer and System Sciences,1997,55(1):119-139.
[19] CHEN T Q,GUESTRIN C.XGBoost:a scalable tree boosting system[C]//KDD 2016:ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Augest 13-17,2016,San Francisco,USA.New York:ACM,2016:785-794.
[20] KE G L,MENG Q,FINLEY T,et al.LightGBM:a highly efficient gradient boosting decision tree[C]//Von LUXBURG U,GUYON I.NIPS′17:Proceedings of the 31st International Conference on Neural Information Processing Systems,Red Hook:Curran Associates Inc.,2017:1-9.
[21] 李锐,李鹏,曲亚东,等.机器学习实战[M].北京:人民邮电出版社,2013:127-129.