nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2024, 04, v.41 126-140
双参数Tweedie机器学习模型及其精算应用
基金项目(Foundation): 国家社会科学基金重点项目“巨灾债券定价与风险管理的统计建模研究”(22ATJ005); 教育部人文社会科学重点研究基地重大项目“数字时代风险管理与精算模型研究”(22JJD910003)
邮箱(Email): mengshw@ruc.edu.cn;
DOI: 10.19343/j.cnki.11-1302/c.2024.04.010
摘要:

Tweedie回归是保险损失预测和风险定价的主要工具之一。为充分利用大数据、物联网、机器学习等技术促进保险业的数字化转型,实现更加精准的风险识别和风险定价,本文将传统的Tweedie广义线性模型推广到双参数形式,并结合机器学习算法,提出双参数Tweedie梯度提升树模型和双参数Tweedie组合神经网络模型。基于我国一家保险公司的车联网大数据,提取了新的驾驶行为风险因子。通过实证研究检验了双参数Tweedie梯度提升树和双参数Tweedie组合神经网络在风险识别以及风险定价中的有效性,为促进我国保险业数字化转型提供了一种新的模型和方法。

Abstract:

Tweedie regression is one of the most widely used models for loss prediction and risk pricing in the insurance industry. In order to make full use of big data, Internet of Things, machine learning,and other technologies to promote the digital transformation of the insurance industry and achieve more accurate risk identification and risk pricing. This parper extend the traditional Tweedie generalized linear model to the double-parameter form. Combined with machine learning algorithm, the double Tweedie gradient boosting tree and the double Tweedie combined neural network model are proposed. In addition,we get the telematics data from a Chinese insurance company and extract new driving behavior factors for risk pricing. The empirical study shows that using the new driving behavior factors, the double Tweedie gradient boosting tree and the double Tweedie combined neural network model can effectively improve the risk identification and risk pricing. The new models may be used to promote digital transformation of the insurance industry.

参考文献

[1]高光远,孟生旺.基于车联网大数据的车险费率因子分析[J].保险研究, 2018(1):90–100.

[2]孟生旺,黄一凡.驾驶行为保险的风险预测模型研究[J].保险研究, 2018(8):21–34.

[3]孟生旺,李天博,高光远.基于机器学习算法的车险索赔概率与累积赔款预测[J].保险研究, 2017(10):42–53.

[4]孙维伟.基于Tweedie类分布的广义可加模型在车险费率厘定中的应用[J].天津商业大学学报, 2014, 34(1):60–67.

[5]吴祥佑.基于驾驶行为的UBI车险定价模型[J].电子科技大学学报:社会科学版, 2020, 22(4):67–76.

[6]张连增,申晴.泊松提升模型在中国车险索赔频率预测建模中的应用[J].统计与信息论坛, 2019(9):27–34.

[7]张连增,谢厚谊. Tweedie分布在车险费率厘定中的应用[J].保险研究, 2017(1):80–90.

[8] Ayuso M, Guillen M, Nielsen J. Improving Automobile Insurance Ratemaking Using Telematics:Incorporating Mileage and Driver Behavior Data[J]. Transportation, 2019, 46(3):735–752.

[9] Breiman L, Friedman J, Stone C, et al. Classification and Regression Trees[M]. New York:Chapman and Hall/CRC, 1984.

[10] Chen T, Guestrin C. Xgboost:A Scalable Tree Boosting System[A]. In:Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C]. 2016:785–794.

[11] Delong L, Lindholm M, Wüthrich M V. Making Tweedie’s Compound Poisson Model More Accessible[J]. European Actuarial Journal, 2021,11(1):185–226.

[12] Ferrario A, Noll A, Wüthrich M V. Insights From Inside Neural Networks[J/OL]. SSRN, 2018. Available at http://doi.org/10.2139/ssrn.3226852.

[13] Ferreira J, Minikel E. Measuring Per Mile Risk for Pay-as-you-drive Automobile Insurance[J]. Transportation Research Record:Journal of the Transportation Research Board, 2012, 2297(1):97–103.

[14] Frees E, Meyers G, Cummings A. Insurance Ratemaking and A Gini Index[J]. The Journal of Risk and Insurance, 2014, 81(2):335–366.

[15] Friedman J. Greedy Function Approximation:A Gradient Boosting Machine[J]. Annals of Statistics, 2001, 29(5):1189–1232.

[16] Hainaut D, Trufin J, Denuit M. Response Versus Gradient Boosting Trees, GLMs and Neural Networks Under Tweedie Loss and Loglink[J].Scandinavian Actuarial Journal, 2022, 2022(10):841–866.

[17] Henckaerts R, C?téM P, Antonio K, et al. Boosting Insights in Insurance Tariff Plans with Tree-based Machine Learning Methods[J]. North American Actuarial Journal, 2021, 25(2):255–285.

[18] Huang Y, Meng S. Automobile Insurance Classification Ratemaking Based on Telematics Driving Data[J]. Decision Support Systems,2019(127):1–11.

[19] Lee S, Lin S. Delta Boosting Machine with Application to General Insurance[J]. North American Actuarial Journal, 2018, 22(3):405–425.

[20] Lee S. Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing[J]. Risks, 2020, 8(1):1–21.

[21] Lee S. Addressing Imbalanced Insurance Data Through Zero-inflated Poisson Regression with Boosting[J]. Astin Bulletin, 2021, 51(1):27–55.

[22] McCullagh P, Nelder J A. Generalized Linear Models[M]. New York:Chapman&Hall, 1989.

[23] Nelder J A, Wedderburn R W. Generalized Linear Models[J]. Journal of the Royal Statal Society:Series A(General), 1972, 135(3):370–384.

[24] Paefgen J, Staake T, Thiesse F. Evaluation and Aggregation of Pay-as-you-drive Insurance Rate Factors:A Classification Analysis Approach[J].Decision Support Systems, 2013, 56(1):192–201.

[25] Richman R. AI in Actuarial Science[J/OL]. SSRN, 2018. Available at http://doi.org/10.2139/ssrn.3218082.

[26] Schelldorfer J, Wüthrich M V. Nesting Classical Actuarial Models into Neural Networks[J/OL]. SSRN, 2019. Available at http://doi.org/10.2139/ssrn.3320525.

[27] Smyth G K, J?rgensen B. Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data:Dispersion Modeling[J]. Astin Bulletin, 2002,32(1):143–157.

[28] Tweedie M C K. An Index which Distinguishes Between Some Important Exponential Families[A]. In:Statistics:Applications and New Directions. Proceeding of the Indian Statistical Golden Jubilee International Conference[C]. 1984:579–604.

[29] Yang Y, Qian W, Zou H. Insurance Premium Prediction via Gradient Tree-boosted Tweedie Compound Poisson Models[J]. Journal of Business and Economic Statistics, 2018, 36(3):456–470.

[30] Zhou H, Qian W, Yang Y. Tweedie Gradient Boosting for Extremely Unbalanced Zero-inflated Data[J]. Communications in Statistics:Simulation and Computation, 2022, 51(9):5507–5529.

基本信息:

DOI:10.19343/j.cnki.11-1302/c.2024.04.010

中图分类号:F842;TP181

引用信息:

[1]高雅倩,孟生旺.双参数Tweedie机器学习模型及其精算应用[J].统计研究,2024,41(04):126-140.DOI:10.19343/j.cnki.11-1302/c.2024.04.010.

基金信息:

国家社会科学基金重点项目“巨灾债券定价与风险管理的统计建模研究”(22ATJ005); 教育部人文社会科学重点研究基地重大项目“数字时代风险管理与精算模型研究”(22JJD910003)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文