nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2026, 01, v.43 122-135
多水平分位回归提升树模型与保险损失预测
基金项目(Foundation): 国家社会科学基金重点项目“巨灾债券定价与风险管理的统计建模研究”(22ATJ005); 国家自然科学基金青年项目“基于多任务机器学习的非寿险相依风险模型研究”(72201062)
邮箱(Email): mengshw@ruc.edu.cn;
DOI: 10.19343/j.cnki.11-1302/c.2026.01.009
投稿时间: 2025-04-23
投稿日期(年): 2025
修回时间: 2026-01-05
终审时间: 2026-01-27
终审日期(年): 2026
审稿周期(年): 1
发布时间: 2026-01-25
出版时间: 2026-01-25
移动端阅读
摘要:

在保险实践中,损失数据的厚尾性会导致均值回归的结果不合理。为此,本文将分位回归与梯度提升树相结合,提出分位回归提升树模型,用树结构来描述风险因子与响应变量分位数之间的复杂关系。出于单调性的考虑,利用加法结构和乘法结构将分位回归提升树推广到多水平形式,同时预测多个水平下的分位数,并将多任务学习中的硬参数共享思想引入多水平模型,从而减少参数个数,降低模型复杂度。本文提出的方法在模拟数据和真实数据上都表现出良好的预测性能和可解释性,为保险定价和风险管理提供一种新的数据分析工具和方法。

Abstract:

In insurance practice, the heavy tailed nature of loss data can lead to unreasonable results in mean regression. Therefore, combining quantile regression and gradient boosting tree, this paper proposes a quantile regression boosting tree model, which uses a tree structure to describe the complex relationship between risk factors and the quantiles of response variables. Considering the monotonicity, the additive structure and multiplicative structure are used to extend the quantile regression boosting tree to multiple forms, enabling it to synchronously predict quantiles at multiple levels. Next, the idea of hard parameter sharing in multi-task learning is introduced into the multivariate model to reduce the number of parameters and model complexity. The method proposed in this paper exhibits good predictive performance and interpretability on both simulated data and real data, providing a new data analysis tool and method for insurance ratemaking and risk management.

参考文献

[1]何静,熊巍,田茂再.可加模型的无交叉分位回归曲线与房价问题研究[J].数理统计与管理, 2015, 34(4):707–718.

[2]胡亚南,王金天,田茂再.半参数空间分位回归模型的估计与变量选择[J].数理统计与管理, 2022, 41(4):647–661.

[3]李云仙,孟生旺.基于分位数回归模型的地震巨灾风险评估[J].数理统计与管理, 2019, 38(5):785–798.

[4]罗幼喜,张敏,田茂再.面板数据的可加分位回归模型研究与应用[J].统计研究, 2020, 37(2):105–118.

[5]孟生旺,杨亮.基于参数化分位回归模型的非寿险准备金评估[J].系统工程理论与实践, 2018, 38(3):604–614.

[6]蒲适,陈秉正.基于分位数回归的交强险费率厘定研究[J].保险研究, 2016(6):61–72.

[7]杨亮,孟生旺.基于分位回归的风险保费预测[J].统计与信息论坛, 2016, 31(9):83–88.

[8]杨亮,孟生旺.准备金评估的贝叶斯分层分位回归模型[J].系统工程学报, 2019, 34(5):672–682.

[9]张永霞,孟生旺,田茂再.半参数贝叶斯分层分位回归模型及其在保险公司成本分析中的应用[J].数理统计与管理, 2021, 40(3):381–394.

[10]Borchani H, Varando G, Bielza C, et al. A Survey on Multi-output Regression[J]. WIREs Data Mining Knowl Discov, 2015(5):216–233.

[11]Breiman L, Friedman J, Stone C, et al. Classification and Regression Trees[M]. New York:Chapman and Hall/CRC, 1984.

[12]Caruana R. Multitask Learning[J]. Machine Learning, 1997, 28(1):41–75.

[13]Chen T, Guestrin C. Xgboost:A Scalable Tree Boosting System[C]. In:Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016:785–794.

[14]Faddoul J B, Chidlovskii B, Gilleron R, et al. Learning Multiple Tasks with Boosted Decision Trees[C]. Proceedings of 2012 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2012:681–696.

[15]Fissler T, Merz M, Wüthrich M V. Deep Quantile and Deep Composite Triplet Regression[J]. Insurance:Mathematics and Economics, 2023,109(3):94–112.

[16]Friedman J. Greedy Function Approximation:A Gradient Boosting Machine[J]. Annals of Statistics, 2001, 29(5):1189–1232.

[17]Frumento P, Bottai M. Parametric Modeling of Quantile Regression Coefficient Functions[J]. Biometrics, 2016, 72(1):74–84.

[18]Koenker R, Bassett G. Regression Quantiles[J]. Econometrica, 1978, 46(1):33–50.

[19]Koenker R. Additive Models for Quantile Regression:Model Selection and Confidence Bands[J]. Brazilian Journal of Probability and Statistics,2011, 25(3):239–262.

[20]Mutis M, Beyaztas U, Karaman F, Shang, H L. On Function-on-function Linear Quantile Regression[J]. Journal of Applied Statistics, 2025,52(4):814–840.

[21]Noufaily A, Jones M C. Parametric Quantile Regression Based on the Generalized Gamma Distribution[J]. Journal of the Royal Statistical Society, 2013, 62(5):723–740.

(1)因篇幅所限,分位回归模型在驾驶行为数据上的样本外损失和CR值以附表1展示,见《统计研究》网站所列附件。

基本信息:

DOI:10.19343/j.cnki.11-1302/c.2026.01.009

中图分类号:F842;O212.1

引用信息:

[1]高雅倩,黄一凡,孟生旺.多水平分位回归提升树模型与保险损失预测[J].统计研究,2026,43(01):122-135.DOI:10.19343/j.cnki.11-1302/c.2026.01.009.

基金信息:

国家社会科学基金重点项目“巨灾债券定价与风险管理的统计建模研究”(22ATJ005); 国家自然科学基金青年项目“基于多任务机器学习的非寿险相依风险模型研究”(72201062)

投稿时间:

2025-04-23

投稿日期(年):

2025

修回时间:

2026-01-05

终审时间:

2026-01-27

终审日期(年):

2026

审稿周期(年):

1

发布时间:

2026-01-25

出版时间:

2026-01-25

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文