nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
您当前所在位置: 首页> 文献列表> 基于高斯谱聚类的风险商户聚类分析
2021, 06, v.38 145-160
基于高斯谱聚类的风险商户聚类分析
基金项目(Foundation): 国家自然科学基金(12071477,71873137); 中央高校建设世界一流大学(学科)和特色发展引导专项资金
邮箱(Email): dyhuang89@126.com;
DOI: 10.19343/j.cnki.11-1302/c.2021.06.011
发布时间: 2021-06-29
出版时间: 2021-06-29
网络发布时间: 2021-06-29
移动端阅读
摘要:

随着电子支付的普及,市场涌现出越来越多的第三方支付平台,而当前关于第三方支付平台商户风险方面的研究相对较少。故本文提出基于高斯谱聚类的风险商户聚类方法,首先使用高斯混合模型构建交易-交易群体的双模网络;其次借助网络中信息传递的思想构建"商户-交易群体网络"的双模网络;再次使用双模网络聚类方法中的谱聚类方法同时对网络中的两类节点聚类,对商户节点聚类的结果可区分出不同风险级别的商户,对交易群体节点聚类的结果可以进一步描述风险商户的交易特征;最后本文分别在模拟数据和某第三方支付平台的实际数据中验证了模型的有效性。实验结果表明,本文提出的方法不仅可以准确地区分出不同风险级别的商户群体,而且能总结归纳风险商户的交易特征,为风险商户的监管提供参考。

Abstract:

With the popularity of electronic payment,more and more third-party payment platforms have emerged,while the research on the risk merchants of third-party payment platforms is relatively deficient.Therefore,we propose a risk merchant clustering method based on Gaussian spectral clustering. First,we use the Gaussian mixture model to build a transaction-trading group bipartite network. Second,we use the idea of information transmission in the network to build a merchant-trading group bipartite network. Third,we take advantage of spectral clustering in the bipartite network method to cluster two kinds of nodes at the same time.The result of merchant node clustering can distinguish merchants' risk levels,and that of trading-group node clustering can further describe the transaction characteristics of risk merchants. At last,we validate the model in the simulated data and the actual data of a third-party payment platform. The experiment results show that the proposed method can not only accurately distinguish the merchants with different risk levels,but also summarize the transaction characteristics of the risk merchants and provide a reference for the supervision of the risk merchants.

参考文献

[1]方匡南,吴见彬,朱建平,等.信贷信息不对称下的信用卡信用风险研究[J].经济研究,2010,45(S1):97-107.

[2]郭雳.信用卡套现责任体系之完善[J].法学,2010(12):120-127.

[3]罗暘洋,李存金,罗斌.与第三方支付机构“竞合”是否提升了银行绩效[J].金融经济学研究,2020,35(4):108-118.

[4]邱晗,黄益平,纪洋.金融科技对传统银行行为的影响:基于互联网理财的视角[J].金融研究,2018(11):17-29.

[5]汪小帆,李翔,陈关荣.网络科学导论[M].北京:高等教育出版社,2012:141.

[6]张卫东.试论信用卡业务中的风险控制[J].国际金融研究,1991(1):52-54,40.

[7]Benford F. The Law of Anomalous Numbers[J]. Proceedings of the American Philosophical Society,1938,78:551-572.

[8]Biernacki C,Celeux G,Govaert G. Choosing Starting Values for the EM Algorithm for Getting the Highest Likelihood in Multivariate Gaussian Mixture Models[J]. Computational Statistics&Data Analysis,2003,41(3-4):561-575.

[9]Clauset A,Newman M E J,Moore C. Finding Community Structure in Very Large Networks[J]. Physical Review E,2004,70(6):066111.

[10]Dhillon I S. Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning[C]//Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001:269-274.

[11]Donath W E,Hoffman A J. Lower Bounds for the Partitioning of Graphs[M]//Selected Papers Of Alan J Hoffman:With Commentary.2003:437-442.

[12]El Assaad H,SaméA,Govaert G,et al. A Variational Expectation-Maximization Algorithm for Temporal Data Clustering[J].Computational Statistics&Data Analysis,2016,103:206-228.

[13]Fader P S,Hardie B G S,Lee K L. RFM and CLV:Using Iso-value Curves for Customer Base Analysis[J]. Journal of Marketing Research,2005,42(4):415-430.

[14]Fowlkes E B,Mallows C L. A Method for Comparing Two Hierarchical Clusterings[J]. Journal of the American Statistical Association,1983,78(383):553-569.

[15]Hand D J,Adams N M. Defining Attributes for Scorecard Construction in Credit Scoring[J]. Journal of Applied Statistics,2000,27(5):527-540.

[16]Heldt R,Silveira C S,Luce F B. Predicting Customer Value Per Product:From RFM to RFM/P[J]. Journal of Business Research,2019,127(2).

[17]Huang D,Zhou J,Wang H. RFMS Method for Credit Scoring Based on Bank Card Transaction Data[J]. Statistica Sinica,2018,28(4):2903-2919.

[18]Huang Y,Zhang M,He Y. Research on Improved RFM Customer Segmentation Model Based on K-means Algorithm[C]//2020 5th International Conference on Computational Intelligence and Applications(ICCIA). IEEE,2020:24-27.

[19]Hu Y H,Yeh T W. Discovering Valuable Frequent Patterns Based on RFM Analysis without Customer Identification Information[J].Knowledge-based Systems,2014,61:76-88.

[20]Kluger Y,Basri R,Chang J T,et al. Spectral Biclustering of Microarray Data:Co-clustering Genes and Conditions[J]. Genome Research,2003,13(4):703-716.

[21]Leemis L M,Schmeiser B W,Evans D L. Survival Distributions Satisfying Benford’s Law[J]. The American Statistician,2000,54(4):236-241.

[22]Lesot M J,d’Allonnes A R. Credit-card Fraud Profiling Using a Hybrid Incremental Clustering Methodology[C]//International Conference on Scalable Uncertainty Management. Springer,Berlin,Heidelberg,2012:325-336.

[23]Mc Lachlan G J,Peel D. Finite Mixture Models[M]. John Wiley&Sons,2004.

[24]Rosenberg A,Hirschberg J. V-measure:A Conditional Entropy-based External Cluster Evaluation Measure[C]//Proceedings of the2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(EMNLPCoNLL),2007:410-420.

[25]Vinh N X,Epps J,Bailey J. Information Theoretic Measures for Clusterings Comparison:Variants,Properties,Normalization and Correction for Chance[J]. Journal of Machine Learning Research,2010,11(10):2837-2854.

[26]Von L U. A Tutorial on Spectral Clustering[J]. Statistics and Computing,2007,17(4):395-416.

[27]Xie F,Xu Y. Bayesian Repulsive Gaussian Mixture Model[J]. Journal of the American Statistical Association,2020,115(529):187-203.

[28]Zhou T,Ren J,Medo M,et al. Bipartite Network Projection and Personal Recommendation[J]. Physical Review E,2007,76(4):046115.

[29]Zhou Y. A Model Framework to Estimate the Fraud Probability of Acquiring Merchants[D]. Arizona State University,2015.

(1)数据来源:国家金融与发展实验室支付清算研究中心发布的《中国支付清算发展报告(2019)》。

(1)因篇幅所限,详细推导步骤在附件中展示,见《统计研究》网站所列附件。

(1)由于合作方要求,此处隐去公司名称。

基本信息:

DOI:10.19343/j.cnki.11-1302/c.2021.06.011

中图分类号:F832.2;F724.6;F222

引用信息:

[1]黄丹阳,毕博洋,朱映秋.基于高斯谱聚类的风险商户聚类分析[J].统计研究,2021,38(06):145-160.DOI:10.19343/j.cnki.11-1302/c.2021.06.011.

基金信息:

国家自然科学基金(12071477,71873137); 中央高校建设世界一流大学(学科)和特色发展引导专项资金

发布时间:

2021-06-29

出版时间:

2021-06-29

网络发布时间:

2021-06-29

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文