CN116934507B - Intelligent claim settlement method and system based on big data driving - Google Patents

Intelligent claim settlement method and system based on big data driving Download PDF

Info

Publication number
CN116934507B
CN116934507B CN202311203638.0A CN202311203638A CN116934507B CN 116934507 B CN116934507 B CN 116934507B CN 202311203638 A CN202311203638 A CN 202311203638A CN 116934507 B CN116934507 B CN 116934507B
Authority
CN
China
Prior art keywords
target user
data
information
settlement
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311203638.0A
Other languages
Chinese (zh)
Other versions
CN116934507A (en
Inventor
房永斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoren Property Insurance Co ltd
Original Assignee
Guoren Property Insurance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoren Property Insurance Co ltd filed Critical Guoren Property Insurance Co ltd
Priority to CN202311203638.0A priority Critical patent/CN116934507B/en
Publication of CN116934507A publication Critical patent/CN116934507A/en
Application granted granted Critical
Publication of CN116934507B publication Critical patent/CN116934507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Library & Information Science (AREA)
  • General Business, Economics & Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Remote Sensing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method and a system for intelligent claim settlement based on big data drive, which are used for obtaining a target user characteristic vector according to target user information; acquiring social relationship knowledge graph data associated with a target user from a graph database according to the multidimensional data features of the target user; carrying out Meanshift clustering according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to each dimensional data of the target user, and further evaluating suspicious communities, wherein the communities are used for providing reference basis for main characteristic factors and secondary characteristic factors of the characteristic data of the target user; setting different weight values for the factors, and updating the target user feature vector; calculating the risk score of a case to be claiming through a preset algorithm model according to the characteristic vector of the target user and the claiming request information; and determining different claim settlement schemes according to the risk scores to settle the claims. By the embodiment of the invention, the accuracy of prediction and control can be improved, and the service quality of insurance claims can be greatly improved.

Description

Intelligent claim settlement method and system based on big data driving
Technical Field
The invention belongs to the field of big data, and particularly relates to an intelligent claim settlement method and system based on big data driving.
Background
With the development of economy and society, more and more people pay a specified premium by buying insurance, so that corresponding guarantees of property, person and the like can be obtained. However, with the economic development of society and the improvement of people's insurance consciousness, the demand of insurance business is also increasing. Meanwhile, the health insurance has complex characteristics from the aspects of insurance targets, insurance accidents and insurance contracts, and the fraud form is complex, so that the fraud identification is very complex and fills subjectivity. Insurance fraud has a detrimental effect on both the insurer and the insured life and society.
The risks of health insurance are increasingly complex, and the conventional identification method brings great challenges to fraud identification, generally needs personnel communication cooperation involving a plurality of departments, has long claim settlement flow and low efficiency, is easy to cause errors caused by human factors, and is difficult to identify fraud efficiently and accurately. Massive health medical big data are generated at any time, and more particularly, an ultra-large amount of data are stored. Thus, in order to improve the identification efficiency of health insurance fraud, the large data technology is one of the important methods for breaking through the bottleneck.
The method and the system for intelligent claim settlement based on big data drive are adopted, and aim to solve the problems of how to establish a plurality of model data through big data for accurately identifying risk users, improving the efficiency, quality, accuracy and the like of insurance claim settlement service on the premise of guaranteeing risk management and control.
Disclosure of Invention
In order to overcome the defects of the prior art, the present disclosure provides a method and a system for intelligent claim settlement based on big data driving, which accurately extracts user features through a plurality of models, and performs claim settlement processing through knowledge graph, user group and graph model reasoning, thereby improving the accuracy of prediction and control and greatly improving the insurance claim settlement service quality.
The technical scheme adopted by the present disclosure is:
a first aspect of the embodiment of the invention provides a big data driving intelligent claim settlement method based on big data processing, comprising the following steps:
obtaining target user information according to claim settlement request information of a case to be subjected to claim settlement sent by a claim settlement terminal;
acquiring target user behavior track data according to the target user information, and extracting multi-dimensional data characteristics of a target user to form a target user characteristic vector;
acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user;
carrying out Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to each dimensional data of the target user;
acquiring attribute data of each community from the social relation knowledge graph data, and judging whether suspicious communities exist in each community containing the target user according to a pre-trained risk assessment model;
if yes, extracting attribute data of the suspicious community members, and matching the attribute data with the user feature vector to obtain target user feature data with corresponding dimensions;
taking the target user characteristic data obtained by matching as a main characteristic factor, and taking the target user characteristic data which is not successfully matched as a secondary characteristic factor;
different weight values are set according to the main feature factors and the secondary feature factors, and the feature vectors of the target users are updated;
calculating the risk score of the case to be claiming through a preset algorithm model according to the characteristic vector of the target user and the claim claiming request information;
and determining different claim settlement schemes according to the risk scores to settle the claims.
Optionally, the performing a Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data features of the target user to obtain a plurality of communities corresponding to each dimensional data including the target user, including:
extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Optionally with one point as the center of a circleCalculating all vectors in the target circular regionValue position vector +.>Comprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The mean position vector in the circular region, h, represents the radius, S h Represents a circular area with radius h, x represents the center of a circle, k represents the number of sample points, and y represents the number of sample points falling into S h A set within a circle; the convergence circle center distance is the minimum distance between the two drift circle areas, and when the circle center distance of the two circles is smaller than the value, the two circles are considered to belong to the same cluster; the convergence movement distance is the minimum threshold value of the secondary drift, and when the distance between the secondary drift is smaller than the minimum threshold value, the convergence movement distance is considered to be converged, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
Optionally, the acquiring attribute data of each community from the social relationship knowledge graph data, and judging whether a suspicious community exists in each community including the target user according to a pre-trained risk assessment model specifically includes:
by training sample data, a probability graph model of a risk assessment model is established in advance, and the joint probability distribution of all variables is as follows:
the probability of the model switching between each state is matrixWherein
Indicating at any time t, if the state is +.>Then at the next moment the state isProbability of (2); wherein (1)>A state at time t+1;
the model obtains the observation probability of each observation value according to the current state and outputs the observation probability which is expressed as a matrixWherein
Indicating at any time t, if the state is +.>Then in observation +.>Probability when acquired;
the probability of the model occurring at each moment is expressed asWherein->Representing the initial state of the model as +.>Is a probability of (2).
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, the risk assessment model training process includes:
an initial risk assessment model is constructed and the risk assessment model is constructed,which includes parameters ofThe observation sequence is generated as follows
S1, setting time t=1, and according to the initial state probabilitySelect initial state +.>
S2, according to the stateAnd output probability B selects the observation variable +.>
S3, according to the stateAnd state transition matrix A transition model state +.>
And S4, if t < n, setting t=t+1, and turning to S2, otherwise stopping.
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, the primary feature factor and the secondary feature factor set different weight values, update the target user feature vector, including,
the weight proportion of the main characteristic factors is improved, and the weight proportion of the secondary characteristic factors is reduced;
and multiplying different characteristic factors by corresponding weight proportions respectively, and correspondingly replacing the original target user characteristic vector.
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, the determining different claim settlement schemes according to the risk score to settle a claim includes:
if the risk score is larger than the first risk threshold, refusing to settle the claim, and outputting corresponding early warning information;
if the risk score is smaller than or equal to the first risk threshold and larger than the second risk threshold, inputting the information of the case to be claiming into a customer service terminal for auditing, and determining the claiming in a manual claiming mode;
if the risk score is smaller than the second risk threshold, inputting the case information to be clashed into a clashed processing model, realizing automatic clashed analysis, matching the corresponding clashed scheme, and realizing automatic clashed.
In order to achieve the above object, an embodiment of the present invention further provides a big data driven intelligent claim settlement system, including:
the information receiving module is used for obtaining target user information according to the claim settlement request information of the claim case to be settled, which is sent by the claim settlement terminal;
the information acquisition module is used for acquiring target user behavior track data according to the target user information, extracting multi-dimensional data characteristics of the target user and forming a target user characteristic vector;
the first information processing module is used for acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user;
the information clustering module is used for carrying out Meanshift clustering on the social relation knowledge graph data according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to the dimensional data of the target user;
the first information evaluation module is used for acquiring attribute data of each community from the social relation knowledge graph data and judging whether suspicious communities exist in each community containing the target user according to a pre-trained risk evaluation model;
the information judging module is used for extracting attribute data of the suspicious community members and matching the attribute data with the user feature vector to obtain target user feature data with corresponding dimensions if the suspicious community members exist;
the information screening module is used for taking the target user characteristic data obtained by matching as a main characteristic factor and the target user characteristic data which is not successfully matched as a secondary characteristic factor;
the second information processing module is used for setting different weight values according to the main characteristic factors and the secondary characteristic factors and updating the characteristic vectors of the target users;
the second information evaluation module is used for calculating the risk score of the case to be claiming through a preset algorithm model according to the characteristic vector of the target user and the claiming request information;
and the claim settlement module is used for determining different claim settlement schemes according to the risk scores to settle the claims.
Optionally, the information clustering module is configured to perform a Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data features of the target user to obtain a plurality of communities corresponding to each dimensional data including the target user, and includes:
extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Optionally with one point as the center of a circleCalculating the mean position vector of all vectors in the target circular region +.>Comprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The mean position vector in the circular region, h, represents the radius, S h Represents a circular area with radius h, x represents the center of a circle, k represents the number of sample points, and y represents the number of sample points falling into S h A set within a circle; the convergence circle center distance is the minimum distance between the two drift circle areas, and when the circle center distance of the two circles is smaller than the value, the two circles are considered to belong to the same cluster; the convergence movement distance is the minimum threshold value of the secondary drift, and when the distance between the secondary drift is smaller than the minimum threshold value, the convergence movement distance is considered to be converged, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
To achieve the above object, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the method for intelligent claim settlement based on big data driving as described above.
To achieve the above object, embodiments of the present invention also provide a computer-readable storage medium including instructions that, when executed on a computer, cause the computer to perform the big data driven intelligent claim-based method as described above.
The beneficial results of the technical scheme of the invention are as follows:
according to the intelligent claim settlement method and system based on big data driving, target user information is obtained according to claim settlement request information of the case to be subjected to claim settlement sent by the claim settlement terminal; acquiring target user behavior track data according to the target user information, and extracting multi-dimensional data characteristics of a target user to form a target user characteristic vector; acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user; carrying out Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to each dimensional data of the target user; acquiring attribute data of each community from the social relation knowledge graph data, and judging whether suspicious communities exist in each community containing the target user according to a pre-trained risk assessment model; if yes, extracting attribute data of the suspicious community members, and matching the attribute data with the user feature vector to obtain target user feature data with corresponding dimensions; taking the target user characteristic data obtained by matching as a main characteristic factor, and taking the target user characteristic data which is not successfully matched as a secondary characteristic factor; different weight values are set according to the main feature factors and the secondary feature factors, and the feature vectors of the target users are updated; calculating the risk score of the case to be claiming through a preset algorithm model according to the characteristic vector of the target user and the claim claiming request information; and determining different claim settlement schemes according to the risk scores to settle the claims. According to the invention, on the premise of risk management and control, the risk degree of the claim settlement case information of the target user is predicted by integrating the social group data related to the user, the accuracy of a risk prediction result is improved, the low-risk case pre-claim, direct claim and flash claim are realized, the medium-risk case quick claim and the high-risk case post-investigation cautious claim are realized, and the insurance claim settlement service quality is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the present application and do not constitute a undue limitation on the present disclosure.
FIG. 1 is a flow chart of steps of a big data driven intelligent claim settlement method based on big data processing
FIG. 2 is a schematic block diagram of a big data driven intelligent claim settlement system based on big data processing
Detailed Description
The disclosure is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Example 1
Referring to FIG. 1, a flowchart illustrating steps of a method for intelligent claim settlement based on big data driving according to one embodiment of the present invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. Note that, in this embodiment, a computer device is described as an execution subject. The method comprises the following steps:
and step S100, obtaining target user information according to the claim settlement request information of the claim case to be settled sent by the claim settlement terminal.
Illustratively, a user may send a claim settlement request to a claim settlement server through a user terminal, such as a cell phone, a computer, or the like, triggering a procedure for automatic claim settlement processing. The claim settlement request information not only contains case information of the case to be claiming, such as types of insurance (e.g. car insurance, health insurance, serious insurance, accidental injury insurance, travel insurance, life insurance, personal insurance, property insurance, liability insurance, etc.), insurance applicant, case information, etc., but also carries information such as user identification, user terminal identification, etc.
Step S102, collecting target user behavior track data according to target user information, extracting multi-dimensional data features of a target user, and forming a target user feature vector.
For example, during the operation of the terminal device, a large amount of user behavior data (such as social network data, operation record data and risk data included in a log file) may be generated, where the user behavior data may be derived from user behavior data generated by different applications. By acquiring the user identifier and the user terminal identifier, user behavior data local to the user terminal can be collected, and target user behavior data stored on the cloud server can be collected according to the user identifier.
And step S104, acquiring social relation knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user.
By way of example, by means of big data, statistical analysis, clustering and other means are performed on the behavior data of the user, various types of social network data can be obtained, such as travel types, shopping types, friend making types and the like, a social relationship knowledge graph of the user can be constructed through the social network data, operation record data and risk data, the constructed social relationship knowledge graph is stored in a graph database, the social network data comprises social relationship information and social mode types, the social mode types comprise but are not limited to short message social service, telephone social service and network social service, the operation record data refers to operation records when the user performs login account operation or service application operation, the operation record data comprise social account information of user login, mobile phone number information of user use/account association, equipment identification information of used equipment, login address information and the like, and the risk data comprises a arrearage amount, a delay minor number, a number of loans under the name and the like. And extracting the entity and the relation among the entities from the data, and constructing a social relation knowledge graph of the user.
And S106, carrying out Meanshift clustering on the social relationship knowledge graph data according to the multi-dimensional data characteristics of the target user to obtain a plurality of communities corresponding to each dimensional data of the target user.
Illustratively, the Meanshift clustering algorithm belongs to a mean shift clustering algorithm, and is used for determining a cluster center after large-scale user point classification. The principle is to find a dense region of data points by sliding a window. The mean shift clustering algorithm is a centroid-based algorithm that locates the center point of each cluster by updating the candidate points for the center point to the mean of the points within the sliding window. And then performing similar window removal processing on the candidate windows to finally form a center point set and a corresponding grouping cluster.
The process of the meanshift algorithm is as follows:
step1: extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
step2: acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Step3: optionally with one point as the center of a circleGenerating a circle with a radius d;
step4: calculating the mean position vector of all vectors in the target circular regionComprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The mean position vector in the circular region, h, represents the radius, S h Represents a circular area with radius h, x represents the center of a circle, k represents the number of sample points, and y represents the number of sample points falling into S h A set within a circle;
step5: judging whether a convergence condition is reached, wherein the convergence center distance is the minimum distance between two drift circles, and when the center distance of the two circles is smaller than the value, if the convergence vector is zero, considering the two circles as belonging to the same cluster; the convergence movement distance is the minimum threshold value of the secondary drift, and when the distance between the secondary drift is smaller than the minimum threshold value, the convergence movement distance is considered to be converged, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
The community obtained through the Meanshift clustering fully considers the user data of each dimension, and a certain social relationship exists between the community and the target user, so that the obtained community is more accurate.
Step S108, attribute data of each community is obtained from the social relation knowledge graph data, and whether suspicious communities exist in each community containing the target user is judged according to a pre-trained risk assessment model.
Optionally, the acquiring attribute data of each community from the social relationship knowledge graph data, and judging whether a suspicious community exists in each community including the target user according to a pre-trained risk assessment model, specifically includes:
by training sample data, a probability graph model of a risk assessment model is established in advance, and the joint probability distribution of all variables is as follows:
the probability of the model switching between each state is matrixWherein
Indicating at any time t, if the state is +.>Then at the next moment the state isProbability of (2); wherein (1)>A state at time t+1;
the model obtains the observation probability of each observation value according to the current state and outputs the observation probability which is expressed as a matrixWherein
Indicating at any time t, if the state is +.>Then in observation +.>Probability when acquired;
the probability of the model occurring at each moment is expressed asWherein->Representing the initial state of the model as +.>Is a probability of (2).
Optionally, the risk assessment model training process includes:
constructing an initial risk assessment model comprising parametersThe observation sequence is generated as follows
S1, setting time t=1, and according to the initial state probabilitySelect initial state +.>
S2, according to the stateAnd output probability B selects the observation variable +.>
S3, according to the stateAnd state transition matrix A transition model state +.>
And S4, if t < n, setting t=t+1, and turning to S2, otherwise stopping.
And step S110, if the suspicious community members exist, extracting attribute data of the suspicious community members, and matching the attribute data with the user feature vectors to obtain target user feature data with corresponding dimensions.
Optionally, attribute data of suspicious community members enriches the dimension of the predicted target user characteristic data, and is the supplement and perfection of the target user characteristic data.
Step S112, the target user characteristic data obtained by matching is taken as a main characteristic factor, and the target user characteristic data which is not successfully matched is taken as a secondary characteristic factor.
Step S114, different weight values are set according to the main feature factors and the secondary feature factors, and the target user feature vector is updated.
Because the sample data of the target user characteristic data is limited and the distribution is unbalanced, the accuracy of the subsequent evaluation of the user risk degree is affected. In the practical application process, the main characteristic factors and the secondary characteristic factors in the user characteristic data are screened, so that the problems of strong subjectivity and inaccuracy exist. Therefore, the importance degree of the target user characteristic data is screened through the obtained attribute data of the suspicious community members, and the accuracy of automatic identification can be improved to a great extent through adjustment of the weight value.
Optionally, the main feature factor and the secondary feature factor set different weight values, and the target user feature vector is updated, including, increasing the weight proportion of the main feature factor and reducing the weight proportion of the secondary feature factor; and multiplying different characteristic factors by corresponding weight proportions respectively, and correspondingly replacing the original target user characteristic vector.
And step S116, calculating the risk score of the case to be claiming through a preset algorithm model according to the characteristic vector of the target user and the claim claiming request information.
And S118, determining different claim settlement schemes according to the risk scores to settle the claims.
Optionally, if the risk score is greater than the first risk threshold, rejecting the claim, and outputting corresponding early warning information; if the risk score is smaller than or equal to the first risk threshold and larger than the second risk threshold, inputting the information of the case to be claiming into a customer service terminal for auditing, and determining the claiming in a manual claiming mode; if the risk score is smaller than the second risk threshold, inputting the case information to be clashed into a clashed processing model, realizing automatic clashed analysis, matching the corresponding clashed scheme, and realizing automatic clashed.
According to the invention, on the premise of risk management and control, the risk degree of the claim settlement case information of the target user is predicted by integrating the social group data related to the user, the accuracy of a risk prediction result is improved, the low-risk case pre-claim, direct claim and flash claim are realized, the medium-risk case quick claim and the high-risk case post-investigation cautious claim are realized, and the insurance claim settlement service quality is greatly improved.
In one embodiment, a big data-driven intelligent claim settlement system is provided, which corresponds to the intelligent claim settlement method in the embodiment. As shown in fig. 2, the intelligent claim settlement apparatus includes an information receiving module 11, an information collecting module 12, a first information processing module 13, an information clustering module 14, a first information evaluating module 15, an information judging module 16, an information screening module 17, a second information processing module 18, a second information evaluating module 19, and a claim settlement module 20. The functional modules are described in detail as follows:
the information receiving module 11 is configured to obtain target user information according to claim settlement request information of a case to be claiming sent by the claim settlement terminal;
the information acquisition module 12 is used for acquiring target user behavior track data according to target user information, extracting multi-dimensional data features of a target user and forming a target user feature vector;
the first information processing module 13 is used for acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user;
the information clustering module 14 is configured to perform a Meanshift cluster on the social relationship knowledge graph data according to the multidimensional data features of the target user, so as to obtain a plurality of communities corresponding to each dimensional data of the target user;
the first information evaluation module 15 is configured to obtain attribute data of each community from the social relationship knowledge graph data, and determine whether a suspicious community exists in each community including the target user according to a pre-trained risk evaluation model;
the information judging module 16 is configured to extract attribute data of the suspicious community members, and match the attribute data with the user feature vector to obtain target user feature data of corresponding dimensions if the suspicious community members exist;
the information screening module 17 is configured to take the target user feature data obtained by matching as a main feature factor, and target user feature data that is not successfully matched as a secondary feature factor;
a second information processing module 18, configured to set different weight values according to the main feature factor and the secondary feature factor, and update the target user feature vector;
the second information evaluation module 19 is configured to calculate, according to the target user feature vector and the claim settlement request information, a risk score of the case to be settled through a preset algorithm model;
and the claim settlement module 20 is used for determining different claim settlement schemes according to the risk scores to settle the claims.
In one embodiment, the functions of the information clustering module 14 further include:
extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Optionally with one point as the center of a circleCalculating the mean position vector of all vectors in the target circular region +.>Comprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The mean position vector in the circular region, h, represents the radius, S h Represents a circular area with radius h, x represents the center of a circle, k represents the number of sample points, and y represents the number of sample points falling into S h A set within a circle; the convergence circle center distance is the minimum distance between the two drift circle areas, and when the circle center distance of the two circles is smaller than the value, the two circles are considered to belong to the same cluster; the convergence moving distance is the minimum threshold value of primary drifting, and when the distance between the two drifting is smaller than the minimum threshold value, convergence is considered to be carried out, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the big data driving-based intelligent claim settlement method is realized when the processor executes the computer program.
The embodiment of the invention also provides a computer readable storage medium, which comprises instructions, wherein when the instructions run on a computer, the instructions cause the computer to execute the intelligent claim settlement method based on big data driving.
While the specific embodiments of the present disclosure have been described above with reference to the drawings, it should be understood that the present disclosure is not limited to the embodiments, and that various modifications and changes can be made by one skilled in the art without inventive effort on the basis of the technical solutions of the present disclosure while remaining within the scope of the present disclosure.

Claims (9)

1. A method for intelligent claim settlement based on big data driving, comprising:
obtaining target user information according to claim settlement request information of a case to be subjected to claim settlement sent by a claim settlement terminal;
acquiring target user behavior track data according to the target user information, and extracting multi-dimensional data characteristics of a target user to form a target user characteristic vector;
acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user;
carrying out Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to each dimensional data of the target user;
acquiring attribute data of each community from the social relation knowledge graph data, and judging whether suspicious communities exist in each community containing the target user according to a pre-trained risk assessment model;
if yes, extracting attribute data of the suspicious community members, and matching with the target user feature vector to obtain target user feature data with corresponding dimensions;
taking the target user characteristic data obtained by matching as a main characteristic factor, and taking the target user characteristic data which is not successfully matched as a secondary characteristic factor;
different weight values are set according to the main feature factors and the secondary feature factors, and the feature vectors of the target users are updated;
calculating the risk score of the case to be claiming through a preset algorithm model according to the updated target user feature vector and the claiming request information;
determining different claim settlement schemes according to the risk scores to settle the claims;
the primary and secondary feature factors set different weight values, update the target user feature vector, including,
the weight proportion of the main characteristic factors is improved, and the weight proportion of the secondary characteristic factors is reduced;
and multiplying different characteristic factors by corresponding weight proportions respectively, and correspondingly replacing the original target user characteristic vector.
2. The method of claim 1, wherein the performing a Meanshift clustering on the social relationship knowledge graph data according to the multi-dimensional data features of the target user to obtain a plurality of communities corresponding to each dimensional data including the target user comprises:
extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Optionally with one point as the center of a circleCalculating the mean position vector of all vectors in the target circular region +.>Comprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Where k represents the number of sample points and y represents the number of sample points falling into S h A set within a circle; the convergence circle center distance is the minimum distance between the two drift circle areas, and when the circle center distance of the two circles is smaller than the convergence circle center distance, the two circles are considered to belong to the same cluster; the convergence movement distance is the minimum threshold value of the secondary drifting, and when the distance between the secondary drifting is smaller than the convergence movement distance, the convergence movement distance is considered to be converged, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
3. The method of claim 1, wherein the obtaining attribute data of each community from the social relationship knowledge graph data, and judging whether there is a suspicious community in each community including the target user according to a pre-trained risk assessment model, specifically comprises:
by training sample data, a risk assessment model is pre-established as a probability map model, and the joint probability distribution of all variables is as follows:
the probability of the risk assessment model switching among various states is matrixWherein
Indicating at any time t, if the state is +.>Then at the next moment the state is +.>Probability of (2); wherein (1)>A state at time t+1;
the risk assessment model obtains the observation probability of each observation value according to the current state and outputs the observation probability, and the observation probability is expressed as a matrixWherein
Indicating at any time t, if the state is +.>Then at the observation valueProbability when acquired;
the probability of occurrence of the risk assessment model at each moment is expressed asWhereinRepresenting the initial state of the model as +.>Is a probability of (2).
4. The big data driven intelligent claim method of claim 3, wherein the risk assessment model training process comprises:
constructing an initial risk assessment model comprising parametersThe observation sequence is generated as follows
S1, setting time t=1, and according to the initial state probabilitySelect initial state +.>
S2, according to the stateAnd output probability B selects the observation variable +.>
S3, according to the stateAnd state transition matrix A transition model state +.>
And S4, if t < n, setting t=t+1, and turning to S2, otherwise stopping.
5. The method of claim 1, wherein the determining different claim settlement schemes for settlement according to risk scores comprises:
if the risk score is larger than the first risk threshold, refusing to settle the claim, and outputting corresponding early warning information;
if the risk score is smaller than or equal to the first risk threshold and larger than the second risk threshold, inputting the information of the case to be claiming into a customer service terminal for auditing, and determining the claiming in a manual claiming mode;
if the risk score is smaller than the second risk threshold, inputting the case information to be clashed into a clashed processing model, realizing automatic clashed analysis, matching the corresponding clashed scheme, and realizing automatic clashed.
6. A big data driven intelligent claim settlement system applied to the big data driven intelligent claim settlement method of claim 1, comprising:
the information receiving module is used for obtaining target user information according to the claim settlement request information of the claim case to be settled, which is sent by the claim settlement terminal;
the information acquisition module is used for acquiring target user behavior track data according to the target user information, extracting multi-dimensional data characteristics of the target user and forming a target user characteristic vector;
the first information processing module is used for acquiring social relationship knowledge graph data associated with the target user from a graph database according to the multidimensional data characteristics of the target user;
the information clustering module is used for carrying out Meanshift clustering on the social relation knowledge graph data according to the multidimensional data characteristics of the target user to obtain a plurality of communities corresponding to the dimensional data of the target user;
the first information evaluation module is used for acquiring attribute data of each community from the social relation knowledge graph data and judging whether suspicious communities exist in each community containing the target user according to a pre-trained risk evaluation model;
the information judging module is used for extracting attribute data of the suspicious community members and matching the attribute data with the target user feature vector to obtain target user feature data with corresponding dimensions if the suspicious community members exist;
the information screening module is used for taking the target user characteristic data obtained by matching as a main characteristic factor and the target user characteristic data which is not successfully matched as a secondary characteristic factor;
the second information processing module is used for setting different weight values according to the main characteristic factors and the secondary characteristic factors and updating the characteristic vectors of the target users;
the second information evaluation module is used for calculating the risk score of the case to be claiming through a preset algorithm model according to the updated target user feature vector and the claiming request information;
the claim settlement module is used for determining different claim settlement schemes according to the risk scores to settle the claims;
the primary and secondary feature factors set different weight values, update the target user feature vector, including,
the weight proportion of the main characteristic factors is improved, and the weight proportion of the secondary characteristic factors is reduced;
and multiplying different characteristic factors by corresponding weight proportions respectively, and correspondingly replacing the original target user characteristic vector.
7. The big data driven intelligent claim settlement system according to claim 6, wherein the information clustering module is configured to perform Meanshift clustering on the social relationship knowledge graph data according to the multidimensional data features of the target user to obtain a plurality of communities corresponding to each dimensional data including the target user, and includes:
extracting behavior data of all user identifications according to entity user identifications in social relationship knowledge graph data;
acquiring corresponding time stamp and IP position data according to the behavior data; representing corresponding IP position data as a vector form according to the sequence of the time stamps,i=1,…,n;
Optionally with one point as the center of a circleCalculating the mean position vector of all vectors in the target circular region +.>Comprising:
S h is a circular region of radius h, the set of y points satisfying the following relationship,the method comprises the steps of carrying out a first treatment on the surface of the Where k represents the number of sample points and y represents the number of sample points falling into S h A set within a circle; the convergence circle center distance is the minimum distance between the two drift circle areas, and when the circle center distance of the two circles is smaller than the convergence circle center distance, the two circles are considered to belong to the same cluster; the convergence movement distance is the minimum threshold value of the secondary drifting, and when the distance between the secondary drifting is smaller than the convergence movement distance, the convergence movement distance is considered to be converged, and the center point position of the IP position is obtained; and obtaining a plurality of communities corresponding to each dimension data of the target user according to the center point position.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the big data driven intelligent claim settlement method according to any of claims 1-5 when the computer program is executed.
9. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the big data driven intelligent claim settlement method of any of claims 1-5.
CN202311203638.0A 2023-09-19 2023-09-19 Intelligent claim settlement method and system based on big data driving Active CN116934507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311203638.0A CN116934507B (en) 2023-09-19 2023-09-19 Intelligent claim settlement method and system based on big data driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311203638.0A CN116934507B (en) 2023-09-19 2023-09-19 Intelligent claim settlement method and system based on big data driving

Publications (2)

Publication Number Publication Date
CN116934507A CN116934507A (en) 2023-10-24
CN116934507B true CN116934507B (en) 2023-12-26

Family

ID=88386529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311203638.0A Active CN116934507B (en) 2023-09-19 2023-09-19 Intelligent claim settlement method and system based on big data driving

Country Status (1)

Country Link
CN (1) CN116934507B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117764742A (en) * 2023-11-20 2024-03-26 中国银行保险信息技术管理有限公司 Method and device for determining insurance health index

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784636A (en) * 2018-12-13 2019-05-21 中国平安财产保险股份有限公司 Fraudulent user recognition methods, device, computer equipment and storage medium
CN110245165A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Risk conduction association map optimization method, device and computer equipment
CN111598713A (en) * 2020-07-24 2020-08-28 北京淇瑀信息科技有限公司 Cluster recognition method and device based on similarity weight updating and electronic equipment
CN116308823A (en) * 2023-03-23 2023-06-23 平安健康保险股份有限公司 Knowledge graph-based wind control method and related equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11403599B2 (en) * 2019-10-21 2022-08-02 Hartford Fire Insurance Company Data analytics system to automatically recommend risk mitigation strategies for an enterprise
US20220129804A1 (en) * 2020-10-28 2022-04-28 Mckinsey & Company, Inc. Systems and Methods for Integrated Technology Risk Management

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784636A (en) * 2018-12-13 2019-05-21 中国平安财产保险股份有限公司 Fraudulent user recognition methods, device, computer equipment and storage medium
CN110245165A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Risk conduction association map optimization method, device and computer equipment
CN111598713A (en) * 2020-07-24 2020-08-28 北京淇瑀信息科技有限公司 Cluster recognition method and device based on similarity weight updating and electronic equipment
CN116308823A (en) * 2023-03-23 2023-06-23 平安健康保险股份有限公司 Knowledge graph-based wind control method and related equipment

Also Published As

Publication number Publication date
CN116934507A (en) 2023-10-24

Similar Documents

Publication Publication Date Title
CN108648074B (en) Loan assessment method, device and equipment based on support vector machine
CN112258093B (en) Data processing method and device for risk level, storage medium and electronic equipment
CN111383101B (en) Post-credit risk monitoring method, post-credit risk monitoring device, post-credit risk monitoring equipment and computer readable storage medium
CN116934507B (en) Intelligent claim settlement method and system based on big data driving
CN106549813A (en) A kind of appraisal procedure and system of network performance
CN110348516B (en) Data processing method, data processing device, storage medium and electronic equipment
CN101893704A (en) Rough set-based radar radiation source signal identification method
CN109273096A (en) A kind of risk management grading evaluation method based on machine learning
CN114998744B (en) Agricultural machinery track field dividing method and device based on motion and vision dual-feature fusion
CN111754241A (en) User behavior perception method, device, equipment and medium
CN105654574A (en) Vehicle equipment-based driving behavior evaluation method and vehicle equipment-based driving behavior evaluation device
CN110648172A (en) Identity recognition method and system fusing multiple mobile devices
CN112950035A (en) Medical institution service quality measurement method for improving D-S algorithm
CN117349771A (en) Error tag data identification method and device, electronic equipment and readable storage medium
CN104579850A (en) Quality of service (QoS) prediction method for Web service under mobile Internet environment
CN116757837A (en) Credit wind control method and system applied to winning bid
CN117132383A (en) Credit data processing method, device, equipment and readable storage medium
CN116627781A (en) Target model verification method and device
CN114363082B (en) Network attack detection method, device, equipment and computer readable storage medium
CN113506266B (en) Method, device, equipment and storage medium for detecting greasy tongue coating
CN110232517B (en) Mobile crowd sensing user profit selection method
Ming et al. The performance evaluation of expressway PPP project during operation period based on RF
CN115273854B (en) Service quality determining method and device, electronic equipment and storage medium
CN117094817B (en) Credit risk control intelligent prediction method and system
CN116012148A (en) Scoring method and scoring system for enterprise credit investigation data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant