WO2019196552A1 - Data processing method, apparatus and device for insurance fraud identification, and server - Google Patents

Data processing method, apparatus and device for insurance fraud identification, and server Download PDF

Info

Publication number
WO2019196552A1
WO2019196552A1 PCT/CN2019/074097 CN2019074097W WO2019196552A1 WO 2019196552 A1 WO2019196552 A1 WO 2019196552A1 CN 2019074097 W CN2019074097 W CN 2019074097W WO 2019196552 A1 WO2019196552 A1 WO 2019196552A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
relationship
person
identified
insurance
Prior art date
Application number
PCT/CN2019/074097
Other languages
French (fr)
Chinese (zh)
Inventor
王修坤
邹晓川
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019196552A1 publication Critical patent/WO2019196552A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Definitions

  • the embodiment of the present specification belongs to the technical field of computer data processing for insurance fraud detection, and particularly relates to a data processing method, device, processing device and server for insurance fraud.
  • the embodiment of the present specification aims to provide a data processing method, device, processing device and server for insurance fraud, which can provide network data and self-characteristics between the use personnel, and can more effectively identify the fraudster.
  • the data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are implemented by the following methods:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • a data processing device for insurance fraud identification comprising:
  • a data acquisition module configured to acquire relationship association data of the to-be-identified group
  • a feature calculation module configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
  • a fraud identification module configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and confirm that the to-be-identified group defrauds the output result;
  • the learning algorithm includes a data relationship model obtained by using the multi-degree relationship network data and the person characteristic data of the selected target group, and the marked historical fraud insurance personnel as the sample data.
  • a processing device includes a processor and a memory for storing processor-executable instructions that, when executed by the processor, are implemented:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • a server comprising at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are based on the multi-dimensional relationship data of the insured person and the insured person to construct the multi-degree relationship network map data of the crowd, which can be more deeply Exploring the network of relationships between people to improve the efficiency and scope of identification.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • FIG. 1 is a schematic flow chart of an embodiment of a data processing method for insurance fraud identification provided by the present specification
  • FIG. 2 is a schematic diagram of a processing procedure for constructing a supervised recognition model provided by the present specification
  • FIG. 3 is a block diagram showing the hardware structure of an insurance fraud identification processing server provided by the present specification
  • FIG. 4 is a block diagram showing the structure of a data processing apparatus for insurance fraud identification provided by the present specification.
  • FIG. 5 is a block diagram showing the structure of a fraud identification module in a data processing apparatus for insurance fraud identification provided by the present specification.
  • the embodiment of the present specification provides a plurality of embodiments, which are triggered by multiple relationship-related data of a target group including an insured person and an application for claiming personnel, and the composition of the multi-degree relationship network is performed (the data of the relationship network graph may be referred to as a multi-degree relationship graph).
  • the solution provided by the embodiment of the present specification also considers the characteristic attributes of the fraudsters themselves, such as the fraudulent inspector usually uses the false information to register the account, the account registration time is short, and the account is registered to use the insured service.
  • the implementation scheme provided by the present specification combines the relationship characteristic data of the fraud insurance group and the self-characteristic data to mark the historical fraud insurance personnel and perform algorithm learning with the supervised model, so that the person to be identified can be calculated or identified whether the fraudulent insurance exists. result.
  • FIG. 1 is a schematic flowchart diagram of an embodiment of a data processing method for insurance fraud identification provided by the present specification.
  • the present specification provides method operation steps or device structures as shown in the following embodiments or figures, there may be more or partial merged fewer operational steps in the method or device based on conventional or no inventive labor. Or module unit.
  • the execution order of the steps or the module structure of the device is not limited to the execution order or the module structure shown in the embodiment or the drawings.
  • server or terminal product of the method or module structure When the device, server or terminal product of the method or module structure is applied, it may be executed sequentially or in parallel according to the method or module structure shown in the embodiment or the drawing (for example, parallel processor or multi-thread processing). Environment, even including distributed processing, server cluster implementation environment).
  • the description of the following embodiments does not constitute a limitation on other expandable technical solutions based on the present specification.
  • the embodiments provided in this specification can also be applied to implementation scenarios of fund fraud identification, product transactions, service transactions, and the like.
  • the data processing method for insurance fraud identification provided by the present specification may include:
  • S2 constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extracting the person characteristic data of the to-be-identified group;
  • the supervised learning algorithm includes adopting Data relationship model obtained by training the selected target population's multi-degree relationship network data and personnel characteristic data, marking historical fraud insurance personnel as sample data
  • the insurance insurance, accounting, and claims are mainly applied to the claiming personnel.
  • the situation in which the fraud insurance motivation occurs from the beginning of the insurance is considered in some embodiments, and the fraud insurance personnel are The main purpose is to apply for insurance benefits, and of course there are some motives that are only after the insurance.
  • the insured person is the main subject of the insurance, such as the fraudulent person of the fellow group deliberately creating the accident of the insured person. Therefore, in the present embodiment, the claimant and the insured are selected when identifying the target group when the fraud is present. Staff collection.
  • the target group when the target group is selected to perform the acquisition learning of the relationship feature data, the target group may include a set of persons applying for the claimant and the insured.
  • the application claimant may include the insured in some implementation situations, such as the father insuring the son, the father as the beneficiary, and the father applying for the claimant after the accident; or in some cases, the claimant may also include the insured.
  • Personnel such as the insured person, insured himself, the beneficiary is himself.
  • the application claimant and the insured mentioned in the above can understand the names of the personnel in different roles in the insurance business, and are not currently different personnel.
  • the selection of the target group may also select one or more of the claim applicant or the insured or the insured or the beneficiary.
  • the relationship association data may include data information associated with personnel in the target group in various dimensions, such as household registration, age, relative/classmate relationship between personnel, insurance data, insurance risk data, and the like.
  • the specific relationship-related data may be selected according to the actual application scenario to determine which categories of data are used.
  • the operator may use the data information that may be involved in the fraudulent behavior as the basis for collecting the relationship-related data.
  • the relationship-related data may include at least one of the following:
  • the social relationship data may include social relationships between people in the target group, such as cousins, teachers and students, family members, classmates, leaders, and subordinates.
  • the terminal data may include a brand, a model, and a category of a communication device used by a person, and some people in the fraud scene use a mobile phone of the same brand.
  • the application of the terminal and the application account operation information may be used to determine whether to use the same application, and use the same account to log in to different terminal applications for insurance fraud operation. In some scenarios, multiple following listeners are unified and commanded on the terminal. Take action.
  • the behavior data associated with the insurance behavior may include behavior data of the target person's insurance behavior, claim behavior, compensation amount, and the like.
  • the personnel basic attribute data may include the age, gender, occupation, household registration, and the like of the applicant/applicant.
  • the geographic location data may include geographic location information currently in which the target population is located or information of a region that has historically passed/detained fruit.
  • the data relationship association data of each dimension described above may have other definitions or contain more/less data categories and information, and may also include relationship-related data of other dimensions than the above, such as consumption information. Even credit records or administrative penalty information, one or more of the above data information may be collected during specific collection.
  • the personnel characteristic data may include data information associated with a single person itself, such as gender, age, insurance service account number or terminal application account registration time, credit history, consumption status, etc., or may also include behaviors associated with insurance behaviors. Data, such as multiple insurance behaviors, frequent claims, and whether the amount of compensation is normal. It can also include transaction data for other goods or services, such as long-term large expenses, multiple vehicle insurance, multiple mobile phones, and multiple communication accounts/social accounts.
  • the person characteristic data used for the specific person feature calculation may adopt a combination of one or more of the above to realize the identification of the person's own characteristics. Therefore, in another embodiment, the person characteristic data may include feature data extracted by at least one of a user registration account, transaction data, and behavior data associated with the insurance behavior.
  • the multi-dimensional relationship association data obtained above can be used to construct the multi-degree relationship network map data of the target group.
  • the multi-degree relationship network graph data may include a relationship network graph generated based on a relationship chain between different people established by the relationship association data, wherein the relationship chain data between the persons on the relationship network graph is a multi-degree relationship Network graph data.
  • the relationship chain can represent relationship data between every two people, such as A and B are boss relationships, A and C are family relationships, and the like.
  • the relationship between two separate persons may be referred to as a one-degree relationship, and the “multiple degrees” in the multi-degree relationship network map data described in this embodiment may include associations between new persons established based on the one-degree relationship.
  • the data such as the second relationship between the first person and the third person based on the one-time relationship between the first person and the second person and the one-time relationship between the second person and the third person, may even further establish the first relationship based on the other relationship
  • a and B are once social relations, and A has no social relationship with company boss C of brother-in-law B, but in the present embodiment, due to the existence of B is both a brother-in-law of A and a subordinate of company boss C, so A establishes a second relationship with company boss C.
  • the embodiment can use the supervised learning algorithm to learn the relationship characteristics and self-features of the fraud-investigating personnel, so that an effective recognition model can be established.
  • Supervised learning is a kind of classification processing.
  • an existing optimal training model ie known data and its corresponding output
  • this model belongs to a set of functions.
  • the best means that it is the best under a certain evaluation criterion
  • this model to map all the inputs to the corresponding output, and make a simple judgment on the output to achieve the purpose of classification, which has the unknown data.
  • the ability to classify Typical examples of supervised learning are KNN (k-NearestNeighbor), SVM (Support Vector Machine), and support vector machines.
  • a supervised learning algorithm can obtain more accurate output results than a supervised algorithm with a certain number of training samples.
  • the processing processes of other specific relationship features and self-features are designed and determined according to the type of algorithm and the recognition processing requirements.
  • a supervised graph algorithm such as Structure2vec can be used.
  • the constructed supervised learning algorithm includes:
  • S40 using the selected supervised learning algorithm to perform the first relationship network learning on the relationship characteristics between the target personnel and other personnel in the multi-degree relationship network data of the target group, and performing the second self-attribute learning based on the self-characteristic data of the target person characteristics ;
  • S44 Determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
  • FIG. 2 is a schematic diagram of a processing procedure of an embodiment of a supervised learning algorithm provided by the present specification.
  • Structure2vec's supervised graph algorithm can be used: on the one hand, to learn the relationship characteristics of the target person and its neighbors (such as how many people are related, whether it is related to the fraudster), on the other hand Learning the characteristics of the target person (such as gender, age, etc.), the above characteristics are used as the x variable of the model; secondly, according to the historical marking, whether it is the scammers as the y variable; finally, the relevant model is established according to y and x, thereby achieving The y case can be predicted only by relying on x.
  • the final identification in the application scenario in this embodiment may be a single person. That is to say, the reason in this embodiment is that after the supervised learning algorithm learns the relationship characteristics of the gang fraud insurance, and then combines the characteristics of the fraud insurance personnel, it can directly obtain whether the person to be identified is a fraudulent or fraudulent person.
  • the scam guarantees the output. For example, the probability of marking a person as a fraudster or a normal person, or as a fraudster.
  • the mark described here is the recognition result of the fraudster based on the relationship feature and the self-characteristic, and can be used as a basis and reference for initially determining whether these people are fraudulent persons. Final determination of whether it is a fraudulent insurance can be subjective judgment by the operator, or combined with other calculation methods to judge and determine.
  • the data processing method of the insurance fraud provided by the embodiment can construct the multi-degree relationship network map data of the crowd based on the multi-dimensional relationship association data of the insured person and the insured person, and can further explore the relationship network between the personnel and improve the relationship network. Identify efficiency and scope.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • the data information of the historical fraudster may be combined with the multi-degree relationship network map data for identification of the fraudster.
  • the relationship association data may further include: historical fraudulent personnel list data.
  • the data information of the historical fraud insurance group is added, and when the classified community is analyzed and processed, the degree of participation of the historical fraud insurance personnel is considered.
  • the relationship concentration described in this embodiment may include the degree of participation of the historical fraud insurance personnel, and may specifically include the number of historical fraud insurance personnel in the classified community, the proportion of historical fraud insurance personnel, the history fraud protector and other personnel.
  • An example of the degree of relationship intensiveness is as follows: in a risk community of 10 people, 2 historical fraudsters are kinship with one or more other relationships, and 2 employees are classmates.
  • the specific relationship concentration can be calculated in different ways, such as the number of historical fraudsters, the proportion, and the relationship network.
  • the relationship concentration may be calculated from two indicators of the size of the person to be identified and the number of historical fraudsters, and the relationship concentration may be used as a measure of the probability of fraudulent insurance. . Specifically, it may include:
  • the personal fraud probability value calculated by combining the self-features can be calculated, and the group fraud probability is calculated to determine the probability that the final output group is a fraud or a single person is fraudulent. Or the group fraud insurance probability and the personal fraud insurance probability are respectively utilized separately, and no mutual calculation is performed.
  • the probability of community fraud can be calculated in the following way:
  • RiskDegree log (total number of classified community members)* Number of historical fraudsters/total number of classified community members.
  • the above embodiments provide a fraudulent group that can use fraudulent data from a historical fraudster to identify fraudulent insurance.
  • the relationship network feature between each member of the group can be utilized to determine whether it is a fraudster. Specifically, such as determining the network structure characteristics of the personnel relationship in the crowd;
  • the crowd is marked as a fraud group.
  • the above described method can be used in the training of a supervised learning algorithm, and the crowd is a target person.
  • the crowd is the person to be identified.
  • the network structure feature may be based on personnel information in a crowd, network information between people, and the like.
  • the relationship network information herein may be the one-time information described above, and may also include the constructed multi-degree information.
  • the relationship network in the crowd may be a network structure such as a "spherical network” or a "pyramid network.”
  • the “pyramid network” is similar to a pyramid scheme organization, with a layer-by-layer relationship structure, which is more likely to be fraudulent.
  • the “spherical network” is a fraudulent organization that is related to each other in the network and may be decentralized.
  • the data processing method of the insurance fraud provided by the embodiment of the present specification, the mining of the relational data supporting relation network algorithm using the relational network close to the actual relational network, and the calculation of the relationship network data of the multi-degree relationship.
  • the multi-degree relationship network map data of the crowd can be used to more deeply explore the relationship network between the people and improve the recognition efficiency and scope.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • the method described above can be used for insurance fraud identification on the client side, such as the anti-fraud application installed by the mobile terminal and the insurance service provided by the payment application.
  • the client can be a PC (personal computer), a server, an industrial computer (industrial control computer), a mobile smart phone, a tablet electronic device, a portable computer (such as a laptop computer, etc.), a personal digital assistant (PDA), or a desktop.
  • FIG. 3 is a block diagram showing the hardware structure of a server for identifying a damaged component of a vehicle according to an embodiment of the present invention.
  • server 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), A memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG.
  • server 10 may also include more or fewer components than those shown in FIG. 3, for example, may also include other processing hardware, such as a database or multi-level cache, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of application software, such as program instructions/modules corresponding to the search method in the embodiment of the present invention, and the processor 102 executes various functions by running software programs and modules stored in the memory 104.
  • Application and data processing that is, a processing method for realizing the content display of the above navigation interaction interface.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission module 106 is configured to receive or transmit data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transport module 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission module 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • the present specification also provides a data processing apparatus for insurance fraud identification.
  • the apparatus may include a system (including a distributed system), software (applications), modules, components, servers, clients, etc., using the methods described in the embodiments of the present specification, in conjunction with necessary device hardware for implementing the hardware.
  • the processing device in one embodiment provided by this specification is as described in the following embodiments.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a schematic structural diagram of a module of an embodiment of a data processing apparatus for insurance fraud identification provided by the present specification, which may include:
  • the data obtaining module 101 is configured to obtain relationship association data of the to-be-identified group
  • the feature calculation module 102 may be configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
  • the fraud identification module 103 may be configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determine that the to-be-identified group defrauds the output result;
  • the supervised learning algorithm includes a data relation model obtained by using the multi-degree relationship network data and personnel characteristic data of the selected target population, and the marked historical fraud insurance personnel as the sample data.
  • the relationship association data may include at least one of the following:
  • the fraud identification module 103 determines that the to-be-identified crowd fraud output output includes a probability of outputting a single target person to be identified as a fraudulent person or a fraudulent person.
  • the selected target population includes a collection of persons applying for claims and insured persons.
  • the person characteristic data includes feature data extracted by at least one of a user registration account number, transaction data, and behavior data associated with the insurance behavior.
  • FIG. 5 is another embodiment of the apparatus. As shown in FIG. 5, the fraud identification module 103 includes:
  • the feature learning module 1031 may be configured to perform, by using the selected supervised learning algorithm, the relationship between the target person and the other person in the multi-degree relationship network data of the target group, the first relationship network learning, and the self-characteristic data based on the target person feature Performing a second self attribute learning;
  • the relationship establishing module 1032 may be configured to use the feature data obtained by the first relationship network learning and the second self attribute learning as an independent variable of the supervised learning algorithm, and establish a relationship by using a labeled historical fraud person as a dependent variable. model;
  • the model training module 1033 can be configured to determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
  • the training iteration of the parameters in the model can be used as an online when the output accuracy requirements are met.
  • the server or client provided by the embodiment of the present specification may be implemented by a processor executing a corresponding program instruction in a computer, such as a C++ language of a Windows operating system, implemented on a PC or a server, or other corresponding to, for example, Linux or a system. Apply the design language to the necessary hardware implementations, or to implement processing logic based on quantum computers. Accordingly, the present specification also provides a data processing device for insurance fraud identification, which may specifically include a processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the above instructions may be stored in a variety of computer readable storage media.
  • the computer readable storage medium may include physical means for storing information, which may be digitized and stored in a medium utilizing electrical, magnetic or optical means.
  • the computer readable storage medium of this embodiment may include: means for storing information by means of electrical energy, such as various types of memories, such as RAM, ROM, etc.; means for storing information by magnetic energy means, such as hard disk, floppy disk, magnetic tape, magnetic Core memory, bubble memory, U disk; means for optically storing information such as CD or DVD.
  • electrical energy such as various types of memories, such as RAM, ROM, etc.
  • magnetic energy means such as hard disk, floppy disk, magnetic tape, magnetic Core memory, bubble memory, U disk
  • means for optically storing information such as CD or DVD.
  • quantum memories graphene memories, and the like.
  • the processing device may specifically provide an insurance anti-fraud identification server for an insurance server or a third-party service organization, and the server may be a separate server, a server cluster, a distributed system server, or a server that processes data by requesting data.
  • System server combination for data processing Accordingly, embodiments of the present specification also provide a specific server product, the server including at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the apparatus, the processing device, and the server described in the foregoing embodiments of the present specification may further include other embodiments according to the description of the related method embodiments.
  • the apparatus, the processing device, and the server described in the foregoing embodiments of the present specification may further include other embodiments according to the description of the related method embodiments.
  • embodiments of the present specification refers to the type of relationship-related data collection, the range of the target population selected during training, the probability of calculating the probability of fraudulent insurance, etc., data acquisition, storage, interaction, calculation, judgment, etc.
  • the data is described, however, embodiments of the present specification are not limited to situations that must be consistent with industry communication standards, standard oversight or unsupervised model processing, communication protocols, and standard data models/templates or embodiments of the specification.
  • Certain industry standards or implementations that have been modified in a manner that uses a custom approach or an embodiment described above may also achieve the same, equivalent, or similar, or post-deformation implementation effects of the above-described embodiments.
  • Embodiments obtained by applying such modified or modified data acquisition, storage, judgment, processing, etc. may still fall within the scope of alternative embodiments of the present specification.
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • the controller can be implemented in any suitable manner, for example, the controller can take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor.
  • computer readable program code eg, software or firmware
  • examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, The Microchip PIC18F26K20 and the Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic.
  • the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding.
  • Such a controller can therefore be considered a hardware component, and the means for implementing various functions included therein can also be considered as a structure within the hardware component.
  • a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.
  • the processing device, device, module or unit set forth in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function.
  • a typical implementation device is a computer.
  • the computer can be, for example, a personal computer, a laptop computer, a car-mounted human-machine interaction device, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet.
  • a computer, wearable device, or a combination of any of these devices are examples of these devices.
  • the above devices are described as being separately divided into various modules by function.
  • the functions of the modules may be implemented in the same software or software, or the modules that implement the same function may be implemented by multiple sub-modules or a combination of sub-units.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present specification can be provided as a method, system, or computer program product.
  • embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment or a combination of software and hardware.
  • embodiments of the present specification can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • Embodiments of the present description can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • Embodiments of the present specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

A data processing method, apparatus and device for insurance fraud identification, and a server, wherein multi-scale relationship network graph data of a crowd are built on the basis of multi-scale relationship association data of insurance applicants and insurants, a relationship network among people may be more deeply mined, identification efficiency is improved, and identification range is widened. At the same time, a supervised learning model is jointly built according to characteristic data of a fraudster and is used for learning relationship network characteristics and personal characteristics of the fraudster. Accomplice fraudsters will have obvious multi-scale relationship characteristics in the relationship network, the characteristics of the fraudsters frequently indicate similarities, thus the fraudsters may be more effectively and efficiently identified by using the described method, and identification processing efficiency is improved.

Description

保险欺诈识别的数据处理方法、装置、设备及服务器Data processing method, device, device and server for insurance fraud identification 技术领域Technical field
本说明书实施例方案属于保险反欺诈识别的计算机数据处理的技术领域,尤其涉及一种保险欺诈的数据处理方法、装置、处理设备及服务器。The embodiment of the present specification belongs to the technical field of computer data processing for insurance fraud detection, and particularly relates to a data processing method, device, processing device and server for insurance fraud.
背景技术Background technique
保险是通过缴纳规定的保费,然后可以享受的财务、人身等保障。随着社会的经济发展和人们保险意识的提高,保险业务的需求也越来越多。Insurance is the financial and personal protection that can be enjoyed by paying the prescribed premiums. With the economic development of society and the awareness of people's insurance, the demand for insurance business is also increasing.
然而,由于保险有一定的经济杠杆效应,使得市场上出现大量骗保的行为,这些骗保人员通常故意制造保险事并依此获得保险公司赔款。目前的骗保行为有发展为专业化、团队化的趋势,对保险行业的健康发展带来非常不利的影响,损坏保险公司和公众利益。目前传统的识别骗保的方式主要依靠任人工利用一些简单规则对历史骗保人员进行识别,凭借历史骗保人员的行为预测是否存在骗保风险。由于骗保人员和团体的隐蔽性越来越强,现有的这种方式不容易快速发现团体作案,并且人工审核的工作量较大,识别效率较为低下。However, due to the certain economic leverage effect of insurance, there is a large amount of fraudulent behavior in the market. These fraudsters usually deliberately create insurance and obtain insurance company compensation. The current fraudulent behavior has developed into a trend of specialization and teamwork, which has a very negative impact on the healthy development of the insurance industry, and damages the interests of insurance companies and the public. At present, the traditional way of identifying fraud insurance mainly relies on manual use of some simple rules to identify historical fraudsters, and predicts whether there is a risk of fraud insurance by the behavior of historical fraudsters. Due to the increasing concealment of fraudulent personnel and groups, the existing method is not easy to quickly find group crimes, and the manual review has a large workload and the recognition efficiency is relatively low.
因此,业内亟需一种可以更加有效和高效的识别出骗保人员的处理方式。Therefore, there is a need in the industry for a way to identify fraudsters more effectively and efficiently.
发明内容Summary of the invention
本说明书实施例目的在于提供一种保险欺诈的数据处理方法、装置、处理设备及服务器,可以提供利用人员之间的关系网络数据和自身特征,可以更加有效的识别出骗保人员。The embodiment of the present specification aims to provide a data processing method, device, processing device and server for insurance fraud, which can provide network data and self-characteristics between the use personnel, and can more effectively identify the fraudster.
本说明书实施例提供的一种保险欺诈的数据处理方法、装置、处理设备及服务器是包括以下方式实现的:The data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are implemented by the following methods:
获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选 取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
一种保险欺诈识别的数据处理装置,包括:A data processing device for insurance fraud identification, comprising:
数据获取模块,用于获取待识别人群的关系关联数据;a data acquisition module, configured to acquire relationship association data of the to-be-identified group;
特征计算模块,用于基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;a feature calculation module, configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
欺诈识别模块,用于利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。a fraud identification module, configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and confirm that the to-be-identified group defrauds the output result; The learning algorithm includes a data relationship model obtained by using the multi-degree relationship network data and the person characteristic data of the selected target group, and the marked historical fraud insurance personnel as the sample data.
一种处理设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:A processing device includes a processor and a memory for storing processor-executable instructions that, when executed by the processor, are implemented:
获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
一种服务器,包括至少一个处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:A server comprising at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
本说明书实施例提供的一种保险欺诈的数据处理方法、装置、处理设备及服务器,基于投保人员和被保险人的多维度的关系关联数据构建人群的多度关系网络图数据,可以更加深入的挖掘人员之间的关系网络,提高识别效率和范围。同时结合骗保人员自身的特征数据,共同建立有监督的学习模型,用来学习骗保人员的关系网络特征和自身特征。团伙的骗保人员不仅在关系网络上有着较为明显和多度的关系特征,其自身特征也常常表现出相似性,因此利用本说明书实施例提供的方法可以更加有效和高效的识别出骗保人员,提高识别处理效率。The data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are based on the multi-dimensional relationship data of the insured person and the insured person to construct the multi-degree relationship network map data of the crowd, which can be more deeply Exploring the network of relationships between people to improve the efficiency and scope of identification. At the same time, combined with the characteristics data of the fraudsters themselves, a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters. The gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
附图说明DRAWINGS
为了更清楚地说明本说明书实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a few embodiments described in the present specification, and other drawings can be obtained from those skilled in the art without any inventive labor.
图1是本说明书提供的一种保险欺诈识别的数据处理方法实施例的流程示意图;1 is a schematic flow chart of an embodiment of a data processing method for insurance fraud identification provided by the present specification;
图2是本说明书提供的一种构建有监督识别模型的处理过程示意图;2 is a schematic diagram of a processing procedure for constructing a supervised recognition model provided by the present specification;
图3是本说明书提供的一种保险欺诈识别处理服务器的硬件结构框图;3 is a block diagram showing the hardware structure of an insurance fraud identification processing server provided by the present specification;
图4是本说明书提供的一种保险欺诈识别的数据处理装置的模块结构示意图。4 is a block diagram showing the structure of a data processing apparatus for insurance fraud identification provided by the present specification.
图5是本说明书提供的一种保险欺诈识别的数据处理装置中欺诈识别模块的模块结构示意图。FIG. 5 is a block diagram showing the structure of a fraud identification module in a data processing apparatus for insurance fraud identification provided by the present specification.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书中的一部分实施例,而不是全部的实施例。基于本说明书中的一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书实施例保护的范围。In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the specification. The embodiments are only a part of the embodiments in the specification, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on one or more embodiments of the present specification without departing from the scope of the present invention should fall within the scope of the embodiments of the present invention.
物以类聚,人以群分。骗保人群通常需要多人配合才能提高骗保的伪装性。而骗保人员的聚集在很多情况下也会基于熟人关系或具有较为明显的共性特征或某一维度的网络关系特征数据。例如以亲戚之间合伙的骗保行为,传销性质的具有明显阶层划分的 骗保团体、有经验的历史骗保人员为头目拉拢的社会群体或学生群体等。本说明书实施例提供多个实施方案中,从包含投保人员和申请理赔人员的目标人群的多种关系关联数据触发,进行多度关系网络的构图(关系网络图的数据可以称为多度关系图数据),深入挖掘目标人群之间的关系网络,解决有常规仅对历史骗保人员和与历史骗保人员有直接关系的一度关系进行识别的覆盖率和识别率低的问题。同时,本说明书实施例提供的方案,还考虑到骗保人员自身的特征属性,如骗保人员通常使用虚假信息注册账号、账号注册时间短、账号注册后主使用投保业务等。本说明书提供的实施方案,结合骗保团体的关系特征数据和自身特征数据,将历史骗保人员标记出来,进行有监督模型的算法学习,从而可以计算或识别出待识别人群是否存在骗保的结果。Birds of a feather flock together. Defrauding people usually need more people to cooperate to improve the camouflage of fraud. In many cases, the aggregation of fraudsters will also be based on acquaintance relationships or network relationship characteristics with more obvious common characteristics or a certain dimension. For example, the fraudulent behavior of partnerships between relatives, the pyramid schemes of the pyramid schemes with obvious class division, and the experienced historical fraudsters are the social groups or student groups that the leaders are wooing. The embodiment of the present specification provides a plurality of embodiments, which are triggered by multiple relationship-related data of a target group including an insured person and an application for claiming personnel, and the composition of the multi-degree relationship network is performed (the data of the relationship network graph may be referred to as a multi-degree relationship graph). Data), dig deep into the network of relationships between target groups, and solve the problem of low coverage and low recognition rate that is conventionally identified only for historical scammers and one-time relationships that are directly related to historical scammers. At the same time, the solution provided by the embodiment of the present specification also considers the characteristic attributes of the fraudsters themselves, such as the fraudulent inspector usually uses the false information to register the account, the account registration time is short, and the account is registered to use the insured service. The implementation scheme provided by the present specification combines the relationship characteristic data of the fraud insurance group and the self-characteristic data to mark the historical fraud insurance personnel and perform algorithm learning with the supervised model, so that the person to be identified can be calculated or identified whether the fraudulent insurance exists. result.
下面以一个具体的保险业务欺诈识别处理的应用场景为例对本说明书实施方案进行说明。具体的,图1是本说明书提供的所述一种保险欺诈识别的数据处理方法实施例的流程示意图。虽然本说明书提供了如下述实施例或附图所示的方法操作步骤或装置结构,但基于常规或者无需创造性的劳动在所述方法或装置中可以包括更多或者部分合并后更少的操作步骤或模块单元。在逻辑性上不存在必要因果关系的步骤或结构中,这些步骤的执行顺序或装置的模块结构不限于本说明书实施例或附图所示的执行顺序或模块结构。所述的方法或模块结构的在实际中的装置、服务器或终端产品应用时,可以按照实施例或者附图所示的方法或模块结构进行顺序执行或者并行执行(例如并行处理器或者多线程处理的环境、甚至包括分布式处理、服务器集群的实施环境)。The following describes an embodiment of the present specification by taking an application scenario of a specific insurance service fraud identification processing as an example. Specifically, FIG. 1 is a schematic flowchart diagram of an embodiment of a data processing method for insurance fraud identification provided by the present specification. Although the present specification provides method operation steps or device structures as shown in the following embodiments or figures, there may be more or partial merged fewer operational steps in the method or device based on conventional or no inventive labor. Or module unit. In the steps or structures in which the necessary causal relationship does not exist logically, the execution order of the steps or the module structure of the device is not limited to the execution order or the module structure shown in the embodiment or the drawings. When the device, server or terminal product of the method or module structure is applied, it may be executed sequentially or in parallel according to the method or module structure shown in the embodiment or the drawing (for example, parallel processor or multi-thread processing). Environment, even including distributed processing, server cluster implementation environment).
当然,下述实施例的描述并不对基于本说明书的其他可扩展到的技术方案构成限制。例如其他的实施场景中,本说明书提供的实施方案同样可以应用到基金欺诈识别、产品交易、服务交易等的实施场景中。具体的一种实施例如图1所示,本说明书提供的一种保险欺诈识别的数据处理方法可以包括:Of course, the description of the following embodiments does not constitute a limitation on other expandable technical solutions based on the present specification. For example, in other implementation scenarios, the embodiments provided in this specification can also be applied to implementation scenarios of fund fraud identification, product transactions, service transactions, and the like. For a specific implementation, as shown in FIG. 1 , the data processing method for insurance fraud identification provided by the present specification may include:
S0:获取待识别人群的关系关联数据;S0: obtaining relationship-related data of the to-be-identified person;
S2:基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;S2: constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extracting the person characteristic data of the to-be-identified group;
S4:利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型S4: using the constructed supervised learning algorithm to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group, and confirm that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting Data relationship model obtained by training the selected target population's multi-degree relationship network data and personnel characteristic data, marking historical fraud insurance personnel as sample data
本实施例应用场景中,通常保险出险、核算、赔付等环节主要针对的是申请理赔人员,本说明书实施例中考虑了一些实际场景中骗保动机存在从投保开始就产生的情况,骗保人员主要目的是为了申请获得保险赔付金额,当然也有一些在投保之后才有的骗保动机。被保险人为出险的主要主体,如老乡团体的骗保人员故意制造被保险人人的意外事故因此本实施例的在识别是否存在骗保时的目标人群时选取了申请理赔人员和被保险人的人员集合。因此,本说明书所述方法的一些实施例中,选取目标人群进行关系特征数据的获取学习时,所述的目标人群可以包括申请理赔人员和被保险人的人员集合。需要说明的是,申请理赔人员在一些实施情况下可以包括投保人,如父亲给儿子投保,父亲为受益人,出险后父亲为申请理赔人员;或者一些实施情况下申请理赔人员也可能包括被保险人员,如投保人给自己的投保,受益人为自己。上述中所述的申请理赔人员和被保险人可以理解的是保险业务中处于不同角色的人员类别名称,并不现在是不同的人员,一些实施场景中所示的申请理赔人员和被保险人员可以全部或部分相同。In the application scenario of the present embodiment, the insurance insurance, accounting, and claims are mainly applied to the claiming personnel. In the embodiment of the present specification, the situation in which the fraud insurance motivation occurs from the beginning of the insurance is considered in some embodiments, and the fraud insurance personnel are The main purpose is to apply for insurance benefits, and of course there are some motives that are only after the insurance. The insured person is the main subject of the insurance, such as the fraudulent person of the fellow group deliberately creating the accident of the insured person. Therefore, in the present embodiment, the claimant and the insured are selected when identifying the target group when the fraud is present. Staff collection. Therefore, in some embodiments of the method described in the present specification, when the target group is selected to perform the acquisition learning of the relationship feature data, the target group may include a set of persons applying for the claimant and the insured. It should be noted that the application claimant may include the insured in some implementation situations, such as the father insuring the son, the father as the beneficiary, and the father applying for the claimant after the accident; or in some cases, the claimant may also include the insured. Personnel, such as the insured person, insured himself, the beneficiary is himself. The application claimant and the insured mentioned in the above can understand the names of the personnel in different roles in the insurance business, and are not currently different personnel. Some claimants and insured persons shown in some implementation scenarios can All or part of the same.
当然,其他的实施例中,目标人群的选取也可以选取理赔申请人员或投保人或被保险人或受益人等中的一种或多种。Of course, in other embodiments, the selection of the target group may also select one or more of the claim applicant or the insured or the insured or the beneficiary.
所述的关系关联数据可以包括多种维度的与所述目标人群中人员相关联的数据信息,如户籍、年龄、人员之间的亲属/同学关系、投保数据、保险出险数据等等。具体的关系关联数据可以根据实际的应用场景中进行选取确定使用哪些类别的哪些数据,一般的,作业人员可以根据骗保行为可能涉及到的数据信息作为采集关系关联数据的依据。本说明书提供的一个实施例中,所述的关系关联数据可以包括下述中的至少一种:The relationship association data may include data information associated with personnel in the target group in various dimensions, such as household registration, age, relative/classmate relationship between personnel, insurance data, insurance risk data, and the like. The specific relationship-related data may be selected according to the actual application scenario to determine which categories of data are used. Generally, the operator may use the data information that may be involved in the fraudulent behavior as the basis for collecting the relationship-related data. In an embodiment provided by the present specification, the relationship-related data may include at least one of the following:
社会关系数据、终端数据、终端的应用以及应用账户操作信息、与保险行为关联的行为数据、人员基础属性数据、地理位置数据。Social relationship data, terminal data, terminal application and application account operation information, behavior data associated with insurance behavior, personnel basic attribute data, and geographic location data.
所述的社会关系数据可以包括目标人群中人员之间的社会关系,如堂兄弟、师生、家人、同学、领导与下属等。所述的终端数据可以包括人员使用的通信设备的品牌、型号、类别,一些骗保场景中人员使用相同品牌的手机。终端的应用以及应用账户操作信息,可以用于确定是否使用同一款应用,以及使用相同的账户登录不同终端的应用进行保险欺诈操作,一些场景中多个下述听从头目统一指挥在终端上应用上进行操作。所述的与保险行为关联的行为数据可以包括目标人的投保行为、理赔行为、赔偿金额等行为数据。所述的人员基础属性数据可以包括投保人/申请理赔人员的年龄、性别、职业、户籍等。所述的地理位置数据可以包括目标人群当前所处的地理位置信息或者历史到过/滞留果的区域的信息。当然,上述所述的各个维度的数据关系关联数据还可以有其他的 定义或包含更多/更少的数据类别和信息,也可以包括除上述之外的其他维度的关系关联数据,如消费信息甚至信用记录或行政处罚信息,具体采集时可以采集上述中的一种或多种数据信息。The social relationship data may include social relationships between people in the target group, such as cousins, teachers and students, family members, classmates, leaders, and subordinates. The terminal data may include a brand, a model, and a category of a communication device used by a person, and some people in the fraud scene use a mobile phone of the same brand. The application of the terminal and the application account operation information may be used to determine whether to use the same application, and use the same account to log in to different terminal applications for insurance fraud operation. In some scenarios, multiple following listeners are unified and commanded on the terminal. Take action. The behavior data associated with the insurance behavior may include behavior data of the target person's insurance behavior, claim behavior, compensation amount, and the like. The personnel basic attribute data may include the age, gender, occupation, household registration, and the like of the applicant/applicant. The geographic location data may include geographic location information currently in which the target population is located or information of a region that has historically passed/detained fruit. Of course, the data relationship association data of each dimension described above may have other definitions or contain more/less data categories and information, and may also include relationship-related data of other dimensions than the above, such as consumption information. Even credit records or administrative penalty information, one or more of the above data information may be collected during specific collection.
所述的人员特征数据可以包括与单个人员自身相关联的数据信息,如性别、年龄、保险服务账号或终端应用账户注册时间、信用记录、消费情况等,或者还可以包括与保险行为关联的行为数据,如多次投保行为、经常性的理赔行为、赔偿金额是否正常等。还可以包括以下其他的商品或服务的交易数据,如长期的大额支出,多次出车险,购买多部手机,注册多个通信账号/社交账号等。The personnel characteristic data may include data information associated with a single person itself, such as gender, age, insurance service account number or terminal application account registration time, credit history, consumption status, etc., or may also include behaviors associated with insurance behaviors. Data, such as multiple insurance behaviors, frequent claims, and whether the amount of compensation is normal. It can also include transaction data for other goods or services, such as long-term large expenses, multiple vehicle insurance, multiple mobile phones, and multiple communication accounts/social accounts.
具体的人员特征计算使用的人员特征数据可以采用上述中的一种或多种的组合,以实现人员自身特征的识别。因此,另一个实施例中,所述人员特征数据可以包括用户注册账号、交易数据、与保险行为关联的行为数据中的至少一项提取出来的特征数据。The person characteristic data used for the specific person feature calculation may adopt a combination of one or more of the above to realize the identification of the person's own characteristics. Therefore, in another embodiment, the person characteristic data may include feature data extracted by at least one of a user registration account, transaction data, and behavior data associated with the insurance behavior.
骗保团伙的人员之间通常存在较为紧密的关系网络,本实施例中可以利用上述获取的多维度的关系关联数据构建目标人群的多度关系网络图数据。所述的多度关系网络图数据可以包括基于所述关系关联数据建立的不同人员之间的关系链而生成的关系网络图,其中的关系网络图上人员之间的关系链数据为多度关系网络图数据。所述的关系链可以表示每两个人员之间的关系数据,如A与B是老板关系、A与C是家人关系等。单独的两个人员之间的关系可以称为一度关系,本实施例中所述的多度关系网络图数据中的“多度”可以包括基于所述一度关系建立的新的人员之间的关联数据,如基于第一人员与第二人员的一度关系和第二人员与第三人员的一度关系建立的所述第一人员与第三人员的二度关系,甚至进一步可以基于其他一度关系建立第一人员与第四人员的三度关系等等。There is usually a relatively close relationship network between the personnel of the scam group. In this embodiment, the multi-dimensional relationship association data obtained above can be used to construct the multi-degree relationship network map data of the target group. The multi-degree relationship network graph data may include a relationship network graph generated based on a relationship chain between different people established by the relationship association data, wherein the relationship chain data between the persons on the relationship network graph is a multi-degree relationship Network graph data. The relationship chain can represent relationship data between every two people, such as A and B are boss relationships, A and C are family relationships, and the like. The relationship between two separate persons may be referred to as a one-degree relationship, and the “multiple degrees” in the multi-degree relationship network map data described in this embodiment may include associations between new persons established based on the one-degree relationship. The data, such as the second relationship between the first person and the third person based on the one-time relationship between the first person and the second person and the one-time relationship between the second person and the third person, may even further establish the first relationship based on the other relationship The third relationship between a person and a fourth person, and so on.
如一个示例中,A是单个人员,B是A的姐夫,则A与B是一度的社会关系,A与其姐夫B的公司老板C之前不存在社会关系,但在本说明书实施例中,由于存在B既是A的姐夫又是公司老板C的下属,因此A与公司老板C之间建立的二度关系。As an example, where A is a single person and B is a brother-in-law of A, then A and B are once social relations, and A has no social relationship with company boss C of brother-in-law B, but in the present embodiment, due to the existence of B is both a brother-in-law of A and a subordinate of company boss C, so A establishes a second relationship with company boss C.
除上述人员之间的社会关系之外,还可以根据采用的关系关联数据或者关系构建需求形成其他类型的多度关系网络图数据,如是否为老乡,使用同一种通信工具、多人终端上的某个应用在固定时间段登录等。当然,基于所述关系关联数据构建关系网络具体的实现中,关系之间的确定可以预先设计成立关系链的规则。In addition to the social relationship between the above-mentioned personnel, it is also possible to form other types of multi-degree relationship network map data according to the relationship-related data or relationship construction requirements adopted, such as whether it is a fellow, using the same communication tool, and on a multi-person terminal. An application logs in at a fixed time period, and so on. Of course, in the specific implementation of constructing the relationship network based on the relationship-related data, the determination between the relationships may be pre-designed to establish a relationship chain rule.
基于构建好的多度关系网络图数据和提取的人员特征数据,本实施例可以采用有监 督的学习算法学习骗保人员的关系特征和自身特征,从而可以建立有效的识别模型。Based on the constructed multi-degree relationship network graph data and the extracted personnel feature data, the embodiment can use the supervised learning algorithm to learn the relationship characteristics and self-features of the fraud-investigating personnel, so that an effective recognition model can be established.
通常的,机器学习的常用方法主要分为有监督学习,有时也简称监督学习(supervised learning)和无监督学习(unsupervised learning)。监督学习是一种分类处理方式,通常针对有标签的数据集,通过已有的训练样本(即已知数据以及其对应的输出)去训练得到一个最优模型(这个模型属于某个函数的集合,最优则表示在某个评价准则下是最佳的),再利用这个模型将所有的输入映射为相应的输出,对输出进行简单的判断从而实现分类的目的,也就具有了对未知数据进行分类的能力。监督学习里典型的例子就是KNN(k-NearestNeighbor,邻近算法)、SVM(Support Vector Machine),支持向量机)。有监督学习算法在有一定数量的训练样本的情况下,相比于无监督算法可以得到更为准确的输出结果。In general, the common methods of machine learning are mainly divided into supervised learning, sometimes referred to as supervised learning and unsupervised learning. Supervised learning is a kind of classification processing. Usually, for a tagged data set, an existing optimal training model (ie known data and its corresponding output) is used to train to obtain an optimal model (this model belongs to a set of functions). , the best means that it is the best under a certain evaluation criterion), and then use this model to map all the inputs to the corresponding output, and make a simple judgment on the output to achieve the purpose of classification, which has the unknown data. The ability to classify. Typical examples of supervised learning are KNN (k-NearestNeighbor), SVM (Support Vector Machine), and support vector machines. A supervised learning algorithm can obtain more accurate output results than a supervised algorithm with a certain number of training samples.
根据选取的不同的有监督学习算法,其他具体的关系特征和自身特征的处理过程根据算法种类和识别处理需求进行设计和确定。例如可以采用Structure2vec等的有监督图算法。例如一个实施例中,所述构建的有监督学习算法包括:According to the selected supervised learning algorithms, the processing processes of other specific relationship features and self-features are designed and determined according to the type of algorithm and the recognition processing requirements. For example, a supervised graph algorithm such as Structure2vec can be used. For example, in one embodiment, the constructed supervised learning algorithm includes:
S40:利用选取的有监督学习算法对目标人群的多度关系网络数据中目标人员与其他人员的关系特征进行第一关系网络学习、基于所述目标人员特征的自身特征数据进行第二自身属性学习;S40: using the selected supervised learning algorithm to perform the first relationship network learning on the relationship characteristics between the target personnel and other personnel in the multi-degree relationship network data of the target group, and performing the second self-attribute learning based on the self-characteristic data of the target person characteristics ;
S42:以所述第一关系网学习和第二自身属性学习得到的特征数据作为所述有监督学习算法的自变量,以打标的历史骗保人员作为因变量建立关系模型;S42: using the feature data obtained by the first relational network learning and the second self-attribute learning as an independent variable of the supervised learning algorithm, and establishing a relationship model by using a labeled historical fraudster as a dependent variable;
S44:在所述关系模型的输出达到预设准确率时确定构建的有监督学习算法。S44: Determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
图2是本说明书提供的一种构建有监督学习算法实施例的处理过程示意图。2 is a schematic diagram of a processing procedure of an embodiment of a supervised learning algorithm provided by the present specification.
如图2所述的示例中,可以使用Structure2vec的有监督图算法:一方面去学习目标人及其邻居的关系特征(如与多少人有关系,是否跟骗保人员有关系),另一方面学习目标人本身的特征(如性别、年龄等),以上特征作为模型的x变量;其次,根据历史打标好是否是骗保人员作为y变量;最后,根据y和x建立相关模型,从而达到仅依靠x就可以预测y情况。In the example shown in Figure 2, Structure2vec's supervised graph algorithm can be used: on the one hand, to learn the relationship characteristics of the target person and its neighbors (such as how many people are related, whether it is related to the fraudster), on the other hand Learning the characteristics of the target person (such as gender, age, etc.), the above characteristics are used as the x variable of the model; secondly, according to the historical marking, whether it is the scammers as the y variable; finally, the relevant model is established according to y and x, thereby achieving The y case can be predicted only by relying on x.
本实施例应用场景中所述的最终识别出是否为骗保的可以是单独的一个人。即本实施例中的理由有监督学习算法学习了团伙骗保的关系特征之后,再结合骗保人员自身的特征,可以直接得到某个待识别人员是否为骗保人员或者是骗保人员的概率的骗保输出结果。如可以为人员打标为骗保人员或正常人员,或者为骗保人员的概率。The final identification in the application scenario in this embodiment may be a single person. That is to say, the reason in this embodiment is that after the supervised learning algorithm learns the relationship characteristics of the gang fraud insurance, and then combines the characteristics of the fraud insurance personnel, it can directly obtain whether the person to be identified is a fraudulent or fraudulent person. The scam guarantees the output. For example, the probability of marking a person as a fraudster or a normal person, or as a fraudster.
当然,这里所述的标记为骗保人员是基于关系特征和自身特征的识别结果,可以作为初步确定这些人是否为骗保人员的依据和参考。最终确定是否为骗保时可以有作业人员主观判断,或者再结合其他的计算方式进行判断和确定。Of course, the mark described here is the recognition result of the fraudster based on the relationship feature and the self-characteristic, and can be used as a basis and reference for initially determining whether these people are fraudulent persons. Final determination of whether it is a fraudulent insurance can be subjective judgment by the operator, or combined with other calculation methods to judge and determine.
本实施例提供的保险欺诈的数据处理方法,可以基于投保人员和被保险人的多维度的关系关联数据构建人群的多度关系网络图数据,可以更加深入的挖掘人员之间的关系网络,提高识别效率和范围。同时结合骗保人员自身的特征数据,共同建立有监督的学习模型,用来学习骗保人员的关系网络特征和自身特征。团伙的骗保人员不仅在关系网络上有着较为明显和多度的关系特征,其自身特征也常常表现出相似性,因此利用本说明书实施例提供的方法可以更加有效和高效的识别出骗保人员,提高识别处理效率。The data processing method of the insurance fraud provided by the embodiment can construct the multi-degree relationship network map data of the crowd based on the multi-dimensional relationship association data of the insured person and the insured person, and can further explore the relationship network between the personnel and improve the relationship network. Identify efficiency and scope. At the same time, combined with the characteristics data of the fraudsters themselves, a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters. The gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
本说明书提供的所述方法的另一个实施例中,还可以利用历史骗保人员的数据信息结合多度关系网络图数据进行骗保人员的识别。具体的,本说明书提供的所述方法的另一个实施例中,所述关系关联数据还可以包括:历史骗保人员名单数据。In another embodiment of the method provided by the present specification, the data information of the historical fraudster may be combined with the multi-degree relationship network map data for identification of the fraudster. Specifically, in another embodiment of the method provided by the present specification, the relationship association data may further include: historical fraudulent personnel list data.
本实施例中加入历史骗保人群的数据信息,在对所述分类社群进行分析处理时,考虑历史骗保人员的参与程度。一般的,若历史骗保人员在某个分类社群中的关系浓度较高,则该分类社群中的人员进行骗保的可能性就越大。本实施例中所述的关系浓度可以包括历史骗保人员的参与程度,具体的可以包括分类社群中历史骗保人员的数量、历史骗保人员的数量占比、历史骗保人与其他人员的关系密程度等。所述的关系密集程度的一个示例如,10个人员的风险社群中,2个历史骗保人员与其他6个人员是一度或多度关系的亲属关系,与2个人员是同学关系,则表示可能为传销性质的骗保团伙。具体的关系浓度可以采用不同的方式计算,如上述历史骗保人员数量,占比,关系网络等。本说明书实施例提供另一种实施例中,可以从待识别人群的规模和历史骗保人员的数量两个指标来计算所述关系浓度,所述的关系浓度可以作为衡量骗保的概率取值。具体的,可以包括:In this embodiment, the data information of the historical fraud insurance group is added, and when the classified community is analyzed and processed, the degree of participation of the historical fraud insurance personnel is considered. In general, if historical fraudsters have a high concentration of relationships in a classified community, the likelihood that a person in the classified community will swindle is greater. The relationship concentration described in this embodiment may include the degree of participation of the historical fraud insurance personnel, and may specifically include the number of historical fraud insurance personnel in the classified community, the proportion of historical fraud insurance personnel, the history fraud protector and other personnel. The degree of relationship density and so on. An example of the degree of relationship intensiveness is as follows: in a risk community of 10 people, 2 historical fraudsters are kinship with one or more other relationships, and 2 employees are classmates. Indicates a fraudulent gang that may be a pyramid scheme. The specific relationship concentration can be calculated in different ways, such as the number of historical fraudsters, the proportion, and the relationship network. In another embodiment, the relationship concentration may be calculated from two indicators of the size of the person to be identified and the number of historical fraudsters, and the relationship concentration may be used as a measure of the probability of fraudulent insurance. . Specifically, it may include:
以所述待识别人群的人员数量取对数后作为第一因子;Taking the logarithm of the number of persons to be identified as the first factor;
以所述待识别人员中历史骗保人员的数量占比作为第二因子;Taking the proportion of the number of historical fraudsters in the person to be identified as the second factor;
基于所述第一因子与所述第二因子的乘积作为待识别人群的团体骗保概率。And determining, according to the product of the first factor and the second factor, a group fraud probability of the to-be-identified group.
然后可以结合自身特征计算得到的个人骗保概率取值,与团体骗保概率进行运算来确定最终输出的团体为骗保或单个人员为骗保的概率。或者所述的团体骗保概率和个人骗保概率分别各自利用,不进行相互计算。Then, the personal fraud probability value calculated by combining the self-features can be calculated, and the group fraud probability is calculated to determine the probability that the final output group is a fraud or a single person is fraudulent. Or the group fraud insurance probability and the personal fraud insurance probability are respectively utilized separately, and no mutual calculation is performed.
例如,具体实现时,可以采用下述方式计算社群骗保的概率:For example, in the specific implementation, the probability of community fraud can be calculated in the following way:
RiskDegree=log(分类社群人员总数)*历史骗保人员数量/分类社群人员总数。RiskDegree=log (total number of classified community members)* Number of historical fraudsters/total number of classified community members.
当然,还可以采用其他的计算方式或变形、变换的方式,如取自然对数等,在此限制和赘述。Of course, other calculation methods or deformation, transformation methods, such as taking natural logarithms, etc., may be used, and limitations and details are herein.
上述实施例提供了可以利用历史骗保人员的数据信息来识别骗保的欺诈群体。本说明书提供的另一种实施例中,可以利用人群中各个成员之间的关系网络特征来确定是否为骗保人员。具体的,如确定人群中人员关系的网络结构特征;The above embodiments provide a fraudulent group that can use fraudulent data from a historical fraudster to identify fraudulent insurance. In another embodiment provided by the present specification, the relationship network feature between each member of the group can be utilized to determine whether it is a fraudster. Specifically, such as determining the network structure characteristics of the personnel relationship in the crowd;
若所述网络结构特征符合预设的骗保网络结构,则将所述人群标记为欺诈群体。If the network structure feature conforms to a preset fraud protection network structure, the crowd is marked as a fraud group.
所述的上述方式可以用于有监督学习算法的训练中,所述的人群为目标人员。对于识别待识别人员的处理中,所述的人群为所述待识别人群。The above described method can be used in the training of a supervised learning algorithm, and the crowd is a target person. In the process of identifying the person to be identified, the crowd is the person to be identified.
所述的网络结构特征可以基于人群中的人员信息、人员之间的关系网络信息等。这里的关系网络信息可以为前述所述的一度信息,也可以包括构建的多度信息。The network structure feature may be based on personnel information in a crowd, network information between people, and the like. The relationship network information herein may be the one-time information described above, and may also include the constructed multi-degree information.
可以使用一定的算法识别分析社群中关系网络是什么特征,如果网络结构特征符合骗保团伙特征,此时可以标记为欺诈群体。例如一个示例中,人群中的关系网络可以为比如“球形网络”、“金字塔形网络”等网络结构。“金字塔网络”类似于传销组织,一层一层关系结构,属于骗保的可能性较大;“球形网络”就是网络中彼此关联,可能为非中心化的骗保组织。A certain algorithm can be used to identify the characteristics of the relationship network in the analysis community. If the network structure features meet the characteristics of the fraud protection group, it can be marked as a fraud group at this time. For example, in one example, the relationship network in the crowd may be a network structure such as a "spherical network" or a "pyramid network." The “pyramid network” is similar to a pyramid scheme organization, with a layer-by-layer relationship structure, which is more likely to be fraudulent. The “spherical network” is a fraudulent organization that is related to each other in the network and may be decentralized.
本说明书实施例提供的一种保险欺诈的数据处理方法、使用接近实际关系网络的关系关联数据支撑关系网络算法的挖掘,实现多度关系的关系网络数据计算。基于投保人员和被保险人的多维度的关系关联数据构建人群的多度关系网络图数据,可以更加深入的挖掘人员之间的关系网络,提高识别效率和范围。同时结合骗保人员自身的特征数据,共同建立有监督的学习模型,用来学习骗保人员的关系网络特征和自身特征。团伙的骗保人员不仅在关系网络上有着较为明显和多度的关系特征,其自身特征也常常表现出相似性,因此利用本说明书实施例提供的方法可以更加有效和高效的识别出骗保人员,提高识别处理效率。The data processing method of the insurance fraud provided by the embodiment of the present specification, the mining of the relational data supporting relation network algorithm using the relational network close to the actual relational network, and the calculation of the relationship network data of the multi-degree relationship. Based on the multi-dimensional relationship between the insured and the insured, the multi-degree relationship network map data of the crowd can be used to more deeply explore the relationship network between the people and improve the recognition efficiency and scope. At the same time, combined with the characteristics data of the fraudsters themselves, a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters. The gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
上述所述的方法可以用于客户端一侧的保险欺诈识别,如移动终端安装反欺诈应用、支付应用提供的保险业务。所述的客户端可以为PC(personal computer)机、服务器、工控机(工业控制计算机)、移动智能电话、平板电子设备、便携式计算机(例如笔记本电脑等)、个人数字助理(PDA)、或桌面型计算机或智能穿戴设备等。移动通信终 端、手持设备、车载设备、可穿戴设备、电视设备、计算设备。也可以应用在保险业务方或服务方或第三方机构的系统服务器中,所述的系统服务器可以包括单独的服务器、服务器集群、分布式系统服务器或者处理设备请求数据的服务器与其他相关联数据处理的系统服务器组合。The method described above can be used for insurance fraud identification on the client side, such as the anti-fraud application installed by the mobile terminal and the insurance service provided by the payment application. The client can be a PC (personal computer), a server, an industrial computer (industrial control computer), a mobile smart phone, a tablet electronic device, a portable computer (such as a laptop computer, etc.), a personal digital assistant (PDA), or a desktop. Computer or smart wearable device, etc. Mobile communication terminals, handheld devices, in-vehicle devices, wearable devices, television devices, computing devices. It can also be applied to the insurance service party or the system server of the servant or the third party institution, which may include a separate server, a server cluster, a distributed system server or a server that processes the device request data and other associated data processing. System server combination.
本说明书实施例所提供的方法实施例可以在移动终端、计算机终端、服务器或者类似的运算装置中执行。以运行在服务器上为例,图3是本发明实施例的一种识别车辆受损部件的服务器的硬件结构框图。如图3所示,服务器10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输模块106。本领域普通技术人员可以理解,图3所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,服务器10还可包括比图3中所示更多或者更少的组件,例如还可以包括其他的处理硬件,如数据库或多级缓存,或者具有与图3所示不同的配置。The method embodiments provided by the embodiments of the present specification can be executed in a mobile terminal, a computer terminal, a server, or the like. Taking the operation on the server as an example, FIG. 3 is a block diagram showing the hardware structure of a server for identifying a damaged component of a vehicle according to an embodiment of the present invention. As shown in FIG. 3, server 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), A memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG. 3 is merely illustrative and does not limit the structure of the above electronic device. For example, server 10 may also include more or fewer components than those shown in FIG. 3, for example, may also include other processing hardware, such as a database or multi-level cache, or have a different configuration than that shown in FIG.
存储器104可用于存储应用软件的软件程序以及模块,如本发明实施例中的搜索方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述导航交互界面内容展示的处理方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store software programs and modules of application software, such as program instructions/modules corresponding to the search method in the embodiment of the present invention, and the processor 102 executes various functions by running software programs and modules stored in the memory 104. Application and data processing, that is, a processing method for realizing the content display of the above navigation interaction interface. Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
传输模块106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输模块106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输模块106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。The transmission module 106 is configured to receive or transmit data via a network. The network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transport module 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission module 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
基于上述所述的设备型号识别方法,本说明书还提供一种保险欺诈识别的数据处理装置。所述的装置可以包括使用了本说明书实施例所述方法的系统(包括分布式系统)、软件(应用)、模块、组件、服务器、客户端等并结合必要的实施硬件的设备装置。基于同一创新构思,本说明书提供的一种实施例中的处理装置如下面的实施例所述。由于 装置解决问题的实现方案与方法相似,因此本说明书实施例具体的处理装置的实施可以参见前述方法的实施,重复之处不再赘述。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。具体的,如图4所示,图4是本说明书提供的一种保险欺诈识别的数据处理装置实施例的模块结构示意图,可以包括:Based on the device model identification method described above, the present specification also provides a data processing apparatus for insurance fraud identification. The apparatus may include a system (including a distributed system), software (applications), modules, components, servers, clients, etc., using the methods described in the embodiments of the present specification, in conjunction with necessary device hardware for implementing the hardware. Based on the same innovative concept, the processing device in one embodiment provided by this specification is as described in the following embodiments. For the implementation of the specific processing device in the embodiment of the present specification, reference may be made to the implementation of the foregoing method, and details are not described herein again. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated. Specifically, as shown in FIG. 4, FIG. 4 is a schematic structural diagram of a module of an embodiment of a data processing apparatus for insurance fraud identification provided by the present specification, which may include:
数据获取模块101,可以用于获取待识别人群的关系关联数据;The data obtaining module 101 is configured to obtain relationship association data of the to-be-identified group;
特征计算模块102,可以用于基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;The feature calculation module 102 may be configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
欺诈识别模块103,可以用于利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。The fraud identification module 103 may be configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determine that the to-be-identified group defrauds the output result; The supervised learning algorithm includes a data relation model obtained by using the multi-degree relationship network data and personnel characteristic data of the selected target population, and the marked historical fraud insurance personnel as the sample data.
所述装置的具体的一个实施例中,所述关系关联数据可以包括下述中的至少一种:In a specific embodiment of the apparatus, the relationship association data may include at least one of the following:
社会关系数据、终端数据、终端的应用以及应用账户操作信息、与保险行为关联的行为数据、人员基础属性数据、地理位置数据。Social relationship data, terminal data, terminal application and application account operation information, behavior data associated with insurance behavior, personnel basic attribute data, and geographic location data.
所述装置的另一个实施例中,所述欺诈识别模块103确所述待识别人群骗保输出结果包括输出单个待识别目标人员是否为欺诈人员或为欺诈人员的概率。In another embodiment of the apparatus, the fraud identification module 103 determines that the to-be-identified crowd fraud output output includes a probability of outputting a single target person to be identified as a fraudulent person or a fraudulent person.
所述装置的另一个实施例,所述选取的目标人群包括申请理赔人员和被保险人的人员集合。In another embodiment of the apparatus, the selected target population includes a collection of persons applying for claims and insured persons.
所述装置的另一个实施例,所述人员特征数据包括用户注册账号、交易数据、与保险行为关联的行为数据中的至少一项提取出来的特征数据。In another embodiment of the apparatus, the person characteristic data includes feature data extracted by at least one of a user registration account number, transaction data, and behavior data associated with the insurance behavior.
图5是所述装置的另一个实施例中,如图5所示,所述欺诈识别模块103包括:FIG. 5 is another embodiment of the apparatus. As shown in FIG. 5, the fraud identification module 103 includes:
特征学习模块1031,可以用于利用选取的有监督学习算法对目标人群的多度关系网络数据中目标人员与其他人员的关系特征进行第一关系网络学习、基于所述目标人员特征的自身特征数据进行第二自身属性学习;The feature learning module 1031 may be configured to perform, by using the selected supervised learning algorithm, the relationship between the target person and the other person in the multi-degree relationship network data of the target group, the first relationship network learning, and the self-characteristic data based on the target person feature Performing a second self attribute learning;
关系建立模块1032,可以用于以所述第一关系网学习和第二自身属性学习得到的特征数据作为所述有监督学习算法的自变量,以打标的历史骗保人员作为因变量建立关系模型;The relationship establishing module 1032 may be configured to use the feature data obtained by the first relationship network learning and the second self attribute learning as an independent variable of the supervised learning algorithm, and establish a relationship by using a labeled historical fraud person as a dependent variable. model;
模型训练模块1033,可以用于在所述关系模型的输出达到预设准确率时确定构建的有监督学习算法。模型中参数的训练迭代,在满足输出精度要求时可以作为线上使用。The model training module 1033 can be configured to determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate. The training iteration of the parameters in the model can be used as an online when the output accuracy requirements are met.
本说明书实施例提供的服务器或客户端可以在计算机中由处理器执行相应的程序指令来实现,如使用windows操作系统的c++语言在PC端或服务器端实现,或其他例如Linux、系统相对应的应用设计语言集合必要的硬件实现,或者基于量子计算机的处理逻辑实现等。因此,本说明书还提供一种保险欺诈识别的数据处理设备,具体的可以包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:The server or client provided by the embodiment of the present specification may be implemented by a processor executing a corresponding program instruction in a computer, such as a C++ language of a Windows operating system, implemented on a PC or a server, or other corresponding to, for example, Linux or a system. Apply the design language to the necessary hardware implementations, or to implement processing logic based on quantum computers. Accordingly, the present specification also provides a data processing device for insurance fraud identification, which may specifically include a processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
上述的指令可以存储在多种计算机可读存储介质中。所述计算机可读存储介质可以包括用于存储信息的物理装置,可以将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。本实施例所述的计算机可读存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。上述所述的装置或服务器或客户端或处理设备中的所涉及的指令同上描述。The above instructions may be stored in a variety of computer readable storage media. The computer readable storage medium may include physical means for storing information, which may be digitized and stored in a medium utilizing electrical, magnetic or optical means. The computer readable storage medium of this embodiment may include: means for storing information by means of electrical energy, such as various types of memories, such as RAM, ROM, etc.; means for storing information by magnetic energy means, such as hard disk, floppy disk, magnetic tape, magnetic Core memory, bubble memory, U disk; means for optically storing information such as CD or DVD. Of course, there are other ways of readable storage media such as quantum memories, graphene memories, and the like. The instructions involved in the above described device or server or client or processing device are as described above.
上述的处理设备可以具体的为保险服务器或第三方服务机构提供保险反欺诈识别的服务器,所述的服务器可以为单独的服务器、服务器集群、分布式系统服务器或者处理设备请求数据的服务器与其他相关联数据处理的系统服务器组合。因此,本说明书实施例还提供一种具体的服务器产品,所述服务器包括至少一个处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:The processing device may specifically provide an insurance anti-fraud identification server for an insurance server or a third-party service organization, and the server may be a separate server, a server cluster, a distributed system server, or a server that processes data by requesting data. System server combination for data processing. Accordingly, embodiments of the present specification also provide a specific server product, the server including at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
需要说明的是,本说明书实施例上述所述的装置和处理设备、服务器,根据相关方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照方法实施例的描述,在此不作一一赘述。It should be noted that the apparatus, the processing device, and the server described in the foregoing embodiments of the present specification may further include other embodiments according to the description of the related method embodiments. For a specific implementation, reference may be made to the description of the method embodiments, and details are not described herein.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于硬件+程序类实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program type embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing description of the specific embodiments of the specification has been described. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than the embodiments and still achieve the desired results. In addition, the processes depicted in the figures are not necessarily in a particular order or in a sequential order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
虽然本申请提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的劳动可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或系统服务器产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境)。Although the present application provides method operational steps as described in the embodiments or flowcharts, more or less operational steps may be included based on conventional or non-creative labor. The order of the steps recited in the embodiments is only one of the many steps of the order of execution, and does not represent a single order of execution. When the actual device or system server product is executed, it may be executed sequentially or in parallel according to the method shown in the embodiment or the drawings (for example, an environment of parallel processor or multi-thread processing).
尽管本说明书实施例内容中提到关系关联数据的采集种类、训练时选取的目标人群的范围、判断为骗保的概率计算方式等之类的数据获取、存储、交互、计算、判断等操作和数据描述,但是,本说明书实施例并不局限于必须是符合行业通信标准、标准监督或无监督模型处理、通信协议和标准数据模型/模板或本说明书实施例所描述的情况。某些行业标准或者使用自定义方式或实施例描述的实施基础上略加修改后的实施方案也可以实现上述实施例相同、等同或相近、或变形后可预料的实施效果。应用这些修改或变形后的数据获取、存储、判断、处理方式等获取的实施例,仍然可以属于本说明书的可选实施方案范围之内。Although the content of the embodiment of the present specification refers to the type of relationship-related data collection, the range of the target population selected during training, the probability of calculating the probability of fraudulent insurance, etc., data acquisition, storage, interaction, calculation, judgment, etc. The data is described, however, embodiments of the present specification are not limited to situations that must be consistent with industry communication standards, standard oversight or unsupervised model processing, communication protocols, and standard data models/templates or embodiments of the specification. Certain industry standards or implementations that have been modified in a manner that uses a custom approach or an embodiment described above may also achieve the same, equivalent, or similar, or post-deformation implementation effects of the above-described embodiments. Embodiments obtained by applying such modified or modified data acquisition, storage, judgment, processing, etc., may still fall within the scope of alternative embodiments of the present specification.
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。In the 1990s, improvements to a technology could clearly distinguish between hardware improvements (eg, improvements to circuit structures such as diodes, transistors, switches, etc.) or software improvements (for process flow improvements). However, as technology advances, many of today's method flow improvements can be seen as direct improvements in hardware circuit architecture. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be implemented by hardware entity modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is an integrated circuit whose logic function is determined by the user programming the device. Designers program themselves to "integrate" a digital system on a single PLD without having to ask the chip manufacturer to design and fabricate a dedicated integrated circuit chip. Moreover, today, instead of manually making integrated circuit chips, this programming is mostly implemented using "logic compiler" software, which is similar to the software compiler used in programming development, but before compiling The original code has to be written in a specific programming language. This is called the Hardware Description Language (HDL). HDL is not the only one, but there are many kinds, such as ABEL (Advanced Boolean Expression Language). AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., are currently the most commonly used VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. It should also be apparent to those skilled in the art that the hardware flow for implementing the logic method flow can be easily obtained by simply programming the method flow into the integrated circuit with a few hardware description languages.
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。The controller can be implemented in any suitable manner, for example, the controller can take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor. In the form of logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, The Microchip PIC18F26K20 and the Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic. Those skilled in the art will also appreciate that in addition to implementing the controller in purely computer readable program code, the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding. The form of a microcontroller or the like to achieve the same function. Such a controller can therefore be considered a hardware component, and the means for implementing various functions included therein can also be considered as a structure within the hardware component. Or even a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.
上述实施例阐明的处理设备、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、车载人机交互设备、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The processing device, device, module or unit set forth in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function. A typical implementation device is a computer. Specifically, the computer can be, for example, a personal computer, a laptop computer, a car-mounted human-machine interaction device, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet. A computer, wearable device, or a combination of any of these devices.
虽然本说明书实施例提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的手段可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或终端产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至为分布式数据处理环境)。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、产品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、产品或者设备所固有的要素。在没有更多限制的情况下,并不排除在包括所述要素的过程、方法、产品或者设备中还存在另外的相同或等同要素。Although embodiments of the present specification provide method operational steps as described in the embodiments or flowcharts, more or fewer operational steps may be included based on conventional or non-creative means. The order of the steps recited in the embodiments is only one of the many steps of the order of execution, and does not represent a single order of execution. When the actual device or terminal product is executed, it may be executed sequentially or in parallel according to the embodiment or the method shown in the drawings (for example, a parallel processor or a multi-threaded environment, or even a distributed data processing environment). The terms "comprising," "comprising," or "comprising" or "comprising" or "the" Elements, or elements that are inherent to such a process, method, product, or device. In the absence of further limitations, it is not excluded that there are additional identical or equivalent elements in the process, method, product, or device.
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书实施例时可以把各模块的功能在同一个或多个软件和/或硬件中实现,也可以将实现同一功能的模块由多个子模块或子单元的组合实现等。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。For the convenience of description, the above devices are described as being separately divided into various modules by function. Of course, in the implementation of the embodiments of the present specification, the functions of the modules may be implemented in the same software or software, or the modules that implement the same function may be implemented by multiple sub-modules or a combination of sub-units. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内部包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。Those skilled in the art will also appreciate that in addition to implementing the controller in purely computer readable program code, the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding. The form of a microcontroller or the like to achieve the same function. Therefore, such a controller can be considered as a hardware component, and a device for internally implementing it for implementing various functions can also be regarded as a structure within a hardware component. Or even a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中 的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.
本领域技术人员应明白,本说明书的实施例可提供为方法、系统或计算机程序产品。因此,本说明书实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬 件方面的实施例的形式。而且,本说明书实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present specification can be provided as a method, system, or computer program product. Thus, embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment or a combination of software and hardware. Moreover, embodiments of the present specification can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
本说明书实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。Embodiments of the present description can be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. Embodiments of the present specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network. In a distributed computing environment, program modules can be located in both local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本说明书实施例的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. The structure, materials, or features are included in at least one embodiment or example of the embodiments of the specification. In the present specification, the schematic representation of the above terms is not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, various embodiments or examples described in the specification, as well as features of various embodiments or examples, may be combined and combined.
以上所述仅为本说明书实施例的实施例而已,并不用于限制本说明书实施例。对于本领域技术人员来说,本说明书实施例可以有各种更改和变化。凡在本说明书实施例的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书实施例的权利要求范围之内。The above descriptions are only examples of the embodiments of the present specification, and are not intended to limit the embodiments of the present specification. Various modifications and changes may be made to the embodiments of the present disclosure. Any modifications, equivalents, improvements, etc. made within the spirit and scope of the embodiments of the present invention are intended to be included within the scope of the appended claims.

Claims (14)

  1. 一种保险欺诈识别的数据处理方法,所述方法包括:A data processing method for insurance fraud identification, the method comprising:
    获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
    基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
    利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  2. 如权利要求1所述的方法,所述关系关联数据包括下述中的至少一种:The method of claim 1, the relationship association data comprising at least one of the following:
    社会关系数据、终端数据、终端的应用以及应用账户操作信息、与保险行为关联的行为数据、人员基础属性数据、地理位置数据。Social relationship data, terminal data, terminal application and application account operation information, behavior data associated with insurance behavior, personnel basic attribute data, and geographic location data.
  3. 如权利要求1所述的方法,所述确所述待识别人群骗保输出结果包括输出单个待识别目标人员是否为欺诈人员或为欺诈人员的概率。The method of claim 1, wherein said determining the population to be identified by the fraudulent output comprises determining whether a single target person to be identified is a fraudulent person or a fraudulent person.
  4. 如权利要求1所述的方法,所述选取的目标人群包括申请理赔人员和被保险人的人员集合。The method of claim 1 wherein said selected target population comprises a collection of persons applying for claims and insured persons.
  5. 如权利要求1或3中任意一项所述的方法,所述人员特征数据包括用户注册账号、交易数据、与保险行为关联的行为数据中的至少一项提取出来的特征数据。The method according to any one of claims 1 to 3, wherein the person characteristic data comprises feature data extracted from at least one of a user registration account number, transaction data, and behavior data associated with the insurance behavior.
  6. 如权利要求1或3中任意一项所述的方法,所述采用下述方式构建有监督学习算法包括:The method according to any one of claims 1 to 3, wherein the supervised learning algorithm is constructed in the following manner:
    利用选取的有监督学习算法对目标人群的多度关系网络数据中目标人员与其他人员的关系特征进行第一关系网络学习、基于所述目标人员特征的自身特征数据进行第二自身属性学习;Using the selected supervised learning algorithm to perform the first relationship network learning on the relationship characteristics between the target personnel and other personnel in the multi-degree relationship network data of the target group, and performing the second self-attribute learning based on the self-characteristic data of the target person characteristics;
    以所述第一关系网学习和第二自身属性学习得到的特征数据作为所述有监督学习算法的自变量,以打标的历史骗保人员作为因变量建立关系模型;The feature data obtained by the first relationship network learning and the second self attribute learning is used as an independent variable of the supervised learning algorithm, and the historical fraud controller is used as a dependent variable to establish a relationship model;
    在所述关系模型的输出达到预设准确率时确定构建的有监督学习算法。The constructed supervised learning algorithm is determined when the output of the relationship model reaches a preset accuracy rate.
  7. 一种保险欺诈识别的数据处理装置,包括:A data processing device for insurance fraud identification, comprising:
    数据获取模块,用于获取待识别人群的关系关联数据;a data acquisition module, configured to acquire relationship association data of the to-be-identified group;
    特征计算模块,用于基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;a feature calculation module, configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
    欺诈识别模块,用于利用构建的有监督学习算法对所述待识别人群的多度关系网络 图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。a fraud identification module, configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and confirm that the to-be-identified group defrauds the output result; The learning algorithm includes a data relationship model obtained by using the multi-degree relationship network data and the person characteristic data of the selected target group, and the marked historical fraud insurance personnel as the sample data.
  8. 如权利要求7所述的装置,其中,所述关系关联数据包括下述中的至少一种:The apparatus of claim 7, wherein the relationship-related data comprises at least one of the following:
    社会关系数据、终端数据、终端的应用以及应用账户操作信息、与保险行为关联的行为数据、人员基础属性数据、地理位置数据。Social relationship data, terminal data, terminal application and application account operation information, behavior data associated with insurance behavior, personnel basic attribute data, and geographic location data.
  9. 如权利要求7所述的装置,所述欺诈识别模块确所述待识别人群骗保输出结果包括输出单个待识别目标人员是否为欺诈人员或为欺诈人员的概率。The apparatus of claim 7, wherein the fraud identification module determines that the to-be-identified crowd fraud output output includes a probability of outputting a single target person to be identified as a fraudulent person or a fraudulent person.
  10. 如权利要求7所述的装置,所述选取的目标人群包括申请理赔人员和被保险人的人员集合。The apparatus of claim 7, wherein the selected target population comprises a set of persons applying for a claimant and an insured.
  11. 如权利要求7或9所述的装置,所述人员特征数据包括用户注册账号、交易数据、与保险行为关联的行为数据中的至少一项提取出来的特征数据。The apparatus according to claim 7 or 9, wherein the person characteristic data comprises feature data extracted from at least one of a user registration account number, transaction data, and behavior data associated with the insurance behavior.
  12. 如权利要求7或9所述的装置,所述欺诈识别模块包括:The device of claim 7 or 9, the fraud identification module comprising:
    特征学习模块,用于利用选取的有监督学习算法对目标人群的多度关系网络数据中目标人员与其他人员的关系特征进行第一关系网络学习、基于所述目标人员特征的自身特征数据进行第二自身属性学习;The feature learning module is configured to perform, by using the selected supervised learning algorithm, the relationship between the target person and other personnel in the multi-degree relationship network data of the target group, the first relationship network learning, and the self-characteristic data based on the target person feature Second self-attribute learning;
    关系建立模块,用于以所述第一关系网学习和第二自身属性学习得到的特征数据作为所述有监督学习算法的自变量,以打标的历史骗保人员作为因变量建立关系模型;a relationship establishing module, configured to use the feature data obtained by the first relationship network learning and the second self attribute learning as an independent variable of the supervised learning algorithm, and establish a relationship model by using a historical fraud controller as a dependent variable;
    模型训练模块,用于在所述关系模型的输出达到预设准确率时确定构建的有监督学习算法。The model training module is configured to determine the constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
  13. 一种处理设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:A processing device includes a processor and a memory for storing processor-executable instructions that, when executed by the processor, are implemented:
    获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
    基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
    利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  14. 一种服务器,包括至少一个处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现:A server comprising at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
    获取待识别人群的关系关联数据;Obtaining relationship-related data of the people to be identified;
    基于所述关系关联数据构建所述待识别人群的多度关系网络图数据以及提取所述待识别人群的人员特征数据;Constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data and extracting the person characteristic data of the to-be-identified group;
    利用构建的有监督学习算法对所述待识别人群的多度关系网络图数据和所述人员特征数据进行识别,确所述待识别人群骗保输出结果;所述有监督学习算法包括采用以选取的目标人群的多度关系网络数据和人员特征数据、打标的历史骗保人员作为样本数据进行训练得到的数据关系模型。Identifying the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determining that the to-be-identified group defrauds the output result; the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
PCT/CN2019/074097 2018-04-12 2019-01-31 Data processing method, apparatus and device for insurance fraud identification, and server WO2019196552A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810327069.3A CN108334647A (en) 2018-04-12 2018-04-12 Data processing method, device, equipment and the server of Insurance Fraud identification
CN201810327069.3 2018-04-12

Publications (1)

Publication Number Publication Date
WO2019196552A1 true WO2019196552A1 (en) 2019-10-17

Family

ID=62934055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074097 WO2019196552A1 (en) 2018-04-12 2019-01-31 Data processing method, apparatus and device for insurance fraud identification, and server

Country Status (3)

Country Link
CN (1) CN108334647A (en)
TW (1) TWI686760B (en)
WO (1) WO2019196552A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4113422A3 (en) * 2021-12-08 2023-05-17 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for remote damage assessment of a vehicle, electronic device and medium

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334647A (en) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification
CN109087145A (en) * 2018-08-13 2018-12-25 阿里巴巴集团控股有限公司 Target group's method for digging, device, server and readable storage medium storing program for executing
CN109325525A (en) * 2018-08-31 2019-02-12 阿里巴巴集团控股有限公司 Sample attribute assessment models training method, device and server
CN109447658A (en) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 The generation of anti-fraud model and application method, device, equipment and storage medium
CN109657890B (en) * 2018-09-14 2023-04-25 蚂蚁金服(杭州)网络技术有限公司 Method and device for determining risk of money transfer fraud
CN109614496B (en) * 2018-09-27 2022-06-17 长威信息科技发展股份有限公司 Low security identification method based on knowledge graph
CN109509106A (en) * 2018-10-30 2019-03-22 平安科技(深圳)有限公司 Flat type determines method and Related product
CN109544379A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 Flat type determines method and Related product
CN109801176B (en) * 2019-02-22 2021-04-06 中科软科技股份有限公司 Method, system, electronic device and storage medium for identifying insurance fraud
CN110264371B (en) * 2019-05-10 2024-03-08 创新先进技术有限公司 Information display method, device, computing equipment and computer readable storage medium
CN110428337B (en) * 2019-06-14 2023-01-20 南京极谷人工智能有限公司 Vehicle insurance fraud group partner identification method and device
CN110363406A (en) * 2019-06-27 2019-10-22 上海淇馥信息技术有限公司 Appraisal procedure, device and the electronic equipment of a kind of client intermediary risk
CN110580260B (en) * 2019-08-07 2023-05-26 北京明智和术科技有限公司 Data mining method and device for specific group
CN111415241A (en) * 2020-02-29 2020-07-14 深圳壹账通智能科技有限公司 Method, device, equipment and storage medium for identifying cheater
CN112419074A (en) * 2020-11-13 2021-02-26 中保车服科技服务股份有限公司 Vehicle insurance fraud group identification method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066616A (en) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 Method, device and electronic equipment for account processing
CN107194623A (en) * 2017-07-20 2017-09-22 深圳市分期乐网络科技有限公司 A kind of discovery method and device of clique's fraud
CN107403326A (en) * 2017-08-14 2017-11-28 云数信息科技(深圳)有限公司 A kind of Insurance Fraud recognition methods and device based on teledata
CN107644098A (en) * 2017-09-29 2018-01-30 马上消费金融股份有限公司 A kind of fraud recognition methods, device, equipment and storage medium
CN108334647A (en) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7813944B1 (en) * 1999-08-12 2010-10-12 Fair Isaac Corporation Detection of insurance premium fraud or abuse using a predictive software system
WO2001073652A1 (en) * 2000-03-24 2001-10-04 Access Business Group International Llc System and method for detecting fraudulent transactions
CN105095238B (en) * 2014-05-04 2019-01-18 中国银联股份有限公司 For detecting the decision tree generation method of fraudulent trading
WO2015187372A1 (en) * 2014-06-02 2015-12-10 Yottamine Analytics, Llc Digital event profile filters
WO2016210122A1 (en) * 2015-06-24 2016-12-29 IGATE Global Solutions Ltd. Insurance fraud detection and prevention system
CN106600413A (en) * 2015-10-19 2017-04-26 阿里巴巴集团控股有限公司 Cheat recognition method and system
CN106600423A (en) * 2016-11-18 2017-04-26 云数信息科技(深圳)有限公司 Machine learning-based car insurance data processing method and device and car insurance fraud identification method and device
CN106803168B (en) * 2016-12-30 2021-04-16 中国银联股份有限公司 Abnormal transfer detection method and device
CN107145587A (en) * 2017-05-11 2017-09-08 成都四方伟业软件股份有限公司 A kind of anti-fake system of medical insurance excavated based on big data
CN107785058A (en) * 2017-07-24 2018-03-09 平安科技(深圳)有限公司 Anti- fraud recognition methods, storage medium and the server for carrying safety brain
CN107730262B (en) * 2017-10-23 2021-09-24 创新先进技术有限公司 Fraud identification method and device
CN107819747B (en) * 2017-10-26 2020-09-18 上海欣方智能系统有限公司 Telecommunication fraud association analysis system and method based on communication event sequence

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066616A (en) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 Method, device and electronic equipment for account processing
CN107194623A (en) * 2017-07-20 2017-09-22 深圳市分期乐网络科技有限公司 A kind of discovery method and device of clique's fraud
CN107403326A (en) * 2017-08-14 2017-11-28 云数信息科技(深圳)有限公司 A kind of Insurance Fraud recognition methods and device based on teledata
CN107644098A (en) * 2017-09-29 2018-01-30 马上消费金融股份有限公司 A kind of fraud recognition methods, device, equipment and storage medium
CN108334647A (en) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4113422A3 (en) * 2021-12-08 2023-05-17 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for remote damage assessment of a vehicle, electronic device and medium

Also Published As

Publication number Publication date
TWI686760B (en) 2020-03-01
TW201944338A (en) 2019-11-16
CN108334647A (en) 2018-07-27

Similar Documents

Publication Publication Date Title
WO2019196552A1 (en) Data processing method, apparatus and device for insurance fraud identification, and server
WO2019196545A1 (en) Data processing method, apparatus and device for insurance fraud identification, and server
TWI712981B (en) Risk identification model training method, device and server
WO2019114412A1 (en) Graphical structure model-based method for credit risk control, and device and equipment
TWI715879B (en) Method, device and equipment for controlling transaction risk based on graph structure model
Adetunji et al. House price prediction using random forest machine learning technique
US11568480B2 (en) Artificial intelligence derived anonymous marketplace
US11797844B2 (en) Neural embeddings of transaction data
WO2019196546A1 (en) Method and apparatus for determining risk probability of service request event
Alaka et al. Methodological approach of construction business failure prediction studies: a review
Figini et al. Statistical merging of rating models
US11514369B2 (en) Systems and methods for machine learning model interpretation
US11544627B1 (en) Machine learning-based methods and systems for modeling user-specific, activity specific engagement predicting scores
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
KR20200039852A (en) Method for analysis of business management system providing machine learning algorithm for predictive modeling
CN112102006A (en) Target customer acquisition method, target customer search method and target customer search device based on big data analysis
CN109903166B (en) Data risk prediction method, device and equipment
Calabrese Optimal cut-off for rare events and unbalanced misclassification costs
Park et al. A study on improving turnover intention forecasting by solving imbalanced data problems: focusing on SMOTE and generative adversarial networks
Hong Optimal threshold from ROC and CAP curves
Kim et al. Identification of merger and acquisition waves and their macroeconomic determinants in the hospitality industry
López-Díaz et al. A stochastic comparison of customer classifiers with an application to customer attrition in commercial banking
Boz et al. Reassessment and monitoring of loan applications with machine learning
CN112163962A (en) Method and device for model training and business wind control
Cheng et al. A quarterly time-series classifier based on a reduced-dimension generated rules method for identifying financial distress

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19784644

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19784644

Country of ref document: EP

Kind code of ref document: A1