WO2019196552A1 - Procédé, appareil et dispositif de traitement de données permettant l'identification d'une fraude à l'assurance et serveur - Google Patents

Procédé, appareil et dispositif de traitement de données permettant l'identification d'une fraude à l'assurance et serveur Download PDF

Info

Publication number
WO2019196552A1
WO2019196552A1 PCT/CN2019/074097 CN2019074097W WO2019196552A1 WO 2019196552 A1 WO2019196552 A1 WO 2019196552A1 CN 2019074097 W CN2019074097 W CN 2019074097W WO 2019196552 A1 WO2019196552 A1 WO 2019196552A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
relationship
person
identified
insurance
Prior art date
Application number
PCT/CN2019/074097
Other languages
English (en)
Chinese (zh)
Inventor
王修坤
邹晓川
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019196552A1 publication Critical patent/WO2019196552A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Definitions

  • the embodiment of the present specification belongs to the technical field of computer data processing for insurance fraud detection, and particularly relates to a data processing method, device, processing device and server for insurance fraud.
  • the embodiment of the present specification aims to provide a data processing method, device, processing device and server for insurance fraud, which can provide network data and self-characteristics between the use personnel, and can more effectively identify the fraudster.
  • the data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are implemented by the following methods:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • a data processing device for insurance fraud identification comprising:
  • a data acquisition module configured to acquire relationship association data of the to-be-identified group
  • a feature calculation module configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
  • a fraud identification module configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and confirm that the to-be-identified group defrauds the output result;
  • the learning algorithm includes a data relationship model obtained by using the multi-degree relationship network data and the person characteristic data of the selected target group, and the marked historical fraud insurance personnel as the sample data.
  • a processing device includes a processor and a memory for storing processor-executable instructions that, when executed by the processor, are implemented:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • a server comprising at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the data processing method, device, processing device and server for insurance fraud provided by the embodiments of the present specification are based on the multi-dimensional relationship data of the insured person and the insured person to construct the multi-degree relationship network map data of the crowd, which can be more deeply Exploring the network of relationships between people to improve the efficiency and scope of identification.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • FIG. 1 is a schematic flow chart of an embodiment of a data processing method for insurance fraud identification provided by the present specification
  • FIG. 2 is a schematic diagram of a processing procedure for constructing a supervised recognition model provided by the present specification
  • FIG. 3 is a block diagram showing the hardware structure of an insurance fraud identification processing server provided by the present specification
  • FIG. 4 is a block diagram showing the structure of a data processing apparatus for insurance fraud identification provided by the present specification.
  • FIG. 5 is a block diagram showing the structure of a fraud identification module in a data processing apparatus for insurance fraud identification provided by the present specification.
  • the embodiment of the present specification provides a plurality of embodiments, which are triggered by multiple relationship-related data of a target group including an insured person and an application for claiming personnel, and the composition of the multi-degree relationship network is performed (the data of the relationship network graph may be referred to as a multi-degree relationship graph).
  • the solution provided by the embodiment of the present specification also considers the characteristic attributes of the fraudsters themselves, such as the fraudulent inspector usually uses the false information to register the account, the account registration time is short, and the account is registered to use the insured service.
  • the implementation scheme provided by the present specification combines the relationship characteristic data of the fraud insurance group and the self-characteristic data to mark the historical fraud insurance personnel and perform algorithm learning with the supervised model, so that the person to be identified can be calculated or identified whether the fraudulent insurance exists. result.
  • FIG. 1 is a schematic flowchart diagram of an embodiment of a data processing method for insurance fraud identification provided by the present specification.
  • the present specification provides method operation steps or device structures as shown in the following embodiments or figures, there may be more or partial merged fewer operational steps in the method or device based on conventional or no inventive labor. Or module unit.
  • the execution order of the steps or the module structure of the device is not limited to the execution order or the module structure shown in the embodiment or the drawings.
  • server or terminal product of the method or module structure When the device, server or terminal product of the method or module structure is applied, it may be executed sequentially or in parallel according to the method or module structure shown in the embodiment or the drawing (for example, parallel processor or multi-thread processing). Environment, even including distributed processing, server cluster implementation environment).
  • the description of the following embodiments does not constitute a limitation on other expandable technical solutions based on the present specification.
  • the embodiments provided in this specification can also be applied to implementation scenarios of fund fraud identification, product transactions, service transactions, and the like.
  • the data processing method for insurance fraud identification provided by the present specification may include:
  • S2 constructing the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extracting the person characteristic data of the to-be-identified group;
  • the supervised learning algorithm includes adopting Data relationship model obtained by training the selected target population's multi-degree relationship network data and personnel characteristic data, marking historical fraud insurance personnel as sample data
  • the insurance insurance, accounting, and claims are mainly applied to the claiming personnel.
  • the situation in which the fraud insurance motivation occurs from the beginning of the insurance is considered in some embodiments, and the fraud insurance personnel are The main purpose is to apply for insurance benefits, and of course there are some motives that are only after the insurance.
  • the insured person is the main subject of the insurance, such as the fraudulent person of the fellow group deliberately creating the accident of the insured person. Therefore, in the present embodiment, the claimant and the insured are selected when identifying the target group when the fraud is present. Staff collection.
  • the target group when the target group is selected to perform the acquisition learning of the relationship feature data, the target group may include a set of persons applying for the claimant and the insured.
  • the application claimant may include the insured in some implementation situations, such as the father insuring the son, the father as the beneficiary, and the father applying for the claimant after the accident; or in some cases, the claimant may also include the insured.
  • Personnel such as the insured person, insured himself, the beneficiary is himself.
  • the application claimant and the insured mentioned in the above can understand the names of the personnel in different roles in the insurance business, and are not currently different personnel.
  • the selection of the target group may also select one or more of the claim applicant or the insured or the insured or the beneficiary.
  • the relationship association data may include data information associated with personnel in the target group in various dimensions, such as household registration, age, relative/classmate relationship between personnel, insurance data, insurance risk data, and the like.
  • the specific relationship-related data may be selected according to the actual application scenario to determine which categories of data are used.
  • the operator may use the data information that may be involved in the fraudulent behavior as the basis for collecting the relationship-related data.
  • the relationship-related data may include at least one of the following:
  • the social relationship data may include social relationships between people in the target group, such as cousins, teachers and students, family members, classmates, leaders, and subordinates.
  • the terminal data may include a brand, a model, and a category of a communication device used by a person, and some people in the fraud scene use a mobile phone of the same brand.
  • the application of the terminal and the application account operation information may be used to determine whether to use the same application, and use the same account to log in to different terminal applications for insurance fraud operation. In some scenarios, multiple following listeners are unified and commanded on the terminal. Take action.
  • the behavior data associated with the insurance behavior may include behavior data of the target person's insurance behavior, claim behavior, compensation amount, and the like.
  • the personnel basic attribute data may include the age, gender, occupation, household registration, and the like of the applicant/applicant.
  • the geographic location data may include geographic location information currently in which the target population is located or information of a region that has historically passed/detained fruit.
  • the data relationship association data of each dimension described above may have other definitions or contain more/less data categories and information, and may also include relationship-related data of other dimensions than the above, such as consumption information. Even credit records or administrative penalty information, one or more of the above data information may be collected during specific collection.
  • the personnel characteristic data may include data information associated with a single person itself, such as gender, age, insurance service account number or terminal application account registration time, credit history, consumption status, etc., or may also include behaviors associated with insurance behaviors. Data, such as multiple insurance behaviors, frequent claims, and whether the amount of compensation is normal. It can also include transaction data for other goods or services, such as long-term large expenses, multiple vehicle insurance, multiple mobile phones, and multiple communication accounts/social accounts.
  • the person characteristic data used for the specific person feature calculation may adopt a combination of one or more of the above to realize the identification of the person's own characteristics. Therefore, in another embodiment, the person characteristic data may include feature data extracted by at least one of a user registration account, transaction data, and behavior data associated with the insurance behavior.
  • the multi-dimensional relationship association data obtained above can be used to construct the multi-degree relationship network map data of the target group.
  • the multi-degree relationship network graph data may include a relationship network graph generated based on a relationship chain between different people established by the relationship association data, wherein the relationship chain data between the persons on the relationship network graph is a multi-degree relationship Network graph data.
  • the relationship chain can represent relationship data between every two people, such as A and B are boss relationships, A and C are family relationships, and the like.
  • the relationship between two separate persons may be referred to as a one-degree relationship, and the “multiple degrees” in the multi-degree relationship network map data described in this embodiment may include associations between new persons established based on the one-degree relationship.
  • the data such as the second relationship between the first person and the third person based on the one-time relationship between the first person and the second person and the one-time relationship between the second person and the third person, may even further establish the first relationship based on the other relationship
  • a and B are once social relations, and A has no social relationship with company boss C of brother-in-law B, but in the present embodiment, due to the existence of B is both a brother-in-law of A and a subordinate of company boss C, so A establishes a second relationship with company boss C.
  • the embodiment can use the supervised learning algorithm to learn the relationship characteristics and self-features of the fraud-investigating personnel, so that an effective recognition model can be established.
  • Supervised learning is a kind of classification processing.
  • an existing optimal training model ie known data and its corresponding output
  • this model belongs to a set of functions.
  • the best means that it is the best under a certain evaluation criterion
  • this model to map all the inputs to the corresponding output, and make a simple judgment on the output to achieve the purpose of classification, which has the unknown data.
  • the ability to classify Typical examples of supervised learning are KNN (k-NearestNeighbor), SVM (Support Vector Machine), and support vector machines.
  • a supervised learning algorithm can obtain more accurate output results than a supervised algorithm with a certain number of training samples.
  • the processing processes of other specific relationship features and self-features are designed and determined according to the type of algorithm and the recognition processing requirements.
  • a supervised graph algorithm such as Structure2vec can be used.
  • the constructed supervised learning algorithm includes:
  • S40 using the selected supervised learning algorithm to perform the first relationship network learning on the relationship characteristics between the target personnel and other personnel in the multi-degree relationship network data of the target group, and performing the second self-attribute learning based on the self-characteristic data of the target person characteristics ;
  • S44 Determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
  • FIG. 2 is a schematic diagram of a processing procedure of an embodiment of a supervised learning algorithm provided by the present specification.
  • Structure2vec's supervised graph algorithm can be used: on the one hand, to learn the relationship characteristics of the target person and its neighbors (such as how many people are related, whether it is related to the fraudster), on the other hand Learning the characteristics of the target person (such as gender, age, etc.), the above characteristics are used as the x variable of the model; secondly, according to the historical marking, whether it is the scammers as the y variable; finally, the relevant model is established according to y and x, thereby achieving The y case can be predicted only by relying on x.
  • the final identification in the application scenario in this embodiment may be a single person. That is to say, the reason in this embodiment is that after the supervised learning algorithm learns the relationship characteristics of the gang fraud insurance, and then combines the characteristics of the fraud insurance personnel, it can directly obtain whether the person to be identified is a fraudulent or fraudulent person.
  • the scam guarantees the output. For example, the probability of marking a person as a fraudster or a normal person, or as a fraudster.
  • the mark described here is the recognition result of the fraudster based on the relationship feature and the self-characteristic, and can be used as a basis and reference for initially determining whether these people are fraudulent persons. Final determination of whether it is a fraudulent insurance can be subjective judgment by the operator, or combined with other calculation methods to judge and determine.
  • the data processing method of the insurance fraud provided by the embodiment can construct the multi-degree relationship network map data of the crowd based on the multi-dimensional relationship association data of the insured person and the insured person, and can further explore the relationship network between the personnel and improve the relationship network. Identify efficiency and scope.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • the data information of the historical fraudster may be combined with the multi-degree relationship network map data for identification of the fraudster.
  • the relationship association data may further include: historical fraudulent personnel list data.
  • the data information of the historical fraud insurance group is added, and when the classified community is analyzed and processed, the degree of participation of the historical fraud insurance personnel is considered.
  • the relationship concentration described in this embodiment may include the degree of participation of the historical fraud insurance personnel, and may specifically include the number of historical fraud insurance personnel in the classified community, the proportion of historical fraud insurance personnel, the history fraud protector and other personnel.
  • An example of the degree of relationship intensiveness is as follows: in a risk community of 10 people, 2 historical fraudsters are kinship with one or more other relationships, and 2 employees are classmates.
  • the specific relationship concentration can be calculated in different ways, such as the number of historical fraudsters, the proportion, and the relationship network.
  • the relationship concentration may be calculated from two indicators of the size of the person to be identified and the number of historical fraudsters, and the relationship concentration may be used as a measure of the probability of fraudulent insurance. . Specifically, it may include:
  • the personal fraud probability value calculated by combining the self-features can be calculated, and the group fraud probability is calculated to determine the probability that the final output group is a fraud or a single person is fraudulent. Or the group fraud insurance probability and the personal fraud insurance probability are respectively utilized separately, and no mutual calculation is performed.
  • the probability of community fraud can be calculated in the following way:
  • RiskDegree log (total number of classified community members)* Number of historical fraudsters/total number of classified community members.
  • the above embodiments provide a fraudulent group that can use fraudulent data from a historical fraudster to identify fraudulent insurance.
  • the relationship network feature between each member of the group can be utilized to determine whether it is a fraudster. Specifically, such as determining the network structure characteristics of the personnel relationship in the crowd;
  • the crowd is marked as a fraud group.
  • the above described method can be used in the training of a supervised learning algorithm, and the crowd is a target person.
  • the crowd is the person to be identified.
  • the network structure feature may be based on personnel information in a crowd, network information between people, and the like.
  • the relationship network information herein may be the one-time information described above, and may also include the constructed multi-degree information.
  • the relationship network in the crowd may be a network structure such as a "spherical network” or a "pyramid network.”
  • the “pyramid network” is similar to a pyramid scheme organization, with a layer-by-layer relationship structure, which is more likely to be fraudulent.
  • the “spherical network” is a fraudulent organization that is related to each other in the network and may be decentralized.
  • the data processing method of the insurance fraud provided by the embodiment of the present specification, the mining of the relational data supporting relation network algorithm using the relational network close to the actual relational network, and the calculation of the relationship network data of the multi-degree relationship.
  • the multi-degree relationship network map data of the crowd can be used to more deeply explore the relationship network between the people and improve the recognition efficiency and scope.
  • a supervised learning model is established to learn the relationship network characteristics and characteristics of the fraudsters.
  • the gang's swindlers not only have obvious and abundance of relationship characteristics on the relationship network, but also their own characteristics often show similarities. Therefore, the methods provided in the embodiments of the present specification can identify fraudsters more effectively and efficiently. Improve the efficiency of recognition processing.
  • the method described above can be used for insurance fraud identification on the client side, such as the anti-fraud application installed by the mobile terminal and the insurance service provided by the payment application.
  • the client can be a PC (personal computer), a server, an industrial computer (industrial control computer), a mobile smart phone, a tablet electronic device, a portable computer (such as a laptop computer, etc.), a personal digital assistant (PDA), or a desktop.
  • FIG. 3 is a block diagram showing the hardware structure of a server for identifying a damaged component of a vehicle according to an embodiment of the present invention.
  • server 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), A memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG.
  • server 10 may also include more or fewer components than those shown in FIG. 3, for example, may also include other processing hardware, such as a database or multi-level cache, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of application software, such as program instructions/modules corresponding to the search method in the embodiment of the present invention, and the processor 102 executes various functions by running software programs and modules stored in the memory 104.
  • Application and data processing that is, a processing method for realizing the content display of the above navigation interaction interface.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission module 106 is configured to receive or transmit data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transport module 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission module 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • the present specification also provides a data processing apparatus for insurance fraud identification.
  • the apparatus may include a system (including a distributed system), software (applications), modules, components, servers, clients, etc., using the methods described in the embodiments of the present specification, in conjunction with necessary device hardware for implementing the hardware.
  • the processing device in one embodiment provided by this specification is as described in the following embodiments.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a schematic structural diagram of a module of an embodiment of a data processing apparatus for insurance fraud identification provided by the present specification, which may include:
  • the data obtaining module 101 is configured to obtain relationship association data of the to-be-identified group
  • the feature calculation module 102 may be configured to construct the multi-degree relationship network map data of the to-be-identified group based on the relationship-related data, and extract the person characteristic data of the to-be-identified group;
  • the fraud identification module 103 may be configured to identify the multi-degree relationship network map data and the person characteristic data of the to-be-identified group by using the constructed supervised learning algorithm, and determine that the to-be-identified group defrauds the output result;
  • the supervised learning algorithm includes a data relation model obtained by using the multi-degree relationship network data and personnel characteristic data of the selected target population, and the marked historical fraud insurance personnel as the sample data.
  • the relationship association data may include at least one of the following:
  • the fraud identification module 103 determines that the to-be-identified crowd fraud output output includes a probability of outputting a single target person to be identified as a fraudulent person or a fraudulent person.
  • the selected target population includes a collection of persons applying for claims and insured persons.
  • the person characteristic data includes feature data extracted by at least one of a user registration account number, transaction data, and behavior data associated with the insurance behavior.
  • FIG. 5 is another embodiment of the apparatus. As shown in FIG. 5, the fraud identification module 103 includes:
  • the feature learning module 1031 may be configured to perform, by using the selected supervised learning algorithm, the relationship between the target person and the other person in the multi-degree relationship network data of the target group, the first relationship network learning, and the self-characteristic data based on the target person feature Performing a second self attribute learning;
  • the relationship establishing module 1032 may be configured to use the feature data obtained by the first relationship network learning and the second self attribute learning as an independent variable of the supervised learning algorithm, and establish a relationship by using a labeled historical fraud person as a dependent variable. model;
  • the model training module 1033 can be configured to determine a constructed supervised learning algorithm when the output of the relationship model reaches a preset accuracy rate.
  • the training iteration of the parameters in the model can be used as an online when the output accuracy requirements are met.
  • the server or client provided by the embodiment of the present specification may be implemented by a processor executing a corresponding program instruction in a computer, such as a C++ language of a Windows operating system, implemented on a PC or a server, or other corresponding to, for example, Linux or a system. Apply the design language to the necessary hardware implementations, or to implement processing logic based on quantum computers. Accordingly, the present specification also provides a data processing device for insurance fraud identification, which may specifically include a processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the above instructions may be stored in a variety of computer readable storage media.
  • the computer readable storage medium may include physical means for storing information, which may be digitized and stored in a medium utilizing electrical, magnetic or optical means.
  • the computer readable storage medium of this embodiment may include: means for storing information by means of electrical energy, such as various types of memories, such as RAM, ROM, etc.; means for storing information by magnetic energy means, such as hard disk, floppy disk, magnetic tape, magnetic Core memory, bubble memory, U disk; means for optically storing information such as CD or DVD.
  • electrical energy such as various types of memories, such as RAM, ROM, etc.
  • magnetic energy means such as hard disk, floppy disk, magnetic tape, magnetic Core memory, bubble memory, U disk
  • means for optically storing information such as CD or DVD.
  • quantum memories graphene memories, and the like.
  • the processing device may specifically provide an insurance anti-fraud identification server for an insurance server or a third-party service organization, and the server may be a separate server, a server cluster, a distributed system server, or a server that processes data by requesting data.
  • System server combination for data processing Accordingly, embodiments of the present specification also provide a specific server product, the server including at least one processor and a memory for storing processor-executable instructions, the processor implementing the instructions to:
  • the supervised learning algorithm includes adopting to select The data relationship model of the target group's multi-degree relationship network data and personnel characteristic data, and the marking history of the fraudsters as training data.
  • the apparatus, the processing device, and the server described in the foregoing embodiments of the present specification may further include other embodiments according to the description of the related method embodiments.
  • the apparatus, the processing device, and the server described in the foregoing embodiments of the present specification may further include other embodiments according to the description of the related method embodiments.
  • embodiments of the present specification refers to the type of relationship-related data collection, the range of the target population selected during training, the probability of calculating the probability of fraudulent insurance, etc., data acquisition, storage, interaction, calculation, judgment, etc.
  • the data is described, however, embodiments of the present specification are not limited to situations that must be consistent with industry communication standards, standard oversight or unsupervised model processing, communication protocols, and standard data models/templates or embodiments of the specification.
  • Certain industry standards or implementations that have been modified in a manner that uses a custom approach or an embodiment described above may also achieve the same, equivalent, or similar, or post-deformation implementation effects of the above-described embodiments.
  • Embodiments obtained by applying such modified or modified data acquisition, storage, judgment, processing, etc. may still fall within the scope of alternative embodiments of the present specification.
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • the controller can be implemented in any suitable manner, for example, the controller can take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor.
  • computer readable program code eg, software or firmware
  • examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, The Microchip PIC18F26K20 and the Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic.
  • the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding.
  • Such a controller can therefore be considered a hardware component, and the means for implementing various functions included therein can also be considered as a structure within the hardware component.
  • a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.
  • the processing device, device, module or unit set forth in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function.
  • a typical implementation device is a computer.
  • the computer can be, for example, a personal computer, a laptop computer, a car-mounted human-machine interaction device, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet.
  • a computer, wearable device, or a combination of any of these devices are examples of these devices.
  • the above devices are described as being separately divided into various modules by function.
  • the functions of the modules may be implemented in the same software or software, or the modules that implement the same function may be implemented by multiple sub-modules or a combination of sub-units.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic controllers, and embedding.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present specification can be provided as a method, system, or computer program product.
  • embodiments of the present specification can take the form of an entirely hardware embodiment, an entirely software embodiment or a combination of software and hardware.
  • embodiments of the present specification can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • Embodiments of the present description can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • Embodiments of the present specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network.
  • program modules can be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

L'invention concerne un procédé, un appareil et un dispositif de traitement de données permettant une identification de fraude à l'assurance et un serveur. Des données de graphe de réseau de relations à échelles multiples d'un groupe sont construites en fonction de données d'association de relations à échelles multiples de souscripteurs d'assurance et d'assurés, un réseau de relations parmi des personnes peut être extrait plus profondément, l'efficacité d'identification est améliorée et la plage d'identification est élargie. En même temps, un modèle d'apprentissage supervisé est construit conjointement en fonction de données caractéristiques d'un fraudeur et est utilisé pour apprendre des caractéristiques de réseau de relations et des caractéristiques personnelles du fraudeur. Les fraudeurs complices présentent des caractéristiques de relations à échelles multiples évidentes dans le réseau de relation, les caractéristiques des fraudeurs indiquent fréquemment des similarités, ainsi les fraudeurs peuvent être identifiés plus efficacement à l'aide du procédé de l'invention, et l'efficacité de traitement d'identification est améliorée.
PCT/CN2019/074097 2018-04-12 2019-01-31 Procédé, appareil et dispositif de traitement de données permettant l'identification d'une fraude à l'assurance et serveur WO2019196552A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810327069.3 2018-04-12
CN201810327069.3A CN108334647A (zh) 2018-04-12 2018-04-12 保险欺诈识别的数据处理方法、装置、设备及服务器

Publications (1)

Publication Number Publication Date
WO2019196552A1 true WO2019196552A1 (fr) 2019-10-17

Family

ID=62934055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074097 WO2019196552A1 (fr) 2018-04-12 2019-01-31 Procédé, appareil et dispositif de traitement de données permettant l'identification d'une fraude à l'assurance et serveur

Country Status (3)

Country Link
CN (1) CN108334647A (fr)
TW (1) TWI686760B (fr)
WO (1) WO2019196552A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4113422A3 (fr) * 2021-12-08 2023-05-17 Beijing Baidu Netcom Science Technology Co., Ltd. Procédé et appareil d'évaluation à distance de dommages d'un véhicule, dispositif électronique et support

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334647A (zh) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 保险欺诈识别的数据处理方法、装置、设备及服务器
CN109087145A (zh) * 2018-08-13 2018-12-25 阿里巴巴集团控股有限公司 目标人群挖掘方法、装置、服务器及可读存储介质
CN109325525A (zh) * 2018-08-31 2019-02-12 阿里巴巴集团控股有限公司 样本属性评估模型训练方法、装置及服务器
CN109447658A (zh) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 反欺诈模型的生成及应用方法、装置、设备及存储介质
CN109657890B (zh) * 2018-09-14 2023-04-25 蚂蚁金服(杭州)网络技术有限公司 一种转账欺诈的风险确定方法及装置
CN109614496B (zh) * 2018-09-27 2022-06-17 长威信息科技发展股份有限公司 一种基于知识图谱的低保鉴别方法
CN109509106A (zh) * 2018-10-30 2019-03-22 平安科技(深圳)有限公司 单位类型确定方法及相关产品
CN109544379A (zh) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 单位类型确定方法及相关产品
CN109801176B (zh) * 2019-02-22 2021-04-06 中科软科技股份有限公司 识别保险欺诈的方法、系统、电子设备及存储介质
CN110264371B (zh) * 2019-05-10 2024-03-08 创新先进技术有限公司 信息展示方法、装置、计算设备及计算机可读存储介质
CN110428337B (zh) * 2019-06-14 2023-01-20 南京极谷人工智能有限公司 车险欺诈团伙的识别方法及装置
CN110363406A (zh) * 2019-06-27 2019-10-22 上海淇馥信息技术有限公司 一种客户中介风险的评估方法、装置和电子设备
CN110580260B (zh) * 2019-08-07 2023-05-26 北京明智和术科技有限公司 针对特定群体的数据挖掘方法及装置
CN111415241A (zh) * 2020-02-29 2020-07-14 深圳壹账通智能科技有限公司 欺诈人员识别方法、装置、设备和存储介质
CN112419074A (zh) * 2020-11-13 2021-02-26 中保车服科技服务股份有限公司 一种车险欺诈团伙识别方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066616A (zh) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 用于账号处理的方法、装置及电子设备
CN107194623A (zh) * 2017-07-20 2017-09-22 深圳市分期乐网络科技有限公司 一种团伙欺诈的发现方法及装置
CN107403326A (zh) * 2017-08-14 2017-11-28 云数信息科技(深圳)有限公司 一种基于电信数据的保险欺诈识别方法及装置
CN107644098A (zh) * 2017-09-29 2018-01-30 马上消费金融股份有限公司 一种欺诈行为识别方法、装置、设备及存储介质
CN108334647A (zh) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 保险欺诈识别的数据处理方法、装置、设备及服务器

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7813944B1 (en) * 1999-08-12 2010-10-12 Fair Isaac Corporation Detection of insurance premium fraud or abuse using a predictive software system
JP2003529160A (ja) * 2000-03-24 2003-09-30 アクセス ビジネス グループ インターナショナル リミテッド ライアビリティ カンパニー 不正取引を検出するシステム及び方法
CN105095238B (zh) * 2014-05-04 2019-01-18 中国银联股份有限公司 用于检测欺诈交易的决策树生成方法
WO2015187372A1 (fr) * 2014-06-02 2015-12-10 Yottamine Analytics, Llc Filtres de profil d'événements
WO2016210122A1 (fr) * 2015-06-24 2016-12-29 IGATE Global Solutions Ltd. Système de prévention et de détection de fraude à l'assurance
CN106600413A (zh) * 2015-10-19 2017-04-26 阿里巴巴集团控股有限公司 欺诈识别方法和系统
CN106600423A (zh) * 2016-11-18 2017-04-26 云数信息科技(深圳)有限公司 基于机器学习的车险数据处理方法、车险欺诈识别方法及装置
CN106803168B (zh) * 2016-12-30 2021-04-16 中国银联股份有限公司 一种异常转账侦测方法和装置
CN107145587A (zh) * 2017-05-11 2017-09-08 成都四方伟业软件股份有限公司 一种基于大数据挖掘的医保反欺诈系统
CN107785058A (zh) * 2017-07-24 2018-03-09 平安科技(深圳)有限公司 反欺诈识别方法、存储介质和承载平安脑的服务器
CN107730262B (zh) * 2017-10-23 2021-09-24 创新先进技术有限公司 一种欺诈识别方法和装置
CN107819747B (zh) * 2017-10-26 2020-09-18 上海欣方智能系统有限公司 一种基于通信事件序列的电信诈骗关联分析系统和方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066616A (zh) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 用于账号处理的方法、装置及电子设备
CN107194623A (zh) * 2017-07-20 2017-09-22 深圳市分期乐网络科技有限公司 一种团伙欺诈的发现方法及装置
CN107403326A (zh) * 2017-08-14 2017-11-28 云数信息科技(深圳)有限公司 一种基于电信数据的保险欺诈识别方法及装置
CN107644098A (zh) * 2017-09-29 2018-01-30 马上消费金融股份有限公司 一种欺诈行为识别方法、装置、设备及存储介质
CN108334647A (zh) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 保险欺诈识别的数据处理方法、装置、设备及服务器

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4113422A3 (fr) * 2021-12-08 2023-05-17 Beijing Baidu Netcom Science Technology Co., Ltd. Procédé et appareil d'évaluation à distance de dommages d'un véhicule, dispositif électronique et support

Also Published As

Publication number Publication date
TW201944338A (zh) 2019-11-16
TWI686760B (zh) 2020-03-01
CN108334647A (zh) 2018-07-27

Similar Documents

Publication Publication Date Title
WO2019196552A1 (fr) Procédé, appareil et dispositif de traitement de données permettant l'identification d'une fraude à l'assurance et serveur
WO2019196545A1 (fr) Procédé, appareil et dispositif de traitement de données pour l'identification de fraude à l'assurance, et serveur
TWI712981B (zh) 風險辨識模型訓練方法、裝置及伺服器
TWI715879B (zh) 一種基於圖結構模型的交易風險控制方法、裝置以及設備
US11568480B2 (en) Artificial intelligence derived anonymous marketplace
US20240046095A1 (en) Neural embeddings of transaction data
WO2019196546A1 (fr) Procédé et appareil de détermination de probabilité de risque d'un événement de demande de service
Alaka et al. Methodological approach of construction business failure prediction studies: a review
US11514369B2 (en) Systems and methods for machine learning model interpretation
Figini et al. Statistical merging of rating models
US11544627B1 (en) Machine learning-based methods and systems for modeling user-specific, activity specific engagement predicting scores
US9870596B2 (en) Predicting community development trends
CN108241867B (zh) 一种分类方法及装置
CN111783039B (zh) 风险确定方法、装置、计算机系统和存储介质
KR20200039852A (ko) 기업 경영 현황 분석 예측 모델링을 위한 기계학습 알고리즘 제공 방법
CN112102006A (zh) 基于大数据分析的目标客户获取方法、搜索方法及装置
Park et al. A study on improving turnover intention forecasting by solving imbalanced data problems: focusing on SMOTE and generative adversarial networks
CN109903166B (zh) 一种数据风险预测方法、装置及设备
Calabrese Optimal cut-off for rare events and unbalanced misclassification costs
Hong Optimal threshold from ROC and CAP curves
Boz et al. Reassessment and monitoring of loan applications with machine learning
Kim et al. Identification of merger and acquisition waves and their macroeconomic determinants in the hospitality industry
Nobanee et al. Big data and credit risk assessment: a bibliometric review, current streams, and directions for future research
Cheng et al. A quarterly time-series classifier based on a reduced-dimension generated rules method for identifying financial distress
CN113094595A (zh) 对象识别方法、装置、计算机系统及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19784644

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19784644

Country of ref document: EP

Kind code of ref document: A1