CN107785058A - Anti- fraud recognition methods, storage medium and the server for carrying safety brain - Google Patents

Anti- fraud recognition methods, storage medium and the server for carrying safety brain Download PDF

Info

Publication number
CN107785058A
CN107785058A CN201710605531.7A CN201710605531A CN107785058A CN 107785058 A CN107785058 A CN 107785058A CN 201710605531 A CN201710605531 A CN 201710605531A CN 107785058 A CN107785058 A CN 107785058A
Authority
CN
China
Prior art keywords
model
data
fraud
feature
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710605531.7A
Other languages
Chinese (zh)
Inventor
肖京
王健宗
王建明
徐亮
汪伟
周宝
李想
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201710605531.7A priority Critical patent/CN107785058A/en
Priority to PCT/CN2018/077230 priority patent/WO2019019630A1/en
Publication of CN107785058A publication Critical patent/CN107785058A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of anti-fraud recognition methods, for solving the problems, such as the anti-fraud scarce capacity of medical field.Method provided by the invention includes:Determine object event;The extraction target data related to the object event;The target data is handled using at least two methods being constructed as below in the method for decision model, the recognition methods of recognition methods, the fraud of cheating data.The present invention also provides storage medium and carries the server of safety brain.

Description

Anti- fraud recognition methods, storage medium and the server for carrying safety brain
Technical field
The present invention relates to medical field, more particularly to instead cheat recognition methods, storage medium and the service for carrying safety brain Device.
Background technology
In the medical field, many frauds at present often be present, for example, it is the behavior of medicine mouse, Claims Resolution fraud, non- Method is swiped the card reimbursement behavior etc., and the presence of these frauds can waste medical resource, intensify social contradications.
These frauds are identified however, there is currently no a set of perfect method, cause medical field Anti- fraud scarce capacity, fraud are difficult to be controlled.Therefore, a kind of anti-fraud method is found further to improve medical neck The anti-fraud ability in domain turns into the problem of those skilled in the art's urgent need to resolve.
The content of the invention
The embodiments of the invention provide anti-fraud recognition methods, storage medium and the server for carrying safety brain, Neng Gougeng Comprehensively, improve ground and anti-fraud decision-making and identification are carried out to the event of medical field.
First aspect, there is provided a kind of anti-fraud recognition methods, including:
Determine object event;
The extraction target data related to the object event;
In recognition methods using recognition methods, the fraud of the method, fraud data that decision model is constructed as below At least two methods are handled the target data;
The method of the structure decision model includes:
Rule template data are obtained, and extract each variable object in the rule template data and each template sample This;
Cluster analysis is carried out to the variable object, obtains cluster result;
The cluster result is matched with each template samples according to the rule template data, and by after matching Cluster result is as fisrt feature;
Calculate the black sample probability of each variable object respectively, and using the black sample probability of each variable object as Second feature;
Decision model is built by the fisrt feature and the second feature;
The recognition methods of the fraud data includes:
Default training dataset is trained using default continuum model training method, establishes that continuous type is counter to take advantage of Cheat model;
Based on the continuous type it is counter cheat model and treat test data be trained, identify taking advantage of in the data to be tested Cheat data;
The recognition methods of the fraud includes:
Gone to a doctor the relational network that data establish doctors and patients, medicine is examined based on social security, wherein, the relational network includes each section Point, it is subordinate to different relations between each node;
The group medial demand of each node in the relational network is analyzed, it is corresponding to extract each node Various dimensions colony medical treatment feature;
Each various dimensions colony medical treatment feature of extraction is input to default disaggregated model, with according to the disaggregated model Identify the rate of fraud of each node;
Wherein, the target data includes at least two in the medical data of data to be tested, rule template data and social security It is individual.
Second aspect, there is provided a kind of computer-readable recording medium, the computer-readable recording medium storage have meter Calculation machine program, the computer program realizes above-mentioned anti-fraud recognition methods when being executed by processor the step of.
The third aspect, there is provided it is a kind of carry safety brain big data platform server, including memory, processor and It is stored in the computer program that can be run in the memory and on the processor, it is characterised in that the processor is held The step of above-mentioned anti-fraud recognition methods being realized during the row computer program.
As can be seen from the above technical solutions, the embodiment of the present invention has advantages below:
In the embodiment of the present invention, for the target data of object event, using the method for structure decision model, fraud data Recognition methods, fraud recognition methods at least two methods the target data is handled, can be more complete Face, improve event progress anti-fraud decision-making and identification of the ground to medical field.
When being handled using the method for structure decision model the rule template data in target data, advised by extracting Then each variable object in template data and each template samples, cluster analysis is carried out to variable object, obtains cluster result, And matched cluster result with each template samples according to rule template data, the cluster result after matching is special as first Sign, calculates the black sample probability of each variable object respectively, and using the black sample probability of each variable object as second feature, Decision model is built by fisrt feature and second feature again, by carrying out cluster analysis to variable object, data can be reduced and related to And dimension and level, be advantageous to build decision model and reduce the influence of performance to model.In addition, by fisrt feature with The decision model of second feature structure, makes the performance of model more accurate, can effectively help quick processing to need to carry out complicated rule The business then audited, improve the efficiency of decision-making.
When being handled using the recognition methods of fraud data the data to be tested in target data, using default company Ideotype model training mode establishes that continuous type is counter to cheat model, and model is cheated to treat test data using the continuous type of foundation is counter It is trained, identifies the fraud data in the data to be tested.Due to being unbalanced for cheating data in data to be tested The feature of data, cheat the fraud data that model treated in test data using continuous type is counter and analyzed, identify, compared to common Single model can improve the accuracy of identification and recall rate of fraud data, more accurately judge that fraud case, so as to reduce artificial examine The scope and cost looked into.
When being handled using the recognition methods of fraud the medical data of social security in target data, first based on social security The relational network that medical data establish doctors and patients, medicine is examined, then to the group medial demand of each node in the relational network Analyzed, to extract various dimensions colony medical treatment feature, each various dimensions colony medical treatment feature most extracted at last is input to Default disaggregated model, to identify the rate of fraud of each node according to the disaggregated model.It is furthermore it is possible to polygonal from various dimensions Social security fraud is identified degree, and the identification of relatively conventional single rule, the accuracy identified to social security fraud is more It is high.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art In the required accompanying drawing used be briefly described, it should be apparent that, drawings in the following description be only the present invention some Embodiment, for those of ordinary skill in the art, without having to pay creative labor, can also be according to these Accompanying drawing obtains other accompanying drawings.
Fig. 1 is a kind of flow chart of anti-fraud recognition methods in the embodiment of the present invention;
Fig. 2 is the schematic flow sheet for the method that decision model is built in one embodiment;
Fig. 3 is the schematic flow sheet for the method that decision model is built in another embodiment;
Fig. 4 is that the schematic flow sheet of decision model how is built in one embodiment;
Fig. 5 is the schematic flow sheet for carrying out cluster analysis in one embodiment to variable object;
Fig. 6 is the structural representation for the device that decision model is built in one embodiment;
Fig. 7 is the schematic flow sheet of the recognition methods first embodiment of present invention fraud data;
Fig. 8 is the schematic flow sheet of the recognition methods second embodiment of present invention fraud data;
Fig. 9 is the high-level schematic functional block diagram of the identification device first embodiment of present invention fraud data;
Figure 10 is the schematic flow sheet of the recognition methods first embodiment of social security fraud of the present invention;
Figure 11 is the refinement schematic flow sheet of step S10 in Figure 10;
Figure 12 is the refinement schematic flow sheet of step S30 in Figure 10;
Figure 13 is the schematic flow sheet of the recognition methods second embodiment of social security fraud of the present invention;
Figure 14 is the high-level schematic functional block diagram of the identification device first embodiment of social security fraud of the present invention;
Figure 15 is the preferable schematic diagram of relational network of the present invention;
Figure 16 is the schematic diagram of the server for the carrying safety brain big data platform that one embodiment of the invention provides.
Embodiment
The embodiments of the invention provide anti-fraud recognition methods, storage medium and the server for carrying safety brain, for more Comprehensively, improve ground and anti-fraud decision-making and identification are carried out to the event of medical field.
Safety brain big data platform provided by the invention utilizes group financial and the big data resource in non-financial field, with reference to Self-technique and platform advantage, by big data technologies such as the data minings, machine learning, deep learning in international forward position, to each Kind structuring and unstructured data carry out the Classification Management that becomes more meticulous, mining data value.
For client provide service form be mainly analysis report, standardized data product and customize data product or Service, there is provided the model to user include Marketing Model, continuous type it is counter cheat model, risk warning model, relational network model, Credit Rating Model, vehicle insurance fraud identification model, vehicle insurance pricing model, guarantee value of stocks model, risk business personnel Early-warning Model, Autoregressive moving-average model etc., possess recommended engine, early warning signal is reminded, " medicine mouse " identifies, business personnel's public sentiment prison Product or the functions such as control system, enterprise's public sentiment monitoring system, medical benefits fund air control, intellect service robot platform, application scenarios Cover the necks such as precision marketing, risk control, fraud identification, operation optimization, intelligent finance, intelligent health, business intelligence, robot Domain.
Wherein, application of the safety brain big data platform in anti-fraud field is especially prominent.In the market has numerous risk controls Model and anti-fraud model, common model are compared to just by the use of maximum accuracy rate as criterion in fraudulent trading Often in the case of transaction relative rarity, model can be partial to correctly estimate as much as possible so that fraudulent trading is more difficult to be shown Come.Such model accuracy and recall rate be not high.The structure decision model for being applied to complex rule examination & verification is included in the present invention Method, cheat data recognition methods, fraud recognition methods the methods of, model, carried applied to early warning signal The products such as awake, " medicine mouse " identification can effectively improve anti-fraud efficiency.
On application of the safety brain big data platform in anti-fraud field, several Related products are as follows:
" medicine mouse " identifies:Submitted an expense account for thousands of medical bills, it is difficult to by limited human resources carry out by One examination, using computer and the advantage of data mining, establishing anti-fraud model can effectively help manually to be examined, Predicting abnormality and the necessary examination foundation of offer.Cheat model using continuous type is counter, can significantly improve model precision and Recall rate.
Recommended engine:Identify that user buys the default risk of loan product, reduce bad credit rate.
Early warning signal is reminded:Credit Risk Assessment of Enterprise is identified, risk disposal intervention is preposition, reduces the risk that promise breaking is brought Loss.
Business personnel's public sentiment monitoring system:Disobeyed by analyzing business personnel's social network content authentication service person with the presence or absence of sale Violation/fraud customer action of financial product or non-flat eutocia product is advised, so as to carry out risk averse.
Enterprise's public sentiment monitoring system:Enterprise's public information, finance and economics information, monitoring enterprise are analyzed by natural language processing technique Industry is abnormal with the presence or absence of illegal, senior executive's unusual fluctuation etc., assists industry research and post-loan management mechanism to do the risk of investment target Early warning.
Medical benefits fund air control:Quantitative evaluation quality of medical care effect, government is assisted to reduce the unreasonable expenditure of medical benefits fund, branch Hold government and formulate air control policy.
Vehicle insurance is settled a claim:Identify vehicle insurance fraud.
Intellect service robot platform:As safety brain early warning system terminal, you can using the information source equipment of itself, Such as display, voice etc., it can also be interconnected with existing early warning system, so as to realize various dimensions early warning, by the fortune of safety brain Result Real-time Feedback is calculated, farthest avoids the generation of fraud.
It can be seen that safety brain big data platform is in anti-fraud field application and its extensively.The present invention is mainly from medical field Anti- fraudulent party is described in face of safety brain big data platform.By anti-fraud recognition methods provided by the invention, can from document, Personal and three angles of colony carry out the anti-fraud identification of comprehensive medical treatment, in terms of having filled up the anti-ability of fraud in current medical field The defects of insufficient.
To enable goal of the invention, feature, the advantage of the present invention more obvious and understandable, below in conjunction with the present invention Accompanying drawing in embodiment, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that disclosed below Embodiment be only part of the embodiment of the present invention, and not all embodiment.Based on the embodiment in the present invention, this area All other embodiment that those of ordinary skill is obtained under the premise of creative work is not made, belongs to protection of the present invention Scope.
As shown in figure 1, a kind of anti-fraud recognition methods provided by the invention, including:101st, object event is determined;102nd, carry Take the target data related to the object event;103rd, using the identification for the method, fraud data that decision model is constructed as below Method, fraud recognition methods at least two methods the target data is handled.Wherein, the target data At least two in data of being gone to a doctor including data to be tested, rule template data and social security.
For the ease of understanding and describing, below by by several different embodiments respectively to build decision model side Method, the recognition methods of recognition methods, the fraud of cheating data are described in detail.
Embodiment one:
Referring to Fig. 2, a kind of method one embodiment for building decision model includes in the embodiment of the present invention:
Step S110, obtain rule template data, and each variable object in extracting rule template data and each mould Plate sample.
Specifically, rule template refers to being used to help a set of standard for determining auditing result, a document or project Examination & verification may correspond to one or more rule templates, for example, examination & verification creditor's credit rating, it may include " accommodator is at which Branch is provided a loan ", the rule template such as " creditor once had record of bad behavior in which machine-operated mechanism ".Each different regular mould Plate has its corresponding rule template data, wherein, each variable object, each template sample are may include in rule template data This, and the matching relationship between variable object and template samples, variable object is the variable of qualitative type, each variable object A different classification in rule of correspondence template, for example, rule template is " which accommodator provided a loan in lines ", it is corresponding Rule template data may include user 1 A branch provided a loan, user 2 B branch provided a loan, user 3 C branch enter Go and provide a loan ..., wherein, each branch such as A branches, B branches, C branches is that variable object, user 1, user 2, user 3 etc. is For template samples.
Step S120, cluster analysis is carried out to variable object, obtains cluster result.
Specifically, can extract the multidimensional data of each variable object, and variable object is clustered according to multidimensional data Analysis, multidimensional data refers to the related data of each dimension of variable object, for example, variable object is each branch, more dimensions According to may include the overall loan number of each branch, total loan amount, loan average period, branch's scale, geographical position etc..Cluster point Analysis refers to the set of physics or abstract object being grouped into the analysis process for the multiple classes being made up of similar object, by right Variable object carries out cluster analysis, can cluster similar or similar variable object, can reduce the level of variable object.Example Such as, variable object includes A branches, B branches, C branches, D branches ..., and cluster analysis, A branches and B point are carried out to variable object Analysis is more similar, assigns to A groups, C branches and D branches are more similar, assign to B groups ..., variable object is by original each branch Level be reduced to the level of each group.It is available to cluster the cluster formed by each after carrying out cluster analysis to variable object As a result.
Step S130, cluster result is matched with each template samples according to rule template data, and by after matching Cluster result as fisrt feature.
Specifically, cluster analysis is carried out to variable object, can be according to variable in rule template data after obtaining cluster result Object is matched cluster result with each template samples with the matching relationship of template samples.For example, rule template is " loan People once had record of bad behavior in which associated mechanisms ", rule template data include user 1 once FK mechanisms had record of bad behavior, User 2 once had record of bad behavior, user 3 once to have record of bad behavior in KD mechanisms in CE mechanisms ..., to variable object FK mechanisms, CE mechanisms, KD mechanisms ... carry out cluster analysis, obtain respectively with a group A, group B, each cluster for organizing C ... names, and will be poly- Class result is matched with template samples user 1, user 2, user 3 ....Can be as shown in the table, table 1 represents rule template number According to the matching relationship of middle variable object and template samples, table 2 represents cluster result and the matching relationship of each template samples, can use " 1 " represents variable object and template samples or the matching relationship of cluster result, but not limited to this.
Table 1
Table 2
By carrying out cluster analysis to variable object, hence it is evident that the level of variable object can be reduced, be advantageous to model.
Step S140, the black sample probability of each variable object is calculated respectively, and the black sample of each variable object is general Rate is as second feature.
Specifically, the common output result of decision model is black sample or white sample, black sample refers to not passing through examination & verification Sample, white sample then refers to that the sample by examination & verification, such as decision model are used for bank loan qualification, and black sample is then Refer to not referring to the user by qualification of providing a loan then by the user for qualification of providing a loan, white sample.Calculate respectively The black sample probability of each variable object, i.e., for each variable object in rule template data, the result class of template samples Type for black sample probability accounting how much, for example, rule template is " creditor once had record of bad behavior in which associated mechanisms ", It can then calculate " being finally how many for the probability of black sample in the user that there were record of bad behavior in KD mechanisms " etc..The black sample of variable object The calculation formula of this probability can be:The template samples of the black number of samples/variable object of the black sample probability=variable object Sum.The black sample probability for each variable object being calculated can be regard as second feature in the form of continuous variable. In other embodiments, WOE (weight-of-evidence, evidence weight) value of each variable object can be also calculated respectively, Its calculation formula is that (the black number of samples of the variable object accounts for the ratio/variable object of total black number of samples to WOE=ln White number of samples accounts for the ratio of total white number of samples), WOE values are higher, then it represents that the template samples of the variable object are black samples This probability is lower.
Step S150, decision model is built by fisrt feature and second feature.
Specifically, the mode of structure decision model is to be modeled all rule template data inputs at present, rule Template data is more and level is complicated, is unfavorable for modeling and having an impact the performance of model.By by the cluster knot after matching Fruit, using the black sample probability of each variable object as second feature, substitutes original rule template data as fisrt feature Input carries out structure decision model, not only reduces the level that data are related to, and remain each variable object to the result of decision Influence, make the result of decision more accurate.Decision model may include decision tree, GBDT (Gradient Boosting Decision Tree) machine such as tree-model, LDA (Linear Discriminant Analysis, linear discriminent analysis) model Device learning model.When building the examination & verification decision model of some document or project, one or more rule templates may be corresponded to, Then need to respectively obtain fisrt feature, second feature corresponding to each rule template, and substitute original rule template data input Decision model is built, when the variable object in some rule templates is few, rule template data structure model can be directly inputted.
The method of above-mentioned structure decision model, pass through each variable object in extracting rule template data and each template Sample, cluster analysis is carried out to variable object, obtains cluster result, and according to rule template data by cluster result and each mould Plate sample is matched, and the cluster result after matching calculates the black sample probability of each variable object as fisrt feature respectively, And using the black sample probability of each variable object as second feature, then pass through fisrt feature and second feature structure decision model Type, by variable object carry out cluster analysis, dimension and level that data are related to can be reduced, be advantageous to build decision model and Reduce the influence of the performance to model.In addition, the decision model built by fisrt feature and second feature, makes the performance of model It is more accurate, it can effectively help quick processing to need to carry out the business of complex rule examination & verification, improve the efficiency of decision-making.
As shown in figure 3, in one embodiment, the method for above-mentioned structure decision model, in addition to:
Step S210, each variable object is mapped in pre-defined label according to preset algorithm.
Specifically, each label can be pre-defined, and variable object is mapped in pre-defined label, preset algorithm It may include hash function, such as MD5 (Message-Digest Algorithm 5, Message Digest Algorithm 5), SHA (Secure Hash Algorithm, Secure Hash Algorithm) etc., but not limited to this.According to preset algorithm by each variable object It is mapped in pre-defined label, for example, variable object is A branches, B branches, C branches ..., using SHA algorithms by A points Row and C branches are mapped in label A, B branches are mapped into label K is medium, the number of label can be set according to actual conditions It is fixed, excessive variable object will not be included under a label, dimension and level that data are related to can be reduced, can also be retained original A part of information.
Step S220, label is matched with each template samples according to rule template data, and by the mark after matching Label are used as third feature.
Specifically, can be according to the matching relationship of variable object and template samples in rule template data by label and each mould Plate sample is matched, and the label after matching is modeled as third feature.
Step S230, decision model is built by fisrt feature, second feature and third feature.
Specifically, using the cluster result after matching as fisrt feature, using the black sample probability of each variable object as Second feature, using the label after matching as third feature, and fisrt feature, second feature and third feature replacement is all Rule template data input carries out structure decision model, not only reduces the level that data are related to, and remain each variable pair As the influence to the result of decision, make the result of decision more accurate.
The method of above-mentioned structure decision model, decision model is built by fisrt feature, second feature and third feature, led to Cross and cluster analysis is carried out to variable object and maps to pre-defined label, dimension and level that data are related to can be reduced, had Beneficial to structure decision model and the influence of the performance to model is reduced, the performance of model can be made more accurate, can effectively be helped fast Speed processing needs to carry out the business of complex rule examination & verification, improves the efficiency of decision-making.
As shown in figure 4, in one embodiment, step S230 is built by fisrt feature, second feature and third feature Decision model, comprise the following steps:
Step S302, establishes ancestor node.
Specifically, in the present embodiment, decision model can be decision-tree model, the ancestor node of decision tree can be first established.
Step S304, according to the result type of each template samples of rule template data acquisition.
Specifically, the result type of template samples refers to the final result of template samples, such as black sample, white sample Deng the result types of each template samples can be obtained from rule template data.
Record is read in step S306, respectively traversal reading fisrt feature, second feature and third feature, generation.
Specifically, traversal reads fisrt feature, second feature and third feature respectively, record is read in generation, i.e., respectively time Each possible decision tree branches, such as traversal reading fisrt feature respectively are gone through, and generate user 1 to have bad loan in a group A Money record, user 2 had the reading record of non-performing loan record ... in group A, and traversal reads second feature respectively, and generates FK Reading record that the black sample probability of mechanism is 20%, the black sample probability of CE mechanisms is 15% ... etc., every is read record It is probably a branch for decision tree.
Step S308, according to the segmentation purity of each bar reading record of the result type of each template samples calculating, and according to Segmentation purity determines cut-point.
Specifically, it is pure to determine that each bar reads the segmentation recorded by calculating Geordie impurity level, entropy, information gain etc. Degree, wherein, Geordie impurity level refer to by certain result in set at random be applied to set in a certain data item it is pre- Period error rate, entropy are used for the confusion degree of gauging system, and information gain is then used for weighing a reading record differentiation template samples Ability.It may be interpreted as recording division template samples by the reading if calculating each bar and reading the segmentation purity recorded, then predict The difference of obtained result type and real result type has much, and difference is smaller, and segmentation purity is bigger, represents that this reads Record is purer.For example, the calculation formula of Geordie impurity level can be:
Then split purity=1- Geordie impurity levels, wherein, i ∈ { 1,2 ... ..., m } refer to that the m kinds of decision model most terminate Fruit, P (i) are then ratio of result type of the template samples when using reading record as Rule of judgment for this kind of final result Example.
The size for the segmentation purity that record can be read according to each bar determines optimal partition point, the bigger reading bar of segmentation purity Part is preferentially used as branch, and ancestor node is split.
Step S310, feature corresponding with cut-point is obtained, and establish new node.
Specifically, feature corresponding with cut-point can be obtained, and new node is established, for example, reading recording gauge to each bar Purity is cut in point counting, and the reading that can obtain possessing maximum fractionation purity is recorded as " user 1 had non-performing loan record in a group A ", then Ancestor node can be divided into Liang Ge branches, one is to have non-performing loan record in a group A, and another is not have not in a group A Good loan documentation, and node corresponding to generation, then next cut-point is found respectively to new node, split, until institute Some reading records are added in decision tree.After having built decision-tree model, decision tree can be trimmed, it is pure to wipe out segmentation Degree makes each branch of decision tree have higher segmentation pure less than node corresponding to the reading record of default Reinheitszahl Degree.In other embodiments, the number of nodes of decision tree also can be first set, when the number of nodes of decision tree reaches the setting During number of nodes, that is, stop structure decision tree.
The method of above-mentioned structure decision model, respectively traversal reading fisrt feature, second feature and third feature, generation are read Record is taken, and the segmentation purity of each bar reading record is calculated according to the result type of each template samples, according to segmentation purity Size determines cut-point, builds decision model, the performance of model can be made more accurate, can effectively help quick processing to need to carry out The business of complex rule examination & verification, improves the efficiency of decision-making.
As shown in figure 5, in one embodiment, step S120 carries out cluster analysis to variable object, obtains cluster result, Including:
Step S402, first cluster centre of multiple variable objects respectively as cluster is randomly choosed from variable object.
Specifically, multiple variable objects can be randomly choosed from all variable objects, and by each variable pair of selection As the first cluster centre respectively as each cluster, and each cluster is named respectively, each first cluster centre pair Answer a cluster, namely the number of cluster is identical with the number of the variable object of selection.
Step S404, each variable object is calculated respectively to the distance of each first cluster centre.
In one embodiment, step S404 calculates each variable object to the distance of each first cluster centre respectively, Including:
(a) according to the multidimensional data of each variable object of rule template data acquisition.
Specifically, the multidimensional data of each variable object can be obtained from rule template data, multidimensional data refers to becoming The related data of each dimension of object is measured, for example, variable object is each branch, multidimensional data may include total loan of each branch Money number, total loan amount, provide a loan average period, branch's scale, geographical position etc..
(b) each variable object is calculated to each first cluster centre according to the multidimensional data of each variable object respectively Distance.
Specifically, according to the multidimensional data of each variable object of acquisition, using public affairs such as Euclidean distance, cosine similarities Formula calculates the distance between two variable objects, calculates each variable object respectively to the distance of each first cluster centre, example Such as, 4 clusters are shared, are corresponding with 4 the first cluster centres respectively, then it is poly- to the 1st first to calculate each variable object respectively The distance at class center, the distance ... to the 2nd the first cluster centre.
Step S406, each variable object is divided according to result of calculation, it is most short that variable object is divided into distance The first cluster centre corresponding to cluster in.
Specifically, calculating each variable object respectively to after the distance of each first cluster centre, variable object can be drawn Assign in being clustered corresponding to the first most short cluster centre of distance.In other embodiments, the distance that will can be also calculated Compared with default distance threshold, when variable object is less than the distance threshold with certain first cluster centre distance, then by variable Object is divided into cluster corresponding to first cluster centre.
Step S408, respectively each cluster after computation partition the second cluster centre.
Specifically, after the completion of division, each cluster may each comprise one or more variable objects, using mean value formula weight The second cluster centre of each cluster is newly calculated, selectes the center of each cluster again.
Step S410, judge the first cluster centre in each cluster and the second cluster centre distance whether be less than it is default Threshold value, if so, step S414 is then performed, if it is not, then performing step S412.
Specifically, calculate the first cluster centre of each cluster and the distance of the second cluster centre, and judging distance whether Less than predetermined threshold value, if the first cluster centre of all clusters and the distance of the second cluster centre are respectively less than predetermined threshold value, say Bright each cluster tends towards stability, and no longer changes, then can export each cluster as cluster result.If the first of cluster is poly- Class center and the distance of the second cluster centre are not less than predetermined threshold value, then need again to draw the variable object of each cluster Point.
Step S412, the second cluster centre is substituted the first cluster centre of corresponding cluster, and continues executing with step S404。
, should specifically, if the first cluster centre and the distance of the second cluster centre of cluster are not less than predetermined threshold value Cluster the second cluster centre substitute the first cluster centre, and re-execute calculate respectively each variable object to each first gather Class center apart from the step of, repeat step S404 to S412, until each cluster tends towards stability, no longer change.
Step S414, each cluster is exported as cluster result.
The method of above-mentioned structure decision model, cluster analysis is carried out to variable object, similar variable object is merged In one cluster, the level that data are related to can be reduced, is advantageous to build decision model.
Embodiment two:
As shown in fig. 6, a kind of device for building decision model, including extraction module 510, cluster module 520, fisrt feature Module 530, second feature module 540 and structure module 550.
Extraction module 510, for obtaining rule template data, and each variable object in extracting rule template data and Each template samples.
Cluster module 520, for carrying out cluster analysis to variable object, obtain cluster result.
Fisrt feature module 530, for cluster result to be matched with each template samples according to rule template data, And using the cluster result after matching as fisrt feature.
Second feature module 540, calculates the black sample probability of each variable object respectively, and by the black of each variable object Sample probability is as second feature.
Module 550 is built, for building decision model by fisrt feature and second feature.
Embodiment three:
Referring to Fig. 7, a kind of recognition methods one embodiment for cheating data includes in the embodiment of the present invention:
Step K10, default training dataset is trained using default continuum model training method, the company of foundation Ideotype is counter to cheat model;
In the present embodiment, first using default continuum model training method, with reference to data such as decision tree, random forests The data analysis tool such as analysis theories and R, SAS, is trained that continuous type is counter to cheat to establish to default training dataset Model.Such as default training dataset can be divided into multiple groups, be trained respectively and centre is tested, continuous type is counter to take advantage of to establish Cheat model., in one embodiment, can will be default when using default continuum model training method to be trained Training dataset is divided into multiple groups, carries out model training in each group respectively and test, each group of training result are relatively only It is vertical, it is independent of each other, then the model that each group trained, is obtained after test is integrated, obtains that final continuous type is counter to cheat Model.
In another embodiment, default training dataset can be divided into multiple groups, successively to each group of carry out mould Type training and test, the result basis training and test as next group model that a upper group model is trained and tested, i.e., on Lower two groups training result is interrelated, and in whole training process, model can constantly be optimized, be improved, and obtains final Continuous type counter cheat model.
Certainly, do not limit yet and default training dataset is trained using other model training modes, to establish Continuous type is counter to cheat model.
Step K20, based on the continuous type it is counter cheat model and treat test data be trained, identify the number to be tested Fraud data in.
After establishing that continuous type is counter and cheating model, you can using the anti-model of cheating of the continuous type of foundation come to number to be tested According to being trained, to analyze, identify the fraud data in the data to be tested.Such as can continuous type is counter to cheat model by establishing When to the test mode of default training dataset, the nest to be tested that need to be identified is used with same or analogous test mode The anti-model of cheating of the continuous type of foundation is trained, tested, and is identified according to the result of training, test in the data to be tested Fraud data.
Due to easily occurring at some in the scene of fraud such as social security malice reimbursement scene, fraud data are whole Accounting in social security big data is extremely small, that is, cheats data and substantial amounts of lack of uniformity be present, and knows according to general single mode type Fraud data not therein, then can be because of the unbalanced characteristic of fraud data so that the precision and recall rate of identification are relatively low.Cause This, in the present embodiment for fraud data unbalanced characteristic, establish continuous type it is counter cheat model treat test data carry out Identification, the method that can be such as voted jointly using a variety of models simultaneously carry out cheat data identification, fraud number can be effectively improved According to accuracy of identification and recall rate, can more accurately judge that fraud case so as to reducing the scope of manual review and cost.
The present embodiment establishes that continuous type is counter to cheat model using default continuum model training method, utilizes the company of foundation The anti-model of cheating of ideotype is trained to treat test data, identifies the fraud data in the data to be tested.Due to for The feature that data are unbalanced data is cheated in data to be tested, taking advantage of in test data is treated using the anti-model of cheating of continuous type Swindleness data are analyzed, identified, the accuracy of identification and recall rate for cheating data can be improved compared to general single mode type, more accurately Fraud case is judged, so as to reduce the scope of manual review and cost.
Further, in other embodiments, the continuous type that the fraud data in test data are analyzed, identified is treated Anti- fraud model uses direct continuum model, and above-mentioned steps K10 could alternatively be:
Default training dataset is decomposed into training set and test set by preset ratio;
Retain the test set, the training set is further broken into two sub- training sets by preset ratio, described two Training set and test set of the individual sub- training set respectively as next layer model;
It is repeated in dividing training set to preset times;
The multilayer training set of division is utilized respectively, carrys out training pattern using default classical model, and in the multilayer of reservation Tested on test set, establish direct continuum model.
In the present embodiment, direct continuum model is established in the training that can carry out N weight continuum models, wherein, N be more than Positive integer equal to 2, the training of direct continuum model can be such as carried out according to the following steps:
The first step:It is training set Train_set and test to decompose default training dataset according to certain preset ratio Collect Test_set, retain test set Test_set.
Second step:Two son training are further broken into training set Train_set according to certain preset ratio Collect Train_set11 and Train_set12, using two sub- training set Train_set11 and Train_set12 as next The training set and test set of layer model.
Repeat second step and divide training set to certain preset times.
3rd step:N layers training set is utilized respectively to carry out training pattern using default conventional classical model and carry out parameter tune It is excellent, tested on N layer test sets, carry out arameter optimization and reserving model.Wherein, the classical model includes but is not limited to certainly Plan tree-model, Random Forest model etc..
4th step:Arrangement and tuning are collected to the model of reservation, obtain direct continuum model.
Further, above-mentioned steps K20 could alternatively be:
Test data is treated to carry out concentrating training set ratio identical multilayer to divide with the training data, and described in utilization Data to be tested after direct continuum model divides to multilayer are trained respectively, identify the fraud in the data to be tested Data.
After direct continuum model is established, it can be carried out using the direct continuum model established to treat test data Training, to analyze, identify the fraud data in the data to be tested.Specifically, can be to that need to carry out cheating the to be measured of identification Examination data carry out and repeatedly divide training set ratio identical random division when establishing model, recycle directly connecting for foundation Ideotype model pair concentrates the data to be tested after the division of training set ratio identical multilayer to enter respectively with the training data Model training corresponding to row, collect the training knot that the data to be tested after being divided to multilayer carry out corresponding model training respectively Fruit.Can be obtained according to the training result data to be tested after being divided to multilayer carry out respectively it is each after corresponding model training The fraud data of identification are tested in layer, the fraud data that identification is tested in each layer, which collect, can obtain finally described treat Fraud data in test data.
Further, in other embodiments, the continuous type that the fraud data in test data are analyzed, identified is treated Anti- fraud model could alternatively be using optimization continuum model, above-mentioned steps K10:
Default training dataset is decomposed into training set and test set by preset ratio;
Retain the test set, the training set is further broken into two sub- training sets by preset ratio, described two Lower floor training set and lower floor test set of the individual sub- training set respectively as next layer model;
Using lower floor's training set come training pattern, and tested on lower floor's test set, sun is obtained according to test result Property sample simultaneously retains training pattern, using the positive sample of acquisition as new training set;
The step of being repeated in carrying out division training set, test, until the positive sample quantity obtained is zero or established Complete Multiple Training model;
Arrangement is collected to the Multiple Training model of foundation, obtains optimization continuum model.
In the present embodiment, optimization continuum model is established in the training that can carry out N weight continuum models, such as can be by following step The rapid training for optimizing continuum model:
The first step:It is training set Train_set and test to decompose default training dataset according to certain preset ratio Collect Test_set, retain test set Test_set.
Second step:Two son training are further broken into training set Train_set according to certain preset ratio Collect Train_set11 and Train_set12, using two sub- training set Train_set11 and Train_set12 as next The lower floor's training set and lower floor's test set of layer model.
3rd step:By the use of lower floor training set Train_set11 as training set come training pattern and tuning, tested in lower floor Tested on collection Train_set12, positive sample and reserving model are obtained according to test result.
4th step:Extract the positive sample composition training set obtained in the 3rd step.
5th step:Second step is repeated to the 4th until N molality types have been built up or positive sample quantity is zero, its In, N is the positive integer more than or equal to 2.
6th step:N molality types to structure are that Multiple Training model is collected arrangement and tuning, obtain optimization continuous type Model.
Further, above-mentioned steps K20 could alternatively be:
Using the top-down test of continuum model progress is optimized in data to be tested, obtained simultaneously according to test result Retain positive sample, to identify the fraud data in the data to be tested according to the positive sample.
After optimization continuum model is established, it can be carried out using the optimization continuum model established to treat test data Training, to analyze, identify the fraud data in the data to be tested.Specifically, directly can be utilized in data to be tested The optimization continuum model of foundation carries out top-down prediction, retains the optimization continuum model and treats test data progress in advance Positive sample during survey, the N molality types until the optimization continuum model are circulated, by each molality type to number to be tested It is predicted that positive sample collect the fraud data that can be obtained in the final data to be tested.
As shown in figure 8, on the basis of above-described embodiment three, also include after above-mentioned steps K20:
Step K30, the type and/or source of the fraud data are marked.
In the present embodiment, using the continuous type of foundation it is counter cheat fraud data that Model Identification goes out in data to be tested it Afterwards, further, also the types of fraud data and/or source that identify are marked, to indicate the feature of fraud data Type and/or source so that related examination department or relevant staff couple are identical with the marked type for cheating data, source Or other similar data carry out emphasis identification, manual review scope is reduced.Such as exist in social security medical treatment reimbursement system Malice or it is illegal swipe the card, reimbursement behavior.Go out social security doctor to be tested in the anti-Model Identification of cheating of the continuous type using foundation After treating the fraud data in reimbursement data, the type of the fraud data identified and/or source can be marked, such as marked For Chinese medicine, Western medicine, diagnosis and treatment etc..So, social security department can be using Chinese medicine, Western medicine, diagnosis and treatment as the height for being likely to occur false reimbursement Danger region carries out strict management and control, so as to reduce examination scope, improves the precision and efficiency of fraud data identification.
Example IV:
Referring to Fig. 9, a kind of identification device one embodiment for cheating data includes in the embodiment of the present invention:
Modeling module 01, for being instructed using default continuum model training method to default training dataset Practice, establish that continuous type is counter to cheat model;
Identification module 02, for based on the continuous type it is counter cheat model and treat test data be trained, described in identification Fraud data in data to be tested.
The present embodiment establishes that continuous type is counter to cheat model using default continuum model training method, utilizes the company of foundation The anti-model of cheating of ideotype is trained to treat test data, identifies the fraud data in the data to be tested.Due to for The feature that data are unbalanced data is cheated in data to be tested, taking advantage of in test data is treated using the anti-model of cheating of continuous type Swindleness data are analyzed, identified, the accuracy of identification and recall rate for cheating data can be improved compared to general single mode type, more accurately Fraud case is judged, so as to reduce the scope of manual review and cost.
Embodiment five:
Referring to Fig. 10, a kind of recognition methods one embodiment of fraud includes in the embodiment of the present invention:
Gone to a doctor the relational network that data establish doctors and patients, medicine is examined based on social security, wherein, the relational network includes each section Point, it is subordinate to different relations between each node;The group medial demand of each node in the relational network is divided Analysis, to extract various dimensions colony medical treatment feature corresponding to each node;Each various dimensions colony medical treatment feature of extraction is defeated Enter to default disaggregated model, to identify the rate of fraud of each node according to the disaggregated model.
It is that the specific steps to social security fraud identification are done step-by-step in the present embodiment below:
Step Y10, gone to a doctor the relational network that data establish doctors and patients, medicine is examined based on social security, wherein, the relational network includes Each node, it is subordinate to different relations between each node;
In the present embodiment, the medical data of social security are first obtained from database, can after the medical data of social security are got It is directly based upon the medical relational network that data establish doctors and patients, medicine is examined of social security.Wherein, the node of the relational network includes but unlimited In:Hospital, doctor, sufferer, region, disease and pharmaceutical item etc..
Further, after the medical data of social security are got, the data that can also be gone to a doctor to the social security got carry out sensitive The processing of information, sensitive information processing represent:Rule is handled using sensitive information data are carried out to the sensitive information in data Deformation, to realize the protection of privacy-sensitive data.Subsequently, you can establish doctor based on the medical data of social security after sensitive information processing Suffer from, the relational network that medicine is examined.It is preferred that social security hereinafter is gone to a doctor, data are all the medical data of social security after sensitive information processing, under Text no longer repeats one by one.
Specifically, reference picture 11, the step Y10 include:
Step Y11, data of being gone to a doctor to social security carry out data processing;
Step Y12, according to the medical relational network that data establish doctors and patients, medicine is examined of social security after data processing.
In the present embodiment, get social security to go to a doctor after data, data of first going to a doctor to social security carry out data processing, at this Reason data can include going interference to handle data progress denoising, in order to which the relational network subsequently established is more accurate, to social security After medical data carry out data processing, according to the medical relational network that data establish doctors and patients, medicine is examined of social security after data processing.
In the present embodiment, the relational network for data foundation of being gone to a doctor based on social security, Figure 15 can refer to.As shown in figure 15, it is described Relational network includes multiple nodes, and node is respectively:Hospital, doctor, sufferer, region, disease and pharmaceutical item etc..From Figure 15 In can be seen that, in the relational network, different relations is subordinate between each node, for example, the relation between doctor and hospital It is:Doctor belongs to (BELONG) hospital;Relation between doctor and disease is:Diagnosis (DIAGNOSE) disease;Sufferer and The relation of pharmaceutical item is:Sufferer buys (BUY) pharmaceutical item;The relation of sufferer and disease is:Sufferer is with (HAS) disease etc. Deng.By the relational network, can the medial demand of conduct monitoring at all levels patient should be appreciated that the relational network figure that Figure 15 is illustrated An only preferable schematic diagram in the present embodiment, and the relational network of Figure 15 displayings is relational network in the present embodiment One fraction, can be seen that from Figure 15 relational network, and each node is all different types of node, therefore each node is all It is the node of different attribute.But in the relational network of the present embodiment, it actually may include the section of multiple identical attributes Point, such as include the node of multiple doctors, or include the node of multiple sufferers, also, between each node of attribute identical Being subordinate to has different relations.Therefore, the node in the present embodiment is not limited to above-mentioned illustrated content, in the medical number of social security In the case of according to change, different relational network and node can be also obtained, herein without exhaustive one by one.
Step Y20, the group medial demand of each node in the relational network is analyzed, it is each to extract Various dimensions colony medical treatment feature corresponding to node;
In the present embodiment, after data of being gone to a doctor based on social security establish doctors and patients, the relational network that medicine is examined, to the relation The group medial demand of each node is analyzed in network, in the present embodiment, to the group medial demand of each node Analyzed, continued by taking Figure 15 as an example, exactly the medial demand showed in relational network is analyzed, is the equal of To medical-care-seeking behavior analysis, to medical treatment behavioural analysis either disease treatment means analysis etc..Due to the relation It is subordinate to different relations in network between each node, and each node is influenceed by single dimension, but by institute The combined influence of other each nodes in relational network is stated, therefore the group medial demand of each node is analyzed, The various dimensions group medical treatment feature of each node is finally can obtain, the medical treatment feature is exactly the spy extracted in medial demand Sign.By taking the sufferer node in Figure 15 as an example, the group medial demand of the sufferer node includes:Region where sufferer, sufferer Hospital, the quantity of sufferer buying pharmaceutical item and specific time, the disease that sufferer is suffered from, the sufferer seen a doctor see the row such as doctor examined For.The group medial demand of sufferer is analyzed, is equivalent to purchase pharmaceutical item to the region where sufferer, sufferer Disease that quantity and specific time, sufferer are suffered from etc. carries out comprehensive analysis.If it is repeatedly big in different hospital's purchases to find sufferer The medicine of amount, and the species of medicine is different, it may be determined that group is seen a doctor and is characterized as:The medicine purchase volume of user is big, medicine More than type etc..
Step Y30, each various dimensions colony medical treatment feature of extraction is input to default disaggregated model, with according to Disaggregated model identifies the rate of fraud of each node.
After various dimensions colony medical treatment feature corresponding to each node is extracted, by each various dimensions colony of extraction just Doctor's feature is input to default disaggregated model, to identify the rate of fraud of each node according to the disaggregated model.Specifically, join According to Figure 12, the step Y30 includes:
Step Y31, according to various dimensions colony medical treatment feature corresponding to each node, calculate same attribute each node it is more The similarity of dimension colony medical treatment feature;
Step Y32, the similarity of each node of calculating is input in default disaggregated model, with according to the classification Default fraud detection formula in model, calculate the rate of fraud of each node.
That is, after various dimensions colony medical treatment feature corresponding to each node is extracted, each of same attribute is calculated The similarity of the various dimensions colony medical treatment feature of individual node.The node of the same alike result is such as:Doctor's node and doctor's node, or Person's sufferer node and sufferer node.
In the present embodiment, the similarity of the various dimensions colony medical treatment feature of each node of same attribute is calculated, it is preferred to use Several algorithms are realized below:
1) Jaccard Similarity (representing broad sense similarity):
Jaccard (A, B)=| A intersect B |/| A union B |
Wherein, Intersect represents to occur simultaneously, and Union represents union, and A and B represent the node of same alike result, such as A and B all Doctor's node in Figure 15 is represented, or all represents sufferer node.
2) Euclidean similarity (similarity of Euclidean distance):
Euclidean (A, B)=1-euclidean_distance (A, B)
Wherein, A and B represents the node of same alike result.
The similarity of the two kinds of various dimensions colony medical treatment features for calculating each node with attribute listed above enumerated What algorithm was merely exemplary, those skilled in the art are proposed using the technological thought of the present invention according to its real needs Other algorithms are within the scope of the present invention, herein without exhaustive one by one.
Pass through above-mentioned calculating formula of similarity, you can determine the various dimensions colony of the node of any two same alike result just Cure the similarity of feature.
After the similarity of various dimensions colony medical treatment feature of each node of same attribute is determined, by each of calculating The similarity of node is input in default disaggregated model, with according to default fraud detection formula in the disaggregated model, meter Calculate the rate of fraud of each node.Wherein, the fraud detection formula preferably includes:KNN(k-Nearest Neighbor The closest Node Algorithm of algorithm, K, K take the formula of 5) algorithm;The formula of two points of Kmeans algorithms;Shewhart Formula of methods algorithms etc., because the formula of these algorithms is all existing formula, calculating process is not gone to live in the household of one's in-laws on getting married herein State.
Further, in order to improve the accuracy of disaggregated model calculate node rate of fraud, in the present embodiment, the step After Y32, the recognition methods of the social security fraud also includes:
Step A, the rate of fraud of each node is verified, will verify that conclusion is added to the rate of fraud of each node In;
Step B, the rate of fraud added with checking conclusion is re-entered into the disaggregated model, it is described in order to train Disaggregated model.
That is, according to default fraud detection formula in the disaggregated model, calculating the rate of fraud of each node Afterwards, the rate of fraud of each node can also be verified, in the present embodiment, the examination & approval that the verification mode is preferably under line are tested Card, after being verified to the rate of fraud of each node, checking conclusion is added in the rate of fraud of each node, and will add The rate of fraud for having checking conclusion is re-entered into the disaggregated model, in order to train the disaggregated model so that follow-up institute It is more accurate to state identification of the disaggregated model to node rate of fraud.
Social security fraud identification of the present embodiment based on relational network is exactly in colony's dimension, to the medical row of colony To establish the medical relational network of medical treatment, and algorithm for design model identifies fraud from colony's dimension, to obtain taking advantage of for node Swindleness rate, realize the social security behavior to colony's dimension and be identified.It is appreciated that carried out by data of being gone to a doctor to the social security of user Analysis, if detecting, the rate of fraud of multiple nodes is all higher, and the rate of fraud of only respective nodes is relatively low, now it is believed that the use There is social security fraud in family, relative to single rule trigger mechanism, determine whether user deposits by the medical behavior of group It is some higher in social security fraud, the accuracy rate of social security fraud identification.
The recognition methods for the social security fraud that the present embodiment proposes, first being gone to a doctor based on social security, data establish doctors and patients, medicine is examined Relational network, then the group medial demand of each node in the relational network is analyzed, to extract multidimensional Colony's medical treatment feature is spent, each various dimensions colony medical treatment feature most extracted at last is input to default disaggregated model, with basis The disaggregated model identifies the rate of fraud of each node.This programme is known from various dimensions multi-angle to social security fraud Not, relatively conventional single rule identification, it is higher to the accuracy of social security fraud identification.
Further, in order to improve the accuracy of the identification of social security fraud, the present invention is proposed based on a upper embodiment Another embodiment of the recognition methods of fraud.
In the present embodiment, reference picture 13, before the step Y20, the recognition methods of the social security fraud is also wrapped Include:
Step Y40, extrinsicfactor feature to be supplemented is determined in the relational network, and from internet described in acquisition Extrinsicfactor feature;
Step Y50, the extrinsicfactor feature generation new node based on acquisition;
Step Y60, the new node is added in the relational network, to update the relational network.
In the present embodiment, extrinsicfactor feature to be supplemented first is determined in the relational network, and from internet The extrinsicfactor feature is obtained, the extrinsicfactor feature refers to the external information of node association, for example, node is doctor Institute, then extrinsicfactor feature is exactly hospital's relevant information, such as hospital address information etc..Get extrinsicfactor feature it Afterwards, first the extrinsicfactor feature based on acquisition generates new node, and most the new node is added to the relational network at last In, to update the relational network so that in follow-up relational network, node in further detail, the fraud to follow-up each node The identification of rate is also more accurate.
The present invention it is worth noting that, although each algorithm being related to is existing algorithm, take advantage of by whole social security In the identification process of swindleness behavior, used entire flow, identification with existing social security fraud simultaneously differs, the present invention Overcome the problem of existing social security fraud identification accuracy is low.
Embodiment six:
The present invention further provides a kind of identification device of fraud.
Reference picture 14, Figure 14 are the high-level schematic functional block diagram of the first embodiment of identification device 100 of fraud of the present invention.
It is emphasized that it will be apparent to those skilled in the art that functional block diagram shown in Figure 14 is only one preferable The exemplary plot of embodiment, those skilled in the art surround the functional module of the identification device 100 of the fraud shown in Figure 14, The supplement of new functional module can be carried out easily;The title of each functional module is self-defined title, is only used for auxiliary and understands that this is taken advantage of Each program function block of the identification device 100 of swindleness behavior, is not used in restriction technical scheme, technical solution of the present invention Core be the function to be reached of functional module of each self-defined title.
In the present embodiment, the identification device 100 of the fraud includes:
Module 10 is established, for being gone to a doctor the relational network that data establish doctors and patients, medicine is examined based on social security, wherein, the relation Network includes each node, is subordinate to different relations between each node;
Extraction module 20 is analyzed, for analyzing the group medial demand of each node in the relational network, To extract various dimensions colony medical treatment feature corresponding to each node;
Identification module 30 is inputted, for each various dimensions colony medical treatment feature of extraction to be input into default classification mould Type, to identify the rate of fraud of each node according to the disaggregated model.
In the present embodiment, the relational network for data foundation of being gone to a doctor based on social security, Figure 15 can refer to.As shown in figure 15, it is described Relational network includes multiple nodes, and node is respectively:Hospital, doctor, sufferer, region, disease and pharmaceutical item etc..From Figure 15 In can be seen that, in the relational network, different relations is subordinate between each node, for example, the relation between doctor and hospital It is:Doctor belongs to (BELONG) hospital;Relation between doctor and disease is:Diagnosis (DIAGNOSE) disease;Sufferer and The relation of pharmaceutical item is:Sufferer buys (BUY) pharmaceutical item;The relation of sufferer and disease is:Sufferer is with (HAS) disease etc. Deng.By the relational network, can conduct monitoring at all levels patient medial demand.
It should be appreciated that the relational network figure that Figure 15 is illustrated is only a preferable schematic diagram in the present embodiment, and scheme The relational network of 15 displayings is a fraction of relational network in the present embodiment, be can be seen that from Figure 15 relational network, Each node is all different types of node, therefore each node is all the node of different attribute.But in the pass of the present embodiment It is the node that actually may include multiple identical attributes in network, such as includes the node of multiple doctors, or including multiple diseases The node of trouble, also, be also subordinate between each node of attribute identical and have different relations.Therefore, the node in the present embodiment Above-mentioned illustrated content is not limited to, in the case where social security goes to a doctor data variation, can also obtain different relational networks And node, herein without exhaustive one by one.
It should be understood that the size of the sequence number of each step is not meant to the priority of execution sequence, each process in above-described embodiment Execution sequence should determine that the implementation process without tackling the embodiment of the present invention forms any limit with its function and internal logic It is fixed.
Figure 16 is the schematic diagram of the server for the carrying safety brain big data platform that one embodiment of the invention provides.Such as Figure 16 Shown, the server 21 of the embodiment includes:Processor 210, memory 211 and it is stored in the memory 211 and can The computer program 212 run on the processor 210, such as perform the program of anti-fraud recognition methods.The processor The step in above-mentioned each anti-fraud recognition methods embodiment, such as Fig. 1 institutes are realized during the 210 execution computer program 212 The step 101 shown is to 103.Or the processor 210 realizes that above-mentioned each device is implemented when performing the computer program 212 The function of each module/unit in example, such as the function of module 01 to 02 shown in the function of module 510 to 550 shown in Fig. 6, Fig. 9, The function of module 10 to 30 shown in Figure 14.
Exemplary, the computer program 212 can be divided into one or more module/units, it is one or Multiple module/the units of person are stored in the memory 211, and are performed by the processor 210, to complete the present invention.Institute It can be the series of computation machine programmed instruction section that can complete specific function to state one or more module/units, the instruction segment For describing implementation procedure of the computer program 212 in the server 21.
The server 21 can be the computing devices such as the cloud server for carrying safety brain big data platform.The service Device may include, but be not limited only to, processor 210, memory 211.It will be understood by those skilled in the art that Figure 16 is only to service The example of device 21, the restriction to server 21 is not formed, parts more more or less than diagram can be included, or combine certain A little parts, or different parts, such as the server can also include input-output equipment, network access equipment, bus Deng.
The processor 210 can be CPU (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor Deng.
The memory 211 can be the internal storage unit of the server 21, such as the hard disk of server 21 or interior Deposit.The memory 211 can also be the External memory equipment of the server 21, such as be equipped with the server 21 slotting Connect formula hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory Block (Flash Card) etc..Further, the memory 211 can also both include the internal storage unit of the server 21 Also External memory equipment is included.The memory 211 is used to storing the computer program and its needed for the server His program and data.The memory 211 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, the corresponding process in preceding method embodiment is may be referred to, will not be repeated here.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and is not described in detail or remembers in some embodiment The part of load, it may refer to the associated description of other embodiments.
Those of ordinary skill in the art are it is to be appreciated that each embodiment described with reference to the embodiments described herein Module, unit and/or method and step, it can be realized with the combination of electronic hardware or computer software and electronic hardware.This A little functions are performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specially Industry technical staff can realize described function using distinct methods to each specific application, but this realization is not It is considered as beyond the scope of this invention.
Described above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to before Embodiment is stated the present invention is described in detail, it will be understood by those within the art that:It still can be to preceding State the technical scheme described in each embodiment to modify, or equivalent substitution is carried out to which part technical characteristic;And these Modification is replaced, and the essence of appropriate technical solution is departed from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

  1. A kind of 1. anti-fraud recognition methods, it is characterised in that including:
    Determine object event;
    The extraction target data related to the object event;
    Using be constructed as below decision model method, cheat data recognition methods, fraud recognition methods at least Two methods are handled the target data;
    The method of the structure decision model includes:
    Rule template data are obtained, and extract each variable object in the rule template data and each template samples;
    Cluster analysis is carried out to the variable object, obtains cluster result;
    The cluster result is matched with each template samples according to the rule template data, and by the cluster after matching As a result it is used as fisrt feature;
    The black sample probability of each variable object is calculated respectively, and using the black sample probability of each variable object as second Feature;
    Decision model is built by the fisrt feature and the second feature;
    The recognition methods of the fraud data includes:
    Default training dataset is trained using default continuum model training method, establishes that continuous type is counter to cheat mould Type;
    Based on the continuous type it is counter cheat model and treat test data be trained, identify the fraud number in the data to be tested According to;
    The recognition methods of the fraud includes:
    Gone to a doctor the relational network that data establish doctors and patients, medicine is examined based on social security, wherein, the relational network includes each node, respectively It is subordinate to different relations between individual node;
    The group medial demand of each node in the relational network is analyzed, it is more corresponding to each node to extract Dimension colony medical treatment feature;
    Each various dimensions colony medical treatment feature of extraction is input to default disaggregated model, to be identified according to the disaggregated model Go out the rate of fraud of each node;
    Wherein, the target data includes at least two in the medical data of data to be tested, rule template data and social security.
  2. 2. anti-fraud recognition methods according to claim 1, it is characterised in that pass through the fisrt feature and institute described Before stating second feature structure decision model, the method for the structure decision model also includes:
    Each variable object is mapped in pre-defined label according to preset algorithm;
    The label is matched with each template samples according to the rule template data, and using the label after matching as Third feature;
    It is described that decision model is built by the fisrt feature and the second feature, specifically include:
    Decision model is built by the fisrt feature, the second feature and the third feature.
  3. 3. anti-fraud recognition methods according to claim 2, it is characterised in that it is described by the fisrt feature, it is described Second feature and third feature structure decision model, including:
    Establish ancestor node;
    According to the result type of each template samples of rule template data acquisition;
    Traversal reads fisrt feature, the second feature and the third feature respectively, and record is read in generation;
    The segmentation purity of each bar reading record is calculated according to the result type of each template samples, and it is pure according to the segmentation Degree determines cut-point;
    Feature corresponding with the cut-point is obtained, and establishes new node.
  4. 4. anti-fraud recognition methods according to any one of claim 1 to 3, it is characterised in that described to the variable Object carries out cluster analysis, obtains cluster result, including:
    First cluster centre of multiple variable objects respectively as cluster is randomly choosed from the variable object, each first is poly- The corresponding cluster in class center;
    Each variable object is calculated respectively to the distance of each first cluster centre;
    Each variable object is divided according to result of calculation, variable object is divided into the first most short cluster centre of distance In corresponding cluster;
    Second cluster centre of each cluster after computation partition respectively;
    Judge whether the distance of the first cluster centre in each cluster and the second cluster centre is less than predetermined threshold value, if so, then Each cluster is exported as cluster result, if it is not, the second cluster centre is then substituted to the first cluster centre of corresponding cluster, And continue executing with it is described calculate respectively each variable object to each first cluster centre apart from the step of.
  5. 5. anti-fraud recognition methods according to claim 1, it is characterised in that the anti-model of cheating of the continuous type is direct Continuum model;
    It is described default training dataset to be trained using default continuum model training method, establish that continuous type is counter to take advantage of Swindleness model includes:
    Default training dataset is decomposed into training set and test set by preset ratio;
    Retain the test set, the training set is further broken into two sub- training sets, described two sons by preset ratio Training set and test set of the training set respectively as next layer model;
    It is repeated in dividing training set to preset times;
    The multilayer training set of division is utilized respectively, carrys out training pattern using default classical model, and in the multi-layer testing of reservation Tested on collection, establish direct continuum model.
  6. 6. anti-fraud recognition methods according to claim 1, it is characterised in that the anti-model of cheating of the continuous type is optimization Continuum model;
    It is described default training dataset to be trained using default continuum model training method, establish that continuous type is counter to take advantage of Swindleness model includes:
    Default training dataset is decomposed into training set and test set by preset ratio;
    Retain the test set, the training set is further broken into two sub- training sets, described two sons by preset ratio Lower floor training set and lower floor test set of the training set respectively as next layer model;
    Using lower floor's training set come training pattern, and tested on lower floor's test set, positive sample is obtained according to test result This simultaneously retains training pattern, using the positive sample of acquisition as new training set;
    The step of being repeated in carrying out division training set, test, until the positive sample quantity obtained is zero or established more Retraining model;
    Arrangement is collected to the Multiple Training model of foundation, obtains optimization continuum model.
  7. 7. anti-fraud recognition methods according to claim 1, it is characterised in that described to establish doctor based on the medical data of social security The step of relational network that trouble, medicine are examined, includes:
    Data of being gone to a doctor to social security carry out data processing;
    According to the medical relational network that data establish doctors and patients, medicine is examined of social security after data processing.
  8. 8. anti-fraud recognition methods according to claim 1, it is characterised in that each various dimensions colony by extraction Medical treatment feature is input to default disaggregated model, is wrapped with being identified according to the disaggregated model the step of rate of fraud of each node Include:
    According to various dimensions colony medical treatment feature corresponding to each node, the various dimensions colony for calculating each node of same attribute sees a doctor The similarity of feature;
    The similarity of each node of calculating is input in default disaggregated model, with according to default in the disaggregated model Fraud detection formula, calculate the rate of fraud of each node.
  9. 9. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists In realizing the anti-fraud recognition methods as any one of claim 1 to 8 when the computer program is executed by processor Step.
  10. 10. the server of safety brain big data platform is carried a kind of, including memory, processor and is stored in the memory In and the computer program that can run on the processor, it is characterised in that computer program described in the computing device Shi Shixian instead cheats the step of recognition methods as any one of claim 1 to 8.
CN201710605531.7A 2017-07-24 2017-07-24 Anti- fraud recognition methods, storage medium and the server for carrying safety brain Pending CN107785058A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710605531.7A CN107785058A (en) 2017-07-24 2017-07-24 Anti- fraud recognition methods, storage medium and the server for carrying safety brain
PCT/CN2018/077230 WO2019019630A1 (en) 2017-07-24 2018-02-26 Anti-fraud identification method, storage medium, server carrying ping an brain and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710605531.7A CN107785058A (en) 2017-07-24 2017-07-24 Anti- fraud recognition methods, storage medium and the server for carrying safety brain

Publications (1)

Publication Number Publication Date
CN107785058A true CN107785058A (en) 2018-03-09

Family

ID=61437479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710605531.7A Pending CN107785058A (en) 2017-07-24 2017-07-24 Anti- fraud recognition methods, storage medium and the server for carrying safety brain

Country Status (2)

Country Link
CN (1) CN107785058A (en)
WO (1) WO2019019630A1 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038701A (en) * 2018-03-20 2018-05-15 杭州恩牛网络技术有限公司 A kind of integrated study is counter to cheat test method and system
CN108334647A (en) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification
CN108428132A (en) * 2018-03-15 2018-08-21 阿里巴巴集团控股有限公司 Fraudulent trading recognition methods, device, server and storage medium
CN108734479A (en) * 2018-04-12 2018-11-02 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification
CN109003191A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 The anti-fraud template automatic generation method of medical treatment and system based on hierarchical clustering
CN109064032A (en) * 2018-08-06 2018-12-21 国网浙江杭州市临安区供电有限公司 The small micro- power honesty risk surveillance managing and control system of power supply station based on enterprise's cloud platform
CN109101562A (en) * 2018-07-13 2018-12-28 中国平安人寿保险股份有限公司 Find method, apparatus, computer equipment and the storage medium of target group
CN109166030A (en) * 2018-08-01 2019-01-08 深圳微言科技有限责任公司 A kind of anti-fraud solution and system
CN109242307A (en) * 2018-09-04 2019-01-18 中国光大银行股份有限公司信用卡中心 A kind of anti-fraudulent policies analysis method, server, electronic equipment and storage medium
CN109284371A (en) * 2018-09-03 2019-01-29 平安证券股份有限公司 Anti- fraud method, electronic device and computer readable storage medium
CN109413031A (en) * 2018-08-31 2019-03-01 深圳壹账通智能科技有限公司 Construction method, device, equipment and the readable storage medium storing program for executing of anti-fraud model
CN109409502A (en) * 2018-09-26 2019-03-01 深圳壹账通智能科技有限公司 Generation method, device, equipment and the storage medium of anti-fraud model
CN109545312A (en) * 2018-10-23 2019-03-29 平安医疗健康管理股份有限公司 A kind of pharmacy's advice of settlement risk checking method and device
CN109544150A (en) * 2018-10-09 2019-03-29 阿里巴巴集团控股有限公司 A kind of method of generating classification model and device calculate equipment and storage medium
CN109599153A (en) * 2018-11-14 2019-04-09 金色熊猫有限公司 Medical data tracking and device, storage medium, electronic equipment
CN109598628A (en) * 2018-11-30 2019-04-09 平安医疗健康管理股份有限公司 Recognition methods, device, equipment and the readable storage medium storing program for executing of medical insurance fraud
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109903053A (en) * 2019-03-01 2019-06-18 成都新希望金融信息有限公司 A kind of anti-fraud method carrying out Activity recognition based on sensing data
CN109919780A (en) * 2019-01-23 2019-06-21 平安科技(深圳)有限公司 Claims Resolution based on figure computing technique is counter to cheat method, apparatus, equipment and storage medium
CN109948806A (en) * 2019-03-28 2019-06-28 医渡云(北京)技术有限公司 Decision model optimization method, device, storage medium and equipment
CN110263106A (en) * 2019-06-25 2019-09-20 中国人民解放军国防科技大学 Collaborative public opinion fraud detection method and device
WO2019200739A1 (en) * 2018-04-17 2019-10-24 平安科技(深圳)有限公司 Data fraud identification method, apparatus, computer device, and storage medium
WO2020000688A1 (en) * 2018-06-27 2020-01-02 平安科技(深圳)有限公司 Financial risk verification processing method and apparatus, computer device, and storage medium
CN111047428A (en) * 2019-12-05 2020-04-21 深圳索信达数据技术有限公司 Bank high-risk fraud client identification method based on small amount of fraud samples
CN111738747A (en) * 2020-06-24 2020-10-02 中诚信征信有限公司 Method and device for anti-fraud decision
TWI723528B (en) * 2019-02-01 2021-04-01 開曼群島商創新先進技術有限公司 Computer-executed event risk assessment method and device, computer-readable storage medium and computing equipment
CN113837874A (en) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN115422016A (en) * 2022-11-05 2022-12-02 北京淇瑀信息科技有限公司 Data monitoring method and device based on server-side relation network

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393066A (en) * 2020-03-11 2021-09-14 清华大学 Method and device for generating risk assessment model and risk assessment method and device
CN114155080A (en) * 2021-09-29 2022-03-08 东方微银科技股份有限公司 Fraud identification method, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384282A (en) * 2016-06-14 2017-02-08 平安科技(深圳)有限公司 Method and device for building decision-making model
CN106600423A (en) * 2016-11-18 2017-04-26 云数信息科技(深圳)有限公司 Machine learning-based car insurance data processing method and device and car insurance fraud identification method and device
CN106682067A (en) * 2016-11-08 2017-05-17 浙江邦盛科技有限公司 Machine learning anti-fraud monitoring system based on transaction data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015527660A (en) * 2012-07-24 2015-09-17 デロイッテ・ディベロップメント・エルエルシー Frozen detection system method and apparatus
CN104408547B (en) * 2014-10-30 2017-09-15 浙江网新恒天软件有限公司 A kind of detection method of the medical insurance fraud based on data mining
CN105279382B (en) * 2015-11-10 2017-12-22 成都数联易康科技有限公司 A kind of medical insurance abnormal data on-line intelligence detection method
CN107657453B (en) * 2016-07-25 2020-10-20 平安科技(深圳)有限公司 Method and device for identifying fraudulent data
CN107657536B (en) * 2017-02-20 2018-07-31 平安科技(深圳)有限公司 The recognition methods of social security fraud and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384282A (en) * 2016-06-14 2017-02-08 平安科技(深圳)有限公司 Method and device for building decision-making model
CN106682067A (en) * 2016-11-08 2017-05-17 浙江邦盛科技有限公司 Machine learning anti-fraud monitoring system based on transaction data
CN106600423A (en) * 2016-11-18 2017-04-26 云数信息科技(深圳)有限公司 Machine learning-based car insurance data processing method and device and car insurance fraud identification method and device

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10970719B2 (en) 2018-03-15 2021-04-06 Advanced New Technologies Co., Ltd. Fraudulent transaction identification method and apparatus, server, and storage medium
US11276068B2 (en) 2018-03-15 2022-03-15 Advanced New Technologies Co., Ltd. Fraudulent transaction identification method and apparatus, server, and storage medium
CN108428132B (en) * 2018-03-15 2020-12-29 创新先进技术有限公司 Fraud transaction identification method, device, server and storage medium
CN108428132A (en) * 2018-03-15 2018-08-21 阿里巴巴集团控股有限公司 Fraudulent trading recognition methods, device, server and storage medium
CN108038701A (en) * 2018-03-20 2018-05-15 杭州恩牛网络技术有限公司 A kind of integrated study is counter to cheat test method and system
CN108334647A (en) * 2018-04-12 2018-07-27 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification
CN108734479A (en) * 2018-04-12 2018-11-02 阿里巴巴集团控股有限公司 Data processing method, device, equipment and the server of Insurance Fraud identification
WO2019200739A1 (en) * 2018-04-17 2019-10-24 平安科技(深圳)有限公司 Data fraud identification method, apparatus, computer device, and storage medium
WO2020000688A1 (en) * 2018-06-27 2020-01-02 平安科技(深圳)有限公司 Financial risk verification processing method and apparatus, computer device, and storage medium
CN109003191A (en) * 2018-07-12 2018-12-14 上海金仕达卫宁软件科技有限公司 The anti-fraud template automatic generation method of medical treatment and system based on hierarchical clustering
CN109101562A (en) * 2018-07-13 2018-12-28 中国平安人寿保险股份有限公司 Find method, apparatus, computer equipment and the storage medium of target group
CN109101562B (en) * 2018-07-13 2023-07-21 中国平安人寿保险股份有限公司 Method, device, computer equipment and storage medium for searching target group
CN109166030A (en) * 2018-08-01 2019-01-08 深圳微言科技有限责任公司 A kind of anti-fraud solution and system
CN109064032A (en) * 2018-08-06 2018-12-21 国网浙江杭州市临安区供电有限公司 The small micro- power honesty risk surveillance managing and control system of power supply station based on enterprise's cloud platform
CN109413031A (en) * 2018-08-31 2019-03-01 深圳壹账通智能科技有限公司 Construction method, device, equipment and the readable storage medium storing program for executing of anti-fraud model
CN109284371B (en) * 2018-09-03 2023-04-18 平安证券股份有限公司 Anti-fraud method, electronic device, and computer-readable storage medium
CN109284371A (en) * 2018-09-03 2019-01-29 平安证券股份有限公司 Anti- fraud method, electronic device and computer readable storage medium
CN109242307A (en) * 2018-09-04 2019-01-18 中国光大银行股份有限公司信用卡中心 A kind of anti-fraudulent policies analysis method, server, electronic equipment and storage medium
CN109242307B (en) * 2018-09-04 2022-02-01 中国光大银行股份有限公司信用卡中心 Anti-fraud policy analysis method, server, electronic device and storage medium
CN109409502A (en) * 2018-09-26 2019-03-01 深圳壹账通智能科技有限公司 Generation method, device, equipment and the storage medium of anti-fraud model
CN109544150A (en) * 2018-10-09 2019-03-29 阿里巴巴集团控股有限公司 A kind of method of generating classification model and device calculate equipment and storage medium
CN109545312B (en) * 2018-10-23 2023-08-08 平安医疗健康管理股份有限公司 Drug store statement risk detection method and device
CN109545312A (en) * 2018-10-23 2019-03-29 平安医疗健康管理股份有限公司 A kind of pharmacy's advice of settlement risk checking method and device
CN109599153A (en) * 2018-11-14 2019-04-09 金色熊猫有限公司 Medical data tracking and device, storage medium, electronic equipment
CN109599153B (en) * 2018-11-14 2021-06-29 金色熊猫有限公司 Medical data tracking method and device, storage medium and electronic equipment
CN109598628A (en) * 2018-11-30 2019-04-09 平安医疗健康管理股份有限公司 Recognition methods, device, equipment and the readable storage medium storing program for executing of medical insurance fraud
CN109598628B (en) * 2018-11-30 2022-09-20 平安医疗健康管理股份有限公司 Method, device and equipment for identifying medical insurance fraud behaviors and readable storage medium
CN109816397A (en) * 2018-12-03 2019-05-28 北京奇艺世纪科技有限公司 A kind of fraud method of discrimination, device and storage medium
CN109816397B (en) * 2018-12-03 2021-05-25 北京奇艺世纪科技有限公司 Fraud discrimination method, device and storage medium
CN109919780A (en) * 2019-01-23 2019-06-21 平安科技(深圳)有限公司 Claims Resolution based on figure computing technique is counter to cheat method, apparatus, equipment and storage medium
TWI723528B (en) * 2019-02-01 2021-04-01 開曼群島商創新先進技術有限公司 Computer-executed event risk assessment method and device, computer-readable storage medium and computing equipment
CN109903053A (en) * 2019-03-01 2019-06-18 成都新希望金融信息有限公司 A kind of anti-fraud method carrying out Activity recognition based on sensing data
CN109948806A (en) * 2019-03-28 2019-06-28 医渡云(北京)技术有限公司 Decision model optimization method, device, storage medium and equipment
CN110263106A (en) * 2019-06-25 2019-09-20 中国人民解放军国防科技大学 Collaborative public opinion fraud detection method and device
CN111047428A (en) * 2019-12-05 2020-04-21 深圳索信达数据技术有限公司 Bank high-risk fraud client identification method based on small amount of fraud samples
CN111047428B (en) * 2019-12-05 2023-08-08 深圳索信达数据技术有限公司 Bank high-risk fraud customer identification method based on small amount of fraud samples
CN111738747A (en) * 2020-06-24 2020-10-02 中诚信征信有限公司 Method and device for anti-fraud decision
CN113837874B (en) * 2021-11-22 2022-04-12 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN113837874A (en) * 2021-11-22 2021-12-24 北京芯盾时代科技有限公司 Data identification method and device, storage medium and electronic equipment
CN115422016A (en) * 2022-11-05 2022-12-02 北京淇瑀信息科技有限公司 Data monitoring method and device based on server-side relation network

Also Published As

Publication number Publication date
WO2019019630A1 (en) 2019-01-31

Similar Documents

Publication Publication Date Title
CN107785058A (en) Anti- fraud recognition methods, storage medium and the server for carrying safety brain
US7930242B2 (en) Methods and systems for multi-credit reporting agency data modeling
CN113011973B (en) Method and equipment for financial transaction supervision model based on intelligent contract data lake
EP4236197A2 (en) Micro-loan system
CN110807700A (en) Unsupervised fusion model personal credit scoring method based on government data
CN108898476A (en) A kind of loan customer credit-graded approach and device
CN112150298B (en) Data processing method, system, device and readable medium
CN112700319A (en) Enterprise credit line determination method and device based on government affair data
CN111985937A (en) Method, system, storage medium and computer equipment for evaluating value information of transaction traders
CN112735549A (en) Data processing method and data processing system based on medical insurance data
CN111950625A (en) Risk identification method and device based on artificial intelligence, computer equipment and medium
CN107908633A (en) A kind of finance and economics reasoning method of knowledge based collection of illustrative plates
Ahmadzadeh et al. Studying the critical success factors of ERP in the banking sector: a DEMATEL approach
CN112365339A (en) Method for judging commercial value credit loan amount of small and medium-sized enterprises
Nwankwo et al. Knowledge discovery and analytics in process reengineering: a study of port clearance processes
CN117132383A (en) Credit data processing method, device, equipment and readable storage medium
CN116739764A (en) Transaction risk detection method, device, equipment and medium based on machine learning
CN115564591A (en) Financing product determination method and related equipment
CN114998022A (en) Compliance wind control method and system
WO2022143431A1 (en) Method and apparatus for training anti-money laundering model
CN114048330A (en) Risk conduction probability knowledge graph generation method, device, equipment and storage medium
Meng et al. The practice study of consumer credit risk based on random forest
Kulothungan Loan Forecast by Using Machine Learning
Srivastava et al. Hyperautomation in transforming underwriting operation in the life insurance industry
Abdolbaghi Ataabadi et al. The effectiveness of the automatic system of fuzzy logic-based technical patterns recognition: Evidence from Tehran stock exchange

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180309