The content of the invention
For above-mentioned defect, it is an object of the invention to provide a kind of motor vehicle insurance fraud detection method and system,
It can avoid artificially evading influence of the behavior to recognizing and detecting, and can fast and accurately identify high suspect vehicle collision group
Body.
To achieve these goals, the present invention provides a kind of motor vehicle insurance fraud detection method, and methods described includes:
The scheduled time is divided into some isometric periods, and according to the event data in first time period set up vehicle it
Between collide relational matrix;
Calculated according to the collision relational matrix and obtain collision network, and set up between the collision network and the vehicle
Relational matrix;
Calculate the collision network obtained in the first time period similar to the collision network in other periods
Spend, and the collision network that the similarity is unsatisfactory for into predetermined threshold value is deleted;
Enter row rank sequence processing to the collision network in the first time period;
Row rank row is entered with the collision network of the period before the first time period respectively to each collision network
Sequence, obtains first object colony.
According to the motor vehicle insurance fraud detection method of the present invention, methods described also includes:
Set up the relational matrix of related personnel and vehicle;
Also include after acquisition first object colony step:
Second target group are obtained according to the relational matrix of related personnel and vehicle.
According to the motor vehicle insurance fraud detection method of the present invention, methods described also includes:
Set up the relational matrix of repair shop and vehicle;
Also include after acquisition first object colony step:
3rd target group are obtained according to the relational matrix of the repair shop and vehicle.
It is described that the scheduled time is divided into some isometric times according to the motor vehicle insurance fraud detection method of the present invention
Section, and collision relational matrix step between vehicle is set up according to the event data in first time period included:
The scheduled time is divided into t isometric timing statisticses sections, t is represented with 0,1,2,3 ..., the very first time
Whole event informations in section t are mapped to n × n collision relational matrix C of a vehicle and vehiclet:
The relational matrix step set up between the collision network and the vehicle includes:
Step A:Define the unique encodings v of null set M and all vehiclesi, i is the integer more than 0, from v1Start the cycle over,
From the Matrix CtMiddle lookup row vector C1, t, record the v1And be added in the set M, make M={ v1};
Step B:Travel through the row vector C1, tEach element, if element cI, j, tFor 1, then v is recordedj, and it is added to institute
State in set M, make M={ v1..., vj..., 1≤j≤n;
Step C:C is traveled through1, tAfter each element, v is set1For access flag 1;
Step D:Search each element v in the set Mi, 1≤i≤n, if viAccess flag be 1, continue obtain
Access flag is not 1 vi;
Step E:If viAccess flag be not 1, then search homography CtRow vector CI, t, and traveled through described
CI, tEach element, judges whether include the element in the set M, if do not included, the element is added to described
In set M, and v is setiFor access flag 1, the step D is returned;
Step F:If each v in the set MiAccess flag all be 1, then terminate once to search, the M be one
Individual collision relational network, each element of the set M represents the vehicle code included in the network, and the set M is mapped
For GtRow vector, the GtNetwork matrix is collided for the binary of collision network and the corresponding relation of vehicle;
Successively according to step A~step F, each described G is builttRow vector, obtain matrix Gt。
According to the motor vehicle insurance fraud detection method of the present invention, the relational matrix step for setting up related personnel and vehicle
Suddenly include:
The related personnel's vector for defining h × 1 is D=(d1, d2, d3..., dh-1, dh)T, diRepresent all related personnel's
Unique encodings, and 1≤i≤h;
Event in the first time period t is mapped to h × n binary matrixs of a related personnel and vehicle relation
At:
Pass through the matrix AtAnd Matrix CtMap out h × h matrixes between related personnel
Wherein:Represent in the first time period t, related personnel diAnd djBy the collision relation of vehicle directly or
Person produces the number of times of incidence relation, and 1≤i≤h, 1≤j≤h indirectly;
It is rightθ (t) individual interpersonal relationships net is built, the θ of related personnel's network and the corresponding relation of related personnel is mapped out
(t) × h matrixes
Expression personnel diIn interpersonal relationships netThe number of times or weight of middle appearance, and 1≤i≤θ (t), 1≤j≤h.
According to the motor vehicle insurance fraud detection method of the present invention, the relational matrix step for setting up repair shop and vehicle
Including:
Set associated vehicle and have f repair shop, and build n × f matrix Bst:
Wherein:bI, j, tRepresent whether jth car is repaired in i-th of repair shop, it is then 1 to be, is otherwise 0,1≤i≤f, 1≤
j≤n;
Calculate repair shop's maintenance frequency matrix of suspect vehicle.
The present invention also provides a kind of motor vehicle insurance fraud detection system, including:
Network struction module, for the scheduled time to be divided into some isometric periods, and according in first time period
Event data collides relational matrix between setting up vehicle;And acquisition collision network is calculated according to the collision relational matrix, and
The relational matrix set up between the collision network and the vehicle;
Similar computing module, for calculating in the collision network and other periods in the acquisition first time period
Collision network similarity, and by the similarity be unsatisfactory for predetermined threshold value collision network delete;
Sort processing module, for entering row rank sequence processing to the collision network in the first time period;And to every
The individual collision network enters row rank sequence with the collision network of the period before the first time period respectively, obtains the first mesh
Mark colony.
According to the motor vehicle insurance fraud detection system of the present invention, the network struction module is further used for setting up related
Personnel and the relational matrix of vehicle;
The system also includes:
Recognition processing module, for obtaining the second target group according to the relational matrix of the related personnel and vehicle.
According to the motor vehicle insurance fraud detection system of the present invention, the network struction module, which is further used for setting up, repairs
Factory and the relational matrix of vehicle;
The recognition processing module is further used for obtaining the 3rd target according to the relational matrix of the repair shop and vehicle
Colony.
The present invention to fraud by carrying out in clique's Lattice strain, concrete application, if will be divided into the scheduled time first
The dry isometric period, and set up according to the event data in first time period relational matrix is collided between vehicle, and then set up
Relational matrix between the collision network and the vehicle.Further, the present invention obtains the very first time by calculating
The similarity of collision network and the collision network in other periods in section, and the similarity is unsatisfactory for default threshold
The collision network of value is deleted, and the suspicion committed a crime for narrow sense clique is can determine whether whereby.Preferably, the present invention can also be to described
Collision network in first time period enter row rank sequence processing, and to it is each it is described collision network respectively with the first time period
The collision network of period before enters row rank sequence, obtains first object colony, and many of first object colony correspondence is motor-driven
The network topology structure that car is obtained due to collision relation.Similarly, according to related personnel in first time period and the relation square of vehicle
Battle array, calculates personnel's relational network, is sorted by entering row rank to personnel's relational network, obtains the second target group, i.e., many famous persons
The network topology structure that member is obtained by collision relation.By extensive matrixing algorithm, show that suspect vehicle is repaiied multiple
Factory's maintenance frequency matrix is managed, maintenance frequency is subjected to cumulative and normalized, the 3rd target group are obtained, you can doubts vehicle and exists
The distribution of weights of all repair shops.The present invention realizes the quick identification of high suspicious vehicle insurance fraud clique and the work(of compensation case early warning
Can, i.e., in substantial amounts of Claims Resolution case data basis, it can fast and accurately identify high suspect vehicle collision clique and broad sense people
Member clique.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
The present invention is the technical scheme that the motor vehicle based on clique's behavior cheats micromodel, therefore, and the present invention is right first
Clique's fraud has carried out Lattice strain.Recognize that minimum probability occurs first, while being also highly suspicious collision behavior, borrow
This finds high suspicious clique's vehicle, then detects the broader participant of clique's vehicle behind.
For ease of understanding and describing, it is narrow sense clique that the present invention, which defines first object colony,:I.e. many motor vehicles are due to touching
The network topology structure that the relation of hitting is obtained;It is broad sense clique to define the second target group and the 3rd target group:By narrow sense clique
The network topology structure for the related participant that derivation goes out, including insurant, car owner, driver and repair shop.Need explanation
, the behavioural characteristic of narrow sense clique and broad sense team all meets network organization and repeatability.
The process of mathematical modeling of the present invention is as follows:
It is p, p > 0 to define the probability do not evaded when each car is met with other cars.It is random given any in n car
Two cars, the probability that two cars meet once isξ (n, α, β) is represented in n car, is given any α car and is recurred
The probability of β collision.The probability once collided occurs for two cars:Represent two cars to meet all not evade
It can just cause to collide.
Two cars recur β times collision probability beIf β > 1, have
So ξ (n, 2, β) < ξ (n, 2,1).BecauseSo in the case where two cars are without subjective premeditated collision, with
The increase of vehicle radix, the probability that two cars recur β collision is minimum probability.
Bump against problem for many cars, with n increase, the probability ξ (n, α, β) that α car recurs β collision is also pole
Small probability, that is, have:
For given vehicle radix n and continuous collision frequency β, if in the presence ofThen have:
For given vehicle radix n and α car, if there is β1≥β2, then have:
The generation event (α car recurs β collision) of above-mentioned minimum probability, meets the important spy of clique's fraud very much
Levy, i.e. the sense of organization and repeatability.Therefore, narrow sense clique or the vehicle of above-mentioned minimum probability are found according to formula (1), (2) and (3)
Clique, is unusual reasonable scheme.
The technical program finds above-mentioned minimum probability event, that is, meets the complete of ξ (n, α, β) > 0 first according to formula (1)
Portion's vehicle collision network, α > 1, β > 1.Then it is ranked up according to formula (2) and (3) and obtains suspicion degree highest narrow sense group
Group.
Because participant can be evaded to narrow sense clique using Second-hand Vehicle Transaction change crime vehicle, conversion participation role
Detection, therefore the technical program can map out the network topology structure of related participant according to narrow sense clique, including be protected
Dangerous people, car owner, driver, repair shop.
Referring to Fig. 1, the invention provides a kind of motor vehicle insurance fraud detection method, it can be by as shown in Figure 2
Motor vehicle insurance fraud detection system 100 is realized, and the system 100 can be the software unit for being built in computer equipment, firmly
Part unit or software and hardware combining unit.Specifically, methods described comprises the steps S101~step 107, wherein:
Step S101, is divided into some isometric periods by the scheduled time, and according to the event data in first time period
Set up between vehicle and collide relational matrix.
Step S102, calculates according to the collision relational matrix and obtains collision network, and set up the collision network and institute
State the relational matrix between vehicle.
Above-mentioned steps S101 and S102 is realized by the network struction module 10 of system 100.Specific calculating acquisition process is as follows:
First, as following definitions are specifically vectorial:
(1) the motor vehicle column vector V=(v of n × 11, v2, v3..., vn-1, vn)T, wherein, viRepresent all vehicles only
One coding, and 1≤i≤n;
(2) the related personnel vector D=(d of h × 11, d2, d3..., dh-1, dh)T, wherein, diRepresent all related personnel
Unique encodings, and 1≤i≤h, related personnel includes car owner, insurant, driver;
(3) the repair shop vector R=(r of f × 11, r2, r3..., rf-1, rf)T, wherein riRepresent that all repair shops uniquely compile
Code, 1≤i≤f;
Of the invention preferred by t isometric timing statisticses sections of time interval division, with 0,1,2,3 ... ..., t is represented.This reality
Apply the first time period in example represent specify time t in (i.e. from time t-1 to t), full insurance in first time period t is paid for
N × n collision relational matrix C of the case information MAP into a vehicle and vehiclet。
cI, j, tValue 0 or 1,0 represents that two cars do not collide relation in time t, and 1 represents that two cars occur within the t times
Collision relation, diagonal line value is set to 0,1≤i≤n, 1≤j≤n.CtFor symmetrical matrix.
Finally, for the C in time ttM (t) individual collision network is built, and a collision network and vehicle can be mapped out
Corresponding relation m (t) × n binary collision network matrix Gt:
gI, j, tValue 0 or 1,0 represents vehicle vjIt is not belonging to collide network GI, t, 1 represents vehicle vjBelong to collision ring GI, t, 1
≤ i≤m (t), 1≤j≤n.
It is preferred that, the present embodiment obtains G by following algorithmt
The first step:Null set M is defined, from v1Start the cycle over, in Matrix CtMiddle lookup row vector C1, t, record v1And be added to
In set M, M={ v1};
Second step:Travel through row vector C1, tEach element, if element c1, j, tFor 1, then v is recordedj, it is added to set M
In, M={ v1..., vj..., 1≤j≤n;
3rd step:C is traveled through1, tAfter each element, v is set1For access flag 1;
4th step:Begin look for each element v in set M1, 1≤i≤n, if viAccess flag be 1, continue to obtain
Take the v that access flag is not 1i;
5th step:If viAccess flag be not 1, then search homography CtRow vector CI, t, and traveled through CI, t
Each element, judges whether included in M, if do not included, is added in set M.V is setiFor access flag 1, return
4th step;
6th step:If each v in MiAccess flag all be 1, then terminate once to search, M be one collision relation
Network, each element represents the vehicle code included in the network, therefore can be mapped as GtRow vector.Arrived successively according to first
The flow of five steps, builds each GtRow vector, finally complete matrix GtStructure;
7th step:C is built successively1, C2... Ct-1And G1, G2... Gt-1。
Obtained by the calculating of network struction module 10 after above-mentioned each matrixing network data, data can be transferred to similar calculating
Module 20 is analyzed and processed.
Step S103, calculates and obtains the collision network in the first time period t and the collision in other periods
The similarity of network, and by the similarity be unsatisfactory for predetermined threshold value collision network delete.The phase that the step passes through system 100
Realized like computing module 20.
Specifically, the collision network G that the present invention generates time ttBefore in timing statisticses section t-1, t-2 ..., 2,1
All collision network matrix G of generationT-1,Gt-2..., G2, G1Similarity Measure is carried out, passes through common node number and similarity
Determine whether the suspicion of narrow sense clique crime.The present invention can preset similarity threshold, will be unsatisfactory for the collision network of threshold value
Delete, and then retain suspicious collision network.
The present invention is defined as follows first:
Define car and car collision relational matrix in the t-1 periods as follows:
The individual collision networks of m (t-1) can be built by defining in the t-1 periods, the relational matrix of collision network and vehicle is as follows:
The calculating process that network similarity is collided in the present embodiment is as follows:
The first step:By calculating Gt×Gt-1 T, draw m (t) × m (t-1) common node number between two collision networks
Matrix St:
sI, j, tRepresent collision network GI, tWith collision network GJ, t-1Between common node quantity, i.e., simultaneously appear in two
Collide the vehicles number of network, 1≤i≤m (t), 1≤j≤m (t-1).
Second step:Travel through each sI, j, tElement carry out Similarity Measure, similarity eI, j, tCalculation formula is:
Wherein,Represent collision network GI, tIncluding vehicle number.Represent collision network GJ, t-1Including car
Number.
3rd step:Build l (t) × n common node matrix Ust, l (t)=m (t) × m (t-1).Define similarity threshold values ε, ε
It can be obtained by calculating the mathematic expectaion of similarity, to sequence eI, j, tUsing Maximum-likelihood estimation probability distribution function Beta
The parameter (α, β) of (α, β), computational mathematics is expectedε takes 0.5 in the present embodiment, it is preferred that if other embodiments base
In other data messages, ε can also take the numerical value of other adaptability.
Travel through eI, j, tIf, eI, j, t>=ε, then can position collision network GI, tWith collision network GJ, t-1Utility car coding,
I.e.:
In matrix GtIn take the i-th row vector GI, t=(gI, 1, tgI, 2, t…gI, n, t), wherein gI, j, tValue 0 or 1.
Similarly, in matrix Gt-1In take jth row vector:GJ, t-1=(gJ, 1, t-1gJ, 2, t-1…gJ, n, t-1)。
By GI, tAnd GJ, t-1Position and computing are asked, then obtains UtK-th of row vector, k=(i-1) × m (t-1)+j, i.e.,:
UK, t=(gI, 1, tgI, 2, t…gI, n, t)&(gJ, 1, t-1gJ, 2, t-1…gJ, n, t-1)
UK, t=(uK, 1, tuK, 2, t…uK, i, t…uK, n, t)
Thus can be according to uK, i, t=1 (1≤i≤n) finds corresponding vehicle code vi.
4th step:E is traveled throughI, j, t>=ε element, builds and completes l (t) × n common node matrix Ust。
5th step:Similarly, G is utilizedt×G1 T, Gt×G2 T... Gt×Gt-2 TMatrix S can be obtained2... St-2, St-1, finally
It can calculate and obtain U2... Ut-2, Ut-1。
Similar computing module 20 is got after the similarity of collision network, that is, is compared processing, be will be less than similarity threshold
The collision network of value is deleted, that is, retains the narrow sense clique of tool event suspicion.
Step S104, row rank sequence processing is entered to the collision network in the first time period.
Step S105, to the collision network of each collision network respectively with the period before the first time period
Enter row rank sequence, obtain first object colony.
Above-mentioned steps S104 and S105 can be realized by the sequence processing module 30 of system 100.The collision network of the present embodiment
Common node order sort algorithm is as follows:
The first step:Given l (t) × n common node matrix Ut, take row vector UI, t=(uI, 1, tuI, 2, t…uI, n, t), 1≤
i≤l(t);
Second step:Calculate UtI-th of row vector UI, tRow and,To all uI, j, t> 0 is respectively in CtWith
Ct-1All jth rows of middle extraction and jth row build s (i) × s (i) rank submatrixs that utility car collides relation(use subscript
What t-1 was representedIt is collision network GtAnd Gt-1From C after comparingtThe utility car collision relational matrix of middle extraction) and QT-1, i:
Represent vehicle viAnd vjCollision relation, 0 represent do not collide, 1 indicates collision.Similarly QT-1, iRepresent such as
Under:
3rd step:Ask respectivelyAnd QT-1, iOrderWith r (QI, t-1), draw UtOrder operation result l (t) × 2
Matrix RtIt is as follows:
4th step:For 1≤k≤m (t), traversal is following:
Represent collision network GK, tIn the vehicle number that includes.
Represent that the t periods collide network matrix
GtIn collision network GK, tNetwork matrix G is collided with the t-1 periodst-1In each collision network GI, t-1After comparing, obtain
The aggregate value of common node order computing.
order3(k)=order3(k)+2m (t-1), represents calculating order2(k) when, the number of cumulative order.
5th step:Similarly, the computational methods of the above-mentioned first to the 4th step are repeated, it can be deduced that Ut-1, Ut-2... U2It is corresponding
Order operation result matrix Rt-1, Rt-2... R2, traversal m (t-i) is secondary respectively, and 2≤i≤t-1, wherein m (t-i) are matrix Gt-i's
Row dimension, m (t) is matrix GtRow dimension.
order3(k)=order3(k)+2m (t-i), 2≤i≤t-1
6th step:OrderTo whole order1(k), y (k), is ranked up from big to small, it is possible to
To the suspicious degree ranking results matrix F of the collision network of m (t) × nt (1)。
7th step:Extract Ft (1)The middle whole collision networks for meeting y (k) > 0, can obtain high m × n suspicious collision
Network ranking results matrix
8th step:ForIn each collision network, with reference to collision network common node order operation result matrix Rt,
Rt-1, Rt-2... R3, R2, for GtIn each GK, tCalculate and Gt-iIn GJ, t-iCommon node rank of matrix is averaged
Value rK, i, j, it is as follows:
Then r is pressedK, i, jFrom big to small to GJ, t-iIt is ranked up, draws GK, tThe suspicious narrow sense clique (first object of middle height
Colony).
Order computing of the invention by seeking common node vehicle relational matrix, can calculate common node in different collisions
Connectedness in network, multiple order operation result is big, represents that common node contact is closer, correlation degree is high, clique's crime can
Doubtful degree is also higher.By colliding common node number and order sort result between network, the standard that identification vehicle insurance cheats clique is improved
True property, the process is to make use of the sequence of formula (2) and the maximum suspicion degree of (3) progress in foregoing model.
In existing application, many frauds are more and more obscure, and vehicle is often transferred ownership between different personnel, then
Insured in different insurance companies, or the different vehicles of same personnel's driving are in danger in certain time period.Therefore, if
The collision relational network of vehicle is only focused on, can holiday related personnel in close relations.To overcome the defect, the present invention passes through
Step S106 and step S107 effectively recognize the broad sense clique of related personnel, will help the identification to fraud, and just
Collected evidence in fraud case.Broad sense clique can be divided into related personnel clique (the second target group) and high suspicious repair shop
(the 3rd target group).
Wherein:
Step S106, the second target group are obtained according to the relational matrix of related personnel and vehicle.
Step S107, the 3rd target group are obtained according to the relational matrix of the repair shop and vehicle.
Above-mentioned steps S106 and S107 is realized by the recognition processing module 40 of system 100.Specifically, the present invention is led to first
The relational matrix that the network struction module 10 is used to set up related personnel and vehicle is crossed, recognition processing module 30 performs as follows again
Related personnel clique (the second target group) recognizer:
The first step:The related personnel vector D=(d of h × 1 are built from data1, d2, d3..., dh-1, dh)T, wherein diRepresent
The unique encodings of all car owners, driver, insurant etc., 1≤i≤h.
Second step:Insurance compensation case in time t is mapped to h × n binary matrixs of a related personnel and vehicle relation
At, car owner, driver, insurant and the driver of three's car of car of reporting a case to the security authorities and the relation of motor vehicle show in the matrix
Show.
aI, j, tValue 0 or 1,0 represents the personnel d in time tiWith motor vehicle vjThere is no incidence relation, 1 represented in the t times
Interior personnel diWith motor vehicle vjIt is relevant, 1≤i≤h, 1≤j≤n.Incidence relation represents, to the possessing of the motor vehicle, drives
Sail, compensate the direct relations such as benefited and shock.
3rd step:Using matrixing algorithm, that is, utilize people's car incidence matrix AtWith collision relational matrix CtMap out correlation
H × h matrixes between personnel
Represent in time t, related personnel diAnd djAssociation is directly or indirectly produced by the collision relation of vehicle
The number of times of relation, 1≤i≤h, 1≤j≤h.
4th step:Using structure matrix G in network struction module 10tIdentical algorithms, it is rightThe individual interpersonal relationships nets of θ (t) can be built
Network, can map out θ (t) × h matrixes of related personnel's network and the corresponding relation of related personnel
Expression personnel diIn interpersonal relationships netThe number of times or weight of middle appearance, 1≤i≤θ (t), 1≤j≤h.
5th step:Using the above-mentioned similar identical algorithm of computing module 20, buildWithUtilizeCalculate matrixWith public personnel's matrix in interpersonal relationships net
6th step:It is right using the above-mentioned identical algorithm of sequence processing module 30Enter row rank sort algorithm, obtain
To the suspicious degree ranking results matrix F of θ (t) × d broad sense cliquet (2)With high suspicious ranking results
The present invention is set up by network struction module 10 after the relational matrix of repair shop and vehicle, and recognition processing module 40 is held
Following suspicious repair shop (the 3rd target group) recognizer of row:
The first step:Set associated vehicle and have f repair shop, then can build n × f matrix Bst:
bI, j, tRepresent whether jth car is repaired in i-th of repair shop, it is then 1 to be, is otherwise 0,1≤i≤f, 1≤j≤n.
Second step:Calculate repair shop's maintenance frequency matrix F of m × f suspect vehiclet (3), i.e.,:
3rd step:It is W=rowp (F to make 1 × f weight vectorst (3)), wherein rowp (Ft (3)) it is to matrix Ft (3)By row
Vector after being added up and being normalized.BecauseTo collide the high suspicious degree ranking results matrix of network, so W is entirety
Distribution of weights after repair shop's sequence.
Preferably, if having newly-increased vehicle in the model of the present invention, here it is assumed that increasing z motor vehicle newly, it can carry out such as
Lower addition calculating processing:V=(v are expanded to vectorial V1, v2, v3..., vn+z-1, vn+z)T;As newly-increased x related personnel, to
Amount D expands to D=(d1, d2, d3..., dh+x-1, dh+x)T, other processing can perform by above-mentioned steps.
The method have the advantages that:
(1) identify minimum probability to occur but be highly suspicious clique's behavior, so as to find high suspicious fraud group
Group, the fraud sample that method need not be determined relatively, it is not required that carrying out model training just can directly apply.
(2) present invention is using the Similarity Measure based on matrix, order sequence and becomes the high suspicious clique's identification of scaling method progress,
All processes are completed as a result of matrix numerical operation, computational efficiency is improved.
(3) invention introduces the concept of broad sense clique, taken advantage of for crime means are more and more intelligent, more and more obscure
Swindleness behavior, comprehensive is identified, and has evaded and has only carried out the situation that gang member is omitted in analysis, relatively existing method from vehicle
It can more comprehensively, accurately carry out the identification of vehicle insurance fraud.
At present, the present invention is accomplished in the commercial systems for applications InsGuardV1.0 of our company, and at certain
A little areas are commercially available, calculated and obtained high suspicious clique.The computational methods and database computational methods of the system are carried out
Compare, efficiency improves 100~200 times.
The present invention is explained further below by a specific embodiment.
1) information data of needs is obtained first.
It is data set to obtain the Claims Resolution in nearly 3 years of somewhere vehicle insurance platform and its declaration form data, quantity of settling a claim:2,979,878
Part, declaration form quantity 12,128,776 is single, uniquely determines principle according to vehicle first, carries out vehicle code, altogether 2,729,275
Car.Calculate, divided according to monthly since the compensation case in March, 2011, the corresponding collision network of generation, below with the period
20130601-20130630 Claims Resolution data instance carries out fraud recognition methods explanation.Key data is such as in compensation case information
Shown in Fig. 3 A.
2) collision relational matrix is built
Information of vehicles of reporting a case to the security authorities, three's information of vehicles in 94,618 compensation cases that in June, 2013 is in danger and has wound up the case
Construct the collision relational matrix C between vehicle (n=2,729,275)t, wherein there is collision relation to be 1, the relation of colliding is not
0, CtIt is the symmetrical matrix that scale is 2,729,275 × 2,729,275, diagonal is 0.
3) collision network matrix is extracted
The algorithm of collision network matrix is built using above-mentioned steps S101 and step S102, from relational matrix CtMiddle extraction is touched
Hit the relational matrix G of network and vehiclet, the collision network matrix of the scale of m (t) × n=26563 × 2729275 is obtained, such as
Shown in Fig. 3 B.
4) calculate the similarity between collision network and filtered by threshold values
By above-mentioned steps S103 algorithm, all collision network matrixs by the collision network matrix of this month generation and before
Transposed matrix multiplication Gt×Gt-i T, draw the common node numbers matrix S between collision networkT, t-i, and calculate each collision
Similarity e between networkI, j, t。
5) collision network order sequence
According to step S104 and step S105 algorithm, 26,563 collision networks in measurement period and collision before
Network relatively after, the s of non-zeroI, j, tThere are 42,219, filtered after calculating similarity according to threshold values, surplus element 123.By 123
Individual common node relation carries out average order calculating using main collision network code as according to standard, draws corresponding 100 main collision networks
The sequence of coding, that is, collide the suspicious degree ranking results of network.
6) sub- sequence is carried out in each collision network
To each main collision network internal and the relation of other sub- collision networks, according to the average size of common node order
It is ranked up, draws high suspicious narrow sense clique.System can collide the relation of network common node by graphical representation.
7) identification personnel clique
Unique encodings are carried out according to ID card No. to related personnel, then narrow sense are recognized according to the method in step S106
The method of clique, draws high questionable person person clique.Because personal information is imperfect in current practical basis data, so can not enter
Checking in the extensive real data of row, but implemented in the case where small range data are complete, personnel can be navigated to
Clique.
8) high suspicious repair shop is recognized
Unique encodings are carried out to all repair shops, according to repair shop and the maintenance relation of car, and vehicle collision network
Suspicious degree ranking results, draw repair shop of the repair shop in the vehicle repaired more than suspicious clique's vehicle accounting, are high suspicious
Repair shop.
The present invention compares conventional method, and clique's Lattice strain is integrated into matrix computations first and cheated applied to motor vehicle insurance
In detection, there is more accurate and efficient actual application value compared to conventional method:1st, the sample without fraud, it is not necessary to instruct
Practice;2nd, matrix form can provide quick calculating;3rd, by the vehicle collision relationship map behaviour cyberrelationship of suspicious fraud clique,
So as to avoid the various artificial influences for evading behavior to recognizing and detecting.
In summary, the present invention is to fraud by carrying out in clique Lattice strain, concrete application, first the scheduled time
It is divided into some isometric periods, and is set up according to the event data in first time period between vehicle and collide relational matrix, enters
And the relational matrix set up between the collision network and the vehicle.Further, the present invention obtains described the by calculating
The similarity of collision network and the collision network in other periods in one period, and the similarity is unsatisfactory for
The collision network of predetermined threshold value is deleted, and the suspicion committed a crime for narrow sense clique is can determine whether whereby.Preferably, the present invention can be with
Enter row rank sequence processing to the collision network in the first time period, and to each collision network respectively with described first
The collision network of period before period enters row rank sequence, obtains first object colony, and first object colony correspondence is more
The network topology structure that motor vehicle is obtained due to collision relation.Similarly, according to related personnel in first time period and vehicle
Relational matrix, calculates personnel's relational network, is sorted by entering row rank to personnel's relational network, obtains the second target group, i.e.,
The network topology structure that several personnel are obtained by collision relation.By extensive matrixing algorithm, show that suspect vehicle exists
Multiple repair shop's maintenance frequency matrixes, cumulative and normalized is carried out by maintenance frequency, obtains the 3rd target group, you can doubted
Distribution of weights of the vehicle in all repair shops.The present invention realizes the quick identification and compensation case early warning that high suspicious vehicle insurance cheats clique
Function, i.e., in substantial amounts of Claims Resolution case data basis, can fast and accurately identify high suspect vehicle collision clique and wide
Adopted personnel clique.
Certainly, the present invention can also have other various embodiments, ripe in the case of without departing substantially from spirit of the invention and its essence
Various corresponding changes and deformation, but these corresponding changes and change ought can be made according to the present invention by knowing those skilled in the art
Shape should all belong to the protection domain of appended claims of the invention.