CN101187927B

CN101187927B - Criminal case joint investigation intelligent analysis method

Info

Publication number: CN101187927B
Application number: CN2007100508540A
Authority: CN
Inventors: 刘启和; 张建中; 陈雷霆; 闵帆; 何明耘
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2007-12-17
Filing date: 2007-12-17
Publication date: 2010-12-15
Anticipated expiration: 2027-12-17
Also published as: CN101187927A

Abstract

The invention provides an intelligent analysis approach for accurately and efficiently searching the consolidated text and the image of a criminal case strand. The approach includes the steps that the image of a database and the data information of a text are distilled to form a multidimensional vector feature of each case; an operational formula is defined for continuous data and discrete symbolic data to perform uniform treatment and calculation; the multidimensional vector is endowed with different weights; a rough and reduced technology is adopted, the multidimensional vector feature of each case is performed with the dimensionality reduction; the similarity of the case to be analyzed and each case of the database with a reduced vector is calculated, a serial parallel case which is relevant to the case to be analyzed in the database is found out. The adoption of the invention can combine the experience and the knowledge of an analyst to perform an agile and flexible search and survey interactively, provide the more accurate information of the serial parallel case for the crime-solving personal and improve the crime-solving efficiency.

Description

A kind of intelligent analysis method of combining related cases of criminal case

Technical field

The present invention relates to a kind of intelligent analysis method of combining related cases of criminal case.

Background technology

Criminal case detection personnel can from a large amount of information of criminal-scene collection and with these information stores computer system, as the footprint photo, scene photograph, the feature description of object in situ, information is described or the like in the crime place.These information formats are various, and discrete numbers and symbols is arranged, and text message arranged, image information.At the crime of current strange land, the crime of stream string, the characteristics of committing a crime continuously, criminal case intelligence analysis personnel are according to current case feature, need carry out complicated retrieval and comparison to the case that has taken place in the computer system, finding which case may be to be done by same people or clique, thereby provide a large amount of evidence and clues for cracking of cases work.The disposal route of combining related cases in the criminal case mainly is the analysis and the processing of combining related cases by simple retrieval and artificial comparison at present.Adopt its efficient of this method extremely low, along with analyst's workload increasing and degree of fatigue increase, the accuracy of its manual analysis also reduces greatly, and then has influenced the speed and the efficient of solving a case.Though it is of common occurrence at present respectively image to be carried out the method that collection apparatus and text classify, but because the data characteristics that criminal-scene is gathered is various, both comprised text data, comprise view data again, existing discrete data, comprise continuous data again, in addition, analyst's the mode complexity of combining related cases is various, need a kind ofly can carry out multiple array configuration and finish the analysis of combining related cases information, the computing machine that uses current known image retrieval technologies or text retrieval technology to be difficult to the to be applied in criminal case assistant analysis system that combines related cases, the efficient that can not satisfy combines related cases analyzes and the requirement of accuracy.

Summary of the invention

The objective of the invention is: provide a kind of and can carry out accurately image again, the intelligent analysis method of combining related cases of the criminal case of efficient retrieval text.

Goal of the invention of the present invention realizes by implementing following technical proposals:

A kind of intelligent analysis method of combining related cases of criminal case comprises the steps:

Step 1, respectively the image of each case in the database and the feature of text are extracted;

The image of step 2, each case that will extract from database and a character representation of text are an one-dimensional vector of this case, all features of the image that extracted in each case and text are formed a multi-C vector of each case;

Step 3, give weights to each one-dimensional vector of each case; Similarity in the computational data storehouse between the case obtains similarity matrix; Assign thresholds calculates the field of each case again, obtains the field rough set system 1 of database;

Step 4, the multi-C vector of each case is carried out the dimension yojan;

The practice is the one-component that removes in each case multi-C vector; Give weights to remaining each vector of each case then, calculate the similarity that lacks between this component case, the use threshold calculations identical with step 3 lacks the field of this each case of component, obtains the field rough set system 2 of database; Field rough set system 1 and each case of the database that comparison step 3 obtains removed the field rough set system 2 of the database that one-component obtains, if the two significant difference, this component can not remove, the dimension of multi-C vector can not yojan, if the two difference is little, the component that each case is removed should be by yojan, and then the dimension of the multi-C vector of each case is by yojan; Should repeat this step practice to other each component in each case multi-C vector, component that can yojan removes, and the component that reservation can not yojan has obtained each case multi-C vector of simplifying approximately at last;

Step 5, calculate in case to be analyzed and the database, from database, find out combine related cases part related with it by the similarity between each case behind the yojan vector;

If step 6 can not get satisfied result from step 5, then should in step 3, readjust the weights of each case vector, and the threshold value of adjusting each case field of calculating, repeating step 3 is to the method for step 5, up to the result of the part that obtains to combine related cases; Wherein: a character representation of the image of each case that will extract from database described in the step 2 and text is the one-dimensional vector of this case, " feature " of image described here and text, be meant that the attribute-property value that is expressed as in the following table is right, the pairing property value of each attribute that is each case is its one-dimensional vector, and all attribute-property values are to having formed the multi-C vector of each case:

Attribute	Property value
		Characteristic point position in the image	The pixel value of unique point
Speech in the text data	The frequency of speech in text
		The text that is used for specific description	Discrete data
The numeral that is used for specific description	Continuous data

The text that is used for specific description can comprise tool used in crime, and its property value can comprise cutter, rifle; Can be perpetrator's number, its property value is a discrete data; Can be the length of on-the-spot footprint, its property value is a continuous data; It is characterized in that:

1) each case C _iAll be represented as a n-dimensional vector: (v _I1, v _I2..., v _In), both comprised the continuous number data in the vector, also comprise the discrete symbols data, establish v, s is the property value from same attribute, is defined as follows computing ' ':

2) image and the text feature of the case of extraction described in the step 1, should extract as follows:

The average gradient of each pixel square matrix in step 1-1, the computed image:

N (x, y) = [\begin{matrix} {(\frac{&PartialD; I}{&PartialD; x})}^{2} & \frac{&PartialD; I}{&PartialD; x} \frac{&PartialD; I}{&PartialD; y} \\ \frac{&PartialD; I}{&PartialD; x} \frac{&PartialD; I}{&PartialD; y} & {(\frac{&PartialD; I}{&PartialD; x})}^{2} \end{matrix}]

Wherein I (x, y) be in the image (x, gray-scale value y), when point (x, when y) two eigenwerts of Dui Ying average gradient square matrix are big, this point (x is a unique point y), and the unique point response function is:

R＝det(N)-k(trace(N)) ²，

Wherein det (N) is the determinant of a matrix value, trace (N) is the mark of matrix N, k is 0.04, by the R value picture element in the image is carried out descending sort, constitute an ordered series of numbers, determine a required unique point number F, preceding F picture element is unique point in the peek row then, and the positional information of unique point is formed a vector of unique point;

Step 1-2, the text feature of case extracted carries out as follows:

Text is carried out participle and part-of-speech tagging, remove function word wherein, remaining speech is designated as v ₁, v ₂..., v _nCalculate each speech v _iWord frequency in text is designated as p _i, be dimension with the speech, obtain a word frequency vector (p ₁, p ₂..., p _n);

3) weight vector in the step 3 is carried out following normalized:

The weights r of each component of step 3-1, case multi-C vector _iConstituted the weights of multi-C vector, therefore, be designated as weight vector the weights of multi-C vector:

R＝(r ₁，r ₂，...，r _m)，

Step 3-2, weight vector R formula calculated as described below carry out normalized:

W = (\frac{r_{1}}{Σ_{i = 1}^{n} r_{i}}, \frac{r_{2}}{Σ_{i = 1}^{n} r_{i}}, . . ., \frac{r_{n}}{Σ_{i = 1}^{n} r_{i}}) = (w_{1}, w_{2}, . . ., w_{n})

W=(w herein ₁, w ₂..., w _n) be normalized weight vector

4) weights of the vector of utilization described in the step 3 calculate the similarity between case, should adopt following computing method to calculate:

Similarity in step 3-3, the computational data storehouse between two cases;

If C ₁And C ₂Be two cases, the vector of its correspondence is (v ₁, v ₂..., v _n) and (s ₁, s ₂..., s _n), C then ₁And C ₂Between similarity calculate according to following formula:

S (C_{1}, C_{2}) = \frac{Σ_{i = 1}^{n} w_{i} (v_{i} \cdot s_{i})}{\sqrt{Σ_{i = 1}^{n} v_{i} \cdot v_{i}} \sqrt{Σ_{i = 1}^{n} s_{i} \cdot s_{i}}}

Here w ₁Be the i component of normalization weight vector, v _iV _i, s _iS _iBe C ₁, C ₂' ' computing of each case vector self i component, (v _iS _i) be C ₁And C ₂' ' computing of corresponding i component between two cases;

Similarity in step 3-4, the computational data storehouse between all cases obtains similarity matrix:

If C is arranged in the database ₁, C ₂..., Cm case, each case are represented as a n-dimensional vector, by step 3-3, calculate the similarity between any two cases, and the similarity matrix that obtains all cases is as follows:

MS = (\begin{matrix} S (C_{1}, C_{1}), S (C_{1}, C_{2}), . . ., S (C_{1}, C_{m}) \\ S (C_{2}, C_{1}), S (C_{2}, C_{2}), . . ., S (C_{2}, C_{m}) \\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . \\ S (C_{m}, C_{1}), S (C_{m}, C_{2}), . . ., S (C_{m}, C_{m}) \end{matrix})

Step 3-5, assign thresholds K are according to K and similarity matrix, to any case C _i, calculate case C according to following formula _iField N (C _i):

N(C _i)＝{C _j|S(C _i，C _j)≤K，j∈{1，2，...，m}}，

Step 3-6, to each case in the database, calculate its field, it is as follows to obtain field rough set system 1:

NS＝{N(C ₁)，N(C ₂)，...，N(C _m)}

5) it is as follows described in the step 4 multi-C vector to be carried out the step of dimension yojan:

Step 4-1: the case of establishing in the database has n component, allows F={1, and 2 ..., n} establishes C ₁And C ₂Be two cases, the vector of its correspondence is (v ₁, v ₂..., v _n) and (s ₁, s ₂..., s _n), from F, remove any one component i, allow F=F-{i}, calculate the similarity of two cases after removing again, its formula is as follows:

S (C_{1}, C_{2}) = \frac{\underset{i &Element; F}{Σ} w_{i} (v_{i} \cdot s_{i})}{\sqrt{\underset{i &Element; F}{Σ} v_{i} \cdot v_{i}} \sqrt{\underset{i &Element; F}{Σ} s_{i} \cdot s_{i}}},

Step 4-2: the similarity according to step 4-1 calculates, adopt step 3-5, that step 3-6 obtains a field rough set system 2 is as follows:

NS ^*＝{N ^*(C ₁)，N ^*(C ₂)，...，N ^*(C _m)}

Step 4-3: to field rough set NS of system and NS ^*Compare its difference, definition:

L = \frac{1}{2 n} \times Σ_{i = 1}^{n} \frac{| N (C_{i}) \cap N^{*} (C_{i}) |}{| N (C_{i}) |} + \frac{| N (C_{i}) \cap N^{*} (C_{i}) |}{| N^{*} (C_{i}) |},

L has described field NS of rough set system and NS ^*Between difference degree, its value is big more, difference is more little, when L less than specified threshold value, { i} promptly can not remove component i, otherwise removes component i to allow F=F ∪;

Step 4-4: repeating step 4-1 can not remove till the component in F to step 4-3 again, obtains the multi-C vector F after the yojan;

6) described case to be analyzed of step 5 and data of database similarity analysis step are as follows:

Step 5-1, to the case Cp of appointment, on the vectorial F after the yojan, calculate the similarity of each case in Cp and the database, it is as follows to obtain similarity vector:

(S(C _p，C ₁)，S(C _p，C ₂)，...，S(C _p，C _m))，

The field of step 5-2, calculating Cp is as follows:

N(C _p)＝{C _j|S(C _p，C _j)≤K，j∈{1，2，...，m}}，

Field N (C wherein _p) in case for the case Cp part of combining related cases.

Before described step 1-1, adopt following steps that image is carried out pre-service:

Step 1-0, data acquisition is carried out in specific region in the image such as footprint;

Step 1-01, determine footprint zone in the photo, comprise the forward position point and the heel point of footprint;

Step 1-02, footprint forward position point and heel point are connected to a line segment, the mid point of getting this line segment is an initial point, and this line segment is made as the y axle, and its vertical straight line is the x axle, sets up area coordinate system, and the position of each picture element in this coordinate system in the zoning.

The present invention is by extracting the useful feature vector in image from database and the text, and user's knowledge is mapped as a kind of weight vector, in conjunction with this weight vector, use rough set theory, each component to vector carries out dynamic yojan and selection, carries out similarity then above the vector after yojan and calculates the analysis of combining related cases that realizes case.The present invention unifies to handle and calculate to continuous data and discrete symbols data, define a kind of operational formula, avoid comprising continuous data in the multi-C vector, can't calculate positive region during yojan, maybe need and to calculate positive region, the drawback that causes a large amount of useful informations to lose after the continuous data discretize again; Adopt the present invention's energy binding analysis personnel's experience and knowledge interactively to carry out flexible, flexible retrieval and comparison,, improved the efficient of solving a case for the personnel of solving a case provide the information of combining related cases more accurately.

Description of drawings

Fig. 1 is a process flow diagram of the present invention.

Embodiment

The present invention is described in further detail below in conjunction with accompanying drawing:

Analyze with the data retrieved storehouse in comprise the data message of each criminal case, this information comprises and also comprising the existing case analysis result of combining related cases, to and case case together identify.

Step 1: adopt following steps respectively the view data and the text data of each case in the database to be carried out feature extraction;

Step 1-0: have the footprint photo before extracting feature, to carry out pre-service in the image;

Step 1-01: forward position point and the heel point of determining footprint zone in the photo and footprint;

Step 1-02: footprint forward position point and heel point are connected to a line segment, and the mid point of getting this line segment is an initial point, and this line segment is made as the y axle, and its vertical straight line is the x axle, sets up area coordinate system, and the position of each picture element in this coordinate system in the zoning.

Step 1-1: image-region is carried out feature point extraction according to the following steps, and its extracting method is as follows:

The average gradient of each pixel square matrix is as follows in the computed image:

N (x, y) = [\begin{matrix} {(\frac{&PartialD; I}{&PartialD; x})}^{2} & \frac{&PartialD; I}{&PartialD; x} \frac{&PartialD; I}{&PartialD; y} \\ \frac{&PartialD; I}{&PartialD; x} \frac{&PartialD; I}{&PartialD; y} & {(\frac{&PartialD; I}{&PartialD; x})}^{2} \end{matrix}],

Wherein (x y) is (x, the gray-scale value of y) locating of position in the image to I.If two eigenwerts of the average gradient square matrix that certain point is corresponding are bigger, near the bigger gray level that has this point changes so, and this just illustrates that this point is a unique point, and the unique point response function is:

R＝det(N)-k(trace(N)) ²，

Det (N) is the determinant of a matrix value, and trace (N) is the mark of matrix N, and k is generally 0.04.

By the R value picture element in the image is carried out descending sort, constitute an ordered series of numbers, determine a required unique point number F, preceding F picture element is unique point in the peek row then, and the positional information of unique point is formed a vector of unique point.

Step 1-2: the data of descriptors such as the vestige that the text data in the database such as way of committing offenses, tool used in crime, crime number, crime personnel shape characteristic, scene are left over, footprint length are carried out feature extraction; Text is carried out participle and part-of-speech tagging, remove function word wherein, remaining speech is designated as w ₁, w ₂..., w _nCalculate each speech w _iWord frequency in text, note p _i, be dimension with the speech, obtain a vector (p ₁, p ₂..., p _n).

Step 2: the image and the text feature of each case in the database that extracts are expressed as attribute-property value to form, be counted as attribute as the characteristic point position in the image, and the pixel value of unique point is counted as the value on this attribute, speech is counted as attribute in the text data, and the frequency of this speech in text is counted as the value on this attribute, existing discrete data and continuous data also can be organized as the form of attribute-property value in the case, as the text that is used for specific description can comprise tool used in crime, and its property value can comprise cutter, rifle; Can be perpetrator's number, its property value is a discrete data; Can be the length of on-the-spot footprint, its property value is a continuous data.

If m case arranged in the database, then the case in the database is organized into the information table into following form:

Case

A ₁

A ₂

…

A _n

C ₁	v ₁₁	v ₁₂	…	v _1n
					C ₂	v ₂₁	v ₂₂	…	v _2n
…			…
					C _，m	v _m1	v _m2	…	v _mn

Wherein, C ₁, C ₂..., Cm represents case, A ₁, A ₂..., A _nRepresent n attribute, v _I1, v _I2..., v _InExpression case C _iRespectively at attribute A ₁, A ₂..., A _nOn value, like this, the every row in the table is exactly the data vector of a case.Each case C _iAll be represented as a n-dimensional vector: (v _I1, v _I2..., v _In), both comprised the continuous number data in the vector, also comprise the discrete symbols data, establish v, s is the property value from same attribute, is defined as follows computing:

Property value v as a footprint length is 19.85cm, and the property value s of a footprint length is 19.80cm, is continuous data, and these two property values are defined as vs=19.80*19.85;

Property value v as a tool used in crime is a cutter, and the property value s of a tool used in crime is a rifle, v ≠ s, then vs=0;

As the number of the personnel's that commit a crime property value v is 3, and the number of crime personnel's property value s is 3, v=s, then vs=1

Step 3: each dimensional vector to case is given weights; Similarity between the case of computational data storehouse obtains similarity matrix; Assign thresholds, the field of calculating each case obtains the field rough set system 1 of database;

Step 3-1: the analyst is in conjunction with experimental knowledge and analysis mode each dimensional vector such as the footprint photo to case, and way of committing offenses is given weights, because the n-dimensional vector of each case is all from data such as footprint photo, ways of committing offenses.Use these weights give weights can for each component of n-dimensional vector, the weights of each component of n-dimensional vector are identical with the weights of vector.Its method is: if some component is all from same data in the n-dimensional vector, from the footprint photo, then these components all are endowed the weights of footprint photo as all.Like this, each component in the n-dimensional vector all has weights, remembers that this weight vector is

P＝(p ₁，p ₂，...，p _n)，

Step 3-2: weight vector shows the attention degree to data, and the analyst obtains the analysis side emphasis by regulating weight vector.For example, if only use the analysis of combining related cases of footprint photo, then the weights of footprint photo can be provided with 1, and the weights of other data are set to 0.With weight vector P normalization, it is as follows to obtain normalized weight vector W:

W = (\frac{p_{1}}{Σ_{i = 1}^{n} p_{i}}, \frac{p_{2}}{Σ_{i = 1}^{n} p_{i}}, . . ., \frac{p_{n}}{Σ_{i = 1}^{n} p_{i}}) = (w_{1}, w_{2}, . . ., w_{n}),

Step 3-3: establish C ₁And C ₂Be two cases, the vector of its correspondence is (v ₁, v ₂..., v _n) and (s ₁, s ₂..., s _n), then the similarity between C1 and the C2 is calculated according to following formula:

S (C_{1}, C_{2}) = \frac{Σ_{i = 1}^{n} w_{i} (v_{i} \cdot s_{i})}{\sqrt{Σ_{i = 1}^{n} v_{i} \cdot v_{i}} \sqrt{Σ_{i = 1}^{n} s_{i} \cdot s_{i}}} .

If C is arranged in the database ₁, C ₂..., Cm case, through step 2, each case is represented as a n-dimensional vector.Therefore, by step 33, calculate the similarity of any two cases, it is as follows to obtain similarity matrix:

MS = (\begin{matrix} S (C_{1}, C_{1}), S (C_{1}, C_{2}), . . ., S (C_{1}, C_{m}) \\ S (C_{2}, C_{1}), S (C_{2}, C_{2}), . . ., S (C_{2}, C_{m}) \\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . \\ S (C_{m}, C_{1}), S (C_{m}, C_{2}), . . ., S (C_{m}, C_{m}) \end{matrix})

Step 3-5: assign thresholds K, according to K and similarity matrix, to any case C _i, according to the field N (Ci) of following formula calculating case Ci,

N(C _i)＝{C _j|S(C _i，C _j)≤K，j∈{1，2，...，m}}

Step 3-6: to each case in the database, calculate its field, it is as follows so just to obtain field rough set system 1:

NS＝{N(C ₁)，N(C ₂)，...，N(C _m)}。

Step 4: the multi-C vector to each case adopts top-down reduction method, carries out the dimension yojan; Concrete steps are as follows:

S (C_{1}, C_{2}) = \frac{\underset{i &Element; F}{Σ} w_{i} (v_{i} \cdot s_{i})}{\sqrt{\underset{i &Element; F}{Σ} v_{i} \cdot v_{i}} \sqrt{\underset{i &Element; F}{Σ} s_{i} \cdot s_{i}}},

Step 4-2: the similarity among the 4-1 set by step, it is as follows to obtain a field rough set system 2 once more according to step 3-5, step 3-6:

NS ^*＝{N ^*(C ₁)，N ^*(C ₂)，...，N ^*(C _m)}，

L = \frac{1}{2 n} \times Σ_{i = 1}^{n} \frac{| N (C_{i}) \cap N^{*} (C_{i}) |}{| N (C_{i}) |} + \frac{| N (C_{i}) \cap N^{*} (C_{i}) |}{| N^{*} (C_{i}) |},

L has described field NS of rough set system and NS ^*Between difference degree, its value is big more, difference is more little.When L less than certain specified threshold value, { i} promptly can not remove component i, otherwise removes component i to allow F=F ∪.

Step 4-4: repeating step 4-1 can not remove till the component in F to step 4-3 again.Obtain the multi-C vector F after the yojan.

Step 5: calculate the case to be analyzed and the similarity of each case of database after the vectorial yojan, find out database its related part of combining related cases that neutralizes.

Step 5-1: to the case Cp of appointment, on the vectorial F after the yojan, calculate the similarity of each case in Cp and the database, it is as follows to obtain similarity vector:

(S(C _p，C ₁)，S(C _p，C ₂)，...，S(C _p，C _m))，

Step 5-2: the field of calculating CP is as follows:

N(C _p)＝{C _j|S(C _p，C _j)≤K，j∈{1，2，...，m}}，

Field N (C wherein _p) in case for the case Cp part of combining related cases.

Step 6: if the analyst is dissatisfied to the result in the step 5, then should in step 3, readjust the weights of each case vector, and the threshold value k that adjusts each case field of calculating, repeating step 3 is to the method for step 5, the new part result that combines related cases that gets back is till the analyst is satisfied.

Claims

1. the intelligent analysis method of combining related cases of a criminal case comprises the steps:

Step 3, give weights to each component in the multi-C vector of each case; Similarity in the computational data storehouse between the case obtains similarity matrix; Assign thresholds calculates the field of each case again, obtains the field rough set system 1 of database;

Step 4, the multi-C vector of each case is carried out the dimension yojan;

The practice is the one-component that removes in each case multi-C vector; Give weights to remaining each component in each case multi-C vector then, calculate the similarity that lacks between this component case, the use threshold calculations identical with step 3 lacks the field of this each case of component, obtains the field rough set system 2 of database; Field rough set system 1 and each case of the database that comparison step 3 obtains removed the field rough set system 2 of the database that one-component obtains, if the two significant difference, this component can not remove, the dimension of multi-C vector can not yojan, if the two difference is little, the component that each case is removed should be by yojan, and then the dimension of the multi-C vector of each case is by yojan; Should repeat this step practice to other each component in each case multi-C vector, component that can yojan removes, and the component that reservation can not yojan has obtained each case multi-C vector of simplifying approximately at last;

Step 5, calculate in case to be analyzed and the database, from database, find out combine related cases related with it by the similarity between each case behind the yojan vector;

If step 6 can not get satisfied result from step 5, then should in step 3, readjust in the multi-C vector of each case each component and give weights, and the threshold value of adjusting each case field of calculating, repeating step 3 is to the method for step 5, up to the result who obtains to combine related cases; Wherein: a character representation of the image of each case that will extract from database described in the step 2 and text is the one-dimensional vector of this case, " feature " of image described here and text, be meant that the attribute-property value that is expressed as in the following table is right, the pairing property value of each attribute that is each case is its one-dimensional vector, and all attribute-property values are to having formed the multi-C vector of each case:

It is characterized in that:

Wherein

Represent respectively I (x, y) (x is y) to the derivative of y, and I (x to the derivative of x and I, y) be that ((x is when y) two eigenwerts of Dui Ying average gradient square matrix are big when point for x, gray-scale value y) in the image, this point (x y) is a unique point, and the unique point response function is:

A＝det(N)-k(trace(N)) ²，

Wherein det (N) is the determinant of a matrix value, trace (N) is the mark of matrix N, k is 0.04, by the A value pixel in the image is carried out descending sort, constitute an ordered series of numbers, determine a required unique point number F, preceding F pixel is unique point in the peek row then, and the positional information of unique point is formed a vector of unique point;

Step 1-2, the text feature of case extracted carries out as follows:

Text is carried out participle and part-of-speech tagging, remove function word wherein, remaining speech is designated as v ₁, v ₂..., v _nCalculate the word frequency of each speech vi in text, be designated as p _i, be dimension with the speech, obtain a word frequency vector (p ₁, p ₂..., p _n);

3) give weights to each component in the multi-C vector in the step 3 and carry out following normalized:

R＝(r ₁，r ₂，...，r _m)，

W=(w herein ₁, w ₂..., w _n) be normalized weight vector;

4) utilize each component in the multi-C vector to give weights described in the step 3 and calculate similarity between case, should adopt following computing method to calculate:

Similarity in step 3-3, the computational data storehouse between two cases;

Here w _iBe the i component of normalization weight vector, v _iV _i, s _iS _iBe C ₁, C ₂' ' computing of each case vector self i component, (v _iS _i) be C ₁And C ₂' ' computing of corresponding i component between two cases;

Similarity in step 3-4, the computational data storehouse between all cases obtains similarity matrix: establishing has C in the database ₁, C ₂..., Cm case, each case are represented as a n-dimensional vector, by step 3-3, calculate the similarity between any two cases, and the similarity matrix that obtains all cases is as follows:

Step 3-5, assign thresholds K are according to K and similarity matrix, to any case C _i, calculate case C according to following formula _iField N (C _i)

N(C _i)＝{C _j|S(C _i，C _j)≤K，j∈{1，2，...，m}}，

NS＝{N(C ₁)，N(C ₂)，...，N(C _m)}

NS ^*＝{N ^*(C ₁)，N ^*(C ₂)，...，N ^*(C _m)}，

(S(C _p，C ₁)，S(C _p，C ₂)，...，S(C _p，C _m))，

The field of step 5-2, calculating Cp is as follows:

N(C _p)＝{C _j|S(C _p，C _j)≤K，j∈{1，2，...，m}}，

Field N (C wherein _p) in case for to combine related cases with case Cp.

2. the intelligent analysis method of combining related cases of criminal case according to claim 1 is characterized in that, before described step 1-1, adopts following steps that image is carried out pre-service:

Step 1-0, data acquisition is carried out in the specific region in the image.

3. the intelligent analysis method of combining related cases of criminal case according to claim 2 is characterized in that, described image is a footprint, and its pre-treatment step is as follows:

Step 1-02, footprint forward position point and heel point are connected to a line segment, the mid point of getting this line segment is an initial point, and this line segment is made as the y axle, and its vertical straight line is the x axle, sets up area coordinate system, and the position of each pixel in this coordinate system in the zoning.