CN110442606A

CN110442606A - A kind of processing method of data, equipment and computer storage medium

Info

Publication number: CN110442606A
Application number: CN201910642824.1A
Authority: CN
Inventors: 杨莉; 阮学武
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-07-16
Filing date: 2019-07-16
Publication date: 2019-11-12

Abstract

This application discloses a kind of processing method of data, equipment and computer storage medium, the processing method of the data includes: acquisition pending data；Pending data is carried out similarity with the data in multiple storage regions respectively to compare, to respectively obtain multiple comparison results；Wherein, data of multiple storage regions for fragment storage frequently-used data library；Based on multiple comparison results, obtained from frequently-used data library and the highest target data of pending data similarity；Respective handling is carried out to pending data according to the similarity of pending data and target data.By the above-mentioned means, can be improved the efficiency of data processing.

Description

A kind of processing method of data, equipment and computer storage medium

Technical field

This application involves technical field of data processing, more particularly to the processing method, equipment and calculating of a kind of data Machine storage medium.

Background technique

In the processing of big data, with the surge of data volume, the increase of database, when a data will be with database When all data are serially compared, comparison number is more, and relative efficiency is lower.

For example, as acquisition covering surface increases sharply, personnel are also more and more, right in the acquisition and comparison to identity information The database that should be created in real time also just increases with it；So, all acquisition targets and owner in file store are completed daily Member compares, and whether confirmation acquisition target is to have personnel or newly-built archives staff in database, and it is very that this, which compares number, It is huge, if it is all it is remarkable it is serial compare, then it is time-consuming by with the increase of database and the increase of remarkable number per second and Exponentially grade increases.

Summary of the invention

To solve the above problems, this application provides a kind of processing method of data, equipment and computer storage medium, It can be improved the efficiency of data processing.

The technical solution that the application uses is: a kind of processing method of data is provided, this method comprises: obtaining wait locate Manage data；Pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain Comparison result；Wherein, data of multiple storage regions for fragment storage frequently-used data library；Based on multiple comparison results, from normal With being obtained in database and the highest target data of pending data similarity；It is similar to target data according to pending data Degree carries out respective handling to pending data.

Wherein, the step of respective handling being carried out to pending data according to the similarity of pending data and target data, It include: to judge whether the similarity of pending data and target data is less than setting similarity threshold；If so, by number to be processed According to being cached.

Wherein, after the step of pending data being cached, further includes: to multiple pending datas in caching into Duplicate removal processing of row；Multiple pending datas after duplicate removal processing are compared with the data in global newdata library It is right, to carry out secondary duplicate removal processing；Multiple pending datas after secondary duplicate removal processing are stored in global newdata library In.

Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching Whether the number of multiple pending datas reaches setting amount threshold；If so, being carried out to multiple pending datas in caching Duplicate removal processing.

Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching Whether the storage time of multiple pending datas reaches setting time threshold value；If so, to multiple pending datas in caching Carry out a duplicate removal processing.

Wherein, multiple pending datas after secondary duplicate removal processing are stored in the step in global newdata library, wrapped Include: the similarity of each pending data after obtaining a duplicate removal processing and the data in global newdata library compares knot Fruit；It, will be to be processed when the similarity of data in pending data and global newdata library is less than setting similarity threshold Data are added in global newdata library.

Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set When determining similarity threshold, pending data is saved in a storage region into multiple storage regions.

Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set When determining similarity threshold, pending data is added to perdurable data library；In the exception in global newdata library, from lasting Change the data read in set period of time in database, establishes new global newdata library.

Wherein, pending data parallel similarity is carried out with the data in multiple storage regions respectively to compare, with The step of to multiple comparison results, comprising: pending data is subjected to parallel phase with the data in multiple storage regions respectively Like degree compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.

Wherein, the step of obtaining pending data, comprising: obtain facial image；Characteristic information is extracted from facial image, Using as pending data.

The technical solution that the application uses is: providing a kind of data processing equipment, which includes place Device and memory are managed, memory is for storing program data, and processor is for executing program data to realize such as above-mentioned method.

The technical solution that the application uses is: providing a kind of computer storage medium, which deposits Program data is contained, program data is when being executed by processor, to realize such as above-mentioned method.

The processing method of the data of data provided by the present application includes: acquisition pending data；Pending data is distinguished It carries out parallel similarity with the data in multiple storage regions to compare, to obtain multiple comparison results；Wherein, multiple memory blocks Data of the domain for fragment storage frequently-used data library；Based on multiple comparison results, obtained from frequently-used data library and number to be processed According to the highest target data of similarity；Corresponding position is carried out to pending data according to the similarity of pending data and target data Reason.It is parallel with multiple storages when pending data is compared by the above-mentioned means, data subregion is stored The data in region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can largely count When according to being compared simultaneously, the stability of system is improved.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Wherein:

Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application；

Fig. 2 is the block schematic illustration of the processing method of data provided by the embodiments of the present application；

Fig. 3 is another flow diagram of the processing method of data provided by the embodiments of the present application；

Fig. 4 is another block schematic illustration of the processing method of data provided by the embodiments of the present application；

Fig. 5 is the structural schematic diagram of data processing equipment provided by the embodiments of the present application；

Fig. 6 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description.It is understood that specific embodiment described herein is only used for explaining the application, rather than to the limit of the application It is fixed.It also should be noted that illustrating only part relevant to the application for ease of description, in attached drawing and not all knot Structure.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts Every other embodiment, shall fall in the protection scope of this application.

Term " first ", " second " in the application etc. be for distinguishing different objects, rather than it is specific suitable for describing Sequence.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as comprising The process, method, system, product or equipment of a series of steps or units are not limited to listed step or unit, and It is optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, product or equipment Intrinsic other step or units.

Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.

Refering to fig. 1, Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application, this method comprises:

Step 11: obtaining pending data.

Wherein, which can be the data for being identified to people or object, for example, human face data, vehicle Data etc..

Optionally, by taking human face data as an example, step 11 can be with specifically: obtains facial image；It is extracted from facial image Characteristic information, using as pending data.

Specifically, it can use camera to obtain facial image.For example, in one embodiment, camera can be used Facial image is obtained, image procossing is carried out to facial image, further obtains characteristic information therein；In addition, in another reality It applies in example, the color depth image of face can be obtained using colour imagery shot and infrared camera, further obtain colored Characteristic information in depth image.Wherein, these characteristic informations can be pupil size, interpupillary distance, eyes and other face organs Position, color characteristic etc..

Step 12: pending data is subjected to parallel similarity with the data in multiple storage regions respectively and is compared, with Obtain multiple comparison results；Wherein, data of multiple storage regions for fragment storage frequently-used data library.

Wherein, multiple storage region can be realized using multiple memories, a memory can also be divided into Multiple memory blocks can also be realized using multiple servers.

Specifically, all frequently-used data library data information fragments can be stored in different storage regions, fragment is just It is to split frequently-used data library, is divided on different server, such as: there will be originally the data of 100G, is split into 10 parts store onto different servers, the data that every server so is only stored with 10G.For example, by taking identity information as an example, All identity informations are divided into N group, are respectively stored in N number of storage region, in comparing, pending data is distinguished Carry out parallel comparison respectively in N number of storage region.

Optionally, in one embodiment, step 12 can be with specifically: by pending data respectively and in multiple storage regions Data carry out parallel similarity and compare, and obtain in each storage region with the highest number of pending data similarity According to as comparison result.

Wherein, the algorithm of similarity can use clustering algorithm or distance algorithm, for example, can be calculated using Euclidean distance Method obtains the similarity of two data by calculating the Euclidean distance of two data.

Wherein, comparison result here, which can be, does not have, one or more data, in comparison process, by number to be processed It is compared according to the parallel data with multiple storage regions.In a storage region, then in particular order respectively with Each data in the storage region are compared.

Specifically, pending data is successively carried out similarity with M data in a storage region to compare, obtains M Then a similarity value is ranked up according to similarity size, select the corresponding data of maximum similarity value therein as than To result.

In addition, in other embodiments, wherein phase can also be selected according to the sequence of the similarity of M above-mentioned data Like maximum preceding multiple data, such as maximum three data of similarity are spent, as comparison result.

Step 13: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity Mark data.

After the parallel comparison to each storage region, multiple comparison results are merged.For example, being tied to multiple comparisons The similarity of all data in fruit is ranked up, and therefrom selects the highest data of similarity as target data.

For example, if the data of pending data and each storage region in N number of storage region carry out similarity and compare to obtain One highest data of similarity, then the N number of data compared N number of storage region carry out sequencing of similarity, by similarity Maximum data are as target data；If pending data is similar to the data progress of each storage region in N number of storage region Degree compares and obtains the highest multiple data of similarity, and such as 3, then the 3N data compared N number of storage region carry out phase It sorts like degree, using the maximum data of similarity as target data.

Step 14: respective handling is carried out to pending data according to the similarity of pending data and target data.

Goal data be in frequently-used data library with the highest data of pending data similarity.In a kind of feelings Under condition, the similarity of target data and pending data is higher, meets preset requirement, it is believed that frequently-used data has been deposited in library The pending data is contained, then without carrying out subsequent processing to the pending data.In another case, target data It is lower with the similarity of pending data, it is unsatisfactory for preset requirement, it is believed that it is to be processed that this is not stored in frequently-used data library Data can carry out next storage operation.

It is illustrated below by human face data citing, as shown in Fig. 2, Fig. 2 is data provided by the embodiments of the present application The block schematic illustration of processing method can establish executor (actuator) frame, multiple people in a specific embodiment One grade of operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:

A, executor1 gets pending data 1, and pending data 1 is sent respectively to one grade of operator of a people 1, one People's one grade of operator 2 ..., one grade of operator N of a people.

B, pending data 1 is compared respectively for one grade of operator of each people, for example, one grade of operator 1 of a people will be wait locate It manages the data stored in data 1 and fragment REPO-1 to be compared one by one, obtains comparison result 1.Similarly, one grade of operator N of a people The data stored in pending data 1 and fragment REPO-N are compared one by one, obtain comparison result N.One grade of each people Comparison result is fed back to executor1 by operator.

C, executor1 merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data According to.

D, executor1 is according to the similarity of pending data 1 and target data, determines subsequent to pending data 1 Operation.

It is different from the prior art, the processing method of the data of the present embodiment includes: acquisition pending data；By number to be processed It is compared according to similarity is carried out with the data in multiple storage regions respectively, to respectively obtain multiple comparison results；Wherein, Duo Gecun Data of the storage area domain for fragment storage frequently-used data library；Based on multiple comparison results, obtained from frequently-used data library and to from Manage the highest target data of data similarity；Phase is carried out to pending data according to the similarity of pending data and target data It should handle.By the above-mentioned means, data subregion is stored, when pending data is compared, it is parallel with it is multiple The data of storage region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can be big When amount data are compared simultaneously, the stability of system is improved.

It is another flow diagram of the processing method of data provided by the embodiments of the present application, this method refering to Fig. 3, Fig. 3 Include:

Step 31: obtaining pending data.

Step 32: pending data being subjected to similarity with the data in multiple storage regions respectively and is compared, to obtain respectively To multiple comparison results；Wherein, data of multiple storage regions for fragment storage frequently-used data library.

Step 33: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity Mark data.

Step 34: judging whether the similarity of pending data and target data is less than setting similarity threshold.

When the similarity of pending data and target data is greater than or equal to setting similarity threshold, it is believed that normal With the pending data has been stored in database, without being stored again to the pending data, this can be deleted Pending data, and obtain next pending data and handled.It is less than in the similarity of pending data and target data When setting similarity threshold, it is believed that the pending data is a new data, needs to store it, then after executing Continuous step 35.

Corresponding to the embodiment of recognition of face, in face characteristic data and the target data phase in frequently-used data library of acquisition When being greater than or equal to setting similarity threshold like degree, it is believed that be stored with the face characteristic pair in frequently-used data library The identity information answered, without storing again.Target data similarity in the face characteristic data of acquisition and frequently-used data library When less than setting similarity threshold, it is believed that the corresponding subscriber identity information of face characteristic is a new identity letter Breath, needs to store it, then executes subsequent step 35.

Step 35: pending data is cached.

Step 36: a duplicate removal processing is carried out to multiple pending datas in caching.

It should be understood that in the application scenarios of recognition of face, if same user adopts by data repeatedly in a short time Collect point, human face data be easy to cause human face data to repeat stored problem, in this embodiment, for new by multi collect Data are not stored immediately in frequently-used data library, but first carry out caching duplicate removal processing.

It is alternatively possible to according to the cache-time of pending data, or the data amount check of caching carries out duplicate removal.

In one embodiment, judge whether the storage time of multiple pending datas in caching reaches setting time threshold Value；If so, carrying out a duplicate removal processing to multiple pending datas in caching.

For example, storing first pending data in the buffer starts timing, in reach after a certain period of time, no matter caching The quantity of the pending data of storage be it is how many, in buffer zone all pending datas carry out a duplicate removal processing.

In another embodiment, judge whether the number of multiple pending datas in caching reaches setting quantity threshold Value；If so, carrying out a duplicate removal processing to multiple pending datas in caching.

It is started counting for example, storing first pending data in the buffer, when the quantity of the pending data of caching reaches To after certain amount, a duplicate removal processing is carried out to all pending datas in buffer zone.

During a duplicate removal, pending data successively can be subjected to similarity with each data in buffer zone It compares, when similarity is greater than given threshold before, it is believed that pending data and the Data duplication, it may be possible to acquire same Data are not synchronized in frequently-used data library also only, then can delete one of those.

For example, the data in buffer zone include: A1, B1, C1, D1, A2, E1, F1 ..., after comparing, wherein Data A1 and data A2 similarity be greater than given threshold, it is believed that data A1 and data A2 is repeated data, deletes it In one.

Step 37: multiple pending datas after a duplicate removal processing are compared with the data in global newdata library It is right, to carry out secondary duplicate removal processing.

Global newdata library is for saving the new pending data got in certain time.

Optionally, the pending data after a duplicate removal processing is subjected to phase with the data in global newdata library respectively It is compared like degree, secondary duplicate removal is carried out to the pending data after a duplicate removal.

Step 38: multiple pending datas after secondary duplicate removal processing are stored in global newdata library.

It is compared for example, pending data 1 is successively carried out similarity with each data in global newdata library respectively, If thering is the similarity of any one data and the pending data to be greater than given threshold, it can be assumed that the pending data has added It adds in global newdata, it, can should if the similarity of all data and the pending data is respectively less than given threshold Pending data is added in global newly-built file store.

Further, the similarity of the data in pending data and global newdata library is less than setting similarity threshold When, pending data is saved in a storage region into multiple storage regions.It is i.e. newly-built in pending data and the overall situation When the similarity of data in database is less than setting similarity threshold, pending data is added in global data base.

It is illustrated below by human face data citing, as shown in figure 4, Fig. 4 is data provided by the embodiments of the present application Another block schematic illustration of processing method can establish executor (actuator) frame, multiple in a specific embodiment One one grade of people operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:

A, executor gets pending data 1, and pending data 1 is sent respectively to one grade of operator 1 of a people, a people One grade of operator 2 ... one grade of operator N of a people.

C, executor merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data According to.

D, executor judges whether pending data 1 and the similarity of target data are less than setting similarity threshold, if It is less than, pending data is cached.

E, a duplicate removal is carried out to multiple pending datas of caching.

F, the pending data after a duplicate removal is compared with the data in global newly-built file store, carries out secondary go Weight.

Wherein, secondary duplicate removal process be exactly the data in pending data and global newly-built file store are compared one by one, if Similarity is greater than given threshold, then it represents that in comparison, if similarity is less than given threshold, then it represents that in not comparing.

If G, the pending data is added to global newly-built during pending data and global newly-built file store do not compare In file store.

If pending data H, is sent to executor during pending data and global newly-built file store do not compare, The pending data is added in one grade of operator of a people (i.e. frequently-used data library) by executor.

Be different from the prior art, in the present embodiment by pending data after comparison by duplicate removal processing twice, energy Enough prevent same Data duplication file the problem of, and improve data than team performance.

In addition, the similarity of the data in pending data and global newdata library is less than setting similarity threshold When, pending data is added to perdurable data library；In the exception in global newdata library, read from perdurable data library The data in set period of time are taken, new global newdata library is established.

Specifically, newly file since load is global library operator A can for some reason abnormal (such as device powers down is offline), System can reselect the work of another operator B adapter tube operator A with the global library ability of newly filing of load at this time.So in order to The seamless connection of work, operator B needs to know before operator A exception global library information of newly filing in memory, so with regard to needing to hold The global library information of newly filing of longization, operator reloads when for exception.In the present embodiment, by establish perdurable data library come It realizes the global library information of newly filing of persistence, after receiving secondary duplicate removal confirmation comparison result, judges that pending data is newly When data, perdurable data library is sent by the pending data and carries out persistent storage.After operator A exception, load is global The operator B in library of newly filing newly files library from the overall situation loaded in life cycle in perdurable data library, seamless pipe operator A work Make.

It is the structural schematic diagram of data processing equipment provided by the embodiments of the present application refering to Fig. 5, Fig. 5, which sets Standby 50 include processor 51 and memory 52.

Wherein, which can specifically include multiple sub memories, or including multiple storage regions, be used for fragment Storing data, in addition, being also stored with program data in the memory 52, the processor 51 is for executing the program data to realize Following steps:

Obtain pending data；Pending data is carried out similarity with the data in multiple storage regions respectively to compare, To respectively obtain multiple comparison results；Wherein, data of multiple storage regions for fragment storage frequently-used data library；Based on multiple Comparison result obtains and the highest target data of pending data similarity from frequently-used data library；According to pending data with The similarity of target data carries out respective handling to pending data.

Optionally, processor 51 is also used to execute: judging whether the similarity of pending data and target data is less than and sets Determine similarity threshold；If so, pending data is cached.

Optionally, processor 51 is also used to execute: carrying out a duplicate removal processing to multiple pending datas in caching；It will Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out secondary duplicate removal Processing；Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.

Optionally, processor 51 is also used to execute: judging whether the number of multiple pending datas in caching reaches and sets Determine amount threshold；If so, carrying out a duplicate removal processing to multiple pending datas in caching.

Optionally, processor 51 is also used to execute: judging whether the storage time of multiple pending datas in caching reaches To setting time threshold value；If so, carrying out a duplicate removal processing to multiple pending datas in caching.

Optionally, processor 51 is also used to execute: each pending data and the overall situation after obtaining a duplicate removal processing are new Build the similarity comparison result of the data in database；The similarity of data in pending data and global newdata library When less than setting similarity threshold, pending data is added in global newdata library.

Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data When degree is less than setting similarity threshold, pending data is saved in a storage region into multiple storage regions.

Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data When degree is less than setting similarity threshold, pending data is added to perdurable data library；Exception in global newdata library When, from the data read in set period of time in perdurable data library, establish new global newdata library.

Optionally, processor 51 is also used to execute: pending data is carried out with the data in multiple storage regions respectively Similarity compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.

Optionally, processor 51 is also used to execute: obtaining facial image；Characteristic information is extracted, from facial image to make For pending data.

It is to be appreciated that the data processing equipment 50 in the present embodiment can be as the service in video monitoring system Device, the server are connect with multiple front end picture pick-up devices, carry out subsequent processing to the facial image of multiple picture pick-up devices acquisition.

It is the structural schematic diagram of computer storage medium provided by the embodiments of the present application refering to Fig. 6, Fig. 6, which deposits Storage media 60 is stored with program data 61, and program data 61 is when being executed by processor, to perform the steps of

In addition, being also used to execute following steps: judging whether the similarity of pending data and target data is less than setting Similarity threshold；If so, pending data is cached.Multiple pending datas in caching are carried out at a duplicate removal Reason；Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two Secondary duplicate removal processing；Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.

In several embodiments provided herein, it should be understood that disclosed method and equipment, Ke Yitong Other modes are crossed to realize.For example, equipment embodiment described above is only schematical, for example, the module or The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize present embodiment scheme according to the actual needs Purpose.

In addition, each functional unit in each embodiment of the application can integrate in one processing unit, it can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated unit in above-mentioned other embodiments is realized in the form of SFU software functional unit and as independence Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the application Technical solution substantially all or part of the part that contributes to existing technology or the technical solution can be in other words It is expressed in the form of software products, which is stored in a storage medium, including some instructions are used So that a computer equipment (can be personal computer, server or the network equipment etc.) or processor (processor) all or part of the steps of each embodiment the method for the application is executed.And storage medium packet above-mentioned It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), the various media that can store program code such as magnetic or disk.

The foregoing is merely presently filed embodiments, are not intended to limit the scope of the patents of the application, all according to this Equivalent structure or equivalent flow shift made by application specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other Technical field similarly includes in the scope of patent protection of the application.

Claims

1. a kind of processing method of data characterized by comprising

Obtain pending data；

The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain Comparison result；Wherein, data of the multiple storage region for fragment storage frequently-used data library；

Based on the multiple comparison result, obtained from the frequently-used data library and the highest number of targets of pending data similarity According to；

Respective handling is carried out to the pending data according to the similarity of the pending data and the target data.

2. the method according to claim 1, wherein

It is described that respective handling is carried out to the pending data according to the similarity of the pending data and the target data The step of, comprising:

Judge whether the pending data and the similarity of the target data are less than setting similarity threshold；

If so, the pending data is cached.

3. according to the method described in claim 2, it is characterized in that,

After described the step of being cached the pending data, further includes:

Duplicate removal processing is carried out to multiple pending datas in caching；

Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two Secondary duplicate removal processing；

Multiple pending datas after secondary duplicate removal processing are stored in the global newdata library.

4. according to the method described in claim 3, it is characterized in that,

The step of multiple pending datas in described pair of caching carry out a duplicate removal processing, comprising:

Judge whether the number of the multiple pending data in caching reaches setting amount threshold；

If so, carrying out a duplicate removal processing to the multiple pending data in caching.

5. according to the method described in claim 3, it is characterized in that,

Judge whether the storage time of the multiple pending data in caching reaches setting time threshold value；

6. according to the method described in claim 3, it is characterized in that,

Multiple pending datas by after secondary duplicate removal processing are stored in the step in the global newdata library, packet It includes:

The similarity ratio of each pending data after obtaining a duplicate removal processing and the data in the global newdata library To result；

When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute Pending data is stated to be added in the global newdata library.

7. according to the method described in claim 6, it is characterized in that,

The method also includes:

When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute It states in the storage region that pending data is saved into the multiple storage region.

8. according to the method described in claim 6, it is characterized in that,

The method also includes:

When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute It states pending data and is added to perdurable data library；

In the exception in the global newdata library, from the data read in the perdurable data library in set period of time, Establish new global newdata library.

9. the method according to claim 1, wherein

It is described the pending data is subjected to parallel similarity with the data in multiple storage regions respectively to compare, to obtain The step of multiple comparison results, comprising:

The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, and is obtained each In storage region with the highest data of the pending data similarity, as comparison result.

10. the method according to claim 1, wherein

The step of acquisition pending data, comprising:

Obtain facial image；

Characteristic information is extracted from the facial image, using as pending data.

11. a kind of data processing equipment, which is characterized in that the data processing equipment includes processor and memory, described to deposit Reservoir is for storing program data, and the processor is for executing described program data to realize such as any one of claim 1-10 The method.

12. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with program data, the journey Ordinal number evidence is when being executed by processor, to realize such as the described in any item methods of claim 1-10.