CN110442606A - A kind of processing method of data, equipment and computer storage medium - Google Patents

A kind of processing method of data, equipment and computer storage medium Download PDF

Info

Publication number
CN110442606A
CN110442606A CN201910642824.1A CN201910642824A CN110442606A CN 110442606 A CN110442606 A CN 110442606A CN 201910642824 A CN201910642824 A CN 201910642824A CN 110442606 A CN110442606 A CN 110442606A
Authority
CN
China
Prior art keywords
data
pending
similarity
pending data
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910642824.1A
Other languages
Chinese (zh)
Inventor
杨莉
阮学武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN201910642824.1A priority Critical patent/CN110442606A/en
Publication of CN110442606A publication Critical patent/CN110442606A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)

Abstract

This application discloses a kind of processing method of data, equipment and computer storage medium, the processing method of the data includes: acquisition pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare, to respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and the highest target data of pending data similarity;Respective handling is carried out to pending data according to the similarity of pending data and target data.By the above-mentioned means, can be improved the efficiency of data processing.

Description

A kind of processing method of data, equipment and computer storage medium
Technical field
This application involves technical field of data processing, more particularly to the processing method, equipment and calculating of a kind of data Machine storage medium.
Background technique
In the processing of big data, with the surge of data volume, the increase of database, when a data will be with database When all data are serially compared, comparison number is more, and relative efficiency is lower.
For example, as acquisition covering surface increases sharply, personnel are also more and more, right in the acquisition and comparison to identity information The database that should be created in real time also just increases with it;So, all acquisition targets and owner in file store are completed daily Member compares, and whether confirmation acquisition target is to have personnel or newly-built archives staff in database, and it is very that this, which compares number, It is huge, if it is all it is remarkable it is serial compare, then it is time-consuming by with the increase of database and the increase of remarkable number per second and Exponentially grade increases.
Summary of the invention
To solve the above problems, this application provides a kind of processing method of data, equipment and computer storage medium, It can be improved the efficiency of data processing.
The technical solution that the application uses is: a kind of processing method of data is provided, this method comprises: obtaining wait locate Manage data;Pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain Comparison result;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple comparison results, from normal With being obtained in database and the highest target data of pending data similarity;It is similar to target data according to pending data Degree carries out respective handling to pending data.
Wherein, the step of respective handling being carried out to pending data according to the similarity of pending data and target data, It include: to judge whether the similarity of pending data and target data is less than setting similarity threshold;If so, by number to be processed According to being cached.
Wherein, after the step of pending data being cached, further includes: to multiple pending datas in caching into Duplicate removal processing of row;Multiple pending datas after duplicate removal processing are compared with the data in global newdata library It is right, to carry out secondary duplicate removal processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library In.
Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching Whether the number of multiple pending datas reaches setting amount threshold;If so, being carried out to multiple pending datas in caching Duplicate removal processing.
Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching Whether the storage time of multiple pending datas reaches setting time threshold value;If so, to multiple pending datas in caching Carry out a duplicate removal processing.
Wherein, multiple pending datas after secondary duplicate removal processing are stored in the step in global newdata library, wrapped Include: the similarity of each pending data after obtaining a duplicate removal processing and the data in global newdata library compares knot Fruit;It, will be to be processed when the similarity of data in pending data and global newdata library is less than setting similarity threshold Data are added in global newdata library.
Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set When determining similarity threshold, pending data is saved in a storage region into multiple storage regions.
Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set When determining similarity threshold, pending data is added to perdurable data library;In the exception in global newdata library, from lasting Change the data read in set period of time in database, establishes new global newdata library.
Wherein, pending data parallel similarity is carried out with the data in multiple storage regions respectively to compare, with The step of to multiple comparison results, comprising: pending data is subjected to parallel phase with the data in multiple storage regions respectively Like degree compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.
Wherein, the step of obtaining pending data, comprising: obtain facial image;Characteristic information is extracted from facial image, Using as pending data.
The technical solution that the application uses is: providing a kind of data processing equipment, which includes place Device and memory are managed, memory is for storing program data, and processor is for executing program data to realize such as above-mentioned method.
The technical solution that the application uses is: providing a kind of computer storage medium, which deposits Program data is contained, program data is when being executed by processor, to realize such as above-mentioned method.
The processing method of the data of data provided by the present application includes: acquisition pending data;Pending data is distinguished It carries out parallel similarity with the data in multiple storage regions to compare, to obtain multiple comparison results;Wherein, multiple memory blocks Data of the domain for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and number to be processed According to the highest target data of similarity;Corresponding position is carried out to pending data according to the similarity of pending data and target data Reason.It is parallel with multiple storages when pending data is compared by the above-mentioned means, data subregion is stored The data in region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can largely count When according to being compared simultaneously, the stability of system is improved.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.Wherein:
Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application;
Fig. 2 is the block schematic illustration of the processing method of data provided by the embodiments of the present application;
Fig. 3 is another flow diagram of the processing method of data provided by the embodiments of the present application;
Fig. 4 is another block schematic illustration of the processing method of data provided by the embodiments of the present application;
Fig. 5 is the structural schematic diagram of data processing equipment provided by the embodiments of the present application;
Fig. 6 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description.It is understood that specific embodiment described herein is only used for explaining the application, rather than to the limit of the application It is fixed.It also should be noted that illustrating only part relevant to the application for ease of description, in attached drawing and not all knot Structure.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts Every other embodiment, shall fall in the protection scope of this application.
Term " first ", " second " in the application etc. be for distinguishing different objects, rather than it is specific suitable for describing Sequence.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as comprising The process, method, system, product or equipment of a series of steps or units are not limited to listed step or unit, and It is optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, product or equipment Intrinsic other step or units.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.
Refering to fig. 1, Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application, this method comprises:
Step 11: obtaining pending data.
Wherein, which can be the data for being identified to people or object, for example, human face data, vehicle Data etc..
Optionally, by taking human face data as an example, step 11 can be with specifically: obtains facial image;It is extracted from facial image Characteristic information, using as pending data.
Specifically, it can use camera to obtain facial image.For example, in one embodiment, camera can be used Facial image is obtained, image procossing is carried out to facial image, further obtains characteristic information therein;In addition, in another reality It applies in example, the color depth image of face can be obtained using colour imagery shot and infrared camera, further obtain colored Characteristic information in depth image.Wherein, these characteristic informations can be pupil size, interpupillary distance, eyes and other face organs Position, color characteristic etc..
Step 12: pending data is subjected to parallel similarity with the data in multiple storage regions respectively and is compared, with Obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library.
Wherein, multiple storage region can be realized using multiple memories, a memory can also be divided into Multiple memory blocks can also be realized using multiple servers.
Specifically, all frequently-used data library data information fragments can be stored in different storage regions, fragment is just It is to split frequently-used data library, is divided on different server, such as: there will be originally the data of 100G, is split into 10 parts store onto different servers, the data that every server so is only stored with 10G.For example, by taking identity information as an example, All identity informations are divided into N group, are respectively stored in N number of storage region, in comparing, pending data is distinguished Carry out parallel comparison respectively in N number of storage region.
Optionally, in one embodiment, step 12 can be with specifically: by pending data respectively and in multiple storage regions Data carry out parallel similarity and compare, and obtain in each storage region with the highest number of pending data similarity According to as comparison result.
Wherein, the algorithm of similarity can use clustering algorithm or distance algorithm, for example, can be calculated using Euclidean distance Method obtains the similarity of two data by calculating the Euclidean distance of two data.
Wherein, comparison result here, which can be, does not have, one or more data, in comparison process, by number to be processed It is compared according to the parallel data with multiple storage regions.In a storage region, then in particular order respectively with Each data in the storage region are compared.
Specifically, pending data is successively carried out similarity with M data in a storage region to compare, obtains M Then a similarity value is ranked up according to similarity size, select the corresponding data of maximum similarity value therein as than To result.
In addition, in other embodiments, wherein phase can also be selected according to the sequence of the similarity of M above-mentioned data Like maximum preceding multiple data, such as maximum three data of similarity are spent, as comparison result.
Step 13: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity Mark data.
After the parallel comparison to each storage region, multiple comparison results are merged.For example, being tied to multiple comparisons The similarity of all data in fruit is ranked up, and therefrom selects the highest data of similarity as target data.
For example, if the data of pending data and each storage region in N number of storage region carry out similarity and compare to obtain One highest data of similarity, then the N number of data compared N number of storage region carry out sequencing of similarity, by similarity Maximum data are as target data;If pending data is similar to the data progress of each storage region in N number of storage region Degree compares and obtains the highest multiple data of similarity, and such as 3, then the 3N data compared N number of storage region carry out phase It sorts like degree, using the maximum data of similarity as target data.
Step 14: respective handling is carried out to pending data according to the similarity of pending data and target data.
Goal data be in frequently-used data library with the highest data of pending data similarity.In a kind of feelings Under condition, the similarity of target data and pending data is higher, meets preset requirement, it is believed that frequently-used data has been deposited in library The pending data is contained, then without carrying out subsequent processing to the pending data.In another case, target data It is lower with the similarity of pending data, it is unsatisfactory for preset requirement, it is believed that it is to be processed that this is not stored in frequently-used data library Data can carry out next storage operation.
It is illustrated below by human face data citing, as shown in Fig. 2, Fig. 2 is data provided by the embodiments of the present application The block schematic illustration of processing method can establish executor (actuator) frame, multiple people in a specific embodiment One grade of operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:
A, executor1 gets pending data 1, and pending data 1 is sent respectively to one grade of operator of a people 1, one People's one grade of operator 2 ..., one grade of operator N of a people.
B, pending data 1 is compared respectively for one grade of operator of each people, for example, one grade of operator 1 of a people will be wait locate It manages the data stored in data 1 and fragment REPO-1 to be compared one by one, obtains comparison result 1.Similarly, one grade of operator N of a people The data stored in pending data 1 and fragment REPO-N are compared one by one, obtain comparison result N.One grade of each people Comparison result is fed back to executor1 by operator.
C, executor1 merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data According to.
D, executor1 is according to the similarity of pending data 1 and target data, determines subsequent to pending data 1 Operation.
It is different from the prior art, the processing method of the data of the present embodiment includes: acquisition pending data;By number to be processed It is compared according to similarity is carried out with the data in multiple storage regions respectively, to respectively obtain multiple comparison results;Wherein, Duo Gecun Data of the storage area domain for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and to from Manage the highest target data of data similarity;Phase is carried out to pending data according to the similarity of pending data and target data It should handle.By the above-mentioned means, data subregion is stored, when pending data is compared, it is parallel with it is multiple The data of storage region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can be big When amount data are compared simultaneously, the stability of system is improved.
It is another flow diagram of the processing method of data provided by the embodiments of the present application, this method refering to Fig. 3, Fig. 3 Include:
Step 31: obtaining pending data.
Step 32: pending data being subjected to similarity with the data in multiple storage regions respectively and is compared, to obtain respectively To multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library.
Step 33: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity Mark data.
Step 34: judging whether the similarity of pending data and target data is less than setting similarity threshold.
When the similarity of pending data and target data is greater than or equal to setting similarity threshold, it is believed that normal With the pending data has been stored in database, without being stored again to the pending data, this can be deleted Pending data, and obtain next pending data and handled.It is less than in the similarity of pending data and target data When setting similarity threshold, it is believed that the pending data is a new data, needs to store it, then after executing Continuous step 35.
Corresponding to the embodiment of recognition of face, in face characteristic data and the target data phase in frequently-used data library of acquisition When being greater than or equal to setting similarity threshold like degree, it is believed that be stored with the face characteristic pair in frequently-used data library The identity information answered, without storing again.Target data similarity in the face characteristic data of acquisition and frequently-used data library When less than setting similarity threshold, it is believed that the corresponding subscriber identity information of face characteristic is a new identity letter Breath, needs to store it, then executes subsequent step 35.
Step 35: pending data is cached.
Step 36: a duplicate removal processing is carried out to multiple pending datas in caching.
It should be understood that in the application scenarios of recognition of face, if same user adopts by data repeatedly in a short time Collect point, human face data be easy to cause human face data to repeat stored problem, in this embodiment, for new by multi collect Data are not stored immediately in frequently-used data library, but first carry out caching duplicate removal processing.
It is alternatively possible to according to the cache-time of pending data, or the data amount check of caching carries out duplicate removal.
In one embodiment, judge whether the storage time of multiple pending datas in caching reaches setting time threshold Value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
For example, storing first pending data in the buffer starts timing, in reach after a certain period of time, no matter caching The quantity of the pending data of storage be it is how many, in buffer zone all pending datas carry out a duplicate removal processing.
In another embodiment, judge whether the number of multiple pending datas in caching reaches setting quantity threshold Value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
It is started counting for example, storing first pending data in the buffer, when the quantity of the pending data of caching reaches To after certain amount, a duplicate removal processing is carried out to all pending datas in buffer zone.
During a duplicate removal, pending data successively can be subjected to similarity with each data in buffer zone It compares, when similarity is greater than given threshold before, it is believed that pending data and the Data duplication, it may be possible to acquire same Data are not synchronized in frequently-used data library also only, then can delete one of those.
For example, the data in buffer zone include: A1, B1, C1, D1, A2, E1, F1 ..., after comparing, wherein Data A1 and data A2 similarity be greater than given threshold, it is believed that data A1 and data A2 is repeated data, deletes it In one.
Step 37: multiple pending datas after a duplicate removal processing are compared with the data in global newdata library It is right, to carry out secondary duplicate removal processing.
Global newdata library is for saving the new pending data got in certain time.
Optionally, the pending data after a duplicate removal processing is subjected to phase with the data in global newdata library respectively It is compared like degree, secondary duplicate removal is carried out to the pending data after a duplicate removal.
Step 38: multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
It is compared for example, pending data 1 is successively carried out similarity with each data in global newdata library respectively, If thering is the similarity of any one data and the pending data to be greater than given threshold, it can be assumed that the pending data has added It adds in global newdata, it, can should if the similarity of all data and the pending data is respectively less than given threshold Pending data is added in global newly-built file store.
Further, the similarity of the data in pending data and global newdata library is less than setting similarity threshold When, pending data is saved in a storage region into multiple storage regions.It is i.e. newly-built in pending data and the overall situation When the similarity of data in database is less than setting similarity threshold, pending data is added in global data base.
It is illustrated below by human face data citing, as shown in figure 4, Fig. 4 is data provided by the embodiments of the present application Another block schematic illustration of processing method can establish executor (actuator) frame, multiple in a specific embodiment One one grade of people operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:
A, executor gets pending data 1, and pending data 1 is sent respectively to one grade of operator 1 of a people, a people One grade of operator 2 ... one grade of operator N of a people.
B, pending data 1 is compared respectively for one grade of operator of each people, for example, one grade of operator 1 of a people will be wait locate It manages the data stored in data 1 and fragment REPO-1 to be compared one by one, obtains comparison result 1.Similarly, one grade of operator N of a people The data stored in pending data 1 and fragment REPO-N are compared one by one, obtain comparison result N.One grade of each people Comparison result is fed back to executor1 by operator.
C, executor merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data According to.
D, executor judges whether pending data 1 and the similarity of target data are less than setting similarity threshold, if It is less than, pending data is cached.
E, a duplicate removal is carried out to multiple pending datas of caching.
F, the pending data after a duplicate removal is compared with the data in global newly-built file store, carries out secondary go Weight.
Wherein, secondary duplicate removal process be exactly the data in pending data and global newly-built file store are compared one by one, if Similarity is greater than given threshold, then it represents that in comparison, if similarity is less than given threshold, then it represents that in not comparing.
If G, the pending data is added to global newly-built during pending data and global newly-built file store do not compare In file store.
If pending data H, is sent to executor during pending data and global newly-built file store do not compare, The pending data is added in one grade of operator of a people (i.e. frequently-used data library) by executor.
Be different from the prior art, in the present embodiment by pending data after comparison by duplicate removal processing twice, energy Enough prevent same Data duplication file the problem of, and improve data than team performance.
In addition, the similarity of the data in pending data and global newdata library is less than setting similarity threshold When, pending data is added to perdurable data library;In the exception in global newdata library, read from perdurable data library The data in set period of time are taken, new global newdata library is established.
Specifically, newly file since load is global library operator A can for some reason abnormal (such as device powers down is offline), System can reselect the work of another operator B adapter tube operator A with the global library ability of newly filing of load at this time.So in order to The seamless connection of work, operator B needs to know before operator A exception global library information of newly filing in memory, so with regard to needing to hold The global library information of newly filing of longization, operator reloads when for exception.In the present embodiment, by establish perdurable data library come It realizes the global library information of newly filing of persistence, after receiving secondary duplicate removal confirmation comparison result, judges that pending data is newly When data, perdurable data library is sent by the pending data and carries out persistent storage.After operator A exception, load is global The operator B in library of newly filing newly files library from the overall situation loaded in life cycle in perdurable data library, seamless pipe operator A work Make.
It is the structural schematic diagram of data processing equipment provided by the embodiments of the present application refering to Fig. 5, Fig. 5, which sets Standby 50 include processor 51 and memory 52.
Wherein, which can specifically include multiple sub memories, or including multiple storage regions, be used for fragment Storing data, in addition, being also stored with program data in the memory 52, the processor 51 is for executing the program data to realize Following steps:
Obtain pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare, To respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple Comparison result obtains and the highest target data of pending data similarity from frequently-used data library;According to pending data with The similarity of target data carries out respective handling to pending data.
Optionally, processor 51 is also used to execute: judging whether the similarity of pending data and target data is less than and sets Determine similarity threshold;If so, pending data is cached.
Optionally, processor 51 is also used to execute: carrying out a duplicate removal processing to multiple pending datas in caching;It will Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out secondary duplicate removal Processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
Optionally, processor 51 is also used to execute: judging whether the number of multiple pending datas in caching reaches and sets Determine amount threshold;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
Optionally, processor 51 is also used to execute: judging whether the storage time of multiple pending datas in caching reaches To setting time threshold value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
Optionally, processor 51 is also used to execute: each pending data and the overall situation after obtaining a duplicate removal processing are new Build the similarity comparison result of the data in database;The similarity of data in pending data and global newdata library When less than setting similarity threshold, pending data is added in global newdata library.
Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data When degree is less than setting similarity threshold, pending data is saved in a storage region into multiple storage regions.
Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data When degree is less than setting similarity threshold, pending data is added to perdurable data library;Exception in global newdata library When, from the data read in set period of time in perdurable data library, establish new global newdata library.
Optionally, processor 51 is also used to execute: pending data is carried out with the data in multiple storage regions respectively Similarity compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.
Optionally, processor 51 is also used to execute: obtaining facial image;Characteristic information is extracted, from facial image to make For pending data.
It is to be appreciated that the data processing equipment 50 in the present embodiment can be as the service in video monitoring system Device, the server are connect with multiple front end picture pick-up devices, carry out subsequent processing to the facial image of multiple picture pick-up devices acquisition.
It is the structural schematic diagram of computer storage medium provided by the embodiments of the present application refering to Fig. 6, Fig. 6, which deposits Storage media 60 is stored with program data 61, and program data 61 is when being executed by processor, to perform the steps of
Obtain pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare, To respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple Comparison result obtains and the highest target data of pending data similarity from frequently-used data library;According to pending data with The similarity of target data carries out respective handling to pending data.
In addition, being also used to execute following steps: judging whether the similarity of pending data and target data is less than setting Similarity threshold;If so, pending data is cached.Multiple pending datas in caching are carried out at a duplicate removal Reason;Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two Secondary duplicate removal processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
In several embodiments provided herein, it should be understood that disclosed method and equipment, Ke Yitong Other modes are crossed to realize.For example, equipment embodiment described above is only schematical, for example, the module or The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize present embodiment scheme according to the actual needs Purpose.
In addition, each functional unit in each embodiment of the application can integrate in one processing unit, it can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit in above-mentioned other embodiments is realized in the form of SFU software functional unit and as independence Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the application Technical solution substantially all or part of the part that contributes to existing technology or the technical solution can be in other words It is expressed in the form of software products, which is stored in a storage medium, including some instructions are used So that a computer equipment (can be personal computer, server or the network equipment etc.) or processor (processor) all or part of the steps of each embodiment the method for the application is executed.And storage medium packet above-mentioned It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), the various media that can store program code such as magnetic or disk.
The foregoing is merely presently filed embodiments, are not intended to limit the scope of the patents of the application, all according to this Equivalent structure or equivalent flow shift made by application specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other Technical field similarly includes in the scope of patent protection of the application.

Claims (12)

1. a kind of processing method of data characterized by comprising
Obtain pending data;
The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain Comparison result;Wherein, data of the multiple storage region for fragment storage frequently-used data library;
Based on the multiple comparison result, obtained from the frequently-used data library and the highest number of targets of pending data similarity According to;
Respective handling is carried out to the pending data according to the similarity of the pending data and the target data.
2. the method according to claim 1, wherein
It is described that respective handling is carried out to the pending data according to the similarity of the pending data and the target data The step of, comprising:
Judge whether the pending data and the similarity of the target data are less than setting similarity threshold;
If so, the pending data is cached.
3. according to the method described in claim 2, it is characterized in that,
After described the step of being cached the pending data, further includes:
Duplicate removal processing is carried out to multiple pending datas in caching;
Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two Secondary duplicate removal processing;
Multiple pending datas after secondary duplicate removal processing are stored in the global newdata library.
4. according to the method described in claim 3, it is characterized in that,
The step of multiple pending datas in described pair of caching carry out a duplicate removal processing, comprising:
Judge whether the number of the multiple pending data in caching reaches setting amount threshold;
If so, carrying out a duplicate removal processing to the multiple pending data in caching.
5. according to the method described in claim 3, it is characterized in that,
The step of multiple pending datas in described pair of caching carry out a duplicate removal processing, comprising:
Judge whether the storage time of the multiple pending data in caching reaches setting time threshold value;
If so, carrying out a duplicate removal processing to the multiple pending data in caching.
6. according to the method described in claim 3, it is characterized in that,
Multiple pending datas by after secondary duplicate removal processing are stored in the step in the global newdata library, packet It includes:
The similarity ratio of each pending data after obtaining a duplicate removal processing and the data in the global newdata library To result;
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute Pending data is stated to be added in the global newdata library.
7. according to the method described in claim 6, it is characterized in that,
The method also includes:
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute It states in the storage region that pending data is saved into the multiple storage region.
8. according to the method described in claim 6, it is characterized in that,
The method also includes:
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute It states pending data and is added to perdurable data library;
In the exception in the global newdata library, from the data read in the perdurable data library in set period of time, Establish new global newdata library.
9. the method according to claim 1, wherein
It is described the pending data is subjected to parallel similarity with the data in multiple storage regions respectively to compare, to obtain The step of multiple comparison results, comprising:
The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, and is obtained each In storage region with the highest data of the pending data similarity, as comparison result.
10. the method according to claim 1, wherein
The step of acquisition pending data, comprising:
Obtain facial image;
Characteristic information is extracted from the facial image, using as pending data.
11. a kind of data processing equipment, which is characterized in that the data processing equipment includes processor and memory, described to deposit Reservoir is for storing program data, and the processor is for executing described program data to realize such as any one of claim 1-10 The method.
12. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with program data, the journey Ordinal number evidence is when being executed by processor, to realize such as the described in any item methods of claim 1-10.
CN201910642824.1A 2019-07-16 2019-07-16 A kind of processing method of data, equipment and computer storage medium Pending CN110442606A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910642824.1A CN110442606A (en) 2019-07-16 2019-07-16 A kind of processing method of data, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910642824.1A CN110442606A (en) 2019-07-16 2019-07-16 A kind of processing method of data, equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN110442606A true CN110442606A (en) 2019-11-12

Family

ID=68430582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910642824.1A Pending CN110442606A (en) 2019-07-16 2019-07-16 A kind of processing method of data, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN110442606A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797085A (en) * 2020-06-22 2020-10-20 中国平安财产保险股份有限公司 Request data processing method and device, computer equipment and storage medium
CN112905499A (en) * 2021-02-26 2021-06-04 四川泽字节网络科技有限责任公司 Fragmented content similar storage method
WO2021164171A1 (en) * 2020-02-17 2021-08-26 平安科技(深圳)有限公司 Method and apparatus for processing data in knowledge base, and computer device and storage medium
CN113806071A (en) * 2021-08-10 2021-12-17 中标慧安信息技术股份有限公司 Data synchronization method and system for edge computing application
WO2024045721A1 (en) * 2022-08-30 2024-03-07 重庆紫光华山智安科技有限公司 Data deduplication method, apparatus and device, and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN205451095U (en) * 2015-12-02 2016-08-10 深圳市商汤科技有限公司 A face -identifying device
US20170039224A1 (en) * 2000-11-06 2017-02-09 Nant Holdings Ip, Llc Object Information Derived From Object Images
CN107729419A (en) * 2017-09-27 2018-02-23 惠州Tcl移动通信有限公司 A kind of intelligence preserves method, mobile terminal and the storage medium of picture and video
CN108446692A (en) * 2018-06-08 2018-08-24 南京擎华信息科技有限公司 Face comparison method, device and system
CN109710705A (en) * 2018-12-04 2019-05-03 百度在线网络技术(北京)有限公司 Map point of interest treating method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039224A1 (en) * 2000-11-06 2017-02-09 Nant Holdings Ip, Llc Object Information Derived From Object Images
CN205451095U (en) * 2015-12-02 2016-08-10 深圳市商汤科技有限公司 A face -identifying device
CN107729419A (en) * 2017-09-27 2018-02-23 惠州Tcl移动通信有限公司 A kind of intelligence preserves method, mobile terminal and the storage medium of picture and video
CN108446692A (en) * 2018-06-08 2018-08-24 南京擎华信息科技有限公司 Face comparison method, device and system
CN109710705A (en) * 2018-12-04 2019-05-03 百度在线网络技术(北京)有限公司 Map point of interest treating method and apparatus

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021164171A1 (en) * 2020-02-17 2021-08-26 平安科技(深圳)有限公司 Method and apparatus for processing data in knowledge base, and computer device and storage medium
CN111797085A (en) * 2020-06-22 2020-10-20 中国平安财产保险股份有限公司 Request data processing method and device, computer equipment and storage medium
CN112905499A (en) * 2021-02-26 2021-06-04 四川泽字节网络科技有限责任公司 Fragmented content similar storage method
CN112905499B (en) * 2021-02-26 2022-10-04 四川泽字节网络科技有限责任公司 Fragmented content similar storage method
CN113806071A (en) * 2021-08-10 2021-12-17 中标慧安信息技术股份有限公司 Data synchronization method and system for edge computing application
WO2024045721A1 (en) * 2022-08-30 2024-03-07 重庆紫光华山智安科技有限公司 Data deduplication method, apparatus and device, and medium

Similar Documents

Publication Publication Date Title
CN110442606A (en) A kind of processing method of data, equipment and computer storage medium
JP4990383B2 (en) Image group expression method, image group search method, apparatus, computer-readable storage medium, and computer system
CN110147710A (en) Processing method, device and the storage medium of face characteristic
CN109800329B (en) Monitoring method and device
CN110968719B (en) Face clustering method and device
CN109271545A (en) A kind of characteristic key method and device, storage medium and computer equipment
CN107223242A (en) Efficient local feature description's symbol filtering
CN110135428A (en) Image segmentation processing method and device
CN110245247A (en) Method, electronic equipment and the computer storage medium of picture searching
CN113886632A (en) Video retrieval matching method based on dynamic programming
AU2021240278A1 (en) Face identification methods and apparatuses
CN113642685A (en) Efficient similarity-based cross-camera target re-identification method
CN111708906B (en) Visiting retrieval method, device and equipment based on face recognition and storage medium
CN109598240B (en) Video object quickly recognition methods and system again
CN110209656B (en) Data processing method and device
CN112001280A (en) Real-time online optimization face recognition system and method
CN105589896B (en) Data digging method and device
CN110765221A (en) Management method and device of space-time trajectory data
CN114821447A (en) Theft monitoring and tracking method and system for physical retail scene
CN108427759A (en) Real time data computational methods for mass data processing
CN111563479B (en) Concurrent person weight removing method, partner analyzing method and device and electronic equipment
CN114627403A (en) Video index determining method, video playing method and computer equipment
CN111161397B (en) Human face three-dimensional reconstruction method and device, electronic equipment and readable storage medium
CN108230328A (en) Obtain the method, apparatus and robot of target object
CN108305273B (en) A kind of method for checking object, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191112