CN110442606A - A kind of processing method of data, equipment and computer storage medium - Google Patents
A kind of processing method of data, equipment and computer storage medium Download PDFInfo
- Publication number
- CN110442606A CN110442606A CN201910642824.1A CN201910642824A CN110442606A CN 110442606 A CN110442606 A CN 110442606A CN 201910642824 A CN201910642824 A CN 201910642824A CN 110442606 A CN110442606 A CN 110442606A
- Authority
- CN
- China
- Prior art keywords
- data
- pending
- similarity
- pending data
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24556—Aggregation; Duplicate elimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Image Analysis (AREA)
Abstract
This application discloses a kind of processing method of data, equipment and computer storage medium, the processing method of the data includes: acquisition pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare, to respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and the highest target data of pending data similarity;Respective handling is carried out to pending data according to the similarity of pending data and target data.By the above-mentioned means, can be improved the efficiency of data processing.
Description
Technical field
This application involves technical field of data processing, more particularly to the processing method, equipment and calculating of a kind of data
Machine storage medium.
Background technique
In the processing of big data, with the surge of data volume, the increase of database, when a data will be with database
When all data are serially compared, comparison number is more, and relative efficiency is lower.
For example, as acquisition covering surface increases sharply, personnel are also more and more, right in the acquisition and comparison to identity information
The database that should be created in real time also just increases with it;So, all acquisition targets and owner in file store are completed daily
Member compares, and whether confirmation acquisition target is to have personnel or newly-built archives staff in database, and it is very that this, which compares number,
It is huge, if it is all it is remarkable it is serial compare, then it is time-consuming by with the increase of database and the increase of remarkable number per second and
Exponentially grade increases.
Summary of the invention
To solve the above problems, this application provides a kind of processing method of data, equipment and computer storage medium,
It can be improved the efficiency of data processing.
The technical solution that the application uses is: a kind of processing method of data is provided, this method comprises: obtaining wait locate
Manage data;Pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain
Comparison result;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple comparison results, from normal
With being obtained in database and the highest target data of pending data similarity;It is similar to target data according to pending data
Degree carries out respective handling to pending data.
Wherein, the step of respective handling being carried out to pending data according to the similarity of pending data and target data,
It include: to judge whether the similarity of pending data and target data is less than setting similarity threshold;If so, by number to be processed
According to being cached.
Wherein, after the step of pending data being cached, further includes: to multiple pending datas in caching into
Duplicate removal processing of row;Multiple pending datas after duplicate removal processing are compared with the data in global newdata library
It is right, to carry out secondary duplicate removal processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library
In.
Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching
Whether the number of multiple pending datas reaches setting amount threshold;If so, being carried out to multiple pending datas in caching
Duplicate removal processing.
Wherein, a step of duplicate removal processing being carried out to multiple pending datas in caching, comprising: in judgement caching
Whether the storage time of multiple pending datas reaches setting time threshold value;If so, to multiple pending datas in caching
Carry out a duplicate removal processing.
Wherein, multiple pending datas after secondary duplicate removal processing are stored in the step in global newdata library, wrapped
Include: the similarity of each pending data after obtaining a duplicate removal processing and the data in global newdata library compares knot
Fruit;It, will be to be processed when the similarity of data in pending data and global newdata library is less than setting similarity threshold
Data are added in global newdata library.
Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set
When determining similarity threshold, pending data is saved in a storage region into multiple storage regions.
Wherein, this method further include: be less than in the similarity of pending data and the data in global newdata library and set
When determining similarity threshold, pending data is added to perdurable data library;In the exception in global newdata library, from lasting
Change the data read in set period of time in database, establishes new global newdata library.
Wherein, pending data parallel similarity is carried out with the data in multiple storage regions respectively to compare, with
The step of to multiple comparison results, comprising: pending data is subjected to parallel phase with the data in multiple storage regions respectively
Like degree compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.
Wherein, the step of obtaining pending data, comprising: obtain facial image;Characteristic information is extracted from facial image,
Using as pending data.
The technical solution that the application uses is: providing a kind of data processing equipment, which includes place
Device and memory are managed, memory is for storing program data, and processor is for executing program data to realize such as above-mentioned method.
The technical solution that the application uses is: providing a kind of computer storage medium, which deposits
Program data is contained, program data is when being executed by processor, to realize such as above-mentioned method.
The processing method of the data of data provided by the present application includes: acquisition pending data;Pending data is distinguished
It carries out parallel similarity with the data in multiple storage regions to compare, to obtain multiple comparison results;Wherein, multiple memory blocks
Data of the domain for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and number to be processed
According to the highest target data of similarity;Corresponding position is carried out to pending data according to the similarity of pending data and target data
Reason.It is parallel with multiple storages when pending data is compared by the above-mentioned means, data subregion is stored
The data in region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can largely count
When according to being compared simultaneously, the stability of system is improved.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.Wherein:
Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application;
Fig. 2 is the block schematic illustration of the processing method of data provided by the embodiments of the present application;
Fig. 3 is another flow diagram of the processing method of data provided by the embodiments of the present application;
Fig. 4 is another block schematic illustration of the processing method of data provided by the embodiments of the present application;
Fig. 5 is the structural schematic diagram of data processing equipment provided by the embodiments of the present application;
Fig. 6 is the structural schematic diagram of computer storage medium provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description.It is understood that specific embodiment described herein is only used for explaining the application, rather than to the limit of the application
It is fixed.It also should be noted that illustrating only part relevant to the application for ease of description, in attached drawing and not all knot
Structure.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts
Every other embodiment, shall fall in the protection scope of this application.
Term " first ", " second " in the application etc. be for distinguishing different objects, rather than it is specific suitable for describing
Sequence.In addition, term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as comprising
The process, method, system, product or equipment of a series of steps or units are not limited to listed step or unit, and
It is optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, product or equipment
Intrinsic other step or units.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments
It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly understand, embodiment described herein can be combined with other embodiments.
Refering to fig. 1, Fig. 1 is the flow diagram of the processing method of data provided by the embodiments of the present application, this method comprises:
Step 11: obtaining pending data.
Wherein, which can be the data for being identified to people or object, for example, human face data, vehicle
Data etc..
Optionally, by taking human face data as an example, step 11 can be with specifically: obtains facial image;It is extracted from facial image
Characteristic information, using as pending data.
Specifically, it can use camera to obtain facial image.For example, in one embodiment, camera can be used
Facial image is obtained, image procossing is carried out to facial image, further obtains characteristic information therein;In addition, in another reality
It applies in example, the color depth image of face can be obtained using colour imagery shot and infrared camera, further obtain colored
Characteristic information in depth image.Wherein, these characteristic informations can be pupil size, interpupillary distance, eyes and other face organs
Position, color characteristic etc..
Step 12: pending data is subjected to parallel similarity with the data in multiple storage regions respectively and is compared, with
Obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library.
Wherein, multiple storage region can be realized using multiple memories, a memory can also be divided into
Multiple memory blocks can also be realized using multiple servers.
Specifically, all frequently-used data library data information fragments can be stored in different storage regions, fragment is just
It is to split frequently-used data library, is divided on different server, such as: there will be originally the data of 100G, is split into
10 parts store onto different servers, the data that every server so is only stored with 10G.For example, by taking identity information as an example,
All identity informations are divided into N group, are respectively stored in N number of storage region, in comparing, pending data is distinguished
Carry out parallel comparison respectively in N number of storage region.
Optionally, in one embodiment, step 12 can be with specifically: by pending data respectively and in multiple storage regions
Data carry out parallel similarity and compare, and obtain in each storage region with the highest number of pending data similarity
According to as comparison result.
Wherein, the algorithm of similarity can use clustering algorithm or distance algorithm, for example, can be calculated using Euclidean distance
Method obtains the similarity of two data by calculating the Euclidean distance of two data.
Wherein, comparison result here, which can be, does not have, one or more data, in comparison process, by number to be processed
It is compared according to the parallel data with multiple storage regions.In a storage region, then in particular order respectively with
Each data in the storage region are compared.
Specifically, pending data is successively carried out similarity with M data in a storage region to compare, obtains M
Then a similarity value is ranked up according to similarity size, select the corresponding data of maximum similarity value therein as than
To result.
In addition, in other embodiments, wherein phase can also be selected according to the sequence of the similarity of M above-mentioned data
Like maximum preceding multiple data, such as maximum three data of similarity are spent, as comparison result.
Step 13: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity
Mark data.
After the parallel comparison to each storage region, multiple comparison results are merged.For example, being tied to multiple comparisons
The similarity of all data in fruit is ranked up, and therefrom selects the highest data of similarity as target data.
For example, if the data of pending data and each storage region in N number of storage region carry out similarity and compare to obtain
One highest data of similarity, then the N number of data compared N number of storage region carry out sequencing of similarity, by similarity
Maximum data are as target data;If pending data is similar to the data progress of each storage region in N number of storage region
Degree compares and obtains the highest multiple data of similarity, and such as 3, then the 3N data compared N number of storage region carry out phase
It sorts like degree, using the maximum data of similarity as target data.
Step 14: respective handling is carried out to pending data according to the similarity of pending data and target data.
Goal data be in frequently-used data library with the highest data of pending data similarity.In a kind of feelings
Under condition, the similarity of target data and pending data is higher, meets preset requirement, it is believed that frequently-used data has been deposited in library
The pending data is contained, then without carrying out subsequent processing to the pending data.In another case, target data
It is lower with the similarity of pending data, it is unsatisfactory for preset requirement, it is believed that it is to be processed that this is not stored in frequently-used data library
Data can carry out next storage operation.
It is illustrated below by human face data citing, as shown in Fig. 2, Fig. 2 is data provided by the embodiments of the present application
The block schematic illustration of processing method can establish executor (actuator) frame, multiple people in a specific embodiment
One grade of operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:
A, executor1 gets pending data 1, and pending data 1 is sent respectively to one grade of operator of a people 1, one
People's one grade of operator 2 ..., one grade of operator N of a people.
B, pending data 1 is compared respectively for one grade of operator of each people, for example, one grade of operator 1 of a people will be wait locate
It manages the data stored in data 1 and fragment REPO-1 to be compared one by one, obtains comparison result 1.Similarly, one grade of operator N of a people
The data stored in pending data 1 and fragment REPO-N are compared one by one, obtain comparison result N.One grade of each people
Comparison result is fed back to executor1 by operator.
C, executor1 merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data
According to.
D, executor1 is according to the similarity of pending data 1 and target data, determines subsequent to pending data 1
Operation.
It is different from the prior art, the processing method of the data of the present embodiment includes: acquisition pending data;By number to be processed
It is compared according to similarity is carried out with the data in multiple storage regions respectively, to respectively obtain multiple comparison results;Wherein, Duo Gecun
Data of the storage area domain for fragment storage frequently-used data library;Based on multiple comparison results, obtained from frequently-used data library and to from
Manage the highest target data of data similarity;Phase is carried out to pending data according to the similarity of pending data and target data
It should handle.By the above-mentioned means, data subregion is stored, when pending data is compared, it is parallel with it is multiple
The data of storage region are compared simultaneously, then comparison result is summarized, and improve the efficiency of comparing, can be big
When amount data are compared simultaneously, the stability of system is improved.
It is another flow diagram of the processing method of data provided by the embodiments of the present application, this method refering to Fig. 3, Fig. 3
Include:
Step 31: obtaining pending data.
Step 32: pending data being subjected to similarity with the data in multiple storage regions respectively and is compared, to obtain respectively
To multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library.
Step 33: being based on multiple comparison results, obtained from frequently-used data library and the highest mesh of pending data similarity
Mark data.
Step 34: judging whether the similarity of pending data and target data is less than setting similarity threshold.
When the similarity of pending data and target data is greater than or equal to setting similarity threshold, it is believed that normal
With the pending data has been stored in database, without being stored again to the pending data, this can be deleted
Pending data, and obtain next pending data and handled.It is less than in the similarity of pending data and target data
When setting similarity threshold, it is believed that the pending data is a new data, needs to store it, then after executing
Continuous step 35.
Corresponding to the embodiment of recognition of face, in face characteristic data and the target data phase in frequently-used data library of acquisition
When being greater than or equal to setting similarity threshold like degree, it is believed that be stored with the face characteristic pair in frequently-used data library
The identity information answered, without storing again.Target data similarity in the face characteristic data of acquisition and frequently-used data library
When less than setting similarity threshold, it is believed that the corresponding subscriber identity information of face characteristic is a new identity letter
Breath, needs to store it, then executes subsequent step 35.
Step 35: pending data is cached.
Step 36: a duplicate removal processing is carried out to multiple pending datas in caching.
It should be understood that in the application scenarios of recognition of face, if same user adopts by data repeatedly in a short time
Collect point, human face data be easy to cause human face data to repeat stored problem, in this embodiment, for new by multi collect
Data are not stored immediately in frequently-used data library, but first carry out caching duplicate removal processing.
It is alternatively possible to according to the cache-time of pending data, or the data amount check of caching carries out duplicate removal.
In one embodiment, judge whether the storage time of multiple pending datas in caching reaches setting time threshold
Value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
For example, storing first pending data in the buffer starts timing, in reach after a certain period of time, no matter caching
The quantity of the pending data of storage be it is how many, in buffer zone all pending datas carry out a duplicate removal processing.
In another embodiment, judge whether the number of multiple pending datas in caching reaches setting quantity threshold
Value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
It is started counting for example, storing first pending data in the buffer, when the quantity of the pending data of caching reaches
To after certain amount, a duplicate removal processing is carried out to all pending datas in buffer zone.
During a duplicate removal, pending data successively can be subjected to similarity with each data in buffer zone
It compares, when similarity is greater than given threshold before, it is believed that pending data and the Data duplication, it may be possible to acquire same
Data are not synchronized in frequently-used data library also only, then can delete one of those.
For example, the data in buffer zone include: A1, B1, C1, D1, A2, E1, F1 ..., after comparing, wherein
Data A1 and data A2 similarity be greater than given threshold, it is believed that data A1 and data A2 is repeated data, deletes it
In one.
Step 37: multiple pending datas after a duplicate removal processing are compared with the data in global newdata library
It is right, to carry out secondary duplicate removal processing.
Global newdata library is for saving the new pending data got in certain time.
Optionally, the pending data after a duplicate removal processing is subjected to phase with the data in global newdata library respectively
It is compared like degree, secondary duplicate removal is carried out to the pending data after a duplicate removal.
Step 38: multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
It is compared for example, pending data 1 is successively carried out similarity with each data in global newdata library respectively,
If thering is the similarity of any one data and the pending data to be greater than given threshold, it can be assumed that the pending data has added
It adds in global newdata, it, can should if the similarity of all data and the pending data is respectively less than given threshold
Pending data is added in global newly-built file store.
Further, the similarity of the data in pending data and global newdata library is less than setting similarity threshold
When, pending data is saved in a storage region into multiple storage regions.It is i.e. newly-built in pending data and the overall situation
When the similarity of data in database is less than setting similarity threshold, pending data is added in global data base.
It is illustrated below by human face data citing, as shown in figure 4, Fig. 4 is data provided by the embodiments of the present application
Another block schematic illustration of processing method can establish executor (actuator) frame, multiple in a specific embodiment
One one grade of people operator and multiple fragment REPO (warehouse).Can specifically following steps be passed through:
A, executor gets pending data 1, and pending data 1 is sent respectively to one grade of operator 1 of a people, a people
One grade of operator 2 ... one grade of operator N of a people.
B, pending data 1 is compared respectively for one grade of operator of each people, for example, one grade of operator 1 of a people will be wait locate
It manages the data stored in data 1 and fragment REPO-1 to be compared one by one, obtains comparison result 1.Similarly, one grade of operator N of a people
The data stored in pending data 1 and fragment REPO-N are compared one by one, obtain comparison result N.One grade of each people
Comparison result is fed back to executor1 by operator.
C, executor merges multiple comparison results, obtains and the highest number of targets of 1 similarity of pending data
According to.
D, executor judges whether pending data 1 and the similarity of target data are less than setting similarity threshold, if
It is less than, pending data is cached.
E, a duplicate removal is carried out to multiple pending datas of caching.
F, the pending data after a duplicate removal is compared with the data in global newly-built file store, carries out secondary go
Weight.
Wherein, secondary duplicate removal process be exactly the data in pending data and global newly-built file store are compared one by one, if
Similarity is greater than given threshold, then it represents that in comparison, if similarity is less than given threshold, then it represents that in not comparing.
If G, the pending data is added to global newly-built during pending data and global newly-built file store do not compare
In file store.
If pending data H, is sent to executor during pending data and global newly-built file store do not compare,
The pending data is added in one grade of operator of a people (i.e. frequently-used data library) by executor.
Be different from the prior art, in the present embodiment by pending data after comparison by duplicate removal processing twice, energy
Enough prevent same Data duplication file the problem of, and improve data than team performance.
In addition, the similarity of the data in pending data and global newdata library is less than setting similarity threshold
When, pending data is added to perdurable data library;In the exception in global newdata library, read from perdurable data library
The data in set period of time are taken, new global newdata library is established.
Specifically, newly file since load is global library operator A can for some reason abnormal (such as device powers down is offline),
System can reselect the work of another operator B adapter tube operator A with the global library ability of newly filing of load at this time.So in order to
The seamless connection of work, operator B needs to know before operator A exception global library information of newly filing in memory, so with regard to needing to hold
The global library information of newly filing of longization, operator reloads when for exception.In the present embodiment, by establish perdurable data library come
It realizes the global library information of newly filing of persistence, after receiving secondary duplicate removal confirmation comparison result, judges that pending data is newly
When data, perdurable data library is sent by the pending data and carries out persistent storage.After operator A exception, load is global
The operator B in library of newly filing newly files library from the overall situation loaded in life cycle in perdurable data library, seamless pipe operator A work
Make.
It is the structural schematic diagram of data processing equipment provided by the embodiments of the present application refering to Fig. 5, Fig. 5, which sets
Standby 50 include processor 51 and memory 52.
Wherein, which can specifically include multiple sub memories, or including multiple storage regions, be used for fragment
Storing data, in addition, being also stored with program data in the memory 52, the processor 51 is for executing the program data to realize
Following steps:
Obtain pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare,
To respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple
Comparison result obtains and the highest target data of pending data similarity from frequently-used data library;According to pending data with
The similarity of target data carries out respective handling to pending data.
Optionally, processor 51 is also used to execute: judging whether the similarity of pending data and target data is less than and sets
Determine similarity threshold;If so, pending data is cached.
Optionally, processor 51 is also used to execute: carrying out a duplicate removal processing to multiple pending datas in caching;It will
Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out secondary duplicate removal
Processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
Optionally, processor 51 is also used to execute: judging whether the number of multiple pending datas in caching reaches and sets
Determine amount threshold;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
Optionally, processor 51 is also used to execute: judging whether the storage time of multiple pending datas in caching reaches
To setting time threshold value;If so, carrying out a duplicate removal processing to multiple pending datas in caching.
Optionally, processor 51 is also used to execute: each pending data and the overall situation after obtaining a duplicate removal processing are new
Build the similarity comparison result of the data in database;The similarity of data in pending data and global newdata library
When less than setting similarity threshold, pending data is added in global newdata library.
Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data
When degree is less than setting similarity threshold, pending data is saved in a storage region into multiple storage regions.
Optionally, processor 51 is also used to execute: similar to the data in global newdata library in pending data
When degree is less than setting similarity threshold, pending data is added to perdurable data library;Exception in global newdata library
When, from the data read in set period of time in perdurable data library, establish new global newdata library.
Optionally, processor 51 is also used to execute: pending data is carried out with the data in multiple storage regions respectively
Similarity compare, and obtain in each storage region with the highest data of pending data similarity, as comparison result.
Optionally, processor 51 is also used to execute: obtaining facial image;Characteristic information is extracted, from facial image to make
For pending data.
It is to be appreciated that the data processing equipment 50 in the present embodiment can be as the service in video monitoring system
Device, the server are connect with multiple front end picture pick-up devices, carry out subsequent processing to the facial image of multiple picture pick-up devices acquisition.
It is the structural schematic diagram of computer storage medium provided by the embodiments of the present application refering to Fig. 6, Fig. 6, which deposits
Storage media 60 is stored with program data 61, and program data 61 is when being executed by processor, to perform the steps of
Obtain pending data;Pending data is carried out similarity with the data in multiple storage regions respectively to compare,
To respectively obtain multiple comparison results;Wherein, data of multiple storage regions for fragment storage frequently-used data library;Based on multiple
Comparison result obtains and the highest target data of pending data similarity from frequently-used data library;According to pending data with
The similarity of target data carries out respective handling to pending data.
In addition, being also used to execute following steps: judging whether the similarity of pending data and target data is less than setting
Similarity threshold;If so, pending data is cached.Multiple pending datas in caching are carried out at a duplicate removal
Reason;Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two
Secondary duplicate removal processing;Multiple pending datas after secondary duplicate removal processing are stored in global newdata library.
In several embodiments provided herein, it should be understood that disclosed method and equipment, Ke Yitong
Other modes are crossed to realize.For example, equipment embodiment described above is only schematical, for example, the module or
The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units
Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of unit therein can be selected to realize present embodiment scheme according to the actual needs
Purpose.
In addition, each functional unit in each embodiment of the application can integrate in one processing unit, it can also
To be that each unit physically exists alone, can also be integrated in one unit with two or more units.It is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit in above-mentioned other embodiments is realized in the form of SFU software functional unit and as independence
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, the application
Technical solution substantially all or part of the part that contributes to existing technology or the technical solution can be in other words
It is expressed in the form of software products, which is stored in a storage medium, including some instructions are used
So that a computer equipment (can be personal computer, server or the network equipment etc.) or processor
(processor) all or part of the steps of each embodiment the method for the application is executed.And storage medium packet above-mentioned
It includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random
Access Memory), the various media that can store program code such as magnetic or disk.
The foregoing is merely presently filed embodiments, are not intended to limit the scope of the patents of the application, all according to this
Equivalent structure or equivalent flow shift made by application specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other
Technical field similarly includes in the scope of patent protection of the application.
Claims (12)
1. a kind of processing method of data characterized by comprising
Obtain pending data;
The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, it is multiple to obtain
Comparison result;Wherein, data of the multiple storage region for fragment storage frequently-used data library;
Based on the multiple comparison result, obtained from the frequently-used data library and the highest number of targets of pending data similarity
According to;
Respective handling is carried out to the pending data according to the similarity of the pending data and the target data.
2. the method according to claim 1, wherein
It is described that respective handling is carried out to the pending data according to the similarity of the pending data and the target data
The step of, comprising:
Judge whether the pending data and the similarity of the target data are less than setting similarity threshold;
If so, the pending data is cached.
3. according to the method described in claim 2, it is characterized in that,
After described the step of being cached the pending data, further includes:
Duplicate removal processing is carried out to multiple pending datas in caching;
Multiple pending datas after duplicate removal processing are compared with the data in global newdata library, to carry out two
Secondary duplicate removal processing;
Multiple pending datas after secondary duplicate removal processing are stored in the global newdata library.
4. according to the method described in claim 3, it is characterized in that,
The step of multiple pending datas in described pair of caching carry out a duplicate removal processing, comprising:
Judge whether the number of the multiple pending data in caching reaches setting amount threshold;
If so, carrying out a duplicate removal processing to the multiple pending data in caching.
5. according to the method described in claim 3, it is characterized in that,
The step of multiple pending datas in described pair of caching carry out a duplicate removal processing, comprising:
Judge whether the storage time of the multiple pending data in caching reaches setting time threshold value;
If so, carrying out a duplicate removal processing to the multiple pending data in caching.
6. according to the method described in claim 3, it is characterized in that,
Multiple pending datas by after secondary duplicate removal processing are stored in the step in the global newdata library, packet
It includes:
The similarity ratio of each pending data after obtaining a duplicate removal processing and the data in the global newdata library
To result;
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute
Pending data is stated to be added in the global newdata library.
7. according to the method described in claim 6, it is characterized in that,
The method also includes:
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute
It states in the storage region that pending data is saved into the multiple storage region.
8. according to the method described in claim 6, it is characterized in that,
The method also includes:
When the similarity of data in pending data and the global newdata library is less than setting similarity threshold, by institute
It states pending data and is added to perdurable data library;
In the exception in the global newdata library, from the data read in the perdurable data library in set period of time,
Establish new global newdata library.
9. the method according to claim 1, wherein
It is described the pending data is subjected to parallel similarity with the data in multiple storage regions respectively to compare, to obtain
The step of multiple comparison results, comprising:
The pending data is carried out parallel similarity with the data in multiple storage regions respectively to compare, and is obtained each
In storage region with the highest data of the pending data similarity, as comparison result.
10. the method according to claim 1, wherein
The step of acquisition pending data, comprising:
Obtain facial image;
Characteristic information is extracted from the facial image, using as pending data.
11. a kind of data processing equipment, which is characterized in that the data processing equipment includes processor and memory, described to deposit
Reservoir is for storing program data, and the processor is for executing described program data to realize such as any one of claim 1-10
The method.
12. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with program data, the journey
Ordinal number evidence is when being executed by processor, to realize such as the described in any item methods of claim 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910642824.1A CN110442606A (en) | 2019-07-16 | 2019-07-16 | A kind of processing method of data, equipment and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910642824.1A CN110442606A (en) | 2019-07-16 | 2019-07-16 | A kind of processing method of data, equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110442606A true CN110442606A (en) | 2019-11-12 |
Family
ID=68430582
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910642824.1A Pending CN110442606A (en) | 2019-07-16 | 2019-07-16 | A kind of processing method of data, equipment and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110442606A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111797085A (en) * | 2020-06-22 | 2020-10-20 | 中国平安财产保险股份有限公司 | Request data processing method and device, computer equipment and storage medium |
CN112905499A (en) * | 2021-02-26 | 2021-06-04 | 四川泽字节网络科技有限责任公司 | Fragmented content similar storage method |
WO2021164171A1 (en) * | 2020-02-17 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for processing data in knowledge base, and computer device and storage medium |
CN113806071A (en) * | 2021-08-10 | 2021-12-17 | 中标慧安信息技术股份有限公司 | Data synchronization method and system for edge computing application |
WO2024045721A1 (en) * | 2022-08-30 | 2024-03-07 | 重庆紫光华山智安科技有限公司 | Data deduplication method, apparatus and device, and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN205451095U (en) * | 2015-12-02 | 2016-08-10 | 深圳市商汤科技有限公司 | A face -identifying device |
US20170039224A1 (en) * | 2000-11-06 | 2017-02-09 | Nant Holdings Ip, Llc | Object Information Derived From Object Images |
CN107729419A (en) * | 2017-09-27 | 2018-02-23 | 惠州Tcl移动通信有限公司 | A kind of intelligence preserves method, mobile terminal and the storage medium of picture and video |
CN108446692A (en) * | 2018-06-08 | 2018-08-24 | 南京擎华信息科技有限公司 | Face comparison method, device and system |
CN109710705A (en) * | 2018-12-04 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | Map point of interest treating method and apparatus |
-
2019
- 2019-07-16 CN CN201910642824.1A patent/CN110442606A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170039224A1 (en) * | 2000-11-06 | 2017-02-09 | Nant Holdings Ip, Llc | Object Information Derived From Object Images |
CN205451095U (en) * | 2015-12-02 | 2016-08-10 | 深圳市商汤科技有限公司 | A face -identifying device |
CN107729419A (en) * | 2017-09-27 | 2018-02-23 | 惠州Tcl移动通信有限公司 | A kind of intelligence preserves method, mobile terminal and the storage medium of picture and video |
CN108446692A (en) * | 2018-06-08 | 2018-08-24 | 南京擎华信息科技有限公司 | Face comparison method, device and system |
CN109710705A (en) * | 2018-12-04 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | Map point of interest treating method and apparatus |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021164171A1 (en) * | 2020-02-17 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for processing data in knowledge base, and computer device and storage medium |
CN111797085A (en) * | 2020-06-22 | 2020-10-20 | 中国平安财产保险股份有限公司 | Request data processing method and device, computer equipment and storage medium |
CN112905499A (en) * | 2021-02-26 | 2021-06-04 | 四川泽字节网络科技有限责任公司 | Fragmented content similar storage method |
CN112905499B (en) * | 2021-02-26 | 2022-10-04 | 四川泽字节网络科技有限责任公司 | Fragmented content similar storage method |
CN113806071A (en) * | 2021-08-10 | 2021-12-17 | 中标慧安信息技术股份有限公司 | Data synchronization method and system for edge computing application |
WO2024045721A1 (en) * | 2022-08-30 | 2024-03-07 | 重庆紫光华山智安科技有限公司 | Data deduplication method, apparatus and device, and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110442606A (en) | A kind of processing method of data, equipment and computer storage medium | |
JP4990383B2 (en) | Image group expression method, image group search method, apparatus, computer-readable storage medium, and computer system | |
CN110147710A (en) | Processing method, device and the storage medium of face characteristic | |
CN109800329B (en) | Monitoring method and device | |
CN110968719B (en) | Face clustering method and device | |
CN109271545A (en) | A kind of characteristic key method and device, storage medium and computer equipment | |
CN107223242A (en) | Efficient local feature description's symbol filtering | |
CN110135428A (en) | Image segmentation processing method and device | |
CN110245247A (en) | Method, electronic equipment and the computer storage medium of picture searching | |
CN113886632A (en) | Video retrieval matching method based on dynamic programming | |
AU2021240278A1 (en) | Face identification methods and apparatuses | |
CN113642685A (en) | Efficient similarity-based cross-camera target re-identification method | |
CN111708906B (en) | Visiting retrieval method, device and equipment based on face recognition and storage medium | |
CN109598240B (en) | Video object quickly recognition methods and system again | |
CN110209656B (en) | Data processing method and device | |
CN112001280A (en) | Real-time online optimization face recognition system and method | |
CN105589896B (en) | Data digging method and device | |
CN110765221A (en) | Management method and device of space-time trajectory data | |
CN114821447A (en) | Theft monitoring and tracking method and system for physical retail scene | |
CN108427759A (en) | Real time data computational methods for mass data processing | |
CN111563479B (en) | Concurrent person weight removing method, partner analyzing method and device and electronic equipment | |
CN114627403A (en) | Video index determining method, video playing method and computer equipment | |
CN111161397B (en) | Human face three-dimensional reconstruction method and device, electronic equipment and readable storage medium | |
CN108230328A (en) | Obtain the method, apparatus and robot of target object | |
CN108305273B (en) | A kind of method for checking object, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191112 |