CN111507424B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN111507424B
CN111507424B CN202010342165.2A CN202010342165A CN111507424B CN 111507424 B CN111507424 B CN 111507424B CN 202010342165 A CN202010342165 A CN 202010342165A CN 111507424 B CN111507424 B CN 111507424B
Authority
CN
China
Prior art keywords
data
archive
class
instruction
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010342165.2A
Other languages
Chinese (zh)
Other versions
CN111507424A (en
Inventor
杨俊�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yitu Technology Co ltd
Original Assignee
Shanghai Yitu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yitu Technology Co ltd filed Critical Shanghai Yitu Technology Co ltd
Priority to CN202010342165.2A priority Critical patent/CN111507424B/en
Publication of CN111507424A publication Critical patent/CN111507424A/en
Application granted granted Critical
Publication of CN111507424B publication Critical patent/CN111507424B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of artificial intelligence, in particular to a data processing method and device, which acquire data characteristics of data to be processed; comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and each central data characteristic, wherein the central data characteristic represents the data characteristics of the central point of each data contained in the corresponding archive data class; if the feature similarity is not smaller than the preset feature similarity threshold, the data to be processed is archived in the archive data class corresponding to the highest feature similarity, if the feature similarity is smaller than the preset feature similarity threshold, the archive data class is newly built, and the data to be processed is archived in the newly built archive data class, so that the archiving efficiency of the data to be processed can be improved, and the calculation amount is reduced.

Description

Data processing method and device
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a data processing method and apparatus.
Background
At present, security protection equipment mainly provides monitoring record and search function, in the prior art, when searching and searching data, the data is searched and searched in two ways, the first is searching and searching data according to time and a channel, and the second is searching and searching data according to structuring or searching another data through data, but when the data amount is more, the efficiency is relatively lower in the searching and searching process due to higher calculation complexity, and the data is not classified, so that the repeatability of the data obtained by searching and searching is higher.
Disclosure of Invention
The embodiment of the application provides a data processing method and device, which can improve the data archiving efficiency and reduce the calculation complexity and the data repeatability.
The specific technical scheme provided by the embodiment of the application is as follows:
a data processing method, comprising:
acquiring data characteristics of data to be processed;
respectively comparing the data characteristics with central data characteristics of each archive data class, and determining the characteristic similarity between the data characteristics and each central data characteristic, wherein the central data characteristic represents the data characteristics of central points of all data contained in the corresponding archive data class, and the central points represent data with highest average value of the characteristic similarity, wherein the data characteristics are respectively compared with the data characteristics of other data in the archive data class in the corresponding archive data class;
if the feature similarity is not smaller than the preset feature similarity threshold, the data to be processed is archived in the archive data class corresponding to the highest feature similarity, if the feature similarity is smaller than the preset feature similarity threshold, the archive data class is newly built, and the data to be processed is archived in the newly built archive data class.
Optionally, each archive data class and each newly-built archive data class are respectively associated with one archive number, and each data contained in each archive data class and each data contained in each newly-built archive data class respectively correspond to one data number.
Optionally, the method further comprises:
for any one of the archive data classes, if no new data is archived beyond the preset time period, respectively comparing the data features of all the data contained in the any one of the archive data classes, determining the data feature with the highest similarity with other data features, and updating the central data feature of any one of the archive data classes according to the determined data feature;
if it is determined that new data is archived when the center data feature of any one of the archive data classes is updated, the new data is archived in any one of the archive data classes after the center data feature of any one of the archive data classes is updated.
Optionally, the method further comprises:
and if the operation instruction is determined to be received, executing corresponding operation according to the operation instruction.
Optionally, if the operation instruction is a delete instruction, executing a corresponding operation according to the operation instruction, including:
searching a file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class, wherein the deleting instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
According to the file number in the deleting instruction, searching the file data class associated with the file number, and according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data, wherein the deleting instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a modification instruction, executing a corresponding operation according to the operation instruction, which specifically includes:
according to the file number in the modification instruction, searching a file data class associated with the file number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the modification instruction, searching a file data class associated with the file number, according to the data number in the modification instruction, searching data associated with the data number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a search instruction, executing a corresponding operation according to the operation instruction, which specifically includes:
According to the file number in the searching instruction, searching the file data class associated with the file number, wherein the searching instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the searching instruction, searching the file data class associated with the file number, and according to the data number in the searching instruction, searching the data associated with the data number, wherein the searching instruction at least comprises the file number and the data number.
Optionally, if the archive data class corresponds to a plurality of central data features, comparing the data features with the central data features of each archive data class, and determining feature similarity between the data features and each central data feature, specifically includes:
comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class, wherein the plurality of central data characteristics are determined based on the determined plurality of central points after respectively extracting the characteristics of different area blocks of each data contained in the corresponding archive data class and respectively determining the plurality of central points based on the data characteristics of different areas of each data; or the plurality of central data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class.
Optionally, the data to be processed is an image.
A data processing apparatus comprising:
the acquisition module is used for acquiring the data characteristics of the data to be processed;
the comparison module is used for respectively comparing the data characteristics with the central data characteristics of each archive data class and determining the characteristic similarity between the data characteristics and each central data characteristic, wherein the central data characteristics represent the data characteristics of central points of each data contained in the corresponding archive data class, the central points represent the data characteristics of other data in the corresponding archive data class, and the data with the highest average value of the characteristic similarity are compared with the data characteristics of other data in the archive data class;
and the first processing module is used for archiving the data to be processed into the file data class corresponding to the highest feature similarity if the feature similarity is not smaller than the preset feature similarity threshold, creating a new file data class if the feature similarity is smaller than the preset feature similarity threshold, and archiving the data to be processed into the created file data class.
Optionally, each archive data class and each newly-built archive data class are respectively associated with one archive number, and each data contained in each archive data class and each data contained in each newly-built archive data class respectively correspond to one data number.
Optionally, the method further comprises:
the updating module is used for respectively comparing the data characteristics of all the data contained in any one of the archive data classes with each other to determine the data characteristic with the highest similarity with other data characteristics if no new data is archived for any one of the archive data classes beyond the preset time period, and updating the central data characteristic of any one of the archive data classes according to the determined data characteristic;
and the second processing module is used for archiving the new data into any one of the archive data classes after updating the central data characteristics of any one of the archive data classes if the new data is determined to be archived when the central data characteristics of any one of the archive data classes are updated.
Optionally, the method further comprises:
and the third processing module is used for executing corresponding operation according to the operation instruction if the operation instruction is determined to be received.
Optionally, if the operation instruction is a delete instruction, the third processing module is specifically configured to:
searching a file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class, wherein the deleting instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
According to the file number in the deleting instruction, searching the file data class associated with the file number, and according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data, wherein the deleting instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a modification instruction, the third processing module is specifically configured to:
according to the file number in the modification instruction, searching a file data class associated with the file number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the modification instruction, searching a file data class associated with the file number, according to the data number in the modification instruction, searching data associated with the data number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a search instruction, the third processing module is specifically configured to:
According to the file number in the searching instruction, searching the file data class associated with the file number, wherein the searching instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the searching instruction, searching the file data class associated with the file number, and according to the data number in the searching instruction, searching the data associated with the data number, wherein the searching instruction at least comprises the file number and the data number.
Optionally, if the archive data class corresponds to a plurality of central data features, the comparison module is specifically configured to:
comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class, wherein the plurality of central data characteristics are determined based on the determined plurality of central points after respectively extracting the characteristics of different area blocks of each data contained in the corresponding archive data class and respectively determining the plurality of central points based on the data characteristics of different areas of each data; or the plurality of central data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class.
Optionally, the data to be processed is an image.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the data processing method described above when the program is executed.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method described above.
In the embodiment of the application, the data characteristics of the data to be processed are obtained, the data characteristics are respectively compared with the central data characteristics of each archive data class, the characteristic similarity between the data characteristics and each central data characteristic is determined, the central data characteristics represent the data characteristics of the central points of each data contained in the corresponding archive data class, if the characteristic similarity is not less than the preset similarity threshold value, the data to be processed is archived into the archive data class corresponding to the highest characteristic similarity, if the characteristic similarity is less than the preset similarity threshold value, the archive data class is newly built, and the data to be processed is archived into the newly built archive data class, so that the characteristic similarity is determined by comparing the data characteristics of the data to be processed with the central data characteristics of each archive, and the data to be processed is clustered and archived according to the characteristic similarity, thereby not only improving the archiving efficiency of the data to be processed, but also reducing the computational complexity and the data repeatability.
Drawings
FIG. 1 is a flow chart of a data processing method according to an embodiment of the application;
FIG. 2 is a flowchart of another data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
At present, security protection equipment mainly improves monitoring record and search function, in the prior art, when searching for data, searching is generally performed in two ways, the first way is searching for data according to time and a channel, the second way is searching for data according to structuring or searching for another data through one data, for example, when the data is an image, the image can be searched in a graph searching mode, but in the prior art, when the data volume is large, the efficiency of searching the image is lower, and the data is not classified, so that the repeatability of the searched data is higher.
In the embodiment of the application, the data characteristics of the data to be processed are acquired, the data characteristics are respectively compared with the central data characteristics of each archive data class, the characteristic similarity between the data characteristics and the central data characteristics of each central data class is determined, the characteristic similarity between the data characteristics and the central data characteristics is determined, if the characteristic similarity is not less than the preset characteristic similarity threshold value, the data to be processed is archived in the archive data class corresponding to the highest characteristic similarity, if the characteristic similarity is less than the preset characteristic similarity threshold value, the archive data class is newly established, the data to be processed is archived in the newly established archive data class, the data characteristics of the data to be processed and the data characteristics of the archive data class are compared, and the data to be processed are clustered in the corresponding archive data class, so that the data is archived in a clustering mode, and the data retrieval efficiency can be improved and the calculation amount and the data repeatability can be reduced when the data is retrieved.
Based on the above embodiments, referring to fig. 1, a flowchart of a data processing method in an embodiment of the present application is mainly applied to a server, and specifically includes:
Step 100: and acquiring the data characteristics of the data to be processed.
In the embodiment of the present application, the data to be processed may be an image or a voice, which is not limited in the embodiment of the present application.
If the data to be processed is an image, the step 100 is executed, and specifically includes:
s1: an image is obtained by a camera.
In the embodiment of the application, when the data to be processed is an image, the image can be directly captured by the camera or can be obtained by other transmission modes, and the embodiment of the application is not limited.
S2: and extracting the characteristics of the image to obtain the data characteristics of the image.
In the embodiment of the application, the image is subjected to feature extraction to obtain the data feature of the image, for example, if the data to be processed is the image, the image is subjected to feature extraction to obtain the face data feature in the image to be processed, for example, the image can be subjected to feature extraction according to different parts, for example, the parts such as eyes, noses and the like are subjected to feature extraction to obtain the integral data feature of the image, for example, the face recognition can be directly performed on the image to be processed to extract the face data feature, and the type and the number of the data feature are not limited in the embodiment of the application.
And 110, respectively comparing the data characteristics with the central data characteristics of each archive data class, and determining the characteristic similarity between the data characteristics and each central data characteristic.
The central data features represent data features of central points of all data contained in the corresponding archive data class, the central points represent data with highest average value of feature similarity, wherein the data features of the central points correspond to data features of other data in the archive data class, and the data features of the corresponding archive data class are respectively compared with the data features of other data in the archive data class.
In the embodiment of the application, each data in the archive data class is respectively compared with the data characteristics of other data in the archive data class to obtain a plurality of characteristic similarities, then the characteristic similarities are averaged to obtain the average value of the characteristic similarities of each data, and the data with the highest characteristic similarity average value is selected from the data to be used as the center point of the archive data class, for example, when the data to be processed is an image, the center data characteristic in the archive data class of the user A is the face center data characteristic of the user A.
In the embodiment of the present application, the feature similarity between the data feature and each central data feature is determined by comparing the data feature with the central data feature of each archive data class, and the number of central data features is not limited when the feature similarity is determined, and the number of central data features may be one or plural, and the specific implementation of step 110 may be divided into the following two different cases:
First case: the central data feature is one.
In the embodiment of the application, when the central data feature is one, the feature extraction is performed on the data to be processed, after the data feature of the data to be processed is obtained, the extracted data feature is compared with the central data feature of each archive data class, the data feature to be processed is taken as an image, for example, after the image is obtained, the feature extraction is performed on the image, the face data feature is obtained, then the feature similarity between the face data feature and the face central data feature of the archive data class with the archive number A is determined, the feature similarity between the face data feature and the face central data feature of the archive data class with the archive number A is compared with the face central data feature of the archive data class with the archive number B, and the feature similarity between the face data feature and the face central data feature of the archive data class with the archive number C is determined.
Second case: the central data feature is a plurality.
The method specifically comprises the following steps: and respectively comparing the data characteristics with the central data characteristics of each archive data class, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class.
If the archive data class corresponds to a plurality of central data features, two possible implementations are provided for determining the plurality of central data features in the embodiment of the present application: 1) The plurality of center data features are determined based on the determined plurality of center points after feature extraction is respectively performed on different area blocks of each data contained in the corresponding archive data class and a plurality of center points are respectively determined based on the data features of different areas of each data. For example, when the data to be processed is an image, the corresponding different area blocks may be divided into "eyes", "mouth", "nose", and the like, and without limitation, the feature extraction is performed on the different area blocks of each image of the archive data class, and a plurality of center points are determined based on the data features of the different area blocks of each data, that is, the center points determined by the comparison based on the "eyes" data features, the center points determined by the comparison based on the "mouth" data features, and the like, and then the corresponding center data features are determined based on the determined center points "eyes" and "mouth", respectively.
In this embodiment, after each data feature of the data to be processed is obtained, each data feature is compared with each central data feature of each archive data class, and feature similarity between each data feature and each central data feature of each archive data class is determined, for example, assuming that the central data feature of archive data class a is "eye" and "nose", the central data feature of archive data class B is "eye" and "mouth", feature extraction is performed on the data to be processed to obtain data features "eye", "nose" and "mouth", and then the data features "eye" of the data to be processed and the central data feature "eye" of archive data class a are compared, respectively, the data features "nose" of the data to be processed and the central data feature "eye" of archive data class B are compared, and the data features "mouth" of the data to be processed and the central data feature "mouth" of archive data class B "are compared.
2) The plurality of center data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class. For example, the profile data class with the profile number a includes image 1, image 2, image 3, and image 4, and then the feature extraction is performed on the image 1, image 2, image 3, and image 4 respectively, to obtain the data feature A1 of the image 1, the data feature A2 of the image 2, the data feature A3 of the image 3, and the data feature A4 of the image 4, and then the clustering is performed on the data feature A1, the data feature A2, the data feature A3, and the data feature A4 respectively, for example, the average value of the feature similarities of the data feature A2 and the data feature A1 and the data feature A3 is highest, but the feature similarity with the data feature A4 is lower, and then the two central data features of the profile data class with the profile number a can be collected as two classes, and then the data feature A2 and the data feature A4 are respectively obtained.
In the embodiment of the application, although each data included in the archive data class is the data of the corresponding user, because each data in the archive data class may have a larger difference, the number of the central data features may be set to be multiple, for example, the number of the central data features of the archive data class may be set to be 2-6, and the number of the central data features may be set according to actual needs, so that the accuracy of data cluster archiving can be improved by setting multiple central data features for each archive data class, and the more accurate the result of data cluster archiving along with the increment of the number of the central data features, the number of the central data features is not limited in the embodiment of the application.
For example, assuming that 2 central data features are set in the archive data class with the archive number a, namely, a central data feature A1 and a central data feature A2, and 3 central data features are set in the archive data class with the archive number B, namely, a central data feature B1, a central data feature B2 and a central data feature B3, after the data feature of the data to be processed is acquired, the data feature is compared with each central data feature in the archive data class with the archive number a, namely, the data feature is compared with the central data feature A1, the feature similarity between the data feature and the central data feature A1 is determined, the data feature is compared with the central data feature A2, the feature similarity between the data feature and the central data feature A2 is determined, the feature similarity between the data feature and the central data feature B2 is determined, namely, the feature similarity between the data feature and the central data feature B3 is determined, and the feature similarity between the data feature and the central data feature B3 is determined.
The number of the central data features of each archive data class can be the same or different, and can be set according to actual requirements.
And 120, if the feature similarity is not smaller than the preset feature similarity threshold, archiving the data to be processed into the archive data class corresponding to the highest feature similarity, if the feature similarity is smaller than the preset feature similarity threshold, creating an archive data class, and archiving the data to be processed into the created archive data class.
The step 120 may be executed correspondingly based on the number of the corresponding central data features in the archive data type, and may be specifically divided into the following two different embodiments:
first embodiment: when the profile data class corresponds to one central data feature, in the embodiment of the present application, when the central data feature is one, after determining the feature similarity between the data feature of the data to be processed and the central data feature of each profile data class, further determining whether the feature similarity of each data feature of the data to be processed is not less than the preset feature similarity threshold, specifically executing step 120 may be divided into the following two different cases.
First case: the feature similarity is not smaller than a preset feature similarity threshold.
The method specifically comprises the following steps: if the feature similarity is not smaller than the preset feature similarity threshold, the data to be processed is archived in the archive data class corresponding to the highest feature similarity.
In the embodiment of the application, if it is determined that the feature similarity of the data features is not smaller than the preset feature similarity threshold, the data to be processed is archived in the archive data class corresponding to the highest feature similarity between the data features and the central data features of the archive data classes, for example, the image of the user X is acquired through the camera with the feature similarity threshold of 80%, the feature of the user X is extracted, the face data features of the user X are obtained, the first feature similarity of the face data features and the face central data features of the archive data classes with the archive numbers of 90% is assumed, the second feature similarity of the face data features and the face central data features of the archive data classes with the archive numbers of 85%, the third feature similarity of the face data features and the face central data features of the archive data classes with the archive numbers of C is "60%", and therefore, the first feature similarity is larger than the second feature similarity, the second feature similarity is larger than the third feature similarity, the first feature similarity and the preset feature similarity is determined, and the first feature similarity and the second feature similarity is larger than the preset feature similarity is determined, and the feature similarity is the archive data class is clustered to be processed, and the feature similarity is higher than the threshold, the archive data is similar, and the feature is processed.
Further, after the data to be processed is clustered and archived to the corresponding archive data class, a data number is associated with the data to be processed, so that in the process of searching the data, the position of the data can be accurately positioned through the archive number corresponding to the archive data class and the data number corresponding to the data.
In the embodiment of the application, each archive data class and the newly built archive data class are respectively associated with one archive number, each data contained in each archive data class and the newly built archive data class respectively corresponds to one data number, and each archive data class contains data corresponding to a user, for example, when the data is an image, the image of the user A in the archive data class with the archive number A is the image of the user B in the archive data class with the archive number B.
For example, the archive data class is associated with an archive number a, the first data in the archive data class with the archive number a has a data number 1, the second data has a data number 2, and the third data has a data number 3, so that when an operation instruction is received, the corresponding archive data class or data in the archive data class can be searched in time according to the archive number or the data number, and the searching accuracy can be improved.
Second case: the feature similarity is smaller than a preset feature similarity threshold.
The method specifically comprises the following steps:
s1: and if the feature similarity is smaller than the feature similarity threshold, creating a file data class.
In the embodiment of the application, if the feature similarity is determined to be smaller than the feature similarity threshold, that is, the feature similarity between the data feature and the center data feature of each archive data class is determined according to the comparison of the data feature of the data to be processed and the center data feature of each archive data class, if the feature similarity between the determined data feature and the center data feature of each archive data class is smaller than the preset feature similarity threshold, the archive data class is newly created, for example, assuming that the first feature similarity between the determined data feature and the center data feature of the archive data class a is 30%, and the second feature similarity between the data feature and the center data feature of the archive data class B is 20%, and the third feature similarity between the data feature and the center data feature of the archive data class C is "25%", at this time, since the first feature similarity, the second feature similarity and the third feature similarity are all smaller than the preset feature similarity threshold, the new archive data class is created.
S2: and archiving the data to be processed into the newly built archive data class.
After creating the archive data class, archiving the data to be processed into the created archive data class and storing, associating a archive number for the created archive data class, and associating a data number for the data to be processed stored in the archive data, for example, associating a archive number C for the created archive data class, and associating a data number 1 for the data to be processed stored in the archive data class with the archive number C, which is not limited in the embodiment of the present application.
Second embodiment: when the profile data class corresponds to a plurality of center data features, in the embodiment of the present application, when the center data features are a plurality of, after determining the feature similarity between the data features of the data to be processed and each center data feature of each profile data class, further determining whether the feature similarity between the data features of the data to be processed and each center data feature of the profile data class is not less than a preset feature similarity threshold, specifically executing step 120 includes the following two different cases.
First case: the feature similarity is not smaller than a preset feature similarity threshold.
In the embodiment of the application, after determining the feature similarity between the data features of the data to be processed and the central data features of each archive data class, if determining that the feature similarity is not smaller than the preset feature similarity threshold, archiving the data to be processed, wherein the method specifically comprises the following two different modes.
The first way is: and archiving the data to be processed into the archive data class corresponding to the highest feature similarity.
In the embodiment of the application, if the feature similarity is not less than the preset feature similarity threshold, the data to be processed is archived in the archive data class corresponding to the highest feature similarity, for example, the image is subjected to feature extraction to obtain the data feature of the image as the face data feature X, the face center data feature of the archive data class with the archive number A is assumed to be A1 and A2 respectively, the face center data feature of the archive data class with the archive number B is assumed to be B1 and B2 respectively, the face data feature X is compared with the face center data feature A1 to determine that the feature similarity between the face data feature X and the face center data feature A1 is XA1 and is 90%, the face data feature X is compared with the face center data feature A2, the method comprises the steps of determining that the feature similarity between a face data feature X and a face center data feature A2 is XA2 and is 70%, comparing the face data feature X with a face center data feature B1, determining that the feature similarity between the face data feature X and the face center data feature B1 is XB1 and is 80%, comparing the face data feature X with the face center data feature B2, determining that the feature similarity between the face data feature X and the face center data feature B2 is XB2 and is 60%, if the feature similarity threshold is 75%, determining that the feature similarity XA1 and the feature similarity XB1 are larger than the feature similarity threshold, and archiving an image into a archive data class corresponding to the highest feature similarity, namely archiving the image into the archive data class with the archive number A.
The second way is: and archiving the data to be processed into the archive data class corresponding to the highest average feature similarity.
In the embodiment of the application, if the feature similarity is not smaller than the preset feature similarity threshold, respectively calculating the weighted average value of the feature similarity which is not smaller than the preset feature similarity threshold and corresponds to each archive data class, and archiving the data to be processed into the archive data class which corresponds to the highest average feature similarity.
Taking the data to be processed as an image, for example, performing feature extraction on the image to obtain the data feature of the image as a face data feature X, assuming that the face center data feature of the archive data class with the archive number A is A1 and A2 respectively, the face center data feature of the archive data class with the archive number B is B1 and B2 respectively, determining that the feature similarity between the face data feature X and the face center data feature A1 is XA1 and 90%, determining that the feature similarity between the face data feature X and the face center data feature A2 is XA2 and 70%, and taking the average feature similarity between the face data feature X and each face center data feature in the archive data class with the archive number A as 80%, then determining that the feature similarity between the face data feature X and the face center data feature B1 is XB1 and 80%, determining that the feature similarity between the face data feature X and the face center data feature B2 is XB1 and the average feature similarity is 85.82.82%, and taking the weighted similarity between the face data feature X and the average feature similarity as the archive data class with the archive number A, and taking the average feature similarity as the average feature similarity between the archive data class with the archive number of the archive data as the highest value.
Third mode: in an embodiment of the present application, a possible implementation manner is further provided, which specifically includes:
after determining the feature similarity between the data features of the data to be processed and the central data features of each archive data class, respectively calculating the weighted average value of the feature similarity corresponding to each archive data class to obtain the average feature similarity corresponding to each archive data class, and if the average feature similarity is not smaller than the preset feature similarity threshold value, archiving the data to be processed into the archive data class corresponding to the highest average feature similarity.
Taking the data to be processed as an image, taking a feature similarity threshold value as 80% as an example, describing a possible implementation manner in the embodiment of the application in detail, for example, carrying out feature extraction on the image to obtain the data feature of the image as a face data feature X, assuming that the face center data feature of the archive data class with the archive number of A is A1 and A2 respectively, the face center data feature of the archive data class with the archive number of B is B1 and B2 respectively, determining that the feature similarity between the face data feature X and the face center data feature A1 is XA1 and is 80%, determining that the feature similarity between the face data feature X and the face center data feature A2 is XA2 and is 90%, then determining that the average feature similarity between the face data feature X and each face center data feature in the archive data class with the archive number of A is 85%, and determining that the feature similarity between the face data feature X and the face center data feature B1 is XB1 and 70%, and determining that the feature similarity between the face data feature X and the face center data feature B2 is XB2 is 80% and the average feature similarity between the face data feature X and the archive data class with the archive number of the archive data feature of A and the face center data feature of 80.
Second case: the feature similarity is smaller than a preset feature similarity threshold.
The method specifically comprises the following steps: if the feature similarity is smaller than the feature similarity threshold, creating a new file data class, and archiving the data to be processed into the created file data class.
Further, in the embodiment of the present application, for any one archive data class, if it is determined that no new data is archived beyond a preset duration, any one archive data class can automatically update the central data feature, which specifically includes:
s1: and if no new data is archived for any one of the archive data classes, respectively comparing the data features of each data contained in any one of the archive data classes with each other, determining the data feature with the highest similarity with other data features, and updating the central data feature of any one of the archive data classes according to the determined data feature.
In the embodiment of the application, for any one archive data class, if no new data is filed beyond the preset time period, the data features of all the data contained in any one archive data class are respectively compared in pairs, the data feature with the highest similarity with other data features is determined, the central data feature of any one archive data class is updated according to the determined data feature, and the iterative adjustment of the central data feature of each archive data class is realized, that is, the central data feature of each archive data class is obtained selectively, but not calculated, and the data feature is taken as a face data feature, and the central data feature is taken as a face central data feature, for example, when the face central data feature is assumed to be one, the data is obtained Comparing the face data feature corresponding to the image with the number 1 with the face data feature corresponding to the image with the number 2, determining that the first feature similarity between the face data feature corresponding to the image with the number 1 and the face data feature corresponding to the image with the number 2 is 70%, comparing the face data feature corresponding to the image with the number 1 with the face data feature corresponding to the image with the number 3, determining that the second feature similarity between the face data feature corresponding to the image with the number 1 and the face data feature corresponding to the image with the number 3 is 80%, comparing the face data feature corresponding to the image with the number 2 and the face data feature corresponding to the image with the number 3, and determining that the third feature similarity between the face data feature corresponding to the image with the number 2 and the face data feature corresponding to the image with the number 3 is 75%, and then determining that the average feature similarity between the face data feature corresponding to the image with the number 1 and the face data feature corresponding to the image with the number 3 is 75%75% and the average feature similarity of the face data features corresponding to the image with the data number of 2 is +.>72.5%, and the average feature similarity of the face data features corresponding to the image with the data number of 3 is +. >77.5%, and the average value of the feature similarities of the images with the data number of 3 is highest, the face data feature corresponding to the images with the data number of 3 is used as the face center data feature of the archive data class, which is not limited in the embodiment of the present application.
Further, after the central data characteristics of the archive data class are obtained, the characteristics of each data in the archive data class are automatically extracted again, the data characteristics of each data in the archive data class are obtained, the central data characteristics of the archive data class are continuously updated according to the determined data characteristics of each data, and the determined central data characteristics are ensured to be the best data characteristics capable of representing the current archive data class.
Because when data is acquired, the acquired data have larger deviation due to different scenes when the data is acquired, in this case, if the feature similarity between the data and the central data features of each archive data class is smaller than the preset feature similarity threshold value, and if an archive data class is newly built, the data which should be originally archived may be caused, and one archive data class is newly built, so that the accuracy of data clustering archiving is reduced, and therefore, the accuracy of data clustering archiving can be improved by setting a plurality of central data features.
If the central data features are multiple, the data features of each data contained in any one archive data class can be compared two by two, the feature similarity among the data is determined, the average feature similarity of each type of data is obtained, and a plurality of data features with the highest feature similarity are selected as the central data features of the archive data class.
For example, the data feature corresponding to the image with the data number of 1 is compared with the data feature corresponding to the image with the data number of 2, the first feature similarity between the data feature corresponding to the image with the data number of 1 and the data feature corresponding to the image with the data number of 2 is determined, the data feature corresponding to the image with the data number of 1 is compared with the data feature corresponding to the image with the data number of 3, the second feature similarity between the data feature corresponding to the image with the data number of 1 and the data feature corresponding to the image with the data number of 3 is determined, the third feature similarity between the data feature corresponding to the image with the data number of 2 and the data feature corresponding to the image with the data number of 3 is determined, and then the data with the highest average value of the feature similarities and the data with the second highest average value of the feature similarities are selected as the center point of the file data class.
Further, the data in the archive data class can be classified according to different scenes, the feature similarity between the data in each class is calculated, the data with the highest average value of the feature similarity is taken as the central data feature of the class, and the central data feature set of each class is taken as a plurality of central data features of the archive data class.
The low load time may be used to update the central data feature of the archive data class, and the preset duration may be set according to actual needs, for example, the duration may be set to 5 seconds, which is not limited in the embodiment of the present application.
S2: if it is determined that new data is archived when the center data feature of any one of the archive data classes is updated, the new data is archived in any one of the archive data classes after the center data feature of any one of the archive data classes is updated.
In the embodiment of the application, if new data archiving is determined when the central data feature of any one of the archive data classes is updated, that is, new data is received when the central data feature of any one of the archive data classes is updated, and the archive data class to which the new data belongs is determined, after the central data feature of any one of the archive data classes is updated, updating the central data feature of the archive data class is suspended, then the new data is archived in any one of the archive data classes, when the new data is archived, timing is continued, and if it is determined that the new data archiving is not performed for more than a preset time period, the step of updating the central data feature of any one of the archive data classes is continued.
After the archive data class cluster archive or the newly built is completed, the user can send instructions to the server according to actual requirements and different types of operation instructions, and the method specifically comprises the following steps:
and if the operation instruction is determined to be received, executing corresponding operation according to the operation instruction.
The following describes in detail the execution of corresponding operations according to the operation instruction in the embodiment of the present application with the operation instruction as the deletion instruction, the modification instruction and the deletion instruction, and the operation type in the embodiment of the present application is only an example and is not limited thereto.
First type: the operation instruction is a delete instruction.
If it is determined that the archive data class is deleted, executing a corresponding operation according to the operation instruction, specifically including:
and searching the file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class.
The deletion instruction at least comprises a file number.
In the embodiment of the application, according to the file number in the deletion instruction, the file data class associated with the file number is searched, and then the file data class is deleted, so that all data in the file data class is deleted, for example, if it is determined that the file data class with the file number B is deleted, the deletion instruction comprises the answer number B, then the file data class corresponding to the file number is determined according to the file number B, and the file data class is deleted.
If it is determined that the data in the archive data class is deleted, executing a corresponding operation according to the operation instruction, specifically including:
according to the file number in the deleting instruction, searching the file data class associated with the file number, according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data.
The deleting instruction at least comprises an archive number and a data number.
In the embodiment of the application, after a deletion instruction is received, determining to delete the data of the archive data class, according to the archive number in the deletion instruction, searching the archive data class associated with the archive number, according to the data number of the deletion instruction, searching the data associated with the data number, and deleting the data, for example, assuming that the archive number in the deletion instruction is B and the data number is C, after the deletion instruction is received, according to the archive number B in the deletion instruction, searching the archive data class associated with the archive number B, according to the data number 4 in the deletion instruction, searching the data associated with the data number 4, and deleting the data.
Thus, by the method in the embodiment of the application, the archive number is associated with each archive data class respectively, and the data number is associated with each data in the archive data class respectively, so that the archive data class and the data can be accurately positioned, and the accuracy can be improved when the archive data class or the data in the archive data class are deleted.
Second type: the operation instruction is a modification instruction.
If it is determined that the archive data class is modified, executing a corresponding operation according to the operation instruction, which specifically includes:
according to the file number in the modification instruction, searching the file data class associated with the file number, and executing corresponding modification operation according to the modification instruction.
The modification instruction at least comprises a file number.
In the embodiment of the present application, when determining to modify the archive data class, the archive number, name, etc. of the archive data class may be modified according to the modification instruction, which is not limited in the embodiment of the present application.
If it is determined to modify the data in the archive data class, executing a corresponding operation according to the operation instruction, which specifically includes:
according to the file number in the modification instruction, searching the file data class associated with the file number, according to the data number in the modification instruction, searching the data associated with the data number, and executing corresponding modification operation according to the modification instruction.
The modification instruction at least comprises a file number and a data number.
In the embodiment of the application, when determining to modify the data in the archive data class, the corresponding archive data class is searched according to the archive number in the modification instruction and according to the association relationship between the archive number and the archive data class, and further the corresponding data is searched according to the data number in the modification instruction and the association relationship between the data number and the data, and the data is modified, for example, the data number of the data can be modified or replaced with new data, which is not limited in the embodiment of the application.
Third type: the operation instruction is a search instruction.
In the embodiment of the present application, when the operation instruction is a search instruction, the search instruction may be specifically classified into the following three different types.
First kind: the search instruction at least includes a file number.
The corresponding operation is executed according to the operation instruction, which specifically comprises:
and searching the file data class associated with the file number according to the file number in the searching instruction.
The searching instruction at least comprises a file number.
In the embodiment of the application, after the data to be processed is clustered and archived, the archive data class to be searched can be accurately positioned according to the archive number corresponding to the archive data class, so that the data to be processed is archived and then searched, the searched data is presented in the clustered archive data class mode, for example, the snapped image is clustered and archived, and when the image is searched, the data number of the archive data class is input, and then each data of the obtained archive data class is searched, so that the repeated occurrence of the data of the same person can be avoided, and the effects of snap shooting, search, de-duplication display and quick return are realized.
Second kind: the search instruction at least comprises an archive number and a data number.
The corresponding operation is executed according to the operation instruction, which specifically comprises:
according to the file number in the search instruction, searching the file data class associated with the file number, and according to the data number in the search instruction, searching the data associated with the data number.
The searching instruction at least comprises an archive number and a data number.
In the embodiment of the application, when searching the data in the archive data class, the corresponding archive data class is searched according to the archive number in the search instruction and the association relation between the archive number and the archive data class, and then the corresponding data is searched according to the data number in the search instruction and the association relation between the data number and the data, so that the search result can be de-duplicated and returned quickly by searching after the data is clustered and filed, and the data can be retrieved by structuring search or event search, taking the data as an image as an example, and the historical picture, video or archive information can be searched and de-duplicated quickly based on the mode of searching the archive data in the archive data class or the archive data class.
Third kind: the search instruction includes at least a data number.
The corresponding operation is executed according to the operation instruction, which specifically comprises:
and searching the archive data class associated with the data number, and searching the data associated with the data number according to the archive data number.
The searching instruction at least comprises a data number.
In the embodiment of the application, when searching the data in the archive data class, the corresponding archive data class is searched according to the association relation between the data number in the search instruction and the archive data class, and then the corresponding data associated with the data number is searched according to the association relation between the data number and the data.
For example, a data set with matched data features and feature similarity threshold search history may be given, and the archive numbers with matched feature similarity threshold may be searched from the given data features to the corresponding archive data classes, where the set of data corresponding to the archive numbers is required data after being filtered in a given time period.
The filtering mode can be to filter data according to time, so that data with longer storage time in the archive data class can be filtered, recent data are reserved, when the data are images, the images can be filtered according to the definition of the images, so that images with low definition in the archive data class can be filtered, only images with higher definition are reserved, and the accuracy of archiving the data to be processed can be further improved.
Further, taking data as an example, because the data is clustered to obtain archival data types, in the process of searching, the image information of personnel is counted according to people's head, not according to times, and the archival data types can be used for in-out people's flow trend analysis.
In the embodiment of the application, the data characteristics of the data to be processed are obtained, the data characteristics are respectively compared with the central data characteristics of each archive data class, the characteristic similarity between the data characteristics and each central data characteristic is determined, if the characteristic similarity is not smaller than a preset characteristic similarity threshold value, the data to be processed is archived in the archive data class corresponding to the highest characteristic similarity, if the characteristic display degree is smaller than the characteristic similarity threshold value, the archive data class is newly built, the data to be processed is archived in the newly built archive data class, and for any archive data class, if the condition that no new data is archived for longer than the preset time period is determined, the data characteristics of each data contained in any archive data class are respectively compared in pairs, the data characteristics with the highest similarity to other data characteristics are determined, and the central data characteristics of any archive data class are updated according to the determined data characteristics.
Based on the foregoing embodiments, the following describes an image archiving method in the embodiment of the present application in detail by taking data as an image, and referring to fig. 2, a flowchart of another data processing method in the embodiment of the present application specifically includes:
step 200: starting.
Step 210: image features of the image to be processed are obtained.
Step 220: and comparing the image features with the central image features of each archive data class respectively, and determining the feature similarity between the image features and each central image feature.
The center image features represent image features of center points of the images contained in the corresponding archive data class, the center points represent data with highest average value of feature similarity, wherein the data features of the center points are respectively compared with data features of other data in the archive data class in the corresponding archive data class.
Step 230: and judging whether the feature similarity is not smaller than a preset feature similarity threshold, if so, executing step 240, and if not, executing step 250.
Step 240: and archiving the image to be processed into the archive data class corresponding to the highest feature similarity.
Step 250: and creating a new file data class, and archiving the image to be processed into the new file data class.
Wherein, each archive data class and the newly built archive data class are respectively associated with one archive number, and each image contained in each archive data class and the newly built archive data class is respectively corresponding to one image number.
Step 260: and (5) ending.
In the embodiment of the application, the image characteristics of the image to be processed are obtained, the image characteristics are respectively compared with the central image characteristics of each archive data class, the characteristic similarity between the image characteristics and each central image characteristic is determined, the central image characteristics represent the image characteristics of the central points of the images contained in the summary of the corresponding archive data classes, if the characteristic similarity is not less than the preset characteristic similarity threshold value, the image to be processed is archived into the archive data class corresponding to the highest characteristic similarity, if the characteristic similarity is less than the characteristic similarity threshold value, the archive data class is newly built, and the image to be processed is archived into the newly built archive data class, so that the image to be processed is archived by comparing the characteristic similarity, the image de-reproduction can be realized, and the calculated amount is reduced.
Based on the same inventive concept, the embodiments of the present application provide a data processing apparatus, where the data processing apparatus may be a hardware structure, a software module, or a hardware structure plus a software module. Based on the above embodiments, referring to fig. 3, a schematic structural diagram of a data processing apparatus according to an embodiment of the present application is shown, which specifically includes:
An acquisition module 300, configured to acquire data characteristics of data to be processed;
a comparison module 310, configured to compare the data features with the central data features of each archive data class, and determine feature similarities between the data features and the central data features, where the central data features represent data features of central points of each data included in the corresponding archive data class, and the central points represent data with highest average value of the feature similarities, where the data features are compared with data features of other data in the archive data class;
the first processing module 320 is configured to archive the data to be processed into a archive data class corresponding to the highest feature similarity if it is determined that the feature similarity is not less than the preset feature similarity threshold, create a new archive data class if it is determined that the feature similarity is less than the preset feature similarity threshold, and archive the data to be processed into the created archive data class.
Optionally, each archive data class and each newly-built archive data class are respectively associated with one archive number, and each data contained in each archive data class and each data contained in each newly-built archive data class respectively correspond to one data number.
Optionally, the method further comprises:
the updating module 330 is configured to, for any one of the archive data classes, if it is determined that no new data is archived for longer than a preset period of time, respectively compare the data features of each data included in the any one of the archive data classes, determine the data feature with the highest similarity to other data features, and update the central data feature of the any one of the archive data classes according to the determined data feature;
and the second processing module 340 is configured to, if it is determined that new data is archived when the central data feature of the any one of the archive data classes is updated, archive the new data into the any one of the archive data classes after updating the central data feature of the any one of the archive data classes.
Optionally, the method further comprises:
and the third processing module 350 is configured to execute a corresponding operation according to the operation instruction if it is determined that the operation instruction is received.
Optionally, if the operation instruction is a delete instruction, the third processing module 350 is specifically configured to:
searching a file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class, wherein the deleting instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
According to the file number in the deleting instruction, searching the file data class associated with the file number, and according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data, wherein the deleting instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a modification instruction, the third processing module 350 is specifically configured to:
according to the file number in the modification instruction, searching a file data class associated with the file number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the modification instruction, searching a file data class associated with the file number, according to the data number in the modification instruction, searching data associated with the data number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number and the data number.
Optionally, if the operation instruction is a search instruction, when a corresponding operation is executed according to the operation instruction, the third processing module 350 is specifically configured to:
According to the file number in the searching instruction, searching the file data class associated with the file number, wherein the searching instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the searching instruction, searching the file data class associated with the file number, and according to the data number in the searching instruction, searching the data associated with the data number, wherein the searching instruction at least comprises the file number and the data number.
Optionally, if the archive data class corresponds to a plurality of central data features, the comparison module 310 is specifically configured to:
comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class, wherein the plurality of central data characteristics are determined based on the determined plurality of central points after respectively extracting the characteristics of different area blocks of each data contained in the corresponding archive data class and respectively determining the plurality of central points based on the data characteristics of different areas of each data; or the plurality of central data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class.
Optionally, the data to be processed is an image.
Based on the above embodiments, referring to fig. 4, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown.
Embodiments of the present application provide an electronic device that may include a processor 410 (Center Processing Unit, CPU), a memory 420, an input device 430, an output device 440, etc., where the input device 430 may include a keyboard, a mouse, a touch screen, etc., and the output device 440 may include a display device such as a liquid crystal display (Liquid Crystal Display, LCD), cathode Ray Tube (CRT), etc.
Memory 420 may include Read Only Memory (ROM) and Random Access Memory (RAM) and provides processor 410 with program instructions and data stored in memory 420. In an embodiment of the present application, the memory 420 may be used to store a program of any of the data processing methods in the embodiment of the present application.
Processor 410 is operative to perform any one of the data processing methods of the embodiments of the present application in accordance with the obtained program instructions by invoking the program instructions stored in memory 420.
Based on the above embodiments, in the embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data processing method in any of the above method embodiments.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (18)

1. A method of data processing, comprising:
acquiring data characteristics of data to be processed;
respectively comparing the data characteristics with central data characteristics of each archive data class, and determining the characteristic similarity between the data characteristics and each central data characteristic, wherein the central data characteristic represents the data characteristics of central points of all data contained in the corresponding archive data class, and the central points represent data with highest average value of the characteristic similarity, wherein the data characteristics are respectively compared with the data characteristics of other data in the archive data class in the corresponding archive data class;
if the feature similarity is not smaller than the preset feature similarity threshold, archiving the data to be processed into a file data class corresponding to the highest feature similarity, if the feature similarity is smaller than the preset feature similarity threshold, creating a new file data class, and archiving the data to be processed into the created file data class;
for any one of the archive data classes, if no new data is archived beyond the preset time period, respectively comparing the data features of all the data contained in the any one of the archive data classes, determining the data feature with the highest similarity with other data features, and updating the central data feature of any one of the archive data classes according to the determined data feature;
If it is determined that new data is archived when the center data feature of any one of the archive data classes is updated, the new data is archived in any one of the archive data classes after the center data feature of any one of the archive data classes is updated.
2. The method of claim 1, wherein each of the archive data classes and the newly created archive data class is associated with a respective archive number, and wherein each of the archive data classes and the data contained in the newly created archive data class corresponds to a respective data number.
3. The method as recited in claim 1, further comprising:
and if the operation instruction is determined to be received, executing corresponding operation according to the operation instruction.
4. The method of claim 3, wherein if the operation instruction is a delete instruction, performing a corresponding operation according to the operation instruction, specifically comprising:
searching a file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class, wherein the deleting instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the deleting instruction, searching the file data class associated with the file number, and according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data, wherein the deleting instruction at least comprises the file number and the data number.
5. The method of claim 3, wherein if the operation instruction is a modification instruction, performing a corresponding operation according to the operation instruction, specifically comprising:
according to the file number in the modification instruction, searching a file data class associated with the file number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the modification instruction, searching a file data class associated with the file number, according to the data number in the modification instruction, searching data associated with the data number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number and the data number.
6. The method of claim 4, wherein if the operation instruction is a search instruction, performing a corresponding operation according to the operation instruction, specifically comprising:
according to the file number in the searching instruction, searching the file data class associated with the file number, wherein the searching instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
According to the file number in the searching instruction, searching the file data class associated with the file number, and according to the data number in the searching instruction, searching the data associated with the data number, wherein the searching instruction at least comprises the file number and the data number.
7. A method according to any one of claims 1-6, wherein if the archive data class corresponds to a plurality of central data features, comparing the data features with the central data features of each archive data class, respectively, and determining feature similarities between the data features and each central data feature, comprises:
comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class, wherein the plurality of central data characteristics are determined based on the determined plurality of central points after respectively extracting the characteristics of different area blocks of each data contained in the corresponding archive data class and respectively determining the plurality of central points based on the data characteristics of different areas of each data; or the plurality of central data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class.
8. The method of any of claims 1-6, wherein the data to be processed is an image.
9. A data processing apparatus, comprising:
the acquisition module is used for acquiring the data characteristics of the data to be processed;
the comparison module is used for respectively comparing the data characteristics with the central data characteristics of each archive data class and determining the characteristic similarity between the data characteristics and each central data characteristic, wherein the central data characteristics represent the data characteristics of central points of each data contained in the corresponding archive data class, the central points represent the data characteristics of other data in the corresponding archive data class, and the data with the highest average value of the characteristic similarity are compared with the data characteristics of other data in the archive data class;
the first processing module is used for archiving the data to be processed into the file data class corresponding to the highest feature similarity if the feature similarity is not smaller than the preset feature similarity threshold, creating a new file data class if the feature similarity is smaller than the preset feature similarity threshold, and archiving the data to be processed into the created file data class;
the updating module is used for respectively comparing the data characteristics of all the data contained in any one of the archive data classes with each other to determine the data characteristic with the highest similarity with other data characteristics if no new data is archived for any one of the archive data classes beyond the preset time period, and updating the central data characteristic of any one of the archive data classes according to the determined data characteristic;
And the second processing module is used for archiving the new data into any one of the archive data classes after updating the central data characteristics of any one of the archive data classes if the new data is determined to be archived when the central data characteristics of any one of the archive data classes are updated.
10. The apparatus of claim 9, wherein each of the archive data classes and the newly created archive data class is associated with a archive number, and wherein each of the archive data classes and the data contained in the newly created archive data class corresponds to a data number.
11. The apparatus as recited in claim 9, further comprising:
and the third processing module is used for executing corresponding operation according to the operation instruction if the operation instruction is determined to be received.
12. The apparatus of claim 11, wherein if the operation instruction is a delete instruction, the third processing module is specifically configured to:
searching a file data class associated with the file number according to the file number in the deleting instruction, and deleting the file data class, wherein the deleting instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
According to the file number in the deleting instruction, searching the file data class associated with the file number, and according to the data number in the deleting instruction, searching the data associated with the data number, and deleting the data, wherein the deleting instruction at least comprises the file number and the data number.
13. The apparatus of claim 11, wherein if the operation instruction is a modification instruction, the third processing module is specifically configured to:
according to the file number in the modification instruction, searching a file data class associated with the file number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the modification instruction, searching a file data class associated with the file number, according to the data number in the modification instruction, searching data associated with the data number, and executing corresponding modification operation according to the modification instruction, wherein the modification instruction at least comprises the file number and the data number.
14. The apparatus of claim 11, wherein if the operation instruction is a search instruction, the third processing module is specifically configured to:
according to the file number in the searching instruction, searching the file data class associated with the file number, wherein the searching instruction at least comprises the file number; or alternatively, the first and second heat exchangers may be,
according to the file number in the searching instruction, searching the file data class associated with the file number, and according to the data number in the searching instruction, searching the data associated with the data number, wherein the searching instruction at least comprises the file number and the data number.
15. An apparatus as claimed in any one of claims 9 to 14, wherein if the profile class corresponds to a plurality of central data features, the comparison module is specifically configured to:
comparing the data characteristics with the central data characteristics of each archive data class respectively, and determining the characteristic similarity between the data characteristics and the central data characteristics of each archive data class, wherein the plurality of central data characteristics are determined based on the determined plurality of central points after respectively extracting the characteristics of different area blocks of each data contained in the corresponding archive data class and respectively determining the plurality of central points based on the data characteristics of different areas of each data; or the plurality of central data features are obtained by respectively clustering the data features of the data after feature extraction of the data contained in the corresponding archive data class.
16. The apparatus according to any of claims 9-14, wherein the data to be processed is an image.
17. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1-8 when the program is executed by the processor.
18. A computer-readable storage medium having stored thereon a computer program, characterized by: the computer program implementing the steps of the method of any of claims 1-8 when executed by a processor.
CN202010342165.2A 2020-04-27 2020-04-27 Data processing method and device Active CN111507424B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010342165.2A CN111507424B (en) 2020-04-27 2020-04-27 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010342165.2A CN111507424B (en) 2020-04-27 2020-04-27 Data processing method and device

Publications (2)

Publication Number Publication Date
CN111507424A CN111507424A (en) 2020-08-07
CN111507424B true CN111507424B (en) 2023-10-27

Family

ID=71874693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010342165.2A Active CN111507424B (en) 2020-04-27 2020-04-27 Data processing method and device

Country Status (1)

Country Link
CN (1) CN111507424B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112445925B (en) * 2020-11-24 2022-08-26 浙江大华技术股份有限公司 Clustering archiving method, device, equipment and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783672A (en) * 2018-12-28 2019-05-21 上海依图网络科技有限公司 A kind of archiving method and device
CN109800672A (en) * 2018-12-28 2019-05-24 上海依图网络科技有限公司 A kind of archiving method and device
CN109800668A (en) * 2018-12-28 2019-05-24 上海依图网络科技有限公司 A kind of archiving method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8825604B2 (en) * 2012-09-28 2014-09-02 International Business Machines Corporation Archiving data in database management systems

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783672A (en) * 2018-12-28 2019-05-21 上海依图网络科技有限公司 A kind of archiving method and device
CN109800672A (en) * 2018-12-28 2019-05-24 上海依图网络科技有限公司 A kind of archiving method and device
CN109800668A (en) * 2018-12-28 2019-05-24 上海依图网络科技有限公司 A kind of archiving method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
梁静;任㛃.基于Web技术的电子档案管理系统开发与设计.电子设计工程.2017,(24),全文. *

Also Published As

Publication number Publication date
CN111507424A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
US9076064B2 (en) Image processing apparatus and image processing method
US20140105492A1 (en) Detecting Recurring Events in Consumer Image Collections
CN110941978B (en) Face clustering method and device for unidentified personnel and storage medium
CN111783743A (en) Image clustering method and device
CN110413815B (en) Portrait clustering cleaning method and device
EP2864906A2 (en) Searching for events by attendants
CN114741544B (en) Image retrieval method, retrieval library construction method, device, electronic equipment and medium
CN111507424B (en) Data processing method and device
CN114078277A (en) One-person-one-file face clustering method and device, computer equipment and storage medium
US11841902B2 (en) Information processing apparatus, information processing method, and storage medium
CN113810765B (en) Video processing method, device, equipment and medium
CN112052251B (en) Target data updating method and related device, equipment and storage medium
CN111159445A (en) Picture filtering method and device, electronic equipment and storage medium
WO2023124134A1 (en) File processing method and apparatus, electronic device, computer storage medium and program
CN103093213A (en) Video file classification method and terminal
KR101138873B1 (en) Method and apparatus for reducing the number of photo in photo album
CN114359783A (en) Abnormal event detection method, device and equipment
CN112883213B (en) Picture archiving method and device and electronic equipment
CN111708906A (en) Visiting retrieval method, device and equipment based on face recognition and storage medium
CN113449130A (en) Image retrieval method and device, computer readable storage medium and computing equipment
CN112732961A (en) Image classification method and device
CN114268730A (en) Image storage method and device, computer equipment and storage medium
CN113111689A (en) Sample mining method, device, equipment and storage medium
CN113032610B (en) File management method, device, equipment and computer readable storage medium
CN116561372B (en) Personnel gear gathering method and device based on multiple algorithm engines and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant