CN113127668A - Data annotation method and related product - Google Patents

Data annotation method and related product Download PDF

Info

Publication number
CN113127668A
CN113127668A CN201911418643.7A CN201911418643A CN113127668A CN 113127668 A CN113127668 A CN 113127668A CN 201911418643 A CN201911418643 A CN 201911418643A CN 113127668 A CN113127668 A CN 113127668A
Authority
CN
China
Prior art keywords
data
processed
photo
photos
annotation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911418643.7A
Other languages
Chinese (zh)
Inventor
刘宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN201911418643.7A priority Critical patent/CN113127668A/en
Publication of CN113127668A publication Critical patent/CN113127668A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data annotation method and a related product, which are applied to electronic equipment, wherein the electronic equipment is in communication connection with a manual annotation client; the method comprises the following steps: the method comprises the steps that electronic equipment obtains photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through a manual marking client according to the feature information, determines a target marking set through the manual marking client according to the data feature set and the similar set, and finally carries out data marking on each photo to be processed in the target marking set according to the feature information; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling errors generated in the labeling process are reduced, the quality of data labeling is improved, and the high efficiency and the accuracy of the data labeling are improved.

Description

Data annotation method and related product
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to a data annotation method and a related product.
Background
The face recognition needs a large amount of labeled data to perform algorithm training, in order to complete a face recognition task, a search engine or a web crawler can be adopted to collect face image data from a network, usually many collected face images do not contain identity information, the collected face image data is usually divided into a plurality of groups of same labeled images, each group of labeled images corresponds to a person with a unique main body, the images of the same person need to be manually combined into a set, and the images of different persons are divided into different sets. However, the quality of part of the photos is poor or the difference of the shooting angles is large, so that whether the part of the pre-marked photos belongs to the current set or not can be difficult to judge when the pre-marked photos are manually cleaned, and the mistakenly-divided photos cannot be accurately cleaned; when manual merging is carried out, each set needs to search out a plurality of similar sets according to the similarity, whether the same person is merged or not is judged one by one, and the process is large in workload and easy to miss.
Disclosure of Invention
The embodiment of the application provides a data annotation method and a related product, so that the high efficiency and convenience of data annotation are improved.
In a first aspect, an embodiment of the present application provides a data annotation method, which is applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps:
acquiring photos to be processed, and determining the photos to be processed to form an initial photo set;
extracting feature information of each photo to be processed in the initial photo set;
determining a data feature set and a similar set through the manual labeling client according to the feature information, wherein the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference values of which are smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set;
determining a target labeling set through the manual labeling client according to the data feature set and the similar set, wherein the target labeling set is used for representing a set where each photo to be processed is subjected to data labeling, and labeled data when each photo to be processed is subjected to data labeling is associated with the set where the photo to be processed is located;
and performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
In a second aspect, an embodiment of the present application provides a data annotation device, which is applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the data annotation device comprises a processing unit, a communication unit and a storage unit, wherein,
the processing unit is used for acquiring photos to be processed and determining that the photos to be processed form an initial photo set; and the characteristic information of each photo to be processed in the initial photo set is extracted; the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of which is smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set; the system comprises a manual annotation client, a data feature set, a target annotation set and a data annotation client, wherein the manual annotation client is used for determining the target annotation set according to the data feature set and the similar set, the target annotation set is used for representing a set where each photo to be processed is subjected to data annotation, and data annotated when each photo to be processed is subjected to data annotation is associated with the set where the photo to be processed is located; and the data annotation module is used for performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in any method of the first aspect of the embodiment of the present application.
In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program makes a computer perform part or all of the steps described in any one of the methods in the second aspect of the present application.
In a fifth aspect, the present application provides a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform some or all of the steps described in any one of the methods of the second aspect of the present application. The computer program product may be a software installation package.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a data annotation method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another data annotation method provided in the embodiments of the present application;
FIG. 3 is a schematic flow chart of another data annotation method provided in the embodiments of the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
fig. 5 is a block diagram illustrating functional units of a data annotation device according to an embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The electronic device according to the embodiment of the present application may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to a wireless modem, which have wireless communication functions, and various forms of User Equipment (UE), Mobile Stations (MS), terminal devices (terminal device), and the like.
The following describes embodiments of the present application in detail.
Referring to fig. 1, fig. 1 is a schematic flowchart of a data annotation method provided in an embodiment of the present application, and is applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps:
s101, the electronic equipment acquires photos to be processed and determines that the photos to be processed form an initial photo set;
s102, the electronic equipment extracts the feature information of each photo to be processed in the initial photo set;
wherein the feature information comprises feature data of the photo to be processed.
Wherein each photo to be processed comprises a piece of characteristic information.
S103, the electronic equipment determines a data feature set and a similar set through the manual marking client according to the feature information;
the manual labeling client comprises a manual operation platform or intelligent equipment, wherein the manual operation platform or the intelligent equipment can receive data information sent by the electronic equipment, process the data information, determine a data feature set and a similar set, and send the data feature set and the similar set to the electronic equipment.
The data feature set is used for representing a set formed by a plurality of the photos to be processed, wherein the difference value of the feature information is smaller than a first threshold value.
Wherein the similar set is used to represent a set composed of the pictures to be processed which are similar to the feature information of the pictures to be processed in the first feature set and are not present in the first feature set.
S104, the electronic equipment determines a target labeling set through the manual labeling client according to the data feature set and the similar set;
the target labeling set is used for representing a set where each photo to be processed is located when data labeling is carried out on the photo, and data labeled when data labeling is carried out on each photo to be processed is associated with the set where the photo to be processed is located.
And S105, the electronic equipment performs data annotation on each photo to be processed in the target annotation set according to the characteristic information.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
In one possible example, the electronic device determines, by the manual annotation client, a data feature set and a similarity set according to the feature information, including: the electronic equipment classifies each photo to be processed by taking the characteristic information as an identifier, and determines a plurality of first characteristic sets; the electronic equipment sends the first feature set and the feature information to the manual labeling client side, and determines a similar photo set; and the electronic equipment determines the data feature set and the similar set through the manual annotation client according to the similar photo set.
Each first feature set is composed of a plurality of photos to be processed, wherein the difference values of the feature information of the photos to be processed are smaller than a first threshold value.
The difference value is a difference value between a plurality of pieces of feature information, and the difference value between different pieces of feature information can be calculated through a deep learning algorithm.
Wherein the similar photo set is composed of the to-be-processed photos which have a difference value with the feature information of each to-be-processed photo of the first feature set smaller than a first threshold and are not present in the first feature set.
Specifically, the feature information of the first feature sets F1 and F2, F1 is Q1, the feature information of F2 is Q2, the feature information of the existing photo E to be processed is Q1, but E is not in F1, and then E is classified into the similar set H1.
And the similar photo sets correspond to the first feature sets one by one.
In specific implementation, the electronic device classifies each photo to be processed by using the feature information G as an identifier, and determines a first feature set L1, L2 and L3; the electronic equipment sends the first feature set L1, the first feature set L2, the first feature set L3 and the feature information G to the manual annotation client, similar photo sets K1, K2 and K3 are determined, and the data feature set and the similar sets are determined by the electronic equipment through the manual annotation client according to the similar photo sets.
As can be seen, in this example, the electronic device classifies each photo to be processed by using the feature information as an identifier, determines a plurality of first feature sets, then sends the first feature sets and the feature information to the manual annotation client, determines a similar photo set, and then determines the data feature set and the similar set through the manual annotation client according to the similar photo set; the electronic equipment preliminarily marks the photo to be processed, so that the workload of subsequent manual marking is favorably reduced, and the high efficiency and convenience of data marking are favorably improved.
In one possible example, the electronic device determines, by the manual annotation client, the data feature set and the similar set according to the similar photo set, including: the electronic equipment acquires a data feature set obtained after data cleaning is carried out on each first feature set; and the electronic equipment acquires each similar picture set to carry out data cleaning to obtain a similar set.
And the data feature set is obtained by the manual labeling client side performing data cleaning on the non-homogeneous objects in each first feature set.
And the similar sets are obtained by the manual labeling client side by performing data cleaning on the non-homogeneous objects in each similar picture set.
Wherein, the non-homogeneous objects comprise that the objects in the two photos to be processed are not the same person.
And the similar sets correspond to the data feature sets one by one.
In specific implementation, knowing that the first feature set R1 performs data cleaning on R1, where R1 includes photos T1, T2, and T3 to be processed, the manual annotation client finds that T3 is not a photo of the same person as T1, and T2, deletes T3 from R1, and the electronic device obtains a data feature set R2 after data cleaning; knowing that the similarity set W1 is a similarity set of the first feature set R1, data cleaning is performed on W1, the W1 comprises photos T4, T5 and T6 to be processed, the manual annotation client finds that the T4 and the T6 are photos of the same person as the cleaned R1, the T4 and the T6 are moved into the R1, the T5 and the R1 are not photos of the same person, the T5 is deleted from the W1, and the W1 is empty after cleaning is completed.
As can be seen, in this example, the electronic device obtains the data feature set after data cleaning is performed on each first feature set; the electronic equipment acquires each similar picture set and carries out data cleaning on the similar picture set to obtain a similar set; through the data cleaning is carried out by the manual labeling client, the workload of data labeling is favorably reduced, the data labeling error is reduced, and the accuracy of data labeling is favorably improved.
In one possible example, the electronic device determines, by the manual annotation client, a target annotation set according to the data feature set and the similar set, including: after the electronic equipment acquires the data feature set and the similar set, determining that the photo to be processed in the similar set and the photo to be processed in the data feature set are similar objects, and moving the photo to be processed in the similar set into the data feature set; the electronic equipment queries whether any two data feature sets have repeated photos; if the electronic equipment determines that the repeated pictures exist, determining repeated picture information; and the electronic equipment processes the data feature set through the manual labeling client according to the repeated picture information to determine a target labeling set.
The repeated picture information comprises the picture number, the feature information and the data feature set number of the repeated picture.
Wherein, the same kind of object comprises that the objects in the two photos to be processed are the same person.
In specific implementation, the existing data feature set U1, the data feature set U2, the similar set V1 and the similar set V2 include to-be-processed photos S1, S2 and S3 in the U1, to-be-processed photos S4, S5 and S6 in the U2, to-be-processed pictures S7 in the V1, and to-be-processed pictures S3 and S8 in the V2; after the electronic device acquires the data feature sets U1 and U2 and the similar sets V1 and V2, it is determined that the to-be-processed photos S7 in the similar set V1 and the to-be-processed photos S1, S2 and S3 in the data feature set U1 are similar objects, S7 is moved into the data feature set U1, and at the same time, the to-be-processed photos S3 and S8 in the similar set V2 and the to-be-processed photos S4, S5 and S6 in the data feature set U2 are similar objects, and S3 is moved into the data feature set U2 (this operation is a manual judgment). And then, inquiring that repeated photos S3 exist in U1 and U2, determining repeated picture information, processing the data feature sets U1 and U2 through the manual annotation client to comprehensively judge according to the repeated picture information, determining that S3 in U2 is error incorporation, deleting S3 in U2, and determining target annotation sets P1 and P2. P1 includes pictures to be processed S1, S2, S3 and S7, and P2 includes pictures S4, S5, S6 and S8.
It can be seen that, in this example, after the electronic device obtains the data feature set and the similar set, it is determined that the to-be-processed photos in the similar set and the to-be-processed photos in the data feature set are similar objects, the to-be-processed photos in the similar set are moved into the data feature set, then the electronic device queries whether there are duplicate photos in any two of the data feature sets, if it is determined that there are duplicate photos, repeat picture information is determined, and finally, the electronic device processes the data feature set through the manual annotation client according to the repeat picture information, and determines a target annotation set; the efficiency and the accuracy of data marking are improved.
In one possible example, the electronic device, according to the repeated picture information, processing the data feature set by the manual annotation client to determine a target annotation set, includes: the electronic equipment sends the repeated picture information to the manual labeling client; the electronic equipment judges whether the first data feature set and the second data feature set are photo sets of similar objects or not through the manual labeling client; and the electronic equipment executes a preset repeated elimination strategy on the data feature set according to the judgment result, and determines that the data feature set after executing the repeated elimination strategy is a target labeling set.
The repeated picture information is used for the manual labeling client to determine a first data feature set and a second data feature set.
Wherein the duplicate pictures are present in both the first set of data features and the second set of data features
In specific implementation, the electronic equipment sends the repeated picture information to the manual labeling client; the electronic equipment judges whether any two photos to be processed in the first data feature set U1 and the second data feature set U2 are photo sets of similar objects or not through the manual labeling client; the electronic equipment determines that the U1 and the U2 are photo sets of similar objects according to the judgment result, executes a preset duplicate elimination strategy on the data feature set, eliminates the U2 set, and determines that the data feature set U1 subjected to the duplicate elimination strategy is a target marking set.
As can be seen, in this example, the electronic device sends the repeated picture information to the manual annotation client; the electronic equipment judges whether the first data feature set and the second data feature set are photo sets of similar objects or not through the manual labeling client, executes a preset repeated elimination strategy on the data feature set according to a judgment result, and determines that the data feature set after the repeated elimination strategy is executed is a target labeling set; the method is beneficial to reducing the occurrence of repeated data labeling, reducing data redundancy and improving the high efficiency of data labeling.
In one possible example, the step of, by the electronic device, executing a preset duplicate elimination policy on the data feature set according to the determination result, and determining that the data feature set after executing the duplicate elimination policy is a target annotation set includes: the manual labeling client judges that the repeated photos in the first data set and the multiple photos to be processed in the second data set are photos of the same kind of objects, and then reclassifies the repeated photos into the second data feature set; and the manual labeling client deletes the repeated photos in the first data set to determine a target labeling set.
Wherein the target annotation set comprises the second data feature set after the reclassification.
In a specific implementation, the manual annotation client determines that the duplicate photo S10 in the first data set U3 and the multiple to-be-processed photos S14, S15, and S16 in the second data set U4 are photos of the same kind of object, and reclassifies the duplicate photo S10 into the second data feature set U4; deleting the duplicate photos S10 in the first data set U3, and determining a target annotation set.
As can be seen, in this example, if the manual annotation client determines that the repeated photos in the first data set and the multiple photos to be processed in the second data set are photos of the same kind of object, the repeated photos are reclassified into the second data feature set; the manual annotation client deletes the repeated photos in the first data set to determine a target annotation set; the method is favorable for making up errors generated by the pre-labeling processing of the electronic equipment and improving the accuracy of data labeling.
In one possible example, the electronic device, according to the determination result, executes a preset duplicate elimination policy on the data feature set, and determines that the data feature set subjected to the duplicate elimination policy is a target annotation set, where the method includes: the manual labeling client judges that the photos to be processed in the first data set and the photos to be processed in the second data set are photos of the same kind of objects, and determines that the first data set and the second data set are repeated; and the manual labeling client side merges the first data set and the second data set to determine a target labeling set.
And the target labeling set comprises the second data feature set after merging processing.
In a specific implementation, the manual annotation client determines that the multiple to-be-processed photos S17, S18, and S19 in the first data set U5 and the multiple to-be-processed photos S24, S25, and S26 in the second data set U6 are photos of similar objects, determines that the first data set and the second data set are repeated, merges the first data set U5 and the second data set U6, and determines a target annotation set.
In this example, it is determined that the first data set and the second data set are repeated when the manual annotation client determines that the photos to be processed in the first data set and the photos to be processed in the second data set are photos of the same kind of object, and the manual annotation client merges the first data set and the second data set to determine a target annotation set; the method is favorable for making up errors generated by the pre-labeling processing of the electronic equipment and improving the accuracy of data labeling.
Referring to fig. 2, fig. 1 is a schematic flowchart of another data annotation method provided in the embodiment of the present application, and is applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; as shown in the figure, the data annotation method comprises the following steps:
s201, the electronic equipment acquires photos to be processed and determines that the photos to be processed form an initial photo set;
s202, the electronic equipment extracts the feature information of each photo to be processed in the initial photo set;
s203, the electronic equipment classifies each photo to be processed by taking the feature information as an identifier, and determines a plurality of first feature sets;
s204, the electronic equipment sends the first feature set and the feature information to the manual labeling client side, and determines a similar photo set;
s205, the electronic equipment determines the data feature set and the similar set through the manual annotation client according to the similar photo set;
s206, the electronic equipment determines a target labeling set through the manual labeling client according to the data feature set and the similar set;
and S207, the electronic equipment performs data annotation on each photo to be processed in the target annotation set according to the characteristic information.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
In addition, the electronic equipment classifies each photo to be processed by taking the feature information as an identifier, a plurality of first feature sets are determined, then the electronic equipment sends the first feature sets and the feature information to the manual labeling client to determine similar photo sets, and then the electronic equipment determines the data feature sets and the similar sets through the manual labeling client according to the similar photo sets; the electronic equipment preliminarily marks the photo to be processed, so that the workload of subsequent manual marking is favorably reduced, and the high efficiency and convenience of data marking are favorably improved.
Referring to fig. 3, fig. 3 is a schematic flowchart of another data annotation method provided in the embodiment of the present application, and the method is applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; as shown in the figure, the data annotation method comprises the following steps:
s301, the electronic equipment acquires photos to be processed and determines that the photos to be processed form an initial photo set;
s302, the electronic equipment extracts the feature information of each photo to be processed in the initial photo set;
s303, the electronic equipment classifies each photo to be processed by taking the feature information as an identifier, and determines a plurality of first feature sets;
s304, the electronic equipment sends the first feature set and the feature information to the manual annotation client to determine a similar photo set;
s305, the electronic equipment determines the data feature set and the similar set through the manual annotation client according to the similar photo set;
s306, after the electronic equipment acquires the data feature set and the similar set, determining that the photo to be processed in the similar set and the photo to be processed in the data feature set are similar objects, and moving the photo to be processed in the similar set into the data feature set;
s307, the electronic equipment inquires whether any two data feature sets have repeated photos;
s308, if the electronic equipment determines that the repeated pictures exist, determining the information of the repeated pictures;
s309, the electronic equipment processes the data feature set through the manual annotation client according to the repeated picture information, and determines a target annotation set;
and S310, the electronic equipment performs data annotation on each photo to be processed in the target annotation set according to the characteristic information.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
In addition, after the electronic equipment acquires the data feature set and the similar set, determining that the photos to be processed in the similar set and the photos to be processed in the data feature set are similar objects, moving the photos to be processed in the similar set into the data feature set, then inquiring whether repeated photos exist in any two data feature sets by the electronic equipment, if the repeated photos exist, determining repeated picture information, and finally processing the data feature set through the manual annotation client by the electronic equipment according to the repeated picture information to determine a target annotation set; the efficiency and the accuracy of data marking are improved.
Consistent with the embodiments shown in fig. 1, fig. 2, and fig. 3, please refer to fig. 4, and fig. 4 is a schematic structural diagram of an electronic device 400 provided in an embodiment of the present application, as shown in the figure, the electronic device 400 includes an application processor 410, a memory 420, a communication interface 430, and one or more programs 421, where the one or more programs 421 are stored in the memory 420 and configured to be executed by the application processor 410, and the one or more programs 421 include instructions for performing the following steps;
acquiring photos to be processed, and determining the photos to be processed to form an initial photo set;
extracting feature information of each photo to be processed in the initial photo set;
determining a data feature set and a similar set through the manual labeling client according to the feature information, wherein the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference values of which are smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set;
determining a target labeling set through the manual labeling client according to the data feature set and the similar set, wherein the target labeling set is used for representing a set where each photo to be processed is subjected to data labeling, and labeled data when each photo to be processed is subjected to data labeling is associated with the set where the photo to be processed is located;
and performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
In a possible example, the determining, by the manual tagging client, the data feature set and the similar set according to the feature information includes: classifying each photo to be processed by taking the feature information as an identifier, and determining a plurality of first feature sets, wherein each first feature set consists of a plurality of photos to be processed, and the difference value of the feature information is smaller than a first threshold value; sending the first feature set and the feature information to the manual labeling client, and determining a similar photo set, wherein the similar photo set consists of the to-be-processed pictures which have a difference value smaller than a first threshold value and do not exist in the first feature set, and the similar photo set is in one-to-one correspondence with the first feature set; and determining the data feature set and the similar set through the manual labeling client according to the similar photo set.
In one possible example, the determining, by the manual annotation client, the data feature set and the similarity set according to the set of similar photos includes: acquiring a data feature set after data cleaning is carried out on each first feature set, wherein the data feature set is obtained by carrying out data cleaning on non-homogeneous objects in each first feature set by the manual labeling client; and acquiring a similar set obtained after data cleaning is carried out on each similar picture set, wherein the similar set is obtained by carrying out data cleaning on non-homogeneous objects in each similar picture set by the manual labeling client, and the similar sets correspond to the data feature sets one to one.
In a possible example, the determining, by the manual annotation client, a target annotation set according to the data feature set and the similarity set includes: after the data feature set and the similar set are obtained, determining that the photo to be processed in the similar set and the photo to be processed in the data feature set are similar objects, and moving the photo to be processed in the similar set into the data feature set; querying whether any two data feature sets have repeated photos; if the repeated pictures exist, determining the information of the repeated pictures; and processing the data feature set through the manual labeling client according to the repeated picture information to determine a target labeling set.
In a possible example, the processing, by the manual annotation client, the data feature set according to the repeated picture information to determine a target annotation set, where the instructions in the program are specifically configured to perform the following operations: sending the repeated picture information to the manual annotation client, wherein the repeated picture information is used for the manual annotation client to determine a first data feature set and a second data feature set, and the repeated pictures exist in the first data feature set and the second data feature set at the same time; judging whether the first data feature set and the second data feature set are photo sets of similar objects or not through the manual labeling client; and executing a preset repeated elimination strategy on the data feature set according to the judgment result, and determining that the data feature set after executing the repeated elimination strategy is a target labeling set.
In a possible example, the step of executing a preset duplicate elimination policy on the data feature set according to the judgment result, and determining that the data feature set after executing the duplicate elimination policy is a target labeling set, where the instruction in the program is specifically configured to execute the following operations: the manual labeling client judges that the repeated photos in the first data set and the multiple photos to be processed in the second data set are photos of the same kind of objects, and then reclassifies the repeated photos into the second data feature set; and deleting the repeated photos in the first data set by the manual labeling client, and determining a target labeling set, wherein the target labeling set comprises the reclassified second data feature set.
In a possible example, the step of executing a preset duplicate elimination policy on the data feature set according to the judgment result, and determining that the data feature set after executing the duplicate elimination policy is a target labeling set, where the instruction in the program is specifically configured to execute the following operations: the manual labeling client judges that the photos to be processed in the first data set and the photos to be processed in the second data set are photos of the same kind of objects, and determines that the first data set and the second data set are repeated; and the manual annotation client merges the first data set and the second data set to determine a target annotation set, wherein the target annotation set comprises the second data feature set after merging.
The above description has introduced the solution of the embodiment of the present application mainly from the perspective of the method-side implementation process. It is understood that the electronic device comprises corresponding hardware structures and/or software modules for performing the respective functions in order to realize the above-mentioned functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative elements and algorithm steps described in connection with the embodiments provided herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the electronic device may be divided into the functional units according to the method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 5 is a block diagram of functional units of a data annotation device 500 according to an embodiment of the present application. The data annotation device 500 is applied to electronic equipment which is in communication connection with a manual annotation client; the data annotation device 500 comprises a processing unit 501, a communication unit 502 and a storage unit 503, wherein,
the processing unit 501 obtains photos to be processed, and determines that the photos to be processed form an initial photo set; and the characteristic information of each photo to be processed in the initial photo set is extracted; the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of which is smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set; the system comprises a manual annotation client, a data feature set, a target annotation set and a data annotation client, wherein the manual annotation client is used for determining the target annotation set according to the data feature set and the similar set, the target annotation set is used for representing a set where each photo to be processed is subjected to data annotation, and data annotated when each photo to be processed is subjected to data annotation is associated with the set where the photo to be processed is located; and the data annotation module is used for performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
It can be seen that, in the embodiments of the present application, a data annotation method and related products are provided, which are applied to an electronic device, where the electronic device is in communication connection with a manual annotation client; the method comprises the following steps: the electronic equipment acquires photos to be processed, determines that the photos to be processed form an initial photo set, extracts feature information of each photo to be processed in the initial photo set, determines a data feature set and a similar set through the manual marking client according to the feature information, the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of the feature information of the data feature set is smaller than a first threshold value, the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set, and then determines a target marking set through the manual marking client according to the data feature set and the similar set, and the target marking set is used for representing a set where each photo to be processed is subjected to data marking, when each photo to be processed is subjected to data annotation, the annotated data is associated with the set where the photo to be processed is located, and finally, according to the characteristic information, data annotation is carried out on each photo to be processed in the target annotation set; through the combination of the pre-labeling processing and the manual classification processing of the electronic equipment, the labeling work of a large number of face recognition pictures is completed with low labor cost, and meanwhile, the labeling errors generated in the labeling process are reduced, so that the quality of data labeling is improved, and the high efficiency and the accuracy of data labeling are improved.
It can be understood that, since the method embodiment and the apparatus embodiment are different presentation forms of the same technical concept, the content of the method embodiment portion in the present application should be synchronously adapted to the apparatus embodiment portion, and is not described herein again.
In a possible example, the processing unit 501 is specifically configured to determine, by the manual tagging client, a data feature set and a similar set according to the feature information, and is configured to: classifying each photo to be processed by taking the feature information as an identifier, and determining a plurality of first feature sets, wherein each first feature set consists of a plurality of photos to be processed, and the difference value of the feature information is smaller than a first threshold value; sending the first feature set and the feature information to the manual labeling client, and determining a similar photo set, wherein the similar photo set consists of the to-be-processed pictures which have a difference value smaller than a first threshold value and do not exist in the first feature set, and the similar photo set is in one-to-one correspondence with the first feature set; and determining the data feature set and the similar set through the manual labeling client according to the similar photo set.
In a possible example, the processing unit 501 is specifically configured to determine, by the manual annotation client, the data feature set and the similar set according to the similar photo set, and: acquiring a data feature set after data cleaning is carried out on each first feature set, wherein the data feature set is obtained by carrying out data cleaning on non-homogeneous objects in each first feature set by the manual labeling client; and acquiring a similar set obtained after data cleaning is carried out on each similar picture set, wherein the similar set is obtained by carrying out data cleaning on non-homogeneous objects in each similar picture set by the manual labeling client, and the similar sets correspond to the data feature sets one to one.
In a possible example, the processing unit 501 is specifically configured to determine, by the manual annotation client, a target annotation set according to the data feature set and the similar set, and is configured to: after the data feature set and the similar set are obtained, determining that the photo to be processed in the similar set and the photo to be processed in the data feature set are similar objects, and moving the photo to be processed in the similar set into the data feature set; querying whether any two data feature sets have repeated photos; if the repeated pictures exist, determining the information of the repeated pictures; and processing the data feature set through the manual labeling client according to the repeated picture information to determine a target labeling set.
In a possible example, according to the repeated picture information, the data feature set is processed by the manual annotation client to determine a target annotation set, and the processing unit 501 is specifically configured to: sending the repeated picture information to the manual annotation client, wherein the repeated picture information is used for the manual annotation client to determine a first data feature set and a second data feature set, and the repeated pictures exist in the first data feature set and the second data feature set at the same time; judging whether the first data feature set and the second data feature set are photo sets of similar objects or not through the manual labeling client; and executing a preset repeated elimination strategy on the data feature set according to the judgment result, and determining that the data feature set after executing the repeated elimination strategy is a target labeling set.
In a possible example, in accordance with the determination result, a preset duplicate elimination policy is executed on the data feature set, and it is determined that the data feature set after the duplicate elimination policy is executed is a target labeling set, where the processing unit 501 is specifically configured to: the manual labeling client judges that the repeated photos in the first data set and the multiple photos to be processed in the second data set are photos of the same kind of objects, and then reclassifies the repeated photos into the second data feature set; and deleting the repeated photos in the first data set by the manual labeling client, and determining a target labeling set, wherein the target labeling set comprises the reclassified second data feature set.
In a possible example, in accordance with the determination result, a preset duplicate elimination policy is executed on the data feature set, and it is determined that the data feature set after the duplicate elimination policy is executed is a target labeling set, where the processing unit 501 is specifically configured to: the manual labeling client judges that the photos to be processed in the first data set and the photos to be processed in the second data set are photos of the same kind of objects, and determines that the first data set and the second data set are repeated; and the manual annotation client merges the first data set and the second data set to determine a target annotation set, wherein the target annotation set comprises the second data feature set after merging.
Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, the computer program enabling a computer to execute part or all of the steps of any one of the methods described in the above method embodiments, and the computer includes an electronic device.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package, the computer comprising an electronic device.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for annotating data, the method comprising:
acquiring photos to be processed, and determining the photos to be processed to form an initial photo set;
extracting feature information of each photo to be processed in the initial photo set;
determining a data feature set and a similar set according to the feature information, wherein the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of which is smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the data feature set and do not exist in the data feature set;
determining a target labeling set according to the data feature set and the similar set, wherein the target labeling set is used for representing a set where each photo to be processed is subjected to data labeling, and labeled data of each photo to be processed when the data labeling is carried out is associated with the set where the photo to be processed is located;
and performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
2. The method of claim 1, wherein determining a data feature set and a similarity set according to the feature information comprises:
classifying each photo to be processed by taking the feature information as an identifier, and determining a plurality of first feature sets, wherein each first feature set consists of a plurality of photos to be processed, and the difference value of the feature information is smaller than a first threshold value;
sending the first feature set and the feature information to a manual annotation client, wherein the first feature set and the feature information are used for the manual annotation client to determine a similar photo set, the similar photo set is composed of the to-be-processed pictures which have a difference value smaller than a first threshold value and do not exist in the first feature set, and the similar photo set is in one-to-one correspondence with the first feature set;
and determining the data feature set and the similar set through the manual labeling client according to the similar photo set.
3. The method of claim 2, wherein the determining, by the human annotation client, the set of data features and the similar set according to the set of similar photos comprises:
acquiring a data feature set after data cleaning is carried out on each first feature set, wherein the data feature set is obtained by carrying out data cleaning on non-homogeneous objects in each first feature set by the manual labeling client;
and acquiring a similar set obtained after data cleaning is carried out on each similar picture set, wherein the similar set is obtained by carrying out data cleaning on non-homogeneous objects in each similar picture set by the manual labeling client, and the similar sets correspond to the data feature sets one to one.
4. The method of claim 3, wherein the determining, by the manual annotation client, a target annotation set according to the data feature set and the similarity set comprises:
after the data feature set and the similar set are obtained, determining that the photo to be processed in the similar set and the photo to be processed in the data feature set are similar objects, and moving the photo to be processed in the similar set into the data feature set;
querying whether any two data feature sets have repeated photos;
if the repeated pictures exist, determining the information of the repeated pictures;
and processing the data feature set through the manual labeling client according to the repeated picture information to determine a target labeling set.
5. The method of claim 4, wherein the processing, by the manual annotation client, the data feature set according to the repeated picture information to determine a target annotation set comprises:
sending the repeated picture information to the manual annotation client, wherein the repeated picture information is used for the manual annotation client to determine a first data feature set and a second data feature set, and the repeated pictures exist in the first data feature set and the second data feature set at the same time;
judging whether the first data feature set and the second data feature set are photo sets of similar objects or not through the manual labeling client;
and executing a preset repeated elimination strategy on the data feature set according to the judgment result, and determining that the data feature set after executing the repeated elimination strategy is a target labeling set.
6. The method according to claim 5, wherein the determining, according to the determination result, that the data feature set is subjected to a preset duplicate elimination policy and the data feature set subjected to the duplicate elimination policy is determined as a target labeling set includes:
the manual labeling client judges that the repeated photos in the first data set and the multiple photos to be processed in the second data set are photos of the same kind of objects, and then reclassifies the repeated photos into the second data feature set;
and deleting the repeated photos in the first data set by the manual labeling client, and determining a target labeling set, wherein the target labeling set comprises the reclassified second data feature set.
7. The method according to claim 5, wherein the determining, according to the determination result, that the data feature set is subjected to a preset duplicate elimination policy and the data feature set subjected to the duplicate elimination policy is determined as a target labeling set includes:
the manual labeling client judges that the photos to be processed in the first data set and the photos to be processed in the second data set are photos of the same kind of objects, and determines that the first data set and the second data set are repeated;
and the manual annotation client merges the first data set and the second data set to determine a target annotation set, wherein the target annotation set comprises the second data feature set after merging.
8. The data annotation device is applied to electronic equipment, and the electronic equipment is in communication connection with a manual annotation client; the data annotation device comprises a processing unit, a communication unit and a storage unit, wherein,
the processing unit is used for acquiring photos to be processed and determining that the photos to be processed form an initial photo set; and the characteristic information of each photo to be processed in the initial photo set is extracted; the data feature set is used for representing a set formed by a plurality of photos to be processed, the difference value of which is smaller than a first threshold value, and the similar set is used for representing a set formed by the photos to be processed, which are similar to the feature information of the photos to be processed in the first feature set and do not exist in the first feature set; the system comprises a manual annotation client, a data feature set, a target annotation set and a data annotation client, wherein the manual annotation client is used for determining the target annotation set according to the data feature set and the similar set, the target annotation set is used for representing a set where each photo to be processed is subjected to data annotation, and data annotated when each photo to be processed is subjected to data annotation is associated with the set where the photo to be processed is located; and the data annotation module is used for performing data annotation on each photo to be processed in the target annotation set according to the characteristic information.
9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that a computer program for electronic data exchange is stored, wherein the computer program causes a computer to perform the method according to any one of claims 1-7.
CN201911418643.7A 2019-12-31 2019-12-31 Data annotation method and related product Pending CN113127668A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911418643.7A CN113127668A (en) 2019-12-31 2019-12-31 Data annotation method and related product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911418643.7A CN113127668A (en) 2019-12-31 2019-12-31 Data annotation method and related product

Publications (1)

Publication Number Publication Date
CN113127668A true CN113127668A (en) 2021-07-16

Family

ID=76769358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911418643.7A Pending CN113127668A (en) 2019-12-31 2019-12-31 Data annotation method and related product

Country Status (1)

Country Link
CN (1) CN113127668A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103257954A (en) * 2013-06-05 2013-08-21 北京百度网讯科技有限公司 Proofreading method, system and proofreading server of characters in ancient book
CN105654039A (en) * 2015-12-24 2016-06-08 小米科技有限责任公司 Image processing method and device
CN109657087A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 A kind of batch data mask method, device and computer readable storage medium
CN110046586A (en) * 2019-04-19 2019-07-23 腾讯科技(深圳)有限公司 A kind of data processing method, equipment and storage medium
CN110175549A (en) * 2019-05-20 2019-08-27 腾讯科技(深圳)有限公司 Face image processing process, device, equipment and storage medium
CN110472460A (en) * 2018-05-11 2019-11-19 北京京东尚科信息技术有限公司 Face image processing process and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103257954A (en) * 2013-06-05 2013-08-21 北京百度网讯科技有限公司 Proofreading method, system and proofreading server of characters in ancient book
CN105654039A (en) * 2015-12-24 2016-06-08 小米科技有限责任公司 Image processing method and device
CN110472460A (en) * 2018-05-11 2019-11-19 北京京东尚科信息技术有限公司 Face image processing process and device
CN109657087A (en) * 2018-11-30 2019-04-19 平安科技(深圳)有限公司 A kind of batch data mask method, device and computer readable storage medium
CN110046586A (en) * 2019-04-19 2019-07-23 腾讯科技(深圳)有限公司 A kind of data processing method, equipment and storage medium
CN110175549A (en) * 2019-05-20 2019-08-27 腾讯科技(深圳)有限公司 Face image processing process, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109086669B (en) Face recognition identity verification method and device and electronic equipment
CN110751224B (en) Training method of video classification model, video classification method, device and equipment
CN109635857B (en) Human-vehicle track monitoring and analyzing method, device, equipment and storage medium
CN102880879B (en) Distributed processing and support vector machine (SVM) classifier-based outdoor massive object recognition method and system
CN110458078A (en) A kind of face image data clustering method, system and equipment
CN107870928A (en) File reading and device
CN109412832B (en) User service providing method and system
CN104750791A (en) Image retrieval method and device
CN110765293A (en) Method and system for automatically opening two-dimensional code, electronic device and storage medium
CN106384071A (en) Two-dimensional code scanning recognition processing method and device
CN108304852B (en) Method and device for determining road section type, storage medium and electronic device
CN105488364A (en) Method, device and system using two-dimension code to distinguish user type
CN109783678B (en) Image searching method and device
US20170300514A1 (en) Method and terminal for implementing image sequencing
CN114419428A (en) Target detection method, target detection device and computer readable storage medium
CN113992944A (en) Video cataloging method, device, equipment, system and medium
CN112801923A (en) Word processing method, system, readable storage medium and computer equipment
CN112287800A (en) Advertisement video identification method and system under no-sample condition
CN113127668A (en) Data annotation method and related product
CN103810294A (en) Management method and intelligent terminal for multi-media data files
CN104252618A (en) Method and system for increasing photo return speeds
CN105592221A (en) Multimedia-sending method and communication terminal
CN109271583A (en) Service push method and relevant apparatus
CN115700845A (en) Face recognition model training method, face recognition device and related equipment
CN109784226B (en) Face snapshot method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210716