CN111259251A

CN111259251A - Method and device for recommending annotation task

Info

Publication number: CN111259251A
Application number: CN202010073209.6A
Authority: CN
Inventors: 张晴晴; 杨金富; 罗磊; 段由; 马光谦; 汪洋
Original assignee: Beijing Aishu Wisdom Technology Co ltd
Current assignee: Beijing Aishu Wisdom Technology Co ltd
Priority date: 2020-01-21
Filing date: 2020-01-21
Publication date: 2020-06-09

Abstract

The invention discloses a method and a device for recommending a labeling task, wherein the method comprises the following steps: acquiring feature information of a voice labeling task and user images of a plurality of labels; and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait. According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.

Description

Method and device for recommending annotation task

Technical Field

The invention relates to the field of data annotation, in particular to a method and a device for recommending an annotation task.

Background

With the development of information technology, user portrayal and recommendation systems are frequently found in the e-commerce field, and user features can be labeled in detail to construct user portrayal. And after the user portrait meeting a certain target is generated, recommending, performing association analysis and collaborative filtering by using partial tag data.

In the field of data annotation, voice annotation tasks are mainly completed by outsourcing teams, and how to arrange tasks mainly depends on team profiles and simple screening, for example, some English annotation projects require annotators to have certain English abilities. With the increasing demand of the voice labeling market and the refinement and specialization of data labeling, the existing labeling task distribution method cannot completely achieve the effect of 'using the data as much as possible', if the labeling team is directly thrown to label according to the existing distribution method, the labeling efficiency is greatly reduced, and the rework rate is increased. Therefore, it is very important to find the annotator most suitable for the annotation task.

Disclosure of Invention

The invention provides a method and a device for recommending a labeling task, which aim to overcome the defects that in the prior art, the labeling efficiency is greatly reduced, and the rework rate is increased.

The invention provides a method for recommending an annotation task, which comprises the following steps:

acquiring feature information of a voice labeling task and user images of a plurality of labels;

and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.

Optionally, the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.

Optionally, the method further includes:

and in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.

Optionally, the deep information includes historical performance, language expertise, and scene expertise.

Optionally, the recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the feature information and the user representation includes:

based on the feature information and the user portrait, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators by using a recommendation algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.

The invention also provides a device for recommending the labeling task, which comprises the following steps:

the acquisition module is used for acquiring the characteristic information of the voice labeling task and the user images of a plurality of labels;

and the recommending module is used for recommending the labeling task to a marker matched with the voice labeling task in the plurality of markers according to the characteristic information and the user portrait.

Optionally, the apparatus further includes:

and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.

Optionally, the recommending module is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, of the plurality of annotators based on the feature information and the user portrait, by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.

According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.

Drawings

FIG. 1 is a flowchart of a method for recommending annotation tasks in an embodiment of the present invention;

fig. 2 is a structural diagram of a device for recommending annotation tasks in an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a method for recommending an annotation task, which comprises the following steps of:

step 101, acquiring feature information of a voice labeling task and user images of a plurality of labels;

wherein the user representation may include: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time. After the user representation is created, the user representation may be periodically maintained and updated.

In this embodiment, an initial user portrait including age, native place, school calendar, english level, professional background, etc. may be created based on information filled by each annotator when entering the system. And in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.

The deep information may include historical manifestations, language specialties, scene specialties, and the like.

And 102, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.

Specifically, a recommendation algorithm obtained by performing waterfall fusion on two algorithms can be used based on the feature information and the user portrait, and the annotation task is recommended to an annotator matched with the voice annotation task from the multiple annotators; the Waterfall type (Waterfall Model) fusion method adopts a method of connecting a plurality of models in series, and the two algorithms comprise an Item-based Collaborative Filtering (Item-based Collaborative Filtering) algorithm and a User-based Collaborative Filtering (User-based Collaborative Filtering) algorithm.

For example, after a huge user representation system is established, when a special project (for example, a military-related project, a Chinese-English mixed project, a long voice project, a lot of noises and a very urgent project) is met, a large number of idle markers which are fond of military affairs, have a certain English level, have a historical long voice marking performance of more than 80 minutes, have a noise distinguishing capability of more than 90 minutes and have a week, are selected from the system, the marked project is recommended to the users, and if the markers agree to participate in the marked project, the project manager arranges tasks uniformly.

The embodiment of the invention establishes the user portrait system and the recommendation system, labels the user characteristics, and applies the user portrait and the recommendation system to the field of data annotation, thereby greatly improving the annotation efficiency and the personnel integration efficiency. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.

Based on the method for recommending annotation tasks, an embodiment of the present invention further provides a device for recommending annotation tasks, as shown in fig. 2, including:

an obtaining module 210, configured to obtain feature information of a voice annotation task and user images of multiple annotators;

wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.

And the recommending module 220 is used for recommending the annotation task to an annotator matched with the voice annotation task in the plurality of annotators according to the characteristic information and the user portrait.

Specifically, the recommending module 220 is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, among the plurality of annotators, based on the feature information and the user representation, by using a recommending algorithm obtained by waterfall fusion of the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.

Further, the above apparatus further comprises:

Wherein, the deep information comprises historical expression, language speciality and scene speciality.

The steps of a method described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for recommending annotation tasks, comprising the steps of:

2. The method of claim 1, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.

3. The method of claim 1, further comprising:

4. The method of claim 3, wherein the deep information includes historical performance, language expertise, and scene expertise.

5. The method of claim 1, wherein recommending the annotation task to the annotator of the plurality of annotators matching the voice annotation task based on the feature information and the user representation comprises:

6. An apparatus for recommending annotation tasks, comprising:

7. The apparatus of claim 6, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.

8. The apparatus of claim 6, further comprising:

9. The apparatus of claim 8, wherein the deep information comprises historical performance, language expertise, and scene expertise.

10. The apparatus of claim 6,

the recommending module is specifically used for recommending the labeling task to a marker matched with the voice labeling task from the plurality of markers by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms based on the characteristic information and the user portrait; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.