CN111259251A - Method and device for recommending annotation task - Google Patents

Method and device for recommending annotation task Download PDF

Info

Publication number
CN111259251A
CN111259251A CN202010073209.6A CN202010073209A CN111259251A CN 111259251 A CN111259251 A CN 111259251A CN 202010073209 A CN202010073209 A CN 202010073209A CN 111259251 A CN111259251 A CN 111259251A
Authority
CN
China
Prior art keywords
recommending
user
task
annotation
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010073209.6A
Other languages
Chinese (zh)
Inventor
张晴晴
杨金富
罗磊
段由
马光谦
汪洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aishu Wisdom Technology Co ltd
Original Assignee
Beijing Aishu Wisdom Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aishu Wisdom Technology Co ltd filed Critical Beijing Aishu Wisdom Technology Co ltd
Priority to CN202010073209.6A priority Critical patent/CN111259251A/en
Publication of CN111259251A publication Critical patent/CN111259251A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The invention discloses a method and a device for recommending a labeling task, wherein the method comprises the following steps: acquiring feature information of a voice labeling task and user images of a plurality of labels; and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait. According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.

Description

Method and device for recommending annotation task
Technical Field
The invention relates to the field of data annotation, in particular to a method and a device for recommending an annotation task.
Background
With the development of information technology, user portrayal and recommendation systems are frequently found in the e-commerce field, and user features can be labeled in detail to construct user portrayal. And after the user portrait meeting a certain target is generated, recommending, performing association analysis and collaborative filtering by using partial tag data.
In the field of data annotation, voice annotation tasks are mainly completed by outsourcing teams, and how to arrange tasks mainly depends on team profiles and simple screening, for example, some English annotation projects require annotators to have certain English abilities. With the increasing demand of the voice labeling market and the refinement and specialization of data labeling, the existing labeling task distribution method cannot completely achieve the effect of 'using the data as much as possible', if the labeling team is directly thrown to label according to the existing distribution method, the labeling efficiency is greatly reduced, and the rework rate is increased. Therefore, it is very important to find the annotator most suitable for the annotation task.
Disclosure of Invention
The invention provides a method and a device for recommending a labeling task, which aim to overcome the defects that in the prior art, the labeling efficiency is greatly reduced, and the rework rate is increased.
The invention provides a method for recommending an annotation task, which comprises the following steps:
acquiring feature information of a voice labeling task and user images of a plurality of labels;
and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
Optionally, the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
Optionally, the method further includes:
and in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
Optionally, the deep information includes historical performance, language expertise, and scene expertise.
Optionally, the recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the feature information and the user representation includes:
based on the feature information and the user portrait, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators by using a recommendation algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
The invention also provides a device for recommending the labeling task, which comprises the following steps:
the acquisition module is used for acquiring the characteristic information of the voice labeling task and the user images of a plurality of labels;
and the recommending module is used for recommending the labeling task to a marker matched with the voice labeling task in the plurality of markers according to the characteristic information and the user portrait.
Optionally, the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
Optionally, the apparatus further includes:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
Optionally, the deep information includes historical performance, language expertise, and scene expertise.
Optionally, the recommending module is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, of the plurality of annotators based on the feature information and the user portrait, by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
Drawings
FIG. 1 is a flowchart of a method for recommending annotation tasks in an embodiment of the present invention;
fig. 2 is a structural diagram of a device for recommending annotation tasks in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method for recommending an annotation task, which comprises the following steps of:
step 101, acquiring feature information of a voice labeling task and user images of a plurality of labels;
wherein the user representation may include: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time. After the user representation is created, the user representation may be periodically maintained and updated.
In this embodiment, an initial user portrait including age, native place, school calendar, english level, professional background, etc. may be created based on information filled by each annotator when entering the system. And in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
The deep information may include historical manifestations, language specialties, scene specialties, and the like.
And 102, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
Specifically, a recommendation algorithm obtained by performing waterfall fusion on two algorithms can be used based on the feature information and the user portrait, and the annotation task is recommended to an annotator matched with the voice annotation task from the multiple annotators; the Waterfall type (Waterfall Model) fusion method adopts a method of connecting a plurality of models in series, and the two algorithms comprise an Item-based Collaborative Filtering (Item-based Collaborative Filtering) algorithm and a User-based Collaborative Filtering (User-based Collaborative Filtering) algorithm.
For example, after a huge user representation system is established, when a special project (for example, a military-related project, a Chinese-English mixed project, a long voice project, a lot of noises and a very urgent project) is met, a large number of idle markers which are fond of military affairs, have a certain English level, have a historical long voice marking performance of more than 80 minutes, have a noise distinguishing capability of more than 90 minutes and have a week, are selected from the system, the marked project is recommended to the users, and if the markers agree to participate in the marked project, the project manager arranges tasks uniformly.
The embodiment of the invention establishes the user portrait system and the recommendation system, labels the user characteristics, and applies the user portrait and the recommendation system to the field of data annotation, thereby greatly improving the annotation efficiency and the personnel integration efficiency. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
Based on the method for recommending annotation tasks, an embodiment of the present invention further provides a device for recommending annotation tasks, as shown in fig. 2, including:
an obtaining module 210, configured to obtain feature information of a voice annotation task and user images of multiple annotators;
wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
And the recommending module 220 is used for recommending the annotation task to an annotator matched with the voice annotation task in the plurality of annotators according to the characteristic information and the user portrait.
Specifically, the recommending module 220 is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, among the plurality of annotators, based on the feature information and the user representation, by using a recommending algorithm obtained by waterfall fusion of the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
Further, the above apparatus further comprises:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
Wherein, the deep information comprises historical expression, language speciality and scene speciality.
The embodiment of the invention establishes the user portrait system and the recommendation system, labels the user characteristics, and applies the user portrait and the recommendation system to the field of data annotation, thereby greatly improving the annotation efficiency and the personnel integration efficiency. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
The steps of a method described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for recommending annotation tasks, comprising the steps of:
acquiring feature information of a voice labeling task and user images of a plurality of labels;
and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
2. The method of claim 1, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
3. The method of claim 1, further comprising:
and in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
4. The method of claim 3, wherein the deep information includes historical performance, language expertise, and scene expertise.
5. The method of claim 1, wherein recommending the annotation task to the annotator of the plurality of annotators matching the voice annotation task based on the feature information and the user representation comprises:
based on the feature information and the user portrait, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators by using a recommendation algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
6. An apparatus for recommending annotation tasks, comprising:
the acquisition module is used for acquiring the characteristic information of the voice labeling task and the user images of a plurality of labels;
and the recommending module is used for recommending the labeling task to a marker matched with the voice labeling task in the plurality of markers according to the characteristic information and the user portrait.
7. The apparatus of claim 6, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
8. The apparatus of claim 6, further comprising:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
9. The apparatus of claim 8, wherein the deep information comprises historical performance, language expertise, and scene expertise.
10. The apparatus of claim 6,
the recommending module is specifically used for recommending the labeling task to a marker matched with the voice labeling task from the plurality of markers by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms based on the characteristic information and the user portrait; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
CN202010073209.6A 2020-01-21 2020-01-21 Method and device for recommending annotation task Pending CN111259251A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010073209.6A CN111259251A (en) 2020-01-21 2020-01-21 Method and device for recommending annotation task

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010073209.6A CN111259251A (en) 2020-01-21 2020-01-21 Method and device for recommending annotation task

Publications (1)

Publication Number Publication Date
CN111259251A true CN111259251A (en) 2020-06-09

Family

ID=70950994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010073209.6A Pending CN111259251A (en) 2020-01-21 2020-01-21 Method and device for recommending annotation task

Country Status (1)

Country Link
CN (1) CN111259251A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554130A (en) * 2021-09-22 2021-10-26 平安科技(深圳)有限公司 Data labeling method and device based on artificial intelligence, electronic equipment and medium
CN113963234A (en) * 2021-10-25 2022-01-21 北京百度网讯科技有限公司 Data annotation processing method and device, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846544A (en) * 2018-04-27 2018-11-20 淘然视界(杭州)科技有限公司 A kind of distribution method and system of mark task
US20190050428A1 (en) * 2017-08-08 2019-02-14 TuSimple System and method for image annotation
CN109543111A (en) * 2018-11-28 2019-03-29 广州虎牙信息科技有限公司 Recommendation information screening technique, device, storage medium and server
CN109978356A (en) * 2019-03-15 2019-07-05 平安普惠企业管理有限公司 Mark method for allocating tasks, device, medium and computer equipment
CN110070854A (en) * 2019-04-17 2019-07-30 北京爱数智慧科技有限公司 Voice annotation quality determination method, device, equipment and computer-readable medium
CN110490444A (en) * 2019-08-13 2019-11-22 新华智云科技有限公司 Mark method for allocating tasks, device, system and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190050428A1 (en) * 2017-08-08 2019-02-14 TuSimple System and method for image annotation
CN108846544A (en) * 2018-04-27 2018-11-20 淘然视界(杭州)科技有限公司 A kind of distribution method and system of mark task
CN109543111A (en) * 2018-11-28 2019-03-29 广州虎牙信息科技有限公司 Recommendation information screening technique, device, storage medium and server
CN109978356A (en) * 2019-03-15 2019-07-05 平安普惠企业管理有限公司 Mark method for allocating tasks, device, medium and computer equipment
CN110070854A (en) * 2019-04-17 2019-07-30 北京爱数智慧科技有限公司 Voice annotation quality determination method, device, equipment and computer-readable medium
CN110490444A (en) * 2019-08-13 2019-11-22 新华智云科技有限公司 Mark method for allocating tasks, device, system and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
赵威等: "成本约束下自适应众包标注的用户观点抽取", 《计算机应用》 *
韩通: "融合用户评价行为的协同过滤推荐算法研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554130A (en) * 2021-09-22 2021-10-26 平安科技(深圳)有限公司 Data labeling method and device based on artificial intelligence, electronic equipment and medium
CN113963234A (en) * 2021-10-25 2022-01-21 北京百度网讯科技有限公司 Data annotation processing method and device, electronic equipment and medium
CN113963234B (en) * 2021-10-25 2024-02-23 北京百度网讯科技有限公司 Data annotation processing method, device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
US10387776B2 (en) Recurrent neural network architectures which provide text describing images
US8775429B2 (en) Methods and systems for analyzing data of an online social network
Stafford et al. Eu-social science: the role of internet social networks in the collection of bee biodiversity data
US11381651B2 (en) Interpretable user modeling from unstructured user data
CN107122786B (en) Crowdsourcing learning method and device
US11615485B2 (en) System and method for predicting engagement on social media
US20200143000A1 (en) Customized display of emotionally filtered social media content
US20140030681A1 (en) Activity-oriented Studying Method in an Online-to-offline Manner
CN111259251A (en) Method and device for recommending annotation task
CN115481969A (en) Resume screening method and device, electronic equipment and readable storage medium
CN111008340B (en) Course recommendation method, device and storage medium
Durrant et al. Human values in curating a human rights media archive
WO2009067159A2 (en) Media asset evaluation based on social relationships
Simpson et al. Assessing needs and decision contexts: RISA approaches to engagement research
CN112328905A (en) Online marketing content pushing method and device, computer equipment and storage medium
Desjardins Online and digital contexts
CA3160703A1 (en) Digital recruitment systems and methods thereof
Garaba The record and memorabilia in school archives management in Pietermaritzburg schools, KwaZulu-Natal, South Africa
CN114529244A (en) HRD-based interview data processing method and interview evaluation method and device
Ras et al. Building a future for our digital memory: A collaborative infrastructure for permanent access to digital heritage in The Netherlands
CN113496005B (en) Information management method and device, electronic equipment and storage medium
CN112231594B (en) Information processing method and device
Li et al. Analytics of big geosocial media and crowdsourced data
KR102607570B1 (en) Interview platform system for providing edited interview data according to the permission of the data receiver
Hendrick The Agile Museum: organisational change through collecting ‘new media art’

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 411, 4th floor, building 4, No.44, Middle North Third Ring Road, Haidian District, Beijing 100088

Applicant after: Beijing Qingshu Intelligent Technology Co.,Ltd.

Address before: 100044 1415, 14th floor, building 1, yard 59, gaoliangqiaoxie street, Haidian District, Beijing

Applicant before: BEIJING AISHU WISDOM TECHNOLOGY CO.,LTD.