CN111259251A - Method and device for recommending annotation task - Google Patents
Method and device for recommending annotation task Download PDFInfo
- Publication number
- CN111259251A CN111259251A CN202010073209.6A CN202010073209A CN111259251A CN 111259251 A CN111259251 A CN 111259251A CN 202010073209 A CN202010073209 A CN 202010073209A CN 111259251 A CN111259251 A CN 111259251A
- Authority
- CN
- China
- Prior art keywords
- recommending
- user
- task
- annotation
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000002372 labelling Methods 0.000 claims abstract description 24
- 230000004927 fusion Effects 0.000 claims abstract description 10
- 238000001914 filtration Methods 0.000 claims description 15
- 239000003550 marker Substances 0.000 claims description 9
- 230000010354 integration Effects 0.000 abstract description 4
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012098 association analyses Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012946 outsourcing Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
Abstract
The invention discloses a method and a device for recommending a labeling task, wherein the method comprises the following steps: acquiring feature information of a voice labeling task and user images of a plurality of labels; and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait. According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
Description
Technical Field
The invention relates to the field of data annotation, in particular to a method and a device for recommending an annotation task.
Background
With the development of information technology, user portrayal and recommendation systems are frequently found in the e-commerce field, and user features can be labeled in detail to construct user portrayal. And after the user portrait meeting a certain target is generated, recommending, performing association analysis and collaborative filtering by using partial tag data.
In the field of data annotation, voice annotation tasks are mainly completed by outsourcing teams, and how to arrange tasks mainly depends on team profiles and simple screening, for example, some English annotation projects require annotators to have certain English abilities. With the increasing demand of the voice labeling market and the refinement and specialization of data labeling, the existing labeling task distribution method cannot completely achieve the effect of 'using the data as much as possible', if the labeling team is directly thrown to label according to the existing distribution method, the labeling efficiency is greatly reduced, and the rework rate is increased. Therefore, it is very important to find the annotator most suitable for the annotation task.
Disclosure of Invention
The invention provides a method and a device for recommending a labeling task, which aim to overcome the defects that in the prior art, the labeling efficiency is greatly reduced, and the rework rate is increased.
The invention provides a method for recommending an annotation task, which comprises the following steps:
acquiring feature information of a voice labeling task and user images of a plurality of labels;
and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
Optionally, the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
Optionally, the method further includes:
and in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
Optionally, the deep information includes historical performance, language expertise, and scene expertise.
Optionally, the recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the feature information and the user representation includes:
based on the feature information and the user portrait, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators by using a recommendation algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
The invention also provides a device for recommending the labeling task, which comprises the following steps:
the acquisition module is used for acquiring the characteristic information of the voice labeling task and the user images of a plurality of labels;
and the recommending module is used for recommending the labeling task to a marker matched with the voice labeling task in the plurality of markers according to the characteristic information and the user portrait.
Optionally, the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
Optionally, the apparatus further includes:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
Optionally, the deep information includes historical performance, language expertise, and scene expertise.
Optionally, the recommending module is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, of the plurality of annotators based on the feature information and the user portrait, by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
According to the invention, by establishing the user portrait system and the recommendation system, the user characteristics are labeled, and the user portrait and the recommendation system are applied to the field of data annotation, so that the annotation efficiency and the personnel integration efficiency are greatly improved. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
Drawings
FIG. 1 is a flowchart of a method for recommending annotation tasks in an embodiment of the present invention;
fig. 2 is a structural diagram of a device for recommending annotation tasks in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method for recommending an annotation task, which comprises the following steps of:
wherein the user representation may include: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time. After the user representation is created, the user representation may be periodically maintained and updated.
In this embodiment, an initial user portrait including age, native place, school calendar, english level, professional background, etc. may be created based on information filled by each annotator when entering the system. And in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
The deep information may include historical manifestations, language specialties, scene specialties, and the like.
And 102, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
Specifically, a recommendation algorithm obtained by performing waterfall fusion on two algorithms can be used based on the feature information and the user portrait, and the annotation task is recommended to an annotator matched with the voice annotation task from the multiple annotators; the Waterfall type (Waterfall Model) fusion method adopts a method of connecting a plurality of models in series, and the two algorithms comprise an Item-based Collaborative Filtering (Item-based Collaborative Filtering) algorithm and a User-based Collaborative Filtering (User-based Collaborative Filtering) algorithm.
For example, after a huge user representation system is established, when a special project (for example, a military-related project, a Chinese-English mixed project, a long voice project, a lot of noises and a very urgent project) is met, a large number of idle markers which are fond of military affairs, have a certain English level, have a historical long voice marking performance of more than 80 minutes, have a noise distinguishing capability of more than 90 minutes and have a week, are selected from the system, the marked project is recommended to the users, and if the markers agree to participate in the marked project, the project manager arranges tasks uniformly.
The embodiment of the invention establishes the user portrait system and the recommendation system, labels the user characteristics, and applies the user portrait and the recommendation system to the field of data annotation, thereby greatly improving the annotation efficiency and the personnel integration efficiency. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
Based on the method for recommending annotation tasks, an embodiment of the present invention further provides a device for recommending annotation tasks, as shown in fig. 2, including:
an obtaining module 210, configured to obtain feature information of a voice annotation task and user images of multiple annotators;
wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
And the recommending module 220 is used for recommending the annotation task to an annotator matched with the voice annotation task in the plurality of annotators according to the characteristic information and the user portrait.
Specifically, the recommending module 220 is specifically configured to recommend the annotation task to a annotator, which is matched with the voice annotation task, among the plurality of annotators, based on the feature information and the user representation, by using a recommending algorithm obtained by waterfall fusion of the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
Further, the above apparatus further comprises:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
Wherein, the deep information comprises historical expression, language speciality and scene speciality.
The embodiment of the invention establishes the user portrait system and the recommendation system, labels the user characteristics, and applies the user portrait and the recommendation system to the field of data annotation, thereby greatly improving the annotation efficiency and the personnel integration efficiency. In addition, two algorithms are used in the recommendation system for waterfall fusion, so that the recommendation accuracy can be improved.
The steps of a method described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A method for recommending annotation tasks, comprising the steps of:
acquiring feature information of a voice labeling task and user images of a plurality of labels;
and recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators according to the characteristic information and the user portrait.
2. The method of claim 1, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
3. The method of claim 1, further comprising:
and in the voice labeling process, recording the deep information of each label member, and updating the user image of the label member according to the deep information.
4. The method of claim 3, wherein the deep information includes historical performance, language expertise, and scene expertise.
5. The method of claim 1, wherein recommending the annotation task to the annotator of the plurality of annotators matching the voice annotation task based on the feature information and the user representation comprises:
based on the feature information and the user portrait, recommending the annotation task to an annotator matched with the voice annotation task from the plurality of annotators by using a recommendation algorithm obtained by performing waterfall fusion on the two algorithms; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
6. An apparatus for recommending annotation tasks, comprising:
the acquisition module is used for acquiring the characteristic information of the voice labeling task and the user images of a plurality of labels;
and the recommending module is used for recommending the labeling task to a marker matched with the voice labeling task in the plurality of markers according to the characteristic information and the user portrait.
7. The apparatus of claim 6, wherein the user representation comprises: age, native place, school calendar, foreign language level, professional background, long speech tagging capability, murmur discrimination capability, and recent idle time.
8. The apparatus of claim 6, further comprising:
and the updating module is used for recording the deep information of each marker in the voice marking process and updating the user image of the marker according to the deep information.
9. The apparatus of claim 8, wherein the deep information comprises historical performance, language expertise, and scene expertise.
10. The apparatus of claim 6,
the recommending module is specifically used for recommending the labeling task to a marker matched with the voice labeling task from the plurality of markers by using a recommending algorithm obtained by performing waterfall fusion on the two algorithms based on the characteristic information and the user portrait; wherein the two algorithms include an item-based collaborative filtering algorithm and a user-based collaborative filtering algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010073209.6A CN111259251A (en) | 2020-01-21 | 2020-01-21 | Method and device for recommending annotation task |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010073209.6A CN111259251A (en) | 2020-01-21 | 2020-01-21 | Method and device for recommending annotation task |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111259251A true CN111259251A (en) | 2020-06-09 |
Family
ID=70950994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010073209.6A Pending CN111259251A (en) | 2020-01-21 | 2020-01-21 | Method and device for recommending annotation task |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111259251A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113554130A (en) * | 2021-09-22 | 2021-10-26 | 平安科技(深圳)有限公司 | Data labeling method and device based on artificial intelligence, electronic equipment and medium |
CN113963234A (en) * | 2021-10-25 | 2022-01-21 | 北京百度网讯科技有限公司 | Data annotation processing method and device, electronic equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846544A (en) * | 2018-04-27 | 2018-11-20 | 淘然视界(杭州)科技有限公司 | A kind of distribution method and system of mark task |
US20190050428A1 (en) * | 2017-08-08 | 2019-02-14 | TuSimple | System and method for image annotation |
CN109543111A (en) * | 2018-11-28 | 2019-03-29 | 广州虎牙信息科技有限公司 | Recommendation information screening technique, device, storage medium and server |
CN109978356A (en) * | 2019-03-15 | 2019-07-05 | 平安普惠企业管理有限公司 | Mark method for allocating tasks, device, medium and computer equipment |
CN110070854A (en) * | 2019-04-17 | 2019-07-30 | 北京爱数智慧科技有限公司 | Voice annotation quality determination method, device, equipment and computer-readable medium |
CN110490444A (en) * | 2019-08-13 | 2019-11-22 | 新华智云科技有限公司 | Mark method for allocating tasks, device, system and storage medium |
-
2020
- 2020-01-21 CN CN202010073209.6A patent/CN111259251A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190050428A1 (en) * | 2017-08-08 | 2019-02-14 | TuSimple | System and method for image annotation |
CN108846544A (en) * | 2018-04-27 | 2018-11-20 | 淘然视界(杭州)科技有限公司 | A kind of distribution method and system of mark task |
CN109543111A (en) * | 2018-11-28 | 2019-03-29 | 广州虎牙信息科技有限公司 | Recommendation information screening technique, device, storage medium and server |
CN109978356A (en) * | 2019-03-15 | 2019-07-05 | 平安普惠企业管理有限公司 | Mark method for allocating tasks, device, medium and computer equipment |
CN110070854A (en) * | 2019-04-17 | 2019-07-30 | 北京爱数智慧科技有限公司 | Voice annotation quality determination method, device, equipment and computer-readable medium |
CN110490444A (en) * | 2019-08-13 | 2019-11-22 | 新华智云科技有限公司 | Mark method for allocating tasks, device, system and storage medium |
Non-Patent Citations (2)
Title |
---|
赵威等: "成本约束下自适应众包标注的用户观点抽取", 《计算机应用》 * |
韩通: "融合用户评价行为的协同过滤推荐算法研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113554130A (en) * | 2021-09-22 | 2021-10-26 | 平安科技(深圳)有限公司 | Data labeling method and device based on artificial intelligence, electronic equipment and medium |
CN113963234A (en) * | 2021-10-25 | 2022-01-21 | 北京百度网讯科技有限公司 | Data annotation processing method and device, electronic equipment and medium |
CN113963234B (en) * | 2021-10-25 | 2024-02-23 | 北京百度网讯科技有限公司 | Data annotation processing method, device, electronic equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10387776B2 (en) | Recurrent neural network architectures which provide text describing images | |
US8775429B2 (en) | Methods and systems for analyzing data of an online social network | |
Stafford et al. | Eu-social science: the role of internet social networks in the collection of bee biodiversity data | |
US11381651B2 (en) | Interpretable user modeling from unstructured user data | |
CN107122786B (en) | Crowdsourcing learning method and device | |
US11615485B2 (en) | System and method for predicting engagement on social media | |
US20200143000A1 (en) | Customized display of emotionally filtered social media content | |
US20140030681A1 (en) | Activity-oriented Studying Method in an Online-to-offline Manner | |
CN111259251A (en) | Method and device for recommending annotation task | |
CN115481969A (en) | Resume screening method and device, electronic equipment and readable storage medium | |
CN111008340B (en) | Course recommendation method, device and storage medium | |
Durrant et al. | Human values in curating a human rights media archive | |
WO2009067159A2 (en) | Media asset evaluation based on social relationships | |
Simpson et al. | Assessing needs and decision contexts: RISA approaches to engagement research | |
CN112328905A (en) | Online marketing content pushing method and device, computer equipment and storage medium | |
Desjardins | Online and digital contexts | |
CA3160703A1 (en) | Digital recruitment systems and methods thereof | |
Garaba | The record and memorabilia in school archives management in Pietermaritzburg schools, KwaZulu-Natal, South Africa | |
CN114529244A (en) | HRD-based interview data processing method and interview evaluation method and device | |
Ras et al. | Building a future for our digital memory: A collaborative infrastructure for permanent access to digital heritage in The Netherlands | |
CN113496005B (en) | Information management method and device, electronic equipment and storage medium | |
CN112231594B (en) | Information processing method and device | |
Li et al. | Analytics of big geosocial media and crowdsourced data | |
KR102607570B1 (en) | Interview platform system for providing edited interview data according to the permission of the data receiver | |
Hendrick | The Agile Museum: organisational change through collecting ‘new media art’ |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 411, 4th floor, building 4, No.44, Middle North Third Ring Road, Haidian District, Beijing 100088 Applicant after: Beijing Qingshu Intelligent Technology Co.,Ltd. Address before: 100044 1415, 14th floor, building 1, yard 59, gaoliangqiaoxie street, Haidian District, Beijing Applicant before: BEIJING AISHU WISDOM TECHNOLOGY CO.,LTD. |