CN107767055A - A kind of mass-rent result assemblage method and device based on collusion detection - Google Patents

A kind of mass-rent result assemblage method and device based on collusion detection Download PDF

Info

Publication number
CN107767055A
CN107767055A CN201711003779.2A CN201711003779A CN107767055A CN 107767055 A CN107767055 A CN 107767055A CN 201711003779 A CN201711003779 A CN 201711003779A CN 107767055 A CN107767055 A CN 107767055A
Authority
CN
China
Prior art keywords
worker
answer set
repeated
answers
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711003779.2A
Other languages
Chinese (zh)
Other versions
CN107767055B (en
Inventor
孙海龙
王旭
陈鹏鹏
方毅立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201711003779.2A priority Critical patent/CN107767055B/en
Publication of CN107767055A publication Critical patent/CN107767055A/en
Application granted granted Critical
Publication of CN107767055B publication Critical patent/CN107767055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of mass-rent result assemblage method based on collusion detection and device, methods described to include:Answer set of each worker for set of tasks is collected from mass-rent platform;The convergence result of the answer set is calculated, and calculates the convergence result and the parameter of consistency of the answer of each worker;Determine to repeat answer set from the answer set, the parameter of consistency of the answer based on each worker, calculate worker's capacity variation rate corresponding to each repetition answer set;For repetition answer set of worker's capacity variation rate less than or equal to predetermined threshold value, determine that the answer set that repeats is normal generation and retains the repetition answer set in the answer set;It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines that the answer set that repeats is that collusion produces and the repetition answer set is deleted in the answer set;The answer set updated, calculate the convergence result of the answer set of the renewal.

Description

Crowdsourcing result aggregation method and device based on collusion detection
Technical Field
The invention relates to the technical field of crowdsourcing, in particular to a crowdsourcing result gathering method and device based on collusion detection.
Background
Crowdsourcing is a rapidly developing field aimed at solving the problem that computers are difficult to solve using human cognitive advantages. Popular common platforms such as crowdfower and AMT are widely used by people for general data processing tasks such as emotion analysis, handwriting recognition and picture tagging. One core problem of crowdsourcing is ensuring the quality of results, since workers may return results of poor quality. A widely adopted method of controlling quality is result aggregation, which first assigns each task to multiple workers and then uses an inference algorithm to aggregate the results returned by the workers. Taking the image annotation as an example, one image is distributed to a plurality of workers, and then the workers respectively provide tags describing the contents of the images. Finally, a high quality result is gathered from all the collected tags by voting or other reasoning methods.
In crowdsourcing, in order to obtain more remuneration, less labor is paid, and colluders form collusion teams through short messages, WeChat, telephone, forum and even face-to-face communication outside a platform. In a collusion team, only one worker processes the task and the other worker plagiates his answer. All workers in the final team provide the same answer. These malicious repeated answers will dominate the answers provided by normal workers in the result aggregation, reducing the quality of the results. For example, if one task is given to five workers for execution, and three workers collude, the results are converged by using most voting methods, and the final converged result is equal to the result provided by the colluder.
From the above, it can be seen that the repeated answers generated by collusion are detrimental to the quality of results for general tasks on a generic platform. However, existing collusion detection algorithms do not effectively detect and eliminate the negative effects of such collusion.
Disclosure of Invention
In order to solve the above technical problem, embodiments of the present invention provide a crowdsourcing result aggregation method and apparatus based on collusion detection.
The crowdsourcing result converging method based on collusion detection provided by the embodiment of the invention comprises the following steps:
collecting answer sets of workers aiming at the task sets from the crowdsourcing platform;
calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of each worker;
determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
for a repeated answer set with a worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set;
for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
In an embodiment of the present invention, the calculating consistency parameters of the convergence result and the answers of the workers includes:
calculating a consistency parameter of the aggregated results and the answers of each worker based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In this embodiment of the present invention, the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of each worker includes:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the calculating a first variance of the consistency parameter of the answers of each worker when the repeated answer set is retained in the answer set includes:
keeping a repeated answer set in the answer set by calculating the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In this embodiment of the present invention, the calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted from the answer set includes:
calculating to delete a duplicate answer set in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the calculating a worker capability change rate corresponding to the repeated answer set based on the first variance and the second variance includes:
calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
The embodiment of the invention provides a crowdsourcing result gathering device based on collusion detection, which comprises:
the collection module is used for collecting answer sets of all workers aiming at the task sets from the crowdsourcing platform;
the consistency calculation module is used for calculating the convergence result of the answer set and calculating the consistency parameters of the convergence result and the answers of all workers;
the worker capacity change rate module is used for determining repeated answer sets from the answer sets and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
the collusion detection module is used for determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set aiming at the repeated answer set with the worker capability change rate less than or equal to a preset threshold value; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and the aggregation module is used for obtaining an updated answer set after the repeated answer sets are reserved or deleted, and calculating an aggregation result of the updated answer set.
In an embodiment of the present invention, the consistency calculation module is specifically configured to calculate consistency parameters of the convergence result and answers of each worker based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In an embodiment of the present invention, the worker capability change rate module includes:
a first variance calculating unit for calculating a first variance of a consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit for calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
and the worker capacity change rate calculation unit is used for calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the first variance calculating unit is specifically configured to calculate a repeated answer set reserved in the answer set according to the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In the embodiment of the invention, the first stepA variance calculating unit, specifically for calculating a repeated answer set deleted in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the worker capability change rate calculating unit is specifically configured to calculate a repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
By adopting the technical scheme of the embodiment of the invention, (1) different from the scenes of space-time crowdsourcing and social network, the characteristics of answers of general tasks in a general platform are unknown. Therefore, the embodiment of the invention introduces the concept of consistency of worker answers and convergence results to describe the influence of repeated answers generated by collusion on result convergence.
(2) Different from a collusion detection algorithm based on similarity in an electronic commerce platform, the embodiment of the invention provides a collusion detection method based on worker performance change rate, which can judge repeated answers generated by collusion in an answer set containing normal repeated answers. (3) The embodiment of the invention provides a crowdsourcing result convergence method for collusion detection, which can effectively eliminate the negative influence of collusion behavior on result convergence.
Drawings
FIG. 1 is a schematic diagram of a crowd-sourcing framework based on collusion detection according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a crowdsourcing result aggregation method based on collusion detection according to an embodiment of the present invention;
fig. 3 is a schematic structural composition diagram of a crowdsourcing result aggregation device based on collusion detection according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating the results of the worker capability change rate module in accordance with an embodiment of the present invention.
Detailed Description
So that the manner in which the features and aspects of the embodiments of the present invention can be understood in detail, a more particular description of the embodiments of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings.
The existing collusion detection algorithm cannot effectively detect and eliminate the negative effects of collusion, and mainly has the following reasons:
(1) the detection algorithm for collusion in space-time crowdsourcing and social networks needs to extract some characteristics of data to detect collusion, for example, in space-time crowdsourcing, collusion is detected by using spatial and temporal characteristics of collected data. However, these features are difficult to obtain on a common crowdsourcing platform.
(2) The detection algorithm in the e-commerce platform mainly detects collusion based on the similarity between answers provided by each pair of workers. Since the repeated answers of the tasks in the general platform are divided into normal repetition and collusion repetition. In some simple tasks the worker exhibits a high ability, when many of the repeated answers are generated normally. Checking for collusion based on the similarity of answers would misinterpret a normally repeated answer as an answer generated by the collusion.
(3) In the auction platform, the participants often collude to obtain high payback at low cost. The algorithm is mainly used for detecting collusion behaviors based on the game theory and is difficult to be suitable for general tasks on a general platform.
In summary, for the general task of the general platform, the existing algorithm cannot effectively detect and eliminate the harm of the collusion generating repeated answers to the result quality. Aiming at the existing problems, the technical scheme of the embodiment of the invention provides a crowdsourcing quality control method based on collusion detection.
Fig. 1 is a schematic diagram of a crowd-sourcing framework based on collusion detection according to an embodiment of the present invention, as shown in fig. 1, the framework includes the following steps:
(1) the requester issues the task to a crowdsourcing platform, such as a mechanical turn, where the requester gives a corresponding reward based on the quality of the worker's answer.
(2) Tasks are assigned to workers according to scheduling policies and user-specified platform constraints.
(3) Indeed, some workers are not independent and may even collaborate outside the platform to handle some crowdsourced tasks. Workers may catch on each other behind the curtain. For example, workers pirate others who work on the same crowd-sourced via an online forum. After task processing, the answers are collected and some noisy answers are eliminated, e.g., some answers are apparently not related to the picture in the image tagging task.
(4) This step involves collusion detection and result aggregation. After all workers who finish collecting workers return answers, the embodiment of the invention adopts a collusion detection mechanism to detect collusion behaviors and then filters out repeated answers generated by colluders. After result filtering, embodiments of the present invention use a convergence approach to infer the final result of each task and submit it to the requester.
The core of the framework of the embodiment of the invention is step (4), which comprises the collusion detection method provided by the embodiment of the invention and then adopts a result reasoning method, so that a high-quality result can be reasoned even under the condition of collusion.
The collusion detection crowdsourcing framework provided by the embodiment of the invention effectively solves the problem that the existing result convergence algorithm is difficult to effectively eliminate the harm of collusion to result convergence. Unlike a general crowdsourcing framework, workers in the crowdsourcing framework proposed by the embodiment of the invention are not independent any more, but may communicate or even collude with each other. In addition, a result reasoning part in the framework comprises a serial port detection process.
The technical scheme of the embodiment of the invention integrally comprises the following steps: the method comprises three steps of collusion detection, result filtering and result aggregation, and the three steps are described below.
The method comprises the following steps: collusion detection
(1) Calculating the consistency of the convergence result and the worker answers: when a worker completes a task process, the answers returned by the worker are first collected, assuming a set of tasks is completedThe answer set returned by the worker isIs provided withThe answer set is repeated for one of the answer sets. The purpose of the embodiment of the invention is to judge whether a repeated set is generated by collusion or not and to judge an answer set on the basis of the repeated setPooling is performed to obtain high quality results.
Gathering answer sets by utilizing most voting methods to obtain gathering resultsThe embodiment of the invention provides a calculation formula for consistency of the convergence result and the answer of the worker i:
wherein L isiCorresponding to task collections for worker iThe returned answer set.
(2) Calculating a worker ability change rate for each repeated answer set: for a set of repeated answers, the worker capability change rate is mainly used for measuring the overall performance of the set of repeated answers on the consistency of the worker answers and the aggregated result. The embodiment of the invention utilizes the variance change of the overall consistency before and after the repeated answer set is deleted to form the human ability change rate. First, a set of retained duplicate answers is computedTime, variance of worker answer consistency:
deleting duplicate answer setsCan obtainSimilarly, a set of pruned answers is computedVariance of worker answer consistency:
then, the worker capacity change rate is obtained by the formulas of the two modes:
(3) determining whether the repeated answer was generated by collusion: when in useWhen the Threshold is less than or equal to the Threshold, the repeated set is consideredThe answer is a normal repeat answer. When in useIf the Threshold is exceeded, then the duplicate set is consideredRepeat answers for collusion.
In the above scheme, the repeated answer set is calculatedThe variance of the consistency of the worker answers can also be obtained by using the results of other convergence algorithmsSuch as a method of convergence of probabilities.
Step two: result filtering
Repeating the above steps to a pair of answer setsAll duplicate sets in (1) are detected. The duplicate answers determined to be colluded will be deleted and the answers determined to be normally duplicated will be retained.
Step three: result aggregation
And converging the answer set by using the existing result convergence algorithm to obtain a final result.
Fig. 2 is a schematic flow chart of a crowdsourcing result aggregation method based on collusion detection according to an embodiment of the present invention, and as shown in fig. 2, the crowdsourcing result aggregation method based on collusion detection includes the following steps:
step 201: a set of answers for each worker to the set of tasks is collected from the crowdsourcing platform.
Step 202: and calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of all workers.
In the embodiment of the invention, the consistency parameters of the convergence result and the answers of each worker are calculated based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,for aggregation of answer setsAnd (6) obtaining the result.
Step 203: and determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers.
In this embodiment of the present invention, the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of each worker includes:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
Wherein the repeated answer set is kept in the answer set by the following formula calculationAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
Calculating to delete a duplicate answer set in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var(Pk) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
Calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
Step 204: and for repeated answer sets with the worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer sets are normally generated and reserving the repeated answer sets in the answer sets.
Step 205: for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is generated for collusion and deleting the repeated answer set in the answer set.
Step 206: and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
The collusion detection method provided by the embodiment of the invention can detect the collusion group with high precision according to the result given by a worker. Before and after a certain repeated answer set is deleted, the variation change of the consistency of the worker answers and results is used for formalizing the change rate of the human ability, and the collusion behavior is detected by using the scale of the change rate of the worker ability. The result processing method for deleting and then converging the collusion result provided by the embodiment of the invention can greatly improve the accuracy of the converging result. Different from the existing convergence algorithm, the result convergence method provided by the embodiment of the invention comprises detection of collusion behavior, can effectively eliminate the negative influence on result convergence, and improves the result quality.
Fig. 3 is a schematic structural composition diagram of a crowdsourcing result aggregation device based on collusion detection according to an embodiment of the present invention, and as shown in fig. 3, the device includes:
a collecting module 301, configured to collect answer sets of each worker for a task set from a crowdsourcing platform;
a consistency calculation module 302, configured to calculate a convergence result of the answer set, and calculate consistency parameters of the convergence result and the answers of each worker;
a worker capacity change rate module 303, configured to determine repeated answer sets from the answer sets, and calculate a worker capacity change rate corresponding to each repeated answer set based on a consistency parameter of answers of each worker;
a collusion detection module 304, configured to determine, for a repeated answer set in which a worker capability change rate is less than or equal to a preset threshold, that the repeated answer set is generated normally and retain the repeated answer set in the answer set; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
the aggregation module 305 is configured to obtain an updated answer set after performing retention or deletion processing on each repeated answer set, and calculate an aggregation result of the updated answer set.
In an embodiment of the present invention, the consistency calculating module 302 is specifically configured to calculate consistency parameters of the convergence result and the answers of the workers based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In an embodiment of the present invention, as shown in fig. 4, the worker capability change rate module 303 includes:
a first variance calculating unit 3031, configured to calculate a first variance of the consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit 3032, configured to calculate a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
a worker ability change rate calculation unit 3033, configured to calculate a worker ability change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the first variance calculating unit 3031 is specifically configured to calculate the remaining repeated answer sets in the answer set according to the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In an embodiment of the present invention, the second variance calculating unit 3032 is specifically configured to calculate a repeated answer set to be deleted in the answer set according to the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the worker ability change rate calculation unit 3033 is specifically configured to calculate the repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
It should be understood by those skilled in the art that the implementation functions of the modules in the collusion detection-based crowdsourcing result aggregation device shown in fig. 3 can be understood by referring to the related description of the aforementioned collusion detection-based crowdsourcing result aggregation method, and the implementation functions of the modules in the collusion detection-based crowdsourcing result aggregation device shown in fig. 3 can be implemented by a program running on a processor, and can also be implemented by a specific logic circuit.
The technical schemes described in the embodiments of the present invention can be combined arbitrarily without conflict.
In the embodiments provided in the present invention, it should be understood that the disclosed method and intelligent device may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the apparatus according to the embodiment of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.

Claims (12)

1. A method for crowdsourcing result aggregation based on collusion detection, the method comprising:
collecting answer sets of workers aiming at the task sets from the crowdsourcing platform;
calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of each worker;
determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
for a repeated answer set with a worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set;
for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
2. The method for aggregating crowdsourcing results based on collusion detection according to claim 1, wherein the calculating consistency parameters of the aggregated results and answers of workers comprises:
calculating a consistency parameter of the aggregated results and the answers of each worker based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
3. The crowd-sourced result aggregation method based on collusion detection according to claim 2, wherein the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of the workers comprises:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
4. The method of claim 3, wherein the calculating a first variance of the consistency parameter of the answers of the workers when the repeated answer set is retained in the answer set comprises:
keeping a repeated answer set in the answer set by calculating the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
5. The crowd-sourced result aggregation method based on collusion detection according to claim 3 or 4, wherein the calculating a second variance of the consistency parameter of the answers of each worker when deleting the repeated answer set in the answer set comprises:
calculating to delete a duplicate answer set in the answer set by the following formulaA second variance of the consistency parameter of the answers of said workers:
Wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
6. The crowd-sourced result aggregation method based on collusion detection according to claim 5, wherein the calculating a worker capability change rate corresponding to the repeated answer set based on the first variance and the second variance comprises:
calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
7. A crowd-sourced result aggregation device based on collusion detection, the device comprising:
the collection module is used for collecting answer sets of all workers aiming at the task sets from the crowdsourcing platform;
the consistency calculation module is used for calculating the convergence result of the answer set and calculating the consistency parameters of the convergence result and the answers of all workers;
the worker capacity change rate module is used for determining repeated answer sets from the answer sets and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
the collusion detection module is used for determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set aiming at the repeated answer set with the worker capability change rate less than or equal to a preset threshold value; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and the aggregation module is used for obtaining an updated answer set after the repeated answer sets are reserved or deleted, and calculating an aggregation result of the updated answer set.
8. The crowd-sourced result aggregation device based on collusion detection according to claim 7, wherein the consistency calculation module is specifically configured to calculate the consistency parameters of the aggregated results and the answers of the workers based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
9. The crowd-sourced result aggregation device based on collusion detection according to claim 7, wherein the worker capability change rate module comprises:
a first variance calculating unit for calculating a first variance of a consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit for calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
and the worker capacity change rate calculation unit is used for calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
10. The crowd-sourced result aggregation device based on collusion detection according to claim 9, wherein the first variance calculating unit is specifically configured to calculate to keep a repeated answer set in the answer set by the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
11. The crowd-sourced result aggregation device based on collusion detection according to claim 9 or 10, wherein the second variance calculation unit is specifically configured to calculate a repeated answer set to be deleted from the answer set according to the following formulaWhen the temperature of the water is higher than the set temperature,a second variance of the consistency parameter for the answers of each worker:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
12. The crowd-sourced result aggregation device based on collusion detection according to claim 11, wherein the worker capability change rate calculation unit is specifically configured to calculate the repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
CN201711003779.2A 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection Active CN107767055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711003779.2A CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711003779.2A CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Publications (2)

Publication Number Publication Date
CN107767055A true CN107767055A (en) 2018-03-06
CN107767055B CN107767055B (en) 2021-07-23

Family

ID=61270213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711003779.2A Active CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Country Status (1)

Country Link
CN (1) CN107767055B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978333A (en) * 2019-02-26 2019-07-05 湖南大学 Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system
CN110930114A (en) * 2019-11-20 2020-03-27 北京航空航天大学 Crowdsourcing method for resisting collusion
CN111292062A (en) * 2020-02-10 2020-06-16 中南大学 Crowdsourcing garbage worker detection method and system based on network embedding and storage medium
US11386299B2 (en) 2018-11-16 2022-07-12 Yandex Europe Ag Method of completing a task
US11416773B2 (en) 2019-05-27 2022-08-16 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11475387B2 (en) 2019-09-09 2022-10-18 Yandex Europe Ag Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment
US11481650B2 (en) 2019-11-05 2022-10-25 Yandex Europe Ag Method and system for selecting label from plurality of labels for task in crowd-sourced environment
US11727336B2 (en) 2019-04-15 2023-08-15 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11727329B2 (en) 2020-02-14 2023-08-15 Yandex Europe Ag Method and system for receiving label for digital task executed within crowd-sourced environment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133769A (en) * 2014-08-02 2014-11-05 哈尔滨理工大学 Crowdsourcing fraud detection method based on psychological behavior analysis
CN104599084A (en) * 2015-02-12 2015-05-06 北京航空航天大学 Crowd calculation quality control method and device
US20170187751A1 (en) * 2015-12-29 2017-06-29 International Business Machines Corporation Propagating fraud awareness to hosted applications
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133769A (en) * 2014-08-02 2014-11-05 哈尔滨理工大学 Crowdsourcing fraud detection method based on psychological behavior analysis
CN104599084A (en) * 2015-02-12 2015-05-06 北京航空航天大学 Crowd calculation quality control method and device
US20170187751A1 (en) * 2015-12-29 2017-06-29 International Business Machines Corporation Propagating fraud awareness to hosted applications
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANTONIO FERNÁNDEZ ANTA 等: ""Algorithmic Mechanisms for Reliable Crowdsourcing Computation under Collusion"", 《PLOS ONE》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11386299B2 (en) 2018-11-16 2022-07-12 Yandex Europe Ag Method of completing a task
CN109978333A (en) * 2019-02-26 2019-07-05 湖南大学 Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system
US11727336B2 (en) 2019-04-15 2023-08-15 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11416773B2 (en) 2019-05-27 2022-08-16 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11475387B2 (en) 2019-09-09 2022-10-18 Yandex Europe Ag Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment
US11481650B2 (en) 2019-11-05 2022-10-25 Yandex Europe Ag Method and system for selecting label from plurality of labels for task in crowd-sourced environment
CN110930114A (en) * 2019-11-20 2020-03-27 北京航空航天大学 Crowdsourcing method for resisting collusion
CN110930114B (en) * 2019-11-20 2022-08-23 北京航空航天大学 Crowdsourcing method for resisting collusion
CN111292062A (en) * 2020-02-10 2020-06-16 中南大学 Crowdsourcing garbage worker detection method and system based on network embedding and storage medium
US11727329B2 (en) 2020-02-14 2023-08-15 Yandex Europe Ag Method and system for receiving label for digital task executed within crowd-sourced environment

Also Published As

Publication number Publication date
CN107767055B (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN107767055B (en) Crowdsourcing result aggregation method and device based on collusion detection
CN109086720B (en) Face clustering method, face clustering device and storage medium
JP6594329B2 (en) System and method for facial expression
CN108830145B (en) People counting method based on deep neural network and storage medium
CN104809132B (en) A kind of method and device obtaining network principal social networks type
CN109840467A (en) A kind of in-vivo detection method and system
JP2015529904A (en) User recommendation method and user recommendation system using the method
CN111898592B (en) Track data processing method and device and computer readable storage medium
CN111325204A (en) Target detection method, target detection device, electronic equipment and storage medium
WO2023169274A1 (en) Data processing method and device, and storage medium and processor
WO2021212760A1 (en) Method and apparatus for determining identity type of person, and electronic system
CN118211268A (en) Heterogeneous federal learning privacy protection method and system based on diffusion model
US10791321B2 (en) Constructing a user's face model using particle filters
JP2019020882A (en) Life log utilization system, method and program
WO2024159888A1 (en) Image restoration method and apparatus, and computer device, program product and storage medium
CN117688255A (en) Recommendation method and system based on double social view contrast learning
CN110136019B (en) Social media abnormal group user detection method based on relational evolution
CN102955947B (en) A kind of device and method thereof for being used to determine image definition
CN106815264B (en) Information processing method and system
CN111652673A (en) Intelligent recommendation method, device, server and storage medium
CN111723338A (en) Detection method and detection equipment
CN113326829B (en) Method and device for recognizing gesture in video, readable storage medium and electronic equipment
CN115311723A (en) Living body detection method, living body detection device and computer-readable storage medium
CN111461971B (en) Image processing method, device, equipment and computer readable storage medium
TWM610750U (en) Deep learning device for augmented reality somatosensory game machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant