CN107767055A - A kind of mass-rent result assemblage method and device based on collusion detection - Google Patents
A kind of mass-rent result assemblage method and device based on collusion detection Download PDFInfo
- Publication number
- CN107767055A CN107767055A CN201711003779.2A CN201711003779A CN107767055A CN 107767055 A CN107767055 A CN 107767055A CN 201711003779 A CN201711003779 A CN 201711003779A CN 107767055 A CN107767055 A CN 107767055A
- Authority
- CN
- China
- Prior art keywords
- worker
- answer set
- repeated
- answers
- answer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000008859 change Effects 0.000 claims description 63
- 230000002776 aggregation Effects 0.000 claims description 32
- 238000004220 aggregation Methods 0.000 claims description 32
- 238000004364 calculation method Methods 0.000 claims description 13
- 230000000717 retained effect Effects 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims 1
- 230000006399 behavior Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012358 sourcing Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 241000764238 Isis Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of mass-rent result assemblage method based on collusion detection and device, methods described to include:Answer set of each worker for set of tasks is collected from mass-rent platform;The convergence result of the answer set is calculated, and calculates the convergence result and the parameter of consistency of the answer of each worker;Determine to repeat answer set from the answer set, the parameter of consistency of the answer based on each worker, calculate worker's capacity variation rate corresponding to each repetition answer set;For repetition answer set of worker's capacity variation rate less than or equal to predetermined threshold value, determine that the answer set that repeats is normal generation and retains the repetition answer set in the answer set;It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines that the answer set that repeats is that collusion produces and the repetition answer set is deleted in the answer set;The answer set updated, calculate the convergence result of the answer set of the renewal.
Description
Technical Field
The invention relates to the technical field of crowdsourcing, in particular to a crowdsourcing result gathering method and device based on collusion detection.
Background
Crowdsourcing is a rapidly developing field aimed at solving the problem that computers are difficult to solve using human cognitive advantages. Popular common platforms such as crowdfower and AMT are widely used by people for general data processing tasks such as emotion analysis, handwriting recognition and picture tagging. One core problem of crowdsourcing is ensuring the quality of results, since workers may return results of poor quality. A widely adopted method of controlling quality is result aggregation, which first assigns each task to multiple workers and then uses an inference algorithm to aggregate the results returned by the workers. Taking the image annotation as an example, one image is distributed to a plurality of workers, and then the workers respectively provide tags describing the contents of the images. Finally, a high quality result is gathered from all the collected tags by voting or other reasoning methods.
In crowdsourcing, in order to obtain more remuneration, less labor is paid, and colluders form collusion teams through short messages, WeChat, telephone, forum and even face-to-face communication outside a platform. In a collusion team, only one worker processes the task and the other worker plagiates his answer. All workers in the final team provide the same answer. These malicious repeated answers will dominate the answers provided by normal workers in the result aggregation, reducing the quality of the results. For example, if one task is given to five workers for execution, and three workers collude, the results are converged by using most voting methods, and the final converged result is equal to the result provided by the colluder.
From the above, it can be seen that the repeated answers generated by collusion are detrimental to the quality of results for general tasks on a generic platform. However, existing collusion detection algorithms do not effectively detect and eliminate the negative effects of such collusion.
Disclosure of Invention
In order to solve the above technical problem, embodiments of the present invention provide a crowdsourcing result aggregation method and apparatus based on collusion detection.
The crowdsourcing result converging method based on collusion detection provided by the embodiment of the invention comprises the following steps:
collecting answer sets of workers aiming at the task sets from the crowdsourcing platform;
calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of each worker;
determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
for a repeated answer set with a worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set;
for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
In an embodiment of the present invention, the calculating consistency parameters of the convergence result and the answers of the workers includes:
calculating a consistency parameter of the aggregated results and the answers of each worker based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In this embodiment of the present invention, the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of each worker includes:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the calculating a first variance of the consistency parameter of the answers of each worker when the repeated answer set is retained in the answer set includes:
keeping a repeated answer set in the answer set by calculating the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In this embodiment of the present invention, the calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted from the answer set includes:
calculating to delete a duplicate answer set in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the calculating a worker capability change rate corresponding to the repeated answer set based on the first variance and the second variance includes:
calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
The embodiment of the invention provides a crowdsourcing result gathering device based on collusion detection, which comprises:
the collection module is used for collecting answer sets of all workers aiming at the task sets from the crowdsourcing platform;
the consistency calculation module is used for calculating the convergence result of the answer set and calculating the consistency parameters of the convergence result and the answers of all workers;
the worker capacity change rate module is used for determining repeated answer sets from the answer sets and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
the collusion detection module is used for determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set aiming at the repeated answer set with the worker capability change rate less than or equal to a preset threshold value; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and the aggregation module is used for obtaining an updated answer set after the repeated answer sets are reserved or deleted, and calculating an aggregation result of the updated answer set.
In an embodiment of the present invention, the consistency calculation module is specifically configured to calculate consistency parameters of the convergence result and answers of each worker based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In an embodiment of the present invention, the worker capability change rate module includes:
a first variance calculating unit for calculating a first variance of a consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit for calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
and the worker capacity change rate calculation unit is used for calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the first variance calculating unit is specifically configured to calculate a repeated answer set reserved in the answer set according to the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In the embodiment of the invention, the first stepA variance calculating unit, specifically for calculating a repeated answer set deleted in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the worker capability change rate calculating unit is specifically configured to calculate a repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
By adopting the technical scheme of the embodiment of the invention, (1) different from the scenes of space-time crowdsourcing and social network, the characteristics of answers of general tasks in a general platform are unknown. Therefore, the embodiment of the invention introduces the concept of consistency of worker answers and convergence results to describe the influence of repeated answers generated by collusion on result convergence.
(2) Different from a collusion detection algorithm based on similarity in an electronic commerce platform, the embodiment of the invention provides a collusion detection method based on worker performance change rate, which can judge repeated answers generated by collusion in an answer set containing normal repeated answers. (3) The embodiment of the invention provides a crowdsourcing result convergence method for collusion detection, which can effectively eliminate the negative influence of collusion behavior on result convergence.
Drawings
FIG. 1 is a schematic diagram of a crowd-sourcing framework based on collusion detection according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a crowdsourcing result aggregation method based on collusion detection according to an embodiment of the present invention;
fig. 3 is a schematic structural composition diagram of a crowdsourcing result aggregation device based on collusion detection according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating the results of the worker capability change rate module in accordance with an embodiment of the present invention.
Detailed Description
So that the manner in which the features and aspects of the embodiments of the present invention can be understood in detail, a more particular description of the embodiments of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings.
The existing collusion detection algorithm cannot effectively detect and eliminate the negative effects of collusion, and mainly has the following reasons:
(1) the detection algorithm for collusion in space-time crowdsourcing and social networks needs to extract some characteristics of data to detect collusion, for example, in space-time crowdsourcing, collusion is detected by using spatial and temporal characteristics of collected data. However, these features are difficult to obtain on a common crowdsourcing platform.
(2) The detection algorithm in the e-commerce platform mainly detects collusion based on the similarity between answers provided by each pair of workers. Since the repeated answers of the tasks in the general platform are divided into normal repetition and collusion repetition. In some simple tasks the worker exhibits a high ability, when many of the repeated answers are generated normally. Checking for collusion based on the similarity of answers would misinterpret a normally repeated answer as an answer generated by the collusion.
(3) In the auction platform, the participants often collude to obtain high payback at low cost. The algorithm is mainly used for detecting collusion behaviors based on the game theory and is difficult to be suitable for general tasks on a general platform.
In summary, for the general task of the general platform, the existing algorithm cannot effectively detect and eliminate the harm of the collusion generating repeated answers to the result quality. Aiming at the existing problems, the technical scheme of the embodiment of the invention provides a crowdsourcing quality control method based on collusion detection.
Fig. 1 is a schematic diagram of a crowd-sourcing framework based on collusion detection according to an embodiment of the present invention, as shown in fig. 1, the framework includes the following steps:
(1) the requester issues the task to a crowdsourcing platform, such as a mechanical turn, where the requester gives a corresponding reward based on the quality of the worker's answer.
(2) Tasks are assigned to workers according to scheduling policies and user-specified platform constraints.
(3) Indeed, some workers are not independent and may even collaborate outside the platform to handle some crowdsourced tasks. Workers may catch on each other behind the curtain. For example, workers pirate others who work on the same crowd-sourced via an online forum. After task processing, the answers are collected and some noisy answers are eliminated, e.g., some answers are apparently not related to the picture in the image tagging task.
(4) This step involves collusion detection and result aggregation. After all workers who finish collecting workers return answers, the embodiment of the invention adopts a collusion detection mechanism to detect collusion behaviors and then filters out repeated answers generated by colluders. After result filtering, embodiments of the present invention use a convergence approach to infer the final result of each task and submit it to the requester.
The core of the framework of the embodiment of the invention is step (4), which comprises the collusion detection method provided by the embodiment of the invention and then adopts a result reasoning method, so that a high-quality result can be reasoned even under the condition of collusion.
The collusion detection crowdsourcing framework provided by the embodiment of the invention effectively solves the problem that the existing result convergence algorithm is difficult to effectively eliminate the harm of collusion to result convergence. Unlike a general crowdsourcing framework, workers in the crowdsourcing framework proposed by the embodiment of the invention are not independent any more, but may communicate or even collude with each other. In addition, a result reasoning part in the framework comprises a serial port detection process.
The technical scheme of the embodiment of the invention integrally comprises the following steps: the method comprises three steps of collusion detection, result filtering and result aggregation, and the three steps are described below.
The method comprises the following steps: collusion detection
(1) Calculating the consistency of the convergence result and the worker answers: when a worker completes a task process, the answers returned by the worker are first collected, assuming a set of tasks is completedThe answer set returned by the worker isIs provided withThe answer set is repeated for one of the answer sets. The purpose of the embodiment of the invention is to judge whether a repeated set is generated by collusion or not and to judge an answer set on the basis of the repeated setPooling is performed to obtain high quality results.
Gathering answer sets by utilizing most voting methods to obtain gathering resultsThe embodiment of the invention provides a calculation formula for consistency of the convergence result and the answer of the worker i:
wherein L isiCorresponding to task collections for worker iThe returned answer set.
(2) Calculating a worker ability change rate for each repeated answer set: for a set of repeated answers, the worker capability change rate is mainly used for measuring the overall performance of the set of repeated answers on the consistency of the worker answers and the aggregated result. The embodiment of the invention utilizes the variance change of the overall consistency before and after the repeated answer set is deleted to form the human ability change rate. First, a set of retained duplicate answers is computedTime, variance of worker answer consistency:
deleting duplicate answer setsCan obtainSimilarly, a set of pruned answers is computedVariance of worker answer consistency:
then, the worker capacity change rate is obtained by the formulas of the two modes:
(3) determining whether the repeated answer was generated by collusion: when in useWhen the Threshold is less than or equal to the Threshold, the repeated set is consideredThe answer is a normal repeat answer. When in useIf the Threshold is exceeded, then the duplicate set is consideredRepeat answers for collusion.
In the above scheme, the repeated answer set is calculatedThe variance of the consistency of the worker answers can also be obtained by using the results of other convergence algorithmsSuch as a method of convergence of probabilities.
Step two: result filtering
Repeating the above steps to a pair of answer setsAll duplicate sets in (1) are detected. The duplicate answers determined to be colluded will be deleted and the answers determined to be normally duplicated will be retained.
Step three: result aggregation
And converging the answer set by using the existing result convergence algorithm to obtain a final result.
Fig. 2 is a schematic flow chart of a crowdsourcing result aggregation method based on collusion detection according to an embodiment of the present invention, and as shown in fig. 2, the crowdsourcing result aggregation method based on collusion detection includes the following steps:
step 201: a set of answers for each worker to the set of tasks is collected from the crowdsourcing platform.
Step 202: and calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of all workers.
In the embodiment of the invention, the consistency parameters of the convergence result and the answers of each worker are calculated based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,for aggregation of answer setsAnd (6) obtaining the result.
Step 203: and determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers.
In this embodiment of the present invention, the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of each worker includes:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
Wherein the repeated answer set is kept in the answer set by the following formula calculationAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
Calculating to delete a duplicate answer set in the answer set by the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var(Pk) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
Calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
Step 204: and for repeated answer sets with the worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer sets are normally generated and reserving the repeated answer sets in the answer sets.
Step 205: for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is generated for collusion and deleting the repeated answer set in the answer set.
Step 206: and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
The collusion detection method provided by the embodiment of the invention can detect the collusion group with high precision according to the result given by a worker. Before and after a certain repeated answer set is deleted, the variation change of the consistency of the worker answers and results is used for formalizing the change rate of the human ability, and the collusion behavior is detected by using the scale of the change rate of the worker ability. The result processing method for deleting and then converging the collusion result provided by the embodiment of the invention can greatly improve the accuracy of the converging result. Different from the existing convergence algorithm, the result convergence method provided by the embodiment of the invention comprises detection of collusion behavior, can effectively eliminate the negative influence on result convergence, and improves the result quality.
Fig. 3 is a schematic structural composition diagram of a crowdsourcing result aggregation device based on collusion detection according to an embodiment of the present invention, and as shown in fig. 3, the device includes:
a collecting module 301, configured to collect answer sets of each worker for a task set from a crowdsourcing platform;
a consistency calculation module 302, configured to calculate a convergence result of the answer set, and calculate consistency parameters of the convergence result and the answers of each worker;
a worker capacity change rate module 303, configured to determine repeated answer sets from the answer sets, and calculate a worker capacity change rate corresponding to each repeated answer set based on a consistency parameter of answers of each worker;
a collusion detection module 304, configured to determine, for a repeated answer set in which a worker capability change rate is less than or equal to a preset threshold, that the repeated answer set is generated normally and retain the repeated answer set in the answer set; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
the aggregation module 305 is configured to obtain an updated answer set after performing retention or deletion processing on each repeated answer set, and calculate an aggregation result of the updated answer set.
In an embodiment of the present invention, the consistency calculating module 302 is specifically configured to calculate consistency parameters of the convergence result and the answers of the workers based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
In an embodiment of the present invention, as shown in fig. 4, the worker capability change rate module 303 includes:
a first variance calculating unit 3031, configured to calculate a first variance of the consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit 3032, configured to calculate a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
a worker ability change rate calculation unit 3033, configured to calculate a worker ability change rate corresponding to the repeated answer set based on the first variance and the second variance.
In an embodiment of the present invention, the first variance calculating unit 3031 is specifically configured to calculate the remaining repeated answer sets in the answer set according to the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
In an embodiment of the present invention, the second variance calculating unit 3032 is specifically configured to calculate a repeated answer set to be deleted in the answer set according to the following formulaAnd a second variance of the consistency parameter of the answers of said workers:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
In an embodiment of the present invention, the worker ability change rate calculation unit 3033 is specifically configured to calculate the repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
It should be understood by those skilled in the art that the implementation functions of the modules in the collusion detection-based crowdsourcing result aggregation device shown in fig. 3 can be understood by referring to the related description of the aforementioned collusion detection-based crowdsourcing result aggregation method, and the implementation functions of the modules in the collusion detection-based crowdsourcing result aggregation device shown in fig. 3 can be implemented by a program running on a processor, and can also be implemented by a specific logic circuit.
The technical schemes described in the embodiments of the present invention can be combined arbitrarily without conflict.
In the embodiments provided in the present invention, it should be understood that the disclosed method and intelligent device may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the apparatus according to the embodiment of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.
Claims (12)
1. A method for crowdsourcing result aggregation based on collusion detection, the method comprising:
collecting answer sets of workers aiming at the task sets from the crowdsourcing platform;
calculating a convergence result of the answer set, and calculating a consistency parameter of the convergence result and the answers of each worker;
determining repeated answer sets from the answer sets, and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
for a repeated answer set with a worker capacity change rate less than or equal to a preset threshold value, determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set;
for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and after the repeated answer sets are reserved or deleted, an updated answer set is obtained, and a convergence result of the updated answer set is calculated.
2. The method for aggregating crowdsourcing results based on collusion detection according to claim 1, wherein the calculating consistency parameters of the aggregated results and answers of workers comprises:
calculating a consistency parameter of the aggregated results and the answers of each worker based on the following formula:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
3. The crowd-sourced result aggregation method based on collusion detection according to claim 2, wherein the calculating a worker capability change rate corresponding to each repeated answer set based on the consistency parameter of the answers of the workers comprises:
calculating a first variance of the consistency parameter of the answers of each worker when a set of repeated answers remains in the set of answers;
calculating a second variance of the consistency parameter of the answers of each worker when a duplicate answer set is deleted in the answer set;
and calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
4. The method of claim 3, wherein the calculating a first variance of the consistency parameter of the answers of the workers when the repeated answer set is retained in the answer set comprises:
keeping a repeated answer set in the answer set by calculating the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
5. The crowd-sourced result aggregation method based on collusion detection according to claim 3 or 4, wherein the calculating a second variance of the consistency parameter of the answers of each worker when deleting the repeated answer set in the answer set comprises:
calculating to delete a duplicate answer set in the answer set by the following formulaA second variance of the consistency parameter of the answers of said workers:
Wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
6. The crowd-sourced result aggregation method based on collusion detection according to claim 5, wherein the calculating a worker capability change rate corresponding to the repeated answer set based on the first variance and the second variance comprises:
calculating a set of repeated answers based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
7. A crowd-sourced result aggregation device based on collusion detection, the device comprising:
the collection module is used for collecting answer sets of all workers aiming at the task sets from the crowdsourcing platform;
the consistency calculation module is used for calculating the convergence result of the answer set and calculating the consistency parameters of the convergence result and the answers of all workers;
the worker capacity change rate module is used for determining repeated answer sets from the answer sets and calculating the worker capacity change rate corresponding to each repeated answer set based on the consistency parameters of the answers of all workers;
the collusion detection module is used for determining that the repeated answer set is normally generated and reserving the repeated answer set in the answer set aiming at the repeated answer set with the worker capability change rate less than or equal to a preset threshold value; for a repeated answer set with a worker capacity change rate larger than a preset threshold value, determining that the repeated answer set is collusion generation and deleting the repeated answer set in the answer set;
and the aggregation module is used for obtaining an updated answer set after the repeated answer sets are reserved or deleted, and calculating an aggregation result of the updated answer set.
8. The crowd-sourced result aggregation device based on collusion detection according to claim 7, wherein the consistency calculation module is specifically configured to calculate the consistency parameters of the aggregated results and the answers of the workers based on the following formulas:
wherein, PiFor the convergence of results and worker i and consistency parameters, LiThe answers returned for worker i corresponding to the task set,is the aggregate result of the answer set.
9. The crowd-sourced result aggregation device based on collusion detection according to claim 7, wherein the worker capability change rate module comprises:
a first variance calculating unit for calculating a first variance of a consistency parameter of the answers of each worker when a repeated answer set remains in the answer set;
a second variance calculating unit for calculating a second variance of the consistency parameter of the answers of each worker when the repeated answer set is deleted in the answer set;
and the worker capacity change rate calculation unit is used for calculating the worker capacity change rate corresponding to the repeated answer set based on the first variance and the second variance.
10. The crowd-sourced result aggregation device based on collusion detection according to claim 9, wherein the first variance calculating unit is specifically configured to calculate to keep a repeated answer set in the answer set by the following formulaAnd then, the first variance of the consistency parameters of the answers of the workers:
wherein Var (P) is the first variance, E (P) is the average of the consistency parameters of each worker, PiTo converge the results and consistency parameters for worker i and,is a set of answers.
11. The crowd-sourced result aggregation device based on collusion detection according to claim 9 or 10, wherein the second variance calculation unit is specifically configured to calculate a repeated answer set to be deleted from the answer set according to the following formulaWhen the temperature of the water is higher than the set temperature,a second variance of the consistency parameter for the answers of each worker:
wherein, Var (P)k) Is the second variance, Ε (P)k) Is the average of the consistency parameters, P, of each workeri kTo converge the results and consistency parameters for worker i and,for deletingThe latter answer set.
12. The crowd-sourced result aggregation device based on collusion detection according to claim 11, wherein the worker capability change rate calculation unit is specifically configured to calculate the repeated answer set based on the following formulaCorresponding worker capacity change rate:
wherein,is the rate of change of worker capacity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711003779.2A CN107767055B (en) | 2017-10-24 | 2017-10-24 | Crowdsourcing result aggregation method and device based on collusion detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711003779.2A CN107767055B (en) | 2017-10-24 | 2017-10-24 | Crowdsourcing result aggregation method and device based on collusion detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107767055A true CN107767055A (en) | 2018-03-06 |
CN107767055B CN107767055B (en) | 2021-07-23 |
Family
ID=61270213
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711003779.2A Active CN107767055B (en) | 2017-10-24 | 2017-10-24 | Crowdsourcing result aggregation method and device based on collusion detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107767055B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978333A (en) * | 2019-02-26 | 2019-07-05 | 湖南大学 | Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system |
CN110930114A (en) * | 2019-11-20 | 2020-03-27 | 北京航空航天大学 | Crowdsourcing method for resisting collusion |
CN111292062A (en) * | 2020-02-10 | 2020-06-16 | 中南大学 | Crowdsourcing garbage worker detection method and system based on network embedding and storage medium |
US11386299B2 (en) | 2018-11-16 | 2022-07-12 | Yandex Europe Ag | Method of completing a task |
US11416773B2 (en) | 2019-05-27 | 2022-08-16 | Yandex Europe Ag | Method and system for determining result for task executed in crowd-sourced environment |
US11475387B2 (en) | 2019-09-09 | 2022-10-18 | Yandex Europe Ag | Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment |
US11481650B2 (en) | 2019-11-05 | 2022-10-25 | Yandex Europe Ag | Method and system for selecting label from plurality of labels for task in crowd-sourced environment |
US11727336B2 (en) | 2019-04-15 | 2023-08-15 | Yandex Europe Ag | Method and system for determining result for task executed in crowd-sourced environment |
US11727329B2 (en) | 2020-02-14 | 2023-08-15 | Yandex Europe Ag | Method and system for receiving label for digital task executed within crowd-sourced environment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104133769A (en) * | 2014-08-02 | 2014-11-05 | 哈尔滨理工大学 | Crowdsourcing fraud detection method based on psychological behavior analysis |
CN104599084A (en) * | 2015-02-12 | 2015-05-06 | 北京航空航天大学 | Crowd calculation quality control method and device |
US20170187751A1 (en) * | 2015-12-29 | 2017-06-29 | International Business Machines Corporation | Propagating fraud awareness to hosted applications |
CN107273492A (en) * | 2017-06-15 | 2017-10-20 | 复旦大学 | A kind of exchange method based on mass-rent platform processes image labeling task |
-
2017
- 2017-10-24 CN CN201711003779.2A patent/CN107767055B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104133769A (en) * | 2014-08-02 | 2014-11-05 | 哈尔滨理工大学 | Crowdsourcing fraud detection method based on psychological behavior analysis |
CN104599084A (en) * | 2015-02-12 | 2015-05-06 | 北京航空航天大学 | Crowd calculation quality control method and device |
US20170187751A1 (en) * | 2015-12-29 | 2017-06-29 | International Business Machines Corporation | Propagating fraud awareness to hosted applications |
CN107273492A (en) * | 2017-06-15 | 2017-10-20 | 复旦大学 | A kind of exchange method based on mass-rent platform processes image labeling task |
Non-Patent Citations (1)
Title |
---|
ANTONIO FERNÁNDEZ ANTA 等: ""Algorithmic Mechanisms for Reliable Crowdsourcing Computation under Collusion"", 《PLOS ONE》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11386299B2 (en) | 2018-11-16 | 2022-07-12 | Yandex Europe Ag | Method of completing a task |
CN109978333A (en) * | 2019-02-26 | 2019-07-05 | 湖南大学 | Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system |
US11727336B2 (en) | 2019-04-15 | 2023-08-15 | Yandex Europe Ag | Method and system for determining result for task executed in crowd-sourced environment |
US11416773B2 (en) | 2019-05-27 | 2022-08-16 | Yandex Europe Ag | Method and system for determining result for task executed in crowd-sourced environment |
US11475387B2 (en) | 2019-09-09 | 2022-10-18 | Yandex Europe Ag | Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment |
US11481650B2 (en) | 2019-11-05 | 2022-10-25 | Yandex Europe Ag | Method and system for selecting label from plurality of labels for task in crowd-sourced environment |
CN110930114A (en) * | 2019-11-20 | 2020-03-27 | 北京航空航天大学 | Crowdsourcing method for resisting collusion |
CN110930114B (en) * | 2019-11-20 | 2022-08-23 | 北京航空航天大学 | Crowdsourcing method for resisting collusion |
CN111292062A (en) * | 2020-02-10 | 2020-06-16 | 中南大学 | Crowdsourcing garbage worker detection method and system based on network embedding and storage medium |
US11727329B2 (en) | 2020-02-14 | 2023-08-15 | Yandex Europe Ag | Method and system for receiving label for digital task executed within crowd-sourced environment |
Also Published As
Publication number | Publication date |
---|---|
CN107767055B (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107767055B (en) | Crowdsourcing result aggregation method and device based on collusion detection | |
CN109086720B (en) | Face clustering method, face clustering device and storage medium | |
JP6594329B2 (en) | System and method for facial expression | |
CN108830145B (en) | People counting method based on deep neural network and storage medium | |
CN104809132B (en) | A kind of method and device obtaining network principal social networks type | |
CN109840467A (en) | A kind of in-vivo detection method and system | |
JP2015529904A (en) | User recommendation method and user recommendation system using the method | |
CN111898592B (en) | Track data processing method and device and computer readable storage medium | |
CN111325204A (en) | Target detection method, target detection device, electronic equipment and storage medium | |
WO2023169274A1 (en) | Data processing method and device, and storage medium and processor | |
WO2021212760A1 (en) | Method and apparatus for determining identity type of person, and electronic system | |
CN118211268A (en) | Heterogeneous federal learning privacy protection method and system based on diffusion model | |
US10791321B2 (en) | Constructing a user's face model using particle filters | |
JP2019020882A (en) | Life log utilization system, method and program | |
WO2024159888A1 (en) | Image restoration method and apparatus, and computer device, program product and storage medium | |
CN117688255A (en) | Recommendation method and system based on double social view contrast learning | |
CN110136019B (en) | Social media abnormal group user detection method based on relational evolution | |
CN102955947B (en) | A kind of device and method thereof for being used to determine image definition | |
CN106815264B (en) | Information processing method and system | |
CN111652673A (en) | Intelligent recommendation method, device, server and storage medium | |
CN111723338A (en) | Detection method and detection equipment | |
CN113326829B (en) | Method and device for recognizing gesture in video, readable storage medium and electronic equipment | |
CN115311723A (en) | Living body detection method, living body detection device and computer-readable storage medium | |
CN111461971B (en) | Image processing method, device, equipment and computer readable storage medium | |
TWM610750U (en) | Deep learning device for augmented reality somatosensory game machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |