CN107767055A - A kind of mass-rent result assemblage method and device based on collusion detection - Google Patents

A kind of mass-rent result assemblage method and device based on collusion detection Download PDF

Info

Publication number
CN107767055A
CN107767055A CN201711003779.2A CN201711003779A CN107767055A CN 107767055 A CN107767055 A CN 107767055A CN 201711003779 A CN201711003779 A CN 201711003779A CN 107767055 A CN107767055 A CN 107767055A
Authority
CN
China
Prior art keywords
worker
answer
answer set
parameter
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711003779.2A
Other languages
Chinese (zh)
Other versions
CN107767055B (en
Inventor
孙海龙
王旭
陈鹏鹏
方毅立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201711003779.2A priority Critical patent/CN107767055B/en
Publication of CN107767055A publication Critical patent/CN107767055A/en
Application granted granted Critical
Publication of CN107767055B publication Critical patent/CN107767055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of mass-rent result assemblage method based on collusion detection and device, methods described to include:Answer set of each worker for set of tasks is collected from mass-rent platform;The convergence result of the answer set is calculated, and calculates the convergence result and the parameter of consistency of the answer of each worker;Determine to repeat answer set from the answer set, the parameter of consistency of the answer based on each worker, calculate worker's capacity variation rate corresponding to each repetition answer set;For repetition answer set of worker's capacity variation rate less than or equal to predetermined threshold value, determine that the answer set that repeats is normal generation and retains the repetition answer set in the answer set;It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines that the answer set that repeats is that collusion produces and the repetition answer set is deleted in the answer set;The answer set updated, calculate the convergence result of the answer set of the renewal.

Description

A kind of mass-rent result assemblage method and device based on collusion detection
Technical field
The present invention relates to mass-rent technical field, more particularly to a kind of mass-rent result assemblage method and dress based on collusion detection Put.
Background technology
Mass-rent is a fast-developing field, it is intended to which using the cognition advantage of people, to solve, computer is insoluble to ask Topic.Mass-rent general-purpose platform such as, CrowdFlower and AMT, is widely used in general data processing task, such as feelings by people Sense analysis, handwriting recognition and picture mark.Because worker may return to low-quality result, a key problem of mass-rent is Ensure outcome quality.The method of widely used control quality is result convergence, and each task is distributed to multiple works by it first People, then converge the result that worker returns using reasoning algorithm.By taking image labeling as an example, an image is assigned to multiple works People, then these workers provide respectively description picture material label.Finally, by ballot or other inference methods from all receipts The result of a high quality is converged out in the label of collection.
In mass-rent, less labour is paid in order to obtain more remunerations, collusion person outside platform by short message, it is micro- Letter, phone, forum's even aspectant exchange, form collusion troop.In a collusion troop, only worker's processing is appointed Business, other workers plagiarize his answer.All workers in final troop are provided which identical answer.The repetition of these malice Answer will dominate the answer that normal worker provides in result convergence, reduce the quality of result.For example, a task gives five Individual worker is performed, if wherein three worker's collusions, result convergence, final convergence result are carried out using most of ballot methods The result that the person that would be equivalent to collusion provides.
As known from the above, it is harmful to the outcome quality of the general task on general-purpose platform answer to be repeated caused by collusion 's.But existing collusion probe algorithm can not effectively detect and eliminate the negative effect of such collusion.
The content of the invention
In order to solve the above technical problems, the embodiments of the invention provide a kind of mass-rent result convergence side based on collusion detection Method and device.
Mass-rent result assemblage method provided in an embodiment of the present invention based on collusion detection, including:
Answer set of each worker for set of tasks is collected from mass-rent platform;
The convergence result of the answer set is calculated, and calculates the convergence result and the uniformity ginseng of the answer of each worker Number;
Determined from the answer set repeat answer set, the parameter of consistency of the answer based on each worker, Calculate worker's capacity variation rate corresponding to each repetition answer set;
It is less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines the repetition answer set Retain the repetition answer set for normal generation and in the answer set;
It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines the answer set that repeats for string Scheme produces and the repetition answer set is deleted in the answer set;
To it is each repeat answer set retained or delete processing after, the answer set that is updated, and described in calculating more The convergence result of new answer set.
In the embodiment of the present invention, the parameter of consistency for calculating the convergence result and the answer of each worker, including:
The convergence result and the parameter of consistency of the answer of each worker are calculated based on below equation:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to answering for set of tasks return for worker i Case,For the convergence result of answer set.
In the embodiment of the present invention, the parameter of consistency of the answer based on each worker, each repetition answer is calculated Worker's capacity variation rate corresponding to set, including:
When calculating the reservation repetition answer set in the answer set, the parameter of consistency of the answer of each worker First variance;
When calculating the deletion repetition answer set in the answer set, the parameter of consistency of the answer of each worker Second variance;
Based on the first variance and the second variance, worker's capacity variation corresponding to the repetition answer set is calculated Rate.
In the embodiment of the present invention, when the calculating retains repetition answer set in the answer set, each worker Answer parameter of consistency first variance, including:
Calculate to retain in the answer set by below equation and repeat answer setWhen, the answer of each worker Parameter of consistency first variance:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and Worker i and parameter of consistency,For answer set.
In the embodiment of the present invention, when repetition answer set is deleted in the calculating in the answer set, each worker Answer parameter of consistency second variance, including:
Calculate to delete in the answer set by below equation and repeat answer setWhen, the answer of each worker Parameter of consistency second variance:
Wherein, Var (Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kTied for convergence Fruit and worker i and parameter of consistency,To deleteAnswer set afterwards.
It is described to be based on the first variance and the second variance in the embodiment of the present invention, calculate the repetition answer set Worker's capacity variation rate corresponding to conjunction, including:
Calculated based on below equation and repeat answer setCorresponding worker's capacity variation rate:
Wherein,For worker's capacity variation rate.
Mass-rent result converging device provided in an embodiment of the present invention based on collusion detection, including:
Collection module, for collecting answer set of each worker for set of tasks from mass-rent platform;
Uniformity computing module, for calculating the convergence result of the answer set, and calculate the convergence result and each The parameter of consistency of the answer of worker;
Worker's capacity variation rate module, for determining to repeat answer set from the answer set, based on described each The parameter of consistency of the answer of worker, calculate worker's capacity variation rate corresponding to each repetition answer set;
Collusion detection module, for being less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, really Determine the answer set that repeats to be normal generation and retain the repetition answer set in the answer set;For worker's energy Power rate of change is more than the repetition answer set of predetermined threshold value, determines that the answer set that repeats is that collusion produces and in the answer The repetition answer set is deleted in set;
Convergence module, for it is each repeat answer set retained or delete processing after, the answer set that is updated, And calculate the convergence result of the answer set of the renewal.
In the embodiment of the present invention, the uniformity computing module, tied specifically for calculating the convergence based on below equation The parameter of consistency of the answer of fruit and each worker:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to answering for set of tasks return for worker i Case,For the convergence result of answer set.
In the embodiment of the present invention, worker's capacity variation rate module includes:
First variance computing unit, when retaining repetition answer set in the answer set for calculating, each work The first variance of the parameter of consistency of the answer of people;
Second variance computing unit, when repetition answer set is deleted in the answer set for calculating, each work The second variance of the parameter of consistency of the answer of people;
Worker's capacity variation rate computing unit, for based on the first variance and the second variance, calculating described heavy Worker's capacity variation rate corresponding to multiple answer set.
In the embodiment of the present invention, the first variance computing unit, answered specifically for being calculated by below equation described Retain in case set and repeat answer setWhen, the first variance of the parameter of consistency of the answer of each worker:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and Worker i and parameter of consistency,For answer set.
In the embodiment of the present invention, the second variance computing unit, answered specifically for being calculated by below equation described Deleted in case set and repeat answer setWhen, the second variance of the parameter of consistency of the answer of each worker:
Wherein, Var (Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kTied for convergence Fruit and worker i and parameter of consistency,To deleteAnswer set afterwards.
In the embodiment of the present invention, worker's capacity variation rate computing unit, specifically for calculating weight based on below equation Multiple answer setCorresponding worker's capacity variation rate:
Wherein,For worker's capacity variation rate.
Using the technical scheme of the embodiment of the present invention, (1) is different from the scene of space-time mass-rent and social networks, general flat In platform, the answer of general task is characterized in unknown.Therefore, it is consistent with convergence result to introduce worker's answer for the embodiment of the present invention Property concept describe to repeat caused by collusion the influence that answer is converged to result.
(2) it is different from the collusion detection algorithm based on similarity in e-commerce platform, the embodiment of the present invention proposes a kind of The collusion detection method of rate of change is showed based on worker, collusion can be determined in comprising the normal answer set for repeating answer It is caused to repeat answer.(3) embodiment of the present invention proposes a kind of mass-rent result assemblage method of collusion detection, can effectively disappear The negative effect converged except collusion behavior to result.
Brief description of the drawings
Fig. 1 is the mass-rent block schematic illustration based on collusion detection of the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the mass-rent result assemblage method based on collusion detection of the embodiment of the present invention;
Fig. 3 is the structure composition schematic diagram of the mass-rent result converging device based on collusion detection of the embodiment of the present invention;
Fig. 4 is the result schematic diagram of worker's capacity variation rate module of the embodiment of the present invention.
Embodiment
The characteristics of in order to more fully hereinafter understand the embodiment of the present invention and technology contents, below in conjunction with the accompanying drawings to this hair The realization of bright embodiment is described in detail, appended accompanying drawing purposes of discussion only for reference, is not used for limiting the embodiment of the present invention.
Existing collusion probe algorithm can not effectively detect and eliminate the negative effect of collusion, mainly have it is following some Reason:
(1) the collusion detection algorithm in space-time mass-rent and social networks needs some features for extracting data to carry out collusion Detection, such as in space-time mass-rent, is detected using the room and time feature of gathered data to collusion.But these features exist It is difficult to obtain on general mass-rent platform.
(2) detection algorithm in e-commerce platform be mainly based upon each pair worker furnish an answer between similarity to string Scheme is detected.Because the repetition answer of task in general-purpose platform is divided into normal repeat and collusion repetition.In some simple tasks Middle worker shows higher ability, now repeatedly there are many in answer normal caused.Based on the similarity of answer to string Scheme, which is tested, to be answer caused by collusion the answer misjudgement normally repeated.
(3) in auction platform, bidder's often collusion, high return is obtained to pay low cost.Such algorithm master Collusion behavior is detected based on game theory, it is difficult to suitable for the general task on general-purpose platform.
In summary, for the general task of general-purpose platform, existing algorithm can not effectively detect and eliminate collusion Produce the harm for repeating answer to outcome quality.The problem of for existing, the technical scheme of the embodiment of the present invention are proposed based on string Seek the mass-rent method of quality control of detection.
Fig. 1 is the mass-rent block schematic illustration based on collusion detection of the embodiment of the present invention, as shown in figure 1, the framework includes The following steps:
(1) task is published to mass-rent platform, such as MechanicalTurk by requestor, and wherein requestor is according to worker's The quality of answer gives corresponding reward.
(2) platform constraints that task is specified according to scheduling strategy and user distribute to worker.
(3) in fact, some workers and dependent, it might even be possible to which outside platform collaboration handles some mass-rent tasks. Worker may act in collusion behind the scenes.For example, worker same mass-rent is worked by online forum other people rob Surreptitiously.After task processing, collect answer and simultaneously eliminate some noisy answers, for example, some answers obviously with image tag task In picture it is unrelated.
(4) this step is related to collusion detection and result convergence.After all workers return answer for completing collection worker, this hair Bright embodiment detects collusion behavior using collusion testing mechanism, then filters out and answer is repeated as caused by collusion person.Tying After fruit filtering, the embodiment of the present invention carrys out the final result of each task of reasoning using assemblage method and is submitted to request Person.
The core of the framework of the embodiment of the present invention is (4) step, and it includes the collusion detection side that the embodiment of the present invention proposes Method, then using result inference method, in the case of collusion can also reasoning high quality result.
The collusion detection mass-rent framework that the embodiment of the present invention proposes, efficiently solves existing result assembly algorithms and is difficult to have Effect eliminates the problem of harm that collusion converges to result.Unlike in general mass-rent framework, what the embodiment of the present invention proposed Worker in mass-rent framework is no longer independent, but may have communication even collusion between each other.Knot in this outer framework Fruit reasoning part includes the process of serial ports detection.
The technical scheme of the embodiment of the present invention integrally includes:Collusion detection, result filtering, result converge three big steps, with Under this three big step is described.
Step 1:Collusion detects
(1) convergence result and the uniformity of worker's answer are calculated:When worker completes task processing, worker is returned first Answer be collected, it is assumed that for set of tasksWorker return answer set beIfFor in answer set A repetition answer set.The purpose of the embodiment of the present invention be judge to repeat set whether be caused by collusion, and in this base To answer set on plinthConverged to obtain the result of high quality.
Answer set is converged using most of ballot methods to obtain convergence resultThe embodiment of the present invention provides convergence As a result with the calculation formula of worker's i answer uniformity:
Wherein, LiCorrespond to set of tasks for worker iThe answer set of return.
(2) corresponding each worker's capacity variation rate for repeating answer set is calculated:For a repetition answer set, worker Capacity variation rate principal measure repeats answer set to worker's answer and the general performance of convergence result uniformity.The present invention is implemented Example is changed to formalize worker's capacity variation rate using the variance of global consistency before and after deletion repetition answer set.First, count Calculate to retain and repeat answer setWhen, the variance of worker's answer uniformity:
Delete and repeat answer setIt can obtainSimilar, calculate and delete repetition answer setWhen worker's answer The variance of uniformity:
Then the formula of above-mentioned two mode, worker's capacity variation rate is obtained:
(3) judge to repeat answer whether as caused by collusion:WhenDuring less than or equal to threshold value Threshold, then recognize To repeat to gatherAnswer is repeated to be normal.WhenDuring more than Threshold, then it is assumed that repeat to gatherFor collusion weight Multiple answer.
In such scheme, answer set is repeated calculatingWorker's answer uniformity variance when, can also utilize other The result of assembly algorithms obtainsSuch as the assemblage method of probability.
Step 2:As a result filter
Repeat a pair of answer sets of above stepIn it is all repeat set detected.It is judged as caused by collusion Repeating answer will be deleted, and the answer for being judged as normally repeating will be retained.
Step 3:As a result converge
Answer set is carried out using existing result assembly algorithms to converge out final result.
Fig. 2 is the schematic flow sheet of the mass-rent result assemblage method based on collusion detection of the embodiment of the present invention, such as Fig. 2 institutes Show, the mass-rent result assemblage method based on collusion detection comprises the following steps:
Step 201:Answer set of each worker for set of tasks is collected from mass-rent platform.
Step 202:Calculate the convergence result of the answer set, and calculate the convergence result and the answer of each worker Parameter of consistency.
In the embodiment of the present invention, the convergence result and the uniformity ginseng of the answer of each worker are calculated based on below equation Number:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to answering for set of tasks return for worker i Case,For the convergence result of answer set.
Step 203:Determined from the answer set repeat answer set, the answer based on each worker it is consistent Property parameter, calculate and each repeat worker's capacity variation rate corresponding to answer set.
In the embodiment of the present invention, the parameter of consistency of the answer based on each worker, each repetition answer is calculated Worker's capacity variation rate corresponding to set, including:
When calculating the reservation repetition answer set in the answer set, the parameter of consistency of the answer of each worker First variance;
When calculating the deletion repetition answer set in the answer set, the parameter of consistency of the answer of each worker Second variance;
Based on the first variance and the second variance, worker's capacity variation corresponding to the repetition answer set is calculated Rate.
Wherein, calculate to retain in the answer set by below equation and repeat answer setWhen, each worker Answer parameter of consistency first variance:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and Worker i and parameter of consistency,For answer set.
Calculate to delete in the answer set by below equation and repeat answer setWhen, the answer of each worker Parameter of consistency second variance:
Wherein, Var(Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kTo converge result With worker i and parameter of consistency,To deleteAnswer set afterwards.
Calculated based on below equation and repeat answer setCorresponding worker's capacity variation rate:
Wherein,For worker's capacity variation rate.
Step 204:It is less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines the repetition Answer set is normal generation and retains the repetition answer set in the answer set.
Step 205:It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines the repetition answer Collection is combined into collusion and produces and the repetition answer set is deleted in the answer set.
Step 206:To it is each repeat answer set retained or delete processing after, the answer set that is updated, and counting Calculate the convergence result of the answer set of the renewal.
The collusion detection method that the embodiment of the present invention proposes, the result provided according to worker can accurately detect string Seek group.Before and after some repetition answer set is deleted, the variance of worker's answer and result uniformity changes to formalize worker Capacity variation rate, collusion behavior is detected using the scale of worker's capacity variation rate change.The embodiment of the present invention propose to string The result treatment mode that scheme result converges again after deleting, the accuracy rate of convergence result can be greatly enhanced.Calculated with existing convergence Method is different, the detection that the result assemblage method that the embodiment of the present invention proposes includes collusion behavior and can effectively to eliminate its right As a result the negative effect converged, outcome quality is improved.
Fig. 3 is the structure composition schematic diagram of the mass-rent result converging device based on collusion detection of the embodiment of the present invention, such as Shown in Fig. 3, described device includes:
Collection module 301, for collecting answer set of each worker for set of tasks from mass-rent platform;
Uniformity computing module 302, for calculating the convergence result of the answer set, and calculate it is described convergence result and The parameter of consistency of the answer of each worker;
Worker's capacity variation rate module 303, for determining to repeat answer set from the answer set, based on described The parameter of consistency of the answer of each worker, calculate worker's capacity variation rate corresponding to each repetition answer set;
Collusion detection module 304, for being less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, Determine that the answer set that repeats is normal generation and retains the repetition answer set in the answer set;For worker Capacity variation rate is more than the repetition answer set of predetermined threshold value, determines that the answer set that repeats is that collusion is produced and answered described The repetition answer set is deleted in case set;
Convergence module 305, for it is each repeat answer set retained or delete processing after, the answer set that is updated Close, and calculate the convergence result of the answer set of the renewal.
In an embodiment of the present invention, the uniformity computing module 302, described in being calculated based on below equation Converge result and the parameter of consistency of the answer of each worker:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to answering for set of tasks return for worker i Case,For the convergence result of answer set.
In an embodiment of the present invention, as shown in figure 4, worker's capacity variation rate module 303 includes:
First variance computing unit 3031, it is described when retaining repetition answer set in the answer set for calculating The first variance of the parameter of consistency of the answer of each worker;
Second variance computing unit 3032, it is described when repetition answer set is deleted in the answer set for calculating The second variance of the parameter of consistency of the answer of each worker;
Worker's capacity variation rate computing unit 3033, for based on the first variance and the second variance, calculating institute State worker's capacity variation rate corresponding to repetition answer set.
In an embodiment of the present invention, the first variance computing unit 3031, specifically for being calculated by below equation Retain in the answer set and repeat answer setWhen, the first variance of the parameter of consistency of the answer of each worker:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and Worker i and parameter of consistency,For answer set.
In an embodiment of the present invention, the second variance computing unit 3032, specifically for being calculated by below equation Deleted in the answer set and repeat answer setWhen, the second variance of the parameter of consistency of the answer of each worker:
Wherein, Var (Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kTied for convergence Fruit and worker i and parameter of consistency,To deleteAnswer set afterwards.
In an embodiment of the present invention, worker's capacity variation rate computing unit 3033, specifically for based on following public affairs Formula, which calculates, repeats answer setCorresponding worker's capacity variation rate:
Wherein,For worker's capacity variation rate.
It will be appreciated by those skilled in the art that shown in Fig. 3 based on collusion detection mass-rent result converging device in it is each Module realizes that function can refer to the associated description of the foregoing mass-rent result assemblage method based on collusion detection and understand, Fig. 3 institutes Each module in the mass-rent result converging device based on collusion detection shown realizes that function can be by running on processor Program and realize, can also be realized by specific logic circuit.
, can be in any combination in the case where not conflicting between technical scheme described in the embodiment of the present invention.
In several embodiments provided by the present invention, it should be understood that disclosed method and smart machine, Ke Yitong Other modes are crossed to realize.Apparatus embodiments described above are only schematical, for example, the division of the unit, only Only a kind of division of logic function, can have other dividing mode, such as when actually realizing:Multiple units or component can be tied Close, or be desirably integrated into another system, or some features can be ignored, or do not perform.In addition, shown or discussed each group Into the mutual coupling in part or direct-coupling or communication connection can be by some interfaces, equipment or unit it is indirect Coupling or communication connection, can be electrical, mechanical or other forms.
The above-mentioned unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can positioned at a place, can also be distributed to multiple network lists In member;Partly or entirely unit therein can be selected to realize the purpose of this embodiment scheme according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated into a processing unit, also may be used To be each unit individually as a unit, can also two or more units it is integrated in a unit;It is above-mentioned Integrated unit can both be realized in the form of hardware, can also be realized in the form of hardware adds SFU software functional unit.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through Programmed instruction related hardware is completed, and foregoing program can be stored in a computer read/write memory medium, the program Upon execution, the step of execution includes above method embodiment;And foregoing storage medium includes:It is movable storage device, read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or Person's CD etc. is various can be with the medium of store program codes.
Or if said apparatus of the embodiment of the present invention is realized in the form of software function module and is used as independent product Sale in use, can also be stored in a computer read/write memory medium.Based on such understanding, the present invention is implemented The part that the technical scheme of example substantially contributes to prior art in other words can be embodied in the form of software product, The computer software product is stored in a storage medium, including some instructions are causing a computer equipment (can be with It is personal computer, server or network equipment etc.) perform all or part of each embodiment methods described of the present invention. And foregoing storage medium includes:Movable storage device, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained Cover within protection scope of the present invention.

Claims (12)

1. a kind of mass-rent result assemblage method based on collusion detection, it is characterised in that methods described includes:
Answer set of each worker for set of tasks is collected from mass-rent platform;
The convergence result of the answer set is calculated, and calculates the convergence result and the parameter of consistency of the answer of each worker;
Determine to repeat answer set from the answer set, the parameter of consistency of the answer based on each worker, calculate It is each to repeat worker's capacity variation rate corresponding to answer set;
It is less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines the answer set that repeats for just Often produce and retain the repetition answer set in the answer set;
It is more than the repetition answer set of predetermined threshold value for worker's capacity variation rate, determines that the answer set that repeats is produced for collusion Life simultaneously deletes the repetition answer set in the answer set;
To it is each repeat answer set retained or delete processing after, the answer set that is updated, and calculate the renewal The convergence result of answer set.
2. the mass-rent result assemblage method according to claim 1 based on collusion detection, it is characterised in that the calculating institute Convergence result and the parameter of consistency of the answer of each worker are stated, including:
The convergence result and the parameter of consistency of the answer of each worker are calculated based on below equation:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to the answer of set of tasks return for worker i,For the convergence result of answer set.
3. the mass-rent result assemblage method according to claim 2 based on collusion detection, it is characterised in that described to be based on institute The parameter of consistency of the answer of each worker is stated, calculates worker's capacity variation rate corresponding to each repetition answer set, including:
When calculating the reservation repetition answer set in the answer set, the first of the parameter of consistency of the answer of each worker Variance;
When calculating the deletion repetition answer set in the answer set, the second of the parameter of consistency of the answer of each worker Variance;
Based on the first variance and the second variance, worker's capacity variation rate corresponding to the repetition answer set is calculated.
4. the mass-rent result assemblage method according to claim 3 based on collusion detection, it is characterised in that described to calculate When retaining repetition answer set in the answer set, the first variance of the parameter of consistency of the answer of each worker, including:
Calculate to retain in the answer set by below equation and repeat answer setWhen, the one of the answer of each worker The first variance of cause property parameter:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and worker i With parameter of consistency,For answer set.
5. the mass-rent result assemblage method based on collusion detection according to claim 3 or 4, it is characterised in that the meter When calculating the deletion repetition answer set in the answer set, the second variance of the parameter of consistency of the answer of each worker, Including:
Calculate to delete in the answer set by below equation and repeat answer setWhen, the one of the answer of each worker The second variance of cause property parameter:
Wherein, Var (Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kFor convergence result and work People i and parameter of consistency,To deleteAnswer set afterwards.
6. the mass-rent result assemblage method according to claim 5 based on collusion detection, it is characterised in that described to be based on institute First variance and the second variance are stated, calculates worker's capacity variation rate corresponding to the repetition answer set, including:
Calculated based on below equation and repeat answer setCorresponding worker's capacity variation rate:
Wherein,For worker's capacity variation rate.
7. a kind of mass-rent result converging device based on collusion detection, it is characterised in that described device includes:
Collection module, for collecting answer set of each worker for set of tasks from mass-rent platform;
Uniformity computing module, for calculating the convergence result of the answer set, and calculate the convergence result and each worker Answer parameter of consistency;
Worker's capacity variation rate module, for determining to repeat answer set from the answer set, based on each worker Answer parameter of consistency, calculate and each repeat worker's capacity variation rate corresponding to answer set;
Collusion detection module, for being less than or equal to the repetition answer set of predetermined threshold value for worker's capacity variation rate, determine institute Repetition answer set is stated as normal generation and retains the repetition answer set in the answer set;Become for worker's ability Rate is more than the repetition answer set of predetermined threshold value, determines that the answer set that repeats is that collusion produces and in the answer set It is middle to delete the repetition answer set;
Convergence module, for it is each repeat answer set retained or delete processing after, the answer set that is updated, and counting Calculate the convergence result of the answer set of the renewal.
8. the mass-rent result converging device according to claim 7 based on collusion detection, it is characterised in that the uniformity Computing module, specifically for calculating the convergence result and the parameter of consistency of the answer of each worker based on below equation:
Wherein, PiFor convergence result and worker i and parameter of consistency, LiCorrespond to the answer of set of tasks return for worker i,For the convergence result of answer set.
9. the mass-rent result converging device according to claim 7 based on collusion detection, it is characterised in that worker's energy Power rate of change module includes:
First variance computing unit, when retaining repetition answer set in the answer set for calculating, each worker's The first variance of the parameter of consistency of answer;
Second variance computing unit, when repetition answer set is deleted in the answer set for calculating, each worker's The second variance of the parameter of consistency of answer;
Worker's capacity variation rate computing unit, for based on the first variance and the second variance, calculating the repetition and answering Worker's capacity variation rate corresponding to case set.
10. the mass-rent result converging device according to claim 9 based on collusion detection, it is characterised in that described first Variance computing unit, answer set is repeated specifically for calculating to retain in the answer set by below equationWhen, institute State the first variance of the parameter of consistency of the answer of each worker:
Wherein, Var (P) is first variance, and E (P) is the average value of the parameter of consistency of each worker, PiFor convergence result and worker i With parameter of consistency,For answer set.
11. the mass-rent result converging device based on collusion detection according to claim 9 or 10, it is characterised in that described Second variance computing unit, answer set is repeated specifically for calculating to delete in the answer set by below equation When, the second variance of the parameter of consistency of the answer of each worker:
Wherein, Var (Pk) it is second variance, Ε (Pk) for each worker parameter of consistency average value, Pi kFor convergence result and work People i and parameter of consistency,To deleteAnswer set afterwards.
12. the mass-rent result converging device according to claim 11 based on collusion detection, it is characterised in that the worker Capacity variation rate computing unit, answer set is repeated specifically for being calculated based on below equationCorresponding worker's capacity variation Rate:
Wherein,For worker's capacity variation rate.
CN201711003779.2A 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection Active CN107767055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711003779.2A CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711003779.2A CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Publications (2)

Publication Number Publication Date
CN107767055A true CN107767055A (en) 2018-03-06
CN107767055B CN107767055B (en) 2021-07-23

Family

ID=61270213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711003779.2A Active CN107767055B (en) 2017-10-24 2017-10-24 Crowdsourcing result aggregation method and device based on collusion detection

Country Status (1)

Country Link
CN (1) CN107767055B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978333A (en) * 2019-02-26 2019-07-05 湖南大学 Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system
CN110930114A (en) * 2019-11-20 2020-03-27 北京航空航天大学 Crowdsourcing method for resisting collusion
CN111292062A (en) * 2020-02-10 2020-06-16 中南大学 Crowdsourcing garbage worker detection method and system based on network embedding and storage medium
US11386299B2 (en) 2018-11-16 2022-07-12 Yandex Europe Ag Method of completing a task
US11416773B2 (en) 2019-05-27 2022-08-16 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11475387B2 (en) 2019-09-09 2022-10-18 Yandex Europe Ag Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment
US11481650B2 (en) 2019-11-05 2022-10-25 Yandex Europe Ag Method and system for selecting label from plurality of labels for task in crowd-sourced environment
US11727329B2 (en) 2020-02-14 2023-08-15 Yandex Europe Ag Method and system for receiving label for digital task executed within crowd-sourced environment
US11727336B2 (en) 2019-04-15 2023-08-15 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133769A (en) * 2014-08-02 2014-11-05 哈尔滨理工大学 Crowdsourcing fraud detection method based on psychological behavior analysis
CN104599084A (en) * 2015-02-12 2015-05-06 北京航空航天大学 Crowd calculation quality control method and device
US20170187751A1 (en) * 2015-12-29 2017-06-29 International Business Machines Corporation Propagating fraud awareness to hosted applications
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133769A (en) * 2014-08-02 2014-11-05 哈尔滨理工大学 Crowdsourcing fraud detection method based on psychological behavior analysis
CN104599084A (en) * 2015-02-12 2015-05-06 北京航空航天大学 Crowd calculation quality control method and device
US20170187751A1 (en) * 2015-12-29 2017-06-29 International Business Machines Corporation Propagating fraud awareness to hosted applications
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANTONIO FERNÁNDEZ ANTA 等: ""Algorithmic Mechanisms for Reliable Crowdsourcing Computation under Collusion"", 《PLOS ONE》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11386299B2 (en) 2018-11-16 2022-07-12 Yandex Europe Ag Method of completing a task
CN109978333A (en) * 2019-02-26 2019-07-05 湖南大学 Based on community discovery and the independent worker's selection method for linking prediction in crowdsourcing system
US11727336B2 (en) 2019-04-15 2023-08-15 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11416773B2 (en) 2019-05-27 2022-08-16 Yandex Europe Ag Method and system for determining result for task executed in crowd-sourced environment
US11475387B2 (en) 2019-09-09 2022-10-18 Yandex Europe Ag Method and system for determining productivity rate of user in computer-implemented crowd-sourced environment
US11481650B2 (en) 2019-11-05 2022-10-25 Yandex Europe Ag Method and system for selecting label from plurality of labels for task in crowd-sourced environment
CN110930114A (en) * 2019-11-20 2020-03-27 北京航空航天大学 Crowdsourcing method for resisting collusion
CN110930114B (en) * 2019-11-20 2022-08-23 北京航空航天大学 Crowdsourcing method for resisting collusion
CN111292062A (en) * 2020-02-10 2020-06-16 中南大学 Crowdsourcing garbage worker detection method and system based on network embedding and storage medium
US11727329B2 (en) 2020-02-14 2023-08-15 Yandex Europe Ag Method and system for receiving label for digital task executed within crowd-sourced environment

Also Published As

Publication number Publication date
CN107767055B (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN107767055A (en) A kind of mass-rent result assemblage method and device based on collusion detection
CN108295476B (en) Method and device for determining abnormal interaction account
CN103198161B (en) Microblog water army recognition methods and equipment
CN104778173B (en) Target user determination method, device and equipment
CN110413707A (en) The excavation of clique's relationship is cheated in internet and checks method and its system
CN108197532A (en) The method, apparatus and computer installation of recognition of face
US20170154267A1 (en) Discovering signature of electronic social networks
CN110309840A (en) Risk trade recognition methods, device, server and storage medium
CN108229555A (en) Sample weights distribution method, model training method, electronic equipment and storage medium
CN107767021A (en) A kind of risk control method and equipment
CN107644279A (en) The modeling method and device of evaluation model
CN108427708A (en) Data processing method, device, storage medium and electronic device
CN109586952A (en) Method of server expansion, device
CN109872232A (en) It is related to illicit gain to legalize account-classification method, device, computer equipment and the storage medium of behavior
CN109600336A (en) Store equipment, identifying code application method and device
CN108304853A (en) Acquisition methods, device, storage medium and the electronic device for the degree of correlation of playing
CN107332931A (en) The recognition methods of waterborne troops of machine type forum and device
CN110532399A (en) Knowledge mapping update method, system and the device of object game question answering system
CN108961267A (en) Image processing method, picture processing unit and terminal device
CN109670933A (en) Identify method, user equipment, storage medium and the device of user role
CN109272402A (en) Modeling method, device, computer equipment and the storage medium of scorecard
CN110472050A (en) A kind of clique's clustering method and device
CN107392614A (en) The implementation method and device of off-line transaction
CN108647714A (en) Acquisition methods, terminal device and the medium of negative label weight
CN108734186A (en) Automatically exit from the methods, devices and systems of instant communication session group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant