CN108197202A - Data verification method, device, server and the storage medium of crowdsourcing task - Google Patents

Data verification method, device, server and the storage medium of crowdsourcing task Download PDF

Info

Publication number
CN108197202A
CN108197202A CN201711457649.6A CN201711457649A CN108197202A CN 108197202 A CN108197202 A CN 108197202A CN 201711457649 A CN201711457649 A CN 201711457649A CN 108197202 A CN108197202 A CN 108197202A
Authority
CN
China
Prior art keywords
answer
topic
crowdsourcing task
user
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711457649.6A
Other languages
Chinese (zh)
Other versions
CN108197202B (en
Inventor
黄翠萍
柯海帆
李亚丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711457649.6A priority Critical patent/CN108197202B/en
Publication of CN108197202A publication Critical patent/CN108197202A/en
Application granted granted Critical
Publication of CN108197202B publication Critical patent/CN108197202B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures

Abstract

The embodiment of the invention discloses a kind of data verification method, device, server and the storage medium of crowdsourcing task, wherein this method includes:Same crowdsourcing task is distributed into multiple users and carries out data Collecting operation;Obtain the answer of the multiple user;Answer verification is carried out according to the ratio of answer identical in the answer of the multiple user, determines the final result of the crowdsourcing task.Same crowdsourcing task is distributed to multiple users by the embodiment of the present invention, obtain more parts of answers, self checking is carried out using more parts of answers to examine answer whether correct, obtain the final result of crowdsourcing task, improve accuracy and the efficiency of data check, the cost of manual examination and verification is reduced, solves the problems, such as that available data checking procedure efficiency is low and accuracy is not high.

Description

Data verification method, device, server and the storage medium of crowdsourcing task
Technical field
The present embodiments relate to data check technology more particularly to a kind of data verification method of crowdsourcing task, device, Server and storage medium.
Background technology
With the continuous developments of internet, data are acquired using field operation crowdsourcing model and are also increasingly taken seriously, Data acquisition is related to contents extraction and data audit (also referred to as data check, it is therefore an objective to confirm whether data are correct). Since data volume is huger, data audit is carried out using manual type, the period is long, needs the manpower put into bigger, Cost is higher, and mobility of people is big, easily causes a large amount of data and overstocks.
Contents extraction is carried out using machine and data are audited, and the scale of extraction and audit is limited, for example, in picture Whether certain place, which can be open to traffic, is waited points of interest (Point of Interest, POI) attribute, and machine can not extract in this way from picture Information, can not also carry out audit data it is whether correct.
Invention content
The embodiment of the present invention provides a kind of data verification method, device, server and the storage medium of crowdsourcing task, to carry The efficiency of high data check and accuracy reduce manual examination and verification cost.
In a first aspect, an embodiment of the present invention provides a kind of data verification method of crowdsourcing task, including:
Same crowdsourcing task is distributed into multiple users and carries out data Collecting operation;
Obtain the answer of the multiple user;
Answer verification is carried out according to the ratio of answer identical in the answer of the multiple user, determines the crowdsourcing task Final result.
Second aspect, the embodiment of the present invention additionally provide a kind of data calibration device of crowdsourcing task, including:
Task allocating module carries out data Collecting operation for same crowdsourcing task to be distributed to multiple users;
Answer acquisition module, for obtaining the answer of the multiple user;
Answer correction verification module carries out answer verification for the ratio of identical answer in the answer according to the multiple user, Determine the final result of the crowdsourcing task.
The third aspect, the embodiment of the present invention additionally provide a kind of server, and the server includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are performed by one or more of processors so that one or more of processing Device realizes the data verification method of the crowdsourcing task as described in any embodiment of the present invention.
Fourth aspect, the embodiment of the present invention additionally provide a kind of computer readable storage medium, are stored thereon with computer Program realizes the data verification method of the crowdsourcing task as described in any embodiment of the present invention when the program is executed by processor.
Same crowdsourcing task is distributed to multiple users, obtains more parts of answers, utilized by the technical solution of the embodiment of the present invention More parts of answers carry out self checking to examine answer whether correct, obtain the final result of crowdsourcing task, improve data check Accuracy and efficiency reduce the cost of manual examination and verification, solve that available data checking procedure efficiency is low and accuracy is not high Problem.Also, validity check is carried out to the answer of user using dark stake topic is preset, reject can not credit household answer, carry The high quality of data, further improves the accuracy of crowdsourcing task final result.
Description of the drawings
Fig. 1 is the flow chart of the data verification method for the crowdsourcing task that the embodiment of the present invention one provides;
Fig. 2 is the flow chart of the data verification method of crowdsourcing task provided by Embodiment 2 of the present invention;
Fig. 3 is the particular flow sheet of the data verification method of crowdsourcing task provided by Embodiment 2 of the present invention;
Fig. 4 is the structure diagram of the data calibration device for the crowdsourcing task that the embodiment of the present invention three provides;
Fig. 5 is the structure diagram for the server that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limitation of the invention.It also should be noted that in order to just Part related to the present invention rather than entire infrastructure are illustrated only in description, attached drawing.
Embodiment one
Fig. 1 is the flow chart of the data verification method for the crowdsourcing task that the embodiment of the present invention one provides, and the present embodiment can fit For the situation that the data acquired to crowdsourcing task are verified, this method can be held by the data calibration device of crowdsourcing task Row, which can be realized by software and/or hardware, can generally be integrated in the server.As shown in Figure 1, this method is specific Including:
Same crowdsourcing task is distributed to multiple users and carries out data Collecting operation by S110.
Wherein, user's number of same crowdsourcing task distribution can be set according to demand.It is generally comprised in crowdsourcing task more A topic.User carries out data Collecting operation, i.e., carries out operation to topic each in crowdsourcing task, provides answer, and answer is crowd The data of packet task acquisition.It, can be according to information such as User ID, IP address, user's history records when distributing crowdsourcing task Active distribution is carried out, crowdsourcing task is pushed to qualified user;Can also be the dispensing crowdsourcing task on crowdsourcing platform, Task is got for user.Illustratively, the topic of crowdsourcing task can be that the categories such as phone, address, time are extracted from picture Property information.
S120 obtains the answer of the multiple user.
Wherein, it obtains the answer of multiple users, that is, receives the answer that user uploads, it is each to be answered with portion uploaded per family Case.
S130 carries out answer verification according to the ratio of answer identical in the answer of the multiple user, determines the crowdsourcing The final result of task.
Wherein, for same crowdsourcing task, the more parts of answers that multiple users provide are obtained, self-correcting is carried out using more parts of answers It tests (alternatively referred to as cross check), it may be determined that the final result of crowdsourcing task.Self checking process is reduced without manually participating in Cost of labor.Illustratively, five parts of answers is needed to carry out cross check, then crowdsourcing task can be distributed to five users.
Same crowdsourcing task is distributed to multiple users, obtains more parts of answers, utilize more parts by the technical solution of the present embodiment Answer carries out self checking to examine answer whether correct, obtains the final result of crowdsourcing task, improves the accurate of data check Degree and efficiency, reduce the cost of manual examination and verification, solve the problems, such as that available data checking procedure efficiency is low and accuracy is not high.
Based on the above technical solution, S130 can include:For each topic in the crowdsourcing task, according to The answer of the multiple user determines the identical answer ratio of the topic;If the identical answer ratio of the topic is more than pre- If threshold value, final result of the identical answer as the topic is determined;If the identical answer ratio of the topic does not surpass The predetermined threshold value is crossed, the multiple user is submitted to carry out desk checking to the answer of the topic.
Wherein, crowdsourcing task generally comprises multiple topics, for each topic, will carry out the intersection school of more parts of answers It tests, to determine the final result of each topic in crowdsourcing task.Cross check refers to compare more parts of answers, will be accorded in more parts of answers The identical answer of proportion requirement is closed, as final result.Predetermined threshold value can be configured according to actual demand, such as be set as 50%, i.e. the identical answer ratio of topic is more than 50%, then it is assumed that the identical answer is the final result of the topic.
If the identical answer ratio of topic is less than predetermined threshold value, artificial school is carried out to multiple answers of the topic It tests, when desk checking refers to that by cross check answer can not be obtained, by manually more parts of answers of same topic are compared, To determine the final result of the topic, it can be understood as a kind of auxiliary examination measure.It should be noted that in the embodiment of the present invention The automatic audit of most of data can be realized by cross check, the probability therefore, it is necessary to human assistance verification is also to compare Small.On the basis of cross check, setting human assistance verifies, and can be further ensured that the comprehensive and accurate of data check Property.
Optionally, the topic types in above-mentioned crowdsourcing task include:In multiple-choice question, True-False, question-and-answer problem and gap-filling questions extremely It is one of few.
In order to reduce the difficulty of the task difficulty of crowdsourcing task and data check, can with multiple-choice question, judge entitled crowdsourcing The main topic types of task, ensureing the task answer of user has very simple processing specification and standard.For question-and-answer problem and The topic types such as gap-filling questions in the answer for comparing different user, need to extract the keyword in answer, by judging keyword Similarity, to determine whether the answer of different user identical.Specific similarity can be configured according to different scenes, example Such as, the description information to picture is filled in, the description term different from of different user can set similarity to be more than 90%, then Think that answer is identical;For another example, address or phone are extracted from picture, due to being related to specific house number or phone numbers, In the case of then needing similarity 100%, it is believed that answer is identical.
Embodiment two
Fig. 2 is the flow chart of the data verification method of crowdsourcing task provided by Embodiment 2 of the present invention, and the present embodiment is upper On the basis of stating each embodiment, increased in S130 stake point verification operation, with remove can not credit household answer, improve number According to quality, verified according to the answer of trusted users.Stake point verification refers to that the default dark stake topic in crowdsourcing task is sentenced Whether disconnected user is credible (i.e. whether the answer of the user is effective).As shown in Fig. 2, this method specifically includes:
Same crowdsourcing task is distributed to multiple users and carries out data Collecting operation by S210.
S220 obtains the answer of the multiple user.
S230, the default dark stake topic in the crowdsourcing task, determines the trusted users in the multiple user.
Wherein, dark stake topic is preset for verifying whether user is credible, that is, whether the answer that user provides is effective, is No is data falsification.Presetting dark stake topic can repeat to launch, applied in multiple crowdsourcing tasks.Divide by same crowdsourcing task Before the multiple users of dispensing carry out data Collecting operation, it can be set in the crowdsourcing task and preset dark stake topic, wherein, it presets Dark stake topic includes the topic for having model answer of preset number.Dark stake topic is typically chosen all users and reaches same cognition And has the topic of clear and definite answer, and be typically chosen the simple topic types easily compared, such as multiple-choice question.The number of dark stake topic can To be configured as the case may be, such as 5 dark stake topics of setting.
In one embodiment, can be come by judging whether user is identical with model answer to the answer of dark stake topic Determine whether user is credible.Specifically, S230 includes:For each user in the multiple user, from answering for the user The answer for presetting dark stake topic is extracted in case;If the answer of extraction and the model answer for presetting dark stake topic are whole Identical, it is trusted users to determine the user.If at least one answer and the default dark stake topic in the answer of extraction Model answer is different, and it is not trusted users to determine the user, that is to say, that the possible data falsification of user, the user is to the crowd The answer of all topics is insincere in packet task, and follow-up cross check cannot use the answer of the user.
It should be noted that in addition to judging that the answer that user provides is whether using the answer of default dark stake topic accurately It is no it is credible except, it is also contemplated that judging answer that user provides is whether to the activity duration for presetting dark stake topic using user It is credible.Specifically, compare user to the actual job time of dark stake topic and the default activity duration of the dark stake topic, if extremely The actual job time of a few dark stake topic, (such as the actual job time was less than presetting operation well below the default activity duration The half of time), then it is assumed that user is insincere.The wherein default activity duration can be according to most of user to the operation energy of topic Power obtains.Illustratively, the default activity duration of dark stake topic is 20 seconds, the answer that user only just had submitted the topic with 5 seconds, Then think that the user is insincere.It is of course also possible to judge with reference to the model answer for presetting dark stake topic with the default activity duration Whether user is credible, if user meets the answer of dark stake topic model answer, and the actual job time meets default operation Time, it is believed that user is credible.
S240 if the number of the trusted users is not up to predetermined number, is allocated the crowdsourcing task.
Wherein, predetermined number is the user's number of participation cross check set in advance.It can be default according to this in S210 Crowdsourcing task is distributed to the user of corresponding number by number, can also distribute to more users.Illustratively, five parts are needed Answer carries out cross check, then crowdsourcing task can be distributed to five users, crowdsourcing task can also be distributed to seven User, can not influence of the credit household to cross check flow with reduction.
If the number of trusted users is not up to predetermined number, cross check can not be started, needed to crowdsourcing task weight New distribution, to obtain answer of the other users to the crowdsourcing task, and carries out stake point verification.
S250, if the number of the trusted users reaches the predetermined number, according in the answer of the trusted users The ratio of identical answer carries out answer verification, determines the final result of the crowdsourcing task.Specific answer checking procedure is referring to reality It applies described in example one, details are not described herein again.
Same crowdsourcing task is distributed to multiple users, obtains more parts of answers by the technical solution of the present embodiment, using default Dark stake topic carries out validity check to the answer of user, reject can not credit household answer, improve the quality of data, further Improve the accuracy of crowdsourcing task final result.
In one embodiment, default dark stake topic can be regularly updated.Dark stake topic is regularly updated so that dark Stake topic does not have rule that can follow, and user is avoided to know data falsification after dark stake topic.
The data check process of the present embodiment is illustrated with reference to Fig. 3.Before crowdsourcing task is distributed, in crowdsourcing It is embedded in task and presets dark stake topic.As shown in figure 3, this method includes:
S310 receives the crowdsourcing task answer that user submits.
S320 obtains the answer of dark stake topic from the answer of user.
The dark stake topic answer of user and the model answer of dark stake topic are compared by S330, judge that the verification of stake point is It is no to pass through.If so, represent that user is credible, into S340;If not, represent that user is insincere, into S350.
S340, the answer of the crowdsourcing task which provides is effective, and flow is waited for into cross check.When effective answer When reaching default number (i.e. trusted users reach predetermined number), which stops opening, and proceeds by cross check.
S350, the answer of the crowdsourcing task which provides is invalid, if the effectively not up to default number of answer, needs Sub-distribution again is carried out to the crowdsourcing task.
S360, judges whether the ratio of identical answer in the answer of user reaches predetermined threshold value, and the present embodiment is with 50% Example, that is, judge whether that the answer of more than half user is identical, if so, into S370;If not, into S380.
S370, cross check processing are completed, and the identical answer of most users is the final result of topic.
S380 recycles the answer of user, into human assistance calibration link.
Embodiment three
Fig. 4 is the structure diagram of the data calibration device for the crowdsourcing task that the embodiment of the present invention three provides, such as Fig. 4 institutes Show, which includes:Task allocating module 410, answer acquisition module 420 and answer correction verification module 430.
Task allocating module 410 carries out data Collecting operation for same crowdsourcing task to be distributed to multiple users;
Answer acquisition module 420, for obtaining the answer of the multiple user;
Answer correction verification module 430 carries out answer school for the ratio of identical answer in the answer according to the multiple user It tests, determines the final result of the crowdsourcing task.
Optionally, the topic types in above-mentioned crowdsourcing task include:In multiple-choice question, True-False, question-and-answer problem and gap-filling questions extremely It is one of few.
Further, answer correction verification module 430 includes:
Ratio-dependent unit, for each topic being directed in the crowdsourcing task, according to the answer of the multiple user Determine the identical answer ratio of the topic;
Answer determination unit in the case of being more than predetermined threshold value in the identical answer ratio of the topic, determines institute State final result of the identical answer as the topic;
Unit is submitted in answer, in the case of being less than the predetermined threshold value in the identical answer ratio of the topic, The multiple user is submitted to carry out desk checking to the answer of the topic.
Optionally, answer correction verification module 430 includes:Trusted users determination unit and answer verification unit;
Trusted users determination unit for the default dark stake topic in the crowdsourcing task, determines the multiple use Trusted users in family;
Task allocating module 410 is additionally operable in the case where the number of the trusted users is not up to predetermined number, to institute Crowdsourcing task is stated to be allocated;
Answer verification unit, in the case of reaching the predetermined number in the number of the trusted users, according to institute The ratio for stating identical answer in the answer of trusted users carries out answer verification, determines the final result of the crowdsourcing task.
The trusted users determination unit includes:
Answer extracting subelement for being directed to each user in the multiple user, is carried from the answer of the user Take the answer for presetting dark stake topic;
Trusted users determination subelement, in the answer of extraction and the model answer whole phase for presetting dark stake topic With in the case of, it is trusted users to determine the user.
Optionally, above device further includes:Dark stake topic setup module, for same crowdsourcing task is distributed to it is multiple Before user carries out data Collecting operation, set in the crowdsourcing task and preset dark stake topic, wherein, the default dark stake topic Mesh includes the topic for having model answer of preset number.
Further, above device can also include:Dark stake topic update module, for regularly updating the default dark stake Topic.
The data calibration device of crowdsourcing task that the embodiment of the present invention is provided can perform any embodiment of the present invention and be carried The data verification method of the crowdsourcing task of confession has the corresponding function module of execution method and advantageous effect.Not in the present embodiment In detailed description technical detail, reference can be made to any embodiment of the present invention provide crowdsourcing task data verification method.
Example IV
A kind of server is present embodiments provided, which includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are performed by one or more of processors so that one or more of processing Device realizes the data verification method of the crowdsourcing task as described in any embodiment of the present invention.
Fig. 5 is the structure diagram for the server that the embodiment of the present invention four provides.Fig. 5 shows to be used for realizing this hair The block diagram of the exemplary servers 12 of bright embodiment.The server 12 that Fig. 5 is shown is only an example, should not be to the present invention The function and use scope of embodiment bring any restrictions.
As shown in figure 5, server 12 is showed in the form of universal computing device.The component of server 12 can be included but not It is limited to:One or more processor or processing unit 16, system storage 28, connection different system component is (including system Memory 28 and processing unit 16) bus 18.
Bus 18 represents one or more in a few class bus structures, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using the arbitrary bus structures in a variety of bus structures.It lifts For example, these architectures include but not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Server 12 typically comprises a variety of computer system readable media.These media can any can be serviced The usable medium that device 12 accesses, including volatile and non-volatile medium, moveable and immovable medium.
System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Server 12 may further include other removable/nonremovable , volatile/non-volatile computer system storage medium.Only as an example, it is not removable to can be used for read-write for storage system 34 Dynamic, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 5, can provide For to moving the disc driver of non-volatile magnetic disk (such as " floppy disk ") read-write and to moving anonvolatile optical disk The CD drive of (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver can To be connected by one or more data media interfaces with bus 18.System storage 28 can include at least one program and produce Product, the program product have one group of (for example, at least one) program module, these program modules are configured to perform of the invention each The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42 can be stored in such as system storage In device 28, such program module 42 includes but not limited to operating system, one or more application program, other program modules And program data, the realization of network environment may be included in each or certain combination in these examples.Program module 42 Usually perform the function and/or method in embodiment described in the invention.
Server 12 can also be logical with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.) Letter can also enable a user to the equipment interacted with the server 12 communication and/or with causing the server with one or more 12 any equipment (such as network interface card, the modem etc.) communications that can be communicated with one or more of the other computing device. This communication can be carried out by input/output (I/O) interface 22.Also, server 12 can also pass through network adapter 20 With one or more network (such as LAN (LAN), wide area network (WAN) and/or public network, such as internet) communication. As shown in figure 5, network adapter 20 is communicated by bus 18 with other modules of server 12.Although it should be understood that in figure not It shows, server 12 can be combined and use other hardware and/or software module, including but not limited to:Microcode, device drives Device, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various functions application and Data processing, such as realize the data verification method of crowdsourcing task that the embodiment of the present invention is provided.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should The data verification method of the crowdsourcing task as described in any embodiment of the present invention is realized when program is executed by processor.
The arbitrary of one or more computer-readable media may be used in the computer storage media of the embodiment of the present invention Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device or arbitrary above combination.The more specific example (non exhaustive list) of computer readable storage medium includes:Tool There are one or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage Medium can be any tangible medium for including or storing program, which can be commanded execution system, device or device Using or it is in connection.
Computer-readable signal media can include in a base band or as a carrier wave part propagation data-signal, Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but it is unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but it is unlimited In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
It can write to perform the computer that operates of the present invention with one or more programming language or combinations Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully performs, partly perform on the user computer on the user computer, the software package independent as one performs, portion Divide and partly perform or perform on a remote computer or server completely on the remote computer on the user computer. Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or Wide area network (WAN)-be connected to subscriber computer or, it may be connected to outer computer (such as is carried using Internet service Pass through Internet connection for quotient).
Note that it above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The present invention is not limited to specific embodiment described here, can carry out for a person skilled in the art various apparent variations, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also It can include other more equivalent embodiments, and the scope of the present invention is determined by scope of the appended claims.

Claims (15)

1. a kind of data verification method of crowdsourcing task, which is characterized in that including:
Same crowdsourcing task is distributed into multiple users and carries out data Collecting operation;
Obtain the answer of the multiple user;
Answer verification is carried out according to the ratio of answer identical in the answer of the multiple user, determines the final of the crowdsourcing task Answer.
2. according to the method described in claim 1, it is characterized in that, ratio according to answer identical in the answer of the multiple user Example carries out answer verification, determines the final result of the crowdsourcing task, including:
For each topic in the crowdsourcing task, the identical answer of the topic is determined according to the answer of the multiple user Ratio;
If the identical answer ratio of the topic is more than predetermined threshold value, determine the identical answer as the final of the topic Answer;
If the identical answer ratio of the topic is less than the predetermined threshold value, the multiple user is submitted to the topic Answer carries out desk checking.
3. according to the method described in claim 1, it is characterized in that, ratio according to answer identical in the answer of the multiple user Example carries out answer verification, determines the final result of the crowdsourcing task, including:
Default dark stake topic in the crowdsourcing task, determines the trusted users in the multiple user;
If the number of the trusted users is not up to predetermined number, the crowdsourcing task is allocated;
If the number of the trusted users reaches the predetermined number, according to answer identical in the answer of the trusted users Ratio carries out answer verification, determines the final result of the crowdsourcing task.
4. according to the method described in claim 3, it is characterized in that, default dark stake topic in the crowdsourcing task, really Trusted users in fixed the multiple user, including:
For each user in the multiple user, answering for the default dark stake topic is extracted from the answer of the user Case;
If extraction answer with it is described preset dark stake topic model answer it is all identical, determine the user be can credit Family.
5. according to the method described in claim 1, it is characterized in that, same crowdsourcing task is distributed to multiple users into line number Before Collecting operation, the method further includes:
It is set in the crowdsourcing task and presets dark stake topic, wherein, it is described to preset dark stake topic having including preset number The topic of model answer.
6. according to the method described in claim 5, it is characterized in that, in the crowdsourcing task set preset dark stake topic it Afterwards, the method further includes:
Regularly update the default dark stake topic.
7. method according to any one of claim 1 to 6, which is characterized in that the topic types in the crowdsourcing task Including:At least one of multiple-choice question, True-False, question-and-answer problem and gap-filling questions.
8. a kind of data calibration device of crowdsourcing task, which is characterized in that including:
Task allocating module carries out data Collecting operation for same crowdsourcing task to be distributed to multiple users;
Answer acquisition module, for obtaining the answer of the multiple user;
Answer correction verification module carries out answer verification for the ratio of identical answer in the answer according to the multiple user, determines The final result of the crowdsourcing task.
9. device according to claim 8, which is characterized in that the answer correction verification module includes:
Ratio-dependent unit for each topic being directed in the crowdsourcing task, is determined according to the answer of the multiple user The identical answer ratio of the topic;
Answer determination unit in the case of being more than predetermined threshold value in the identical answer ratio of the topic, determines the phase With final result of the answer as the topic;
Unit is submitted in answer, in the case of being less than the predetermined threshold value in the identical answer ratio of the topic, is submitted The multiple user carries out desk checking to the answer of the topic.
10. device according to claim 8, which is characterized in that the answer correction verification module includes:Trusted users determine list Member and answer verification unit;
The trusted users determination unit for the default dark stake topic in the crowdsourcing task, determines the multiple use Trusted users in family;
The task allocating module is additionally operable in the case where the number of the trusted users is not up to predetermined number, to described Crowdsourcing task is allocated;
The answer verification unit, in the case of reaching the predetermined number in the number of the trusted users, according to institute The ratio for stating identical answer in the answer of trusted users carries out answer verification, determines the final result of the crowdsourcing task.
11. device according to claim 10, which is characterized in that the trusted users determination unit includes:
Answer extracting subelement for being directed to each user in the multiple user, extracts institute from the answer of the user State the answer of default dark stake topic;
Trusted users determination subelement, for all identical in the answer of extraction and the model answer for presetting dark stake topic In the case of, it is trusted users to determine the user.
12. device according to claim 8, which is characterized in that described device further includes:
Dark stake topic setup module, for before same crowdsourcing task is distributed to multiple users carry out data Collecting operations, It is set in the crowdsourcing task and presets dark stake topic, wherein, the default dark stake topic has standard including preset number The topic of answer.
13. device according to claim 12, which is characterized in that described device further includes:
Dark stake topic update module, for regularly updating the default dark stake topic.
14. a kind of server, which is characterized in that the server includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are performed by one or more of processors so that one or more of processors are real The now data verification method of the crowdsourcing task as described in any one of claim 1 to 7.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The data verification method of the crowdsourcing task as described in any one of claim 1 to 7 is realized during execution.
CN201711457649.6A 2017-12-28 2017-12-28 Data verification method and device for crowdsourcing task, server and storage medium Active CN108197202B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711457649.6A CN108197202B (en) 2017-12-28 2017-12-28 Data verification method and device for crowdsourcing task, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711457649.6A CN108197202B (en) 2017-12-28 2017-12-28 Data verification method and device for crowdsourcing task, server and storage medium

Publications (2)

Publication Number Publication Date
CN108197202A true CN108197202A (en) 2018-06-22
CN108197202B CN108197202B (en) 2021-12-24

Family

ID=62585135

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711457649.6A Active CN108197202B (en) 2017-12-28 2017-12-28 Data verification method and device for crowdsourcing task, server and storage medium

Country Status (1)

Country Link
CN (1) CN108197202B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109471943A (en) * 2018-11-12 2019-03-15 平安科技(深圳)有限公司 A kind of crowdsourcing task answer based on data processing determines method and relevant device
CN109583933A (en) * 2018-10-09 2019-04-05 顺丰科技有限公司 Address information judgment method, device, equipment and its storage medium
CN109582581A (en) * 2018-11-30 2019-04-05 平安科技(深圳)有限公司 A kind of result based on crowdsourcing task determines method and relevant device
CN110096525A (en) * 2019-04-29 2019-08-06 北京三快在线科技有限公司 Calibrate method, apparatus, equipment and the storage medium of interest point information
CN110287385A (en) * 2019-06-18 2019-09-27 素朴网联(珠海)科技有限公司 A kind of corpus data acquisition method, system and storage medium
CN111382144A (en) * 2018-12-27 2020-07-07 阿里巴巴集团控股有限公司 Information processing method and device, storage medium and processor
CN111832956A (en) * 2020-07-20 2020-10-27 北京百度网讯科技有限公司 Data verification method, device, electronic equipment and medium
KR102195964B1 (en) * 2020-03-27 2020-12-29 주식회사 크라우드웍스 Method for operating self check process of worker of crowdsourcing based project for artificial intelligence training data generation
KR102195606B1 (en) * 2020-03-23 2020-12-29 주식회사 크라우드웍스 Method for improving reliability by selective self check of worker of crowdsourcing based project for artificial intelligence training data generation
CN112508400A (en) * 2020-12-04 2021-03-16 云南大学 Self-generation method of crowdsourcing collaborative iteration task
CN113268621A (en) * 2020-02-17 2021-08-17 百度在线网络技术(北京)有限公司 Picture sorting method and device, electronic equipment and storage medium
CN113868538A (en) * 2021-10-19 2021-12-31 北京字跳网络技术有限公司 Information processing method, device, equipment and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426826A (en) * 2015-11-09 2016-03-23 张静 Tag noise correction based crowd-sourced tagging data quality improvement method
CN106228294A (en) * 2016-07-18 2016-12-14 合肥赑歌数据科技有限公司 A kind of search engine evaluation system and management
CN106446287A (en) * 2016-11-08 2017-02-22 北京邮电大学 Answer aggregation method and system facing crowdsourcing scene question-answering system
CN106529521A (en) * 2016-10-31 2017-03-22 江苏文心古籍数字产业有限公司 Ancient book character digital recording method
CN106844723A (en) * 2017-02-10 2017-06-13 厦门大学 medical knowledge base construction method based on question answering system
CN107194800A (en) * 2017-05-08 2017-09-22 深圳市华傲数据技术有限公司 A kind of data verification system and method based on mass-rent
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426826A (en) * 2015-11-09 2016-03-23 张静 Tag noise correction based crowd-sourced tagging data quality improvement method
CN106228294A (en) * 2016-07-18 2016-12-14 合肥赑歌数据科技有限公司 A kind of search engine evaluation system and management
CN106529521A (en) * 2016-10-31 2017-03-22 江苏文心古籍数字产业有限公司 Ancient book character digital recording method
CN106446287A (en) * 2016-11-08 2017-02-22 北京邮电大学 Answer aggregation method and system facing crowdsourcing scene question-answering system
CN106844723A (en) * 2017-02-10 2017-06-13 厦门大学 medical knowledge base construction method based on question answering system
CN107194800A (en) * 2017-05-08 2017-09-22 深圳市华傲数据技术有限公司 A kind of data verification system and method based on mass-rent
CN107273492A (en) * 2017-06-15 2017-10-20 复旦大学 A kind of exchange method based on mass-rent platform processes image labeling task

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
岳德君等: "基于投票一致性的众包质量评估策略", 《东北大学学报(自然科学版)》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583933A (en) * 2018-10-09 2019-04-05 顺丰科技有限公司 Address information judgment method, device, equipment and its storage medium
CN109471943A (en) * 2018-11-12 2019-03-15 平安科技(深圳)有限公司 A kind of crowdsourcing task answer based on data processing determines method and relevant device
CN109582581A (en) * 2018-11-30 2019-04-05 平安科技(深圳)有限公司 A kind of result based on crowdsourcing task determines method and relevant device
CN109582581B (en) * 2018-11-30 2023-08-25 平安科技(深圳)有限公司 Result determining method based on crowdsourcing task and related equipment
CN111382144A (en) * 2018-12-27 2020-07-07 阿里巴巴集团控股有限公司 Information processing method and device, storage medium and processor
CN111382144B (en) * 2018-12-27 2023-05-02 阿里巴巴集团控股有限公司 Information processing method and device, storage medium and processor
CN110096525A (en) * 2019-04-29 2019-08-06 北京三快在线科技有限公司 Calibrate method, apparatus, equipment and the storage medium of interest point information
CN110287385A (en) * 2019-06-18 2019-09-27 素朴网联(珠海)科技有限公司 A kind of corpus data acquisition method, system and storage medium
CN113268621A (en) * 2020-02-17 2021-08-17 百度在线网络技术(北京)有限公司 Picture sorting method and device, electronic equipment and storage medium
CN113268621B (en) * 2020-02-17 2024-04-30 百度在线网络技术(北京)有限公司 Picture sorting method and device, electronic equipment and storage medium
KR102195606B1 (en) * 2020-03-23 2020-12-29 주식회사 크라우드웍스 Method for improving reliability by selective self check of worker of crowdsourcing based project for artificial intelligence training data generation
KR102195964B1 (en) * 2020-03-27 2020-12-29 주식회사 크라우드웍스 Method for operating self check process of worker of crowdsourcing based project for artificial intelligence training data generation
CN111832956A (en) * 2020-07-20 2020-10-27 北京百度网讯科技有限公司 Data verification method, device, electronic equipment and medium
CN112508400B (en) * 2020-12-04 2021-10-08 云南大学 Self-generation method of crowdsourcing collaborative iteration task
CN112508400A (en) * 2020-12-04 2021-03-16 云南大学 Self-generation method of crowdsourcing collaborative iteration task
CN113868538A (en) * 2021-10-19 2021-12-31 北京字跳网络技术有限公司 Information processing method, device, equipment and medium
WO2023065825A1 (en) * 2021-10-19 2023-04-27 北京字跳网络技术有限公司 Information processing method and apparatus, device, and medium
CN113868538B (en) * 2021-10-19 2024-04-09 北京字跳网络技术有限公司 Information processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN108197202B (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN108197202A (en) Data verification method, device, server and the storage medium of crowdsourcing task
CN107251033B (en) System and method for real-time user authentication in online education
US20190325773A1 (en) System and method of providing customized learning contents
US20190073181A1 (en) Verification of shared display integrity in a desktop sharing system
CN103839450B (en) Assistant teaching method, auxiliary teaching device and assisted teaching system
US9418567B1 (en) Selecting questions for a challenge-response test
CN103944889B (en) A kind of method and certificate server of network user's online identity certification
CN108920653A (en) A kind of page generation method, device, server and storage medium
CN107172157A (en) The interactive English teaching system platform of many people and implementation method
CN109388924A (en) A kind of auth method, device, server and storage medium
CN104424352B (en) The system and method that agency service is provided to user terminal
CN109284974A (en) A kind of checking method based on block chain, device, audit equipment and storage medium
CN110413522A (en) A kind of determination method, apparatus, storage medium and the electronic equipment of test database
CN107888553A (en) A kind of verification method, server and system
CN110113346A (en) A kind of network verification method, user terminal and server
WO2021135322A1 (en) Automatic question setting method, apparatus and system
CN113207024A (en) Online classroom interaction method and device, server, terminal and storage medium
CN108399128A (en) A kind of generation method of user data, device, server and storage medium
CN107564357A (en) Teen-age robot industrial grade examination system and the method for examination
CN111951135A (en) Invigilating method and system for on-line double-machine-position examinees
WO2016078244A1 (en) On-line scoring method and system
CN108932289B (en) Question answer processing method and system based on information extraction and deep learning
CN109446433A (en) A kind of interest point failure method of calibration, device, server and storage medium
KR20220034066A (en) Method for intelligent scheduling of driving school teaching, apparatus, electronic device, storage medium and computer program
CN106878761A (en) Living broadcast interactive method, device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant