CN106156025B - A kind of management method and device of data mark - Google Patents

A kind of management method and device of data mark Download PDF

Info

Publication number
CN106156025B
CN106156025B CN201510130022.4A CN201510130022A CN106156025B CN 106156025 B CN106156025 B CN 106156025B CN 201510130022 A CN201510130022 A CN 201510130022A CN 106156025 B CN106156025 B CN 106156025B
Authority
CN
China
Prior art keywords
data
mark
subset
various types
data subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510130022.4A
Other languages
Chinese (zh)
Other versions
CN106156025A (en
Inventor
吴海潜
董石鸣
黄峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510130022.4A priority Critical patent/CN106156025B/en
Priority to PCT/CN2016/076570 priority patent/WO2016150328A1/en
Publication of CN106156025A publication Critical patent/CN106156025A/en
Application granted granted Critical
Publication of CN106156025B publication Critical patent/CN106156025B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The present invention provides the management methods and device of a kind of data mark.Its method includes: acquisition data set corresponding with data mark task, and mark corresponding with Various types of data in data set rule;Data set is divided into data subset;According to the mark corresponding with Various types of data of acquisition rule, generates the data mark subtask description information of data subset and issue;The data subset is sent to the first sender for claiming request to data subset;Data after receiving the mark for the sender for claiming request from first, wherein the management method further includes sending the call instruction of annotation tool corresponding with Various types of data in data subset;And/or the data of the data subset in publication mark data format after the target mark containing Various types of data in the data subset in the description information of subtask.The invention avoids carry out Data Format Transform to the data after mark.

Description

A kind of management method and device of data mark
Technical field
The management method and device marked the present invention relates to field of computer data processing more particularly to a kind of data.
Background technique
Data mark, which refers to the process of, to be described or marks to data such as text, picture, voices, for example, in face The positions such as the left eye tail of the eye, the right eye tail of the eye are marked in samples pictures.
Existing data annotation process are as follows: data mark task cutting is multiple subtasks and is distributed to multiple marks manually Note person;Each labeler selects corresponding standalone version annotation tool to be labeled according to the data type of the data of subtask;It is all After the completion of the mark work of subtask, the Data Integration after each labeler mark is saved.
Currently, annotation tool is varied.Even if the same data type, it is also possible to corresponding a variety of annotation tools.It is different Annotation tool, derived data format may be different.Therefore, using existing data annotation process, a data are marked Task, the data format after the corresponding mark in each subtask may be not quite similar, and not be inconsistent with data format actually required, It could be integrated after needing to be converted to the data format of needs.The data of Data Format Transform, especially big data quantity mark task Data Format Transform, reduce data mark, integration process efficiency.
Summary of the invention
It is an object of the present invention to provide a kind of data mark management method and device, it can improve data mark, The efficiency of integration process.
According to an aspect of the present invention, provide a kind of management method of data mark, wherein the management method include with Lower step:
Obtain data set corresponding with data mark task;
Obtain mark rule corresponding with Various types of data in the data set;
The data set is divided into data subset;
According to the mark corresponding with the Various types of data of acquisition rule, the data mark subtask for generating data subset is retouched State information;
The data for issuing data subset mark subtask description information;
Request is claimed to the first of data subset in response to receiving, the sender that Xiang Suoshu first claims request sends should Data subset;
Data after receiving the mark for the sender for claiming request from first,
Wherein:
The management method further includes sending the call instruction of annotation tool corresponding with Various types of data in data subset to First claims the sender of request, wherein data derived from annotation tool corresponding with a kind of data are the target mark of such data Data format after note;And/or
Contain Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after target mark.
According to another aspect of the present invention, a kind of managing device of data mark is additionally provided, wherein the managing device packet It includes:
Data set acquiring unit, for obtaining data set corresponding with data mark task;
Rule unit is marked, for obtaining mark rule corresponding with Various types of data in the data set;
Data subset division unit, for the data set to be divided into data subset;
Task description information generating unit is marked, for regular according to the mark corresponding with the Various types of data of acquisition, The data for generating data subset mark subtask description information;
Task description information issue unit is marked, the data for issuing data subset mark subtask description information;
Data subset transmission unit, for claiming request, Xiang Suoshu first to the first of data subset in response to receiving The sender for claiming request sends the data subset;
Data receipt unit after mark, the data after mark for receiving the sender for claiming request from first,
Wherein:
The managing device further includes for sending out the call instruction of annotation tool corresponding with Various types of data in data subset It is sent to the first call instruction transmission unit for the sender that first claims request, wherein annotation tool corresponding with a kind of data is led Data out are data format after the target of such data marks;And/or
Contain Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after target mark.
Compared with prior art, the embodiment of the present invention has the advantage that data derived from the annotation tool of offer are Data format after the target mark of the corresponding this kind of data of the annotation tool, and/or the data mark of the data subset in publication Data format after target mark containing Various types of data in the data subset in the description information of subtask, thus after ensure that mark Data be target mark after data format, avoid to after mark data carry out Data Format Transform, improve data mark The efficiency of note, integration process.In addition, the corresponding data set of data mark task is also divided into several data by the embodiment of the present invention Collection is generated and is issued the data mark subtask description information of each data subset, i.e., realized data mark in the form of crowdsourcing, Data mark task is divided into several data mark subtask crowdsourcings to the network user, the data mark for improving big data quantity is appointed The treatment effeciency of business.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:
Fig. 1 is the flow chart of method provided by one embodiment of the present invention;
Fig. 2 is mark rule template provided in an embodiment of the present invention and the regular display interface schematic diagram of customized mark;
Fig. 3 is the display interface schematic diagram that data provided in an embodiment of the present invention mark subtask description information;
Fig. 4 is the method flow diagram that another embodiment of the present invention provides;
Fig. 5 is the method flow diagram that another embodiment of the invention provides;
Fig. 6 is the method flow diagram that further embodiment of the present invention provides;
Fig. 7 is system architecture diagram provided in an embodiment of the present invention;
Fig. 8 is that the data of the multiple data subsets of publication provided in an embodiment of the present invention mark total hair of subtask description information The schematic diagram at cloth interface;
Fig. 9 is schematic device provided by one embodiment of the present invention;
Figure 10 is the schematic device that another embodiment of the present invention provides;
Figure 11 is the schematic device that another embodiment of the invention provides;
Figure 12 is the schematic device that further embodiment of the present invention provides.
The same or similar appended drawing reference represents the same or similar component in attached drawing.
Specific embodiment
It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detail The processing or method described as flow chart.Although operations are described as the processing of sequence by flow chart, it is therein to be permitted Multioperation can be implemented concurrently, concomitantly or simultaneously.In addition, the sequence of operations can be rearranged.When it The processing can be terminated when operation completion, it is also possible to have the additional step being not included in attached drawing.The processing It can correspond to method, function, regulation, subroutine, subprogram etc..
Alleged " computer equipment " within a context, also referred to as " computer ", referring to can be by running preset program or referring to Enable to execute numerical value and calculate and/or the intelligent electronic device of the predetermined process process such as logic calculation, may include processor with Memory executes the survival prestored in memory instruction by processor to execute predetermined process process, or by ASIC, The hardware such as FPGA, DSP execute predetermined process process, or are realized by said two devices combination.Computer equipment includes but unlimited In server, PC, laptop, tablet computer, smart phone etc..
The computer equipment includes user equipment and the network equipment.Wherein, the user equipment includes but is not limited to electricity Brain, smart phone, PDA etc.;The network equipment includes but is not limited to that single network server, multiple network servers form Server group or the cloud consisting of a large number of computers or network servers for being based on cloud computing (Cloud Computing), wherein Cloud computing is one kind of distributed computing, a super virtual computer consisting of a loosely coupled set of computers.Its In, the computer equipment can isolated operation realize the present invention, also can access network and by with other calculating in network The present invention is realized in the interactive operation of machine equipment.Wherein, network locating for the computer equipment include but is not limited to internet, Wide area network, Metropolitan Area Network (MAN), local area network, VPN network etc..
It should be noted that the user equipment, the network equipment and network etc. are only for example, other are existing or from now on may be used The computer equipment or network that can occur such as are applicable to the present invention, should also be included within the scope of protection of the present invention, and to draw It is incorporated herein with mode.
Method (some of them are illustrated by process) discussed hereafter can be by hardware, software, firmware, centre Part, microcode, hardware description language or any combination thereof are implemented.Implement when with software, firmware, middleware or microcode When, program code or code segment to implement necessary task can be stored in machine or computer-readable medium and (for example deposit Storage media) in.Necessary task can be implemented in (one or more) processor.
Specific structure and function details disclosed herein are only representative, and are for describing the present invention show The purpose of example property embodiment.But the present invention can be implemented by many alternative forms, and be not interpreted as It is limited only by the embodiments set forth herein.
Although it should be understood that may have been used term " first ", " second " etc. herein to describe each unit, But these units should not be limited by these terms.The use of these items is only for by a unit and another list Member distinguishes.For example, without departing substantially from the range of exemplary embodiment, first unit can be referred to as second Unit, and similarly second unit can be referred to as first unit.Term "and/or" used herein above includes one of them Or more listed by associated item any and all combinations.
Term used herein above is not intended to limit exemplary embodiment just for the sake of description specific embodiment.Unless Context clearly refers else, otherwise singular used herein above "one", " one " also attempt to include plural number.Also answer When understanding, term " includes " and/or "comprising" used herein above provide stated feature, integer, step, operation, The presence of unit and/or component, and do not preclude the presence or addition of other one or more features, integer, step, operation, unit, Component and/or combination thereof.
It should further be mentioned that the function action being previously mentioned can be attached according to being different from some replace implementations The sequence indicated in figure occurs.For example, related function action is depended on, the two width figures shown in succession actually may be used Substantially simultaneously to execute or can execute in a reverse order sometimes.
Present invention is further described in detail with reference to the accompanying drawing.
Fig. 1 is the management method flow chart that the data of one embodiment of the invention mark.1 at least according to the method for the present invention Including step 110, step 120, step 130, step 140, step 150, step 160 and step 170.
Data mark management refer to by data mark task subpackage to user, to user data mark after data into Row is integrated to complete the whole process of the task of data mark.
The platform that the management method of data mark can be marked task publisher itself by data executes, can also be by only The vertical third-party platform with data mark task publisher and the user for carrying out data mark executes.
Referring to Fig.1, in step 110, data set corresponding with data mark task is obtained.
Data mark task refers to the work of the data for needing a to complete mark.For example, it is desired to by a large amount of faces, dog The pictures such as face, cat face carry out the mark at face's each position (for example, left eye canthus, right eye canthus etc.), to be used as machine learning Training sample.Carrying out face location mark to all these pictures is exactly item data mark task.
Data set corresponding with data mark task refers to that data mark targeted data in data mark task Set.In the data mark task for carrying out face location mark to these pictures, these pictures are just constituted appoints with data mark It is engaged in corresponding data set.
Obtaining data set corresponding with data mark task can for example carry out in the following manner: be appointed by data mark The computer equipment that business publisher uses shows that data import interface to publisher, so that the interface of data importing is provided, so as to Publisher imports data set corresponding with data mark task, in this way, executing the platform of method 1 can obtain and data mark The corresponding data set of note task.
Data import interface either WEB interface, is also possible to local client interface, can also be other forms Interface, this is not limited by the present invention.
Referring to Fig.1, in the step 120, mark rule corresponding with Various types of data in above-mentioned data set is obtained.
In the embodiment of the present invention, data are divided into different types according to different mark objects.For example, above to people The pictures such as face, dog face, cat face carry out in the data mark task of face location mark, and data type includes: face picture data, Dog face image data, cat face image data etc..
In the embodiment of the present invention, mark rule is to data marked content and the regulation how to mark.For example, with face figure The corresponding mark rule of sheet data include need to mark in face picture which position (for example, left eye canthus, right eye canthus), How a certain position is marked (such as again mark, light mark, draw it is a little bigger, draw dot etc.);Mark rule corresponding with dog face image data It then include needing to mark which position (for example, left dog have sharp ears, right dog have sharp ears) on dog face picture, how being marked to a certain position (for example, mark, gently mark again, drawing a little bigger, picture dot etc.).
Table 1 is the content for needing to mark for face picture data and the example how to mark.
Serial number Marked content How to mark
0 The left eye tail of the eye It marks again
1 Left eye central point Light mark
2 Left eye inner eye corner It marks again
3 Right eye inner eye corner It marks again
4 Right eye central point Light mark
5 The right eye tail of the eye It marks again
6 Nose It marks again
7 The left corners of the mouth It marks again
8 Mouth center Light mark
9 The right corners of the mouth It marks again
10 Left ear vertex It marks again
11 Left ear bottom point Light mark
12 Auris dextra piece vertex It marks again
13 Auris dextra piece bottom point Light mark
Table 1
Although table 1 is the content marked with the needs that form indicates and the example how to mark, reality On, mark rule is usually write as with machine language, such as:
In embodiments of the present invention, it can receive corresponding with Various types of data in above-mentioned data set mark rule, it can also be with The corresponding relationship of reference data type and mark rule transfers mark rule corresponding with Various types of data in above-mentioned data set.
Specifically, in above-mentioned data set the corresponding mark rule of Various types of data can by all by obtaining in a manner of received, Can also by all by transferring in a manner of obtain;It can also be the corresponding mark rule of some types data by received mode It obtains, the corresponding mark rule of some types data is obtained by way of transferring.
If obtaining mark rule by transferring mode, it is preferable that the corresponding mark rule of Various types of data can be pre-configured with Then, and storing data type and mark rule corresponding relationship.Therefore, with reference to the corresponding pass of the data type and mark rule System, so that it may transfer mark rule corresponding with the data type.
If obtaining mark rule by reception mode, the platform for executing management method 1 receives publisher from publisher Customized mark rule.For convenient for the customized mark rule of publisher, it is preferable that preset revisable corresponding to various data The mark rule template of type, is presented to publisher by interface.For example, presetting face mark corresponding with face picture data Infuse rule template, dog face mark rule template corresponding with dog face image data etc..As shown in Fig. 2, mark rule template can Including parts such as template name, template type, regular sample, custom rules.Template name be mark rule template title, it There is default value in mark rule template, but publisher can according to their needs modify to the template name.Template type is Mark the corresponding data type of rule template, such as above-mentioned face picture data, dog face image data.Publisher can basis It itself needs to modify to template type.Regular sample is a sample of mark rule template corresponding with data type, Publisher can copy the sample to create the mark rule for meeting publisher's actual demand at custom rule.For example, right Should be in the mark rule template of face picture, the sample that regular sample provides is the rule for marking left eye canthus and left eye central point Then, but publisher does not need the mark to left eye central point, after which being copied at custom rule, deletes Wherein mark the part of left eye central point), and the mark rule template filled in is submitted to the platform for executing management method 1.It holds The platform of row management method 1 can read publisher defines and people from the custom rule part for the mark rule template filled in The corresponding mark rule of face image data.
Referring to Fig.1, in step 130, above-mentioned data set is divided into data subset.
Wherein it is possible to which above-mentioned data set is divided into data subset.The quantity of the data subset of equal part can be default value, It is also possible to received from publisher.For example, for allowing publisher to import data set corresponding with data mark task Interface on setting publisher wish the option that data mark task is divided into how many a data marks subtasks, filled out for publisher It writes.Publisher wishes that the quantity of the data being divided into mark subtask corresponds to the quantity for the data subset being divided into.The realization side Formula is particularly suitable for the situation of only a kind of data in data set.
Wherein, above-mentioned data set can also be divided into data subset according to the data type in data set.The division mode Specific implementation again there are many.For example, including face picture data, dog face image data and textual data in data set According to all face image datas in above-mentioned data set being divided into a data subset, by all dog face image datas point For another data subset, and all text datas are divided into another data subset;Alternatively, on this basis, due to face Image data is more, then face picture data is further divided into multiple data subsets, due to dog face image data and textual data According to less, then the data subset of the data subset of dog face image data and text data is further merged into data Collection.
Referring to Fig.1, in step 140, according to the mark corresponding with above-mentioned Various types of data of acquisition rule, data subset is generated Data mark subtask description information.
In one embodiment comprising:
Judge the data type for including in data subset;
For each data type for including in data subset, mark rule corresponding with such data is converted into nature Language;
It will convert into the mark rules integration corresponding with every class data of natural language, retouched with obtaining data mark subtask State information.
For not only including dog face image data but also include the data subset of text data, data is first determined whether out The data type that collection includes is dog face image data and text data.It then, will mark rule corresponding with dog face image data Natural language is converted into (to describe to need the content marked i.e. in a manner of text rather than in the way of machine language and how mark Note), natural language is converted by mark rule corresponding with text data.Note that although what is obtained is corresponding with Various types of data Mark rule further includes mark rule corresponding with face picture data, but due to not including face picture in current data subset Data, therefore mark corresponding with face picture data rule does not use in current data subset.Then, it will convert into nature Language and dog face image data corresponding mark rule, mark rules integration corresponding with text data, to obtain data mark Infuse subtask description information.The example that data mark subtask description information is as follows:
" for dog face picture, left ear vertex, left ear bottom point, auris dextra piece vertex, auris dextra piece bottom point are marked with circle, Nose, the left corners of the mouth, the right corners of the mouth are marked with dot;
For text data, mark with an undulating line verb, marks noun with lower horizontal line."
It should be pointed out that step 140 either execute automatically, is also possible to refer to according to the operation that publisher inputs Enable execution.
Referring to Fig.1, in step 150, the data mark subtask description information of data subset is issued.
Wherein, the data mark subtask description information of publication both may be displayed in WEB page, and it is aobvious can also to issue formula Show on the interface of numerous user APP clients.By taking WEB page as an example, data mark the display interface of subtask description information It can be as shown in Figure 3.Note that being generated in the mark rule of the data mark subtask description information of Fig. 3 accordingly containing only in need The content of mark, without containing the information for how marking (such as with circle, dot etc.), therefore, data mark shown in Fig. 3 is appointed Business description information, which only describes, needs the content that marks, but how actual data mark subtask description information may be to marking Note will also be described.
Referring to Fig.1, in step 160, request is claimed to the first of data subset in response to receiving, first claims and asks to this The sender asked sends the data subset.
As shown in figure 3, one can be contained on the page of the data mark subtask description information of publication data subset Claim option.For example, user is after seeing the page, it is desirable to claim the data subset, that is, receive data mark subtask, selection This claims option, just receives data mark subtask, that is, has issued and claim request to the first of the data subset.Execute pipe The platform of reason method 1 receives this and first claims request, sends the data subset to and issues first user for claiming request.
Referring to Fig.1, the data in step 170, after receiving the mark for the sender for claiming request from first.
For example, user select in Fig. 3 claim option after, execute management method 1 platform send data subset to User.The Various types of data in data subset is shown to user.User is in selection interface after Various types of data progress data mark Option is submitted, then the data after mark are submitted to the platform for executing management method 1.
In one embodiment, in order to guarantee mark after data be target mark after data format, the manager Method 1 can also include: to send first for the call instruction of annotation tool corresponding with Various types of data in data subset and claim to ask The sender's (not shown) asked, wherein data derived from annotation tool corresponding with a kind of data are the target mark of such data Data format after note.
The application etc. that annotation tool, which refers to, to be used when being labeled to data.Certain general a kind of data can be used a kind of or more Kind annotation tool mark, the annotation tool used is different, and derived data format is also different.In the embodiment, by a kind of number According to corresponding with a kind of unique annotation tool, for this kind of data, derived data are all a kind of data formats for guarantee, That is data format after target mark.
A kind of specific implementation of above-mentioned steps may is that
Reference data type and annotation tool corresponding relationship determine mark work corresponding with Various types of data in data subset Tool;According to the mark corresponding with Various types of data of acquisition rule, configure determine it is corresponding with Various types of data in data subset The parameter of annotation tool;The first sender for claiming request is sent by the call instruction for having configured the annotation tool of parameter.
In the embodiment, data format (its after derived target mark is wished according to for every kind of data type in advance Such as be empirically determined by platform for specific data type, data format is more preferable after exporting which kind of mark), platform is determined to A kind of annotation tool of data format after target derived from the hope marks is exported as corresponding with such data in data subset Annotation tool.Then platform configures determine and data subset according to mark corresponding with the Various types of data rule of acquisition The parameter, such as the thickness of lines etc. of the corresponding annotation tool of middle Various types of data.For example, mark rule is pointed out for face picture In left ear vertex, need the dot with diameter 2cm, then have to configuration annotation tool corresponding with face picture data, The dot of diameter 2cm can be drawn.Then, platform will send for the call instruction for having configured the annotation tool of parameter One claims the sender of request, and the sender which can claim request to first in a step 160 sends data subset It is performed simultaneously, can also be individually performed.
For example, having dog face image data and text data in data subset.For dog face image data, it is desirable to target mark Data format is G1 after note.For text data, it is desirable to target mark after data format be G2.After generating target mark Data format G1 is determined and is used annotation tool T1.In order to generate data format G2 after target mark, need with annotation tool T2.It is flat Platform can send the call instruction of the call instruction for the annotation tool T1 for having configured parameter and the annotation tool T2 for having configured parameter The user of request is claimed to transmission first, so that the data format generated after the user annotation is G1 for dog face image data, It is G2 for text data.
In another embodiment, contain the number in the data mark subtask description information of the data subset of publication According to data format after the target mark of Various types of data in subset.
Contain Various types of data in the data subset in the data mark subtask description information of the data subset of publication After target mark in the case where data format, the implementation of above-mentioned steps 150 be may is that
Data format corresponding relationship after reference data type and target mark, determines the target of Various types of data in data subset Data format after mark;Data format after the target mark of Various types of data in the data subset determined is put into data subset It is issued in data mark subtask description information.
For example, be determined in advance derived target mark is wished for every kind of data type after data format (it is for example by putting down Platform is empirically determined for specific data type, export which kind of mark after data format it is more preferable), by the data type with wish to lead The corresponding relationship of data format is previously stored after target mark out.Then, so that it may according to the data contained in data subset Data format corresponding relationship after type, reference data type and target mark, determines the target mark of Various types of data in data subset Data format after note, and data format after the target mark of Various types of data in the data subset determined is put into data subset It is issued in data mark subtask description information.
For example, having dog face image data and text data in data subset.For dog face image data, it is desirable to target mark Data format is G1 after note.For text data, it is desirable to target mark after data format be G2.Therefore, in publication data When the data of collection mark subtask description information, containing for dog face picture in the data mark subtask description information of publication Target desired by data mark after data format G1, for target desired by text data mark after data format G2.Send the One claims the user of request regardless of what annotation tool used, as long as guaranteeing for dog face image data, derived from annotation tool Data format is G1 after mark, and for text data, data format is G2 after the derived mark of annotation tool.
Technical solution provided in an embodiment of the present invention, the data after ensure that mark are the data format after target mark, It avoids carrying out Data Format Transform to the data after mark, improves the efficiency of data annotation process.In addition, data mark is appointed Corresponding data set of being engaged in is divided into several data subsets, generates and issue the data mark subtask description letter of each data subset Breath realizes data mark that is, in the form of crowdsourcing, and data mark task is divided into several data mark subtask crowdsourcings to net Network user improves the treatment effeciency of the data mark task of big data quantity.
Based on above-mentioned any means embodiment, optionally, above-mentioned management method further includes step 180.With reference to Fig. 4, step In 180, the Data Integration after the mark for the sender for claiming request from first of each data subset is stored.That is, will Data after the mark of the respective sender for claiming request from first of each data subset that data set is divided into, according to data Collect the sequence in data set, reconfigures together and store.For example, including data subset S1, S2, S3 in data set S In the case of, according to the sequence of data subset S1, S2, S3 in data set S, first is come from by data subset S1, S2, S3 are respective Then data after claiming the mark of the sender of request are carried out together into an entirety, i.e. data set after mark Storage.
In order to improve the utilization popularity of data resource, it is preferable that claiming from first for each data subset will be asked Data Integration after the mark of the sender asked is stored into cloud storage.If only the data after the mark after integration are sent out Publisher is given, only publisher is able to use the data after the mark after the integration.In certain situations it is desirable to more people The data after mark after the integration can be shared.For example, it is desired to which the owner even public of company where publisher can make With the data after the mark after the integration.Therefore the mode of integration storage to cloud storage improves the utilization of mark achievement extensively Property.
Further, which further includes step 181~step 185.
Referring to Fig. 5, in step 181, according to the mark corresponding with above-mentioned Various types of data of acquisition rule, generation data subset Data check subtask description information.Its specific implementation is referred to the reality of above-mentioned data mark subtask description information Existing, details are not described herein again.Verification is a kind of inspection to mark, and therefore, this verification uses same annotation tool with mark, Marked content and how to mark all consistent when being with mark, therefore its data check subtask description information is also substantially similar to Data mark subtask description information.
Referring to Fig. 5, in step 182, the corresponding verifier information of the data subset is obtained.
Wherein, the corresponding verifier information of the data subset can be publisher allowing publisher import with data mark appoint It is engaged in filling in the interface of corresponding data set, is also possible to individually issue what inquiry obtained to publisher, can also be logical Cross other modes acquisition.Professional where verifier can be publisher in group.
Referring to Fig. 5, in step 183, above-mentioned syndrome task description information and the number are sent according to above-mentioned verifier information According to the data after the mark of subset.For example, the verifier indicated to verifier information sends above-mentioned syndrome task description information With the data after the mark of the data subset.
Referring to Fig. 5, in step 184, the data after the mark of verification of the data subset are received.That is, verifier school The data after the mark of verification for sending the data subset after testing to platform, receive the data by platform.
Referring to Fig. 5, in step 185, the Data Integration after the mark of verification of each data subset will be stored. That is, the data after the mark of verification of each data subset according to sequence of the data subset in data set, are reconfigured Together and store.For example, in the case where data set S includes data subset S1, S2, S3, according to data subset S1, S2, S3 Sequence in data set S, by the respective data after the mark of verification of data subset S1, S2, S3 together into One entirety, that is, the data set after mark after verifying, is then stored.
Wherein it is possible to replace the data stored in step 180 using the data after the mark of verification, can also divide It does not store, does not replace.
In addition, the management method also may include step 181, step 186, step 187, step 188 and step 185, such as Fig. 6.
Step 181 is identical as step 181 in Fig. 5.
Referring to Fig. 6, in step 186, the data check subtask description information of data subset is issued.
Wherein, the data check subtask description information of publication both may be displayed in WEB page, can also show in a distributed manner Show on the interface of the APP client of multiple users.
Referring to Fig. 6, in step 187, request is claimed to the second of data subset in response to receiving, second claims and asks to this The sender asked sends the data after the mark of the data subset.
Second, which claims request, receives the request for verifying the subtask of data mark of the data subset.That is, with Professional verifies different in Fig. 5, is still to send out the verification of the data after the mark of each data subset in the embodiment of Fig. 6 Cloth is simultaneously contracted out to the public etc..
Referring to Fig. 6, in step 188, receive the data subset for the sender for claiming request from second by verification Mark after data.
Wherein, which further includes by the call instruction of annotation tool corresponding with Various types of data in data subset It is sent to the second sender for claiming request, wherein data derived from annotation tool corresponding with a kind of data are such data Data format after target mark;And/or contain the data in the data mark subtask description information of the data subset of publication Data format after the target mark of Various types of data in subset.
Wherein, in step 185, the Data Integration after the mark of verification of each data subset will be stored.For example, The Data Integration after the mark of verification of each data subset will be stored into cloud storage.
Below by taking concrete application scene as an example, method provided in an embodiment of the present invention is described in detail.
In application scenarios shown in Fig. 7, the computer equipment 701 that the publisher of data mark task uses, data mark Task management server 702, data mark task distribution platform server 703, data center's storage server 704 and data mark The computer equipment 705 that the person of claiming of note task uses is realized by internet and is communicated.
Wherein, data mark task management server 702 can be realized by a server, can also be by multiple servers The framework of composition is realized;Data mark task distribution platform server 703 and can be realized by a server, can also be by more The framework of server composition is realized;Data center's storage server 704 can be realized by a server, can also be taken by more The framework of business device composition is realized.
Wherein, in data mark task management server 702, data mark task distribution platform server 703 and data The function of heart storage server 704 can also be integrated in one or more equipment and realize.
Referring to system architecture shown in Fig. 7, the specific working principle is as follows:
Step 1: the computer equipment 701 that the publisher of data mark task uses shows that data import boundary to publisher Face marks the corresponding data set of task so that publisher imports data by the interface.
Step 2: the corresponding data set of data mark task that publisher imports is sent to by above-mentioned computer equipment 701 Data mark task management server 702, i.e. data mark task management server 702 obtains the corresponding number of data mark task According to collection.
It wherein, only include face labeled data in the data set.
Step 3: above-mentioned computer equipment 701 is transferred according to the operational order of publisher and shows that face marks regular mould Plate, for the customized mark rule of publisher.
Corresponding interface is as shown in Figure 2.
Wherein, computer equipment 701 is transferred face mark rule template can be stored in advance in it is local, can also be with It is to above-mentioned 702 request of server.
Step 4: the corresponding mark rule of face labeled data that above-mentioned computer equipment 701 custom-configures publisher It is then sent to above-mentioned server 702, i.e., above-mentioned server 702 obtains mark rule corresponding with Various types of data in above-mentioned data set Then.
It should be pointed out that if publisher does not have, customized mark is regular (not executing step 3 and step 4), that , the corresponding mark rule of preconfigured face labeled data can be transferred by above-mentioned computer equipment 701 and is sent to Server 702 is stated, the corresponding mark rule of preconfigured face labeled data can also be transferred by above-mentioned server 702.
Step 5: above-mentioned data set is divided into data subset by above-mentioned server 702.
Wherein, the quantity of equal part can be default value, be also possible to the value of publisher's setting.
It should be pointed out that above-mentioned data set can also be divided into after data subset again by above-mentioned computer equipment 701 It is sent to above-mentioned server 702.
Step 6: mark rule of the above-mentioned server 702 according to acquisition, the data mark subtask for generating data subset is retouched State information.
Step 7: above-mentioned server 702 is by the data mark subtask description information of the data subset of generation together with publication Request is sent to data mark task distribution platform server 703.
Wherein, above-mentioned server 702 both can be automatically by the data mark subtask description information of data subset together with hair Cloth request is sent to above-mentioned server 703, can also send after the operational order for receiving publisher's transmission.
Wherein, posting request is used to indicate the specified viewing area that subtask description information is published to target network platform Domain.
Step 8: above-mentioned server 703 sends out the data mark subtask description information of data subset according to posting request Cloth is to the specified display area of target network platform, as shown in Figure 8.
Step 9: the computer equipment 705 that uses of the person of claiming of data mark task according to the operational order of the person of claiming to The person of claiming shows interface shown in Fig. 8, and further shows operation interface shown in Fig. 3 according to the operational order for the person of claiming.
Step 10: above-mentioned computer equipment 705 is sent to above-mentioned server 702 to face according to the operational order for the person of claiming The first of the mark corresponding data subset 1 in subtask 1 claims request.
Step 11: above-mentioned server 702 is in response to claiming request to the first of data subset 1, by data subset 1 and people The call instruction of face annotation tool is sent to above-mentioned computer equipment 705.
Wherein, before calling face annotation tool, above-mentioned server 702 configures face according to the mark rule received The parameter of annotation tool.
Step 12: above-mentioned computer equipment 705 is called WEB editions face annotation tools according to call instruction and is shown to The person of claiming marks according to the face of the complete paired data subset 1 of the operational order for the person of claiming.
Step 13: the data after mark are sent to above-mentioned service together with verifier information by above-mentioned computer equipment 705 Device 702.
In the present invention, verifier information can with but be not limited only to be account identification, device address, device identification etc..
In the present embodiment, verifier information is the account identification of above-mentioned publisher.
Step 14: above-mentioned server 702 receives corresponding to face mark subtask 1 from computer equipment 705 Data subset 1 mark after data and from other computer equipments to face mark subtask 2,3 ... it is corresponding After data after the mark of data subset 2,3 ..., by the Data Integration storage after the mark of data subset 1,2,3 ... to number According in central store server 704.
Step 15: above-mentioned server 702 generates the number of data subset 1 according to the corresponding mark rule of face labeled data According to syndrome task description information.
Step 16: above-mentioned server 702, according to above-mentioned verifier information, Xiang Shangshu computer equipment 701 sends above-mentioned Data after the mark of syndrome task description information and the data subset.
It works Step 17: above-mentioned computer equipment 701 completes data check according to the operational order of publisher.
Step 18: above-mentioned server 702 receives the process verification for the data subset that above-mentioned computer equipment 701 is sent Data after mark.
Step 19: above-mentioned server 702 the data after the mark of verification for receiving the data subset 1, with And the Data Integration after the mark of verification of data subset 2,3 ..., and be stored in above-mentioned server 704.
Based on inventive concept same as method, the present invention also provides a kind of managing devices of data mark.Fig. 9 is shown 9 schematic diagram of managing device of data mark.The managing device includes:
Data set acquiring unit 910, for obtaining data set corresponding with data mark task;
Rule unit 920 is marked, for obtaining mark rule corresponding with Various types of data in the data set;
Data subset division unit 930, for the data set to be divided into data subset;
Task description information generating unit 940 is marked, for advising according to the mark corresponding with the Various types of data of acquisition Then, the data mark subtask description information of data subset is generated;
Task description information issue unit 950 is marked, the data for issuing data subset mark subtask description information;
Data subset transmission unit 960, for claiming request, Xiang Suoshu to the first of data subset in response to receiving One sender for claiming request sends the data subset;
Data receipt unit 970 after mark, the data after mark for receiving the sender for claiming request from first,
Wherein, which further includes for referring to the calling of annotation tool corresponding with Various types of data in data subset The the first call instruction transmission unit (not shown) for being sent to the sender that first claims request is enabled, wherein corresponding with a kind of data Data derived from annotation tool are data format after the target of such data marks;And/or the data of the data subset in publication Mark data format after the target containing Various types of data in the data subset in the description information of subtask marks.
Wherein, the mark Rule unit 920 is used for:
Receive mark rule corresponding with Various types of data in the data set;And/or
The corresponding relationship of reference data type and mark rule, transfers mark corresponding with Various types of data in the data set Rule.
Wherein, the data subset division unit 930 is used for:
The data set is divided into data subset;Or
The data set is divided into data subset according to the data type in data set.
Wherein, the mark task description information generating unit 940 is used for:
Judge the data type for including in data subset;
For each data type for including in data subset, mark rule corresponding with such data is converted into nature Language;
It will convert into the mark rules integration corresponding with every class data of natural language, retouched with obtaining data mark subtask State information.
Wherein, referring to Fig.1 0, which further includes data storage cell 980 after mark, is used for:
By the Data Integration storage after the mark for the sender for claiming request from first of each data subset.
Wherein, data storage cell 980 is used for after the mark:
By the Data Integration storage after the mark for the sender for claiming request from first of each data subset to cloud In memory.
Wherein, the first call instruction transmission unit is configured as:
Reference data type and annotation tool corresponding relationship determine mark work corresponding with Various types of data in data subset Tool;
According to the mark corresponding with Various types of data of acquisition rule, configure determine with Various types of data pair in data subset The parameter for the annotation tool answered;
It sends first together together with the data subset by the call instruction for having configured the annotation tool of parameter and claims request Sender.
Wherein, contain all kinds of numbers in the data subset in the data mark subtask description information of the data subset of publication According to target mark after in the case where data format, the mark task description information issue unit 950 is used for:
Data format corresponding relationship after reference data type and target mark, determines the target of Various types of data in data subset Data format after mark;
Data format after the target mark of Various types of data in the data subset determined is put into the data mark of data subset It is issued in note subtask description information.
Wherein, referring to Fig.1 1, the managing device further include:
Task description information generating unit 990 is verified, for advising according to the mark corresponding with the Various types of data of acquisition Then, the data check subtask description information of data subset is generated;
Verifier information acquisition unit 9100, for obtaining the corresponding verifier information of the data subset;
First verification task transmission unit 9110, for sending the syndrome task description according to the verifier information Data after the mark of information and the data subset;
Data receipt unit 9120 after first verification, for receiving the number after the mark of verification of the data subset According to;
Data storage cell 9130 after verification, for the data after the mark of verification for each data subset are whole Close storage.
Wherein, referring to Fig.1 2, the managing device further include:
Task description information generating unit 990 is verified, for advising according to the mark corresponding with the Various types of data of acquisition Then, the data check subtask description information of data subset is generated;
Task description information issue unit 9140 is verified, the data check subtask for issuing data subset describes letter Breath;
Second verification task transmission unit 9150, for claiming request to the second of data subset in response to receiving, to Described second sender for claiming request sends the data after the mark of the data subset;
Data receipt unit 9160 after second verification, for receiving the data for the sender for claiming request from second The data after the mark of verification of collection;
Data storage cell 9130 after verification, for the data after the mark of verification for each data subset are whole Storage is closed,
Wherein, which further includes sending out the call instruction of annotation tool corresponding with Various types of data in data subset It is sent to the second call instruction transmission unit for the sender that second claims request, wherein annotation tool corresponding with a kind of data is led Data out are data format after the target of such data marks;And/or the data of the data subset in publication mark subtask Data format after target mark containing Various types of data in the data subset in description information.
Wherein, data storage cell 9130 includes: after the verification
The Data Integration after the mark of verification of each data subset will be stored into cloud storage.
It should be noted that the present invention can be carried out in the assembly of software and/or software and hardware, for example, this hair Specific integrated circuit (ASIC) can be used in bright each device or any other is realized similar to hardware device.In one embodiment In, software program of the invention can be executed to implement the above steps or functions by processor.Similarly, of the invention Software program (including relevant data structure) can be stored in computer readable recording medium, for example, RAM memory, Magnetic or optical driver or floppy disc and similar devices.In addition, some of the steps or functions of the present invention may be implemented in hardware, example Such as, as the circuit cooperated with processor thereby executing each step or function.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.That states in system claims is multiple Unit or device can also be implemented through software or hardware by a unit or device.The first, the second equal words are used to table Show title, and does not indicate any particular order.
Although front is specifically shown and describes exemplary embodiment, it will be understood to those of skill in the art that It is that without departing substantially from the spirit and scope of claims, can be varied in terms of its form and details.

Claims (22)

1. a kind of management method (1) of data mark, wherein the management method the following steps are included:
Obtain data set (110) corresponding with data mark task;
It is regular (120) to obtain mark corresponding with Various types of data in the data set;
The data set is divided into data subset (130);
According to the mark corresponding with the Various types of data of acquisition rule, the data mark subtask description letter of data subset is generated It ceases (140);
The data for issuing data subset mark subtask description information (150);
Request is claimed to the first of data subset in response to receiving, the sender that Xiang Suoshu first claims request sends the data Subset (160);
Data (170) after receiving the mark for the sender for claiming request from first,
Wherein:
The management method further includes sending first for the call instruction of annotation tool corresponding with Various types of data in data subset The sender of request is claimed, wherein after the target that data derived from annotation tool corresponding with a kind of data are such data marks Data format;And/or
The target containing Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after mark.
2. management method according to claim 1, wherein described to obtain mark corresponding with Various types of data in the data set Infusing regular step (120) includes:
Receive mark rule corresponding with Various types of data in the data set;And/or
The corresponding relationship of reference data type and mark rule transfers mark rule corresponding with Various types of data in the data set Then.
3. management method according to claim 1, wherein the described the step of data set is divided into data subset (130) include one of following:
The data set is divided into data subset;
The data set is divided into data subset according to the data type in data set.
4. management method according to claim 1, wherein the mark corresponding with the Various types of data according to acquisition Rule, generate data subset data mark subtask description information the step of (140) include:
Judge the data type for including in data subset;
For each data type for including in data subset, mark rule corresponding with such data is converted into nature language Speech;
It will convert into the mark rules integration corresponding with every class data of natural language, to obtain data mark subtask description letter Breath.
5. management method according to claim 1, wherein the management method further include:
Data Integration after mark for the sender for claiming request from first of each data subset is stored into (180).
6. management method according to claim 5, wherein described to claim request from first for each data subset Sender mark after Data Integration storage the step of (180) further include:
By the Data Integration storage after the mark for the sender for claiming request from first of each data subset to cloud storage In device.
7. management method according to claim 1, wherein it is described will mark work corresponding with Various types of data in data subset The call instruction of tool be sent to first claim request sender the step of include:
Reference data type and annotation tool corresponding relationship determine annotation tool corresponding with Various types of data in data subset;
According to the mark corresponding with Various types of data of acquisition rule, configure determine it is corresponding with Various types of data in data subset The parameter of annotation tool;
The first sender for claiming request is sent by the call instruction of the annotation tool after configuration parameter.
8. management method according to claim 1, wherein in the data mark subtask description letter of the data subset of publication After target mark containing Various types of data in the data subset in breath in the case where data format, the number of the publication data subset Include: according to (150) the step of mark subtask description information
Data format corresponding relationship after reference data type and target mark determines the target mark of Various types of data in data subset Data format afterwards;
Data format after the target mark of Various types of data in the data subset determined is put into data mark of data subset It is issued in task description information.
9. management method according to claim 5, wherein the management method further include:
According to the mark corresponding with the Various types of data of acquisition rule, the data check subtask description letter of data subset is generated It ceases (181);
Obtain the corresponding verifier information (182) of the data subset;
Data after sending the mark of the syndrome task description information and the data subset according to the verifier information (183);
Receive the data (184) after the mark of verification of the data subset;
(185) will be stored for the Data Integration after the mark of verification of each data subset.
10. management method according to claim 5, wherein the management method further include:
According to the mark corresponding with the Various types of data of acquisition rule, the data check subtask description letter of data subset is generated It ceases (181);
Issue the data check subtask description information (186) of data subset;
Request is claimed to the second of data subset in response to receiving, the sender that Xiang Suoshu second claims request sends the data Data (187) after the mark of subset;
Receive the data (188) after the mark of verification of the data subset for the sender for claiming request from second;
(185) will be stored for the Data Integration after the mark of verification of each data subset,
Wherein:
The management method further includes sending second for the call instruction of annotation tool corresponding with Various types of data in data subset The sender of request is claimed, wherein after the target that data derived from annotation tool corresponding with a kind of data are such data marks Data format;And/or
The target containing Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after mark.
11. management method according to claim 9 or 10, wherein the process verification by for each data subset After mark Data Integration storage the step of (185) include:
The Data Integration after the mark of verification of each data subset will be stored into cloud storage.
12. a kind of managing device (9) of data mark, wherein the managing device includes:
Data set acquiring unit (910), for obtaining data set corresponding with data mark task;
It marks Rule unit (920), for obtaining mark rule corresponding with Various types of data in the data set;
Data subset division unit (930), for the data set to be divided into data subset;
It marks task description information generating unit (940), for regular according to the mark corresponding with the Various types of data of acquisition, The data for generating data subset mark subtask description information;
It marks task description information issue unit (950), the data for issuing data subset mark subtask description information;
Data subset transmission unit (960), for claiming request, Xiang Suoshu first to the first of data subset in response to receiving The sender for claiming request sends the data subset;
Data receipt unit (970) after mark, the data after mark for receiving the sender for claiming request from first,
Wherein:
The managing device further includes for sending the call instruction of annotation tool corresponding with Various types of data in data subset to First claims the first call instruction transmission unit of the sender of request, wherein derived from annotation tool corresponding with a kind of data Data are data format after the target of such data marks;And/or
The target containing Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after mark.
13. managing device according to claim 12, wherein the mark Rule unit (920) is configured as:
Receive mark rule corresponding with Various types of data in the data set;And/or
The corresponding relationship of reference data type and mark rule transfers mark rule corresponding with Various types of data in the data set Then.
14. managing device according to claim 12, wherein the data subset division unit (930) is configured as:
The data set is divided into data subset;Or
The data set is divided into data subset according to the data type in data set.
15. managing device according to claim 12, wherein mark task description information generating unit (940) quilt It is configured that
Judge the data type for including in data subset;
For each data type for including in data subset, mark rule corresponding with such data is converted into nature language Speech;
It will convert into the mark rules integration corresponding with every class data of natural language, to obtain data mark subtask description letter Breath.
16. managing device according to claim 12, wherein the managing device further includes data storage cell after mark (980), for storing the Data Integration after the sender for claiming request from first of mark to(for) each data subset.
17. managing device according to claim 16, wherein data storage cell (980) is configured as after the mark:
By the Data Integration storage after the mark for the sender for claiming request from first of each data subset to cloud storage In device.
18. managing device according to claim 12, wherein the first call instruction transmission unit is configured as:
Reference data type and annotation tool corresponding relationship determine annotation tool corresponding with Various types of data in data subset;
According to the mark corresponding with Various types of data of acquisition rule, configure determine it is corresponding with Various types of data in data subset The parameter of annotation tool;
The first sender for claiming request is sent by the call instruction for having configured the annotation tool of parameter.
19. managing device according to claim 12, wherein described in the data mark subtask of the data subset of publication After target mark containing Various types of data in the data subset in information in the case where data format, the mark task description letter Breath release unit (950) is configured as:
Data format corresponding relationship after reference data type and target mark determines the target mark of Various types of data in data subset Data format afterwards;
Data format after the target mark of Various types of data in the data subset determined is put into data mark of data subset It is issued in task description information.
20. managing device according to claim 16, wherein the managing device further include:
It verifies task description information generating unit (990), for regular according to the mark corresponding with the Various types of data of acquisition, Generate the data check subtask description information of data subset;
Verifier information acquisition unit (9100), for obtaining the corresponding verifier information of the data subset;
First verification task transmission unit (9110) is believed for sending the syndrome task description according to the verifier information Data after the mark of breath and the data subset;
Data receipt unit (9120) after first verification, for receiving the data after the mark of verification of the data subset;
Data storage cell (9130) after verification, for by the Data Integration after the mark of verification for each data subset Storage.
21. managing device according to claim 16, wherein the managing device further include:
It verifies task description information generating unit (990), for regular according to the mark corresponding with the Various types of data of acquisition, Generate the data check subtask description information of data subset;
It verifies task description information issue unit (9140), for issuing the data check subtask description information of data subset;
Second verification task transmission unit (9150), for claiming request to the second of data subset in response to receiving, to institute It states the sender that second claims request and sends the data after the mark of the data subset;
Data receipt unit (9160) after second verification, for receiving the data subset for the sender for claiming request from second The data after the mark of verification;
Data storage cell (9130) after verification, for by the Data Integration after the mark of verification for each data subset Storage,
Wherein:
The managing device further includes sending second for the call instruction of annotation tool corresponding with Various types of data in data subset The second call instruction transmission unit of the sender of request is claimed, wherein data derived from annotation tool corresponding with a kind of data For data format after the target mark of such data;And/or
The target containing Various types of data in the data subset in the data mark subtask description information of the data subset of publication Data format after mark.
22. the managing device according to claim 20 or 21, wherein data storage cell (9130) wraps after the verification It includes:
The Data Integration after the mark of verification of each data subset will be stored into cloud storage.
CN201510130022.4A 2015-03-25 2015-03-25 A kind of management method and device of data mark Active CN106156025B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510130022.4A CN106156025B (en) 2015-03-25 2015-03-25 A kind of management method and device of data mark
PCT/CN2016/076570 WO2016150328A1 (en) 2015-03-25 2016-03-17 Data annotation management method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510130022.4A CN106156025B (en) 2015-03-25 2015-03-25 A kind of management method and device of data mark

Publications (2)

Publication Number Publication Date
CN106156025A CN106156025A (en) 2016-11-23
CN106156025B true CN106156025B (en) 2019-07-23

Family

ID=56976919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510130022.4A Active CN106156025B (en) 2015-03-25 2015-03-25 A kind of management method and device of data mark

Country Status (2)

Country Link
CN (1) CN106156025B (en)
WO (1) WO2016150328A1 (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368565A (en) * 2017-07-10 2017-11-21 美的集团股份有限公司 Data processing method, data processing equipment and computer-readable recording medium
CN107729378A (en) * 2017-07-13 2018-02-23 华中科技大学 A kind of data mask method
CN107705034B (en) * 2017-10-26 2021-06-29 医渡云(北京)技术有限公司 Crowdsourcing platform implementation method and device, storage medium and electronic equipment
CN108108390B (en) * 2017-11-15 2019-02-19 北京达佳互联信息技术有限公司 Data distributing method and device
CN108182448B (en) * 2017-12-22 2020-08-21 北京中关村科金技术有限公司 Selection method of marking strategy and related device
US20210019656A1 (en) * 2018-03-29 2021-01-21 Sony Corporation Information processing device, information processing method, and computer program
CN110400029A (en) * 2018-04-24 2019-11-01 北京京东尚科信息技术有限公司 A kind of method and system of mark management
CN108809980A (en) * 2018-06-11 2018-11-13 厦门华厦学院 A kind of educational data processing server system
CN108829435A (en) * 2018-06-19 2018-11-16 数据堂(北京)科技股份有限公司 A kind of image labeling method and general image annotation tool
CN109408788A (en) * 2018-09-26 2019-03-01 南京大学 A kind of text marking method towards judgement document
CN109492698B (en) * 2018-11-20 2022-11-18 腾讯科技(深圳)有限公司 Model training method, object detection method and related device
CN109710933A (en) * 2018-12-25 2019-05-03 广州天鹏计算机科技有限公司 Acquisition methods, device, computer equipment and the storage medium of training corpus
CN110443294A (en) * 2019-07-25 2019-11-12 丰图科技(深圳)有限公司 Video labeling method, device, server, user terminal and storage medium
CN110674355B (en) * 2019-09-25 2022-07-01 上海依图信息技术有限公司 DSL application system for describing data labeling task and method thereof
CN110851630A (en) * 2019-10-14 2020-02-28 武汉市慧润天成信息科技有限公司 Management system and method for deep learning labeled samples
CN112699906B (en) * 2019-10-22 2023-09-22 杭州海康威视数字技术股份有限公司 Method, device and storage medium for acquiring training data
CN112749308A (en) * 2019-10-31 2021-05-04 北京国双科技有限公司 Data labeling method and device and electronic equipment
CN111309995A (en) * 2020-01-19 2020-06-19 北京市商汤科技开发有限公司 Labeling method and device, electronic equipment and storage medium
CN111353059A (en) * 2020-03-02 2020-06-30 腾讯科技(深圳)有限公司 Picture processing method and device, computer-readable storage medium and electronic device
CN111400581B (en) * 2020-03-13 2024-02-06 京东科技控股股份有限公司 System, method and apparatus for labeling samples
CN111881106B (en) * 2020-07-30 2024-03-29 北京智能工场科技有限公司 Data labeling and processing method based on AI (advanced technology attachment) test
CN112968941B (en) * 2021-02-01 2022-07-08 中科视拓(南京)科技有限公司 Data acquisition and man-machine collaborative annotation method based on edge calculation
CN113312131B (en) * 2021-06-11 2023-04-18 北京百度网讯科技有限公司 Method and device for generating and operating marking tool
CN113407083A (en) * 2021-06-24 2021-09-17 上海商汤科技开发有限公司 Data labeling method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201842A (en) * 2007-10-30 2008-06-18 北京航空航天大学 Digital museum gridding and construction method thereof
CN102843364A (en) * 2012-08-10 2012-12-26 北京鹏泰互动广告有限公司 Method and device for sending, processing and providing site verifying data
CN103824045A (en) * 2012-11-16 2014-05-28 中兴通讯股份有限公司 Face recognition and tracking method and face recognition and tracking system
CN103914334A (en) * 2012-12-31 2014-07-09 北京百度网讯科技有限公司 Map labeling method and system
CN104050238A (en) * 2014-05-23 2014-09-17 北京中交兴路信息科技有限公司 Map labeling method and map labeling device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100437582C (en) * 2006-10-17 2008-11-26 浙江大学 Image content semanteme marking method
CN101477798B (en) * 2009-02-17 2011-01-05 北京邮电大学 Method for analyzing and extracting audio data of set scene
CN101620615B (en) * 2009-08-04 2011-12-28 西南交通大学 Automatic image annotation and translation method based on decision tree learning
US20120084323A1 (en) * 2010-10-02 2012-04-05 Microsoft Corporation Geographic text search using image-mined data
CN103136360B (en) * 2013-03-07 2016-09-07 北京宽连十方数字技术有限公司 A kind of internet behavior markup engine and to should the behavior mask method of engine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201842A (en) * 2007-10-30 2008-06-18 北京航空航天大学 Digital museum gridding and construction method thereof
CN102843364A (en) * 2012-08-10 2012-12-26 北京鹏泰互动广告有限公司 Method and device for sending, processing and providing site verifying data
CN103824045A (en) * 2012-11-16 2014-05-28 中兴通讯股份有限公司 Face recognition and tracking method and face recognition and tracking system
CN103914334A (en) * 2012-12-31 2014-07-09 北京百度网讯科技有限公司 Map labeling method and system
CN104050238A (en) * 2014-05-23 2014-09-17 北京中交兴路信息科技有限公司 Map labeling method and map labeling device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于GSM和Google_Map的定位与地图标注关键技术研究;杨帆;《陕西科技大学学报》;20110430;参见正文第123-125页

Also Published As

Publication number Publication date
WO2016150328A1 (en) 2016-09-29
CN106156025A (en) 2016-11-23

Similar Documents

Publication Publication Date Title
CN106156025B (en) A kind of management method and device of data mark
US11652628B2 (en) Deterministic verification of digital identity documents
US8881244B2 (en) Authorizing computing resource access based on calendar events in a networked computing environment
US20160044132A1 (en) Systems and Methods for RWD App Store Based Collaborative Enterprise Information Management
EP3032442B1 (en) Modeling and simulation of infrastructure architecture for big data
JP2021036433A (en) Terminal, control method of terminal, and program
US10996914B2 (en) Persistent geo-located augmented reality social network system and method
US20190312834A1 (en) Dynamic hashtag ordering based on projected interest
Kshetri et al. Big data and cloud computing for development: Lessons from key industries and economies in the global south
US8788248B2 (en) Transparent flow model simulation implementing bi-directional links
US20140325077A1 (en) Command management in a networked computing environment
CN106233236A (en) Process and the Consumer's Experience of clipping image
CN104572601B (en) Via the documentation revisions of social media
US11442980B2 (en) System and method for photo scene searching
US9342527B2 (en) Sharing electronic file metadata in a networked computing environment
US20130304747A1 (en) Characteristic-based selection in a networked computing environment
Shor Cloud computing for learning and performance professionals
US20160266900A1 (en) Information processing apparatus, work flow creation method, and storage medium
Kwanya Big data in land records management in Kenya: A fit and viability analysis
CN110163564A (en) Method, system and the storage medium of item service are generated based on item model
US10679395B2 (en) Spatial and hierarchical placement of images at runtime
CN111105111B (en) Reading management method, device, system and storage medium
Stanley et al. Formulating" the obvious" as a task request to the crowd: an interactive design experience across cultural and geographical boundaries
IGP et al. Integrated ECM solutions: where records managers, knowledge workers converge
EP3971805A1 (en) Generating workflow, report, interface, conversion, enhancement, and forms (wricef) objects for enterprise software

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1230305

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant