US11551134B2 - Information processing apparatus, information processing method, and storage medium - Google Patents
Information processing apparatus, information processing method, and storage medium Download PDFInfo
- Publication number
- US11551134B2 US11551134B2 US15/845,922 US201715845922A US11551134B2 US 11551134 B2 US11551134 B2 US 11551134B2 US 201715845922 A US201715845922 A US 201715845922A US 11551134 B2 US11551134 B2 US 11551134B2
- Authority
- US
- United States
- Prior art keywords
- label
- learning data
- labeling
- data
- information processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9017—Indexing; Data structures therefor; Storage structures using directory or table look-up
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Definitions
- the present disclosure relates to a labeling technique for applying a label to data.
- Techniques for constructing prediction models from data groups and calculating prediction to query data pieces by supervised machine learning are used in various field such as object recognition.
- the supervised learning accompanies an operation for labeling learning data pieces in advance to apply a label expected as an output result together with query data.
- accuracy of the prediction model is improved.
- Japanese Patent Application Laid-Open No. 2016-62544 describes a method for automatically performing labeling by calculating a ranking of distances in a plurality of characteristics among learning data pieces.
- the present disclosure is directed to determining appropriately a label when a false label is applied by learning data.
- An information processing apparatus includes an obtaining unit configured to obtain information of a plurality of labels applied to the learning data by a plurality of users, information regarding reliability of each applied label itself, and information regarding reliability of a user who applies the relevant label, wherein the information of the label is information regarding a result to be recognized in a case where the predetermined recognition is performed on the learning data and a determination unit configured to determine a label to the learning data from among the plurality of labels based on the reliability of the label itself and the reliability of the user who applies the relevant label.
- FIG. 1 is a block diagram illustrating an example of a configuration of a labeling system according to one or more aspects of the present disclosure.
- FIG. 2 illustrates an example of correspondence relationships among pieces of data, labels, and attribute information to be managed according to one or more aspects of the present disclosure.
- FIG. 3 illustrates examples of monotonically increasing functions f(x) according to one or more aspects of the present disclosure.
- FIG. 4 illustrates examples of label determination of an area designation type according to one or more aspects of the present disclosure.
- FIG. 5 illustrates an example of processing on bounding boxes according to one or more aspects of the present disclosure.
- FIG. 6 is a flowchart illustrating processing for labeling.
- FIG. 7 is a flowchart illustrating processing for determining a label according to one or more aspects of the present disclosure.
- FIG. 8 is a block diagram illustrating an example of a configuration of a labeling confirmation system according to one or more aspects of the present disclosure.
- FIG. 9 illustrates an example of display for editing a symbol designation type label on a touch panel type display according to one or more aspects of the present disclosure.
- FIG. 10 illustrates an example of display for editing an area designation type label on a touch panel type display according to one or more aspects of the present disclosure.
- FIG. 11 is a flowchart illustrating a typical flow for applying and confirming a label using the labeling confirmation system according to one or more aspects of the present disclosure.
- FIG. 12 is a block diagram illustrating an example of a configuration of a labeling system based on crowdsourcing according to one or more aspects of the present disclosure.
- FIG. 13 is a flowchart illustrating a flow of labeling based on crowdsourcing according to one or more aspects of the present disclosure.
- FIG. 14 is a block diagram illustrating an example of a configuration of a learning and recognition system based on crowdsourcing according to one or more aspects of the present disclosure.
- FIG. 15 is a flowchart illustrating a flow of learning and recognition based on crowdsourcing according to one or more aspects of the present disclosure.
- FIG. 16 is a block diagram illustrating an example of a configuration of a labeler evaluation system according to one or more aspects of the present disclosure.
- FIG. 17 is a flowchart illustrating a flow for evaluating a labeler using the labeler evaluation system according to one or more aspects of the present disclosure.
- FIG. 18 illustrates an example of a hardware configuration of an information processing apparatus according to one or more aspects of the present disclosure.
- a central processing unit (CPU) 1810 comprehensively controls each device connected via a bus 1800 .
- the CPU 1810 reads out and executes a processing step and a program stored in a read-only memory (ROM) 1820 .
- ROM read-only memory
- Each processing program, device driver, and the like including an operating system (OS) according to the present exemplary embodiment which are stored in the ROM 1820 are temporarily stored in a random access memory (RAM) 1830 and executed by the CPU 1810 appropriately.
- OS operating system
- An input interface (I/F) 1840 inputs an input signal in a format which can be processed by the information processing apparatus from an external apparatus (an image capturing apparatus or the like).
- An output I/F 1850 outputs a processing result by the information processing apparatus according to the present disclosure as an output signal to the external apparatus in a format which can be processed by the external apparatus.
- An information processing apparatus 10000 stores a result that a person or an algorithm (i.e., a labeler) labels each learning data in association with reliability of the labeler and a confidence degree of labeling by each labeler as attribute information pieces. Further, the information processing apparatus 10000 determines a likely label based on a label applied to a target learning image and a reliability of the label. A person or an algorithm performing labeling is referred to as a labeler appropriately.
- FIG. 1 is a block diagram illustrating an example of a configuration of an information processing apparatus (a labeling apparatus) 10000 according to the first exemplary embodiment.
- a learning data storage unit 100 stores an assembly of learning data pieces (plural storage).
- Learning data is, for example, an image, and each image includes an object to be a labeling target.
- An object is, for example, a person, a dog, a cat, a car, and a building.
- the learning data is used to generate a classifier.
- a classifier for identifying an object in image data including unknown object can be generated by performing machine learning using the learning data.
- an image is described as an example of learning data, and thus an image is sometimes referred to as a learning image.
- a label and attribute information storage unit 101 stores an assembly of labeled results by a labeler to each of the learning data pieces and attribute information pieces accompanying the results.
- the labeled result is used as a teaching signal of the machine learning and, for example, a label “cat” which is applied to an image of a cat by the labeler.
- the attribute information includes label attribute information corresponding to the applied label, labeler attribute information corresponding to the labeler who applies the label, and data attribute information as attribute of the learning data.
- the label attribute information includes an identification (ID) indicating the labeler, a date and time when the labeler performing labeling, a confidence degree indicating a degree how the labeler can perform labeling correctly.
- the labeler is a person
- a high value is applied as the confidence degree
- the person wonders whether the object is a cat or a dog a low value is applied as the confidence degree.
- a value of the confidence degree is, for example, a real number from zero to one, however, the value is not limited to this example.
- the labeler is an algorithm (a classifier)
- a likelihood representing a cat likeness may be used as the confidence degree.
- the labeler attribute information is reliability of the labeler.
- the reliability is an expected value that the label applied by the labeler is thought to be correct. For example, reliability of a labeler who always labels correctly any image is high, and in contrast, reliability of a labeler who has a high rate of false labeling is low.
- the data attribute information is information inherent in the learning data, for example, a date and time when the image data is captured, an image capturing parameter, a labeling method described below, a determined label, and a certainty degree.
- a label and attribute information management unit 110 manages and stores a correspondence relationship between learning data and label and attribute information.
- Data pieces are stored as a table in a memory not illustrated.
- FIG. 2 illustrates an example of correspondence relationships among pieces of data, labels, and attribute information to be managed.
- Each learning data in a learning data group is associated with the data attribute information corresponding to each data and a plurality of labels.
- the data attribute information includes a date and time when the data is obtained, parameter information of a camera used for capturing an image, and a type of the labeling method.
- an image as learning data 1 is associated with a plurality of labels namely, “cat” as a label 1 a , “dog” as a label 1 b , and “cat” as a label 1 c .
- each label is imparted with the above-described label attribute information correspondingly.
- the learning data is associated with a plurality of labels, and a label is determined like majority voting by a method described below. Accordingly, effects of an accidental labeling error and a labeling error by a malicious labeler can be reduced. Every time the label applied to the learning data and the attribute information of the label are received, the label and attribute information management unit 110 associates the learning data 100 with the label and attribute information in the above-described form.
- a request reception unit 111 makes an inquiry to the label and attribute information management unit 110 and transmits the label and the attribute information corresponding to the designated learning data (target learning data) to a label determination unit 112 .
- the designated learning data may be one each or a group of a plurality of data pieces.
- the label determination unit 112 determines a likely label based on the label and the attribute information corresponding to the learning data received from the label and attribute information management unit 110 .
- the method for determining the label is described in detail below.
- a function f(x) monotonically increases with respect to x and can be various functions f(x) as illustrated in FIG. 3 as examples.
- conditions can be changed such as a score is not given to a label of which the reliability of the labeler is a certain value or lower, and a label of which the confidence degree to labeling is extremely high is greatly depended.
- the score S(j) is calculated for each label candidates I_j, and a label I_j when the score S(j) becomes a maximum value may be regarded as the determined label.
- the maximum value of the score S(j) is selected, and thus a labeler having higher reliability, a label having a higher confidence degree, and a label with more same labels are selected as the determined label.
- a labeler having higher reliability, a label having a higher confidence degree, and a label with more same labels are selected as the determined label.
- an effect of a label which has a low confidence degree and is highly likely a false label may be reduced, and if a malicious labeler applies a false label, the reliability of the malicious labeler is lower, so that a negative effect on label determination can be reduced.
- the confidence degree C(i) and the reliability R (A(i)) can be obtained from the attribute information in the formula (1), however, the confidence degree may be applied using the above-described method according to the present exemplary embodiment.
- the reliability derivation for example, a rate of applying a label same as the label determined in the past is calculated for each labeler, and the calculated value may be regarded as the reliability, or the labeler is caused to label a plurality of data pieces of which correct labels are determined in advance, and a rate of correct answers may be regarded as the reliability.
- calculation may be performed by regarding any one of or both of the confidence degree and the reliability are constant values.
- a case of a “symbol designation type” labeling is described in which some sort of names and good or bad is designated as a label
- a scope of application of the present disclosure is not limited to this case.
- the present disclosure can be applied to a case of an “area designation type” labeling which is labeling for enclosing a face area in an object captured in an image with a bounding box and labeling for filling a road area captured in an image with color.
- a label can be determined using a following method.
- True or False is designated to a certain learning data image as learning data (pixel learning data) independent for each pixel.
- learning data pixel learning data
- labeling may be performed in such a manner that a pixel in the area in which the person is captured is designated as True, and a pixel in an area other than that is designated as False.
- a method for designating an area there are, for example, a bounding box which designates a rectangular area enclosing a target person on an image, and a method for filling pixels in a target person area with color.
- FIG. 4 illustrates examples of label determination of the area designation type.
- FIG. 4 illustrates labeled results 10 to 17 in which specific areas are filled with color in images.
- the pixel label is determined for each pixel with respect to a plurality of the labels. For example, when a third pixel from the left on a second line from the top is focused, there are seven results 10 , 11 , 12 , 14 , 15 , 16 , and 17 in which the relevant pixel is filled with color (black) and one result 13 in which the relevant pixel is not filled with color (white) in FIG. 4 .
- a result 18 illustrated in FIG. 4 is obtained, and the result 18 can be used as the determined label of the image.
- the bounding box there is a case in which a determined pixel label representing inside of the bounding box does not have a rectangular shape as a result 20 in FIG. 5 , however, in such a case, the bounding box may be set by being approximated by a rectangular frame as shown in a result 21 in FIG. 5 .
- a “numerical value designation type” labeling can be considered in which a numeric array such as a matrix and a vector representing an image or coordinate values and an orientation in a three-dimensional space of a target object is designated as a label.
- a numerical value of the label can be determined as a weighted average value using the score S(j) in the formula (1) as a weight in each label.
- a value may be determined by, for example, performing robust estimation such as M estimation, which is a known technique, in consideration of an outlier or using random sample consensus (RANSAC), which is a known technique, by assuming some model without being limited to the weighted average.
- FIG. 6 is a flowchart illustrating labeling processing.
- the label and attribute information management unit 110 obtains a result that a labeler applies a label and attribute information to the learning data and stores the result in the label and attribute information storage unit 101 .
- the attribute information may be directly input by the labeler, and a date and a time when the labeling is performed and an ID of the labeler in the label attribute information may be automatically extracted from the information processing apparatus performing labeling.
- the label and attribute information management unit 110 manages the learning data in association with the label and attribute information in a format as described above in FIG. 2 .
- the label and attribute information management unit 110 may associate the learning data with each label or associate a plurality of the learning data pieces with a plurality of labels at once without depending on the number of the learning data pieces and the number of labels.
- step S 1003 the label and attribute information management unit 110 checks whether labeling is completed, and when the labeling is completed (YES in step S 1003 ), processing for determining the label is ready to be performed. When the labeling is not completed (NO in step S 1003 ), the processing returns to step S 1001 .
- a criterion for determining whether the labeling is completed is determined whether sufficient labels for determining the label are applied to the learning data. For example, when x or more pieces of labels are applied to each learning data, it may be determined that labeling is completed. However, the criterion is not limited to the above-described one, and it may be determined that the labeling is completed when a certainty degree exceeds a threshold value which is described below in a third exemplary embodiment.
- Each function unit described herein is implemented by the CPU 1810 loading a program stored in the ROM 1820 to the RAM 1830 and executing processing according to each flowchart described below. Further, for example, when hardware is constituted as an alternative to software processing using the CPU 1810 , an arithmetic unit and a circuit may be constituted so as to correspond to processing of each function unit described herein.
- FIG. 7 is a flowchart illustrating a processing flow by the information processing apparatus 10000 according to the present exemplary embodiment.
- the request reception unit 111 receives a request from a user and a system.
- a content of the request is, for example, “return a label corresponding to certain learning data”.
- the request reception unit 111 transmits an ID of the designated learning data to the label and attribute information management unit 110 and requests the label thereof.
- the learning data to be requested is not limited to one and may be a plurality of pieces, and in this case, a plurality of IDs of the learning data pieces is transmitted to the label and attribute information management unit 110 correspondingly.
- step S 1102 the label and attribute information management unit 110 determines a label and attribute information corresponding to the learning data ID received from the request reception unit 111 and transmits the label and the attribute information to the label determination unit 112 .
- step S 1103 the label determination unit 112 performs scoring on the label candidates based on the formula (1) and determines a label.
- step S 1104 the label determination unit 112 returns the determined label to the user and the system which made the request.
- the label and attribute information management unit 110 can transmit the label and the attribute information corresponding to the learning data in advance to the label determination unit 112 , and the label determination unit 112 can determine the label and store the determined label in the data attribute information before receiving a request.
- the label and attribute information management unit 110 may transmit the determined label stored in the data attribute information to the label determination unit 112 , and the label determination unit 112 may transmit the determined label as it is to the user and the system which made the request without performing calculation processing.
- the label and attribute information management unit manages a plurality of labels and various attribute information pieces with respect to each learning data, and the label determination unit appropriately determines a label form these information pieces, so that, if a part of labels includes an error, a correct label can be determined by suppressing an effect of the error.
- An information processing apparatus 20000 presents, in a case where a labeler is a user and when the user performs labeling, a label itself applied by another user and a similarity degree of a label which the user currently intends to apply and enables the user who performs labeling to efficiently perform labeling.
- a label applied by another user already exists in certain learning data
- the already existing label is compared with a label and attribute information to facilitate labeling.
- FIG. 8 is a functional block diagram illustrating a configuration of a labeling apparatus 20000 according to the second exemplary embodiment.
- the configuration of the labeling apparatus 20000 according to the second exemplary embodiment includes portions in common with the example of the configuration according to the first exemplary embodiment illustrated in FIG. 1 , so that a request reception unit 211 , a label comparison unit 113 , and a display control unit 114 which are different from the first exemplary embodiment are described.
- the information processing apparatus according to the second exemplary embodiment is connected to a display apparatus 30 in a wired way or wirelessly.
- the display apparatus may adopt any display format, and the present exemplary embodiment is described as using a touch panel type display.
- an organic electroluminescence (EL) display a cathode ray tube (CRT) display, and a projector which displays an image by projecting it on a wall and the like may be used as the display apparatus.
- a keyboard and a mouse may be used as an input device to the display apparatus instead of the touch panel.
- the request reception unit 211 receives a request regarding a label to be referred to from a user and transmits the request to the label and attribute information management unit 110 in addition to the function according to the first exemplary embodiment.
- the label and attribute information management unit 110 transmits information pieces such as the label and the attribute information to the label comparison unit 113 and the display control unit 114 upon receiving an order from the request reception unit 211 .
- the label comparison unit 113 receives two pairs of related labels from the label and attribute information management unit 110 , calculates a similarity degree between labels, and transmits a label similarity to the display control unit 114 .
- the display control unit 114 receives the label and the attribute information from the label and attribute information management unit 110 , receives the label similarity degree from the label comparison unit 113 , and causes, for example, the display apparatus 30 to display based on these information pieces.
- the label and attribute information management unit 110 extract a label in the designated learning data and transmits the label to the display control unit 114 .
- a content to be displayed on the display apparatus 30 by the display control unit 114 is described in detail below.
- the label comparison unit 113 calculates a similarity degree between two labels (a label A and a label B). First, calculation of a similarity degree in the symbol designation type labeling is described. When the label A is “cat”, and the label B is also “cat”, the label A and the label B are the same, and thus a similarity degree is 1.0. When the label A is “cat”, and the label B is a “dog”, the label A and the label B are different, and thus the similarity degree is 0. Further, for example, when a plurality of labels for enumerating objects captured in an image is applied, a similarity degree is calculated as (the number of labels common to the label A and the label B)/(the number of labels included in either one of the label A and the label B).
- the calculation method of the similarity degree is not limited to the above-described formula, and a method can be adopted as long as a similarity degree between two labels is a higher value as labels are similar and is a lower value as labels are different.
- the label comparison unit 113 calculates a similarity degree and a coincidence degree therebetween, and the user can perform labeling while considering whether the label being edited is a likely label by referring to the similarity degree and the coincidence degree by a user interface (UI) described below.
- UI user interface
- the label comparison unit 113 can calculate not only a similarity degree but also a coincidence degree of the label.
- a coincidence degree is an index indicating how much a label applied by the user coincides with a plurality of labels applied by another labeler. More specifically, the coincidence degree can be obtained by (a total sum of a similarity degree of the labels applied by the user and the labels applied by the other labeler)/(the number of labels applied by the other labeler) with respect to certain learning data.
- all labels associated with the learning data may be selected.
- the selecting method is not limited to the above-described one, and labels of a labeler whose reliability is low and a label group excluding labels of which a confidence degree is low may be used, and the user may arbitrarily select a label group.
- the coincidence degree can be referred to by the display control unit 114 as with the similarity degree. The coincidence degree is calculated and referred to, and thus an error in labeling caused by a mistake and a malfunction can be reduced.
- the similarity degree is obtained from a rate with respect to the number of pixels based on whether a pixel label is the same or not for each pixel.
- a term w represents that a certain pixel is within the bounding box or filled with color
- a term b represents that a certain pixel is out of the bounding box or not filled with color.
- Nwb represents the number of pixels which has the label A and is classified as “w” and pixels which has the label B and is classified as “b”
- variables Nww, Nbb, and Nbw are defined in a similar manner.
- a similarity degree Re of the label A and the label B is expressed by a following formula.
- a difference d between values of the label A and the label B is calculated, a monotonically decreasing function g which is inverse to the function f in FIG. 3 is considered which approaches 1.0 when the difference d is zero and approaches 0.0 when the difference d becomes larger, and the similarity degree may be defined by g(d).
- the value d is defined as a difference between values of the label A and the label B here, however, the present exemplary embodiment is not limited to this value, and, for example, a L1 norm and a L2 norm may be used as the value d in the case that the label A and the label B are vectors and matrices.
- the descriptions thereof are omitted since the coincidence degrees can be calculated similarly by the method described in the symbol designation type after respectively calculating the similarity degrees.
- the display control unit 114 receives necessary information from the label and attribute information management unit 110 and the label comparison unit 113 and performs display on the display apparatus 30 which is a touch panel type display. An external appearance and a function of the display apparatus 30 are described with reference to FIG. 9 .
- FIG. 9 is an example of display for editing the symbol designation type label.
- the display apparatus 30 is the touch panel type display and includes a touch panel liquid crystal display 31 , labeling target image data 32 , a label input portion 33 for the target image data, a label list 34 of the target image data, a thumbnail image data display area 35 , a display condition setting portion 36 , a similarity degree display portion 37 , a labeling type switching button 38 , a temporary storing button 39 , an image switching button 40 , a backward and forward button 41 , a label transfer button 42 , a confirmation button 43 , a label determination button 44 , and a label comparison button 45 .
- the display apparatus 30 has a function of displaying a content instructed by the display control unit 114 and a function of receiving a request received by the request reception unit 111 .
- the display apparatus 30 may include functions of the label and attribute information management unit 110 , the label determination unit 112 , and the label comparison unit 113 , or an external information processing apparatus includes these functions, and the display apparatus 30 may exchange information therewith via a communication unit.
- the touch panel liquid crystal display 31 displays an image, a button, and the like.
- the touch panel liquid crystal display is described here, however, the display may not be a touch panel type, and in such a case, for example, a mouse may be used to perform a button operation and the like.
- the labeling target image data 32 an image of a target that a user currently labels is displayed.
- An arbitrary area in the displayed image can be enlarged or reduced by a user operation.
- the label input portion 33 for the target image data is an area for editing a label regarding the image displayed on the labeling target image data 32 .
- the label may be directly input with characters using a keyboard and the like or selected by a button or from a list when candidates to be selected as a label are determined.
- the label that the user applied in the past can be displayed on the label input portion 33 .
- the label list 34 of the target image data is an area for displaying a list of labels already applied to the target image by the user himself/herself of another labeler. The user can think a label of the target image with reference to the label list 34 . In addition, when the determined label has been obtained, the determined label is included in the label list 34 .
- thumbnail image data display area 35 a thumbnail of the learning data image is displayed.
- FIG. 9 illustrates an example in which four thumbnails are displayed, however the display is not limited to this example, and the display can be switched to a larger image, a smaller image, a file name, and the like.
- the thumbnail image data display area 35 can be scrolled, and the user can browse another image by, for example, touching the display area with a finger or the like and sliding it to right and left.
- the number of applied labels and whether the determined label is registered to the data attribute information can be displayed as the thumbnails, and the user can select an image to be labeled next from these information pieces.
- the display condition setting portion 36 is used to set a label to be displayed in the label list 34 or a condition of an image to be displayed on the thumbnail image data display area 35 . Which condition is set can be switched in the display condition setting portion 36 .
- the condition the above-described “want to refer a label applied to certain learning data by each labeler” may be included.
- a condition for displaying only labels of which the confidence degree is greater than or equal to 0.8 and a condition for arranging the reliability of the labeler in descending order in the label list are set based on the label attribute information and the labeler attribute information and reflected to the label list 34 .
- a condition for displaying an image captured in 2016 or later and a condition for displaying only images of which labeled results are less than five pieces are set based on the data attribute information and the label attribute information, and thus a thumbnail image to be a target can be displayed on the thumbnail image data display area 35 .
- the user refers to the image data pieces and labeled results corresponding to the set condition while filtering and sorting them and accordingly can improve labeling accuracy by more specifically visualizing a labeling rule.
- the labeled results are displayed in descending order of the labeler reliability with respect to the same the image data, and thus the user can confirm a tendency of a likely label and improve the confidence degree of labeling.
- a group of images applied with the same label is displayed as a thumbnail, and accordingly the user can understand a tendency of the images to be applied with the relevant label and can perform more accurate labeling.
- the user refers to the label that the user himself or herself applied before and can confirm that whether a labeling rule of himself or herself is wavered or not.
- a group of images labeled before a date when the labeling rule is changed is selected by performing filtering, and accordingly a label can be efficiently applied again only to the image group on which a new rule is not applied.
- the similarity degree display portion 37 displays a similarity degree between a label set in the label input portion 33 and a label selected from the label list 34 .
- the similarity degree is calculated by the above-described label comparison unit 113 .
- the similarity degree display portion 37 can switch display of the similarity degree and display of the coincidence degree by a coincidence degree switching button, which is not illustrated.
- the coincidence degree is calculated by the above-described label comparison unit 113 .
- the labeling type switching button 38 is a button for switching the type of the labeling defined by a user, such as the symbol designation type labeling, the area designation type labeling, and the numerical value designation type labeling.
- an image data group to be a target is also changed, so that contents and the like in the labeling target image data 32 , the label list 34 , and the thumbnail image data display area 35 are updated.
- the case of switching the labeling type is described as an example, however, the present exemplary embodiment is not limited to this example.
- the image data groups can be switched, and when the image data group is designated, the labeling type may be updated according to the type of the labeling method in the data attribute information.
- the temporary storing button 39 is a button for temporarily storing the content in the label input portion 33 .
- the content is stored, and thus in the case that the display is switched to other image data and then returned to the previous image, the label can be edited again from the stored state.
- the temporary storing button 39 is not necessarily in a button form, and the content may be always automatically stored when the label input portion 33 is edited.
- the image switching button 40 is used to switch the content of the labeling target image data 32 .
- the content may be switched by designating a thumbnail image displayed on the thumbnail image data display area 35 , or images may be randomly switched without designation.
- the backward and forward button 41 is used to return to a previous history state by the backward button or to proceed to a next history state, if there is the next history state, by the forward button from a history of operation performed by a user.
- the label transfer button 42 is used to select a label from the label list 34 and transfer the label same as the selected label to the label input portion 33 .
- the confirmation button 43 is used to determine the label written in the label input portion 33 and registers the label to the label and attribute information. When the confirmation button 43 is pressed, a screen for setting the confidence degree to the labeling by the user is displayed, and the confidence degree is stored as the attribute information at the same time.
- the confirmation button 43 is also used to display a message for prompting the user to confirm that there is no error in the label and an option of whether to edit the label in the case where a similarity degree between a label newly applied by the user and the determined label is low, or the newly applied label is greatly different from the tendency of the labels applied by the user himself or herself in the past. Accordingly, the user can reduce a possibility of registering a false label due to a malfunction and a mistake.
- the label determination button 44 is used to determine the label using the label determination unit 112 based on a label group corresponding to the designated image data. When the label is determined, the determined label is added to or updated in the label list 34 .
- the label comparison button 45 is used to compare the label regarding the labeling target image data 32 being edited in the label input portion 33 with the label designated in the label list 34 using the label comparison unit 113 and calculate the similarity degree therebetween.
- the calculated similarity degree is displayed on the similarity degree display portion 37 .
- the label comparison button 45 is described here, however, the present exemplary embodiment is not limited to this button, and label comparison and calculation of the similarity degree may be automatically performed every time the label is designated in the label list 34 , and the similarity degree display portion 37 may be updated.
- FIG. 10 illustrates an example of display for editing the area designation type label on the touch panel type display. Many functions in FIG. 10 are same as those in FIG. 9 , and thus different functions are only described.
- FIG. 10 includes comparison target labeling image data 46 .
- a user can select the label already applied to the same image in the labeling target image data 32 and display the label by overlapping with the image data.
- the labeling target image data 32 an image which is a current target of labeling by the user is displayed as with the symbol designation type, however, a function of labeling by designating an area on the image is added. Accordingly, the processing performed on the label input portion 33 in FIG. 9 is integrated as the function of the labeling target image data 32 .
- thumbnail image displayed on the thumbnail image data display area 35 not only thumbnail display of the learning data image as with the symbol designation type but also a thumbnail of an image in which a labeled result is overlapped on the labeling target image data can be displayed. Accordingly, in the display condition setting portion 36 , the user can further set a condition by designating a category of an image on which the labeled result is overlapped.
- the similarity degree display portion 37 displays the similarity degree between the label set in the labeling target image data 32 and the label in the comparison target labeling image data 46 .
- the label transfer button 42 transfers the label displayed in the comparison target labeling image data 46 as the label being edited in the labeling target image data 32 . Accordingly, the user can efficiently select the label from labels applied in the past, edit and apply the label.
- buttons are displayed on the touch panel liquid crystal display 31 , however, a processing content may be selected from a list or assigned to a shortcut key instead of the buttons without being limited to the above-described configuration.
- An example of display for editing the numerical value designation type label is similar to that of the symbol designation type excepting a point that a numerical value is input as a label instead of selecting symbols as character input in the example of the symbol designation type illustrated in FIG. 9 , so that description thereof is omitted.
- FIG. 11 illustrates a typical flow for applying and confirming the label using the information processing apparatus according to the present exemplary embodiment. Processing flows are approximately similar in any of the symbol designation type, the area designation type, and the numerical value designation type labeling and thus described together with reference to FIG. 11 .
- step S 2001 when a user starts up the display apparatus 30 , initialization processing is performed, and the display control unit 114 displays a UI for labeling as illustrated in FIG. 9 or FIG. 10 .
- the initialization processing includes processing for reading necessary data pieces from the learning data and the label and attribute information.
- step S 2002 the label and attribute information management unit 110 selects image data to be labeled by the user.
- the user designates the image using the display condition setting portion 36 and the thumbnail image data display area 35 and presses the image switching button 40 to display the image on the labeling target image data 32 .
- the user may randomly display images by pressing the image switching button 40 without designating the image.
- step S 2003 the label and attribute information management unit 110 receives the labeled result by the user.
- the user performs the labeling operation using the temporary storing button 39 , the backward and forward button 41 , the label transfer button 42 , and the like.
- step S 2004 the label comparison unit 113 compares and confirms the labels.
- the label list 34 or the comparison target labeling image data 46 is used as previous labeled results in the comparison and confirmation.
- a calculation result of the similarity degree is displayed on the similarity degree display portion 37 , and the user can confirm a difference from the label of the comparison target while watching the display.
- the display condition setting portion 36 and thumbnail images may be used.
- step S 2005 the label and attribute information management unit 110 confirms whether labeling is completed.
- the processing returns to step S 2003 .
- the processing proceeds to step S 2006 .
- the user presses the confirmation button 43 to confirm that there is no possibility of an error in the label.
- step S 2006 the label and attribute information management unit 110 registers and manages the image data by adding the attribute information such as the confidence degree of the user to the labeling.
- the label and attribute information management unit 110 associates the learning data with the label and the attribute information.
- step S 2007 the user determines whether to continue labeling on another image.
- the processing returns to step S 2002 .
- labeling is not continued (NO in step S 2007 )
- the processing is terminated.
- a user can efficiently perform labeling while visually confirming another label and image data using a label similarity degree calculation function of the label comparison unit 113 and functions of comparing, referring, and editing a label by the UI using the display apparatus.
- An information processing apparatus 30000 causes many labelers to perform labeling by crowdsourcing and determines a label with a high degree of accuracy from a large amount of labels. Further, the information processing apparatus 30000 appropriately extracts learning data to be processed by the labeler by calculating data of which a label is not determined in the learning data pieces and thus efficiently labels a large amount of learning data pieces.
- FIG. 12 is a functional block diagram illustrating a functional configuration of a labeling apparatus 30000 based on crowdsourcing according to the third exemplary embodiment.
- the functional configuration according to the third exemplary embodiment includes portions in common with the configuration according to the second exemplary embodiment illustrated in FIG. 8 , so that crowdsourcing 102 , a certainty degree calculation unit 115 , and a data extraction unit 116 which are different from the second exemplary embodiment are described in detail.
- the crowdsourcing 102 is constituted of many labelers who perform labeling, each labeler performs labeling on a learning data group extracted by the data extraction unit 116 , and the label and attribute information management unit 110 stores and manages the labeled results in the label and attribute information.
- the third exemplary embodiment is not necessarily limited to the crowdsourcing 102 and may adopt outsourcing with particular people or cause a plurality of personal computers (PCs) to perform labeling in parallel.
- PCs personal computers
- the certainty degree calculation unit (certainty degree derivation unit) 115 receives the label and attribute information of each learning data from the label and attribute information management unit 110 and calculates a certainty degree as a degree of likelihood of a label determined by the label determination unit 112 .
- the calculated certainty degree is stored as the data attribute information for each learning data.
- the certainty degree is higher, the label determined by the label determination unit 112 has high likelihood which represents that no further labeling is necessary.
- the certainty degree is low, it indicates that the label is highly likely wrong even the label determined by the label determination unit 112 , and thus it is necessary to perform labeling by more labelers. Or, there is a possibility that the learning data itself is data to which a label is difficult to be uniquely determined.
- the data extraction unit 116 extracts the learning data to be labeled by the labeler in the crowdsourcing 102 based on the certainty degree in the data attribute information for each learning data. Since labeling is less necessary for the learning data of which the certainty degree is high, the learning data is not easily extracted as data. In contrast, the learning data of which the certainty degree is low needs to improve the certainty degree by performing labeling and thus is easily extracted as data.
- the extracted learning data group is transferred to the crowdsourcing 102 to be labeled.
- a certainty degree F. is calculated by a following formula using the score S(j) of each label candidate calculated by the formula (1).
- a function W(n) determines a maximum value of the certainty degree F. in response to the number of labels.
- the function W(n) will also be a small value, and when the value n is large, the function W(n) will also be a large value.
- an upper limit of the function w(n) is 1.0.
- the function W(n) is provided as described above so as not to easily output the high certainty degree F. when the total number of labels is small (when the value n is small).
- the calculation of the certainty degree is not limited to the above-described formula as long as a formula uses a fact that a ratio of the determined label is high. For example, a following formula may be used instead of the formula (3).
- a value S[1] the max S(j)
- a value S[m] the min S(j).
- the policy (i) is because that as the certainty degree is lower, the data is subjected to labeling by a next labeler from a data group in the learning data so as to make a more correct label easier to be selected.
- the policy (ii) indicates that the learning data itself has a defect.
- the fact that the certainty degree is not increased even the number of labels is increased for certain learning data indicates that a plurality of peaks or no peak of the score S(j) exists in distribution of the score S(j) by the formula (3) or the formula (4). In such a case, it is highly likely that the learning data itself is an image difficult to be labeled, for example, a case is considered in which a target object is only partly captured in an image and indistinguishable to be labeled.
- an attribute for indicating that the data is difficult to be labeled is defined and registered to the data attribute information.
- data extraction is performed so as to make a request to a labeler having high reliability who can highly accurately perform labeling for labeling of the learning data which is difficult to be labeled, and accordingly an error in labeling can be reduced.
- the learning data which is difficult to be labeled may be excluded from target learning data to maintain a quality of the learning data group.
- the policy (iii) means a reliability evaluation data set for testing the reliability of the labeler.
- a data set for evaluating the reliability is prepared in advance which is constituted of data of which a true label is known and a data group with the high certainty degree, and the reliability evaluation data set is partly included in data pieces to be labeled by the labeler without the labeler's noticing. Subsequently, the reliability of the labeler can be obtained by evaluating how the labels applied to the reliability evaluation data set by the labeler are correct compared to the true labels.
- One of points requiring attention to crowdsourcing is incorporation of a false label by a malicious user. Using the reliability evaluation data set lowers the reliability of the labeler who applies a false label, so that if such a malicious user is included, an effect thereof can be suppressed.
- the label and attribute information to the learning data is not defined at all, and the certainty degree is not set and zero with respect to all learning data pieces, so that the data extraction unit 116 may transfer data extracted by appropriately dividing all of the learning data pieces to the crowdsourcing 102 .
- the behavior in the initialization is not limited to the above-describe one.
- some data pieces to be a model of labeling may be registered in advance, and data may be extracted using the data pieces as first label and attribute information and reliability evaluation data set.
- FIG. 13 is a flowchart illustrating a flow of labeling based on the crowdsourcing.
- the request reception unit 211 receives a request to perform a labeling operation from the labeler in the crowdsourcing 102 .
- step S 3002 the data extraction unit 116 extracts data to be labeled.
- Data may be extracted one by one or in a plurality of data groups together.
- step S 3003 the extracted data is transferred to the labeler in the crowdsourcing 102 to perform labeling.
- the labeling procedure may be performed by following the flowchart already described in FIG. 11 .
- step S 3004 the label and attribute information management unit 110 obtains the label and the attribute information from the crowdsourcing 102 and manages them in association with the learning data.
- a detail processing flow in step S 3004 may be performed by following the flowchart described in FIG. 6 .
- step S 3005 the certainty degree is calculated for each learning data using the certainty degree calculation unit 115 .
- the calculated certainty degree is updated as the data attribute information with respect to the learning data.
- step S 3006 the label is determined using the label determination unit 112 .
- the label determination flow may be performed by following the flowchart already described in FIG. 7 .
- the processing is described to be performed in the order illustrated in the flowchart in FIG. 13 , however the order is not limited to the above-described one, and, for example, the determination of the label in step S 3006 is calculated using the score S(j) in the formula (1) as with the certainty degree in step S 3005 , so that the label may be determined at the same time when the certainty degree is calculated.
- some use cases require the determined label if the label is not necessarily determined. In such a case, the processing in the flowchart in FIG. 7 may be executed at an arbitrary timing independent from the processing in the flowchart in FIG. 13 using the certainty degree.
- the certainty degree indicating whether the label is determined is calculated, and data is extracted from data of which the label is not determined, so that the learning data which is necessary to be labeled can be efficiently selected and transferred to the cloud.
- the reliability evaluation data set is included in the data to be extracted, and accordingly a harmful effect such as incorporation of a labeling error by a malicious user in the crowdsourcing can be minimized.
- An information processing apparatus 40000 determines a label with fewer errors from a large amount of labels by crowdsourcing, performs learning using the label with fewer errors, and thus can highly accurately perform recognition.
- FIG. 14 is a functional block diagram illustrating a functional configuration of the information processing apparatus 40000 according to the fourth exemplary embodiment.
- the functional configuration according to the fourth exemplary embodiment includes portions in common with the example of the configuration according to the third exemplary embodiment illustrated in FIG. 12 , so that a learning unit 117 and a recognition unit 118 which are different from the third exemplary embodiment are described in detail.
- the learning unit 117 receives the label determined by the label determination unit 112 and the learning data by making a request to the request reception unit 211 and performs supervised learning using the label corresponding to the learning data as a teacher.
- the learning method is not particularly limited, and deep learning and random forest may be used.
- data to be used in the learning only data of which the certainty degree is greater than or equal to a threshold value is used. Accordingly, the learning can be performed by labels with fewer errors, and a highly accurate prediction model can be generated.
- data is not limited to the above-described one, and the learning may be performed using all of the learning data pieces. The learning can be performed at an arbitrary timing if the determined label corresponding to the learning data can be obtained.
- the recognition unit 118 When receiving the prediction model from the learning unit 117 and the query data from the request reception unit 211 , the recognition unit 118 returns a predicted result (an output label) based on the prediction model.
- the output label is used for comparison with another label in the label comparison unit 113 , displayed by the display control unit 114 with the other label, and referred to when the certainty degree calculation unit 115 determines the certainty degree.
- the display control unit 114 causes the display apparatus 30 to display comparison with the output label during editing of the label and thus can prompt a user to correct the label.
- the certainty degree F. is already calculated by the formula (3) or the formula (4).
- a formula B (I_d, I_o) is further added to the formula.
- I_d” and “I_o” respectively represent the determined label and the output label.
- the formula B indicates a similarity degree with a larger value when “I_d” and “I_o” are the same and a similarity degree with a smaller value when “I_d” and “I_o” are different.
- the formula B (I_d, I_o) is added to the formula (3) or the formula (4) for the calculation of the certainty degree.
- the calculation method is not limited to this, and the certainty degree may be calculated by multiplying the formula (3) or the formula (4) by the formula B (I_d, I_o).
- the certainty degree calculation unit 115 determines the determined label after calculating the certainty degree with reference to FIG. 13 .
- the label may be determined as a tentatively determined label before the certainty degree is determined, and when the tentatively determined label is equal to the output label, and the certainty degree is high, the tentatively determined label may be finally regarded as the determined label.
- FIG. 15 illustrates a flow of learning and recognition by the crowdsourcing.
- the label determination unit 112 determines the label using the flow illustrated in FIG. 13 as with the third exemplary embodiment.
- labeling is completed to a stage at which the label of data to be subjected to learning can be determined in the learning data. In addition, it is confirmed not only that the determined label is applied to the learning data, but also the certainty degree of each learning data is greater than or equal to a constant value depending on a condition.
- step S 4002 the learning unit 117 performs the supervised learning using the determined label corresponding to the learning data and estimates the prediction model.
- step S 4003 the request reception unit 211 receives the query data and transmits the query data to the recognition unit 118 .
- the query data has a format similar to that of the learning data and is, for example, an image file.
- the image file is obtained by an image obtainment unit, which is not illustrated.
- Step S 4003 is an event which can occur at an arbitrary timing when the prediction model is generated and not necessarily to be executed after step S 4002 once the prediction model is generated in step S 4002 .
- the recognition unit 118 predicts (recognizes) a result with respect to the query data based on the prediction model obtained in the previous processing.
- the predicted result has a form similar to that of the label.
- the recognition unit 118 predicts (recognizes) that an animal captured in the query data is a “cat”, and in the case of the area designation type, the recognition unit 118 predicts, for example, a result in which only a face area of an object is filled with color.
- the recognition unit 118 predicts a matrix and quaternion indicating a posture of a target object.
- the predicted (recognized) result may be returned to use the recognition unit 118 as a recognition system or to improve the calculation in the certainty degree calculation unit 115 . Further, the predicted (recognized) result may be used as a comparison target in the display control unit 114 .
- the learning unit 117 and the recognition unit 118 are described according to the present exemplary embodiment having the configuration in which the learning unit 117 and the recognition unit 118 are added to the information processing apparatus according to the third exemplary embodiment, however the configuration is not limited to this, and the learning unit 117 and the recognition unit 118 may be included in the first exemplary embodiment and the second exemplary embodiment in the similar manner.
- a prediction model is generated by performing learning using labels with fewer errors determined by the label determination unit, and the prediction model is used to perform recognition, so that the recognition can be highly accurately performed. Further, a certainty degree is calculated with a high degree of accuracy using the recognition result, and accordingly, a more reliable determined label can be obtained, and accuracy in data extraction can be improved.
- the display control unit 114 displays the recognition result to be compared with a label being edited, and thus a labeler can be helped to reduce an error in labeling.
- An information processing apparatus 50000 has a function of comparing a label applied by a labeler with an existing label and evaluating a reliability of the labeler. Evaluating the labeler with a high degree of accuracy leads to appropriately pay a reward for a labeling operation and improve motivation and work efficiency of the labeler himself or herself. In addition, a malicious labeler who intends to apply a false label can be identified.
- FIG. 16 is a block diagram illustrating an example of a configuration of the information processing apparatus 50000 according to the fifth exemplary embodiment.
- the example of the configuration according to the fifth exemplary embodiment includes portions in common with the example of the configuration according to the second exemplary embodiment illustrated in FIG. 8 , so that a labeler evaluation unit (label evaluation unit) 119 which is different from the second exemplary embodiment is described.
- the information processing apparatus 50000 according to the fifth exemplary embodiment does not include the display control unit 114 compared to the information processing apparatus 20000 according to the second exemplary embodiment illustrated in FIG. 8 , however an evaluation result by the labeler evaluation unit 119 may be displayed on the display apparatus 30 via the display control unit 114 .
- the labeler evaluation unit 119 compares a label newly applied by a labeler with the determined label or the output label and evaluates the newly applied label. Further, the labeler evaluation unit 119 evaluates the labeler based on an evaluation with respect to an entire label group applied by the labeler. When the labeler is evaluated, the reliability of the labeler can be calculated more accurately, and accordingly, an error in the determined label can be reduced. In addition, the evaluation can be used as a factor for determining a reward to the labeler in the crowdsourcing 102 .
- a label evaluation method is described.
- the label newly applied by the labeler is evaluated by calculating a similarity degree using the label comparison unit 113 with respect to the determined label calculated by the label determination unit 112 or the output label calculated using the prediction model by the recognition unit 118 .
- a calculation method of the similarity degree is the same as that described according to the second exemplary embodiment.
- An evaluation value is determined in a range from zero to one so that the evaluation becomes higher as the similarity degree is higher such as the evaluation value equal to the similarity degree. It is considered that the determined label and the output label are ideally the same label, the similarity degree may be calculated with respect to any of them, or the similarity degree may be calculated to both of them, and an average value of the similarity degrees may be regarded as the evaluation value.
- the attribute information includes an inadequate setting in the calculation of the formula (1) by the label determination unit 112 , and the learning of the prediction model by the learning unit 117 has failed. In such a case, the newly applied label cannot be correctly evaluated, so that the evaluation of the label to the learning data is invalidated.
- a reward parameter Rw and a reliability R of a labeler are calculated.
- a reward of the labeler in the crowdsourcing is determined based on the reward parameter Rw of the labeler.
- the determined label, the prediction model, and the like change at any time, so that if the reward parameter Rw is calculated in that stage, the calculated parameter is not always correct. Therefore, the reward parameter Rw can be defined from a label applied to data regarded as labeling is completed with respect to the learning data. However, the reliability R of the labeler at that time is used.
- the reward parameter Rw may be added for each data on which labeling is completed one after another as shown in the formula (5).
- a function W(p) is similar to the function W(n) in the formula (3) and has a value from zero to one for determining a maximum value of the reliability R according to the number p of the evaluation values.
- the calculated reliability R of the labeler is updated as the labeler attribute information.
- the reliability R of the labeler is less than a threshold value, the labeler is recognized as a malicious labeler who always applies a false label, and the label applied by the labeler is disregarded or deleted.
- a countermeasure may be taken in which a weight coefficient to the reward parameter Rw is extremely reduced.
- the reliability R of the labeler is greater than or equal to the threshold value
- the value of the score S(j) determined in the formula (1) becomes larger, and the labeler has a greater influence on the determined label and the certainty degree calculated by the label determination unit 112 and the certainty degree calculation unit 115 .
- the reliability R of the labeler is calculated from the labeled result as described above, and thus the malicious labeler who applies a false label can be easily identified. Further, when the label is determined, the label applied by the labeler having the higher reliability is prioritized, and thus the accuracy of the label is improved. In addition, a labeler whose labeling accuracy occasionally changes without malice may be alerted when the reliability R is reduced. Accordingly, the labeler can be notified of an educational benefit for correct labeling and prevention of a careless mistake.
- step S 5001 a target labeler performs labeling.
- step S 5002 it is determined whether labeling is completed on target learning data.
- a criterion of labeling completion is either or both of that the certainty degree is greater than or equal to the threshold value, and that the similarity degree between the determined label and the output label is high.
- step S 5003 the labeler evaluation unit 119 calculates a label evaluation value based on the similarity degree calculated by the label comparison unit 113 .
- step S 5004 the labeler evaluation unit 119 calculates the reliability of the labeler based on the formula (6), and the label and attribute information management unit 110 updates the labeler reliability.
- step S 5005 it is determined whether the labeler reliability is greater than or equal to the threshold value.
- the processing proceeds to step S 5012 , and when the labeler reliability is greater than or equal to the threshold value (YES in step S 5005 ), the processing proceeds to step S 5006 .
- step S 5006 the reward parameter Rw is calculated based on the formula (5).
- step S 5011 the data extraction unit 116 repeatedly extracts data and requests the labeler to perform labeling until the labeling is completed.
- the data extraction unit 116 is not limited to a method for extracting one piece of data at a time, and the data extraction unit 116 may first extract data pieces as a whole and requests the labeler to perform labeling until there is no more data.
- step S 5012 when the labeler reliability is less than the threshold value, the labeler is recognized as the malicious labeler, so that information of the label applied by the labeler is deleted, a weight thereof is reduced, and the function f(R) in the formula (5) is set to zero or a very small value.
- the labeler is evaluated based on the labeled result, and the labeler reliability is updated, so that a label which is more reliable and has less possibility of error can be obtained.
- the reward of the labeler is determined based on the evaluation of the labeler, so that improvement of motivation and work efficiency of the labeler can be expected. Further, the malicious labeler is identified, and thus harmful effects on determination of the label and generation of the prediction model can be reduced.
- the label can be appropriately determined.
- Embodiment(s) of the present disclosure can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s).
- computer executable instructions e.g., one or more programs
- a storage medium which may also be referred to more fully as a
- the computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions.
- the computer executable instructions may be provided to the computer, for example, from a network or the storage medium.
- the storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)TM), a flash memory device, a memory card, and the like.
Abstract
Description
[Formula 1]
S(j)=Σi|L(i)∈l
Rw=Σkf(r)*v(k) (5)
R=W(p)*Πkv(k) (6)
Claims (19)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JPJP2016-249170 | 2016-12-22 | ||
JP2016249170 | 2016-12-22 | ||
JP2016-249170 | 2016-12-22 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20180181885A1 US20180181885A1 (en) | 2018-06-28 |
US11551134B2 true US11551134B2 (en) | 2023-01-10 |
Family
ID=62629869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/845,922 Active 2041-10-16 US11551134B2 (en) | 2016-12-22 | 2017-12-18 | Information processing apparatus, information processing method, and storage medium |
Country Status (2)
Country | Link |
---|---|
US (1) | US11551134B2 (en) |
JP (1) | JP6946081B2 (en) |
Families Citing this family (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210279637A1 (en) * | 2018-02-27 | 2021-09-09 | Kyushu Institute Of Technology | Label collection apparatus, label collection method, and label collection program |
JP7308421B2 (en) * | 2018-07-02 | 2023-07-14 | パナソニックIpマネジメント株式会社 | LEARNING DEVICE, LEARNING SYSTEM AND LEARNING METHOD |
JP7211735B2 (en) * | 2018-08-29 | 2023-01-24 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | CONTRIBUTION DETERMINATION METHOD, CONTRIBUTION DETERMINATION DEVICE AND PROGRAM |
WO2020047466A1 (en) * | 2018-08-30 | 2020-03-05 | The Government Of The United States Of America, As Represented By Thesecretary Of The Navy | Human-assisted machine learning through geometric manipulation and refinement |
WO2020090076A1 (en) * | 2018-11-01 | 2020-05-07 | 日本電気株式会社 | Answer integrating device, answer integrating method, and answer integrating program |
KR102129843B1 (en) * | 2018-12-17 | 2020-07-03 | 주식회사 크라우드웍스 | Method for verifying real annotation works using test annotation works and apparatus thereof |
JP7229795B2 (en) * | 2019-02-01 | 2023-02-28 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Annotation device, annotation method, and program |
JP7298174B2 (en) * | 2019-02-12 | 2023-06-27 | 日本電信電話株式会社 | Model learning device, label estimation device, methods thereof, and program |
WO2020188701A1 (en) * | 2019-03-18 | 2020-09-24 | 日本電気株式会社 | Machine learning system, information terminal, information processing device, information processing method, program, learned model, and method for generating learned model |
JP7213138B2 (en) * | 2019-05-10 | 2023-01-26 | 株式会社日立システムズ | LEARNING DATA CREATE SUPPORT SYSTEM AND LEARNING DATA CREATE SUPPORT METHOD |
JP2022116369A (en) * | 2019-06-10 | 2022-08-10 | アミフィアブル株式会社 | Sports movie management system |
JP2021010970A (en) * | 2019-07-05 | 2021-02-04 | 京セラドキュメントソリューションズ株式会社 | Robot system and robot control method |
JP7404679B2 (en) | 2019-07-11 | 2023-12-26 | 富士通株式会社 | Judgment processing program, judgment processing method, and information processing device |
CN110717785A (en) * | 2019-09-29 | 2020-01-21 | 支付宝(杭州)信息技术有限公司 | Decision method, system and device based on label distribution learning |
JP2021081793A (en) * | 2019-11-14 | 2021-05-27 | キヤノン株式会社 | Information processing device, control method and program for information processing device |
JPWO2021140604A1 (en) * | 2020-01-09 | 2021-07-15 | ||
WO2021181520A1 (en) | 2020-03-10 | 2021-09-16 | オリンパス株式会社 | Image processing system, image processing device, endoscope system, interface, and image processing method |
WO2021193025A1 (en) * | 2020-03-25 | 2021-09-30 | パナソニックIpマネジメント株式会社 | Data generation method, determination method, program, and data generation system |
US20230206085A1 (en) * | 2020-06-05 | 2023-06-29 | Nippon Telegraph And Telephone Corporation | Processing device, processing method and processing program |
CN116635876A (en) * | 2020-12-07 | 2023-08-22 | 松下知识产权经营株式会社 | Processing system, learning processing system, processing method, and program |
WO2022185360A1 (en) * | 2021-03-01 | 2022-09-09 | 日本電信電話株式会社 | Assistance device, assistance method, and program |
CN112988727B (en) | 2021-03-25 | 2022-09-16 | 北京百度网讯科技有限公司 | Data annotation method, device, equipment, storage medium and computer program product |
WO2022224378A1 (en) * | 2021-04-21 | 2022-10-27 | 株式会社Apto | Data collection system and program therefor |
JPWO2022234692A1 (en) * | 2021-05-06 | 2022-11-10 | ||
US20220374930A1 (en) * | 2021-05-18 | 2022-11-24 | At&T Intellectual Property I, L.P. | Machine learning models with accurate data labeling |
CN113420149A (en) * | 2021-06-30 | 2021-09-21 | 北京百度网讯科技有限公司 | Data labeling method and device |
Citations (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080086432A1 (en) * | 2006-07-12 | 2008-04-10 | Schmidtler Mauritius A R | Data classification methods using machine learning techniques |
US20100312725A1 (en) * | 2009-06-08 | 2010-12-09 | Xerox Corporation | System and method for assisted document review |
US20110238605A1 (en) * | 2010-03-25 | 2011-09-29 | Sony Corporation | Information processing apparatus, information processing method, and program |
US20120216257A1 (en) * | 2011-02-18 | 2012-08-23 | Google Inc. | Label privileges |
JP2012194691A (en) | 2011-03-15 | 2012-10-11 | Olympus Corp | Re-learning method and program of discriminator, image recognition device |
US20120278266A1 (en) * | 2011-04-28 | 2012-11-01 | Kroll Ontrack, Inc. | Electronic Review of Documents |
US20130132308A1 (en) | 2011-11-22 | 2013-05-23 | Gregory Jensen Boss | Enhanced DeepQA in a Medical Environment |
JP2013120534A (en) | 2011-12-08 | 2013-06-17 | Mitsubishi Electric Corp | Related word classification device, computer program, and method for classifying related word |
US8713023B1 (en) * | 2013-03-15 | 2014-04-29 | Gordon Villy Cormack | Systems and methods for classifying electronic information using advanced active learning techniques |
JP2015129988A (en) | 2014-01-06 | 2015-07-16 | 日本電気株式会社 | Data processor |
US9087303B2 (en) * | 2012-02-19 | 2015-07-21 | International Business Machines Corporation | Classification reliability prediction |
JP2016062544A (en) | 2014-09-22 | 2016-04-25 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Information processing device, program, information processing method |
JP2016115245A (en) | 2014-12-17 | 2016-06-23 | ソニー株式会社 | Information processing device, information processing method, and program |
US20160373657A1 (en) * | 2015-06-18 | 2016-12-22 | Wasaka Llc | Algorithm and devices for calibration and accuracy of overlaid image data |
US9760622B2 (en) * | 2012-08-08 | 2017-09-12 | Microsoft Israel Research And Development (2002) Ltd. | System and method for computerized batching of huge populations of electronic documents |
US10002182B2 (en) * | 2013-01-22 | 2018-06-19 | Microsoft Israel Research And Development (2002) Ltd | System and method for computerized identification and effective presentation of semantic themes occurring in a set of electronic documents |
US10229117B2 (en) * | 2015-06-19 | 2019-03-12 | Gordon V. Cormack | Systems and methods for conducting a highly autonomous technology-assisted review classification |
US10445379B2 (en) * | 2016-06-20 | 2019-10-15 | Yandex Europe Ag | Method of generating a training object for training a machine learning algorithm |
US10504037B1 (en) * | 2016-03-31 | 2019-12-10 | Veritas Technologies Llc | Systems and methods for automated document review and quality control |
US10902066B2 (en) * | 2018-07-23 | 2021-01-26 | Open Text Holdings, Inc. | Electronic discovery using predictive filtering |
US20210089963A1 (en) * | 2019-09-23 | 2021-03-25 | Dropbox, Inc. | Cross-model score normalization |
US20210158209A1 (en) * | 2019-11-27 | 2021-05-27 | Amazon Technologies, Inc. | Systems, apparatuses, and methods of active learning for document querying machine learning models |
US20220108082A1 (en) * | 2020-10-07 | 2022-04-07 | DropCite Inc. | Enhancing machine learning models to evaluate electronic documents based on user interaction |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6421421B2 (en) * | 2014-03-04 | 2018-11-14 | 富士ゼロックス株式会社 | Annotation information adding program and information processing apparatus |
-
2017
- 2017-07-10 JP JP2017134662A patent/JP6946081B2/en active Active
- 2017-12-18 US US15/845,922 patent/US11551134B2/en active Active
Patent Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080086432A1 (en) * | 2006-07-12 | 2008-04-10 | Schmidtler Mauritius A R | Data classification methods using machine learning techniques |
US20100312725A1 (en) * | 2009-06-08 | 2010-12-09 | Xerox Corporation | System and method for assisted document review |
US20110238605A1 (en) * | 2010-03-25 | 2011-09-29 | Sony Corporation | Information processing apparatus, information processing method, and program |
JP2011203991A (en) | 2010-03-25 | 2011-10-13 | Sony Corp | Information processing apparatus, information processing method, and program |
US20120216257A1 (en) * | 2011-02-18 | 2012-08-23 | Google Inc. | Label privileges |
JP2012194691A (en) | 2011-03-15 | 2012-10-11 | Olympus Corp | Re-learning method and program of discriminator, image recognition device |
US20120278266A1 (en) * | 2011-04-28 | 2012-11-01 | Kroll Ontrack, Inc. | Electronic Review of Documents |
US20130132308A1 (en) | 2011-11-22 | 2013-05-23 | Gregory Jensen Boss | Enhanced DeepQA in a Medical Environment |
JP2013120534A (en) | 2011-12-08 | 2013-06-17 | Mitsubishi Electric Corp | Related word classification device, computer program, and method for classifying related word |
US9087303B2 (en) * | 2012-02-19 | 2015-07-21 | International Business Machines Corporation | Classification reliability prediction |
US9760622B2 (en) * | 2012-08-08 | 2017-09-12 | Microsoft Israel Research And Development (2002) Ltd. | System and method for computerized batching of huge populations of electronic documents |
US10002182B2 (en) * | 2013-01-22 | 2018-06-19 | Microsoft Israel Research And Development (2002) Ltd | System and method for computerized identification and effective presentation of semantic themes occurring in a set of electronic documents |
US8713023B1 (en) * | 2013-03-15 | 2014-04-29 | Gordon Villy Cormack | Systems and methods for classifying electronic information using advanced active learning techniques |
JP2015129988A (en) | 2014-01-06 | 2015-07-16 | 日本電気株式会社 | Data processor |
JP2016062544A (en) | 2014-09-22 | 2016-04-25 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Information processing device, program, information processing method |
JP2016115245A (en) | 2014-12-17 | 2016-06-23 | ソニー株式会社 | Information processing device, information processing method, and program |
US20160373657A1 (en) * | 2015-06-18 | 2016-12-22 | Wasaka Llc | Algorithm and devices for calibration and accuracy of overlaid image data |
US10229117B2 (en) * | 2015-06-19 | 2019-03-12 | Gordon V. Cormack | Systems and methods for conducting a highly autonomous technology-assisted review classification |
US10504037B1 (en) * | 2016-03-31 | 2019-12-10 | Veritas Technologies Llc | Systems and methods for automated document review and quality control |
US10445379B2 (en) * | 2016-06-20 | 2019-10-15 | Yandex Europe Ag | Method of generating a training object for training a machine learning algorithm |
US10902066B2 (en) * | 2018-07-23 | 2021-01-26 | Open Text Holdings, Inc. | Electronic discovery using predictive filtering |
US20210089963A1 (en) * | 2019-09-23 | 2021-03-25 | Dropbox, Inc. | Cross-model score normalization |
US20210158209A1 (en) * | 2019-11-27 | 2021-05-27 | Amazon Technologies, Inc. | Systems, apparatuses, and methods of active learning for document querying machine learning models |
US20220108082A1 (en) * | 2020-10-07 | 2022-04-07 | DropCite Inc. | Enhancing machine learning models to evaluate electronic documents based on user interaction |
Non-Patent Citations (1)
Title |
---|
Satoshi Oyama, et al., Accurate Integration of Crowdsourced Labels Using Workers' Self-reported Confidence Scores, IJCAI International Joint Conference on Artificial Intelligence, pp. 1-4, 2013. |
Also Published As
Publication number | Publication date |
---|---|
JP2018106662A (en) | 2018-07-05 |
JP6946081B2 (en) | 2021-10-06 |
US20180181885A1 (en) | 2018-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11551134B2 (en) | Information processing apparatus, information processing method, and storage medium | |
CN111052146B (en) | System and method for active learning | |
WO2017216980A1 (en) | Machine learning device | |
US10755447B2 (en) | Makeup identification using deep learning | |
US10964057B2 (en) | Information processing apparatus, method for controlling information processing apparatus, and storage medium | |
US11921777B2 (en) | Machine learning for digital image selection across object variations | |
JP6489005B2 (en) | Information processing system, information processing method, and program | |
US9600893B2 (en) | Image processing device, method, and medium for discriminating a type of input image using non-common regions | |
US20140112545A1 (en) | Information processing apparatus and information processing method | |
US20230024586A1 (en) | Learning device, learning method, and recording medium | |
CN111124863B (en) | Intelligent device performance testing method and device and intelligent device | |
JP2019086475A (en) | Learning program, detection program, learning method, detection method, learning device, and detection device | |
US20200320409A1 (en) | Model creation supporting method and model creation supporting system | |
US20230131717A1 (en) | Search processing device, search processing method, and computer program product | |
US20220351533A1 (en) | Methods and systems for the automated quality assurance of annotated images | |
KR102413588B1 (en) | Object recognition model recommendation method, system and computer program according to training data | |
CN111124862B (en) | Intelligent device performance testing method and device and intelligent device | |
CN115730208A (en) | Training method, training device, training apparatus, and computer-readable storage medium | |
US20220405894A1 (en) | Machine learning device, machine learning method, andrecording medium storing machine learning program | |
US20220101038A1 (en) | Information processing device, information processing method, and storage medium | |
US11809984B2 (en) | Automatic tag identification for color themes | |
US10915286B1 (en) | Displaying shared content on respective display devices in accordance with sets of user preferences | |
US20230054688A1 (en) | Systems and methods for determining gui interaction information for an end user device | |
JP6813704B1 (en) | Information processing equipment, information processing methods, and programs | |
WO2023188160A1 (en) | Input assistance device, input assistance method, and non-transitory computer-readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: CANON KABUSHIKI KAISHA, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HIGO, TOMOAKI;INABA, MASAKI;YAMADA, TAKAYUKI;AND OTHERS;SIGNING DATES FROM 20171222 TO 20180119;REEL/FRAME:045477/0795 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |