WO2020049864A1 - Procédé et système de traitement d'informations - Google Patents

Procédé et système de traitement d'informations Download PDF

Info

Publication number
WO2020049864A1
WO2020049864A1 PCT/JP2019/027259 JP2019027259W WO2020049864A1 WO 2020049864 A1 WO2020049864 A1 WO 2020049864A1 JP 2019027259 W JP2019027259 W JP 2019027259W WO 2020049864 A1 WO2020049864 A1 WO 2020049864A1
Authority
WO
WIPO (PCT)
Prior art keywords
recognition result
recognition
likelihood
candidate
candidates
Prior art date
Application number
PCT/JP2019/027259
Other languages
English (en)
Japanese (ja)
Inventor
デニス グドフスキー
拓也 山口
育規 石井
宗太郎 築澤
Original Assignee
パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP2019076509A external-priority patent/JP7287823B2/ja
Application filed by パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ filed Critical パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ
Priority to CN201980005088.7A priority Critical patent/CN111213180A/zh
Priority to EP19857798.3A priority patent/EP3848889A4/fr
Publication of WO2020049864A1 publication Critical patent/WO2020049864A1/fr
Priority to US16/849,334 priority patent/US11449706B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present disclosure relates to an information processing method and the like.
  • Non-Patent Document 1 a method for visualizing the reaction of a deep neural network has been proposed (for example, Non-Patent Document 1).
  • recognition processing such as object detection
  • an object may not be detected, and it may be difficult to analyze the output of the recognition processing.
  • an object of the present disclosure is to provide an information processing method or the like capable of outputting information for analyzing the output of the recognition process even when no object is detected in the recognition process.
  • the computer obtains a plurality of recognition results on the sensing data, which are obtained by inputting sensing data to a model that is trained by machine learning and executes a recognition process.
  • Candidate and obtain the likelihood of each of the plurality of recognition result candidates, obtain an instruction to specify a portion to be analyzed in the sensing data, the relationship between each of the plurality of recognition result candidates and the portion, And selecting at least one recognition result candidate from the plurality of recognition result candidates based on the likelihood of each of the plurality of recognition result candidates, and outputting the selected at least one recognition result candidate.
  • non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM.
  • An apparatus, a method, an integrated circuit, a computer program, and a recording medium may be realized by a non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM.
  • FIG. 1 is an image diagram showing an input image used for the identification processing in the reference example.
  • FIG. 2 is an image diagram showing an identification result obtained by the identification processing in the reference example.
  • FIG. 3 is an image diagram showing an input image used for the detection processing in the reference example.
  • FIG. 4 is an image diagram showing a detection result obtained by the detection processing in the reference example.
  • FIG. 5 is an image diagram showing a positive contribution to the identification result in the reference example.
  • FIG. 6 is an image diagram showing a negative contribution to the identification result in the reference example.
  • FIG. 7 is a block diagram illustrating a configuration of the information processing system according to the embodiment.
  • FIG. 8 is an image diagram showing an aspect of the information processing system according to the embodiment.
  • FIG. 9 is a flowchart illustrating processing of the information processing system according to the embodiment.
  • FIG. 10 is a conceptual diagram illustrating processing of the recognition unit according to the embodiment.
  • FIG. 11 is a conceptual diagram illustrating a process of the designation unit according to the embodiment.
  • FIG. 12 is a conceptual diagram illustrating another aspect of the processing of the specifying unit according to the embodiment.
  • FIG. 13 is a conceptual diagram showing still another mode of the process of the specifying unit in the embodiment.
  • FIG. 14 is a conceptual diagram showing still another mode of the process of the specifying unit in the embodiment.
  • FIG. 15 is a conceptual diagram illustrating processing of the selection unit according to the embodiment.
  • FIG. 16 is a flowchart illustrating processing of the selection unit according to the embodiment.
  • FIG. 17 is a flowchart illustrating another mode of the process of the selection unit according to the embodiment.
  • FIG. 16 is a flowchart illustrating processing of the selection unit according to the embodiment.
  • FIG. 18 is a conceptual diagram illustrating a process of deriving an analysis target area from designated coordinates according to the embodiment.
  • FIG. 19 is a conceptual diagram showing another aspect of the processing for deriving the analysis target area from the designated coordinates in the embodiment.
  • FIG. 20 is a conceptual diagram illustrating a process of the analysis unit according to the embodiment.
  • a neural network is a mathematical model simulating the nerves of an organism, and this mathematical model may have multiple layers.
  • the neural network type recognition model is a model configured by a neural network, and is a model for performing recognition processing. For example, sensing data such as an image is input to a neural network type recognition model, and the content of the sensing data is recognized by the neural network type recognition model.
  • various recognition processes are performed according to various recognition models, and these recognition processes are classified into an identification process, a detection process, a segmentation process, and the like.
  • the identification process what the sensing data represents is identified.
  • the identification process is also referred to as a classification process.
  • the sensing data is classified into one of a plurality of types.
  • the detection processing a portion representing the content of the detection target in the sensing data is detected.
  • the detection process which part of the sensing data represents what is detected.
  • FIG. 1 is an image diagram showing an input image used for the identification processing in the reference example.
  • the input image shown in FIG. 1 is input to a recognition model that performs an identification process.
  • FIG. 2 is an image diagram showing an identification result obtained by the identification processing in the reference example. For example, by inputting the input image shown in FIG. 1 to a recognition model for performing a classification process, the classification result shown in FIG. 2 is obtained. Specifically, the identification result indicates that the input image represents a dog.
  • FIG. 3 is an image diagram showing an input image used for the detection processing in the reference example.
  • the input image shown in FIG. 3 is input to a recognition model that performs a detection process.
  • FIG. 4 is an image diagram showing a detection result obtained by the detection processing in the reference example. For example, by inputting the input image shown in FIG. 3 to a recognition model for performing a detection process, the identification result shown in FIG. 4 is obtained. Specifically, a portion representing a dog is obtained as a detection result. In other words, the detection result indicates that the dotted frame in FIG. 4 represents a dog.
  • Non-Patent Document 1 is an example.
  • FIG. 5 is an image diagram showing a positive contribution to the identification result shown in FIG.
  • the degree of positively contributing to the identification result that the input image represents a dog is represented by the density of points.
  • the degree of positively contributing to the identification result may be expressed by the color density or the like for each pixel.
  • the part where the dog's head appears in the input image and the part where the dog's tail appears in the input image positively contribute to the identification result that the input image represents the dog. That is, the portion where the dog's head is reflected and the portion where the dog's tail is reflected guide the identification result in the direction that the input image represents the dog.
  • FIG. 6 is an image diagram showing a negative contribution to the identification result shown in FIG.
  • the degree of negatively contributing to the identification result that the input image represents a dog is represented by the density of points.
  • the degree of negatively contributing to the identification result may be expressed by the color density or the like for each pixel.
  • the portion of the input image where the dog's paw is shown contributes negatively to the identification result that the input image represents a dog.
  • the portion where the front paws of the dog is reflected leads the identification result in the direction that the input image does not represent the dog.
  • the positive contribution and the negative contribution are represented separately, but may be represented in combination with the positive contribution and the negative contribution.
  • the degree of positively contributing to the identification result may be expressed by red density
  • the degree of negatively contributing to the identification result may be expressed by blue density.
  • the factor of the identification result that the input image represents a dog is shown.
  • Such information is useful for improving the accuracy of the recognition model. For example, if the identification result is incorrect, it is possible to adjust the parameters of the recognition model or perform relearning based on the factor. Therefore, it is possible to efficiently improve the accuracy of the recognition model.
  • the identification result of the identification process is simple, but the detection result of the detection process is complicated. Therefore, it is difficult to apply the process of deriving the contribution in the identification process to the detection process.
  • the detection processing when an object is correctly detected or when an object is erroneously detected, it may be possible to derive the degree of contribution to the object detection for each pixel of the image.
  • an object may not be detected in the detection processing. In this case, there is no degree that has contributed to the detection of the object, and it is difficult to derive the degree of contribution.
  • a computer is obtained by inputting sensing data into a model that is trained by machine learning and executes a recognition process.
  • a plurality of recognition result candidates, and the likelihood of each of the plurality of recognition result candidates are obtained, an instruction to specify a part to be analyzed in the sensing data is obtained, and each of the plurality of recognition result candidates and the part are obtained.
  • at least one recognition result candidate is selected from the plurality of recognition result candidates based on the likelihood of each of the plurality of recognition result candidates, and the selected at least one recognition result candidate is output. I do.
  • the computer further presents a processing result based on the outputted at least one recognition result candidate.
  • the processing result indicates a degree to which each of a plurality of values included in the sensing data contributes to the likelihood of each of the output at least one recognition result candidate.
  • the relationship is at least one of the presence / absence of overlap between each of the plurality of recognition result candidates and the portion and the degree of overlap.
  • each of the at least one recognition result candidate selected from the plurality of recognition result candidates is a recognition result candidate whose likelihood is higher than a threshold value among the plurality of recognition result candidates.
  • the computer lowers the threshold, and , The at least one recognition result candidate is selected based on the likelihood and the lowered threshold value.
  • the computer further outputs the likelihood of each of the selected at least one recognition result candidate.
  • the sensing data is an image
  • the recognition process is an object recognition process on the image
  • each of the plurality of recognition result candidates is a candidate for an object appearing in the image.
  • the information processing system may include a plurality of recognition methods on the sensing data, which are obtained by inputting sensing data to a model that is trained by machine learning and that executes a recognition process.
  • a result candidate, and the likelihood of each of the plurality of recognition result candidates are obtained, an instruction to specify a part to be analyzed in the sensing data is obtained, and the relationship between each of the plurality of recognition result candidates and the part is obtained.
  • a computer that selects at least one recognition result candidate from the plurality of recognition result candidates based on the likelihood of each of the plurality of recognition result candidates, and outputs the selected at least one recognition result candidate.
  • the information processing system can output a recognition result candidate useful for the analysis processing or the like based on the designated portion. That is, the information processing system can output information for analyzing the output of the recognition process even when no object is detected in the recognition process.
  • a non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM.
  • An apparatus, a method, an integrated circuit, a computer program, and a recording medium may be implemented in a non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM.
  • FIG. 7 is a block diagram illustrating a configuration of the information processing system according to the present embodiment.
  • the information processing system 100 illustrated in FIG. 7 includes a recognition unit 101, a designation unit 102, a selection unit 103, and an analysis unit 104.
  • each component is configured by an electric circuit, and can transmit information to another component via wired or wireless communication, an input / output circuit, or the like.
  • the recognition unit 101 is an information processing unit that acquires sensing data and performs a recognition process on the sensing data. For example, the recognition unit 101 performs a recognition process on the sensing data by inputting the sensing data to the recognition model, and acquires a recognition result. At that time, the recognition unit 101 inputs a plurality of recognition result candidates on the sensing data and a likelihood of each of the plurality of recognition result candidates by inputting the sensing data to the recognition model. Then, the recognition unit 101 obtains a final recognition result indicating a recognition result candidate whose likelihood is equal to or larger than the threshold.
  • the sensing data is an image
  • the recognition process is a process of detecting an object in the image
  • the recognition result is a detection result as shown in FIG. 4
  • each of the plurality of recognition result candidates is an object in the image. Is a candidate.
  • the recognition model is a model trained by machine learning, and is a model that executes a recognition process.
  • the recognition unit 101 may train a recognition model using the correct answer data.
  • the recognition model may be a mathematical model called a neural network. More specifically, an object such as an SSD (Single Shot multibox Detector), a Faster-RCNN (Faster Region-based Convolutional Neutral Network), or a YOLO (Your Only Look) that recognizes an object such as a model that recognizes an object such as a Only-Look is used. You may.
  • SSD Single Shot multibox Detector
  • Faster-RCNN Faster Region-based Convolutional Neutral Network
  • YOLO Your Only Look
  • ⁇ Recognition unit 101 also outputs to recognition unit 103 the plurality of recognition result candidates on the sensing data and the likelihood of each of the plurality of recognition result candidates. That is, the recognition unit 101 outputs to the selection unit 103 a plurality of recognition result candidates on the sensing data and information indicating the likelihood of each of the plurality of recognition result candidates. The recognition unit 101 does not necessarily need to obtain the final recognition result indicating the recognition result candidate whose likelihood is equal to or larger than the threshold, and need not output the final recognition result.
  • the designation unit 102 is an information processing unit that acquires an instruction to designate a part to be analyzed in the sensing data. For example, when the sensing data is an image, the specifying unit 102 acquires an instruction to specify a part included in the image as a part to be analyzed.
  • the designating unit 102 may include an input interface or a communication interface for acquiring an instruction.
  • the input interface is a touch panel, a keyboard, a mouse, a microphone, or the like.
  • the communication interface is a communication terminal or an antenna for obtaining an instruction via a communication network.
  • the selection unit 103 is an information processing unit that selects at least one recognition result candidate from a plurality of recognition result candidates on the sensing data.
  • the recognition unit 101 acquires a plurality of recognition result candidates on the sensing data and the likelihood of each of the plurality of recognition result candidates by inputting the sensing data to the recognition model.
  • the selecting unit 103 acquires a plurality of recognition result candidates on the sensing data and the likelihood of each of the plurality of recognition result candidates from the recognition unit 101.
  • the selecting unit 103 determines at least one recognition result candidate from the plurality of recognition result candidates based on the relationship between each of the plurality of recognition result candidates and the designated portion and the likelihood of each of the plurality of recognition result candidates. Elect. Then, the selection unit 103 outputs at least one selected recognition result candidate. That is, the selection unit 103 outputs information indicating at least one selected recognition result candidate.
  • each of the plurality of recognition result candidates and the designated portion may be the presence or absence of overlap between each of the plurality of recognition result candidates and the designated portion, or may be the degree of overlap.
  • the at least one recognition result candidate selected by the selection unit 103 may be a recognition result candidate having an overlap with a designated part among a plurality of recognition result candidates, or a plurality of recognition result candidates.
  • the recognition result candidate that has a large overlap with the designated portion may be used.
  • each of the plurality of recognition result candidates and the specified portion may be the Euclidean distance between each of the plurality of recognition result candidates and the specified portion.
  • the at least one recognition result candidate selected by the selection unit 103 may be a recognition result candidate within a predetermined Euclidean distance from a specified portion among a plurality of recognition result candidates.
  • ⁇ At least one recognition result candidate selected by the selection unit 103 may be a recognition result candidate whose likelihood is higher than a threshold among a plurality of recognition result candidates. That is, the selecting unit 103 selects a recognition result candidate having a likelihood higher than the threshold value and having an overlap or a large overlap with the designated portion, from among the plurality of recognition result candidates obtained from the recognition unit 101. Is also good.
  • the threshold for the selection unit 103 to select a recognition result candidate may be lower than the threshold for the recognition unit 101 to obtain the recognition result.
  • the selection unit 103 may lower the threshold value and select at least one recognition result candidate from the plurality of recognition result candidates until a predetermined number of recognition result candidates are selected from the plurality of recognition result candidates.
  • the analysis unit 104 is an information processing unit that performs an analysis process. For example, the analysis unit 104 performs an analysis process based on at least one recognition result candidate output from the selection unit 103, and presents a processing result of the analysis process.
  • the analysis unit 104 may include an output interface or a communication interface for presenting a processing result.
  • the output interface is a touch panel, a display, a speaker, or the like.
  • the communication interface is a communication terminal or an antenna for presenting a processing result via a communication network.
  • the analysis unit 104 presents a processing result indicating the degree to which each of a plurality of values included in the sensing data contributes to the likelihood of at least one recognition result candidate output from the selection unit 103.
  • the degree to which each of the plurality of values contributes to the likelihood of the recognition result candidate can also be expressed as the degree of contribution of each of the plurality of values to the likelihood of the recognition result candidate.
  • the processing result indicates the contribution of each of the plurality of pixel values to the likelihood of the recognition result candidate.
  • the analysis unit 104 may perform the analysis process using any one of a plurality of methods called PDA, LIME, Grad-CAM, and G ⁇ Back ⁇ Prop. Thus, the analysis unit 104 may present a processing result indicating the contribution of each of the plurality of pixel values to the likelihood of the recognition result candidate.
  • the recognition unit 101 and the analysis unit 104 are optional components, and the information processing system 100 does not need to include them.
  • another recognition system may include the recognition unit 101
  • another analysis system may include the analysis unit 104.
  • the sensing data to be processed used in the information processing system 100 may not be an image.
  • the sensing data is not limited to two-dimensional data such as an image, but may be one-dimensional data such as waveform data obtained by a microphone or an inertial sensor.
  • the sensing data may be point cloud data obtained by, for example, LiDAR or radar, or three-dimensional data such as moving image data that is a plurality of time-series images. Further, the sensing data may be data of another dimension.
  • the dimension of the sensing data to be processed may be changed.
  • the sensing data when the sensing data is waveform data, the sensing data may be used as two-dimensional data by using the waveform data for a predetermined period. Further, when the sensing data is waveform data, the dimension of the sensing data may be changed by converting the waveform data into two-dimensional data including time and frequency like a cepstrum.
  • the sensing data is point group data composed of points specified by positions in the horizontal direction, the vertical direction, and the depth direction
  • the point group data obtained in the horizontal direction and the vertical direction at a specific position in the depth direction may be used. That is, in this case, sensing data on a plane at a specific position in the depth direction may be used.
  • the sensing data may be a voice signal representing waveform data of voice.
  • the element to be recognized may be a word, a sentence, a speech section, or the like.
  • the sensing data may be point cloud data that is three-dimensional data. Then, in the three-dimensional object detection, the element to be recognized, that is, the element to be detected may be a three-dimensional object.
  • the designation unit 102 acquires an instruction to designate a part to be analyzed in the sensing data in the above dimensions. Then, when the selecting unit 103 selects a recognition result candidate, the relationship between each of the plurality of recognition result candidates and the designated part is used. At this time, the overlap ratio between the region specified as the analysis target portion and the recognition result candidate may be used. Alternatively, the Euclidean distance between the coordinates specified as the analysis target portion and the recognition result candidate may be used.
  • a condition that an overlap rate between a region specified as a part to be analyzed and a recognition result candidate is equal to or greater than a predetermined overlap rate may be used as a selection condition.
  • a condition that the Euclidean distance between the coordinates specified as the analysis target portion and the recognition result candidate is equal to or less than a predetermined Euclidean distance may be used as the selection condition.
  • the fact that the likelihood of the recognition result candidate is higher than the threshold value may be used as a selection condition in the selection unit 103.
  • the selection unit 103 selects one or more recognition result candidates extracted from the plurality of recognition result candidates output from the recognition unit 101 based on a relationship between each of the plurality of recognition result candidates and a designated part. Alternatively, at least one recognition result candidate having a relatively high likelihood may be selected. For example, the selection unit 103 may select a recognition result candidate having the highest likelihood from among one or more recognition result candidates, or may select a predetermined number of recognition result candidates in descending order of the likelihood. .
  • FIG. 8 is an image diagram showing an aspect of the information processing system 100 shown in FIG.
  • the information processing system 100 includes a computer 110 corresponding to a plurality of components included in the information processing system 100 illustrated in FIG.
  • the computer 110 may serve as a plurality of components included in the information processing system 100 illustrated in FIG.
  • the computer 110 may include a plurality of devices that are distributed and arranged.
  • the computer 110 may play the role of only the designation unit 102 and the selection unit 103 among the recognition unit 101, the designation unit 102, the selection unit 103, and the analysis unit 104 shown in FIG.
  • One or more other devices different from the computer 110 may serve as the recognition unit 101 and the analysis unit 104.
  • FIG. 9 is a flowchart showing the processing of the information processing system 100 shown in FIG.
  • the recognition unit 101 performs a recognition process on sensing data (S101). Specifically, the recognition unit 101 acquires a plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates by inputting the sensing data to the recognition model. Then, the recognition unit 101 outputs the plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates.
  • the selection unit 103 acquires a plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates from the recognition unit 101 (S102).
  • the specifying unit 102 acquires an instruction to specify a part to be analyzed in the sensing data (S103).
  • the selection unit 103 acquires, via the specifying unit 102, an instruction to specify a part to be analyzed in the sensing data.
  • the selection unit 103 determines at least one recognition result from the plurality of recognition result candidates based on the relationship between each of the plurality of recognition result candidates and the designated portion and the likelihood of each of the plurality of recognition result candidates.
  • a candidate is selected (S104).
  • the selection unit 103 outputs at least one selected recognition result candidate (S105).
  • the analysis unit 104 performs an analysis process based on at least one recognition result candidate output from the selection unit 103, and presents a processing result of the analysis process (S106).
  • the information processing system 100 can output a recognition result candidate that is useful for the analysis processing based on the specified portion even when a valid recognition result cannot be obtained in the recognition processing. it can. Therefore, the information processing system 100 can present the processing result of the appropriate analysis processing based on the recognition result candidate.
  • the processing performed by the information processing system 100 is not limited to the following example.
  • FIG. 10 is a conceptual diagram showing the processing of the recognition unit 101 shown in FIG.
  • the recognizing unit 101 acquires sensing data and performs a recognition process on the sensing data.
  • the sensing data is an image
  • the recognition process is a process of detecting an object in the image. That is, the recognition unit 101 acquires an input image and performs an object detection process on the input image.
  • the recognition unit 101 obtains a plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates from an input image using a multilayer neural network.
  • the recognition result candidate is a candidate for a recognition target object in the input image, that is, a candidate for a detection target object in the input image.
  • the recognition unit 101 inputs an input image to an input layer of a multilayered neural network, and generates a feature map indicating a feature of the input image in each of a plurality of processing layers included in the multilayered neural network. Derive. Then, the recognition unit 101 derives a recognition result candidate that matches the feature map and its likelihood. Thereby, the recognition unit 101 derives a plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates from one or more feature maps.
  • ⁇ Recognition unit 101 then outputs the plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates to selection unit 103.
  • ⁇ Recognition unit 101 may determine whether the recognition result candidate is a recognition target object, that is, whether the recognition result candidate is a detection target object, based on the likelihood of each of the plurality of recognition result candidates. Then, the recognizing unit 101 may output a recognition result indicating the recognition result candidate determined to be the recognition target object as the recognition target object, that is, the detection target object. However, the determination as to whether the object is a recognition target object and the output of the recognition result need not be performed.
  • FIG. 11 is a conceptual diagram showing the processing of the specifying unit 102 shown in FIG.
  • the specifying unit 102 acquires an instruction to specify a part to be analyzed in the sensing data.
  • the sensing data is an image
  • the specifying unit 102 acquires an instruction to specify a part included in the image as a part to be analyzed.
  • the part to be analyzed may be a point or a position of the object to be analyzed, that is, the coordinates of the object to be analyzed.
  • the portion to be analyzed may be a region having a width, a size, or the like.
  • the coordinates are specified as a part to be analyzed.
  • the user of the information processing system 100 specifies a part to be analyzed in the sensing data via the input interface provided in the specifying unit 102.
  • the specification unit 102 acquires information indicating the specified part as an instruction.
  • the specifying unit 102 may include a screen for displaying an image and a mouse for acquiring an input. Then, the specifying unit 102 acquires the image input to the recognizing unit 101, and displays the acquired image on a screen. Then, the specifying unit 102 may obtain the coordinates clicked with the mouse on the screen on which the image is displayed, as an analysis target portion, more specifically, as an instruction to specify the analysis target portion.
  • the user can specify a part to be analyzed by clicking the object displayed on the screen with a mouse, and the specifying unit 102 can obtain an instruction to specify the part to be analyzed. .
  • a touch panel may be used.
  • the specifying unit 102 may acquire the coordinates touched on the touch panel as a part to be analyzed.
  • FIG. 12 is a conceptual diagram showing another aspect of the process of the specifying unit 102 shown in FIG.
  • the part to be analyzed may be specified by an operation of painting a region corresponding to the part to be analyzed with a mouse. Then, the specifying unit 102 may acquire the filled area as a part to be analyzed.
  • FIG. 13 is a conceptual diagram showing still another mode of the process of the specifying unit 102 shown in FIG.
  • the part to be analyzed may be specified by an operation of surrounding the area corresponding to the part to be analyzed with the mouse.
  • a rectangular frame may be designated as a part to be analyzed by a mouse.
  • Such a designation method is also called rectangle selection.
  • the portion to be analyzed may be specified by not only a rectangle but also another polygon or another shape.
  • FIG. 14 is a conceptual diagram showing still another mode of the process of the specifying unit 102 shown in FIG.
  • the specifying unit 102 may read the correct answer data and acquire a region of the object indicated by the correct answer data as a part to be analyzed.
  • the correct answer data may be input via a recording medium or the like. Further, the correct answer data may be input to the recognition unit 101 together with the image. Then, the specifying unit 102 may acquire the correct answer data from the recognizing unit 101.
  • FIGS. 11 to 14 there are a plurality of methods for specifying a part to be analyzed. These methods include a method for specifying the coordinates of the analysis target (FIG. 11) and a method for specifying the coordinates of the analysis target. (FIGS. 12 to 14). One of these methods may be available, or two or more methods may be available.
  • the coordinates specified in the instruction to specify the analysis target coordinates as the analysis target part are also simply expressed as specified coordinates.
  • the area specified in the instruction to specify the area to be analyzed as the part to be analyzed is simply expressed as a specified area.
  • the part specified in the instruction for specifying the part to be analyzed is also simply expressed as a specified part.
  • the specifying unit 102 outputs to the selecting unit 103 an instruction to specify a coordinate or a region to be analyzed as a part to be analyzed.
  • FIG. 15 is a conceptual diagram showing the processing of the selection unit 103 shown in FIG.
  • the selection unit 103 selects at least one recognition result candidate from a plurality of recognition result candidates on the sensing data.
  • the selecting unit 103 acquires a plurality of recognition result candidates on the sensing data and the likelihood of each of the plurality of recognition result candidates from the recognition unit 101.
  • the selection unit 103 acquires an instruction to specify a part to be analyzed from the specification unit 102.
  • the selecting unit 103 selects at least one recognition result candidate from the plurality of recognition result candidates based on the likelihood of each of the plurality of recognition result candidates and the designated portion, and selects at least one of the selected recognition result candidates.
  • One recognition result candidate is output.
  • the selecting unit 103 selects a recognition result candidate corresponding to the designated portion and having a high likelihood from a plurality of recognition result candidates.
  • the selection unit 103 may select a plurality of recognition result candidates. Specifically, the selection unit 103 may select the top three recognition result candidates based on the likelihood.
  • the selection unit 103 determines that at least one recognition result candidate corresponding to the specified portion, that is, at least one detection result candidate corresponding to the specified portion. Select and output result candidates.
  • FIG. 16 is a flowchart showing the processing of the selection unit 103 shown in FIG. This example corresponds to the processing when the area to be analyzed is specified as the part to be analyzed.
  • the selecting unit 103 obtains an instruction to specify a region to be analyzed from the specifying unit 102, thereby obtaining the specified region as a region to be analyzed (S201). Then, the selection unit 103 repeats the likelihood threshold loop until the number of selection candidates becomes larger than the predetermined number (S202 to S212).
  • the selection unit 103 acquires a recognition result candidate list including recognition result candidates whose likelihood is higher than the threshold based on the likelihood of each of the plurality of recognition result candidates and the threshold ( S203).
  • the selection unit 103 may obtain a recognition result candidate list from the recognition unit 101 based on the likelihood and the threshold.
  • the selection unit 103 obtains a plurality of recognition result candidates and the likelihood of each of the plurality of recognition result candidates from the recognition unit 101 before the likelihood threshold loop, and obtains the likelihood in advance in the likelihood threshold loop.
  • a recognition result candidate list may be obtained from a plurality of recognition result candidates thus obtained.
  • the selection unit 103 initializes the selection candidate list (S204). For example, in the likelihood threshold loop, recognition result candidates may be added to the selection candidate list. The selection unit 103 deletes the recognition result candidates added to the selection candidate list from the selection candidate list. Then, the selection unit 103 repeats the recognition result candidate list loop (S205 to S210) until the recognition result candidate list becomes empty.
  • the selection unit 103 extracts one recognition result candidate from the recognition result candidate list, and deletes the extracted recognition result candidate from the recognition result candidate list (S206). Then, the selection unit 103 derives an overlap rate between the analysis target area and the extracted recognition result candidate area (S207). For example, the overlap rate corresponds to the rate at which the extracted recognition result candidate area is included in the analysis target area. IoU (Intersection @ over @ Union) may be used for the overlap ratio.
  • the selection unit 103 determines whether or not the overlap rate is equal to or more than the threshold (S208). If the overlap rate is equal to or greater than the threshold (Yes in S208), the selection unit 103 updates the selection candidate list (S209). That is, in this case, the selection unit 103 adds the extracted recognition result candidates to the selection candidate list. When the overlap rate is smaller than the threshold (No in S208), the selection unit 103 does not update the selection candidate list. That is, in this case, the selection unit 103 does not add the extracted recognition result candidates to the selection candidate list.
  • the selection unit 103 repeats the recognition result candidate list loop (S205 to S210) until the recognition result candidate list becomes empty, and thereafter, the selection unit 103 reduces the likelihood threshold (S211). Then, the selection unit 103 repeats the likelihood threshold loop until the number of selection candidates becomes larger than the predetermined number (S202 to S212).
  • the number of selection candidates corresponds to the number of recognition result candidates included in the selection candidate list. That is, the recognition result candidates are selected while lowering the threshold value until more than a predetermined number of recognition result candidates are selected.
  • the selection unit 103 After the number of selection candidates becomes larger than the predetermined number, the selection unit 103 outputs a selection candidate list (S213).
  • the selecting unit 103 selects at least one recognition result candidate from the plurality of recognition result candidates based on the likelihood of each of the plurality of recognition result candidates and the designated part by the above-described processing. At least one recognized recognition result candidate can be output.
  • FIG. 17 is a flowchart showing another mode of the process of the selection unit 103 shown in FIG. This example corresponds to the process when the coordinates of the analysis target are specified as the analysis target portion.
  • the selection unit 103 acquires the designated coordinates by acquiring an instruction to designate the coordinates of the analysis target from the designation unit 102 (S301). Then, the selection unit 103 obtains the analysis target area by deriving the analysis target area based on the designated coordinates (S302). Subsequent processing is the same as the processing after the analysis target area is acquired in FIG. 16 (S202 to S213).
  • FIG. 18 is a conceptual diagram showing a process of deriving an analysis target area from designated coordinates in the selection unit 103 shown in FIG.
  • the selection unit 103 may derive a region in a range relatively determined from the designated coordinates as the region to be analyzed.
  • the analysis target area is an area defined by a circle having the specified coordinates at the center and having a certain radius.
  • the region to be analyzed may be a rectangular region having the designated coordinates at the center of gravity, or another polygonal region having the designated coordinates at the center of gravity.
  • the designating unit 102 may acquire an instruction for designating at least one of the shape and the size of the analysis target area, and output the instruction to the selecting unit 103.
  • the selection unit 103 may derive the analysis target area based on the designated coordinates and the designated shape or size.
  • FIG. 19 is a conceptual diagram showing another mode of the process of deriving the analysis target area from the designated coordinates in the selection unit 103 shown in FIG.
  • the selection unit 103 may derive an analysis target area based on the segmentation result of the image input to the recognition unit 101 and the designated coordinates. That is, the selection unit 103 may derive, as the analysis target region, a region of the segment including the designated coordinates among the plurality of segments indicated by the segmentation result of the image.
  • the segmentation of the image may be performed by a neural network, or may be performed by edge detection or the like of the image.
  • the image segmentation may be performed in the recognition unit 101, may be performed in the designation unit 102, may be performed in the selection unit 103, and may be performed in other components.
  • the selection unit 103 may acquire the segmentation result of the image from outside the information processing system 100.
  • FIGS. 18 and 19 are an example, and another processing that can obtain an equivalent result may be performed.
  • the selection unit 103 derives the analysis target area from the designated coordinates.
  • the specifying unit 102 may derive an area to be analyzed from the specified coordinates. Then, the specifying unit 102 may obtain an instruction to specify coordinates in the sensing data as an instruction to specify an area to be analyzed. In this case, the coordinates are not used as the part to be analyzed, but the area is used.
  • FIG. 20 is a conceptual diagram showing the processing of the analysis unit 104 shown in FIG.
  • the analysis unit 104 performs an analysis process based on at least one recognition result candidate output from the selection unit 103, and presents a processing result of the analysis process.
  • the analysis unit 104 presents a processing result indicating the degree to which each of a plurality of values included in the sensing data contributes to the likelihood of each of the at least one recognition result candidate output from the selection unit 103.
  • the sensing data is an image
  • the processing result indicates the contribution of each of the plurality of pixel values to the likelihood of the recognition result candidate.
  • the analysis unit 104 uses any one of a plurality of methods called PDA, LIME, Grad-CAM, and G @ Back @ Prop to derive the recognition result candidate in the same manner as the derivation of the degree of contribution to the identification result. The contribution to the likelihood of is derived. Then, the analysis unit 104 presents a processing result indicating the degree of contribution on a screen or the like provided in the analysis unit 104.
  • the degree of positive contribution to the likelihood of the recognition result candidate is expressed by the density of points as the positive degree of contribution.
  • the degree of positively contributing to the likelihood of the recognition result candidate may be expressed by color density or the like for each pixel. Specifically, a portion where the dog's head is reflected and a portion where the dog's tail is reflected in the input image positively contribute to the likelihood of the recognition result candidate. In other words, the part where the dog's head is reflected and the part where the dog's tail is reflected guide the recognition result candidate in a direction in which the likelihood increases.
  • the degree of negative contribution is represented by the density of points.
  • the degree of negatively contributing to the likelihood of the recognition result candidate may be represented by a color density or the like for each pixel.
  • the portion where the front paws of the dog is reflected negatively contributes to the likelihood of the recognition result candidate.
  • the portion where the front paws of the dog is reflected leads to a direction in which the likelihood of the recognition result candidate decreases.
  • the positive contribution and the negative contribution are represented separately here, they may be represented in combination with the positive contribution and the negative contribution.
  • the degree of positively contributing to the likelihood of the recognition result candidate may be expressed by red density
  • the degree of negatively contributing to the likelihood of the recognition result candidate may be expressed by blue density.
  • the analysis unit 104 may present a plurality of processing results to a plurality of recognition result candidates by performing an analysis process on each of the plurality of recognition result candidates and presenting the processing results.
  • the analysis unit 104 may present a single processing result by integrating a plurality of processing results obtained for a plurality of recognition result candidates.
  • the information processing system 100 described above can be applied to a process of detecting a plurality of types of objects from an image. Further, in FIGS. 10 to 20, the object detection processing is used as the recognition processing, but other recognition processing such as a segmentation processing may be used.
  • the embodiment of the information processing system 100 has been described based on the embodiment and the like, the embodiment of the information processing system 100 is not limited to the embodiment and the like. Modifications conceived by those skilled in the art may be made to the embodiments and the like, and a plurality of components in the embodiments and the like may be arbitrarily combined. For example, a process performed by a specific component in the embodiment and the like may be performed by another component instead of the specific component. Further, the order of the plurality of processes may be changed, or the plurality of processes may be executed in parallel.
  • the information processing method including the steps performed by each component of the information processing system 100 may be executed by an arbitrary device or system.
  • the information processing method may be executed by a computer (computer 110 or another computer) including a processor, a memory, an input / output circuit, and the like.
  • the information processing method may be executed by causing the computer to execute a program for causing the computer to execute the information processing method.
  • the program may be recorded on a non-transitory computer-readable recording medium.
  • the above program is obtained by inputting sensing data to a computer, a model trained by machine learning and performing a recognition process, a plurality of recognition result candidates on the sensing data, and Obtaining the likelihood of each of the plurality of recognition result candidates, obtaining an instruction to specify a part to be analyzed in the sensing data, the relationship between each of the plurality of recognition result candidates and the part, and the plurality of At least one recognition result candidate is selected from the plurality of recognition result candidates based on the likelihood of each recognition result candidate, and an information processing method of outputting the selected at least one recognition result candidate is executed.
  • the plurality of components of the information processing system 100 may be configured by dedicated hardware, may be configured by general-purpose hardware that executes the above-described programs, or the like, or may be configured by a combination of these. You may.
  • the general-purpose hardware may be configured by a memory in which a program is recorded, a general-purpose processor that reads out the program from the memory and executes the program.
  • the memory may be a semiconductor memory or a hard disk, or the general-purpose processor may be a CPU or the like.
  • dedicated hardware may be configured with a memory, a dedicated processor, and the like.
  • a dedicated processor may execute the above information processing method with reference to a memory in which information of a recognition model is recorded.
  • Each component of the information processing system 100 may be an electric circuit. These electric circuits may constitute one electric circuit as a whole, or may be separate electric circuits. Further, these electric circuits may correspond to dedicated hardware, or may correspond to general-purpose hardware that executes the above-described programs and the like.
  • the computer 110 obtains a plurality of recognition result candidates on sensing data obtained by inputting sensing data to a model, and the likelihood of each of the plurality of recognition result candidates. Is acquired (S102).
  • the model is a model that is trained by machine learning and that executes a recognition process.
  • the computer 110 acquires an instruction to specify a part to be analyzed in the sensing data (S103). Then, the computer 110 determines at least one recognition result candidate from the plurality of recognition result candidates based on a relationship between each of the plurality of recognition result candidates and a part to be analyzed and a likelihood of each of the plurality of recognition result candidates. A selection is made (S104). Then, the computer 110 outputs at least one selected recognition result candidate (S105).
  • the computer 110 presents a processing result based on the output at least one recognition result candidate.
  • processing results such as analysis processing based on the selected recognition result candidates.
  • the processing result indicates the degree to which each of the plurality of values included in the sensing data contributes to the likelihood of each of the at least one output recognition result candidate. This makes it possible to present the contribution of each of the plurality of values in the sensing data to the selected recognition result candidate.
  • the relationship between each of the plurality of recognition result candidates and the portion to be analyzed is based on whether or not each of the plurality of recognition result candidates overlaps with the portion to be analyzed, and each of the plurality of recognition result candidates. It is at least one of the degrees of overlap with the part to be analyzed. As a result, it is possible to output the recognition result candidates selected based on the presence or absence or the degree of overlap with the designated part.
  • each of the at least one recognition result candidate selected from the plurality of recognition result candidates is a recognition result candidate whose likelihood is higher than the threshold value among the plurality of recognition result candidates. This makes it possible to select a recognition result candidate having a high likelihood, and to output a more useful recognition result candidate.
  • the computer 110 lowers the threshold value. Then, the computer 110 selects at least one recognition result candidate based on the relationship, the likelihood, and the lowered threshold.
  • the relationship is the relationship between each of the plurality of recognition result candidates and the part to be analyzed
  • the likelihood is the likelihood of each of the plurality of recognition result candidates
  • the threshold is at least one recognition result candidate. This is a threshold value for selection.
  • the predetermined number may be 0, 1, or 2 or more.
  • the computer 110 outputs the likelihood of each of the at least one selected recognition result candidate. This makes it possible to output the likelihood of the selected recognition result candidate as information useful for analysis processing and the like.
  • the sensing data is an image
  • the recognition process is an object recognition process on the image
  • each of the plurality of recognition result candidates is a candidate for an object appearing in the image. This makes it possible to output a candidate for an object appearing in the image with respect to the object recognition processing on the image.
  • the information processing system 100 includes the computer 110.
  • the computer 110 obtains a plurality of recognition result candidates on the sensing data obtained by inputting the sensing data to a model that is a model trained by machine learning and executes a recognition process, and a likelihood of each of the plurality of recognition result candidates.
  • the degree is obtained (S102).
  • the computer 110 acquires an instruction to specify a part to be analyzed in the sensing data (S103). Then, the computer 110 determines at least one recognition result candidate from the plurality of recognition result candidates based on the relationship between each of the plurality of recognition result candidates and the analysis target portion and the likelihood of each of the plurality of recognition result candidates. A selection is made (S104). Then, the computer 110 outputs at least one selected recognition result candidate (S105).
  • the information processing system 100 can output a recognition result candidate useful for the analysis processing or the like based on the designated portion. That is, the information processing system 100 can output information for analyzing the output of the recognition process even when no object is detected in the recognition process.
  • the present disclosure can be used to output information for analyzing a recognition process, and is applicable to an information processing system or the like for improving a recognition system or an analysis system.
  • REFERENCE SIGNS LIST 100 information processing system 101 recognition unit 102 designation unit 103 selection unit 104 analysis unit 110 computer

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Dans ce procédé de traitement d'informations, un ordinateur : acquiert une pluralité de candidats de résultat de reconnaissance sur des données de détection obtenues par entrée des données de détection à un modèle qui a été entraîné par apprentissage automatique et qui exécute un traitement de reconnaissance, et acquiert la probabilité de chaque candidat de la pluralité de candidats de résultat de reconnaissance (S102) ; acquiert une instruction spécifiant une partie à analyser parmi les données de détection (S103) ; sélectionne au moins un candidat de résultat de reconnaissance parmi la pluralité de candidats de résultat de reconnaissance sur la base de la relation entre ladite partie et chaque candidat de la pluralité de candidats de résultat de reconnaissance, et les probabilités respectives de la pluralité de candidats de résultat de reconnaissance (S104) ; et délivre le ou les candidats de résultat de reconnaissance sélectionnés (S105).
PCT/JP2019/027259 2018-09-07 2019-07-10 Procédé et système de traitement d'informations WO2020049864A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201980005088.7A CN111213180A (zh) 2018-09-07 2019-07-10 信息处理方法以及信息处理系统
EP19857798.3A EP3848889A4 (fr) 2018-09-07 2019-07-10 Procédé et système de traitement d'informations
US16/849,334 US11449706B2 (en) 2018-09-07 2020-04-15 Information processing method and information processing system

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201862728418P 2018-09-07 2018-09-07
US62/728,418 2018-09-07
JP2019076509A JP7287823B2 (ja) 2018-09-07 2019-04-12 情報処理方法及び情報処理システム
JP2019-076509 2019-04-12

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/849,334 Continuation US11449706B2 (en) 2018-09-07 2020-04-15 Information processing method and information processing system

Publications (1)

Publication Number Publication Date
WO2020049864A1 true WO2020049864A1 (fr) 2020-03-12

Family

ID=69721789

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/027259 WO2020049864A1 (fr) 2018-09-07 2019-07-10 Procédé et système de traitement d'informations

Country Status (1)

Country Link
WO (1) WO2020049864A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002074368A (ja) * 2000-08-25 2002-03-15 Matsushita Electric Ind Co Ltd 移動物体認識追跡装置
JP2013125524A (ja) * 2011-12-16 2013-06-24 Hitachi High-Technologies Corp 学習用装置、および、遺伝子学習方法
JP2018010626A (ja) * 2016-06-30 2018-01-18 キヤノン株式会社 情報処理装置、情報処理方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002074368A (ja) * 2000-08-25 2002-03-15 Matsushita Electric Ind Co Ltd 移動物体認識追跡装置
JP2013125524A (ja) * 2011-12-16 2013-06-24 Hitachi High-Technologies Corp 学習用装置、および、遺伝子学習方法
JP2018010626A (ja) * 2016-06-30 2018-01-18 キヤノン株式会社 情報処理装置、情報処理方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LUISA M ZINTGRAF ET AL., VISUALIZING DEEP NEURAL NETWORK DECISIONS: PREDICTION DIFFERENCE ANALYSIS, 15 February 2017 (2017-02-15), Retrieved from the Internet <URL:https://arxiv.org/pdf/1702.04595.pdf>
See also references of EP3848889A4 *

Similar Documents

Publication Publication Date Title
JP7287823B2 (ja) 情報処理方法及び情報処理システム
US11768974B2 (en) Building information model (BIM) element extraction from floor plan drawings using machine learning
US11586785B2 (en) Information processing apparatus, information processing method, and program
WO2019091417A1 (fr) Procédé et dispositif d&#39;identification à base de réseau neuronal
WO2018099194A1 (fr) Procédé et dispositif d&#39;identification de caractères
JP7252188B2 (ja) 画像処理システム、画像処理方法及びプログラム
US20150154442A1 (en) Handwriting drawing apparatus and method
KR20160101683A (ko) 공식 입력 방법 및 장치
CN109343920B (zh) 一种图像处理方法及其装置、设备和存储介质
KR102161052B1 (ko) 영상에서 객체를 분리하는 방법 및 장치.
JP6337973B2 (ja) 追学習装置、追学習方法、および、追学習プログラム
JP2017511917A (ja) 音楽記号を認識するための方法および装置
US11164036B2 (en) Human-assisted machine learning through geometric manipulation and refinement
JP2019220014A (ja) 画像解析装置、画像解析方法及びプログラム
CN115861400A (zh) 目标对象检测方法、训练方法、装置以及电子设备
CN111738252B (zh) 图像中的文本行检测方法、装置及计算机系统
CN111492407B (zh) 用于绘图美化的系统和方法
Meng et al. Globally measuring the similarity of superpixels by binary edge maps for superpixel clustering
JP6914724B2 (ja) 情報処理装置、情報処理方法及びプログラム
WO2020049864A1 (fr) Procédé et système de traitement d&#39;informations
JP6694638B2 (ja) プログラム、情報記憶媒体及び認識装置
EP3951616A1 (fr) Dispositif d&#39;ajout d&#39;informations d&#39;identification, procédé d&#39;ajout d&#39;informations d&#39;identification, et programme
CN114387600A (zh) 文本特征识别方法、装置、计算机设备和存储介质
KR20200005853A (ko) 심층 구조 학습 기반 사람 계수 방법 및 시스템
WO2021131127A1 (fr) Dispositif d&#39;ajout d&#39;informations d&#39;identification, procédé d&#39;ajout d&#39;informations d&#39;identification, et programme

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19857798

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019857798

Country of ref document: EP

Effective date: 20210407