WO2023089888A1 - 機械学習方法、機械学習プログラム、機械学習装置、および情報処理装置 - Google Patents
機械学習方法、機械学習プログラム、機械学習装置、および情報処理装置 Download PDFInfo
- Publication number
- WO2023089888A1 WO2023089888A1 PCT/JP2022/031342 JP2022031342W WO2023089888A1 WO 2023089888 A1 WO2023089888 A1 WO 2023089888A1 JP 2022031342 W JP2022031342 W JP 2022031342W WO 2023089888 A1 WO2023089888 A1 WO 2023089888A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- series data
- machine learning
- size adjustment
- unit
- series
- Prior art date
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 103
- 230000010365 information processing Effects 0.000 title description 22
- 238000007781 pre-processing Methods 0.000 claims abstract description 46
- 238000000034 method Methods 0.000 claims description 38
- 238000001514 detection method Methods 0.000 claims description 20
- 238000000605 extraction Methods 0.000 claims description 15
- 238000005070 sampling Methods 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 5
- 238000003384 imaging method Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 description 21
- 238000012549 training Methods 0.000 description 18
- 238000007689 inspection Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000002950 deficient Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000011248 coating agent Substances 0.000 description 3
- 238000000576 coating method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000013434 data augmentation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
Definitions
- the present invention relates to a machine learning method, a machine learning program, a machine learning device, and an information processing device.
- Machine learning such as deep learning generally requires learning using a large amount of high-quality teacher data in order to achieve a certain level of object recognition accuracy.
- entropy is analyzed for speech data at a cycle of 15 msec, and sampling is performed according to the analysis result. Set rates and generate training data for learning.
- Non-Patent Document 1 relates to audio data and requires advanced interpolation processing as preprocessing.
- the present invention was made to solve such problems.
- a machine learning device that easily generates learning data without the need for advanced preprocessing and uses this data for learning to generate a learning model with improved robustness against conditional changes in the sequence direction. , and aims to provide a machine learning method.
- a machine learning method for generating a learning model for extracting features of interest comprising: step (a) of obtaining series data; performing preprocessing for size adjustment in the series direction on the series data based on a predetermined condition, thereby generating a plurality of post-adjustment series data with different intervals in the series direction from one piece of series data; (b); A machine learning method for executing a process including step (c) of performing supervised learning using the plurality of adjusted series data generated in step (b) to generate a learning model.
- step (a) obtain a label of the series data together with the series data;
- step (c) the label of one of the series data is applied to the plurality of adjusted series data to perform supervised learning.
- step (b) The machine learning method according to (1) or (2) above, wherein in step (b), conditions for size adjustment are automatically set based on the predetermined conditions.
- step (a) is time-series image data obtained by photographing a target object within the photographing area;
- the learning model is a learning model for extracting features of a target object.
- step (b) The machine learning method according to (4) above, wherein in step (b), the size adjustment condition is set according to the sampling rate or the number of frames of the series data as the predetermined condition.
- step (d) of acquiring external information about the shooting environment The machine learning method according to (4) or (5) above, wherein in step (b), the size adjustment condition is set as the predetermined condition based on the external information.
- step (9) The machine learning method according to (8) above, wherein in step (b), a size adjustment condition is set according to the number of key frames detected in step (e).
- step (10) The machine learning method according to (8) or (9) above, wherein in step (b), only the keyframes are targeted for the size adjustment.
- step (b) The machine according to any one of (8) to (10) above, wherein in step (b), the method of size adjustment is made different before and after the reference frame in the direction in which the series data are arranged. learning method.
- a machine learning device that generates a learning model for extracting features of a target, an acquisition unit that acquires series data; Preprocessing for generating a plurality of post-adjustment series data with different intervals in the series direction from one piece of series data by performing preprocessing for size adjustment in the series direction on the series data based on a predetermined condition. Department and a learning unit that performs supervised learning using a plurality of the adjusted series data generated by the preprocessing unit to generate a learning model; A machine learning device with
- the acquisition unit acquires a label of the series data together with the series data,
- the machine learning device according to (12) above, wherein the learning unit applies the label of one of the series data to the plurality of adjusted series data to perform supervised learning.
- the series data acquired by the acquisition unit is time-series image data obtained by photographing a target object within an imaging region;
- the machine learning device according to any one of (12) to (14) above, wherein the learning model is a learning model for extracting features of a target object.
- the acquisition unit further acquires external information regarding the shooting environment, The machine learning device according to (15) or (16) above, wherein the preprocessing unit sets the size adjustment condition based on the external information as the predetermined condition.
- series data is acquired, and preprocessing for size adjustment in the series direction is performed on the series data based on a predetermined condition, thereby obtaining one series data generates a plurality of adjusted series data with different intervals in the series direction, and supervised learning is performed using the generated plurality of adjusted series data to generate a learning model.
- multiple sets of training data with different intervals between series data can be easily generated without the need for advanced preprocessing, and training is performed using these data, improving robustness against changes in conditions in the series direction. can generate a learning model.
- FIG. 1 is a diagram showing a schematic configuration of an information processing device according to an embodiment of the present invention
- FIG. FIG. 2 is a side view showing an example of an object to be inspected by the information processing apparatus shown in FIG. 1; It is a block diagram which shows the structure of an information processing apparatus.
- FIG. 4 is a functional block diagram showing the flow of data in the machine learning device realized by the functioning of the control unit; It is an example of series data. 4 is a flowchart showing machine learning processing of the machine learning device;
- FIG. 11 is a subroutine flowchart showing a process of setting size adjustment conditions in step S53;
- FIG. FIG. 11 is a subroutine flowchart showing a size adjustment condition setting process in step S53 in another example;
- FIG. 4 is a schematic diagram for explaining a machine learning method using post-adjustment series data;
- FIG. 4 is a functional block diagram showing the flow of data in inspection processing of an information processing device using a learning model generated by machine learning; It is a flowchart which shows the test
- FIG. 1 is a diagram showing a schematic configuration of an inspection system 1 including an information processing device according to this embodiment.
- the inspection system 1 is composed of a sequence data input device 30 and an information processing device 10, which are communicably connected to each other via a network 90 such as a LAN.
- the series data input device 30 generates and inputs series data.
- Series data input device 30 includes camera 310 .
- the series data input device 30 has a three-dimensional ranging sensor such as LiDar (Light Detection And Ranging), a temperature sensor installed in a factory, a pressure sensor, etc.
- a detection device that outputs detection data, an HDD (hard disk drive, etc.) that records the series data obtained from these devices may be included.
- Machine learning is performed using the series data from 30 to generate a machine learning model.
- Sequence data is a data group in which a plurality of data are arranged according to predetermined order information. For example, photographed data (time-series image data) obtained by photographing by the camera 310, three-dimensional data in which two-dimensional image data are arranged according to positional information in a direction perpendicular to the two-dimensional data, and time-series data of voices uttered by people. There are audio data arranged in series, distance measurement point cloud data obtained from a three-dimensional distance measurement sensor, and the like. In the following description, photographed data (moving images) obtained by photographing by the camera 310 will be taken as an example of series data.
- FIG. 2 shows an example of a predetermined object inspected by the inspection system 1.
- the object is a long sheet metal member, which is conveyed from the right hand side to the left hand side along the conveying direction in FIG. 2 by a belt conveyor (not shown).
- the information processing device 10 of the inspection system 1 extracts defects in the surface coating of the sheet metal member (indicated as points of interest in FIG. 2) as features of the object (object), and extracts the result to output
- the object is not limited to this, and it may be a product such as a plurality of vehicles, or a part of the component parts for this product, which is continuously conveyed by a belt conveyor. Specific features (defective products, missing items, etc.) may be extracted and the extraction results may be output.
- FIG. 3 is a block diagram showing the configuration of the information processing device 10. As shown in FIG. Information processing apparatus 10 includes control unit 11 , storage unit 12 , operation display unit 13 , and communication unit 14 . These are interconnected via signal lines such as buses for exchanging signals.
- the control unit 11 functions as a machine learning device and includes multiple CPUs, multiple GPUs (Graphics Processing Units), RAM, ROM, etc., and performs machine learning and controls each device according to a program.
- the information processing device 10 may be an on-premise server or a cloud server using a commercial cloud service. Also, a part of the functions of the information processing device 10 (for example, only the functions of the machine learning device) may be realized by a cloud server.
- the storage unit 12 is composed of a semiconductor memory that stores various programs and various data in advance, and a magnetic memory such as a hard disk.
- a machine learning model 200 (also referred to as a learned model) that has been learned, generated, and updated by machine learning is stored in the storage unit 12 .
- the storage unit 12 stores the following three types of data d1 to d3. (d1) a large number of series data generated by the series data input device 30, (d2) external information, and (d3) extracting conditions for a point of interest.
- a label (correct label) is associated with each series data (d1) and stored.
- the external information (d2) is information about the shooting environment, such as the sampling rate of the camera 310, the number of frames (FPS), or the moving speed of the object, that is, the conveying speed of the belt conveyor. Or if the series data is audio data, it is the sampling rate.
- the extraction condition (d3) is a preset rule, and as a rule-based algorithm using this, for example, an image processing algorithm for detecting a point of interest such as pattern matching or edge detection processing is applied. can. This extraction condition (d3) or an algorithm using this is used for detection processing of the detection unit 112, which will be described later.
- the operation display unit 13 is, for example, a touch panel display, which displays various information and accepts various inputs from the user.
- the user can set the above-described imaging environment (external information) via the operation display unit 13 .
- Assignment of labels to each series data may be performed via this operation display unit 13, or may be performed by a pre-labeling process using a rule-based algorithm or a machine learning model. These settings or given information are stored in the storage unit 12 .
- the communication unit 14 is an interface that transmits and receives data via a network.
- communication is performed according to standards such as Ethernet, Bluetooth (registered trademark), and IEEE802.11 (Wi-Fi).
- FIG. 4 is a functional block diagram showing the flow of data in the machine learning device realized by the functioning of the control unit 11.
- the control unit 11 functions as an acquisition unit 111 by cooperating with the communication unit 14 .
- Control unit 11 also functions as detection unit 112 , preprocessing unit 113 , and learning unit 114 .
- Acquisition unit 111 acquires external information and a plurality of training data from series data input device 30 or storage unit 12 .
- Training data consists of sets of series data and labels.
- the detection unit 112 receives series data from the acquisition unit 111 .
- FIG. 5 is an example of series data.
- the series data here is photographed data photographed in a predetermined interval (time t ⁇ to t+ ⁇ ).
- a predetermined interval time t ⁇ to t+ ⁇ .
- one series data consists of 30, 60, or 120 frames (still images).
- the predetermined interval and FPS can be set as appropriate.
- one piece of series data consists of 60 frames.
- the series data used as training data is generated in advance by photographing an object having a spot of interest to be inspected (for example, a defect such as uneven coating in part) while being moved by a belt conveyor.
- the part of interest coating unevenness
- white white for easy understanding.
- the detection unit 112 detects a frame (hereinafter also referred to as a key frame) containing a target point from among a plurality of frames forming series data based on a preset extraction condition (d3).
- a detection result is sent to the preprocessing unit 113 .
- the frame numbers of key frames are sent.
- Preprocessing unit 113 The preprocessing unit 113 adjusts the size of the series data in the series direction based on a predetermined condition, and generates a plurality of post-adjustment series data having different intervals in the series direction.
- Predetermined conditions include the following predetermined conditions A1 to A3 (hereinafter collectively referred to as predetermined condition A).
- A1 sampling rate or number of frames (A2) external information (eg, moving speed or camera specs), and (A3) key frame information.
- A2) external information eg, moving speed or camera specs
- A3 key frame information (A1) sampling rate or number of frames, (A2) external information (eg, moving speed or camera specs), and (A3) key frame information.
- (A1) is information indicating the characteristics of series data stored in the storage unit 12 in advance, and is set by the user, for example.
- (A2) The external information is obtained from the sequence data input device 30.
- (A3) The keyframe information is information on the number of keyframes and/or the position of the reference frame (see below), and is determined based on the keyframe information acquired from the detection unit 112 .
- the preprocessing unit 113 sets a reference frame from the series data.
- This reference frame is set from among the key frames detected by the detection unit 112 .
- the key frame at time t is set as the reference frame.
- This reference frame is set according to a predetermined condition (hereinafter also referred to as a predetermined condition B).
- a predetermined condition B when a plurality of key frames are detected, a method of setting the central position of the arrangement as the reference frame, or a method of setting the edge of the point of interest (the boundary between black and white in the figure) as an image.
- the preprocessing unit 113 sets size adjustment conditions from the predetermined conditions A1 and A2. For example, in an inspection apparatus, when the speed range of movement of an object is predetermined, the variation of images that can occur within that speed range is increased (the number of types of post-adjustment series data is increased). Similarly, depending on the specifications of the camera, the number of image variations that can be generated within the speed range is increased so as to cover the frame rate. Based on the size of the point of interest relative to the shooting area in the direction) and the moving speed, the number of frames in which the point of interest exists in the shooting area (hereinafter referred to as existing frames and the number of existing frames) is determined. Accordingly, size adjustment is performed to generate a plurality of post-adjustment series data.
- the preprocessing unit 113 performs size adjustment by extracting several frames before and after the reference frame, and performs size adjustment by skipping one frame, skipping two frames, or the like within the range of several frames before and after the reference frame.
- only existing frames may be targeted for size adjustment, or the method of size adjustment may be different before and after the reference frame in the alignment direction of the series data.
- interpolation processing or extrapolation processing may be performed in addition to thinning processing. For example, if the number of existing frames is less than a predetermined number, an intermediate frame is generated by interpolating with preceding and following frames. A specific example of size adjustment will be described later.
- the learning unit 114 performs machine learning by supervised learning using a plurality of adjusted series data with different intervals in the series direction after size adjustment and labels attached thereto as training data, and generates a machine learning model 200. or update.
- one label assigned to one series data is commonly applied to a plurality of post-adjustment series data generated based on this series data.
- FIG. 6 machine learning processing
- FIG. 6 as series data, an example is taken in which the amount of data per piece is reduced by decimation processing in the time direction as size adjustment in the series direction in shooting data composed of 60 pieces of time-series image data. described as.
- FIG. 6 is a flowchart showing machine learning processing executed by the control unit 11 functioning as a machine learning device.
- a plurality of post-adjustment series data with different intervals are generated from each of the plurality of series data by the processes of steps S51 to S55. This increases the number of samples (the number of training data) and reduces the amount of each data.
- step S56 a learning model is generated and updated by machine learning using the adjusted series data.
- Step S51 the acquisition unit 111 of the control unit 11 acquires external information.
- the external information is obtained directly from the series data input device as described above, or is set by the user via the operation display section 13 and stored in the storage section 12 .
- Step S52 the acquisition unit 111 acquires the training data directly from the series data input device 30 or stored in the storage unit 12 .
- the training data consists of a plurality of series data, and each series data is labeled.
- Step S53 the preprocessing unit 113 automatically sets the conditions for size adjustment by itself or in cooperation with the detection unit 112 .
- FIG. 7A is a subroutine flowchart showing the size adjustment condition setting process of step S53 in one example
- FIG. 7B is a subroutine flowchart showing the size adjustment condition setting process of step S53 in another example.
- the preprocessing unit 113 sets a plurality of size adjustment conditions based on the predetermined condition A.
- the predetermined condition A is the number of frames constituting the series data (predetermined condition A3), and the larger the number of frames, the higher the thinning rate.
- predetermined condition A3 For 30 frames, for example, 1 and 2 frame skipping is set, and for 60 frames, 1 to 3 frame skipping is set. For example, if one frame is thinned out of 60 frames (0 to 59), odd-numbered frames are deleted, and even-numbered frames (0, 2, 4, 6, . do.
- Step S621 In another example shown in FIG. 7B, here, the detection unit 112 extracts key frames from the series data based on the extraction condition (d3).
- Step S622 the preprocessing unit 113 sets a reference frame.
- This reference frame is set from among the key frames detected by the detection unit 112 in step S621.
- the frame at time t is set as the reference frame based on the predetermined condition B described above.
- Step S623 The preprocessing unit 113 sets a plurality of size adjustment conditions based on a combination of the number of existing frames determined by the predetermined condition A1 or A2 and the predetermined condition A3 (key frame information), or based on the predetermined condition A3 only. See Figure 8). With this, the processing in FIG. 7B is completed, and the process returns to the processing in FIG. 6 (return).
- Step S54 Please refer to FIG. 6 again.
- the preprocessing unit 113 performs size adjustment based on the size adjustment condition set in step S53, and generates a plurality of post-adjustment series data with a plurality of intervals different from each other from one series data.
- FIG. 8 is an example of multiple post-adjustment series data generated by preprocessing.
- the frames shown in FIG. 8 correspond to those in FIG. 5, and in FIG. 8, the frame after adjustment is surrounded by a solid-line square frame, and the rest (that is, the frame to be deleted) is shown in light density (gray).
- the adjusted series data x1 shown in FIG. 8A as the adjustment condition set in step S623, continuous Frames in a predetermined section (three frames in the drawing) are extracted (frames at times t ⁇ 1, t, and t+1).
- the adjusted series data x2 in FIG. 8B as another adjustment condition set in step S623, three frames are extracted by thinning one frame centering on the reference frame (time t- 2, t, t+2).
- the series data after adjustment is composed of three frames is shown, but it is not limited to this, and may be composed of more than three frames.
- the post-adjustment series data may be composed only of existing frames (or key frames) in which the point of interest exists in the shooting area, but may include frames other than existing frames.
- FIG. 9 is an example of post-adjustment series data generated under another size adjustment condition.
- Fig. 9(a) shows thinning of one frame centering on the reference frame (t)
- Fig. 9(b) shows thinning of two frames
- Fig. 9(c) shows the method of adjustment before and after the reference frame (t).
- It is series data after adjustment generated by a different method (random thinning). Specifically, in the example of FIG. 9C, the thinning rate is made different before and after the reference frame.
- the adjustment conditions as shown in FIG. 9 may be applied in combination with the adjustment conditions as shown in FIG. 8 or instead of FIG.
- Step S55 If the size adjustment has not been completed for all the training data, the control unit 11 returns the process to step S52 and repeats the subsequent processes. When the size adjustment for all data sets of training data is completed, the process proceeds to step S56.
- Step S56 The control unit 11, which is a machine learning device, reads the adjusted series data and labels after sample adjustment as training data and performs machine learning.
- FIG. 10 is a schematic diagram for explaining a machine learning method using post-adjustment series data.
- a plurality of post-adjustment series data x1 and x2 are generated from one series data x with which the label X is linked.
- the label X associated with the original series data x is commonly applied to these adjusted series data x1 and x2.
- FIG. 10 shows an example in which two adjusted series data x1 and x2 are generated, three or more adjusted series data with different intervals may be generated and used for machine learning. For example, as shown in FIGS. 8 and 9, four pieces of adjusted series data x1 to x4 each having an interval of k in the series direction may be generated.
- the size of series data is adjusted and the number of samples is increased by performing similar size adjustments on many other series data.
- these adjusted series data are input to a neural network as training data for a machine learning device.
- the machine learning device (control unit 11) compares the neural network estimation result of the adjusted series data with the label, and adjusts the parameters based on the comparison result. For example, by performing a process called back-propagation (back-propagation of errors), the parameters are adjusted and updated so that the error in the comparison result is reduced. This is repeated for target training data (adjusted series data) to advance machine learning.
- target training data adjusted series data
- machine learning method using a neural network configured by combining perceptrons has been explained, it is not limited to this, and various methods can be used as long as it is supervised learning. For example, random forest, support vector machine (SVM), Boosting, Bayesian network linear discriminant method, nonlinear discriminant method, etc. can be applied.
- SVM support vector machine
- Boosting Bayesian network linear discriminant method
- nonlinear discriminant method etc.
- series data and labels are acquired, and preprocessing for size adjustment in the series direction is performed on the series data based on a predetermined condition.
- preprocessing for size adjustment in the series direction is performed on the series data based on a predetermined condition.
- a plurality of adjusted series data with different intervals in the series direction are generated from one series data, and supervised using the labels and the plurality of adjusted series data generated by the preprocessing unit Learn and generate a learning model.
- multiple sets of training data with different intervals between series data can be easily generated without the need for advanced preprocessing, and training is performed using these data, improving robustness against changes in conditions in the series direction. can generate a learning model.
- machine learning when applying a learning model learned under the condition that products are moving on a belt conveyor in a production line of one factory to the production line of another factory, machine learning is not performed for each belt conveyor with different speeds. It was assumed that the accuracy would be degraded. Even in such a situation, by performing machine learning as in the present embodiment, it is possible to generate a plurality of adjusted series with different intervals using series data obtained from an object moving on a belt conveyor at one speed. By learning using data, a single learning model can handle various situations with different speeds.
- the machine learning device or machine learning method according to the present embodiment can be preferably applied to generate a learning model for extracting features of an object whose moving speed or movement itself is not a main parameter.
- FIG. 11 is a functional block diagram showing the flow of data in inspection processing of the information processing device 10
- FIG. 12 is a flow chart showing the inspection processing of the information processing device 10. As shown in FIG.
- the control unit 11 of the information processing device 10 functions as an acquisition unit 116, an extraction unit 117, and an output unit 118.
- Acquisition unit 116 has the same function as acquisition unit 111 and acquires series data obtained by photographing an object as shown in FIG.
- the extraction unit 117 uses the learning model 600 to extract the features of the target (target object) from the series data. Also, the output unit 118 outputs the extraction result.
- Step S71 Acquisition unit 116 acquires series data.
- a photographed image is sent from the camera 310 in real time, and is divided into series data for each predetermined period.
- Step S72 The extraction unit 117 expands the machine learning model 200 stored in the storage unit 12 and uses it to perform visual inspection. Inspection results are output as scores.
- Step 73 The output unit 118 outputs the determination result according to the score. For example, according to the score of the object, the determination result of defective or non-defective is output to the operation display unit 13 or the like.
- the information processing apparatus 10 uses a learning model for series data containing objects to extract features and output extraction results. This makes it possible to determine the characteristics of the object, that is, whether it is a non-defective product or not, with high accuracy.
- the configurations of the machine learning device and the information processing device described above are the main configurations for describing the features of the above-described embodiments, and are not limited to the above-described configurations. can be modified. Moreover, it does not exclude configurations provided in general machine learning devices or information processing devices.
- steps may be omitted from the flowchart described above, and other steps may be added. Further, part of each step may be changed in order or executed simultaneously, and one step may be divided into a plurality of steps and executed.
- the means and methods for performing various processes in the information processing apparatus 10 described above can be realized by either a dedicated hardware circuit or a programmed computer.
- the program may be provided by a computer-readable recording medium such as a USB memory or a DVD (Digital Versatile Disc)-ROM, or may be provided online via a network such as the Internet.
- the program recorded on the computer-readable recording medium is usually transferred to and stored in a storage unit such as a hard disk.
- the program may be provided as independent application software, or may be incorporated into the software of the device as one function of the device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
この非特許文献1の手法では、発音の違い(会話、読み上げ、スピーチ)による認識精度低下を抑制するために、音声データに対して、15msec周期でエントロピーを解析し、解析結果に応じて、サンプリングレートを設定し、学習用のトレーニングデータを生成する。
系列データを取得するステップ(a)と、
前記系列データに対して、所定条件に基づいて、系列方向のサイズ調整の前処理を行うことにより、1つの系列データにより、系列方向の間隔が互いに異なる複数の調整後系列データを生成する、ステップ(b)と、
前記ステップ(b)で生成した複数の前記調整後系列データを用いて、教師あり学習し、学習モデルを生成するステップ(c)と、を含む処理を実行する機械学習方法。
前記ステップ(c)では、前記系列データの1つの前記ラベルを、前記複数の調整後系列データに適用して、教師あり学習を行う、上記(1)に記載の機械学習方法。
前記学習モデルは、対象の物体の特徴を抽出するための学習モデルである、上記(1)から上記(3)のいずれかに記載の機械学習方法。
前記ステップ(b)では、前記所定条件として、前記外部情報に基づいて、前記サイズ調整の条件を設定する、上記(4)、または上記(5)に記載の機械学習方法。
前記ステップ(b)では、前記ステップ(e)で検出したキーフレームの中から一つの基準フレームを設定するとともに、前記サイズ調整を、該基準フレームを基準に実行する、上記(4)から上記(7)のいずれかに記載の機械学習方法。
系列データを取得する取得部と、
前記系列データに対して、所定条件に基づいて、系列方向のサイズ調整の前処理を行うことにより、1つの系列データにより、系列方向の間隔が互いに異なる複数の調整後系列データを生成する前処理部と、
前記前処理部が生成した複数の前記調整後系列データを用いて、教師あり学習し、学習モデルを生成する学習部と、
を備える機械学習装置。
前記学習部は、前記系列データの1つの前記ラベルを、前記複数の調整後系列データに適用して、教師あり学習を行う、上記(12)に記載の機械学習装置。
前記学習モデルは、対象の物体の特徴を抽出するための学習モデルである、上記(12)から上記(14)のいずれかに記載の機械学習装置。
前記前処理部は、前記所定条件として、前記外部情報に基づいて、前記サイズ調整の条件を設定する、上記(15)、または上記(16)に記載の機械学習装置。
前記前処理部は、前記検出部が検出したキーフレームの中から一つの基準フレームを設定するとともに、前記サイズ調整を、該基準フレームを基準に実行する、上記(15)から上記(18)のいずれかに記載の機械学習装置。
上記(1)から上記(11)の何れかに記載の機械学習方法で学習した学習モデルを用いて対象の特徴を抽出する抽出部と、
抽出結果を出力する出力部と、を備える情報処理装置。
系列データは、複数のデータが、所定の順序情報に従って並べられたデータ群である。例えば、カメラ310の撮影により得られた撮影データ(時系列の画像データ)、2次元画像データを、その2次元に垂直な方向の位置情報により並べた3次元データ、人が発した音声を時系列に並べた音声データ、3次元測距センサーから得られた測距点群データ等がある。以下においては、系列データとしてカメラ310の撮影により得られた撮影データ(動画)を例にして説明する。
取得部111は、系列データ入力装置30から、または記憶部12から、外部情報、および複数の訓練データを取得する。訓練データは、複数の系列データとラベルのセットで構成される。
検出部112は、取得部111から系列データを受け取る。図5は、系列データの一例である。ここでの系列データは、所定区間(時刻t―α~t+β)に撮影された撮影データである。例えば30、60、または120FPSのカメラ310で撮影された1秒間の動画データであれば、1つの系列データは、30、60、または120枚のフレーム(静止画像)により構成される。所定区間、FPSは、適宜設定され得る。以下においては、1つの系列データは、60個のフレームにより構成されたものとして説明する。訓練データとして用いられる系列データは、予め、検査の対象となる注目箇所(例えば一部に塗装ムラの欠陥)が存在する対象物をベルトコンベアにより移動させながら撮影することにより生成される。図5に示す例では、わかりやすさのために、注目箇所(塗布ムラ)の部分を白塗りで示している。
前処理部113は、系列データに対して、所定条件に基づいて系列方向のサイズ調整を行って、系列方向の間隔が互いに異なる複数の調整後系列データを生成する。所定条件としては、下記の所定条件A1~A3(以下、これらを総称して所定条件Aともいう)がある。
学習部114は、サイズ調整後の系列方向の間隔が互いに異なる複数の調整後系列データと、これ付与されたラベルを訓練データとして、教師あり学習により機械学習を行い、機械学習モデル200を生成、または更新する。ここで、1つの系列データに付与されていた1つのラベルは、この系列データを元に生成された複数の調整後系列データに共通に適用される。
以下、図6から図11を参照し、本実施形態に係る機械学習方法について説明する。本実施形態では、系列データとして、60個の時系列の画像データで構成される撮影データにおいて、系列方向のサイズ調整として、時間方向の間引き処理により1つ当たりのデータ量を削減した場合を例として説明する。
ここでは、制御部11の取得部111は、外部情報を取得する。外部情報は、上述のように系列データ入力装置から直接取得した、または、ユーザーにより操作表示部13を介して設定され記憶部12に記憶されたものである。
ここでは、取得部111は、系列データ入力装置30から直接、または記憶部12に記憶されている訓練データを取得する。訓練データは、複数の系列データで構成され、各系列データにはラベルが付与されている。
ここでは、前処理部113は、単独で、または検出部112と協働とすることで、サイズ調整の条件を自動的に設定する。図7Aは、一例におけるこのステップS53のサイズ調整条件の設定処理を示すサブルーチンフローチャートであり、図7Bは別の例におけるステップS53のサイズ調整条件の設定処理を示すサブルーチンフローチャートである。
(ステップS611)
図7Aに示すように前処理部113は、所定条件Aに基づいて、複数のサイズ調整条件を設定する。例えば、所定条件Aは、系列データを構成するフレーム数(所定条件A3)であり、フレーム数が大きいほど、間引き率を多くする。例えば30フレームであれば、例えば、1および2フレーム間引きに設定し、60フレームの場合には、1~3フレーム間引きに設定する。例えば、60フレーム(0~59)で1フレーム間引きであれば、奇数番目を削除し、偶数番目(0、2、4、6…)のフレームによりデータ量を半分にした調整後系列データを生成する。また2フレーム間引きであれば、2つ置きのフレームによりデータ量を1/3にした調整後系列データを生成する(0、3、6、9…)。以上で、図7Aの処理を終了し、図6の処理に戻る(リターン)。
(ステップS621)
図7Bに示す別の例では、ここでは、検出部112は、系列データから抽出条件(d3)に基づいてキーフレームを抽出する。
ここでは、前処理部113は、基準フレームを設定する。この基準フレームは、ステップS621で検出部112が検出したキーフレームの中から設定する。例えば図5においては、上述の所定条件Bに基づき時刻tのフレームを基準フレームに設定する。
前処理部113は、所定条件A1またはA2により判定した存在フレーム数と、所定条件A3(キーフレーム情報)の組み合わせ、または所定条件A3のみに基づいて、複数のサイズ調整条件を設定する(後述の図8参照)。以上で、図7Bの処理を終了し、図6の処理に戻る(リターン)。
再び図6を参照する。ここでは、前処理部113は、ステップS53で設定されたサイズ調整条件に基づき、サイズ調整を実行し、1つの系列データから、複数の間隔が互いに異なる複数の調整後系列データを生成する。
制御部11は、全ての訓練データに対する、サイズ調整が終了していなければ、処理をステップS52に戻し、以降の処理を繰り返す。訓練データの全データセットに対するサイズ調整が終了すれば、処理をステップS56に進める。
機械学習装置である制御部11は、サンプル調整後の調整後系列データとラベルを、訓練データとして読み込んで、機械学習を行う。図10は、調整後系列データを用いた機械学習方法を説明するための模式図である。ステップS55までの処理により、ラベルXが紐付けられている1つの系列データxから、複数の調整後系列データx1、x2を生成する。またこれらの調整後系列データx1、x2には、元となる系列データxに紐付けられていたラベルXを共通して適用する。なお、図10では、2つの調整後系列データx1,x2を生成した例を示すが、3つ以上の互いに間隔が異なる調整後系列データを生成し、これを用いて機械学習してもよい。例えば、図8、図9に示した、互いに系列方向の間隔が互いにkとなる4つの調整後系列データx1~x4を生成してもよい。
以下、図11、図12を参照し、図6の機械学習処理で生成された機械学習モデル200を用いた、検査処理について説明する。図11は、情報処理装置10の検査処理におけるデータの流れを示す機能ブロック図であり、図12は、情報処理装置10の検査処理を示すフローチャートである。
取得部116は、系列データを取得する。図2の例では、リアルタイムでカメラ310から撮影画像が送られ、これを所定期間毎の系列データに分割する。
抽出部117は、記憶部12に記憶されている機械学習モデル200を展開し、これを用いて外観検査を行う。検査結果は、スコアとして出力される。
出力部118は、スコアに応じた判定結果を出力する。例えば、対象物のスコアに応じて、不良または良品の判定結果を操作表示部13等に出力する。
10 情報処理装置
11 制御部(機械学習装置)
111 取得部
112 検出部
113 前処理部
114 学習部
116 取得部
117 抽出部
118 出力部
12 記憶部
13 操作表示部
14 通信部
200 学習モデル
30 系列データ入力装置
310 カメラ
Claims (24)
- 対象の特徴を抽出するための学習モデルを生成する機械学習方法であって、
系列データを取得するステップ(a)と、
前記系列データに対して、所定条件に基づいて、系列方向のサイズ調整の前処理を行うことにより、1つの系列データにより、系列方向の間隔が互いに異なる複数の調整後系列データを生成する、ステップ(b)と、
前記ステップ(b)で生成した複数の前記調整後系列データを用いて、教師あり学習し、学習モデルを生成するステップ(c)と、を含む処理を実行する機械学習方法。 - 前記ステップ(a)では、前記系列データとともに、該系列データのラベルを取得し、
前記ステップ(c)では、前記系列データの1つの前記ラベルを、前記複数の調整後系列データに適用して、教師あり学習を行う、請求項1に記載の機械学習方法。 - c2
前記ステップ(b)では、前記所定条件に基づいて、サイズ調整の条件を自動的に設定する、請求項1、または請求項2に記載の機械学習方法。 - 前記ステップ(a)で取得した前記系列データは、撮影領域内の対象の物体を撮影して得られた、時系列の画像データであり、
前記学習モデルは、対象の物体の特徴を抽出するための学習モデルである、請求項1から請求項3のいずれかに記載の機械学習方法。 - 前記ステップ(b)では、前記所定条件として、前記系列データのサンプリングレート、またはフレーム数に応じて、前記サイズ調整の条件を設定する、請求項4に記載の機械学習方法。
- さらに、撮影環境に関する外部情報を取得するステップ(d)を含み、
前記ステップ(b)では、前記所定条件として、前記外部情報に基づいて、前記サイズ調整の条件を設定する、請求項4、または請求項5に記載の機械学習方法。 - 前記外部情報は、前記物体の移動速度、前記撮影領域を撮影するカメラのスペックに関する情報である、請求項6に記載の機械学習方法。
- 予め定められた条件に基づいて、前記系列データを解析し、系列データを構成する複数のフレームの中から、対象の物体の注目箇所が存在する1つ以上のキーフレームを検出するステップ(e)を、さらに含み、
前記ステップ(b)では、前記ステップ(e)で検出したキーフレームの中から一つの基準フレームを設定するとともに、前記サイズ調整を、該基準フレームを基準に実行する、請求項4から請求項7のいずれかに記載の機械学習方法。 - 前記ステップ(b)では、前記ステップ(e)で検出した前記キーフレームの数に応じて、サイズ調整の条件を設定する、請求項8に記載の機械学習方法。
- 前記ステップ(b)では、前記キーフレームのみを対象として、前記サイズ調整の対象とする、請求項8または請求項9に記載の機械学習方法。
- 前記ステップ(b)では、前記系列データの並び方向において、前記基準フレームの前後で、前記サイズ調整の方法を異ならせる、請求項8から請求項10のいずれかに記載の機械学習方法。
- 対象の特徴を抽出するための学習モデルを生成する機械学習装置であって、
系列データを取得する取得部と、
前記系列データに対して、所定条件に基づいて、系列方向のサイズ調整の前処理を行うことにより、1つの系列データにより、系列方向の間隔が互いに異なる複数の調整後系列データを生成する前処理部と、
前記前処理部が生成した複数の前記調整後系列データを用いて、教師あり学習し、学習モデルを生成する学習部と、
を備える機械学習装置。 - 前記取得部は、前記系列データとともに、該系列データのラベルを取得し、
前記学習部は、前記系列データの1つの前記ラベルを、前記複数の調整後系列データに適用して、教師あり学習を行う、請求項12に記載の機械学習装置。 - 前記前処理部は、前記所定条件に基づいて、サイズ調整の条件を自動的に設定する、請求項12、または請求項13に記載の機械学習装置。
- 前記取得部が取得した前記系列データは、撮影領域内の対象の物体を撮影して得られた、時系列の画像データであり、
前記学習モデルは、対象の物体の特徴を抽出するための学習モデルである、請求項12から請求項14のいずれかに記載の機械学習装置。 - 前記前処理部は、前記所定条件として、前記系列データのサンプリングレート、またはフレーム数に応じて、前記サイズ調整の条件を設定する、請求項15に記載の機械学習装置。
- 前記取得部は、さらに、撮影環境に関する外部情報を取得し、
前記前処理部は、前記所定条件として、前記外部情報に基づいて、前記サイズ調整の条件を設定する、請求項15、または請求項16に記載の機械学習装置。 - 前記外部情報は、前記物体の移動速度、前記撮影領域を撮影するカメラのスペックに関する情報である、請求項17に記載の機械学習装置。
- 予め定められた条件に基づいて、前記系列データを解析し、系列データを構成する複数のフレームの中から、対象の物体の注目箇所が存在する1つ以上のキーフレームを検出する検出部を、さらに含み、
前記前処理部は、前記検出部が検出したキーフレームの中から一つの基準フレームを設定するとともに、前記サイズ調整を、該基準フレームを基準に実行する、請求項15から請求項18のいずれかに記載の機械学習装置。 - 前記前処理部は、前記検出部が検出した前記キーフレームの数に応じて、サイズ調整の条件を設定する、請求項19に記載の機械学習装置。
- 前記前処理部は、前記キーフレームのみを対象として、前記サイズ調整の対象とする、請求項19または請求項20に記載の機械学習装置。
- 前記前処理部は、前記系列データの並び方向において、前記基準フレームの前後で、前記サイズ調整の方法を異ならせる、請求項19から請求項21のいずれかに記載の機械学習装置。
- 請求項1から請求項11の何れかに記載の機械学習方法を、コンピューターに実行させるための機械学習プログラム。
- 系列データを取得する取得部と、
請求項1から請求項11の何れかに記載の機械学習方法で学習した学習モデルを用いて対象の特徴を抽出する抽出部と、
抽出結果を出力する出力部と、を備える情報処理装置。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280075636.5A CN118251684A (zh) | 2021-11-18 | 2022-08-19 | 机器学习方法、机器学习程序、机器学习装置以及信息处理装置 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-187584 | 2021-11-18 | ||
JP2021187584 | 2021-11-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023089888A1 true WO2023089888A1 (ja) | 2023-05-25 |
Family
ID=86396628
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/031342 WO2023089888A1 (ja) | 2021-11-18 | 2022-08-19 | 機械学習方法、機械学習プログラム、機械学習装置、および情報処理装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118251684A (ja) |
WO (1) | WO2023089888A1 (ja) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020070876A1 (ja) * | 2018-10-05 | 2020-04-09 | 日本電気株式会社 | 教師データ拡張装置、教師データ拡張方法およびプログラム |
WO2021100267A1 (ja) * | 2019-11-20 | 2021-05-27 | 株式会社日立製作所 | 情報処理装置、および、情報処理方法 |
JP2021187584A (ja) | 2020-05-27 | 2021-12-13 | 株式会社アイチコーポレーション | 作業車 |
-
2022
- 2022-08-19 CN CN202280075636.5A patent/CN118251684A/zh active Pending
- 2022-08-19 WO PCT/JP2022/031342 patent/WO2023089888A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020070876A1 (ja) * | 2018-10-05 | 2020-04-09 | 日本電気株式会社 | 教師データ拡張装置、教師データ拡張方法およびプログラム |
WO2021100267A1 (ja) * | 2019-11-20 | 2021-05-27 | 株式会社日立製作所 | 情報処理装置、および、情報処理方法 |
JP2021187584A (ja) | 2020-05-27 | 2021-12-13 | 株式会社アイチコーポレーション | 作業車 |
Non-Patent Citations (1)
Title |
---|
AMBER AFSHANJINXI GUOSOO JIN PARKVIJAY RAVIALAN MCCREEABEER ALWAN: "Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification", 8 August 2020, CORNELL UNIVERSITY |
Also Published As
Publication number | Publication date |
---|---|
CN118251684A (zh) | 2024-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6216024B1 (ja) | 学習済モデル生成方法及び信号データ判別装置 | |
US11715190B2 (en) | Inspection system, image discrimination system, discrimination system, discriminator generation system, and learning data generation device | |
EP3271869B1 (fr) | Procédé de traitement d'un signal asynchrone | |
JP2021515885A (ja) | 照明条件を設定する方法、装置、システム及びプログラム並びに記憶媒体 | |
EP4137901A1 (en) | Deep-learning-based real-time process monitoring system, and method therefor | |
US20170280130A1 (en) | 2d video analysis for 3d modeling | |
CN113807378B (zh) | 训练数据增量方法、电子装置与计算机可读记录介质 | |
JP2016181098A (ja) | 領域検出装置および領域検出方法 | |
CN109584206B (zh) | 零件表面瑕疵检测中神经网络的训练样本的合成方法 | |
US20230058809A1 (en) | Method and apparatus for detecting defect based on phased pass/fail determination | |
CN111144151A (zh) | 一种基于图像识别的高速动态条码实时检测方法 | |
WO2023089888A1 (ja) | 機械学習方法、機械学習プログラム、機械学習装置、および情報処理装置 | |
EP3609189B1 (en) | Testing rendering of screen objects | |
CN113554645A (zh) | 基于wgan的工业异常检测方法和装置 | |
CN113706496A (zh) | 一种基于深度学习模型的飞行器结构裂纹检测方法 | |
CN117274245A (zh) | 基于图像处理技术的aoi光学检测方法及系统 | |
JP2021092887A (ja) | 外観検査装置、外観検査システム、特徴量推定装置および外観検査プログラム | |
JP2020107104A (ja) | 画像判定装置、画像判定方法及び画像判定プログラム | |
CN114119479A (zh) | 一种基于图像识别的工业流水线质量监控方法 | |
JP6944201B2 (ja) | 電子回路、ハードウェアコンポーネント、エッジ処理システム、エッジコンピューティングシステム、識別方法、識別プログラム | |
Chauhan et al. | Effect of illumination techniques on machine vision inspection for automated assembly machines | |
JP2019056591A (ja) | 外観検査装置及び外観検査方法 | |
JP7070308B2 (ja) | 推定器生成装置、検査装置、推定器生成方法、及び推定器生成プログラム | |
Zhang et al. | Weld joint penetration state sequential identification algorithm based on representation learning of weld images | |
JP7345764B2 (ja) | 検査システムおよび検査プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22895176 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023562139 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022895176 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2022895176 Country of ref document: EP Effective date: 20240618 |