WO2022269885A1

WO2022269885A1 - Learning data collection device, learning data collection method, and learning data collection program

Info

Publication number: WO2022269885A1
Application number: PCT/JP2021/024066
Authority: WO
Inventors: 堅也杉原
Original assignee: 三菱電機株式会社
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2022-12-29
Also published as: JPWO2022269885A1; JP7415082B2

Abstract

This learning data collection device is provided with a data retention determination unit (3) that acquires new data newly acquired via a sensor and two existing data items already retained in a data retention storage device and that determines whether or not to retain, as learning data, the new data or data concerning the new data, by using the two existing data items.

Description

LEARNING DATA COLLECTOR, LEARNING DATA COLLECTION METHOD, AND LEARNING DATA COLLECTION PROGRAM

This disclosure relates to learning data collection technology.

Conventionally, research and development has been conducted to effectively collect learning data for use in AI or machine learning. As an example, Patent Literature 1 discloses a learning data collection system for collecting learning data used for image recognition. The system of Patent Document 1 determines whether it is appropriate to use newly captured image data as learning data based on various determination criteria, and if the captured image is not appropriate as learning data, it is determined that it is not appropriate. The determination result is reported to the photographer.

Japanese Patent Application Laid-Open No. 2020-8904

According to the conventional technology including Patent Document 1, there is a problem that a large amount of new data similar to existing data can be collected as learning data.

The present disclosure was made to solve such problems, and aims to provide a learning data collection technique that does not collect new data similar to existing data as learning data.

A learning data collection device according to an embodiment of the present disclosure acquires new data newly acquired through a sensor and two existing data already stored in a data storage device, and stores the new data or the new data A data storage determination unit that determines whether or not to store related data as learning data using the two existing data.

According to the learning data collection device according to the embodiment of the present disclosure, it is possible not to collect new data similar to existing data as learning data.

1 is a diagram showing a configuration example of a learning data collection device according to Embodiments 1 and 2; FIG. FIG. 4 is a diagram illustrating a hardware configuration example of a data storage determination unit of the learning data collection device; FIG. 4 is a diagram illustrating a hardware configuration example of a data storage determination unit of the learning data collection device; 4 is a flow chart showing the operation of the learning data collection device according to Embodiment 1; 4 is a diagram for explaining the operation of the learning data collection device according to Embodiment 1; FIG. 9 is a flow chart showing the operation of the learning data collection device according to Embodiment 2; FIG. 12 is a diagram for explaining the operation of the learning data collection device according to Embodiment 2; FIG. 12 is a diagram for explaining the operation of the learning data collection device according to Embodiment 2; FIG. 13 is a diagram showing a configuration example of a learning data collection device according to Embodiment 3;

Various embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. Components denoted by the same or similar reference numerals in the drawings have the same or similar configurations or functions, and duplicate descriptions of such components will be omitted. In addition, the term "similar" in the above-mentioned phrase "new data similar to existing data" means that when new data is represented by linear interpolation of two existing data, new data is represented by linear interpolation of two existing data. is within a predetermined distance from or when the hidden layer output obtained by inputting new data to the neural network is within a predetermined distance from the linear interpolation of the hidden layer output obtained by inputting two existing data to the neural network. terminology. That is, the term "similar" includes not only cases where new data has a fixed relationship with two existing data, but also cases where the new data's hidden layer output has a fixed relationship with two existing data's hidden layer outputs. included in some cases.

Embodiment 1.
<Configuration>
A configuration of a learning data collection device according to Embodiment 1 of the present disclosure will be described with reference to FIGS. 1 to 4. FIG. As shown in FIG. 1, the learning data collection device 10 is electrically connected to the sensor 1 and the data storage device 4, and the learning data collection device 10, the sensor 1 and the data storage device 4 are used for learning data collection. configure the system; As an example, the learning data collection device 10 includes a data temporary storage section 2 and a data storage determination section 3 . As another example, the data temporary storage unit 2 may be provided outside the learning data collection device 10 . The learning data collection device 10 receives data acquired by the sensor 1 from the sensor 1 and exchanges data with the data storage device 4 .

(sensor)
The sensor 1 is a device that can acquire some data, such as a sensor that acquires data related to physical quantities such as vibration, voltage, current, temperature, or humidity, a camera that acquires images, and a microphone that acquires sound. The sensor 1 may be any device as long as the acquired data can be used for AI or machine learning. Data acquired by the sensor 1 is converted into digital data by the sensor 1 or an AD converter (not shown) and supplied to the data temporary storage unit 2 .

(Temporary data storage unit)
The data temporary storage unit 2 is a device for temporarily holding data, and temporarily holds data supplied from the sensor 1 . The data temporary storage unit 2 is implemented by a storage device such as a memory, HDD (Hard Disk Drive), or SSD (Solid State Drive).

(Data storage judgment part)
The data storage determination unit 3 is a functional unit that determines whether or not to store the data acquired from the data temporary storage unit 2 in the data storage device 4 . Here, the hardware configuration of the data save determination unit 3 will be described with reference to FIGS. 2A and 2B. As an example, as shown in FIG. 2A, the data storage determination unit 3 is implemented by a processing circuit 100a. The processing circuit 100a is, for example, a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, an ASIC (Application Specific Integrated Circuit), an FPGA (Field-Programmable Gate Array), or a combination thereof.

As another example, as shown in FIG. 2B, the data storage determination unit 3 is realized by a CPU (Central Processing Unit) 100b and a memory 100c. The function of the data storage determination unit 3 (data storage determination function) is realized by reading out and executing the program stored in the memory 100c by the CPU 100b. Programs may be implemented as software, firmware, or a combination of software and firmware. Examples of the memory 100c include non-volatile or volatile semiconductors such as RAM (Random Access Memory), ROM (Read Only Memory), flash memory, EPROM (Erasable Programmable Read Only Memory), and EEPROM (Electrically-EPROM). Memory, magnetic disk, flexible disk, optical disk, compact disk, mini disk, DVD are included.

(data storage storage device)
The data storage storage device 4 is a storage device that stores data. The data storage storage device 4 stores the data determined to be stored by the data storage determination unit 3 . In addition, the data saving memory device 4 supplies data to be used for deciding whether to save data to the data saving determination unit 3 . Examples of the data storage device 4 include storage media such as HDDs (Hard Disk Drives), SSDs (Solid State Drives), memories, or memory cards.

<Action>
Next, the operation of the data storage determination unit 3 of the learning data collection device 10 will be described. Briefly, the data storage determination unit 3 uses the new data d held in the data temporary storage unit 2 and the two existing data e1 and e2 stored in the data storage storage device 4 to generate the new data d should be saved. Generally, data handled by AI or machine learning can be represented by a vector, and in the present disclosure, data is assumed to be an n-dimensional vector represented by two or more dimensions. In this embodiment, if the new data d can be linearly interpolated with the two existing data e1 and e2, it is determined that the new data d is not important and should not be saved. Details of the determination method will be described with reference to FIGS. 3 and 4. FIG. It is assumed that some existing data are stored in the data storage device 4 at the start of the operation of this system.

First, in step ST1, the data storage determination unit 3 acquires from the data temporary storage unit 2 new data d acquired by the sensor 1 and held in the data temporary storage unit 2 as digital data.

Subsequently, the data storage determination unit 3 repeats the selection loop of step ST2. In the selection loop of step ST2, two pairs of data e1 and e2 are selected from the set of existing data stored in the data storage device 4. FIG. The end condition of the loop is whether the loop has been repeated a preset number of times, or whether the loop has been repeated until all the existing data pairs e1 and e2 have been selected. Alternatively, as a result of calculating whether or not linear interpolation is possible in step ST4 described below, the loop may be terminated immediately after it is determined that linear interpolation is possible. In this case, the calculation time can be reduced by the part that the loop is terminated halfway.

In step ST3, the data storage determination unit 3 selects a pair of existing data e1 and e2. In step ST3, different pairs of e1 and e2 are selected each time, and once selected pairs are not selected repeatedly.

In step ST4, the data storage determination unit 3 calculates whether the new data d can be linearly interpolated with the existing data e1 and e2. A linear interpolation x of data e1 and e2 can be expressed as x=λe1+(1−λ)e2 using a scalar value parameter λ. As shown in FIG. 4, linear interpolation x=λe1+(1−λ)e2 can be regarded as an equation representing a straight line 1 that can be taken by a point x in an n-dimensional space. When the new data d is linearly interpolable with the existing data e1 and e2, the new data d exists as a point on the line l, while when it is not linearly interpolable, the new data d does not exist on the line l.

To calculate whether new data d exists on straight line l, it is sufficient to determine whether the equation d=λe1+(1−λ)e2 of λ has a solution. This equation is an n-dimensional linear equation, and it can be easily calculated whether a solution λ exists. For example, it is sufficient to solve each of n linear equations and determine whether or not all of the obtained n solutions λ agree with each other.

In addition, when performing supervised learning in machine learning, not only the data but also the labels corresponding to the data are used for learning. In this embodiment, it may be calculated whether or not linear interpolation is possible, including labels. Let m-dimensional vectors yd, y1, and y2 be the respective labels of the new data d and the existing data e1 and e2 described above, and n+m-dimensional vectors d′=(d, yd), e′1=(e1, y1) and e '2=(e2, y2) can be used in place of d, e1 and e2 above to calculate if vector d' is linearly interpolable by vectors e'1 and e'2. However, for example, vector d'=(d, yd) represents n+m dimension of n+m elements composed of n elements of vector d and m elements of vector yd.

In step ST5, the data storage determination unit 3 determines whether it is determined that at least one line-type interpolation is possible for the new data d based on the calculation result in step ST4. If it is not determined even once that the new data d can be linearly interpolated in step ST4 (No in step ST5), the data storage determining unit 3 determines to store the new data d, and stores the new data d as data. It is saved in the save storage device 4 (step ST6).

On the other hand, when it is determined at least once in step ST4 that the new data d can be linearly interpolated (Yes in step ST5), the new data d is considered to be similar to the existing data e1 or e2. The data storage determination unit 3 determines not to store the new data d (step ST7).

In step ST7, the data storage determination unit 3 does not simply determine that the new data d is not to be stored. It may be determined to save the information specifying e2 and the parameters for restoring the new data d as learning data. Both the information specifying the existing data e1 and e2 and the parameters for restoring the new data d are data related to the new data d. Examples of information identifying the existing data e1 and e2 include numbers or codes identifying the existing data e1 and e2. Examples of parameters for reconstructing the new data d include the value of λ used when determining that the new data d can be linearly interpolated with the existing data e1 and e2. The data storage determination unit 3 stores the information and parameters in the data storage device 4 (step ST7).

By storing the information specifying the existing data e1 and e2 and the parameters for restoring the new data d in this way, the existing data e1 and e2 can be extracted from the information specifying the existing data in subsequent learning. Then, using the extracted existing data e1 and e2 and the saved parameters, the discarded new data d can be completely restored as d=λe1+(1−λ)e2. Therefore, when storing the information specifying the existing data e1 and e2 and the parameter λ for restoring the new data d, the amount of information of the entire learning data is reduced compared to the case of simply not storing the new data d. Since loss can be avoided, the accuracy of AI or machine learning can be improved.

In general, not only the quantity but also the quality of the data to be learned is important, and even if there is a large amount of similar data and learning similar data, AI or machine learning will not necessarily be highly accurate. For example, in the case of image recognition, it is effective to obtain images of an object to be recognized which are photographed from various angles. However, according to the prior art, there is a problem that a large amount of similar data is collected.

On the other hand, according to the learning data collection device 10 of the present disclosure described above, new data d similar to existing data e1 or e2 is not acquired as learning data. Optionally, the learning data collection device 10 collects information specifying the existing data e1 and e2 used when it was determined that the new data d can be linearly interpolated, a parameter λ for restoring the new data d, and to save. By selecting and storing data effective for learning AI or machine learning from the data acquired by the sensor 1, the capacity of a storage device for holding learning data can be reduced. AI or machine learning trained on the saved data can acquire its recognition or regression with high accuracy because it selects valid data for learning. Furthermore, the smaller the amount of learning data, the smaller the calculation time or amount of calculation required for learning AI or machine learning. can do. By reducing the amount of calculation, it is possible to reduce the cost of the equipment that performs learning.

Also, when performing learning later, mixup, which is a data augmentation method proposed in the following reference, may be used. Mixup is a data augmentation method that uses linear interpolation of arbitrary pairs in learning data as learning data, and can learn to interpolate new data d that is rejected without being saved in this embodiment, and AI Or it can improve the accuracy of machine learning.
(Reference 1) H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: beyond empirical risk minimization,” arXiv:1710.09412, 2017.

Embodiment 2.
In Embodiment 1, the data storage determination unit 3 uses linear interpolation of existing data to determine whether or not to store new data d. Therefore, the effect of reducing the amount of learning data to be saved may be limited. In the present embodiment, less data is stored by relaxing the conditions for determination by the data storage determination unit 3 compared to linear interpolation, thereby further reducing the amount of learning data to be stored.

The configuration of the learning data collection device 10A according to the second embodiment is the same as that of the learning data collection device 10 of the first embodiment. According to the learning data collection device 10A according to the second embodiment, the processing content (determination method) by the data storage determination unit 3 is different from that of the first embodiment. The processing contents of the data storage determination unit 3 in the case of the second embodiment will be described with reference to FIG. In FIG. 5, steps ST1, ST2, ST3, ST6 and ST7 are the same as in the first embodiment. Embodiment 2 differs from Embodiment 1 in that the processing contents in steps ST4A and ST5A differ from steps ST4 and ST5 in embodiment 1, respectively. These differences are described below.

In step ST4A, the data storage determination unit 3 calculates whether or not the new data d exists within a distance T from the straight line l indicating linear interpolation of the existing data e1 and e2. T is a threshold, which is appropriately set for the data for judging whether or not to save using the learning data collection device 10A. If T is increased, the amount of training data can be reduced more, but if it is increased more than necessary, there is a possibility that important data will be discarded without being saved.

As described in Embodiment 1, linear interpolation of existing data e1 and e2 can be represented by a straight line l: x=λe1+(1−λ)e2 in n-dimensional space. In the second embodiment, it is determined whether or not an n-dimensional sphere C with a radius of a distance T centered at a point d representing new data intersects the straight line l. If the straight line l and the sphere C intersect as shown in FIG. 6, the new data d exists within the distance T from the linear interpolation of the existing data e1 and e2. It exists farther than the linear interpolation of e1 and e2 and the distance T.

Determination of whether the straight line l and the sphere C intersect can be realized as follows. An n-dimensional sphere C with radius T ⁽ distance T) centered at point d can be expressed as |x ⁻ d| expressed. The following two-dimensional simultaneous equation x=λe1+(1-λ)e2 with x and λ as variables (1)
|xd| ² =T ² (2)
has a real solution if the sphere C and the line l intersect, and does not have a real solution if they do not intersect.

Substituting equation (1) into equation (2) yields the following equation (3).
|λe1+(1-λ)e2-d| ² =T ² (3)

Equation (3) becomes an equation for λ that does not include x. By rearranging this formula (3), the following formula (4) is obtained.
|(e1-e2)λ+(e2-d)| ² =T ² (4)

Further arranging the equation (4) yields the following equation (5).
|(e1-e2)| ² λ ² +2(e1-e2)・(e2-d)λ+(|(e2-d)| ² -T ² )=0 (5)

The symbol "·" in the second item of Equation (5) represents the inner product of vectors. Equation (5) is a quadratic equation of λ, and all coefficients are scalar values.

Whether or not a quadratic equation has a real number solution can be determined by calculating the sign of the discriminant of the quadratic equation. If the discriminant is positive or zero, there is a real solution, and if it is negative, there is no real solution. The discriminant D can be expressed as D=b ² −4ac using the coefficients a, b, and c of the quadratic equation, and the coefficients a, b, and c in the case of equation (5) are specifically (6) to (8).
a=|(e1-e2)| ² (6)
b=2(e1-e2)・(e2-d) (7)
c=|(e2 ^- d)| ² -T2 (8)
If D≧0, the sphere C and the line l intersect, and if D<0, the sphere C and the line l do not intersect.

As described in Embodiment 1, n+m-dimensional vectors d'=(d, yd), e'1=(e1, y1) and e'2=(e2, y2) including labels are may be used in place of d, e1 and e2 above, respectively.

By doing so, the conditions for saving data can be relaxed compared to the first embodiment, and the amount of data to be saved can be reduced.

As described above, by selecting and saving data effective for learning AI or machine learning from the data acquired by the sensor 1, it is possible to reduce the capacity of the storage device for holding the learning data. . AI or machine learning trained on the saved data can acquire its recognition or regression with high accuracy because it selects valid data for learning. Furthermore, the smaller the amount of learning data, the smaller the calculation time and amount of calculation required for learning AI or machine learning. can do. By reducing the amount of calculation, it is possible to reduce the cost of the equipment that performs learning.

In addition, mixup, which is a data augmentation method proposed in Reference 1, may be used when performing learning later. Mixup is a data augmentation method that uses linear interpolation of any pair in the learning data as learning data, and can learn to interpolate the new data d that is rejected without saving in the present embodiment, AI or It can improve the accuracy of machine learning.

Embodiment 3.
<Configuration>
In Embodiments 1 and 2, the data storage determination unit 3 performed determination using the data d, e1, and e2. It is thought that there are few things that can be represented by linear interpolation of existing data. Moreover, when the conditions are relaxed as in the second embodiment, it is necessary to increase the distance T as the threshold value in order to reduce the amount of stored data, and the possibility of discarding important data increases. end up

Therefore, in Embodiment 3, instead of using the data d, e1 and e2, the data storage determination unit 3 inputs the data d, e1 and e2 to the trained neural network, and outputs these data to the intermediate layer. A similar determination is made using the value.

The configuration of a system including the learning data collection device 20 according to Embodiment 3 is shown in FIG. The system shown in FIG. 8 is the same as the system shown in FIG. It differs from the system shown in FIG. 1 in that the model 5 is provided. The data storage determination unit 3 of the learning data collection device 20 is configured to exchange data with the trained machine learning model 5 . The trained machine learning model 5 is a storage device that holds a neural network trained and prepared in advance. This neural network is trained in advance with learning data collected by the learning data collecting device 20 or another means. This neural network is held in a state in which inference operations are possible, and the trained machine learning model 5 supplies the neural network for the data storage determination unit 3 .

The data storage determination unit 3 inputs the new data d and the existing data e1 and e2 to the neural network provided from the trained machine learning model 5, and learns the respective intermediate layer outputs d', e'1 and e'2. obtained from the machine learning model 5.

After that, the data storage determination unit 3 uses the intermediate layer outputs d′, e′1 and e′2 instead of the data d, e1 and e2 to obtain the data shown in FIG. 3 of the first embodiment or FIG. 5 of the second embodiment. It is determined whether to save the new data d or data related to the new data d according to the determination method of .

In the case of conforming to the determination method of FIG. 3 of Embodiment 1, in the step corresponding to step ST4, the data storage determination unit 3 determines that the intermediate layer output d′ is represented by linear interpolation of the intermediate layer outputs e′1 and e′2. Calculate if If the intermediate layer output d' is not represented by linear interpolation of the intermediate layer outputs e'1 and e'2, the data storage determination unit 3, in the step corresponding to step ST6 in FIG. may be determined to be stored, or instead of determining to store the new data d, it may be determined to store the intermediate layer output d', which is data related to the new data d. When the intermediate layer output d' is represented by linear interpolation of the intermediate layer outputs e'1 and e'2, the data storage determination unit 3 determines that the new data d should be stored in the step corresponding to step ST7 in FIG. You can judge. Further, in a step corresponding to step ST7 in FIG. may be determined to be stored.

5 according to the second embodiment, in a step corresponding to step ST4A, the data storage determination unit 3 determines that the intermediate layer output d' is linearly interpolated between the intermediate layer outputs e'1 and e'2. Calculate if it exists within the distance T. If the intermediate layer output d' does not exist within the distance T from the linear interpolation of the intermediate layer outputs e'1 and e'2, the data storage determination unit 3 performs the same operation as in step ST6 in the step corresponding to step ST6 in FIG. Alternatively, it may be determined to store the intermediate layer output d', which is data related to the new data d, instead of storing the new data d. If the intermediate layer output d' exists within the distance T from the linear interpolation of the intermediate layer outputs e'1 and e'2, the data storage determination unit 3 stores the new data d in the step corresponding to step ST7 in FIG. It may be determined not to save. In a step corresponding to step ST7 in FIG. 5, the data storage determination unit 3 stores information specifying the existing data e1 and e2, which are data related to the new data d, and parameters for restoring the new data d. may be determined to be stored.

By appropriately designing the neural network, the number of dimensions of d', e'1 and e'2 obtained in the hidden layer output can be made smaller than the number of dimensions of data d, e1 and e2, and the new data can be The possibility that existing data can be represented by linear interpolation or the possibility that new data exists within the distance T increases, and the amount of learning data to be stored can be reduced compared to the first and second embodiments.

Furthermore, if the intermediate layer output d', which is data related to the new data, is saved as learning data instead of the new data d, the number of dimensions of the data to be saved can be suppressed, and the amount of data to be saved can be reduced. be able to.

In addition, there is a high possibility that the intermediate layer output of the neural network is a value in which the feature amount of the input data is appropriately extracted, and it is possible to determine whether the data should be saved more appropriately. As a result, the accuracy of AI or machine learning learned with the learning data obtained in the present embodiment can be improved.

It should be noted that the embodiments can be combined, and each embodiment can be modified or omitted as appropriate.

The learning data collection technology of the present disclosure can be used as a device for collecting learning data for use in AI or machine learning.

1 sensor, 2 data temporary storage unit, 3 data storage determination unit, 4 data storage storage device, 5 learned machine learning model, 10 learning data collection device, 10A learning data collection device, 20 learning data collection device, 100a processing circuit, 100b CPU, 100c memory.

Claims

Whether to acquire new data newly acquired through a sensor and two existing data already stored in a data storage device, and store the new data or data related to the new data as learning data A data storage determination unit that determines using the two existing data,
A learning data collection device.
The data storage determination unit calculates whether the new data can be represented by linear interpolation of the two existing data, and determines not to store the new data if the new data can be represented by linear interpolation of the two existing data. do,
The learning data collection device according to claim 1.
The data storage determination unit calculates whether the new data can be expressed by linear interpolation of the two existing data, and if the new data can be expressed by linear interpolation of the two existing data, a method for restoring the new data. determining to store parameters and information identifying the two existing data as data associated with the new data;
The learning data collection device according to claim 1 or 2.
The data storage determination unit calculates whether the new data exists within a certain distance from the linear interpolation of the two existing data, and if the new data exists within the distance, the new data is not stored. judge,
The learning data collection device according to claim 1.
The data storage determination unit calculates whether the new data exists within a certain distance from the linear interpolation of the two existing data, and restores the new data if the new data exists within the distance. and information identifying the two existing data are stored as data associated with the new data;
The learning data collection device according to claim 1 or 4.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating whether the hidden layer output can be represented by a linear interpolation of the hidden layer outputs of the two existing data, and if the hidden layer output of the new data can be represented by a linear interpolation of the hidden layer outputs of the two existing data, the new data; determine not to save the
The learning data collection device according to claim 1.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating whether the hidden layer output can be represented by a linear interpolation of the hidden layer outputs of the two existing data, and if the hidden layer output of the new data can be represented by a linear interpolation of the hidden layer outputs of the two existing data, the new data; and information identifying the two existing data are stored as data associated with the new data;
The learning data collection device according to claim 1 or 6.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating if the hidden layer output is within a distance from a linear interpolation of the two existing data hidden layer outputs, and if the new data hidden layer output is within the distance, do not store the new data. determine that
The learning data collection device according to claim 1.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating if the hidden layer output is within a distance from a linear interpolation of the two existing data hidden layer outputs, and if the new data hidden layer output is within the distance, reconstructing the new data. determining to store parameters for and information identifying the two existing data as data associated with the new data;
The learning data collection device according to claim 1 or 8.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating whether the hidden layer output can be represented by a linear interpolation of the hidden layer outputs of the two existing data, and if the hidden layer output of the new data cannot be represented by a linear interpolation of the hidden layer outputs of the two existing data, the new determining to store a hidden layer output of data as data associated with the new data;
The learning data collection device according to claim 1.
The data storage determination unit refers to a trained machine learning model configured by a neural network, inputs the new data and the two existing data to the neural network, acquires an intermediate layer output, and stores the new data. calculating if the hidden layer output is within a distance from a linear interpolation of the two existing data hidden layer outputs, and if the new data hidden layer output is not within the distance, then the new data hidden layer output determining to store output as data associated with the new data;
The learning data collection device according to claim 1.
A learning data collection method using a learning data collection device comprising a data storage determination unit,
The data storage determination unit acquires new data newly acquired through a sensor and two existing data already stored in a data storage device, and learns the new data or data related to the new data. Determining whether to save as data using the two existing data;
A learning data collection method comprising:
Whether to acquire new data newly acquired through a sensor and two existing data already stored in a data storage device, and store the new data or data related to the new data as learning data A data storage determination function that determines using the two existing data,
A learning data collection program that causes a computer to execute