WO2022029927A1 - 推論処理装置 - Google Patents
推論処理装置 Download PDFInfo
- Publication number
- WO2022029927A1 WO2022029927A1 PCT/JP2020/030021 JP2020030021W WO2022029927A1 WO 2022029927 A1 WO2022029927 A1 WO 2022029927A1 JP 2020030021 W JP2020030021 W JP 2020030021W WO 2022029927 A1 WO2022029927 A1 WO 2022029927A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- input data
- inference
- data
- processing device
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Definitions
- the present invention relates to an inference processing device, and more particularly to a technique for performing inference using a neural network.
- DNN deep neural network
- DNN processing has two phases, learning and inference.
- learning requires a large amount of data and may be processed in the cloud.
- inference uses a trained DNN model to estimate the output for unknown input data.
- input data such as time series data or image data is given to the trained neural network model, and the characteristics of the input data are inferred.
- a sensor terminal equipped with an acceleration sensor and a gyro sensor is used to detect an event such as rotation or stop of a garbage truck to detect dust. Estimating the amount.
- a neural network model learned by using the time series data in which the event at each time is known in advance is used.
- Non-Patent Document 1 time-series data acquired from the sensor terminal is used as input data, and it is necessary to extract events in real time. Therefore, it is necessary to speed up the inference process.
- an FPGA that realizes processing is mounted on a sensor terminal, and inference calculation is performed by such an FPGA to speed up the processing (see, for example, Non-Patent Document 2).
- Kishino, et. al "Detecting garbage collection garbage handling motion sensors selected on a garbage truck truck waste smart management”
- SPWID17 Kishino, et. al "Datafying city: detecting and accumulating spatio-temporal events by vehicle-mounted sensors”
- the present invention has been made to solve the above-mentioned problems, and an object of the present invention is to provide an inference processing technique capable of speeding up inference calculation processing while reducing power consumption associated with inference calculation processing. do.
- the inference processing apparatus is an inference processing apparatus that infers the characteristics of input data using a trained neural network, and is a first storage unit that stores the input data.
- a second storage unit that stores the weights of the trained neural network, a data filter unit that extracts only specific input data from the received input data, and the specific input data extracted by the data filter unit.
- an inference calculation unit that executes an inference operation of the trained neural network with the weight as an input and infers the characteristics of the input data.
- FIG. 1 is a block diagram showing a configuration of an inference processing device according to the first embodiment.
- FIG. 2A is a block diagram showing a configuration of a data filter unit in the inference processing device according to the first embodiment.
- FIG. 2B is a diagram illustrating the processing of the data filter unit in the inference processing apparatus according to the first embodiment.
- FIG. 3 is a block diagram showing another configuration of the inference processing device according to the first embodiment.
- FIG. 4 is a block diagram showing a configuration of the inference processing device according to the first embodiment.
- FIG. 5 is a block diagram showing a configuration of the inference processing device according to the first embodiment.
- FIG. 6 is a flowchart showing the operation of the data filter unit in the inference processing device according to the first embodiment.
- FIG. 7 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- FIG. 8 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- FIG. 9 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- FIG. 10 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- FIG. 11 is a block diagram showing a configuration of a data filter unit in the inference processing device according to the third embodiment.
- FIG. 12 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- FIG. 13 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- FIG. 14 is a flowchart showing the operation of the data filter unit in the inference processing device according to the third embodiment.
- FIG. 15 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- FIG. 16 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- FIG. 17 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- FIG. 18 is a block diagram showing a configuration of the inference processing device according to the fifth embodiment.
- FIG. 19 is a flowchart showing the operation of the data filter unit in the inference processing device according to the fifth embodiment.
- FIG. 20 is a block diagram showing a hardware configuration of the inference processing device according to the embodiment of the present invention.
- FIG. 21 is a block diagram showing a configuration of a conventional inference processing device.
- FIG. 1 is a block diagram showing a configuration of an inference processing device according to the first embodiment.
- FIG. 2A is a block diagram showing a configuration of a data filter unit in the inference processing device according to the first embodiment.
- FIG. 2B is a diagram illustrating the processing of the data filter unit according to the first embodiment.
- FIG. 3 is a block diagram showing another configuration of the inference processing device according to the first embodiment.
- FIG. 4 is a block diagram showing a configuration of the inference processing device according to the first embodiment.
- FIG. 5 is a block diagram showing a configuration of the inference processing device according to the first embodiment.
- the inference processing device 1 of the present invention performs inference processing on unknown input data using a neural network model in which weight values are learned using predetermined learning data.
- Time-series data such as voice data and language data acquired from the outside of the inference processing device 1 of the present invention, or image data is used as input data to be inferred.
- the inference processing device 1 batch-processes the operation of the neural network using the trained neural network model, and infers the characteristics of the input data.
- the inference processing device 1 uses a neural network model learned in advance using input data such as time-series data in which events at each time are known.
- the inference processing device 1 estimates an event at each time by using input data such as unknown time series data and weight data of a trained neural network as inputs.
- the input data and the weight data are matrix data.
- the inference processing device 1 estimates the amount of dust by detecting an event such as rotation or stop of a garbage truck using input data acquired from a sensor equipped with an acceleration sensor and a gyro sensor. (See Non-Patent Document 1).
- the inference processing device 1 extracts only specific data from the first storage unit 10 for storing input data, the second storage unit 12 for storing the weights of the trained neural network, and the inference calculation unit.
- the inference calculation of the trained neural network is executed by using the data filter unit 11 as the input data to the data 13, the input data extracted by the data filter unit 11, and the weight of the trained neural network as inputs, and the input data.
- the inference calculation unit 13 for inferring the characteristics of the above is provided.
- the first storage unit 10 has a function of storing input data.
- the second storage unit 12 has a function of storing a trained neural network model, that is, weight data.
- the inference calculation unit 13 has a function of performing a neural network calculation with the input data, weight data, and output data as inputs, and outputting the result. During the period when the input data is not input, the inference calculation unit 13 does not perform the inference calculation process.
- the clock supply to the inference calculation unit may be stopped (clock gating) or the power supply may be stopped (power gating) to reduce the power consumption.
- the inference calculation unit 13 may output the immediately preceding inference result to the outside such as a higher-level device or a user device without performing the inference calculation process.
- the data filter unit 11 has a function of extracting only specific data from the input data and inputting it to the inference calculation unit 13. Specifically, the similarity between the input data and the previously inferred data is determined, and the dissimilar input data is extracted and input to the inference calculation unit 13. Since it is configured so that it is not necessary to perform inference calculation processing on similar input data in which the similarity of the input data is judged and the result of the inference calculation processing is the same, the inference calculation processing is performed on all the input data. It is no longer necessary, and the inference calculation processing can be speeded up while reducing the power consumption associated with the inference calculation processing.
- the holding unit 120 holds the input data used when the reasoning calculation unit 13 performs the inference processing and outputs the inference result immediately before
- the comparison unit 110 holds the input data immediately before the input data. Compares with the input data used when the inference processing is performed by the inference calculation unit 13 and the inference result is output, and the output control unit 130 outputs the input data to the inference calculation unit 13 based on the comparison result. To judge.
- the output control unit 130 when the difference between the input data and the input data used when the inference processing is performed immediately before and the inference result is output is equal to or larger than the threshold value, the input data is input to the inference calculation unit 13, while the input data is input to the inference calculation unit 13. When the difference is smaller than the threshold value, the input data is not input to the inference calculation unit 13, and the inference result obtained by inference processing in the inference calculation unit 13 immediately before is used as the inference result at the time in this case. ..
- the input data is compared with the input data that was inferred immediately before, but a plurality of input data that were previously inferred are retained, and the input data is compared with the plurality of retained data to obtain the input data.
- FIG. 2B is a diagram illustrating the processing of the data filter unit according to the first embodiment.
- the data filter unit 11 utilizes the fact that the result obtained by the inference calculation processing in the subsequent stage for similar input data does not change, and uses the similarity of the input data to perform the inference calculation processing in the inference calculation unit 13 for the similar data. It can be configured so that it is not necessary to do. This has the effect of speeding up the inference calculation process while reducing the power consumption associated with the inference calculation process.
- the input data includes a plurality of elements, for example, a plurality of data acquired from a sensor equipped with an acceleration sensor and a gyro sensor
- each element of the input data is compared. Comparison is performed using the first threshold value, and when the number of elements whose difference is equal to or greater than the first threshold value is equal to or greater than the second threshold value, it is determined that the input data is input to the inference calculation unit 13, and the elements whose difference is equal to or greater than the first threshold value are determined to be input. When the number is less than the second threshold value, it may be determined that the input data is not input to the inference calculation unit 13.
- the data to be compared is not limited to input data.
- the feedback data that is, the output data is used as input data to the data filter unit 11. May be compared.
- the presence / absence of the difference is determined by taking the logical sum or the logical product of each comparison result.
- the method of inference operation processing is not limited to this.
- the inference operation result may be used as an input for the inference operation processing in the next cycle, that is, output feedback may be performed.
- the inference processing device 1 further includes a third storage unit 14 that holds the output data fed back from the inference calculation unit 13.
- a third storage unit 14 that holds the output data fed back from the inference calculation unit 13.
- the data amount may be reduced by lengthening the sampling period for the input data in the data filter unit 11.
- the difference from the conventional inference processing device 1 shown in FIG. 21 is that the data filter unit 11 is provided.
- the conventional inference processing device 1 performs inference calculation processing on all input data, whereas the inference processing device 1 of the present invention extracts only specific input data by the data filter unit 11. be.
- FIG. 6 is a flowchart showing the operation of the data filter unit in the inference processing device according to the first embodiment.
- the data filter unit 11 sets a threshold value used to detect the difference between the input data and the input data that has been inferred in the past (step S1-1).
- the threshold value may be dynamically changed during operation.
- the threshold value may be increased. Further, when the inference accuracy for the inference processing result is lower than the desired accuracy, the inference processing is performed for more input data by reducing the threshold value, so that the accuracy can be expected to be improved. In this way, the threshold value used for similarity comparison of input data may be dynamically set according to the inference calculation result.
- the data filter unit 11 acquires the input data and the input data that has been inferred immediately before (step S1-2), and calculates the difference between the input data and the past input data that has been previously inferred (step S1-2). S1-3).
- the past input data for example, the input data input immediately before and inferred can be used.
- step S1-4: Yes When the calculated difference is compared with the threshold value and the difference is equal to or greater than the threshold value (step S1-4: Yes), the input data is output to the inference calculation unit 13 (step S1-5). On the other hand, when the difference is smaller than the threshold value (step S1-4: No), the input data is not output to the inference calculation unit 13 (step S1-6). As the inference result in this case, the inference result obtained by inference processing by the inference calculation unit 13 using the past input data is used.
- the data filter unit 11 determines the similarity of the input data to the data for which the inference calculation was performed in the past, and determines the similarity of the input data to the past inference calculation. It can be configured so that the input data similar to the input data is not output to the inference calculation unit 13. As a result, the inference calculation unit 13 does not have to perform inference calculation processing on the input data similar to the input data that has been inferred in the past, so that the inference calculation processing can be speeded up and the power consumption associated with the inference calculation processing can be reduced. It can be realized.
- a first storage unit in order to perform inference processing on unknown input data by using a neural network model in which weight values are learned using predetermined learning data, a first storage unit is used. 10 stores the input data, the second storage unit 12 stores the weight of the trained neural network, and the data filter unit 11 extracts only specific data from the input data and inputs it to the inference calculation unit 13. Using the input data as input data, the inference calculation unit 13 receives the input data extracted by the data filter unit 11 and the weight of the trained neural network as inputs, executes the inference calculation of the trained neural network, and characterizes the input data. Infer.
- the data filter unit 11 determines the similarity of the input data by utilizing the fact that the result obtained by the inference calculation processing in the subsequent stage for the similar input data does not change, and performs the inference calculation processing on the similar data. It can be configured so that it does not have to be.
- the inference processing device 1 of the present invention can speed up inference calculation processing and reduce power consumption associated with inference calculation processing as compared with a conventional inference processing device that performs inference processing for all input data. Will be. Further, since it is not necessary to perform inference processing on all the input data, the output of the inference result from the inference processing device 1 can be reduced, so that the load on the communication network can be reduced.
- FIG. 7 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- FIG. 8 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- FIG. 9 is a block diagram showing a configuration of the inference processing device according to the second embodiment.
- a data filter unit 11 is provided in front of the storage unit, and only specific data among the input data is extracted and then stored in the storage unit.
- the memory control unit determines whether or not there is input data stored in the storage unit, that is, input data waiting for inference calculation processing, and inputs the input data to the inference calculation unit 13 in the subsequent stage.
- the inference operation result may be used as an input for the inference operation process in the next cycle, that is, output feedback may be performed.
- output feedback By performing output feedback, there is an effect that inference operations suitable for time-series data such as character strings and voice / language processing can be performed.
- the output feedback may be directly fed back in the inference calculation unit 13 instead of being input to the storage unit, which has the effect of reducing the amount of memory consumed in the storage unit. ..
- the first storage unit 10 uses input data.
- the second storage unit 12 stores the weight of the trained neural network, and the data filter unit 11 extracts only specific data from the input data and uses it as input data to the inference calculation unit 13 for inference.
- the calculation unit 13 receives the input data extracted by the data filter unit 11 and the weight of the trained neural network as inputs, executes an inference calculation of the trained neural network, and infers the characteristics of the input data.
- the data filter unit 11 can determine the similarity of the input data and configure the data filter unit 11 so that it is not necessary to perform inference calculation processing on the input data similar to the input data that has been inferred in the past.
- the inference processing device 1 of the present invention can speed up inference calculation processing and reduce power consumption associated with inference calculation processing as compared with a conventional inference processing device that performs inference processing for all input data. Will be.
- the output of the inference result from the inference processing device 1 can be reduced, so that the load on the communication network can be reduced.
- FIG. 10 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- FIG. 11 is a block diagram showing a configuration of a data filter unit in the inference processing device according to the third embodiment.
- FIG. 12 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- FIG. 13 is a block diagram showing a configuration of the inference processing device according to the third embodiment.
- the difference from the first and second embodiments is the inference processing device 1 that receives input data from a plurality of data sources, performs inference calculation processing on those input data, and outputs inference results. Further, it is provided with a data filter unit 11 for detecting similarities between a plurality of input data at the same time.
- the data filter unit has a function of extracting only specific data from a plurality of input data and inputting it to the inference calculation unit 13. Specifically, as shown in FIG. 11, when a plurality of input data are compared with each other and the difference is equal to or less than the threshold value, inference calculation processing is performed only for one of the compared input data. conduct. In this case, the inference result of the input data that has not been inferred is the same as the inference result obtained by inferring the compared input data in the inference calculation unit 13. On the other hand, when the difference is larger than the threshold value, the output results of the inference calculation processing of both are different, so the inference calculation processing is performed on the input data of both.
- the data filter unit 11 detects the similarity between the input data of a plurality of different input data generation sources, and the result obtained by the subsequent inference calculation process for the similar input data is the same, so that the inference calculation is performed. You don't have to do any processing. This has the effect of speeding up the inference calculation process and reducing the power consumption associated with the inference calculation process.
- the input data to be compared is a predetermined combination.
- the predetermined combination is, for example, comparing the input data having the closest physical distance to the source of the input data, or comparing the input data corresponding to the order of the identifiers given to the source of the input data. Or something.
- the number of times the input data is compared is not limited to one step, and the comparison may be performed multiple times by combining different input data.
- the inference calculation unit 13 may perform inference processing for a plurality of input data in parallel. This has the effect of speeding up the inference calculation process.
- the input data to be compared shows an example of comparing input data from a predetermined input source, but it is not always necessary to compare specific input data. That is, arbitrary input data may be compared with each other. For example, when the terminal or the like that generates the input data is a mobile terminal that physically moves with respect to time, the input data may be compared by combining mobile terminals that are physically close to each other at that time. ..
- the input data is reduced in order to speed up the inference processing and reduce the power consumption. Therefore, instead of searching for similarities exhaustively for all combinations of input data, an example of comparing similarities only for a certain combination of input data is shown, but it is not always similar.
- the detection method is not limited to this.
- comparing the similarity of all combinations with respect to the source of the input data input to the inference processing device 1 can be executed faster than the inference processing in the subsequent stage, and the power required for detecting the similarity. If is lower than the inference processing in the subsequent stage, the inference processing may be realized at higher speed and lower power by comprehensively searching for similarities.
- the threshold value used for detecting similarity is given as the initial setting, but the threshold setting method is not limited to this. For example, if there is no difference in the obtained inference processing result with respect to the threshold value used at a certain time, the threshold value may be increased. Further, when the inference accuracy for the inference processing result is lower than the desired accuracy, the inference processing is performed for more input data by reducing the threshold value, so that the accuracy can be expected to be improved. In this way, the threshold value used for similarity comparison of input data may be dynamically set according to the inference calculation result.
- the inference operation result may be used as an input for the inference operation processing in the next cycle, that is, output feedback may be performed.
- output feedback By performing output feedback, there is an effect that inference operations suitable for time-series data such as character strings and voice / language processing can be performed.
- the output feedback may be directly fed back in the inference calculation unit 13 instead of being input to the storage unit, which has the effect of reducing the amount of memory consumed in the storage unit. ..
- FIG. 14 is a flowchart showing the operation of the data filter unit in the inference processing device according to the third embodiment.
- the difference from the first and second embodiments is that input data from a plurality of data sources is received, similarity between the plurality of input data is determined, and a specific input is made from the input data based on the similarity. The point is to extract the data.
- the data filter unit 11 sets a threshold value used to detect similarities between a plurality of input data (step S2-1).
- the threshold value may be dynamically changed during operation.
- the threshold value may be increased. Further, when the inference accuracy for the inference processing result is lower than the desired accuracy, the inference processing is performed for more input data by reducing the threshold value, so that the accuracy can be expected to be improved. In this way, the threshold value used for similarity comparison of input data may be dynamically set according to the inference calculation result.
- the data filter unit 11 acquires input data from a plurality of data sources (step S2-2) and calculates the difference (S2-3).
- the calculated difference is equal to or greater than the threshold value (step S2-4: Yes)
- the output results of the inference calculation processing for the plurality of input data are different, so that the inference calculation processing is performed for the plurality of input data (step S2-5). ..
- step S2-4 when the calculated difference is smaller than the threshold value (step S2-4: No), the inference calculation process is performed only on one of the compared input data (step S2-6). In this case, the inference result of the other input data that has not been inferred is the same as the inference result obtained by inferring the compared input data in the inference calculation unit 13.
- the data filter unit 11 determines the similarity between the input data of a plurality of different input data generation sources and is similar. It can be configured not to perform inference calculation processing on all of the input data to be input. This makes it possible to speed up the inference calculation process and reduce the power consumption associated with the inference calculation process.
- a plurality of first storage units 10 are used.
- the second storage unit 12 stores the weights of the trained neural network
- the data filter unit 11 detects the similarity between the plurality of input data at the same time, and the input data is stored.
- only specific data is extracted and used as input data to the inference calculation unit 13, and the inference calculation unit 13 receives the input data extracted by the data filter unit 11 and the weight of the learned neural network as inputs, and has already learned the data.
- the inference operation of the neural network is executed to infer the characteristics of the input data.
- the data filter unit 11 detects the similarity between the input data of a plurality of different input data generation sources, and does not require the subsequent inference calculation processing for the similar input data having the same result of the inference calculation processing. I'm done.
- the inference processing device 1 of the present invention can speed up inference calculation processing and reduce power consumption associated with inference calculation processing as compared with a conventional inference processing device that performs inference processing for all input data. Will be.
- the output of the inference result from the inference processing device 1 can be reduced, so that the load on the communication network can be reduced.
- FIG. 15 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- FIG. 16 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- FIG. 17 is a block diagram showing a configuration of the inference processing device according to the fourth embodiment.
- a data filter unit 11 is provided in front of the storage unit, and after extracting only specific data from the input data, the data is stored in the first storage unit 10. It is an inference processing device 1 that receives input data from a plurality of data sources and performs inference calculation processing on those input data, and further detects similarities between a plurality of input data at the same time. ..
- the inference operation result may be used as an input for the inference operation processing in the next cycle, that is, output feedback may be performed.
- output feedback By performing output feedback, there is an effect that inference operations suitable for time-series data such as character strings and voice / language processing can be performed.
- the output feedback may be directly fed back in the inference calculation unit 13 instead of being input to the storage unit, which has the effect of reducing the amount of memory consumed in the storage unit. ..
- a plurality of first storage units 10 are used.
- the second storage unit 12 stores the weights of the trained neural network
- the data filter unit 11 detects the similarity between the plurality of input data at the same time, and the input data is stored.
- only specific data is extracted and used as input data to the inference calculation unit 13, and the inference calculation unit 13 receives the input data extracted by the data filter unit 11 and the weight of the learned neural network as inputs, and has already learned the data.
- the inference operation of the neural network is executed to infer the characteristics of the input data.
- the data filter unit 11 detects the similarity between the input data of a plurality of different input data generation sources, and does not perform the subsequent inference calculation processing for the similar input data in which the result of the inference calculation processing is the same. It's done.
- the inference processing device 1 of the present invention can speed up the inference calculation processing and reduce the power consumption associated with the inference calculation processing as compared with the conventional inference processing device that performs the inference processing for all the input data. ..
- the data filter unit 11 in front of the first storage unit 10, there is an effect that the amount of memory used by the first storage unit 10 can be reduced. Further, since it is not necessary to perform inference processing on all the input data, the output of the inference result from the inference processing device 1 can be reduced, so that the load on the communication network can be reduced.
- FIG. 18 is a block diagram showing a configuration of a data filter unit in the inference processing device according to the fifth embodiment.
- the difference from the first to fourth embodiments is that in the data filter unit 11, the similarity between the plurality of input data at the same time with respect to the input data from the plurality of data sources, and immediately before the input data.
- the inference calculation unit 13 detects both the similarity with the input data used when the inference processing is performed and the inference result is output.
- the data filter unit 11 has a function of extracting only specific data from a plurality of input data and inputting it to the inference calculation unit 13. Specifically, as shown in FIG. 18, a plurality of input data are compared with each other, and when the difference is equal to or less than a threshold value, one of the compared input data is extracted. Further, the input data is compared with the input data used when the inference processing is performed by the inference calculation unit 13 immediately before and the inference result is output, and if the difference is equal to or more than the threshold value, the input data is used in the inference calculation unit 13. Input to, and perform inference calculation processing only for the input data.
- the inference result of the input data that has not been inferred is the same as the inference result obtained by inferring the compared input data in the inference calculation unit 13.
- the difference is smaller than the threshold value, the input data is not input to the inference calculation unit 13, and the inference result at the time in this case is inferred by the inference calculation unit 13 immediately before. Use the result.
- the inference calculation processing is performed on both input data.
- the input data is compared with the input data used when the inference processing is performed by the inference calculation unit 13 immediately before and the inference result is output, and if the difference is equal to or more than the threshold value, the input data is input.
- the inference result of the input data that has not been inferred is the same as the inference result obtained by inferring the compared input data in the inference calculation unit 13.
- the data filter unit 11 detects the similarity between the input data of a plurality of different input data generation sources, and the result obtained by the subsequent inference calculation process for the similar input data is the same, so that the inference calculation is performed. You don't have to do any processing. As a result, the inference calculation processing can be speeded up and the power consumption associated with the inference calculation processing can be reduced.
- the data filter unit 11 detects the similarity of the input data, and the result obtained by the inference calculation processing in the subsequent stage for the similar input data does not change, so that the inference calculation processing does not have to be performed.
- the inference calculation processing can be speeded up and the power consumption associated with the inference calculation processing can be reduced.
- the input data to be compared is a predetermined combination.
- the predetermined combination is, for example, comparing the input data having the closest physical distance to the source of the input data, or comparing the input data corresponding to the order of the identifiers given to the source of the input data. Or something.
- the number of times the input data is compared is not limited to one step, and the comparison may be performed multiple times by combining different input data.
- the inference calculation unit 13 may perform inference processing for a plurality of input data in parallel. This has the effect of speeding up the inference calculation process.
- the input data to be compared shows an example of comparing input data from a predetermined input source, but it is not always necessary to compare specific input data. That is, arbitrary input data may be compared with each other. For example, when the terminal or the like that generates the input data is a mobile terminal that physically moves with respect to time, the input data may be compared by combining mobile terminals that are physically close to each other at that time. ..
- the input data is reduced in order to speed up the inference processing and reduce the power consumption. Therefore, instead of searching for similarities exhaustively for all combinations of input data, an example of comparing similarities only for a certain combination of input data is shown, but it is not always similar.
- the detection method is not limited to this.
- comparing the similarity of all combinations with respect to the source of the input data input to the inference processing device 1 can be executed faster than the inference processing in the subsequent stage, and the power required for detecting the similarity. If is lower than the inference processing in the subsequent stage, the inference processing may be realized at higher speed and lower power by comprehensively searching for similarities.
- the threshold value used for detecting similarity is given as the initial setting, but the threshold setting method is not limited to this. For example, if there is no difference in the obtained inference processing result with respect to the threshold value used at a certain time, the threshold value may be increased. Further, when the inference accuracy for the inference processing result is lower than the desired accuracy, the inference processing is performed for more input data by reducing the threshold value, so that the accuracy can be expected to be improved. In this way, the threshold value used for similarity comparison of input data may be dynamically set according to the inference calculation result.
- the inference operation result may be used as an input for the inference operation processing in the next cycle, that is, output feedback may be performed.
- output feedback By performing output feedback, there is an effect that inference operations suitable for time-series data such as character strings and voice / language processing can be performed.
- the output feedback may be directly fed back in the inference calculation unit 13 instead of being input to the storage unit, which has the effect of reducing the amount of memory consumed in the storage unit. ..
- the first threshold value is used for each element of the input data, and there is a difference when the elements having the difference of the first threshold value or more are the second threshold value or more, and the difference is the first threshold value or more. When the number of elements is less than the second threshold value, it is determined that there is no difference.
- the data to be compared is not limited to input data.
- the feedback data that is, the output data may be compared.
- the presence / absence of the difference is determined by taking the logical sum or the logical product of each comparison result.
- the inference operation processing is not limited to this.
- the inference operation result may be used as an input for the inference operation processing in the next cycle, that is, output feedback may be performed.
- the inference processing device 1 further includes a third storage unit 14 that holds the output data fed back from the inference calculation unit 13. By performing output feedback, there is an effect that inference operations suitable for time-series data such as character strings and voice / language processing can be performed.
- the output feedback may be directly fed back in the inference calculation unit 13 instead of being input to the third storage unit 14, thereby increasing the memory capacity mounted on the inference processing device 1. There is an effect that can be reduced.
- FIG. 19 is a flowchart showing the operation of the data filter unit in the inference processing device according to the fifth embodiment.
- the difference from the first to fourth embodiments is the judgment of similarity with the past input data calculated by inference before, the input data from a plurality of data sources is received, and the plurality of input data at the same time.
- the point is to extract specific input data from the input data based on the judgment results of both judgments of similarity between them. By judging the similarity between both of the above, it is possible to further reduce the input data for which the inference operation is performed.
- the data filter unit 11 sets a threshold value used for detecting the similarity between a plurality of input data and a threshold value used for detecting the difference between the input data and the input data that has been subjected to inference calculation processing in the past. (Step S3-1).
- the threshold value may be dynamically changed during operation.
- the threshold value may be increased. Further, when the inference accuracy for the inference processing result is lower than the desired accuracy, the inference processing is performed for more input data by reducing the threshold value, so that the accuracy can be expected to be improved. In this way, the threshold value used for similarity comparison of input data may be dynamically set according to the inference calculation result.
- the data filter unit 11 acquires input data from a plurality of data sources (step S3-2) and calculates the difference (step S3-3).
- step S3-3 calculates the difference
- step S3-4: Yes whether the difference from the past input data previously processed by inference calculation is calculated (step S3-5) and output to the inference calculation unit. Is determined (step S3-7).
- step S3-4 when the calculated difference is smaller than the threshold value (step S3-4: No), only one of the compared input data is calculated as the difference from the past input data that has been previously inferred. Then (step S3-6), it is determined whether to output to the inference calculation unit 13 (step S3-7).
- step S3-7 Yes
- the input data is output to the inference calculation unit 13 and the inference calculation is performed (step S3-8).
- step S3-7: No the input data is not input to the inference calculation unit 13 and inference is not performed (step S3-9). In this case, the inference result obtained by inferring the past input data is used as the inference result at the relevant time.
- the data filter unit 11 determines the similarity of the plurality of input data, configures the data filter unit 11 not to perform inference calculation processing on all of the similar input data, and further performs inference processing before. It is configured so that the inference calculation processing is not performed on the data similar to the past input data which has been previously subjected to the inference processing by judging the similarity with the past input data. As a result, it is not necessary to perform the inference calculation processing in the subsequent stage for similar input data, so that the inference calculation processing can be speeded up and the power consumption associated with the inference calculation processing can be reduced.
- a plurality of first storage units 10 are used.
- the second storage unit 12 stores the weights of the trained neural network, and the data filter unit 11 performs similarity between a plurality of input data and previously inference processing. Judging the similarity with the past input data performed, based on the judgment result of the similarity, only specific data from the input data is extracted and used as input data to the inference calculation unit 13, and the inference calculation unit is used. 13 uses the input data extracted by the data filter unit 11 and the weight of the trained neural network as inputs, executes an inference operation of the trained neural network, and infers the characteristics of the input data.
- the data filter unit 11 determines the similarity between the input data of a plurality of different input data generation sources and the similarity with the past input data for which the inference processing has been performed before, and the result of the inference calculation processing. It can be configured so that it is not necessary to perform the inference calculation processing in the subsequent stage for similar input data having the same value. Therefore, the inference processing device 1 of the present invention speeds up the inference calculation processing and reduces the power consumption associated with the inference calculation processing as compared with the conventional inference processing device that performs the inference processing for all the input data. Can be done.
- the data filter unit 11 in front of the first storage unit 10, there is an effect that the amount of memory used by the first storage unit 10 can be reduced. Further, since it is not necessary to perform inference processing on all the input data, the output of the inference result from the inference processing device 1 can be reduced, so that the load on the communication network can be reduced.
- the inference processing device 1 includes, for example, a computer including a processor 102, a main storage device 103, a communication interface 104, an auxiliary storage device 105, and an input / output I / O 106 connected via a bus 101. It can be realized by a program that controls these hardware resources.
- the display device 107 may be connected via the bus 101, and the inference result or the like may be displayed on the display screen.
- the sensor 108 may be connected via the input / output I / O 106 and the bus 101, and the inference processing device 1 may measure the input data X composed of time-series data such as voice data to be inferred.
- the main storage device 103 is realized by, for example, a semiconductor memory such as a SRAM, a DRAM, and a ROM.
- the main storage device 103 realizes the storage unit described with reference to FIG. 1 and the like.
- the main storage device 103 stores in advance a program for the processor 102 to perform various controls and operations.
- the processor 102 and the main storage device 103 realize each function of the inference processing device 1 including the first storage unit 10, the second storage unit 12, the data filter unit 11, and the inference calculation unit 13 shown in FIG. ..
- the communication interface 104 is an interface circuit for communicating with various external electronic devices via the communication network NW.
- the inference processing device 1 may receive the weight data W of the learned neural network from the outside via the communication interface 104, or may send the inference result Y to the outside.
- the communication interface 104 for example, an interface and an antenna compatible with wireless data communication standards such as LTE, 3G, wireless LAN, and Bluetooth (registered trademark) are used.
- the communication network NW includes, for example, WAN (Wide Area Network), LAN (Local Area Network), the Internet, a dedicated line, a wireless base station, a provider, and the like.
- the auxiliary storage device 105 is composed of a readable / writable storage medium and a drive device for reading / writing various information such as programs and data to the storage medium.
- a semiconductor memory such as a hard disk or a flash memory can be used as the storage medium in the auxiliary storage device 105.
- the auxiliary storage device 105 has a program storage area for storing a program for the inference processing device 1 to perform inference. Further, the auxiliary storage device 105 may have, for example, a backup area for backing up the above-mentioned data, programs, and the like. The auxiliary storage device 105 can store, for example, an inference processing program.
- the input / output I / O 106 is composed of an I / O terminal that inputs a signal from an external device such as a display device 107 and outputs a signal to the external device.
- the inference processing device 1 may be distributed not only by one computer but also by a plurality of computers connected to each other by a communication network NW. Further, the processor 102 may be realized by hardware such as FPGA (Field-Programmable Gate Array), LSI (Large Scale Integration), and ASIC (Application Specific Integrated Circuit).
- FPGA Field-Programmable Gate Array
- LSI Large Scale Integration
- ASIC Application Specific Integrated Circuit
- the circuit configuration can be flexibly rewritten according to the configuration of the input data X and the neural network model used. In this case, it is possible to realize the inference processing device 1 that can correspond to various applications.
- Inference processing device 10 ... First storage unit, 11 ... Data filter unit, 12 ... Second storage unit, 13 ... Inference calculation unit 13, 14 ... Third storage unit, 101 ... Bus, 102 ... Processor, 103 ... Main storage device, 104 ... communication interface, 105 ... auxiliary storage device, 106 ... input / output I / O, 107 ... display device, 108 ... sensor.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Feedback Control In General (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Telephone Function (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022541409A JP7435793B2 (ja) | 2020-08-05 | 2020-08-05 | 推論処理装置 |
| US18/006,533 US20230297856A1 (en) | 2020-08-05 | 2020-08-05 | Inference Processing Apparatus |
| PCT/JP2020/030021 WO2022029927A1 (ja) | 2020-08-05 | 2020-08-05 | 推論処理装置 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2020/030021 WO2022029927A1 (ja) | 2020-08-05 | 2020-08-05 | 推論処理装置 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022029927A1 true WO2022029927A1 (ja) | 2022-02-10 |
Family
ID=80117769
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2020/030021 Ceased WO2022029927A1 (ja) | 2020-08-05 | 2020-08-05 | 推論処理装置 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230297856A1 (https=) |
| JP (1) | JP7435793B2 (https=) |
| WO (1) | WO2022029927A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7666289B2 (ja) * | 2021-10-25 | 2025-04-22 | 富士通株式会社 | 機械学習プログラム、機械学習方法、及び、情報処理装置 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2000089958A (ja) * | 1998-09-16 | 2000-03-31 | Hitachi Ltd | データ指向推論装置 |
| WO2018179361A1 (ja) * | 2017-03-31 | 2018-10-04 | 日本電気株式会社 | 画像処理装置、画像処理方法、および記録媒体 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10108850B1 (en) * | 2017-04-24 | 2018-10-23 | Intel Corporation | Recognition, reidentification and security enhancements using autonomous machines |
| US11657087B2 (en) * | 2018-03-19 | 2023-05-23 | Verily Life Sciences Llc | Surgical video retrieval based on preoperative images |
| US20230035526A1 (en) * | 2020-02-26 | 2023-02-02 | Mitsubishi Electric Corporation | Inference device, driving assistance device, inference method, and server |
| US12405660B2 (en) * | 2020-05-11 | 2025-09-02 | Nvidia Corporation | Gaze estimation using one or more neural networks |
-
2020
- 2020-08-05 JP JP2022541409A patent/JP7435793B2/ja active Active
- 2020-08-05 US US18/006,533 patent/US20230297856A1/en active Pending
- 2020-08-05 WO PCT/JP2020/030021 patent/WO2022029927A1/ja not_active Ceased
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2000089958A (ja) * | 1998-09-16 | 2000-03-31 | Hitachi Ltd | データ指向推論装置 |
| WO2018179361A1 (ja) * | 2017-03-31 | 2018-10-04 | 日本電気株式会社 | 画像処理装置、画像処理方法、および記録媒体 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2022029927A1 (https=) | 2022-02-10 |
| JP7435793B2 (ja) | 2024-02-21 |
| US20230297856A1 (en) | 2023-09-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN106716439B (zh) | 基于事件的下采样方法、设备、装置及介质 | |
| EP3671575B1 (en) | Neural network processing method and apparatus based on nested bit representation | |
| CN108090516A (zh) | 自动生成机器学习样本的特征的方法及系统 | |
| EP3924887A1 (en) | Mixed precision training of an artificial neural network | |
| CN109829375A (zh) | 一种机器学习方法、装置、设备及系统 | |
| CN102741840B (zh) | 用于对个性化场景建模的方法和装置 | |
| CN106068520A (zh) | 个性化的机器学习模型 | |
| CN108108455B (zh) | 目的地的推送方法、装置、存储介质及电子设备 | |
| CN111798018A (zh) | 行为预测方法、装置、存储介质及电子设备 | |
| CN116306987A (zh) | 基于联邦学习的多任务学习方法及相关设备 | |
| CN111275135A (zh) | 一种故障诊断方法、装置、设备、介质 | |
| CN107885545A (zh) | 应用管理方法、装置、存储介质及电子设备 | |
| CN112069803A (zh) | 文本备份方法、装置、设备及计算机可读存储介质 | |
| CN105532071A (zh) | 多传感器手部检测 | |
| US20220318572A1 (en) | Inference Processing Apparatus and Inference Processing Method | |
| JP7435793B2 (ja) | 推論処理装置 | |
| CN109961163A (zh) | 性别预测方法、装置、存储介质及电子设备 | |
| CN111402617A (zh) | 站点信息确定方法、装置、终端及存储介质 | |
| CN111797874A (zh) | 行为预测方法、装置、存储介质及电子设备 | |
| US20220327405A1 (en) | Inference Processing Apparatus and Inference Processing Method | |
| CN111800535A (zh) | 终端运行状态的评估方法、装置、存储介质及电子设备 | |
| CN116628600A (zh) | 基于随机森林的不平衡数据采样方法及装置 | |
| US11651195B2 (en) | Systems and methods for utilizing a machine learning model combining episodic and semantic information to process a new class of data without loss of semantic knowledge | |
| CN105531644B (zh) | 分离移动设备电极 | |
| CN108364067A (zh) | 基于数据分割的深度学习方法以及机器人系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20947874 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022541409 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20947874 Country of ref document: EP Kind code of ref document: A1 |