WO2022113152A1 - 処理システム、処理方法及び処理プログラム - Google Patents
処理システム、処理方法及び処理プログラム Download PDFInfo
- Publication number
- WO2022113152A1 WO2022113152A1 PCT/JP2020/043564 JP2020043564W WO2022113152A1 WO 2022113152 A1 WO2022113152 A1 WO 2022113152A1 JP 2020043564 W JP2020043564 W JP 2020043564W WO 2022113152 A1 WO2022113152 A1 WO 2022113152A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- inference
- model
- server device
- divided
- processing
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 104
- 238000003672 processing method Methods 0.000 title claims description 11
- 238000000034 method Methods 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 28
- 238000001514 detection method Methods 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000033001 locomotion Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 14
- 238000012986 modification Methods 0.000 description 13
- 230000004048 modification Effects 0.000 description 13
- 230000005540 biological transmission Effects 0.000 description 9
- 230000010354 integration Effects 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 241000282326 Felis catus Species 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Definitions
- the present invention relates to a processing system, a processing method and a processing program.
- resources such as the calculation amount and memory of the device used in the edge device are referred to as a device other than the edge device (hereinafter, for convenience, cloud) which is physically and logically located farther from the user than the edge device. ), It is poor. For this reason, if a process with a large computational load is performed by an edge device, it may take a large amount of time to complete the process, or it may take a long time to complete other processes that do not have a large computational load. There is.
- Non-Patent Document 1 proposes application of so-called adaptive learning to the edge cloud. That is, in the method described in Non-Patent Document 1, a trained model trained using general-purpose training data in the cloud is expanded on an edge device, and learning is performed in the cloud using the data acquired by the edge device. By re-learning the model for which the above was performed, the operation that takes advantage of the cloud and the edge device is realized.
- the present invention has been made in view of the above, and provides a processing system, a processing method, and a processing program capable of reducing the amount of data transfer from the edge device to the server device and reducing the calculation load on the server device.
- the purpose is.
- the processing system is a processing system performed by using an edge device and a server device, and the edge device divides the processing data into a plurality of pieces.
- the first inference unit that inputs the divided data to the corresponding first model of the plurality of first models to execute the inference in each first model, and each of the plurality of divided data.
- the server device has a determination unit that outputs only the divided data determined that the inference result in the model 1 matches the predetermined result to the server device, and the server device has a higher calculation amount than the first model. It is characterized by having a second inference unit that executes inference processing on the divided data output from the edge device using the model of 2.
- the present invention it is possible to reduce the amount of data transferred from the edge device to the server device and reduce the calculation load on the server device.
- FIG. 1 is a diagram illustrating an outline of a processing method of the processing system according to the embodiment.
- FIG. 2 is a diagram illustrating an example of DNN1 and DNN2.
- FIG. 3 is a diagram schematically showing an example of the configuration of the processing system according to the embodiment.
- FIG. 4 is a flowchart showing a flow of processing executed by the edge device shown in FIG.
- FIG. 5 is a flowchart showing a flow of processing executed by the server device shown in FIG.
- FIG. 6 is a diagram illustrating an outline of a processing method of the processing system according to the first modification of the embodiment.
- FIG. 7 is a diagram illustrating an outline of a processing method in an edge device of the processing system according to the second modification of the embodiment.
- FIG. 8 is a diagram schematically showing an example of the configuration of the processing system according to the second modification of the embodiment.
- FIG. 9 is a diagram showing an example of a computer in which an edge device and a server device are realized by executing a program
- FIG. 1 is a diagram illustrating an outline of a processing method of the processing system according to the embodiment.
- the processing system of the embodiment constitutes a model cascade using a high-precision model and a lightweight model.
- an edge device using a high-speed and low-precision lightweight model for example, DNN1 (first model)
- a low-speed and high-precision high-precision model for example, DNN2 (second model)
- a server device is a device that is physically and logically located far from the user.
- the edge device is an IoT device and various terminal devices that are physically and logically close to the user, and has less resources than a server device.
- DNN1 and DNN2 are models that output inference results based on the input processing target data.
- an input image is divided in an edge device, and a plurality of processes are executed in parallel for each divided image to determine a predetermined value. Only the divided images that satisfy the conditions of are sent to the cloud side.
- the edge device and the server device include a plurality of DNN1 and DNN2, and execute each process including the inference process in parallel. It is also effective for high frame rate video.
- the image may be divided and the divided image of the image including the desired subject may be transmitted to the cloud side.
- the edge device divides the image G1 into, for example, nine equal parts. Then, the divided images G1-1 to G1-9 are distributed to DNN1-1 to DNN1-9, respectively ((1) in FIG. 1). Each DNN1-1 to DNN1-9 performs subject recognition and motion detection for inferring the probability of each class of the object appearing in the image for the input divided images G1-1 to G1-9 (FIG. 1). (2)).
- the number of DNN1-1 to DNN1-9 in the edge device is an example, and the number may be set according to the number of divisions of the image. Further, the divided images may be sequentially processed by using DNN1-1 to DNN1-M (M is a number smaller than the number of divisions).
- the edge device based on the inference results of DNN1-1 to DNN1-9, the divided images G1-1 and G1-that include a predetermined subject (for example, a cat or a cat part) and that the moving object is detected are detected. 5 is selected, and the certainty of the divided images G1-1 and G1-5 is acquired.
- the degree of certainty is the degree of certainty that the result of subject recognition by DNN1-1 to DNN1-9 is the correct answer.
- the edge device determines that the divided images G1-1 and G1-5 whose certainty is equal to or higher than a predetermined threshold are to be transmitted ((3) in FIG. 1), and the divided images G1-1 and G1-
- the image 5 is encoded for each of the divided images G1-1 and G1-5 and transmitted to the cloud (server device) ((4) in FIG. 1).
- the area around the divided image may also be designed to be sent. This is effective in improving the inference accuracy when a desired subject is present outside the divided image. In particular, it is a case where a desired subject is photographed so as to occupy a similar area on a plurality of divided screens.
- the plurality of split screens are effective, for example, when there are two split screens or when there are a plurality of surrounding split screens.
- the divided images G1-1 and G1-5 output from the edge device are received, they are decoded for each divided image G1-1 and G1-5 ((5) in FIG. 1), 2-1 to DNN2. Enter each in -9 ((6) in FIG. 1).
- Each of 2-1 to DNN2-9 performs inference processing for inferring the probability of each class of the object to be reflected in the image for the input divided images G1-1 and G1-5 ((6) in FIG. 1).
- the inference results of each 2-1 to DNN2-9 are integrated ((7) in FIG. 1) and output as the processing result of the image G1 which is the processing data. do.
- the number of 2-1 to DNN2-9 on the cloud side is an example, and the number may be set according to the number of input divided images.
- the image to be processed is divided, each process including the inference process is executed in parallel for each divided image, and only the divided image satisfying a predetermined condition is selected.
- Send to the cloud side it is possible to reduce the amount of data transfer from the edge device to the server device as compared with the case where the entire image to be processed is transmitted.
- the server device performs inference processing only on the transmitted divided image. Therefore, in the processing system according to the embodiment, it is possible to reduce the calculation load in the server device as compared with the case where the inference processing is performed on the entire image to be processed.
- FIG. 2 is a diagram illustrating an example of DNN1 and DNN2.
- the DNN has an input layer for inputting data, a plurality of intermediate layers for variously converting data input from the input layer, and an output layer for outputting so-called inferred results such as probability and likelihood. Further, it may be configured to output the above-mentioned certainty.
- the output value of the middle layer which is the output value to be sent to the cloud, may be irreversible if the input data needs to be kept anonymous.
- the processing system may use independent DNN1a and DNN2a as DNN1-1 to DNN1-9 and DNN2-1 to DNN2-9, respectively.
- DNN1a may be trained using the training data used in the training of DNN2a.
- the number of DNN1-..., DNN2-... is not limited as long as it is 1 or more.
- DNN1a and DNN2a may be assigned different tasks consisting of a low-computation model and a high-compliance model are assigned instead of the lightweight model and the high-precision model. May be done.
- a moving object may be detected as a model with a low calculation amount, and a subject may be recognized as a model with a high calculation amount.
- DNN1-1 to DNN1-9 may be learned for each divided region, or may be a common DNN. Further, DNN1-1 to DNN1-9 may perform motion detection as well as subject recognition.
- FIG. 3 is a diagram schematically showing an example of the configuration of the processing system according to the embodiment.
- the processing system 100 includes a server device 20 and an edge device 30. Further, the server device 20 and the edge device 30 are connected via the network N.
- the network N is, for example, the Internet.
- the server device 20 is a server provided in a cloud environment.
- the edge device 30 is, for example, an IoT device and various terminal devices.
- a predetermined program is read into a computer or the like including a ROM (Read Only Memory), a RAM (Random Access Memory), a CPU (Central Processing Unit), etc., and the CPU loads the predetermined program. It is realized by executing it.
- so-called accelerators represented by GPUs, VPUs (Vision Processing Units), FPGAs (Field Programmable Gate Arrays), ASICs (Application Specific Integrated Circuits), and dedicated AI (Artificial Intelligence) chips are also used.
- the server device 20 and the edge device 30 each have a NIC (Network Interface Card) or the like, and may communicate with other devices via a telecommunication line such as a LAN (Local Area Network) or the Internet. It is possible.
- the server device 20 uses a decoding unit 21 having a plurality of decoders and an inference unit 22 (inference unit 22) that makes inferences using a plurality of DNN2-1 to DNN2-9 that are trained high-precision models. It has a second inference unit) and an integration unit 23.
- DNN2-1 to DNN2-9 include information such as model parameters.
- the number of DNN2-1 to DNN2-9 is an example, and the number may be set according to the number of input divided images.
- the decoding unit 21 has a first decoder 21-1 and a second decoder 21-2.
- the first decoder 21-1 and the second decoder 21-2 receive the divided image transmitted from the edge device 30 and perform decoding processing.
- the number of the first decoder 21-1 and the second decoder 21-2 in the decoding unit 21 is an example, and the number of decoders 21 in the inference unit 22 is 1 in the minimum configuration.
- the minimum configuration of the entire system is also described.
- One of the DNN1, the encoder, the decoder, and the DNN2 is the minimum configuration. Further, the number of any configuration may be variable. For example, the number of DNN1s may be 2, the number of encoders may be 4, the others may be 1, and so on.
- the inference unit 22 uses the DNN 2 to execute an inference process for the divided image output from the edge device 30.
- the inference unit 22 uses each divided image output from the edge device 30 as an input of DNN2-1 to DNN2-9.
- the inference unit 22 acquires the inference result (for example, the probability of each class of the object shown in the image and the presence / absence of a moving object compared with the images before and after) as the output of DNN2-1 to DNN2-9.
- the inference unit 22 receives the input of inference data and outputs the inference result. It is assumed that the label of each divided image is unknown data.
- the inference result obtained by the inference unit 22 may be transmitted to the edge device 30 and returned to the user from the edge device 30.
- the number of DNN2s in the inference unit 22 is 1.
- the integration unit 23 integrates each inference result for each divided image by the inference unit 22, and outputs the integrated inference result as the processing result of the image which is the processing data.
- the server device 20 and the edge device 30 form a model cascade.
- the model cascade is intended to be used by connecting two or more independent models in layers (two layers or multiple layers). Therefore, the inference unit 22 does not always make inferences.
- the reasoning unit 22 receives the input of the divided image determined to cause the server device 20 to execute the inference process in the edge device 30, and performs the inference by the DNN 2.
- the edge device 30 includes a division unit 31, an inference unit 32 (first inference unit) having DNN1-1 to DNN1-N (N is a natural number) which are trained lightweight models, a determination unit 33, and encoding. It has a part 34.
- the division unit 31 divides the processing data.
- the dividing unit 31 divides the image to be processed.
- the size and the number of divisions are set according to the resources of the edge device 30 and the server device 20 and the transmission capacity of the transmission line between the edge device 30 and the server device 20.
- the inference unit 32 makes inferences using a plurality of DNN1-1 to DNN1-N, which are trained lightweight models.
- the inference unit 32 inputs the divided data divided by the divided unit 31 to the corresponding DNN among the plurality of DNN1-1 to DNN1-N, and causes the inference in each of the DNN1-1 to DNN1-N to be executed.
- the number of DNN1s in the edge device 30 is an example, and the divided images may be sequentially processed by using DNN1s smaller than the number of divisions.
- DNN1-1 to DNN1-N perform subject recognition that infers the probability of each class of the object reflected in the image. Further, DNN1-1 to DNN1-N may perform motion detection as well as subject recognition. Further, in DNN1-1 to DNN1-N, only motion detection may be performed.
- the inference unit 32 may use a lighter model in order to detect a moving object.
- a lightweight model there is a model that detects a moving object by using the coded data included in the coded data of the image. Specifically, it is based on a model that determines the presence or absence of motion detection according to the ratio of the intra-encoded block and the inter-encoded block in the divided region, and the ratio of the code amount to other regions.
- a model that detects moving objects There is a model that determines whether or not there is a change between these regions based on the amount of code between the corresponding regions in two images taken at almost the same position in the real space, and a motion between the two still images.
- the inference unit 32 inputs the divided image divided by the dividing unit 31 into the corresponding DNN1-1 to DNN1-N among the DNN1-1 to DNN1-N, respectively, and the subject in each of the DNN1-1 to DNN1-N. Make recognition run. Further, the inference unit 32 may cause DNN1-1 to DNN1-N to perform motion detection. The inference unit 32 outputs inference results (for example, subject recognition result, subject recognition result, and moving object detection) for a plurality of images.
- inference results for example, subject recognition result, subject recognition result, and moving object detection
- the determination unit 33 outputs only the divided data for which it is determined that the inference result in each of DNN1-1 to DNN1-N matches the predetermined result among the plurality of divided data to the server device 20.
- the determination unit 33 outputs the divided image determined to include at least a predetermined subject in each of the DNN1-1 to DNN1-N among the plurality of divided images to the server device 20, and processes (inference) related to the inference data. It has a first determination unit 33-1 to an Nth determination unit 33-N for determining that the server device 20 is to perform the process).
- the number of the first determination unit 33-1 to the Nth determination unit 33-N is an example, and the number is the same as the DNN1-1 to DNN1-N possessed by the inference unit 32 so that parallel processing can be executed for the divided images. All you need is.
- the first determination unit 33-1 to the Nth determination unit 33-N select a divided image containing a predetermined subject and having a certainty of the subject recognition result for the divided image equal to or higher than a predetermined threshold value. Then, it may be output to the server device 20.
- the degree of certainty is the degree of certainty that the result of subject recognition by each DNN1-1 to DNN1-N is correct.
- the certainty may be the probability for each class of the object reflected in each divided image output by each DNN1-1 to DNN1-N.
- the first determination unit 33-1 to the Nth determination unit 33-N may each include a predetermined subject, select a divided image in which a moving object is detected, and output the divided image to the server device 20.
- the first determination unit 33-1 to the Nth determination unit 33-N are divided images each including a predetermined subject and the moving object is detected, and the certainty of the subject recognition result for the divided image is high.
- the divided images having a predetermined threshold value or more are selected and output to the server device 20.
- the determination unit 33 outputs the inference result inferred by the inference unit 32 when there is no divided image including a predetermined subject.
- the coding unit 34 has the first encoders 34-1 to the N encoder, and each divided image determined to be output to the server device 20 by the first determination unit 33-1 to the Nth determination unit 33-N. After each of the above is quantized, an encoding process for performing an encoding process for communication is performed, and the data is output to the server device 20.
- the number of the first encoders 34-1 to the Nth encoder is an example, and may be the same number as the DNN1-1 to DNN1-N possessed by the inference unit 32 so that parallel processing can be executed for the divided image. Further, the divided images may be sequentially processed by using an encoder having a smaller number of divisions.
- the encoding unit 34 may encode each of the divided images determined to be transmitted to the server device 20, or may combine the divided images and encode them as one image.
- the coding unit 34 may convert the divided image determined not to be transmitted to the server device 20 into a single black color or the like.
- the coding unit 34 rearranges each divided image determined to be transmitted to the server device 20 to a position different from the arrangement of the original image as one image so as to increase the coding efficiency. It may be encoded. Specifically, the coding unit 34 changes the arrangement so that the divided images determined to be transmitted to the server device 20 are adjacent to each other.
- FIG. 4 is a flowchart showing a flow of processing executed by the edge device 30 shown in FIG.
- the edge device 30 receives the input of the image to be processed (for example, the image G) (step S1)
- the dividing unit 31 divides the image to be processed into the divided images G-1 to G-1. It is divided into GN and distributed to DNN1-1 to DNN1-N (step S2).
- each of the distributed divided images G-1 to GN is input to DNN1-1 to DNN1-N (steps S3-1 to S3-N), and subject recognition or subject Recognition and motion detection are executed (steps S4-1 to S4-N).
- the first determination unit 33-1 to the Nth determination unit 33-N indicate whether or not each of the divided images G-1 to GN includes a predetermined subject, or includes the predetermined subject and detects a moving object. It is determined whether or not the image has been completed (steps S5-1 to S5-N).
- the first determination unit 33-1 to the Nth determination unit 33 -N acquires the certainty of the subject recognition result for each of the divided images G-1 to GN (steps S7-1 to S7-N). Then, the first determination unit 33-1 to the Nth determination unit 33-N determine whether or not the certainty degree is equal to or higher than a predetermined threshold value (steps S8-1 to 8-N).
- step S8-1 to 8-N When it is determined that the certainty level is equal to or higher than a predetermined threshold value (steps S8-1 to 8-N: Yes), the coding unit 34 determines that the certainty level is equal to or higher than the predetermined threshold value. After quantizing each of 1 to GN, an encoding process for communication is executed (steps S9-1 to S9-N) and transmitted to the server device 20 (steps S10-1 to S10-N).
- the first determination unit 33-1 to the Nth determination unit 33-N determine that the predetermined subject is not included, or the moving object is not detected even though the predetermined subject is included, that is, the predetermined subject is captured but moves. If it is determined that there is no such procedure (steps S5-1 to S5-N: No), or if it is determined that the certainty level is not equal to or higher than a predetermined threshold value (steps S8-1 to 8-N: No), this division is performed. It is determined that the image does not need to be transmitted (steps S6-1 to S6-N).
- step S11: Yes the determination unit 33 outputs the inference result inferred by the inference unit 32 (step S12). Further, when the determination unit 33 determines that the transmission is not necessary for the fully divided image (step S11: No), the determination unit 33 ends the processing for the image G to be processed.
- FIG. 5 is a flowchart showing a flow of processing executed by the server device 20 shown in FIG. As shown in FIG. 5, when the server device 20 receives the input of the divided images Gi and Gj transmitted from the edge device 30 (steps S21-1 and S21-2), the first decoder 21-1 , The second decoder 21-2 performs decoding processing on the divided images G-i and G-j, respectively (steps S22-1 and S22-2).
- the inference unit 22 inputs the divided images G-i and G-j output from the edge device 30 to DNN2-1 to DNN2-9, and performs inference processing for the divided images G-i and G-j, respectively. Execute (steps S23-1 and S23-2).
- the integration unit 23 integrates each inference result for the divided images G-i and G-j (step S24), and outputs the integrated inference result as the processing result of the image which is the processing data (step S25).
- each process including inference processing is executed in parallel for each divided image obtained by dividing the processed image, and only the divided image satisfying a predetermined condition is transmitted to the server device 20. .. Therefore, in the present embodiment, it is possible to reduce the amount of data transfer from the edge device 30 to the server device 20 as compared with the case where the entire image to be processed is transmitted to the server device. Further, the server device 20 performs inference processing only on the transmitted divided image. Therefore, in the processing system according to the embodiment, it is possible to reduce the calculation load on the server device 20 as compared with the case where the inference processing is performed on the entire image to be processed.
- the maximum resolution of the image to be input may be fixed.
- the target image is input to DNN1-1 to DNN1-N for each divided image, so that the size to be divided is the above maximum resolution.
- the size to be divided should be a size that satisfies the target, for example, depending on the target such as recognition of a subject or detection of an event.
- the edge device 30 selects, among a plurality of divided images, a divided image including a predetermined subject and having a certainty level of subject recognition for the divided image equal to or higher than a predetermined threshold value. , Output to the server device 20.
- the edge device 30 is a divided image that includes a predetermined subject among a plurality of divided images and has a moving object detected, and the certainty of the subject recognition result for the divided image is equal to or higher than a predetermined threshold value. Images are selected and output to the server device 20. Therefore, in the embodiment, only the divided image that is considered to require high-precision inference processing can be transmitted from the edge device 30 to the server device 20.
- server device 20 integrates each inference result for each divided image by the inference unit 22 and outputs the integrated inference result as an inference result for one image to be processed, the user can use the image to be processed. It is possible to accurately recognize the inference result.
- the edge device 30 may attach a classification result to the bounding box of the subject and transmit it to the server device 20 in the subsequent stage.
- the bouncing box of the subject is intended to be a divided image in which the subject is captured.
- the edge device 30 cuts out the target portion in which the subject is captured, attaches the classification result to the clipped partial image, and transmits the classified result to the server device 20 in the subsequent stage. May be good.
- the edge device 30 may attach a classification result to the bounding box of each subject and transmit all the results to the server device 20 in the subsequent stage. ..
- the target portion in which each subject is captured may be cut out, divided, and separately transmitted to the server device 20 in the subsequent stage.
- the edge device 30 may attach the classification result to the bounding box of the subject belonging to a specific classification and transmit the classification result to the server device 20 in the subsequent stage. good.
- the edge device 30 may cut out the target portion in which each subject is captured, divide the image, and separately transmit the target portion to the server device 20 in the subsequent stage.
- the edge device 30 may select a divided image in which a subject of a specific classification is captured and transmit it to the server device 20 in the subsequent stage. For example, when the edge device 30 is an image in which a person and a dog are shown, the divided image in which only the person is shown may be transmitted to the server device 20 in the subsequent stage.
- FIG. 6 is a diagram illustrating an outline of a processing method of the processing system according to the first modification of the embodiment.
- the edge device 30 distributes each of the images Gt11 to Gt13 to DNN1t-1 to DNN1t-3, respectively.
- DNN1t-1 to DNN1t-3 execute subject recognition and motion detection ((1) in FIG. 6).
- the numbers of DNN1t-1 to DNN1t-3 are examples, and are set according to the resources of the edge device 30 and the like.
- DNN1t-1 to DNN1t-3 may be one common DNN. Further, DNN1t-1 to DNN1t-3 may be the same DNN as DNN1-1 to DNN1-N, or may be different DNNs. Further, for each of DNN1t-1 to DNN1t-3, it is possible to omit the moving object detection.
- the edge device 30 selects the images Gt11 and Gt12 that include a predetermined subject and have been detected as a moving object from DNN1t-1 to DNN1t-3. Subsequently, the edge device 30 determines that the images Gt11 and Gt12 whose certainty of the subject recognition result for the image is equal to or higher than a predetermined threshold value are to be transmitted (FIG. 6 (2)).
- the edge device 30 performs an encoding process for each of the selected images Gt11 and Gt12 for each image Gt11 and Gt12, and transmits the selected images to the cloud (server device 20) ((3) in FIG. 6).
- the edge device 30 can omit the divided portion 31 shown in FIG.
- the edge device 30 selects an image including a predetermined subject and having a certainty degree of certainty equal to or higher than a predetermined threshold value as a transmission target. good.
- the server device 20 on the cloud side receives the images Gt11 and Gt12 output from the edge device 30, it decodes each of the images Gt11 and Gt12 ((4) in FIG. 6) to DNN2-1 to DNN2-9. Enter each.
- each of DNN2-1 to DNN2-9 performs inference processing for inferring the probability of each class of the object to be reflected in the image for the input images Gt11 and Gt12 ((5) in FIG. 6).
- the server device 20 outputs the inference results of each DNN2-1 to DNN2-9 after performing the predetermined post-processing.
- the server device 20 can omit the integration unit 23 shown in FIG.
- the edge device 30 selects only the images that require high-precision inference processing from the plurality of images. Since it is transmitted to the server device 20, it has the same effect as that of the embodiment.
- the processing system selects some images from a plurality of images captured in chronological order, divides the selected images, and selects and selects the divided images that require high-precision inference processing. Only the divided image may be transmitted to the server device.
- FIG. 7 is a diagram illustrating an outline of a processing method in an edge device of the processing system according to the second modification of the embodiment.
- FIG. 8 is a diagram schematically showing an example of the configuration of the processing system according to the second modification of the embodiment.
- an input image group (for example, images Gt11 to Gt13) which is a time-series image is input as processing data to the edge device 30B.
- the inference unit 32B distributes each image to DNN1t-1 to DNN1t-M (M is a natural number), and causes DNN1t-1 to DNN1t-M to perform subject recognition and motion detection (FIG. 7 (1)).
- the first determination unit 33-1 to the M determination unit 33-M include a predetermined subject and detect a moving object based on the inference results of the DNN1t-1 to DNN1t-M.
- the images for example, images Gt11 and Gt12
- the certainty of the images Gt11 and Gt12 is equal to or higher than a predetermined threshold value. It is determined whether or not the image is, and an image to be transmitted (for example, image Gt11) is selected ((2) in FIG. 7).
- the first determination unit 33-1 to the M determination unit 33-M include a predetermined subject and the certainty is equal to or higher than a predetermined threshold value.
- the image is selected as the transmission target.
- the division unit 31 divides the image Gt11 into, for example, nine equal parts, and divides the divided images Gt11-1 to Gt11-9 into DNN1-1 to DNN1-N (for example, DNN1) of the inference unit 32. -1 to DNN1-9), respectively ((3) in FIG. 7).
- each DNN1-1 to DNN1-N is subject recognition that infers the probability of each class of the object appearing in the image for the input divided image (for example, the divided images Gt11-1 to Gt11-9).
- Motion detection is performed ((4) in FIG. 7).
- the first determination unit 33-1 to the Nth determination unit 33-N include a predetermined subject based on the inference results of DNN1-1 to DNN1-N, and the divided image is detected as a moving object.
- the divided images Gt11-1, Gt11-5) are selected, and the certainty of the divided images G1-1 and G1-5 is acquired.
- the first determination unit 33-1 to the Nth determination unit 33-N determine that the divided images Gt11-1 and Gt11-5 having a certainty degree of certainty equal to or higher than a predetermined threshold value are to be transmitted (FIG. 7).
- the coding unit 34 quantizes the divided images Gt11-1 and Gt11-5, respectively, performs an encoding process, and transmits the divided images to the cloud (server device 20) ((6) in FIG. 7). For each DNN1-1 to DNN1-N, it is possible to omit the motion detection.
- the first determination unit 33-1 to the Nth determination unit 33-N include a predetermined subject and the certainty is equal to or higher than a predetermined threshold value.
- the divided image is selected as the transmission target.
- the inference unit 32B inputs a plurality of images taken in chronological order into each DNN1t-1 to DNN1t-M, respectively.
- Subject recognition is executed, and the determination unit 33B selects an image recognized to include at least a predetermined subject in each of the DNN1t-1 to DNN1t-M from the plurality of images.
- the division unit 31 divides the image selected by the determination unit 33B, and the inference unit 32 inputs the plurality of divided images to the corresponding DNN1 among the DNN1-1 to DNN1-N, respectively. Then, subject recognition in each DNN1-1 to DNN1-N is executed. Subsequently, in the edge device 30B, it is determined that the determination unit 33 includes at least a predetermined subject in each of the DNN1-1 to DNN1-N among the plurality of divided images, and the certainty is equal to or higher than the predetermined threshold value.
- the divided image is output to the server device 20.
- the edge device 30B selects only images that require high-precision inference processing from a plurality of images, then further divides the selected images, and divides images that require high-precision inference processing.
- the data transfer amount and the calculation load in the server device 20 may be further reduced by selecting and transmitting only the selected divided images to the server device 20.
- [Modification 3] A method for further reducing the amount of data transfer and the overall computational load is shown.
- the inference is performed by DNN1, but the inference may be performed by any of DNN1-1 to DNN1-N. It is assumed that a moving object is detected as a result of inference by DNN1-1 at a certain time point T. If the subject is detected at the time point Tn ... T-1 in a region that is wider than the bounting box corresponding to this moving object and is not the entire image, the moving object detected at the time point T is , Tn ... It may be inferred that the subject was detected at the time of T-1.
- the same reasoning may be used as a moving object, and the divided image transmitted to the DNN 2 may be targeted. Further, when the region indicates substantially the same space in the real space, the same inference may be performed for the divided images transmitted from a plurality of DNN1-k (1 ⁇ k ⁇ N).
- edge devices 30 and 30B there may be a plurality of edge devices 30 and 30B or a plurality of server devices 20, and there may be a plurality of edge devices 30 and 30B and the server device 20.
- the processing data may be a sensor detection result or the like, and the first inference unit 32 and the second inference may be used.
- the unit 21 may perform object detection for detecting the presence or absence of a predetermined object, for example.
- each component of each of the illustrated devices is a functional concept and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of them may be functionally or physically distributed / physically in arbitrary units according to various loads and usage conditions. Can be integrated and configured. Further, each processing function performed by each device may be realized by a CPU and a program analyzed and executed by the CPU, or may be realized as hardware by wired logic.
- FIG. 9 is a diagram showing an example of a computer in which the edge devices 30 and 30B and the server device 20 are realized by executing the program.
- the computer 1000 has, for example, a memory 1010 and a CPU 1020. Further, the accelerator described above may be provided to assist the calculation.
- the computer 1000 also has a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. Each of these parts is connected by a bus 1080.
- the memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012.
- the ROM 1011 stores, for example, a boot program such as a BIOS (Basic Input Output System).
- BIOS Basic Input Output System
- the hard disk drive interface 1030 is connected to the hard disk drive 1090.
- the disk drive interface 1040 is connected to the disk drive 1100.
- a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100.
- the serial port interface 1050 is connected to, for example, a mouse 1110 and a keyboard 1120.
- the video adapter 1060 is connected to, for example, the display 1130.
- the hard disk drive 1090 stores, for example, an OS (Operating System) 1091, an application program 1092, a program module 1093, and program data 1094. That is, the program that defines each process of the edge devices 30 and 30B and the server device 20 is implemented as a program module 1093 in which a code that can be executed by a computer is described.
- the program module 1093 is stored in, for example, the hard disk drive 1090.
- the program module 1093 for executing the same processing as the functional configuration in the edge devices 30 and 30B and the server device 20 is stored in the hard disk drive 1090.
- the hard disk drive 1090 may be replaced by an SSD (Solid State Drive).
- the setting data used in the processing of the above-described embodiment is stored as program data 1094 in, for example, a memory 1010 or a hard disk drive 1090. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 into the RAM 1012 and executes them as needed.
- the program module 1093 and the program data 1094 are not limited to those stored in the hard disk drive 1090, but may be stored in, for example, a removable storage medium and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Then, the program module 1093 and the program data 1094 may be read from another computer by the CPU 1020 via the network interface 1070.
- LAN Local Area Network
- WAN Wide Area Network
- Server equipment 20 Server equipment 21 Decoding unit 22, 32, 32B Inference unit 23 Integration unit 30 Edge equipment 31 Division unit 33, 33B Judgment unit 34 Coding unit 100, 100B Processing system
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
[実施の形態の概要]
本発明の実施の形態について説明する。本発明の実施の形態では、学習済みの高精度モデル及び軽量モデルを使って推論処理を行う処理システムについて説明する。なお、実施の形態の処理システムでは、推論処理において用いるモデルとして、DNN(Deep Neural Network)を用いた場合を例に説明する。実施の形態の処理システムでは、DNN以外のニューラルネットワークを用いてもよいし、学習済みモデルに代えて低演算量の信号処理と高演算量の信号処理を用いてもよい。
次に、DNN1、DNN2について説明する。図2は、DNN1及びDNN2の一例を説明する図である。DNNは、データが入る入力層、入力層から入力されたデータを様々に変換する複数の中間層、確率や尤度など、いわゆる推論した結果を出力する出力層を有する。また、上述した確信度を出力するよう構成してもよい。クラウドに送る出力値とする中間層の出力値は、入力されるデータが匿名性を保つ必要がある場合は非可逆としてもよい。
次に、処理システムの構成について説明する。図3は、実施の形態に係る処理システムの構成の一例を模式的に示す図である。
図4は、図3に示すエッジ装置30が実行する処理の流れを示すフローチャートである。図4に示すように、まず、エッジ装置30において、処理対象の画像(例えば、画像G)の入力を受け付けると(ステップS1)、分割部31は、処理対象の画像を分割画像G-1~G-Nに分割し、DNN1-1~DNN1-Nに分配する(ステップS2)。
図5は、図3に示すサーバ装置20が実行する処理の流れを示すフローチャートである。図5に示すように、サーバ装置20では、エッジ装置30から送信された分割画像G-i,G-jの入力を受け付けると(ステップS21-1,S21-2)、第1デコーダー21-1、第2デコーダー21-2が、分割画像G-i,G-jに対してそれぞれデコード処理を行う(ステップS22-1,S22-2)。
実施の形態に係る処理システムは、エッジ装置30において、処理画像を分割した分割画像ごとに推論処理を含む各処理を並列に実行し、所定の条件を満たす分割画像のみをサーバ装置20に送信する。したがって、本実施の形態では、処理対象の画像全体をサーバ装置に送信する場合と比して、エッジ装置30からサーバ装置20に対するデータ転送量を低減することが可能である。また、サーバ装置20は、送信された分割画像のみについて推論処理を行う。このため、実施の形態に係る処理システムでは、処理対象の画像全体に対して推論処理を行う場合と比して、サーバ装置20における演算負荷を低減することが可能である。また、広く使われているYOLO等のモデルでは、入力する画像の最大解像度が決まっている場合がある。このようなモデルをエッジ装置に配置するDNN1-1~DNN1-Nとして選択する場合、対象画像を分割した画像毎にDNN1-1~DNN1-Nに入力することから、分割するサイズを上記最大解像度以下となるような設計とすることで、画像が持つ情報を劣化させることなく推論のために用いることができる。当該分割するサイズは、例えば被写体の認識やイベントの検知等の対象に応じて、当該対象を満たすサイズとすべきであることは言うまでもない。
実施の形態では、処理データが、1枚の画像である場合を例に説明したが、処理データは、時系列に撮像された複数の画像(時系列画像)であってもよい。図6は、実施の形態の変形例1に係る処理システムの処理方法の概要を説明する図である。
また、処理システムは、時系列に沿って撮像された複数の画像から一部の画像を選別後、選別した画像を分割して、高精度な推論処理が必要である分割画像を選別し、選別した分割画像のみをサーバ装置に送信してもよい。
データ転送量と全体的な演算負荷をさらに低減する手法を示す。以下では、DNN1で推論を行うよう記載しているが、DNN1-1~DNN1-Nいずれで推論が行われてもよい。ある時点Tにおいて、DNN1-1で推論を行った結果、動体を検出したとする。この動体に対応するバウンティングボックスよりも広いかつ画像全体ではない一部である領域内において、T-n・・・T-1の時点において被写体を検出していた場合、ある時点Tにおいて検出した動体は、T-n・・T-1の時点において検出していた被写体であると推論するようにしてもよい。
図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。さらに、各装置にて行なわれる各処理機能は、その全部又は任意の一部が、CPU及び当該CPUにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。
図9は、プログラムが実行されることにより、エッジ装置30,30B及びサーバ装置20が実現されるコンピュータの一例を示す図である。コンピュータ1000は、例えば、メモリ1010、CPU1020を有する。また、演算を補助するために前述したアクセラレータを備えてもよい。また、コンピュータ1000は、ハードディスクドライブインタフェース1030、ディスクドライブインタフェース1040、シリアルポートインタフェース1050、ビデオアダプタ1060、ネットワークインタフェース1070を有する。これらの各部は、バス1080によって接続される。
21 復号化部
22,32,32B 推論部
23 統合部
30 エッジ装置
31 分割部
33,33B 判定部
34 符号化部
100,100B 処理システム
Claims (11)
- エッジ装置とサーバ装置とを用いて行われる処理システムであって、
前記エッジ装置は、
処理データを複数に分割した分割データを、複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける推論を実行させる第1の推論部と、
複数の前記分割データのうち、各第1のモデルにおける推論結果が予め定められた結果と合致すると判定した前記分割データのみを前記サーバ装置に出力する判定部と、
を有し、
前記サーバ装置は、
前記第1のモデルよりも高演算量である第2のモデルを用いて、前記エッジ装置から出力された前記分割データに対する推論処理を実行する第2の推論部と、
を有することを特徴とする処理システム。 - 前記第1の推論部は、前記分割した分割データを、前記複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける物体検出を実行させ、
前記判定部は、複数の前記分割データのうち、各第1のモデルにおいて少なくとも所定の物体を含むと判定した前記分割データを前記サーバ装置に出力することを特徴とする請求項1に記載の処理システム。 - 前記判定部は、複数の前記分割データのうち、前記所定の物体を含み、かつ、前記第1のモデルによる前記物体検出の結果が正解であることの確からしさの度合いである確信度が、所定の閾値以上である前記分割データを、前記サーバ装置に出力することを特徴とする請求項2に記載の処理システム。
- 前記第1の推論部は、前記分割データに対して、物体検出を行うとともに動体検出を行い、
前記判定部は、複数の前記分割データのうち、前記所定の物体を含み、かつ、動体検出された分割データを前記サーバ装置に出力することを特徴とする請求項2または3に記載の処理システム。 - 前記サーバ装置は、
前記第2の推論部による各分割データに対する各推論結果を統合し、統合した推論結果を前記処理データに対する推論結果として出力する統合部
をさらに有することを特徴とする請求項2~4のいずれか一つに記載の処理システム。 - 前記処理データは、1枚の画像であり、
前記第1の推論部は、前記1枚の画像を分割した複数の分割画像を、前記複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける被写体認識を実行させ、
前記判定部は、前記複数の分割画像のうち、各第1のモデルにおいて少なくとも所定の被写体を含むと判定した前記分割画像を前記サーバ装置に出力することを特徴とする請求項2~5のいずれか一つに記載の処理システム。 - 前記処理データは、時系列に沿って撮像された複数の画像であり、
前記第1の推論部は、前記複数の画像をそれぞれ、前記複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける被写体認識を実行させ、
前記判定部は、前記複数の画像のうち、各第1のモデルにおいて少なくとも所定の被写体を含むと判定した前記画像を前記サーバ装置に出力することを特徴とする請求項2~5のいずれか一つに記載の処理システム。 - 前記処理データは、時系列に沿って撮像された複数の画像であり、
前記第1の推論部は、前記複数の画像をそれぞれ、前記複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける被写体認識を実行させ、
前記判定部は、前記複数の画像のうち、各第1のモデルにおいて少なくとも所定の被写体を含むことを認識された画像を選別し、
前記第1の推論部は、前記判定部によって選別された画像を分割した複数の分割画像を、前記複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける被写体認識を実行させ、
前記判定部は、前記複数の分割画像のうち、各第1のモデルにおいて少なくとも所定の被写体を含むと判定された前記分割画像を前記サーバ装置に出力することを特徴とする請求項2~5のいずれか一つに記載の処理システム。 - 前記エッジ装置は、
前記判定部によって前記サーバ装置に出力することを判定された各分割データをそれぞれ符号化し、前記サーバ装置に出力する複数の符号化部
を有し、
前記サーバ装置は、
前記符号化された分割データをそれぞれ復号化する複数の復号化部
を有することを特徴とする請求項2~8のいずれか一つに記載の処理システム。 - エッジ装置とサーバ装置とを用いて行われる処理システムが実行する処理方法であって、
前記エッジ装置が、処理データを複数に分割した分割データを、複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける推論を実行させる第1の推論工程と、
前記エッジ装置が、複数の前記分割データのうち、各第1のモデルにおける推論結果が予め定められた結果と合致すると判定した前記分割データのみを前記サーバ装置に出力する判定工程と、
前記サーバ装置が、前記第1のモデルよりも高演算量である第2のモデルを用いて、前記エッジ装置から出力された前記分割データに対する推論処理を実行する第2の推論工程と、
を含んだことを特徴とする処理方法。 - 方法をコンピュータに実行させる処理プログラムであって、
エッジ装置としてのコンピュータに、
処理データを複数に分割した分割データを、複数の第1のモデルのうち対応する第1のモデルにそれぞれ入力して、各第1のモデルにおける推論を実行させる第1の推論ステップと、
複複数の前記分割データのうち、各第1のモデルにおける推論結果が予め定められた結果と合致すると判定した前記分割データのみを出力する判定ステップと、
を実行させ、
サーバ装置としてのコンピュータに、
前記サーバ装置が、前記第1のモデルよりも高演算量である第2のモデルを用いて、前記エッジ装置から出力された前記分割データに対する推論処理を実行する第2の推論ステップ、
を実行させることを特徴とする処理プログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/037,972 US20230409884A1 (en) | 2020-11-24 | 2020-11-24 | Processing system, processing method, and processing program |
JP2022564707A JPWO2022113152A1 (ja) | 2020-11-24 | 2020-11-24 | |
PCT/JP2020/043564 WO2022113152A1 (ja) | 2020-11-24 | 2020-11-24 | 処理システム、処理方法及び処理プログラム |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/043564 WO2022113152A1 (ja) | 2020-11-24 | 2020-11-24 | 処理システム、処理方法及び処理プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022113152A1 true WO2022113152A1 (ja) | 2022-06-02 |
Family
ID=81754071
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2020/043564 WO2022113152A1 (ja) | 2020-11-24 | 2020-11-24 | 処理システム、処理方法及び処理プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20230409884A1 (ja) |
JP (1) | JPWO2022113152A1 (ja) |
WO (1) | WO2022113152A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109889592A (zh) * | 2019-02-25 | 2019-06-14 | 北京邮电大学 | 一种基于边缘计算的智能制造方法及装置 |
JP2020177344A (ja) * | 2019-04-16 | 2020-10-29 | 富士通株式会社 | 学習方法、学習プログラムおよび学習装置 |
-
2020
- 2020-11-24 JP JP2022564707A patent/JPWO2022113152A1/ja active Pending
- 2020-11-24 US US18/037,972 patent/US20230409884A1/en active Pending
- 2020-11-24 WO PCT/JP2020/043564 patent/WO2022113152A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109889592A (zh) * | 2019-02-25 | 2019-06-14 | 北京邮电大学 | 一种基于边缘计算的智能制造方法及装置 |
JP2020177344A (ja) * | 2019-04-16 | 2020-10-29 | 富士通株式会社 | 学習方法、学習プログラムおよび学習装置 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022113152A1 (ja) | 2022-06-02 |
US20230409884A1 (en) | 2023-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11516473B2 (en) | Bandwidth compression for neural network systems | |
Liang et al. | Pruning and quantization for deep neural network acceleration: A survey | |
US20220180199A1 (en) | Neural network model compression method and apparatus, storage medium, and chip | |
CN110321910B (zh) | 面向点云的特征提取方法、装置及设备 | |
CN117456297A (zh) | 图像生成方法、神经网络的压缩方法及相关装置、设备 | |
JP5591178B2 (ja) | テスト画像内のオブジェクトを分類するための方法 | |
KR20180007657A (ko) | 뉴럴 네트워크를 위한 방법 및 그 방법을 수행하는 장치 | |
KR102011788B1 (ko) | 계층적 시각 특징을 이용한 시각 질의 응답 장치 및 방법 | |
US20230063148A1 (en) | Transfer model training method and apparatus, and fault detection method and apparatus | |
Daghero et al. | Energy-efficient deep learning inference on edge devices | |
EP3738080A1 (en) | Learning compressible features | |
Veness et al. | Online learning with gated linear networks | |
KR20220058628A (ko) | 신경망 모델 압축 | |
EP4283876A1 (en) | Data coding method and related device | |
CN113869234A (zh) | 人脸表情识别方法、装置、设备及存储介质 | |
WO2022113152A1 (ja) | 処理システム、処理方法及び処理プログラム | |
CN114175053A (zh) | 转换装置、转换方法、程序以及信息记录介质 | |
WO2022127819A1 (en) | Sequence processing for a dataset with frame dropping | |
CN116091763A (zh) | 苹果叶部病害图像语义分割系统及分割方法、设备和介质 | |
Nguyen et al. | Development of an object recognition algorithm based on neural networks with using a hierarchical classifier | |
CN115409150A (zh) | 一种数据压缩方法、数据解压方法及相关设备 | |
US11068784B2 (en) | Generic quantization of artificial neural networks | |
Paul et al. | Image compression scheme based on histogram equalization and convolution neural network | |
CN113688989B (zh) | 深度学习网络加速方法、装置、设备及存储介质 | |
CN115797710B (zh) | 基于隐藏层特征差异的神经网络图像分类性能提升方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20963417 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022564707 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18037972 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20963417 Country of ref document: EP Kind code of ref document: A1 |