WO2022249316A1 - 物体検出装置、物体検出方法、及び物体検出プログラム - Google Patents
物体検出装置、物体検出方法、及び物体検出プログラム Download PDFInfo
- Publication number
- WO2022249316A1 WO2022249316A1 PCT/JP2021/019953 JP2021019953W WO2022249316A1 WO 2022249316 A1 WO2022249316 A1 WO 2022249316A1 JP 2021019953 W JP2021019953 W JP 2021019953W WO 2022249316 A1 WO2022249316 A1 WO 2022249316A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- saturation
- threshold
- layers
- object detection
- change
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 134
- 230000008859 change Effects 0.000 claims abstract description 102
- 238000012545 processing Methods 0.000 claims abstract description 51
- 238000013528 artificial neural network Methods 0.000 claims abstract description 31
- 229920006395 saturated elastomer Polymers 0.000 claims description 88
- 238000004364 calculation method Methods 0.000 claims description 45
- 238000000034 method Methods 0.000 claims description 33
- 230000008569 process Effects 0.000 claims description 25
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 6
- 230000006866 deterioration Effects 0.000 description 6
- 230000004913 activation Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 241000038860 Laius Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the disclosed technology relates to an object detection device, an object detection method, and an object detection program.
- object detection is performed by extracting metadata from an input image, which consists of the positions (rectangular frames surrounding the objects) and attributes (types of objects such as people and cars), and the detection accuracy of each object. It is a technology that detects data.
- YOLO You Only Look Once
- SSD Single Shot Multibox Detector
- Non-patent Reference 3 In order to achieve miniaturization and low power consumption, a configuration has been proposed in which object detection processing based on deep learning is implemented in hardware and the data bit width of each data handled by the computing unit is reduced (non-patent Reference 3).
- Each data includes inputs, outputs (feature maps), weights (kernels), and biases.
- each data used for sum-of-products operation is treated as 32-bit floating-point data. This is because the range of possible values for each data is wide, and the range differs for each image or for each layer such as a convolution layer that constitutes a convolutional neural network.
- Non-Patent Document 3 reports that the data bit width in each layer of the convolutional neural network is determined in advance using statistical information and reduced to 8 to 16 bits, thereby reducing the circuit size and power consumption. It is
- each data width is uniformly set to a fixed-point number of n bits (n ⁇ 32) and the decimal point position is dynamically controlled for each input image and layer.
- n bits n ⁇ 32
- the decimal point position is dynamically controlled for each input image and layer.
- the upper/lower limit counter indicates the number of times the detection result exceeds the upper limit (all bits are 1) of the value range (upper limit saturation)/the number of times the detection result falls below the lower limit (only the least significant bit is 1) in the calculation process of object detection. are counters for each layer. Set the upper/lower limit counter thresholds (UP th /UN th ) common to all input images for each layer. Layers exceeding the thresholds are treated as layers to change the position of the decimal point so that the counter values fall within the thresholds. Adjust the decimal point position.
- Non-Patent Document 1 Joseph Redmon et. al, "YOLOv3: An Incremental Improvement", https://arxiv. org/abs/1804.02767
- Non-Patent Document 2 Wei Liu et. al, "SSD: Single Shot MultiBox Detector”, https://arxiv. org/pdf/1512.02325.
- Non-Patent Document 4 Ayaki Hatta, Hiroyuki Uzawa, Shuhei Yoshida, Takatsune Nitta, "Proposal of dynamic decimal point position control method for object detection AI inference hardware", The Institute of Electronics, Information and Communication Engineers, September 2020.
- the superiority or inferiority of object detection results is determined by the balance between the number of saturated layers, which is the number of layers where upper limit saturation occurs, and the average number of saturation times. Even if the average saturation count is reduced, the detection result will be degraded if the number of saturated layers increases due to the change in the decimal point position. In the conventional method, since the threshold is the same for all input images, there is a problem that the number of saturated layers increases depending on the image, and the deterioration of the detection result cannot be suppressed.
- the disclosed technology has been made in view of the above points, and provides an object detection device, an object detection method, and an object detection program that can suppress deterioration in accuracy of detection results of object detection in an input image. intended to provide
- an object detection apparatus provides operations corresponding to each of a plurality of layers forming a multilayer neural network according to a processing algorithm of a multilayer neural network to which an input image is input.
- an object detection calculation unit that performs processing on fixed length data to which a decimal point position is set; and an upper limit saturation count that is the number of times an upper limit value of a value range determined by the decimal point position is exceeded in the calculation process, and a lower limit of the value range.
- a saturation number counter that counts the lower limit saturation number that is the number of times that the value is exceeded, and the upper limit saturation layer number that is the number of layers in which the upper limit saturation number is one or more times, and the lower limit saturation number is one or more times.
- the saturated layer number counter for counting the lower saturated layer number which is the number of layers, and the amount of change in the upper saturated layer number and the lower saturated layer number counted by the saturated layer number counter. It is determined whether at least one of an upper saturation threshold that is a threshold for the upper saturation count and a lower saturation threshold that is a threshold for the lower saturation count is optimal, and if it is determined that it is not optimal, the upper saturation threshold and the lower saturation
- a threshold determination unit that changes at least one of threshold values
- a decimal point position control unit that sets the decimal point position for each of the plurality of layers based on the determination result of the threshold determination unit.
- an object detection method is provided by a computer, in accordance with a processing algorithm of a multilayer neural network to which an input image is input, a plurality of layers constituting the multilayer neural network. Arithmetic processing corresponding to each is performed on fixed-length data with a decimal point position set, and in the arithmetic processing, an upper limit saturation count and a lower limit of the value range, which are the number of times the upper limit value of the value range determined by the decimal point position is exceeded.
- Count the lower limit saturation number which is the number of times the value is exceeded, and count the upper limit saturation number, which is the number of layers in which the upper limit saturation number is 1 or more, and the number of layers in which the lower limit saturation number is 1 or more. is counted, and based on the counted amount of change in the number of upper saturated layers and the amount of change in the counted number of lower saturated layers, the upper saturation threshold and the lower saturation number, which are thresholds for the upper saturation number If at least one of the lower saturation threshold that is the threshold of is not optimal, change at least one of the upper saturation threshold and the lower saturation threshold, and determine whether at least one of the upper saturation threshold and the lower saturation threshold is optimal A process of setting the decimal point position is executed for each of the plurality of layers based on the determination result.
- an object detection program provides a computer with a multi-layer neural network according to a processing algorithm of a multi-layer neural network to which an input image is input, and a plurality of layers constituting the multi-layer neural network. Arithmetic processing corresponding to each is performed on fixed-length data with a decimal point position set, and in the arithmetic processing, an upper limit saturation count and a lower limit of the value range, which are the number of times the upper limit value of the value range determined by the decimal point position is exceeded.
- Count the lower limit saturation number which is the number of times the value is exceeded, and count the upper limit saturation number, which is the number of layers in which the upper limit saturation number is 1 or more, and the number of layers in which the lower limit saturation number is 1 or more. is counted, and based on the counted amount of change in the number of upper saturated layers and the amount of change in the counted number of lower saturated layers, the upper saturation threshold and the lower saturation number, which are thresholds for the upper saturation number If at least one of the lower saturation threshold that is the threshold of is not optimal, change at least one of the upper saturation threshold and the lower saturation threshold, and determine whether at least one of the upper saturation threshold and the lower saturation threshold is optimal A process of setting the decimal point position is executed for each of the plurality of layers based on the determination result.
- FIG. 1 is a functional block diagram of an object detection device according to a first embodiment
- FIG. It is a figure which shows the hardware constitutions of an object detection apparatus.
- 4 is a flowchart of object detection processing according to the first embodiment; It is a figure for demonstrating the upper limit saturated layer number before and behind the change of the decimal point position in the former.
- FIG. 10 is a diagram for explaining the upper limit saturation number of layers before and after changing the position of the decimal point in technology disclosed herein; It is a functional block diagram of an object detection device according to a second embodiment. 9 is a flowchart of object detection processing according to the second embodiment;
- the object detection device is a device that detects metadata including the position, attributes, and detection accuracy of each object included in an input image.
- the position of the object is represented by, for example, at least one of the coordinates of the center of the object in the input image and a rectangular frame (bounding box) surrounding the object.
- the attribute of an object is the type of object such as a person and a car, and is sometimes called a category.
- the object detection accuracy is, for example, the probability that the detected object has a specific attribute.
- the object detection device 10 includes an object detection calculation unit 12, a decimal point position control unit 14, a saturation number counter 16, a threshold determination unit 18, and a saturation layer number counter 20.
- the object detection calculation unit 12 performs calculation processing based on deep learning inference processing on the inputted input image.
- the object detection calculation unit 12 is an arithmetic processing circuit configured to perform arithmetic processing corresponding to each of the layers that make up the multilayer neural network according to the processing algorithm of the multilayer neural network.
- a convolutional neural network (CNN: Convolutional Neural Network) is typically used for processing by a multilayer neural network in the object detection calculation unit 12 .
- the CNN alternately arranges a convolution layer that performs a convolution process of convolving an input image with a predetermined filter and a pooling layer that performs a pooling process of downsizing the result of the convolution process. It includes a feature extraction part that creates a feature map, and an identification part that consists of a plurality of fully connected layers and identifies an object included in the input image from the feature map.
- an operation is performed to convolve the filter on the image.
- a sum-of-products operation that multiplies the value of each pixel of the feature map by a weight and sums the result, and a bias is added to the result of the sum-of-products operation and input to the activation function to obtain the output.
- ReLU Rectified Linear Unit
- the values of the weights and activation function parameters can be determined by learning.
- the object detection calculation unit 12 performs arithmetic processing corresponding to each of the layers that make up the multilayer neural network according to the processing algorithm of the multilayer neural network to which the input image is input, on the fixed length data in which the decimal point position is set. do. Specifically, the object detection calculation unit 12 uses an object detection algorithm based on deep learning, such as YOLO (Non-Patent Document 1) or SSD (Non-Patent Document 2), to perform convolution and combination processing for inference processing. Metadata such as the position, attributes, detection accuracy, etc. of the object included in the input image is output as a detection result. Since such an object detection calculation unit 12 executes many sum-of-products calculations, it is often realized using a multi-core CPU or GPU (graphics processing unit). Note that the object detection calculation unit 12 may be realized by, for example, an FPGA (Field Programmable Gate Array).
- FPGA Field Programmable Gate Array
- the data handled by each layer of the multi-layer neural network by the object detection calculation unit 12, such as input, output (feature map), bias, weight, etc., are fixed-length data having a bit width smaller than 32 bits.
- the data structure is such that each layer can have a different decimal point position.
- the object detection calculation unit 12 performs calculation processing corresponding to each of a plurality of layers constituting a multi-layer neural network for fixed-length data having a bit width of 8 bits, for example. I do.
- the decimal point position of the data handled by the object detection calculation unit 12 is set for each layer by the decimal point position control unit 14, which will be described later.
- the object detection calculation unit 12 adds a bias to the 16-bit calculation result, multiplies it by an activation function, and obtains a 16-bit intermediate input map. Since the feature map will be the input in the next layer, the 16-bit intermediate input map is reduced to 8-bit width and used as the feature map for the next layer input. Note that the number of layers, the activation function, and the method of bias addition are appropriately selected for each object detection algorithm to be used, and do not limit the disclosed technology.
- the decimal point position control unit 14 sets the position of the decimal point (hereinafter simply referred to as the "decimal point position") of the fixed-length data to be calculated by the object detection calculation unit 12.
- the decimal point position control unit 14 based on the output of the object detection calculation unit 12, that is, the detection result of the object by the multi-layer neural network, determines the object The decimal point position of the fixed-length data to be calculated in the detection calculation unit 12 is set. The decimal point position set by the decimal point position control section 14 is notified to the object detection calculation section 12 . Based on the notification from the decimal point position control part 14, the object detection calculation unit 12 changes the decimal point position of the fixed-length data corresponding to each of the layers constituting the multi-layer neural network.
- the decimal point position control unit 14 uses the detection result output from the object detection calculation unit 12 to determine the decimal point position for each layer. For example, when detecting an object included in an input image as a video, the object included in the input images that are continuously input changes little by little, and it is rare that the object changes completely in a short period of time. Therefore, the decimal point position control unit 14 does not set the decimal point position of each layer from the detection result of only one input image, but uses the object detection results for a plurality of input images to calculate the decimal point position of each layer. is repeated to optimize the decimal point position of each layer little by little.
- the saturation count counter 16 counts the upper limit saturation count, which is the number of times the upper limit value (all bits are all 1) of the value range determined by the decimal point position set by the decimal point position control unit 14 is exceeded in the calculation processing of the object detection calculation unit 12. count each time.
- the saturation number counter 16 counts the number of times the object detection calculation section 12 calculates the lower limit value (only the least significant bit is 1) of the value range determined by the decimal point position set by the decimal point position control section 14. are counted per layer.
- the saturation frequency when the upper limit saturation frequency and the lower limit saturation frequency are not distinguished, they may simply be referred to as the saturation frequency.
- the threshold determination unit 18 optimizes the upper saturation threshold and lower saturation threshold of the saturation number counter 16 . Although the details will be described later, the decimal point position control unit 14 sets the decimal point position for each of the layers of the multilayer neural network based on the determination result of the threshold value determining unit 18 . Note that hereinafter, when the upper saturation threshold and the lower saturation threshold are not distinguished from each other, they may simply be referred to as thresholds.
- the saturated layer number counter 20 counts the upper limit saturated layer number, which is the number of layers for which the upper limit saturation number counted by the saturation number counter 16 is one or more. Further, the saturated layer number counter 20 counts the lower saturated layer number, which is the number of layers for which the lower limit saturation number counted by the saturation number counter 16 is one or more.
- the upper limit number of saturated layers and the lower limit number of saturated layers may simply be referred to as the number of saturated layers when they are not distinguished from each other.
- FIG. 2 is a block diagram showing the hardware configuration of the object detection device 10.
- the object detection device 10 has a computer 30 .
- the computer 30 includes a CPU (Central Processing Unit) 30A, a ROM (Read Only Memory) 30B, a RAM (Random Access Memory) 30C, and an input/output interface (I/O) 30D.
- the CPU 30A, ROM 30B, RAM 30C, and I/O 30D are connected via a system bus 30E.
- System bus 30E includes a control bus, an address bus, and a data bus.
- a communication unit 32 and a storage unit 34 are also connected to the I/O 30D.
- the communication unit 32 is an interface for data communication with an external device.
- the storage unit 34 is composed of a non-volatile storage device such as a hard disk, and stores an object detection program 34A, etc., which will be described later.
- the CPU 30A loads the object detection program 34A stored in the storage unit 34 into the RAM 30C and executes it.
- the storage unit 34 may be, for example, a portable storage device that can be attached to and detached from the computer 30 .
- the threshold determination unit 18 initializes the decimal point position of each layer, and obtains the amount of change in the upper limit saturated layer number and the lower limit saturated layer number before and after the change in the decimal point position. is initialized to a predetermined initial upper saturation threshold, and the lower saturation threshold, which is a threshold for the lower saturation count, is initialized to a predetermined initial lower saturation threshold, and object detection is performed twice. Specifically, in the first object detection, the object detection calculation unit 12 calculates the above-described metadata included in the input image, and outputs it to the saturation number counter 16 and the decimal point position control unit 14 . Further, the saturation number counter 16 counts the upper limit saturation number and the lower limit saturation number for each layer, and outputs them to the saturation layer number counter 20 . Note that the upper saturation threshold and the lower saturation threshold are integers of 0 or more. Also, the upper saturation threshold and the lower saturation threshold may be the same value or different values.
- the saturated layer number counter 20 calculates the upper limit saturated layer number, which is the number of layers whose upper limit saturation count is 1 or more, and the lower limit saturated layer number, which is the number of layers whose lower limit saturation count is 1 or more.
- the decimal point position control unit 14 reduces the number of layers in which the upper limit saturation count exceeds the upper saturation threshold based on the detection result of the first object detection, and adjusts the lower limit saturation count.
- the decimal point position is set for each layer so that the number of layers in which is above the upper saturation threshold is reduced. That is, for a layer whose upper saturation count exceeds the upper saturation threshold, the decimal point position is set so that the upper limit of the value range of that layer is large, and for a layer whose lower saturation count exceeds the lower saturation threshold, the layer Set the decimal point position so that the lower limit of the range of is small. That is, the decimal point position is set so as to widen the value range.
- the saturated layer number counter 20 calculates the amount of change in the upper limit number of saturated layers and the amount of change in the lower limit number of saturated layers based on the first object detection result and the second object detection result.
- the amount of change in the upper limit number of saturated layers is the amount of increase in the upper limit number of saturated layers in the second object detection with respect to the upper limit number of saturated layers in the first object detection.
- the amount of change in the lower limit saturated layer number is the amount of increase in the lower saturated layer number in the second object detection with respect to the lower saturated layer number in the first object detection.
- step S104 the threshold determination unit 18 determines whether or not the amount of change in the number of upper saturated layers obtained in step S102 is within the allowable range, that is, whether or not the upper saturation threshold is the optimum upper saturation threshold. do. Specifically, when the amount of change in the upper limit saturated layer number is equal to or less than a predetermined upper limit change amount threshold, it is determined that the upper limit saturation threshold is optimal.
- the upper limit change amount threshold is an integer of 1 or more.
- the amount of change in the upper saturated layer number is larger than the upper limit variation threshold, it is determined that the upper saturated layer number is not optimal.
- the threshold determination unit 18 determines whether or not the amount of change in the number of lower saturated layers obtained in step S102 is within the allowable range, that is, whether or not the lower saturation threshold is the optimum lower saturation threshold. Specifically, when the amount of change in the lower limit saturated layer number is equal to or less than a predetermined lower limit change amount threshold, it is determined that the lower limit saturated layer number is optimal.
- the lower limit change amount threshold is an integer of 1 or more.
- the upper limit change amount threshold and the lower limit change amount threshold are appropriately set depending on the network model or application to which the object detection device 10 is applied.
- the threshold determination unit 18 uses the amount of change in the number of upper saturated layers and the amount of change in the number of lower saturated layers in two object detections by the object detection calculation unit 12 to determine whether the upper limit saturation threshold and the lower limit saturation threshold are optimal. Determine whether or not there is
- step S105 if it is determined that at least one of the upper saturation threshold and the lower saturation threshold is not optimal, the process proceeds to step S105, and if it is determined that both the upper saturation threshold and the lower saturation threshold are optimal, the process proceeds to step S114. Transition.
- step S ⁇ b>105 the threshold determination unit 18 changes at least one of the upper saturation threshold and the lower saturation threshold determined to be non-optimal by increasing it by a predetermined increase value, and outputs it to the saturation counter 16 .
- the increment value is an integer of 1 or more. In this way, when at least one of the upper saturation threshold and the lower saturation threshold is determined to be non-optimal, the threshold determination unit 18 increases at least one of the upper saturation threshold and the lower saturation threshold determined to be non-optimal. .
- the increment value for increasing at least one of the upper saturation threshold and the lower saturation threshold determined to be non-optimal may be set for each of the plurality of layers instead of setting the same value for all of the plurality of layers.
- Steps S108 and S110 are the same processes as steps S100 and S102, but differ in that at least one of the upper saturation threshold and the lower saturation threshold changed in step S105 is used.
- the saturation layer number counter 20 determines the degree of change in the input image.
- the saturation layer number counter 20 determines the degree of change in the input image depending on the type of input image, even if the decimal point position is the same, at least one of the upper limit saturated layer number and the lower saturated layer number may differ. is used to determine the degree of change in the input image.
- step S110 it is determined whether or not the amount of change in the upper limit saturated layer number obtained in step S110 is greater than a predetermined upper limit change amount threshold, and the amount of change in the lower limit saturated layer number obtained in step S110 is It is determined whether or not it is greater than a predetermined lower limit change amount threshold.
- the upper limit change amount threshold and the lower limit change amount threshold are integers of 1 or more.
- step S102 when the amount of change in the upper limit saturated layer number obtained in step S102 is equal to or less than the upper limit change amount threshold and the amount of change in the lower limit saturated layer number obtained in step S102 is equal to or less than the lower limit change amount threshold, the change in the input image It is determined that the degree of change is not large, that is, the degree of change of the input image is 0 or small, and the process proceeds to step S104.
- the saturation layer number counter 20 is set when the change amount of the upper limit saturation layer number in the previous object detection and the current object detection is larger than the upper limit change amount threshold, and when the lower limit saturation layer number in the previous object detection and the current object detection It is determined that the degree of change in the input image is large in at least one of cases where the change amount of the number is greater than the lower limit change amount threshold.
- the upper saturation threshold and the lower saturation threshold are adjusted until the upper saturation threshold and the lower saturation threshold become optimal. The process of changing at least one of them and detecting an object is repeated.
- n is an integer of 2 or more.
- object detection is performed by changing the decimal point position of each layer so that the average value of the upper limit saturation count and the average value of the lower limit saturation count of each layer are reduced.
- the decimal point position is moved in the direction in which the calculation result is below the upper limit, and for the layer whose calculation result is below the lower limit, the calculation result is Moves the decimal point position in the direction that is greater than or equal to the lower limit.
- the decimal point positions obtained by performing object detection n times are stored.
- step S116 the decimal point position control unit 14 sets the decimal point position when the detection result is the best, that is, the decimal point position when the object detection is performed when the upper limit saturation count and the lower limit saturation count are the smallest, as the optimum decimal point position. decide. In this way, when it is determined that the upper saturation threshold and the lower saturation threshold are optimal, the decimal point position control unit 14 controls the upper saturation count and the lower saturation count in a plurality of object detections by the object detection calculation unit 12. to set the decimal point position.
- step S118 the object detection calculation unit 12 executes object detection using the decimal point position determined in step S116.
- step S120 it is determined whether the degree of change in the input image is large between the input image input last time and the input image input this time. This determination is performed in the same manner as the determination in step S112.
- step S100 if the degree of change in the input image is large, the process proceeds to step S100. On the other hand, if the degree of change in the input image is not large, the process proceeds to step S118 to repeat object detection. That is, object detection is repeated at the decimal point position determined in step S116 until the degree of change in the input image increases.
- the optimum decimal point position is determined in the processes of steps S114 and S116.
- the decimal point position is changed for the three layers of layer numbers 1, 3, and 5 whose upper saturation count exceeds the upper saturation threshold, as shown in the lower graph of FIG. Although it is below the threshold, the upper limit saturated layer number is increased to 6 layers. For this reason, it is not possible to suppress the deterioration of the accuracy of the detection result of the object detection of the input image.
- the upper saturation threshold is not optimal, the upper saturation threshold is increased. Therefore, as shown in the upper graph of FIG. 5, the upper limit saturation count exceeds the upper limit saturation threshold only for one layer with the layer number 3, and the decimal point position is changed only for this layer. As a result, as shown in the lower graph of FIG. 5, the upper limit of the number of saturation layers is 4, and compared with the case of FIG. can be done.
- the increased threshold does not become the optimum threshold, that is, in the case of an input image for which the optimum threshold cannot be determined, the optimum decimal point position cannot be determined unless the type of input image changes. In this way, if the threshold value continues to increase, the optimum decimal point position will continue to be uncertain.
- FIG. 6 shows the configuration of the object detection device 11 according to this embodiment.
- the object detection device 11 has a configuration in which a change parameter storage unit 22 is added to the object detection device 10 described in the first embodiment.
- the changed parameter storage unit 22 stores changed parameters.
- the change parameter is the number of times at least one of the upper saturation threshold and the lower saturation threshold is changed, and the amount of change in the number of upper saturated layers and the number of lower saturated layers when it is determined to be non-optimal. at least one of the quantities.
- FIG. 7 shows a flowchart of object detection processing according to this embodiment.
- the object detection process in FIG. 7 is a process in which steps S106, S107, and S113 are added to the object detection process in FIG.
- step S106 the number of changes in at least one of the upper saturation threshold and the lower saturation threshold changed in step S105 and at least one of the amount of change in the number of upper and lower saturation layers determined in step S104 are calculated. Stored in the changed parameter storage unit 22 .
- step S107 it is determined whether or not the number of changes in at least one of the upper saturation threshold and the lower saturation threshold stored in step S106 has reached the maximum number of changes Nmax.
- Nmax is an integer of 1 or more.
- step S113 each time the threshold value is changed, at least one of the amount of change in the upper limit saturated layer number and the amount of change in the lower limit saturated layer number stored in the change parameter storage unit 22, the change in the upper limit saturated layer number with the smallest change amount At least one of an upper saturation threshold and a lower saturation threshold corresponding to at least one of the amount of change and the lower saturation layer number is set.
- the threshold value is not optimal even when the number of changes reaches the maximum number of changes Nmax, at least one of the amount of change in the upper limit saturated layer number and the amount of change in the lower limit saturated layer number stored each time the threshold is increased At least one of the upper saturation threshold and the lower saturation threshold corresponding to at least one of the amount of change in the number of upper saturated layers and the amount of change in the number of lower saturated layers with the smallest amount of change is set. This makes it possible to determine the optimal decimal point position even in an input image for which the optimal threshold cannot be determined by simply increasing the threshold.
- the congestion determination device configured to set the decimal point position for each of the plurality of layers based on a determination result as to whether or not at least one of the upper saturation threshold and the lower saturation threshold is optimal.
- the object detection process includes the computer performing arithmetic processing corresponding to each of a plurality of layers constituting the multi-layer neural network on fixed-length data in which the decimal point position is set according to a processing algorithm of the multi-layer neural network to which the input image is input;
- counting an upper limit saturation count which is the number of times the upper limit value of the value range determined by the decimal point position is exceeded, and a lower limit saturation count, which is the number of times the value range falls below the lower limit value
- counting the number of layers with the upper limit saturated number of times being 1 or more and the number of layers with the lower limit of saturation being 1 or more, and Based on the amount of change in the number of upper saturated layers and the amount of change in the number of lower saturated layers, at least one of an upper saturation threshold that is a threshold for the upper saturation number and a lower saturation threshold that is a threshold for the lower saturation
- a non-temporary storage medium that sets the decimal point position for each of the plurality of layers based on a determination result as to whether or not at least one of the upper saturation threshold and the lower saturation threshold is optimal.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
Description
Joseph Redmon et.al,“YOLOv3:An Incremental Improvement”,https://arxiv.org/abs/1804.02767
<非特許文献2>
Wei Liu et.al,“SSD:Single Shot MultiBox Detector”,https://arxiv.org/pdf/1512.02325.pdf
<非特許文献3>
Zhisheng Li et.al,“Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine”2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC),Guangzhou, 2017,pp. 143-150,doi: 10.1109/ISPA/IUCC.2017.00030.
<非特許文献4>
八田彩希、鵜澤寛之、吉田周平、新田高庸、“物体検出AI推論用ハードウェア向け動的小数点位置制御手法の提案”、電子情報通信学会、2020年9月.
メモリと、
前記メモリに接続された少なくとも1つのプロセッサと、
を含み、
前記プロセッサは、
コンピュータが、
入力画像が入力される多層ニューラルネットワークの処理アルゴリズムに従って、前記多層ニューラルネットワークを構成する複数の層の各々に対応する演算処理を、小数点位置が設定された固定長データに対して行い、
前記演算処理において、前記小数点位置によって定まる値域の上限値を上回った回数である上限飽和回数及び前記値域の下限値を下回った回数である下限飽和回数を各々カウントし、
前記上限飽和回数が1回以上となった層の数である上限飽和層数及び前記下限飽和回数が1回以上となった層の数である下限飽和層数をカウントし、
前記上限飽和層数の変化量及び前記下限飽和層数の変化量に基づいて、前記上限飽和回数の閾値である上限飽和閾値及び前記下限飽和回数の閾値である下限飽和閾値の少なくとも一方が最適でない場合に、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を変更し、
前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方が最適か否かの判定結果に基づいて、前記複数の層の各々に対して前記小数点位置を設定する
ように構成されている渋滞判定装置。
物体検出処理を実行するようにコンピュータによって実行可能なプログラムを記憶した非一時的記憶媒体であって、
前記物体検出処理は、
コンピュータが、
入力画像が入力される多層ニューラルネットワークの処理アルゴリズムに従って、前記多層ニューラルネットワークを構成する複数の層の各々に対応する演算処理を、小数点位置が設定された固定長データに対して行い、
前記演算処理において、前記小数点位置によって定まる値域の上限値を上回った回数である上限飽和回数及び前記値域の下限値を下回った回数である下限飽和回数を各々カウントし、
前記上限飽和回数が1回以上となった層の数である上限飽和層数及び前記下限飽和回数が1回以上となった層の数である下限飽和層数をカウントし、
前記上限飽和層数の変化量及び前記下限飽和層数の変化量に基づいて、前記上限飽和回数の閾値である上限飽和閾値及び前記下限飽和回数の閾値である下限飽和閾値の少なくとも一方が最適でない場合に、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を変更し、
前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方が最適か否かの判定結果に基づいて、前記複数の層の各々に対して前記小数点位置を設定する
非一時的記憶媒体。
12 物体検出演算部
14 小数点位置制御部
16 飽和回数カウンタ
18 閾値判定部
20 飽和層数カウンタ
22 変更パラメータ記憶部
30 コンピュータ
34A 物体検出プログラム
Claims (8)
- 入力画像が入力される多層ニューラルネットワークの処理アルゴリズムに従って、前記多層ニューラルネットワークを構成する複数の層の各々に対応する演算処理を、小数点位置が設定された固定長データに対して行う物体検出演算部と、
前記演算処理において、前記小数点位置によって定まる値域の上限値を上回った回数である上限飽和回数及び前記値域の下限値を下回った回数である下限飽和回数を各々カウントする飽和回数カウンタと、
前記上限飽和回数が1回以上となった層の数である上限飽和層数及び前記下限飽和回数が1回以上となった層の数である下限飽和層数を各々カウントする飽和層数カウンタと、
前記飽和層数カウンタでカウントした前記上限飽和層数の変化量及び前記下限飽和層数の変化量に基づいて、前記上限飽和回数の閾値である上限飽和閾値及び前記下限飽和回数の閾値である下限飽和閾値の少なくとも一方が最適か否かを判定し、最適でないと判定した場合に、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を変更する閾値判定部と、
前記閾値判定部の判定結果に基づいて、前記複数の層の各々に対して前記小数点位置を設定する小数点位置制御部と、
を備えた物体検出装置。 - 前記閾値判定部は、前記物体検出演算部による2回の物体検出における前記上限飽和層数の変化量及び前記下限飽和層数の変化量を用いて前記上限飽和閾値及び前記下限飽和閾値が最適であるか否かを判定する
請求項1記載の物体検出装置。 - 前記閾値判定部は、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方が最適でないと判定された場合に、最適でないと判定された前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を増加させ、
前記小数点位置制御部は、前記上限飽和閾値及び前記下限飽和閾値が最適であると判定された場合に、前記物体検出演算部による複数回の物体検出における前記上限飽和回数及び前記下限飽和回数に基づいて、前記小数点位置を設定する
請求項1又は請求項2記載の物体検出装置。 - 前記閾値判定部は、最適でないと判定された前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を増加させる増加値を前記複数の層の各々について設定する
請求項3記載の物体検出装置。 - 前記飽和層数カウンタは、前回の物体検出及び今回の物体検出における前記上限飽和層数の変化量が上限変化量閾値より大きい場合及び前回の物体検出及び今回の物体検出における前記下限飽和層数の変化量が下限変化量閾値より大きい場合の少なくとも一方の場合に、入力画像の変化度合いが大きいと判定し、
前記閾値判定部は、前記入力画像の変化度合いが大きいと判定された場合に、前記小数点位置、前記上限飽和閾値、及び前記下限飽和閾値を初期化する
請求項1~4の何れか1項に記載の物体検出装置。 - 前記上限飽和閾値及び前記下限飽和閾値の変更回数を記憶する変更パラメータ記憶部を備え、
前記閾値判定部は、前記変更回数が予め定めた最大回数に達した場合に、前記上限飽和閾値及び前記下限飽和閾値を、前記上限飽和層数の変化量及び前記下限飽和層数の変化量が最も小さい前記上限飽和閾値及び前記下限飽和閾値に変更する
請求項1~5の何れか1項に記載の物体検出装置。 - コンピュータが、
入力画像が入力される多層ニューラルネットワークの処理アルゴリズムに従って、前記多層ニューラルネットワークを構成する複数の層の各々に対応する演算処理を、小数点位置が設定された固定長データに対して行い、
前記演算処理において、前記小数点位置によって定まる値域の上限値を上回った回数である上限飽和回数及び前記値域の下限値を下回った回数である下限飽和回数を各々カウントし、
前記上限飽和回数が1回以上となった層の数である上限飽和層数及び前記下限飽和回数が1回以上となった層の数である下限飽和層数を各々カウントし、
カウントした前記上限飽和層数の変化量及び前記下限飽和層数の変化量に基づいて、前記上限飽和回数の閾値である上限飽和閾値及び前記下限飽和回数の閾値である下限飽和閾値の少なくとも一方が最適でない場合に、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を変更し、
前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方が最適か否かの判定結果に基づいて、前記複数の層の各々に対して前記小数点位置を設定する
処理を実行する物体検出方法。 - コンピュータに、
入力画像が入力される多層ニューラルネットワークの処理アルゴリズムに従って、前記多層ニューラルネットワークを構成する複数の層の各々に対応する演算処理を、小数点位置が設定された固定長データに対して行い、
前記演算処理において、前記小数点位置によって定まる値域の上限値を上回った回数である上限飽和回数及び前記値域の下限値を下回った回数である下限飽和回数を各々カウントし、
前記上限飽和回数が1回以上となった層の数である上限飽和層数及び前記下限飽和回数が1回以上となった層の数である下限飽和層数を各々カウントし、
カウントした前記上限飽和層数の変化量及び前記下限飽和層数の変化量に基づいて、前記上限飽和回数の閾値である上限飽和閾値及び前記下限飽和回数の閾値である下限飽和閾値の少なくとも一方が最適でない場合に、前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方を変更し、
前記上限飽和閾値及び前記下限飽和閾値の少なくとも一方が最適か否かの判定結果に基づいて、前記複数の層の各々に対して前記小数点位置を設定する
処理を実行させる物体検出プログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/019953 WO2022249316A1 (ja) | 2021-05-26 | 2021-05-26 | 物体検出装置、物体検出方法、及び物体検出プログラム |
JP2023523790A JPWO2022249316A1 (ja) | 2021-05-26 | 2021-05-26 | |
EP21942970.1A EP4318388A1 (en) | 2021-05-26 | 2021-05-26 | Object detection device, object detection method, and object detection program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/019953 WO2022249316A1 (ja) | 2021-05-26 | 2021-05-26 | 物体検出装置、物体検出方法、及び物体検出プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022249316A1 true WO2022249316A1 (ja) | 2022-12-01 |
Family
ID=84228636
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/019953 WO2022249316A1 (ja) | 2021-05-26 | 2021-05-26 | 物体検出装置、物体検出方法、及び物体検出プログラム |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4318388A1 (ja) |
JP (1) | JPWO2022249316A1 (ja) |
WO (1) | WO2022249316A1 (ja) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018124681A (ja) * | 2017-01-30 | 2018-08-09 | 富士通株式会社 | 演算処理装置、情報処理装置、方法、およびプログラム |
-
2021
- 2021-05-26 JP JP2023523790A patent/JPWO2022249316A1/ja active Pending
- 2021-05-26 EP EP21942970.1A patent/EP4318388A1/en active Pending
- 2021-05-26 WO PCT/JP2021/019953 patent/WO2022249316A1/ja active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018124681A (ja) * | 2017-01-30 | 2018-08-09 | 富士通株式会社 | 演算処理装置、情報処理装置、方法、およびプログラム |
Non-Patent Citations (6)
Title |
---|
HATTA, SAKI ET AL.: "Dynamic Fixed-point Control Method for Object- detection A1 Inference Hardware", PROCEEDINGS OF THE 2020 IEICE ENGINEERING SCIENCES SOCIETY / NOLT A SOCIETY CONFERENCE, 1 September 2020 (2020-09-01), pages 33 * |
IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC, 2017, pages 143 - 150 |
JOSEPH REDMON, YOLOV3: AN INCREMENTAL IMPROVEMENT, Retrieved from the Internet <URL:https://arxiv.org/abs/1804.02767> |
SAKI HATTAHIROYUKI UZAWASHUHEI YOSHIDAKOYO NITTA: "Dynamic Fixed-point Control Method for Obj ect-detection AI Inference Hardware", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS, September 2020 (2020-09-01) |
WEI LIU, SSD: SINGLE SHOT MULTIBOX DETECTOR, Retrieved from the Internet <URL:https://arxiv.org/pdf/1512.02325.pdf> |
ZHISHENG LI: "Laius: An 8-Bit Fixed-Point CNN Hardware Inference Engine", IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS, 2017 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022249316A1 (ja) | 2022-12-01 |
EP4318388A1 (en) | 2024-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210150248A1 (en) | Dynamic quantization for deep neural network inference system and method | |
US10269119B2 (en) | System and method for background and foreground segmentation | |
US9230148B2 (en) | Method and system for binarization of two dimensional code image | |
US9002136B2 (en) | Denoising apparatus, system and method | |
US20140064612A1 (en) | Apparatus and a method for coding an image | |
CN113221768A (zh) | 识别模型训练方法、识别方法、装置、设备及存储介质 | |
US12003871B2 (en) | Generating sparse sample histograms in image processing | |
CN112488060B (zh) | 目标检测方法、装置、设备和介质 | |
CN109040579B (zh) | 一种拍摄控制方法、终端及计算机可读介质 | |
WO2022249316A1 (ja) | 物体検出装置、物体検出方法、及び物体検出プログラム | |
CN113642711B (zh) | 一种网络模型的处理方法、装置、设备和存储介质 | |
WO2022145016A1 (ja) | データ処理装置、データ処理方法、及びデータ処理プログラム | |
WO2022003855A1 (ja) | データ処理装置およびデータ処理方法 | |
Bora | An efficient innovative approach towards color image enhancement | |
US20240232593A9 (en) | Data processing device, data processing method, and data processing program | |
CN111475135B (zh) | 一种乘法器 | |
KR20210156538A (ko) | 뉴럴 네트워크를 이용한 데이터 처리 방법 및 데이터 처리 장치 | |
JP2017041732A (ja) | 画像処理装置、画像処理方法およびプログラム | |
JP6125331B2 (ja) | テクスチャ検出装置、テクスチャ検出方法、テクスチャ検出プログラム、および画像処理システム | |
KR102467240B1 (ko) | 영상의 잡음 제거 장치 및 방법 | |
CN113383347A (zh) | 信息处理装置、信息处理方法和信息处理程序 | |
WO2024004221A1 (ja) | 演算処理装置、演算処理方法、及び演算処理プログラム | |
Pirkl et al. | Self-adaptive FPGA-based image processing filters using approximate arithmetics | |
CN108965700B (zh) | 一种拍摄控制方法、终端及计算机可读介质 | |
CN112734681A (zh) | 一种超高清视频显示方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21942970 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023523790 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021942970 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021942970 Country of ref document: EP Effective date: 20231025 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |