WO2023095934A1 - Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet - Google Patents
Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet Download PDFInfo
- Publication number
- WO2023095934A1 WO2023095934A1 PCT/KR2021/017317 KR2021017317W WO2023095934A1 WO 2023095934 A1 WO2023095934 A1 WO 2023095934A1 KR 2021017317 W KR2021017317 W KR 2021017317W WO 2023095934 A1 WO2023095934 A1 WO 2023095934A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- anchor
- object detector
- detector model
- pruning
- Prior art date
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 107
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000013138 pruning Methods 0.000 claims abstract description 38
- 239000013585 weight reducing agent Substances 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 9
- 230000015556 catabolic process Effects 0.000 claims description 8
- 238000006731 degradation reaction Methods 0.000 claims description 8
- 230000001537 neural effect Effects 0.000 claims description 4
- 238000004891 communication Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- 238000011156 evaluation Methods 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- Embodiments of the present invention relate to objects, detectors, heads, neural networks, weight reduction methods, and systems, and more particularly, to weight reduction methods and systems specialized for weight reduction of head neural networks rather than weight reduction centered on backbone neural networks.
- the present invention is a study conducted with the support of the Information and Communications Planning and Evaluation Institute with financial resources from the government (Ministry of Science and ICT) in 2021 (No. This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2021-0-00907, Development of Adaptive and Lightweight Edge-Collaborative Analysis Technology for Enabling Proactively Immediate Response and Rapid Learning)).
- the deep neural network-based object detector model consists of a backbone neural network that extracts features for input and a head neural network that predicts the coordinates and object type of an object.
- a method of using an efficient convolutional neural network such as MobileNet instead of a convolutional neural network model having a relatively large size and excellent performance has been mainly used.
- the model of the backbone neural network is compressed using lightweight techniques such as pruning and low-rank approximation of the backbone neural network.
- a lightweight method performed by a computer device including at least one processor comprising: receiving, by the at least one processor, an object detector model; replacing, by the at least one processor, a head neural network of the input object detector model; determining, by the at least one processor, whether to perform anchor pruning; performing anchor pruning on the object detector model replaced by the head neural network, when it is determined by the at least one processor to perform the anchor pruning; and outputting, by the at least one processor, a lightweight object detector model.
- the step of replacing the head neural network may be characterized by reducing the number of output channels of a convolutional layer constituting the head neural network of the input object detector model.
- the step of replacing the head neural network may include converting a convolutional layer constituting the head neural network of the input object detector model to another efficient convolutional layer or block (eg, a shuffle block) It can be characterized by replacing with.
- the head neural network of the input object detector model is replaced with a head neural network searched using a neural architecture search (NAS) method. can do.
- NAS neural architecture search
- the pruning of anchors may include measuring importance of anchors; removing anchors belonging to a predetermined ratio or less based on the importance of the anchors; and re-learning an object detector model from which anchors belonging to the predetermined ratio or less are removed.
- the step of measuring the importance of each anchor may be characterized in that the importance of each independent anchor is determined based on the extent of performance degradation before and after removing the output of each independent anchor.
- the step of measuring the importance of the anchor may be characterized in that the importance of the anchor is determined based on the degree of redundancy of the bounding box predicted by each anchor.
- the redundancy of the bounding box is a value obtained by dividing the number of anchors whose Intersection over Union (IoU) score with the bounding box predicted by the first anchor is equal to or greater than a preset value by the number of bounding boxes predicted by one anchor It can be characterized in that it is calculated based on.
- IoU Intersection over Union
- the IoU score may be calculated based on a value obtained by dividing the area of an overlapping region of two bounding boxes predicted by two anchors by the total area of the two bounding boxes.
- the outputting of the lightweight object detector model may include outputting, as the lightweight object detector model, an object detector model in which the head neural network is replaced when the anchor pruning is not performed, and the anchor pruning is not performed.
- the head neural network is replaced and the object detector model for which the anchor pruning is performed may be output as the lightweight object detector model.
- a computer program stored in a computer readable recording medium is provided in combination with a computer device to execute the method on the computer device.
- a computer readable recording medium having a program for executing the method in a computer device is recorded.
- It includes at least one processor implemented to execute instructions readable by a computer device, receiving an object detector model by the at least one processor, replacing a head neural network of the received object detector model, and pruning anchors. , and if it is determined to perform the anchor pruning, the head neural network performs anchor pruning on the replaced object detector model and outputs a lightweight object detector model.
- a computer device that performs the device.
- FIG. 1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.
- FIG. 2 is a block diagram illustrating an example of a computer device according to one embodiment of the present invention.
- FIG. 3 is a diagram showing an example of a structure of an object detector according to an embodiment of the present invention.
- FIG. 4 is a diagram illustrating an example of an original image and a feature map according to an embodiment of the present invention.
- FIG. 5 is a flowchart illustrating an example of a weight reduction method according to an embodiment of the present invention.
- FIG. 6 is a diagram illustrating an example of calculating an IoU score according to an embodiment of the present invention.
- a lightweight system according to embodiments of the present invention may be implemented by at least one computer device.
- a computer program according to an embodiment of the present invention may be installed and driven in the computer device, and the computer device may perform the weight reduction method according to the embodiments of the present invention under the control of the driven computer program.
- the above-described computer program may be combined with a computer device and stored in a computer readable recording medium to execute the weight reduction method on a computer.
- FIG. 1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.
- the network environment of FIG. 1 shows an example including a plurality of electronic devices 110 , 120 , 130 , and 140 , a plurality of servers 150 and 160 , and a network 170 .
- 1 is an example for explanation of the invention, and the number of electronic devices or servers is not limited as shown in FIG. 1 .
- the network environment of FIG. 1 only describes one example of environments applicable to the present embodiments, and the environment applicable to the present embodiments is not limited to the network environment of FIG. 1 .
- the plurality of electronic devices 110, 120, 130, and 140 may be fixed terminals implemented as computer devices or mobile terminals.
- Examples of the plurality of electronic devices 110, 120, 130, and 140 include a smart phone, a mobile phone, a navigation device, a computer, a laptop computer, a digital broadcast terminal, a personal digital assistant (PDA), and a portable multimedia player (PMP). ), and tablet PCs.
- FIG. 1 shows the shape of a smartphone as an example of the electronic device 110, but in the embodiments of the present invention, the electronic device 110 substantially uses a wireless or wired communication method to transmit other information via the network 170. It may refer to one of various physical computer devices capable of communicating with the electronic devices 120 , 130 , and 140 and/or the servers 150 and 160 .
- the communication method is not limited, and short-distance wireless communication between devices as well as a communication method utilizing a communication network (eg, a mobile communication network, a wired Internet, a wireless Internet, and a broadcasting network) that the network 170 may include may also be included.
- a communication network eg, a mobile communication network, a wired Internet, a wireless Internet, and a broadcasting network
- the network 170 may include a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), and a broadband network (BBN). , one or more arbitrary networks such as the Internet.
- PAN personal area network
- LAN local area network
- CAN campus area network
- MAN metropolitan area network
- WAN wide area network
- BBN broadband network
- the network 170 may include any one or more of network topologies including a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, and the like. Not limited.
- Each of the servers 150 and 160 communicates with the plurality of electronic devices 110, 120, 130, and 140 through the network 170 to provide commands, codes, files, contents, services, and the like, or a computer device or a plurality of computers. It can be implemented in devices.
- the server 150 provides a service (eg, an instant messaging service, a social network service, a payment service, a virtual exchange) to a plurality of electronic devices 110, 120, 130, and 140 connected through the network 170. service, risk monitoring service, game service, group call service (or voice conference service), messaging service, mail service, map service, translation service, financial service, search service, content provision service, etc.).
- FIG. 2 is a block diagram illustrating an example of a computer device according to one embodiment of the present invention.
- Each of the plurality of electronic devices 110 , 120 , 130 , and 140 or each of the servers 150 and 160 described above may be implemented by the computer device 200 shown in FIG. 2 .
- the computer device 200 may include a memory 210, a processor 220, a communication interface 230, and an input/output interface 240.
- the memory 210 is a computer-readable recording medium and may include a random access memory (RAM), a read only memory (ROM), and a permanent mass storage device such as a disk drive.
- RAM random access memory
- ROM read only memory
- a permanent mass storage device such as a disk drive.
- a non-perishable mass storage device such as a ROM and a disk drive may be included in the computer device 200 as a separate permanent storage device distinct from the memory 210 .
- an operating system and at least one program code may be stored in the memory 210 . These software components may be loaded into the memory 210 from a computer-readable recording medium separate from the memory 210 .
- the separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, a disk, a tape, a DVD/CD-ROM drive, and a memory card.
- software components may be loaded into the memory 210 through the communication interface 230 rather than a computer-readable recording medium.
- software components may be loaded into memory 210 of computer device 200 based on a computer program installed by files received over network 170 .
- the processor 220 may be configured to process commands of a computer program by performing basic arithmetic, logic, and input/output operations. Instructions may be provided to processor 220 by memory 210 or communication interface 230 . For example, processor 220 may be configured to execute received instructions according to program codes stored in a recording device such as memory 210 .
- the communication interface 230 may provide a function for the computer device 200 to communicate with other devices (eg, storage devices described above) through the network 170 .
- a request, command, data, file, etc. generated according to a program code stored in a recording device such as the memory 210 by the processor 220 of the computer device 200 is controlled by the communication interface 230 to the network ( 170) to other devices.
- signals, commands, data, files, etc. from other devices may be received by the computer device 200 through the communication interface 230 of the computer device 200 via the network 170 .
- Signals, commands, data, etc. received through the communication interface 230 may be transferred to the processor 220 or the memory 210, and files, etc. may be stored as storage media that the computer device 200 may further include (described above). permanent storage).
- the input/output interface 240 may be a means for interface with the input/output device 250 .
- the input device may include a device such as a microphone, keyboard, or mouse
- the output device may include a device such as a display or speaker.
- the input/output interface 240 may be a means for interface with a device in which functions for input and output are integrated into one, such as a touch screen.
- At least one of the input/output devices 250 may be configured as one device with the computer device 200 . For example, like a smart phone, a touch screen, a microphone, a speaker, and the like may be implemented in a form included in the computer device 200 .
- computer device 200 may include fewer or more elements than those of FIG. 2 . However, there is no need to clearly show most of the prior art components.
- the computer device 200 may be implemented to include at least some of the aforementioned input/output devices 250 or may further include other components such as a transceiver and a database.
- FIG. 3 is a diagram showing an example of a structure of an object detector according to an embodiment of the present invention.
- the object detector may be divided into a backbone neural network, a neck neural network, and a head neural network, but in the object detector according to the present embodiment, the head neural network may be defined as both the intermediate neural network and the head neural network.
- the pixels of the input image can be compressed into abstract values while passing through the convolutional layer of the backbone neural network, and can be expressed as a feature map with a smaller resolution than before passing through the convolutional layer.
- (1/n) of “Convolutional layer (1/n)” shown in FIG. 3 may mean a reduction in the spatial size of an image after passing through the corresponding convolutional layer. For example, when an image with a resolution of 800 ⁇ 800 passes through a "convolutional layer (1/2)", it can be reduced to a resolution of 400 ⁇ 400.
- a new feature map can be generated by passing feature maps of different sizes through another convolutional layer. At this time, except for the feature map of the smallest size ("Feature map (1/32)" in the embodiment of FIG. After upsampling the size of the maps, they are combined to create a new feature map, which is then passed through a convolutional layer.
- Each grid of each feature map reduced to 1/n size represents n cells of the original image, and how m anchors pre-specified for each grid must be calibrated to form a bounding box that encloses the object. box) and object types nested within the bounding box.
- m identical anchors are applied to each feature map, and different anchors can be defined in different feature maps. If 9 predefined anchors exist for each feature map and 4 feature maps are generated, a total of 36 independent anchors can be defined.
- the weight reduction method according to the present embodiment may be performed by at least one computer device 200 implementing the weight reduction system.
- the processor 220 of the computer device 200 may be implemented to execute a control instruction according to an operating system code or at least one computer program code included in the memory 210 .
- the processor 220 controls the computer device 200 so that the computer device 200 performs the steps 510 to 550 included in the method of FIG. 5 according to a control command provided by a code stored in the computer device 200. can control.
- the computer device 200 may receive an object detector model.
- the computer device 200 may receive an original model of an object detector for lightening a head neural network out of an overall structure.
- the computer device 200 may replace the head neural network of the input object detector model.
- the computer device 200 may replace the head neural network included in the original model of the object detector with a relatively small head neural network (lightweight head neural network). Since the backbone neural network trained on a large-sized model has a better feature extraction ability, in step 520, the computer device 200 maintains the parameters of the backbone neural network of the pre-trained input model while maintaining the head If you modify the neural network, you can get better performance.
- the computer device 200 may replace the head neural network with a lightweight head neural network through at least one of methods (1) to (3) below.
- a method of searching for a high-performance, high-efficiency head neural network using a Neural Architecture Search for example, the computer device 200 automatically searches for the neural network structure of the head neural network of the object detector model input. It can be replaced with the head neural network searched using the technique.
- NAS Neural Architecture Search
- a method of reducing the number of output channels of the convolutional layer constituting the head neural network for example, the computer device 200 reduces the number of output channels of the convolutional layer constituting the head neural network of the received object detector model, thereby reducing the number of output channels of the head neural network can be replaced.
- a method of replacing the convolution layer constituting the head neural network with a more efficient convolution layer or block eg, a shuffle block
- the computer device 200 calculates the input object detector model
- a convolutional layer constituting the head neural network may be replaced with another convolutional layer block.
- the head neural network After replacing the head neural network, since the parameters of the replaced head neural network are set to initial values, in order to increase the performance of the model replaced by the head neural network as much as the performance of the original model of the object detector, the head neural network is used with training data from the original model. Retraining can be performed on this replaced model. At this time, as in the transfer learning method, learning is started from the parameters of the backbone neural network that is learned from a large model and has good parameters, but both the method of learning only the replaced (lightened) head neural network or the method of updating the backbone neural network together are used this is possible
- method (2) of reducing the number of output channels of the convolutional neural network layer is the easiest method to use, and the head neural network while maintaining the parameters of the backbone neural network through experiments. It was confirmed that latency (inference speed) can be improved while achieving better performance than the original model when applying the method of retraining the entire backbone and head neural network after adjusting the number of channels.
- the object detector model used in the experiment is "Yolo v5", and while maintaining the parameters of the backbone neural network of the input object detector model, the head neural network was replaced with a head neural network composed of convolutional layers with a smaller number of output channels. .
- the data used at this time was OCR (Optical Character Reader) data for recognizing string objects on images, which was learned with 36,939 images and evaluated with 3,000 evaluation data. At this time, the F1 score was used as a performance indicator of the object detector.
- the performance evaluation results are shown in Table 1 below.
- the computer device 200 may determine whether to perform anchor pruning.
- Anchor pruning can be performed to further improve inference speed after the head neural network is replaced.
- anchor pruning can improve inference speed in two aspects (1) and (2) by removing predefined anchors in object detectors: Reduced the number of modifiers for each anchor
- Inference speed can be improved by reducing the number of bounding boxes used in the NMS (Non-Maximum Suppression) process of pairing and comparing predicted bounding boxes to select the best bounding box among overlapping bounding boxes.
- Anchor pruning is difficult to use if the total number of predefined anchors is small or if the anchor-free object detector is an anchor-free object detector, but most high-performance object detectors have a large number of anchors. Anchor pruning is possible for detectors.
- step 540 may be performed, and if it is determined not to perform anchor pruning, step 550 may be performed.
- the computer device 200 may perform anchor pruning on the model in which the head neural network is replaced.
- anchor pruning can be performed in three steps (1) to (3) below.
- Anchor importance measurement for example, the computer device 200 may measure the anchor importance.
- the computer device 200 may remove r% of anchors based on the importance of the anchors.
- r may be a natural number, and the computer device 200 may remove anchors belonging to r% or less based on importance among all anchors.
- Model re-learning for example, the computer device 200 may re-learn a model from which r% anchors have been removed.
- Anchor importance measurement (1) can be performed through one of two of the following (a) and (b).
- the computer device 200 may perform performance evaluation on the verification data set after removing outputs of independent anchors with respect to the stored prediction value for each anchor. At this time, the computer device 200 may measure the extent of performance degradation compared to the conventional one, consider anchors with a relatively small extent of performance degradation as important anchors, and consider anchors with a relatively large extent of performance degradation as unimportant anchors and sort them. there is. In other words, the computer device 200 may sort the anchors by determining the importance of each independent anchor based on the extent of performance degradation before and after removing the output of each independent anchor.
- each bounding box may be calculated as a value obtained by dividing the number of anchors whose Intersection over Union (IoU) scores of x or more with bounding boxes predicted by other anchors by the number of bounding boxes predicted by one anchor.
- the computer device 200 may determine the importance of anchors based on the degree of redundancy of the bounding box predicted by each anchor.
- the score of IoU measures the degree to which two bounding boxes overlap, and can be calculated as a value obtained by dividing the area of the area where the two bounding boxes overlap by the area of the entire area of the two bounding boxes. The closer the score of IoU is to 1, the two bounding boxes can be regarded as identical, and the closer to 0, the two bounding boxes can be regarded as different.
- Score may correspond to an IoU score
- Area of overlap means the area of overlapping two bounding boxes
- Area of union means the total area of two bounding boxes, respectively. can do.
- Score may be calculated as a value obtained by dividing "Area of overlap” by "Area of union”.
- the computer device 200 may remove anchors of the lower r% of the ordered anchors among the predefined anchors.
- the value of r may be preset in consideration of the degree of improvement in inference speed versus performance degradation using the verification data set.
- the computer device 200 retrains the model using the training data used when the original object detector was trained similarly in order to adapt the model to the newly defined anchor after anchor pruning. can be done
- the computer device 200 may output a lightweight object detector model.
- the computer device 200 may output a lightweight object detector model in which the head neural network is replaced, and when step 540 is performed, the computer device 200 replaces the head neural network. and a lightweight object detector model with anchor pruning performed can be output.
- the system or device described above may be implemented as a hardware component or a combination of hardware components and software components.
- devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA) , a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions.
- the processing device may run an operating system (OS) and one or more software applications running on the operating system.
- a processing device may also access, store, manipulate, process, and generate data in response to execution of software.
- the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include.
- a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.
- Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device.
- Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device.
- can be embodied in Software may be distributed on networked computer systems and stored or executed in a distributed manner.
- Software and data may be stored on one or more computer readable media.
- the method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium.
- the computer readable medium may include program instructions, data files, data structures, etc. alone or in combination.
- the medium may continuously store programs executable by a computer or temporarily store them for execution or download.
- the medium may be various recording means or storage means in the form of a single or combined hardware, but is not limited to a medium directly connected to a certain computer system, and may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROM and DVD, magneto-optical media such as floptical disks, and ROM, RAM, flash memory, etc.
- examples of other media include recording media or storage media managed by an app store that distributes applications, a site that supplies or distributes various other software, and a server.
- Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
Sont divulgués un procédé et un système d'allègement d'un réseau neuronal à tête d'un détecteur d'objet. Selon un mode de réalisation, le procédé d'allègement peut comprendre les étapes consistant : à recevoir un modèle de détecteur d'objet en tant qu'entrée ; à remplacer un réseau neuronal à tête du modèle de détecteur d'objet qui a été reçu en tant qu'entrée ; à déterminer s'il faut réaliser ou non un élagage d'ancrage ; s'il est déterminé qu'il faut effectuer l'élagage d'ancrage, à effectuer un élagage d'ancrage sur le modèle de détecteur d'objet dans lequel le réseau neuronal à tête a été remplacé ; et à fournir en sortie le modèle de détecteur d'objet allégé.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2021/017317 WO2023095934A1 (fr) | 2021-11-23 | 2021-11-23 | Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet |
KR1020237036978A KR20230162676A (ko) | 2021-11-23 | 2021-11-23 | 객체 탐지기의 헤드 신경망 경량화 방법 및 시스템 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2021/017317 WO2023095934A1 (fr) | 2021-11-23 | 2021-11-23 | Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023095934A1 true WO2023095934A1 (fr) | 2023-06-01 |
Family
ID=86539786
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/017317 WO2023095934A1 (fr) | 2021-11-23 | 2021-11-23 | Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR20230162676A (fr) |
WO (1) | WO2023095934A1 (fr) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200250459A1 (en) * | 2019-01-11 | 2020-08-06 | Capital One Services, Llc | Systems and methods for text localization and recognition in an image of a document |
CN112614133A (zh) * | 2021-03-05 | 2021-04-06 | 北京小白世纪网络科技有限公司 | 一种无锚点框的三维肺结节检测模型训练方法及装置 |
-
2021
- 2021-11-23 KR KR1020237036978A patent/KR20230162676A/ko unknown
- 2021-11-23 WO PCT/KR2021/017317 patent/WO2023095934A1/fr unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200250459A1 (en) * | 2019-01-11 | 2020-08-06 | Capital One Services, Llc | Systems and methods for text localization and recognition in an image of a document |
CN112614133A (zh) * | 2021-03-05 | 2021-04-06 | 北京小白世纪网络科技有限公司 | 一种无锚点框的三维肺结节检测模型训练方法及装置 |
Non-Patent Citations (3)
Title |
---|
MAXIM BONNAERENS; MATTHIAS FREIBERGER; JONI DAMBRE: "Anchor Pruning for Object Detection", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 1 June 2022 (2022-06-01), 201 Olin Library Cornell University Ithaca, NY 14853, XP091236196, DOI: 10.1016/j.cviu.2022.103445 * |
WANG NING; GAO YANG; CHEN HAO; WANG PENG; TIAN ZHI; SHEN CHUNHUA; ZHANG YANNING: "NAS-FCOS: Fast Neural Architecture Search for Object Detection", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 13 June 2020 (2020-06-13), pages 11940 - 11948, XP033804595, DOI: 10.1109/CVPR42600.2020.01196 * |
ZHU XINGKUI; LYU SHUCHANG; WANG XU; ZHAO QI: "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), IEEE, 11 October 2021 (2021-10-11), pages 2778 - 2788, XP034028164, DOI: 10.1109/ICCVW54120.2021.00312 * |
Also Published As
Publication number | Publication date |
---|---|
KR20230162676A (ko) | 2023-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109815868B (zh) | 一种图像目标检测方法、装置及存储介质 | |
US20210295082A1 (en) | Zero-shot object detection | |
WO2012108623A1 (fr) | Procédé, système et support d'enregistrement lisible par ordinateur pour ajouter une nouvelle image et des informations sur la nouvelle image à une base de données d'images | |
CN109828845B (zh) | 一种基于边缘计算的绝缘子热成像实时诊断系统 | |
CN110990631A (zh) | 视频筛选方法、装置、电子设备和存储介质 | |
CN109961442B (zh) | 神经网络模型的训练方法、装置和电子设备 | |
KR101963404B1 (ko) | 2-단계 최적화 딥 러닝 방법, 이를 실행시키기 위한 프로그램을 기록한 컴퓨터 판독 가능한 기록매체 및 딥 러닝 시스템 | |
CN109743311A (zh) | 一种WebShell检测方法、装置及存储介质 | |
CN112232524B (zh) | 多标签信息的识别方法、装置、电子设备和可读存储介质 | |
US20240233334A1 (en) | Multi-modal data retrieval method and apparatus, medium, and electronic device | |
WO2020246655A1 (fr) | Procédé de reconnaissance de situation et dispositif permettant de le mettre en œuvre | |
CN113313241A (zh) | 确定深度学习模型的张量信息的方法和计算装置 | |
CN111738403A (zh) | 一种神经网络的优化方法及相关设备 | |
CN113269319A (zh) | 深度学习模型的调优方法、编译方法及计算装置 | |
CN110489955A (zh) | 应用于电子设备的图像处理、装置、计算设备、介质 | |
CN113139650B (zh) | 深度学习模型的调优方法和计算装置 | |
WO2023095934A1 (fr) | Procédé et système d'allégement d'un réseau neuronal à tête d'un détecteur d'objet | |
CN113033337A (zh) | 基于TensorRT的行人重识别方法及装置 | |
US9886652B2 (en) | Computerized correspondence estimation using distinctively matched patches | |
CN115984302B (zh) | 基于稀疏混合专家网络预训练的多模态遥感图像处理方法 | |
CN110069997B (zh) | 场景分类方法、装置及电子设备 | |
WO2023017884A1 (fr) | Procédé et système de prédiction de latence de modèle d'apprentissage profond par dispositif | |
CN116503596A (zh) | 图片分割方法、装置、介质和电子设备 | |
WO2022107925A1 (fr) | Dispositif de traitement de détection d'objet à apprentissage profond | |
WO2022164031A1 (fr) | Procédé et système de détection de chaîne de caractères au moyen d'une régression polynomiale tridimensionnelle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21965722 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20237036978 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |