CN112434713A - Image feature extraction method and device, electronic equipment and storage medium - Google Patents
Image feature extraction method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN112434713A CN112434713A CN202011390511.0A CN202011390511A CN112434713A CN 112434713 A CN112434713 A CN 112434713A CN 202011390511 A CN202011390511 A CN 202011390511A CN 112434713 A CN112434713 A CN 112434713A
- Authority
- CN
- China
- Prior art keywords
- node
- nodes
- group
- input
- feature extraction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 39
- 238000003860 storage Methods 0.000 title claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 33
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 27
- 238000001514 detection method Methods 0.000 claims abstract description 13
- 230000004927 fusion Effects 0.000 claims description 40
- 238000004590 computer program Methods 0.000 claims description 5
- 238000000034 method Methods 0.000 abstract description 6
- 230000006870 function Effects 0.000 abstract description 3
- 239000010410 layer Substances 0.000 description 43
- 238000010586 diagram Methods 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 239000002355 dual-layer Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an image feature extraction method, an image feature extraction device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2; taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module; and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image. The method and the device provided by the invention improve the smoothness of information flow in the BiFPN, so that the features of different layers are better fused, and the expression capability of the features is improved, thereby improving the performance of image processing functions such as target detection, image classification and the like.
Description
Technical Field
The invention relates to the technical field of computer application, in particular to an image feature extraction method and device, electronic equipment and a storage medium.
Background
The feature pyramid is a basic component in a recognition system for detecting objects of different dimensions. The inherent multi-scale pyramid hierarchy of deep convolutional networks can be exploited to construct a pyramid of features with marginal extra loss. Currently, a top-down architecture with cross-connections is designed for building high-level semantic feature maps at all scales, and this architecture, called Feature Pyramid Network (FPN), shows significant improvements as a common feature extractor in several applications.
The characteristic gold tower Network currently has various deformation structures, such as a Path Aggregation Network (PANet), a simplified PANet, an NAS-FPN, a bisfppn, and the like. BiFPN is a neural network module which fuses different level features in EfficientDet (target detection algorithm series). BiFPN has consulted the PANET structural feature, has increased from supreme characteristic channel down, has shortened the transmission course of information to the effect of characteristic fusion has been changed. Meanwhile, BiFPN uses the stacking of a plurality of feature pyramids, and the effect of feature fusion is further improved.
However, the current plane structure formed by stacking the BiFPN is from top to bottom, then from bottom to top, and then repeats, because there is a precedence relationship between top to bottom and bottom to top, therefore, the plane structure may cause unsmooth information transmission.
Therefore, how to improve the smoothness of information flow in the BiFPN to better fuse the features of different layers and improve the expression capacity of the features so as to improve the target detection performance is a technical problem to be solved by the technical personnel in the field.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides an image feature extraction method, an image feature extraction device, electronic equipment and a storage medium, so as to improve the smoothness of information flow in BiFPN, better fuse features of different layers, and improve the expression capability of the features, thereby improving the target detection performance.
According to an aspect of the present invention, there is provided an image feature extraction method including:
obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2;
taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module;
and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
In some embodiments of the invention, the two-layer bidirectional feature pyramid neural network module comprises:
n input nodes;
the first intermediate layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first intermediate layer forms a channel from top to bottom, and the other node in each group of nodes of the first intermediate layer forms a channel from bottom to top;
the second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom;
and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer.
In some embodiments of the invention, the first intermediate layer comprises:
a node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4.
In some embodiments of the invention, the first intermediate layer comprises:
one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another node X of the first set of nodes12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion;
one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion.
In some embodiments of the invention, the second intermediate layer comprises:
a node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion.
In some embodiments of the invention, the second intermediate layer comprises:
one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion;
a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion.
In some embodiments of the present invention, when N different sized images are input to a plurality of two-level bi-directional feature pyramid neural network modules, the output of a previous two-level bi-directional feature pyramid neural network module is input to a subsequent two-level bi-directional feature pyramid neural network module.
According to still another aspect of the present invention, there is also provided an image feature extraction device including:
the first acquisition module is used for acquiring N characteristic images with different sizes of a target image, wherein N is an integer greater than 2;
the input module is used for taking the N images with different sizes as the input of the at least one double-layer bidirectional characteristic pyramid neural network module;
and the output module is used for acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps of the image feature extraction method as described above.
According to yet another aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the image feature extraction method as described above.
Compared with the prior art, the invention has the advantages that:
the method comprises the steps of expanding a characteristic pyramid of a plane in the BiFPN into a three-dimensional characteristic fusion mode, changing a planar characteristic pyramid network into two planes, wherein one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from bottom to top, then fusing the outputs of the two planes, and finally stacking the two planes to form a new pyramid network structure. Therefore, the image feature extraction method and the device can be applied to image processing functions such as target detection and image classification.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 shows a flowchart of an image feature extraction method according to an embodiment of the present invention.
Fig. 2 shows a schematic diagram of a BiFPN module with five input nodes.
FIG. 3 shows a schematic diagram of a two-layer bi-directional feature pyramid neural network module, according to an embodiment of the present invention.
Fig. 4 is a block diagram showing an image feature extraction apparatus according to an embodiment of the present invention.
Fig. 5 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the disclosure.
Fig. 6 schematically illustrates an electronic device in an exemplary embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
In order to overcome the defects of the prior art, the invention provides an image feature extraction method and device, electronic equipment and a storage medium, which can avoid file errors caused by manual operation while simplifying the production of moving pictures to realize batch generation.
Referring first to fig. 1, fig. 1 shows a schematic diagram of an image feature extraction method according to an embodiment of the present invention. The image feature extraction method comprises the following steps:
step S110: n characteristic images with different sizes of the target image are obtained, wherein N is an integer larger than 2.
Specifically, the target image may be sequentially passed through a convolutional neural network to obtain feature images of different sizes. Further, the number of the feature images sequentially acquired may be equal to or greater than N, and thus, N feature images obtained last may be selected in step S110. In the N-spoke image, the feature image size obtained later is smaller.
Step S120: and taking the N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module.
Specifically, the two-layer bidirectional feature pyramid neural network module includes: n input nodes, a first intermediate layer, a second intermediate layer, and N output nodes. The first middle layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first middle layer forms a channel from top to bottom, and the other node in each group of nodes of the first middle layer forms a channel from bottom to top. The second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom, and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer. Hereinafter, a specific structure of the two-layer bidirectional feature pyramid neural network module provided by the present invention will be described with reference to fig. 3, which is not repeated herein.
Specifically, when N images of different sizes are used as inputs to a plurality of two-level bidirectional feature pyramid neural network modules, the output of the previous two-level bidirectional feature pyramid neural network module is used as the input to the next two-level bidirectional feature pyramid neural network module. Therefore, a plurality of double-layer bidirectional characteristic pyramid neural network modules can be transversely cascaded
Step S130: and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
In the image feature extraction method provided by the invention, the feature pyramid of the plane in the BiFPN is expanded into a three-dimensional feature fusion mode, the expansion mode is to change a planar feature pyramid network into two planes, one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from top to bottom, then the output of the two planes is fused, and finally the two planes are stacked to form a new pyramid network structure, so that the problem of the information flow or fusion efficiency due to the fact that the channels of the feature pyramid in the BiFPN from top to bottom and from bottom to top are sequentially stacked and the planar structure is prevented is solved.
Referring now to fig. 2, fig. 2 shows a schematic diagram of a BiFPN block with five input nodes. Fig. 2 shows five input nodes, three first intermediate nodes, five second intermediate nodes and five output nodes. Wherein the first intermediate node X1From the input node In1And an input node In2And (4) obtaining fusion. First intermediate node X2From the input node In3And a first intermediate node X1And (4) obtaining fusion. First intermediate node X3From the input node In4And a first intermediate node X2And (4) obtaining fusion. Thus, a channel is formed from top to bottom. Second intermediate node Y5From the input node In5And a first intermediate node X3And (4) obtaining fusion. Second intermediate node Y4From the input node In4First intermediate node X3And a second intermediate node Y5And (4) obtaining fusion. Second intermediate node Y3From the input node In3First intermediate node X2And a second intermediate node Y4And (4) obtaining fusion. Second intermediate node Y2From the input node In2First intermediate node X1And a second intermediate node Y3And (4) obtaining fusion. Second intermediate node Y1From the input node In1And a second intermediate node Y2And (4) obtaining fusion. Thereby forming a reaction mixture fromAnd (c) a channel. And finally outputting the signals as five output nodes through five second intermediate nodes. Therefore, the plane structure formed by stacking the BiFPNs is from top to bottom, then from bottom to top, and then is repeated continuously, because the precedence relationship exists between the plane structure and the plane structure from top to bottom and from bottom to top, the plane structure can cause unsmooth information transmission, and the performance of target detection is influenced.
Referring now to fig. 3, fig. 3 illustrates a schematic diagram of a two-layer bi-directional feature pyramid neural network module, according to an embodiment of the present invention.
In the present invention, in the first intermediate layer, one node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4. Meanwhile, in the first intermediate layer: one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another node X of the first set of nodes12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion; one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion. In the present invention, in the second intermediate layer, one node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion. At the same time, the second intermediate layerIn (2), one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion; a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion. The structure of the above-mentioned two-layer bidirectional feature pyramid neural network module is shown in fig. 3, and is not described herein again.
In various embodiments of the present invention, the two-layer bidirectional feature pyramid neural network module may have 3 inputs, 4 inputs, and 5 inputs, which is not limited in the present invention. The downward channel will be up sampled, the upward channel will be down sampled, the parallel input may have no sampling operation, the node inputs will be added first, and then the information fusion can be performed through a band-activated convolutional layer that does not change the height and depth of the feature map.
The foregoing is merely an exemplary description of various implementations of the invention and is not intended to be limiting thereof.
The invention also provides an image feature extraction device, and fig. 4 is a schematic diagram of the image feature extraction device according to the embodiment of the invention. The image feature extraction apparatus 200 includes a first obtaining module 210, an input module 220, and an output module 230.
The first obtaining module 210 is configured to obtain N feature images of different sizes of a target image, where N is an integer greater than 2.
The input module 220 is used for inputting the N images with different sizes as the input of the at least one two-layer bidirectional feature pyramid neural network module.
The output module 230 is configured to obtain an output of the at least one dual-layer bidirectional feature pyramid neural network module as an image feature of the target image, so as to perform target detection on the target image.
In the image feature extraction device provided by the invention, the feature pyramid of the plane in the BiFPN is expanded into a three-dimensional feature fusion mode, the expansion mode is to change a planar feature pyramid network into two planes, one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from top to bottom, then the output of the two planes is fused, and finally the two planes are stacked to form a new pyramid network structure.
Fig. 4 is only a schematic diagram illustrating the image feature extraction apparatus provided by the present invention, and the splitting, combining, and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The image feature extraction device provided by the present invention can be implemented by software, hardware, firmware, plug-in, and any combination thereof, and the present invention is not limited thereto.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium is also provided, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the image feature extraction method described in any one of the above embodiments. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the present invention as described in the image feature extraction method section above of this specification, when said program product is run on the terminal device.
Referring to fig. 5, a program product 400 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In an exemplary embodiment of the present disclosure, there is also provided an electronic device, which may include a processor, and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the image feature extraction method in any of the above embodiments via execution of the executable instructions.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the image feature extraction method section above in this specification. For example, the processing unit 610 may perform the steps as shown in fig. 1 to 2.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the above-mentioned image feature extraction method according to the embodiments of the present disclosure.
Compared with the prior art, the invention has the advantages that:
the method comprises the steps of expanding a characteristic pyramid of a plane in the BiFPN into a three-dimensional characteristic fusion mode, changing a planar characteristic pyramid network into two planes, wherein one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from bottom to top, then fusing the outputs of the two planes, and finally stacking the two planes to form a new pyramid network structure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
Claims (10)
1. An image feature extraction method, characterized by comprising:
obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2;
taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module;
and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
2. The image feature extraction method of claim 1, wherein the two-layer bidirectional feature pyramid neural network module comprises:
n input nodes;
the first intermediate layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first intermediate layer forms a channel from top to bottom, and the other node in each group of nodes of the first intermediate layer forms a channel from bottom to top;
the second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom;
and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer.
3. The image feature extraction method according to claim 2, wherein, in the first intermediate layer:
a node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4.
4. The image feature extraction method according to claim 3, wherein, in the first intermediate layer:
one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another of the first set of nodesA node X12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion;
one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion.
5. The image feature extraction method according to claim 4, wherein, in the second intermediate layer:
a node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion.
6. The image feature extraction method according to claim 5, wherein, in the second intermediate layer:
one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion;
a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion.
7. The image feature extraction method of any one of claims 1 to 6, wherein when N images of different sizes are input as a plurality of two-level bidirectional feature pyramid neural network modules, an output of a previous two-level bidirectional feature pyramid neural network module is input as a next two-level bidirectional feature pyramid neural network module.
8. An image feature extraction device characterized by comprising:
the first acquisition module is used for acquiring N characteristic images with different sizes of a target image, wherein N is an integer greater than 2;
the input module is used for taking the N images with different sizes as the input of the at least one double-layer bidirectional characteristic pyramid neural network module;
and the output module is used for acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
9. An electronic device, characterized in that the electronic device comprises:
a processor;
a storage medium having stored thereon a computer program which, when executed by the processor, performs the image feature extraction method according to any one of claims 1 to 7.
10. A storage medium having stored thereon a computer program which, when executed by a processor, performs the image feature extraction method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011390511.0A CN112434713A (en) | 2020-12-02 | 2020-12-02 | Image feature extraction method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011390511.0A CN112434713A (en) | 2020-12-02 | 2020-12-02 | Image feature extraction method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112434713A true CN112434713A (en) | 2021-03-02 |
Family
ID=74698908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011390511.0A Pending CN112434713A (en) | 2020-12-02 | 2020-12-02 | Image feature extraction method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112434713A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113313668A (en) * | 2021-04-19 | 2021-08-27 | 石家庄铁道大学 | Subway tunnel surface disease feature extraction method |
CN113361375A (en) * | 2021-06-02 | 2021-09-07 | 武汉理工大学 | Vehicle target identification method based on improved BiFPN |
CN114972713A (en) * | 2022-04-29 | 2022-08-30 | 北京开拓鸿业高科技有限公司 | Area positioning method, device, storage medium and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986137A (en) * | 2017-11-30 | 2018-12-11 | 成都通甲优博科技有限责任公司 | Human body tracing method, device and equipment |
CN109614876A (en) * | 2018-11-16 | 2019-04-12 | 北京市商汤科技开发有限公司 | Critical point detection method and device, electronic equipment and storage medium |
CN111291739A (en) * | 2020-05-09 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Face detection and image detection neural network training method, device and equipment |
CN111914937A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Lightweight improved target detection method and detection system |
CN111967538A (en) * | 2020-09-25 | 2020-11-20 | 北京百度网讯科技有限公司 | Feature fusion method, device and equipment applied to small target detection and storage medium |
-
2020
- 2020-12-02 CN CN202011390511.0A patent/CN112434713A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986137A (en) * | 2017-11-30 | 2018-12-11 | 成都通甲优博科技有限责任公司 | Human body tracing method, device and equipment |
CN109614876A (en) * | 2018-11-16 | 2019-04-12 | 北京市商汤科技开发有限公司 | Critical point detection method and device, electronic equipment and storage medium |
US20200250462A1 (en) * | 2018-11-16 | 2020-08-06 | Beijing Sensetime Technology Development Co., Ltd. | Key point detection method and apparatus, and storage medium |
CN111291739A (en) * | 2020-05-09 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Face detection and image detection neural network training method, device and equipment |
CN111914937A (en) * | 2020-08-05 | 2020-11-10 | 湖北工业大学 | Lightweight improved target detection method and detection system |
CN111967538A (en) * | 2020-09-25 | 2020-11-20 | 北京百度网讯科技有限公司 | Feature fusion method, device and equipment applied to small target detection and storage medium |
Non-Patent Citations (1)
Title |
---|
徐成琪,洪学海: "基于功能保持的特征金字塔目标检测网络", 模式识别与人工智能, 30 June 2020 (2020-06-30), pages 507 - 516 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113313668A (en) * | 2021-04-19 | 2021-08-27 | 石家庄铁道大学 | Subway tunnel surface disease feature extraction method |
CN113361375A (en) * | 2021-06-02 | 2021-09-07 | 武汉理工大学 | Vehicle target identification method based on improved BiFPN |
CN113361375B (en) * | 2021-06-02 | 2022-06-07 | 武汉理工大学 | Vehicle target identification method based on improved BiFPN |
CN114972713A (en) * | 2022-04-29 | 2022-08-30 | 北京开拓鸿业高科技有限公司 | Area positioning method, device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112434713A (en) | Image feature extraction method and device, electronic equipment and storage medium | |
KR20210042864A (en) | Table recognition method, device, equipment, medium and computer program | |
US7024419B1 (en) | Network visualization tool utilizing iterative rearrangement of nodes on a grid lattice using gradient method | |
CN108629414A (en) | depth hash learning method and device | |
KR102553763B1 (en) | Video event recognition method and device, electronic equipment and storage medium | |
CN116822452B (en) | Chip layout optimization method and related equipment | |
CN110569972A (en) | search space construction method and device of hyper network and electronic equipment | |
CN111312223B (en) | Training method and device of voice segmentation model and electronic equipment | |
JP2021128779A (en) | Method, device, apparatus, and storage medium for expanding data | |
CN112906865A (en) | Neural network architecture searching method and device, electronic equipment and storage medium | |
WO2024001653A1 (en) | Feature extraction method and apparatus, storage medium, and electronic device | |
US11645323B2 (en) | Coarse-to-fine multimodal gallery search system with attention-based neural network models | |
CN111312224B (en) | Training method and device of voice segmentation model and electronic equipment | |
CN116186330B (en) | Video deduplication method and device based on multi-mode learning | |
US9886652B2 (en) | Computerized correspondence estimation using distinctively matched patches | |
CN115965074A (en) | Training method of deep learning model, data processing method, device and equipment | |
CN115186738A (en) | Model training method, device and storage medium | |
CN111582456B (en) | Method, apparatus, device and medium for generating network model information | |
US9465905B1 (en) | Structure for static random access memory | |
CN114417856A (en) | Text sparse coding method and device and electronic equipment | |
CN110378378A (en) | Fact retrieval method, apparatus, computer equipment and storage medium | |
JP4391464B2 (en) | Device for storing binary tree structure information and device for storing heap structure information | |
CN116227391B (en) | Fault-tolerant Josephson junction array, dynamic ternary design method and equipment | |
CN114372238B (en) | Distributed state estimation method | |
CN116541421B (en) | Address query information generation method and device, electronic equipment and computer medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |