CN112434713A - Image feature extraction method and device, electronic equipment and storage medium - Google Patents

Image feature extraction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112434713A
CN112434713A CN202011390511.0A CN202011390511A CN112434713A CN 112434713 A CN112434713 A CN 112434713A CN 202011390511 A CN202011390511 A CN 202011390511A CN 112434713 A CN112434713 A CN 112434713A
Authority
CN
China
Prior art keywords
node
nodes
group
input
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011390511.0A
Other languages
Chinese (zh)
Inventor
沈涛
罗超
胡泓
李巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Computer Technology Shanghai Co Ltd
Original Assignee
Ctrip Computer Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Computer Technology Shanghai Co Ltd filed Critical Ctrip Computer Technology Shanghai Co Ltd
Priority to CN202011390511.0A priority Critical patent/CN112434713A/en
Publication of CN112434713A publication Critical patent/CN112434713A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image feature extraction method, an image feature extraction device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2; taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module; and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image. The method and the device provided by the invention improve the smoothness of information flow in the BiFPN, so that the features of different layers are better fused, and the expression capability of the features is improved, thereby improving the performance of image processing functions such as target detection, image classification and the like.

Description

Image feature extraction method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of computer application, in particular to an image feature extraction method and device, electronic equipment and a storage medium.
Background
The feature pyramid is a basic component in a recognition system for detecting objects of different dimensions. The inherent multi-scale pyramid hierarchy of deep convolutional networks can be exploited to construct a pyramid of features with marginal extra loss. Currently, a top-down architecture with cross-connections is designed for building high-level semantic feature maps at all scales, and this architecture, called Feature Pyramid Network (FPN), shows significant improvements as a common feature extractor in several applications.
The characteristic gold tower Network currently has various deformation structures, such as a Path Aggregation Network (PANet), a simplified PANet, an NAS-FPN, a bisfppn, and the like. BiFPN is a neural network module which fuses different level features in EfficientDet (target detection algorithm series). BiFPN has consulted the PANET structural feature, has increased from supreme characteristic channel down, has shortened the transmission course of information to the effect of characteristic fusion has been changed. Meanwhile, BiFPN uses the stacking of a plurality of feature pyramids, and the effect of feature fusion is further improved.
However, the current plane structure formed by stacking the BiFPN is from top to bottom, then from bottom to top, and then repeats, because there is a precedence relationship between top to bottom and bottom to top, therefore, the plane structure may cause unsmooth information transmission.
Therefore, how to improve the smoothness of information flow in the BiFPN to better fuse the features of different layers and improve the expression capacity of the features so as to improve the target detection performance is a technical problem to be solved by the technical personnel in the field.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides an image feature extraction method, an image feature extraction device, electronic equipment and a storage medium, so as to improve the smoothness of information flow in BiFPN, better fuse features of different layers, and improve the expression capability of the features, thereby improving the target detection performance.
According to an aspect of the present invention, there is provided an image feature extraction method including:
obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2;
taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module;
and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
In some embodiments of the invention, the two-layer bidirectional feature pyramid neural network module comprises:
n input nodes;
the first intermediate layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first intermediate layer forms a channel from top to bottom, and the other node in each group of nodes of the first intermediate layer forms a channel from bottom to top;
the second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom;
and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer.
In some embodiments of the invention, the first intermediate layer comprises:
a node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4.
In some embodiments of the invention, the first intermediate layer comprises:
one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another node X of the first set of nodes12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion;
one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion.
In some embodiments of the invention, the second intermediate layer comprises:
a node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion.
In some embodiments of the invention, the second intermediate layer comprises:
one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion;
a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion.
In some embodiments of the present invention, when N different sized images are input to a plurality of two-level bi-directional feature pyramid neural network modules, the output of a previous two-level bi-directional feature pyramid neural network module is input to a subsequent two-level bi-directional feature pyramid neural network module.
According to still another aspect of the present invention, there is also provided an image feature extraction device including:
the first acquisition module is used for acquiring N characteristic images with different sizes of a target image, wherein N is an integer greater than 2;
the input module is used for taking the N images with different sizes as the input of the at least one double-layer bidirectional characteristic pyramid neural network module;
and the output module is used for acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps of the image feature extraction method as described above.
According to yet another aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the image feature extraction method as described above.
Compared with the prior art, the invention has the advantages that:
the method comprises the steps of expanding a characteristic pyramid of a plane in the BiFPN into a three-dimensional characteristic fusion mode, changing a planar characteristic pyramid network into two planes, wherein one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from bottom to top, then fusing the outputs of the two planes, and finally stacking the two planes to form a new pyramid network structure. Therefore, the image feature extraction method and the device can be applied to image processing functions such as target detection and image classification.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 shows a flowchart of an image feature extraction method according to an embodiment of the present invention.
Fig. 2 shows a schematic diagram of a BiFPN module with five input nodes.
FIG. 3 shows a schematic diagram of a two-layer bi-directional feature pyramid neural network module, according to an embodiment of the present invention.
Fig. 4 is a block diagram showing an image feature extraction apparatus according to an embodiment of the present invention.
Fig. 5 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the disclosure.
Fig. 6 schematically illustrates an electronic device in an exemplary embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
In order to overcome the defects of the prior art, the invention provides an image feature extraction method and device, electronic equipment and a storage medium, which can avoid file errors caused by manual operation while simplifying the production of moving pictures to realize batch generation.
Referring first to fig. 1, fig. 1 shows a schematic diagram of an image feature extraction method according to an embodiment of the present invention. The image feature extraction method comprises the following steps:
step S110: n characteristic images with different sizes of the target image are obtained, wherein N is an integer larger than 2.
Specifically, the target image may be sequentially passed through a convolutional neural network to obtain feature images of different sizes. Further, the number of the feature images sequentially acquired may be equal to or greater than N, and thus, N feature images obtained last may be selected in step S110. In the N-spoke image, the feature image size obtained later is smaller.
Step S120: and taking the N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module.
Specifically, the two-layer bidirectional feature pyramid neural network module includes: n input nodes, a first intermediate layer, a second intermediate layer, and N output nodes. The first middle layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first middle layer forms a channel from top to bottom, and the other node in each group of nodes of the first middle layer forms a channel from bottom to top. The second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom, and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer. Hereinafter, a specific structure of the two-layer bidirectional feature pyramid neural network module provided by the present invention will be described with reference to fig. 3, which is not repeated herein.
Specifically, when N images of different sizes are used as inputs to a plurality of two-level bidirectional feature pyramid neural network modules, the output of the previous two-level bidirectional feature pyramid neural network module is used as the input to the next two-level bidirectional feature pyramid neural network module. Therefore, a plurality of double-layer bidirectional characteristic pyramid neural network modules can be transversely cascaded
Step S130: and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
In the image feature extraction method provided by the invention, the feature pyramid of the plane in the BiFPN is expanded into a three-dimensional feature fusion mode, the expansion mode is to change a planar feature pyramid network into two planes, one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from top to bottom, then the output of the two planes is fused, and finally the two planes are stacked to form a new pyramid network structure, so that the problem of the information flow or fusion efficiency due to the fact that the channels of the feature pyramid in the BiFPN from top to bottom and from bottom to top are sequentially stacked and the planar structure is prevented is solved.
Referring now to fig. 2, fig. 2 shows a schematic diagram of a BiFPN block with five input nodes. Fig. 2 shows five input nodes, three first intermediate nodes, five second intermediate nodes and five output nodes. Wherein the first intermediate node X1From the input node In1And an input node In2And (4) obtaining fusion. First intermediate node X2From the input node In3And a first intermediate node X1And (4) obtaining fusion. First intermediate node X3From the input node In4And a first intermediate node X2And (4) obtaining fusion. Thus, a channel is formed from top to bottom. Second intermediate node Y5From the input node In5And a first intermediate node X3And (4) obtaining fusion. Second intermediate node Y4From the input node In4First intermediate node X3And a second intermediate node Y5And (4) obtaining fusion. Second intermediate node Y3From the input node In3First intermediate node X2And a second intermediate node Y4And (4) obtaining fusion. Second intermediate node Y2From the input node In2First intermediate node X1And a second intermediate node Y3And (4) obtaining fusion. Second intermediate node Y1From the input node In1And a second intermediate node Y2And (4) obtaining fusion. Thereby forming a reaction mixture fromAnd (c) a channel. And finally outputting the signals as five output nodes through five second intermediate nodes. Therefore, the plane structure formed by stacking the BiFPNs is from top to bottom, then from bottom to top, and then is repeated continuously, because the precedence relationship exists between the plane structure and the plane structure from top to bottom and from bottom to top, the plane structure can cause unsmooth information transmission, and the performance of target detection is influenced.
Referring now to fig. 3, fig. 3 illustrates a schematic diagram of a two-layer bi-directional feature pyramid neural network module, according to an embodiment of the present invention.
In the present invention, in the first intermediate layer, one node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4. Meanwhile, in the first intermediate layer: one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another node X of the first set of nodes12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion; one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion. In the present invention, in the second intermediate layer, one node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion. At the same time, the second intermediate layerIn (2), one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion; a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion. The structure of the above-mentioned two-layer bidirectional feature pyramid neural network module is shown in fig. 3, and is not described herein again.
In various embodiments of the present invention, the two-layer bidirectional feature pyramid neural network module may have 3 inputs, 4 inputs, and 5 inputs, which is not limited in the present invention. The downward channel will be up sampled, the upward channel will be down sampled, the parallel input may have no sampling operation, the node inputs will be added first, and then the information fusion can be performed through a band-activated convolutional layer that does not change the height and depth of the feature map.
The foregoing is merely an exemplary description of various implementations of the invention and is not intended to be limiting thereof.
The invention also provides an image feature extraction device, and fig. 4 is a schematic diagram of the image feature extraction device according to the embodiment of the invention. The image feature extraction apparatus 200 includes a first obtaining module 210, an input module 220, and an output module 230.
The first obtaining module 210 is configured to obtain N feature images of different sizes of a target image, where N is an integer greater than 2.
The input module 220 is used for inputting the N images with different sizes as the input of the at least one two-layer bidirectional feature pyramid neural network module.
The output module 230 is configured to obtain an output of the at least one dual-layer bidirectional feature pyramid neural network module as an image feature of the target image, so as to perform target detection on the target image.
In the image feature extraction device provided by the invention, the feature pyramid of the plane in the BiFPN is expanded into a three-dimensional feature fusion mode, the expansion mode is to change a planar feature pyramid network into two planes, one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from top to bottom, then the output of the two planes is fused, and finally the two planes are stacked to form a new pyramid network structure.
Fig. 4 is only a schematic diagram illustrating the image feature extraction apparatus provided by the present invention, and the splitting, combining, and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The image feature extraction device provided by the present invention can be implemented by software, hardware, firmware, plug-in, and any combination thereof, and the present invention is not limited thereto.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium is also provided, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the image feature extraction method described in any one of the above embodiments. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the present invention as described in the image feature extraction method section above of this specification, when said program product is run on the terminal device.
Referring to fig. 5, a program product 400 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In an exemplary embodiment of the present disclosure, there is also provided an electronic device, which may include a processor, and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the image feature extraction method in any of the above embodiments via execution of the executable instructions.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.
Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the image feature extraction method section above in this specification. For example, the processing unit 610 may perform the steps as shown in fig. 1 to 2.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the above-mentioned image feature extraction method according to the embodiments of the present disclosure.
Compared with the prior art, the invention has the advantages that:
the method comprises the steps of expanding a characteristic pyramid of a plane in the BiFPN into a three-dimensional characteristic fusion mode, changing a planar characteristic pyramid network into two planes, wherein one plane is from top to bottom and then from bottom to top, the other plane is from bottom to top and then from bottom to top, then fusing the outputs of the two planes, and finally stacking the two planes to form a new pyramid network structure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. An image feature extraction method, characterized by comprising:
obtaining N characteristic images with different sizes of a target image, wherein N is an integer larger than 2;
taking N images with different sizes as the input of at least one double-layer bidirectional characteristic pyramid neural network module;
and acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
2. The image feature extraction method of claim 1, wherein the two-layer bidirectional feature pyramid neural network module comprises:
n input nodes;
the first intermediate layer comprises N-2 groups of nodes, wherein one node in each group of nodes of the first intermediate layer forms a channel from top to bottom, and the other node in each group of nodes of the first intermediate layer forms a channel from bottom to top;
the second intermediate layer comprises N groups of nodes, wherein one node in each group of nodes of the second intermediate layer forms a channel from bottom to top, and the other node in each group of nodes of the second intermediate layer forms a channel from top to bottom;
and each output node is obtained by fusing a group of nodes of the corresponding second intermediate layer.
3. The image feature extraction method according to claim 2, wherein, in the first intermediate layer:
a node X of the ith group of nodesi1By a node X of the i-1 th group of nodesi-11And the i +1 th input node INi+1Fusing to obtain another node X of the ith group of nodesi2By another node X of the (i + 1) th group of nodesi-12And the i +1 th input node INi+1And (b) fusion, wherein i is an integer greater than 1 and less than N-2, and N is an integer greater than 4.
4. The image feature extraction method according to claim 3, wherein, in the first intermediate layer:
one node X of the first set of nodes11From a first input node IN1And a second input node IN2The fusion is obtained, another of the first set of nodesA node X12From a second input node IN2And another node X of the second group of nodes22Obtaining fusion;
one node X of the N-2 th group of nodesN-21From a node X of the N-3 group of nodesN-31And an N-1 th input node INN-1Another node X of the N-2 group of nodes obtained by fusionN-22From the N-1 st input node INN-1And an Nth input node INNAnd (4) obtaining fusion.
5. The image feature extraction method according to claim 4, wherein, in the second intermediate layer:
a node Y of the ith group of nodesi1From the ith input node INiA node X of the i-1 group of nodes of the first intermediate leveli-11And one node Y of the (i + 1) th group of nodesi+11Fusing to obtain another node Y of the ith group of nodesi2From the ith input node INiAnother node X of the i-1 group of nodes of the first intermediate leveli-12And another node Y of the i-1 th group of nodesi-12And (4) obtaining fusion.
6. The image feature extraction method according to claim 5, wherein, in the second intermediate layer:
one node Y of the first group of nodes11From a first input node IN1And a node Y of the second group of nodes21The fusion is obtained, another node Y of the first set of nodes12From a first input node IN1And another node X of the first group of nodes of the first intermediate level12Obtaining fusion;
a node Y of the Nth group of nodesN1From the Nth input node INNAnd a node X of the N-2 group of nodes of the first intermediate layerN-21Merging to obtain another node Y of the Nth group of nodesN2From the Nth input node INNAnd another node Y of the N-1 th group of nodesN-12And (4) obtaining fusion.
7. The image feature extraction method of any one of claims 1 to 6, wherein when N images of different sizes are input as a plurality of two-level bidirectional feature pyramid neural network modules, an output of a previous two-level bidirectional feature pyramid neural network module is input as a next two-level bidirectional feature pyramid neural network module.
8. An image feature extraction device characterized by comprising:
the first acquisition module is used for acquiring N characteristic images with different sizes of a target image, wherein N is an integer greater than 2;
the input module is used for taking the N images with different sizes as the input of the at least one double-layer bidirectional characteristic pyramid neural network module;
and the output module is used for acquiring the output of the at least one double-layer bidirectional characteristic pyramid neural network module as the image characteristic of the target image so as to perform target detection on the target image.
9. An electronic device, characterized in that the electronic device comprises:
a processor;
a storage medium having stored thereon a computer program which, when executed by the processor, performs the image feature extraction method according to any one of claims 1 to 7.
10. A storage medium having stored thereon a computer program which, when executed by a processor, performs the image feature extraction method according to any one of claims 1 to 7.
CN202011390511.0A 2020-12-02 2020-12-02 Image feature extraction method and device, electronic equipment and storage medium Pending CN112434713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011390511.0A CN112434713A (en) 2020-12-02 2020-12-02 Image feature extraction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011390511.0A CN112434713A (en) 2020-12-02 2020-12-02 Image feature extraction method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112434713A true CN112434713A (en) 2021-03-02

Family

ID=74698908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011390511.0A Pending CN112434713A (en) 2020-12-02 2020-12-02 Image feature extraction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112434713A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313668A (en) * 2021-04-19 2021-08-27 石家庄铁道大学 Subway tunnel surface disease feature extraction method
CN113361375A (en) * 2021-06-02 2021-09-07 武汉理工大学 Vehicle target identification method based on improved BiFPN
CN114972713A (en) * 2022-04-29 2022-08-30 北京开拓鸿业高科技有限公司 Area positioning method, device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN109614876A (en) * 2018-11-16 2019-04-12 北京市商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
CN111291739A (en) * 2020-05-09 2020-06-16 腾讯科技(深圳)有限公司 Face detection and image detection neural network training method, device and equipment
CN111914937A (en) * 2020-08-05 2020-11-10 湖北工业大学 Lightweight improved target detection method and detection system
CN111967538A (en) * 2020-09-25 2020-11-20 北京百度网讯科技有限公司 Feature fusion method, device and equipment applied to small target detection and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN109614876A (en) * 2018-11-16 2019-04-12 北京市商汤科技开发有限公司 Critical point detection method and device, electronic equipment and storage medium
US20200250462A1 (en) * 2018-11-16 2020-08-06 Beijing Sensetime Technology Development Co., Ltd. Key point detection method and apparatus, and storage medium
CN111291739A (en) * 2020-05-09 2020-06-16 腾讯科技(深圳)有限公司 Face detection and image detection neural network training method, device and equipment
CN111914937A (en) * 2020-08-05 2020-11-10 湖北工业大学 Lightweight improved target detection method and detection system
CN111967538A (en) * 2020-09-25 2020-11-20 北京百度网讯科技有限公司 Feature fusion method, device and equipment applied to small target detection and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐成琪,洪学海: "基于功能保持的特征金字塔目标检测网络", 模式识别与人工智能, 30 June 2020 (2020-06-30), pages 507 - 516 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313668A (en) * 2021-04-19 2021-08-27 石家庄铁道大学 Subway tunnel surface disease feature extraction method
CN113361375A (en) * 2021-06-02 2021-09-07 武汉理工大学 Vehicle target identification method based on improved BiFPN
CN113361375B (en) * 2021-06-02 2022-06-07 武汉理工大学 Vehicle target identification method based on improved BiFPN
CN114972713A (en) * 2022-04-29 2022-08-30 北京开拓鸿业高科技有限公司 Area positioning method, device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN112434713A (en) Image feature extraction method and device, electronic equipment and storage medium
KR20210042864A (en) Table recognition method, device, equipment, medium and computer program
US7024419B1 (en) Network visualization tool utilizing iterative rearrangement of nodes on a grid lattice using gradient method
CN108629414A (en) depth hash learning method and device
KR102553763B1 (en) Video event recognition method and device, electronic equipment and storage medium
CN116822452B (en) Chip layout optimization method and related equipment
CN110569972A (en) search space construction method and device of hyper network and electronic equipment
CN111312223B (en) Training method and device of voice segmentation model and electronic equipment
JP2021128779A (en) Method, device, apparatus, and storage medium for expanding data
CN112906865A (en) Neural network architecture searching method and device, electronic equipment and storage medium
WO2024001653A1 (en) Feature extraction method and apparatus, storage medium, and electronic device
US11645323B2 (en) Coarse-to-fine multimodal gallery search system with attention-based neural network models
CN111312224B (en) Training method and device of voice segmentation model and electronic equipment
CN116186330B (en) Video deduplication method and device based on multi-mode learning
US9886652B2 (en) Computerized correspondence estimation using distinctively matched patches
CN115965074A (en) Training method of deep learning model, data processing method, device and equipment
CN115186738A (en) Model training method, device and storage medium
CN111582456B (en) Method, apparatus, device and medium for generating network model information
US9465905B1 (en) Structure for static random access memory
CN114417856A (en) Text sparse coding method and device and electronic equipment
CN110378378A (en) Fact retrieval method, apparatus, computer equipment and storage medium
JP4391464B2 (en) Device for storing binary tree structure information and device for storing heap structure information
CN116227391B (en) Fault-tolerant Josephson junction array, dynamic ternary design method and equipment
CN114372238B (en) Distributed state estimation method
CN116541421B (en) Address query information generation method and device, electronic equipment and computer medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination