CN110555354B

CN110555354B - Feature screening method and apparatus, target detection method and apparatus, electronic apparatus, and storage medium

Info

Publication number: CN110555354B
Application number: CN201810571751.7A
Authority: CN
Inventors: 高梓桁
Original assignee: Xilinx Technology Beijing Ltd
Current assignee: Xilinx Technology Beijing Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2022-06-17
Anticipated expiration: 2038-05-31
Also published as: CN110555354A

Abstract

The invention discloses a feature screening method and device, a target detection method and device, electronic equipment and a storage medium. A feature screening apparatus includes: a plurality of information compressors each for down-sampling compressing input information, the plurality of information compressors being cascade-connected to form a multi-layered compression structure for compressing information layer by layer; a plurality of information recoverers, each of which is connected with one information compressor and recovers the information compressed by the information compressor of the same layer by the information recoverer of the layer; and a plurality of information amplifiers cascade-connected to form a multi-layered amplification structure to amplify information layer by layer, at least one of the information amplifiers being each connected to one of the information recoverers to fuse and amplify information output from the connected information recoverer with information output from an information amplifier of a previous layer. The feature screening device of the invention can remove background information as much as possible, thereby improving the accuracy of target detection.

Description

Feature screening method and apparatus, target detection method and apparatus, electronic apparatus, and storage medium

Technical Field

The present disclosure relates to the field of image processing, and in particular, to a feature screening method, a feature screening apparatus, a target detection method, a target detection device, an electronic device, and a storage medium.

Background

Object detection techniques refer to a series of technical processes that, given an image or a frame of video, determine whether an object (e.g., a human face) is present therein, and if so, return its position and size. The technology is widely applied to the fields of security protection, biological recognition and the like, and is a prior technology for target detection (such as face recognition and face key point detection), so that the improvement of the performance and accuracy of the technology is of great importance.

In the past, people have used traditional visual methods or machine learning methods to find the location of objects in a map, such as the most common Haar-feature object detection method; nowadays, with the development of deep learning, a large number of new and efficient target detection methods are emerging, such as DenseBox, MTCNN, TinyFace, and the like.

However, what is important, but not properly solved, of these methods is how to obtain better features so as to reduce false detections as much as possible during the detection process.

Therefore, how to obtain better characteristics in the target detection process to reduce false detection as much as possible, so as to improve the accuracy, efficiency and performance of target detection is a technical problem which needs to be solved urgently.

Disclosure of Invention

In view of the above technical problems, the present invention provides a feature screening target detection technique, so that extracted features are more focused on valid information (e.g., face information with a target of a face) and invalid information (e.g., background information) that is likely to cause false detection is ignored.

The invention provides a progressive scaling feature screening target detection method, which distinguishes important information from non-important information by continuously losing information and recovering the important information, so that the interference of the non-important information is reduced as little as possible in the target detection process, and the accuracy and the efficiency of target detection are effectively improved.

According to an embodiment of the present invention, there is provided a feature filtering apparatus for performing feature filtering on input information to remove invalid background information and obtain useful information, the feature filtering apparatus including: a plurality of information compressors, each configured to down-sample compress input information, and connected in cascade to form a multi-layered compression structure to compress information layer by layer; a plurality of information recoverers each connected to one of the information compressors such that the connected information recoverers and information compressors are regarded as being on the same layer, wherein each information recoverer is configured to recover information compressed by the information compressor of the same layer via the information recoverer of the layer; and a plurality of information amplifiers cascade-connected to form a multilayer amplification structure equivalent to the multilayer compression structure to amplify information layer by layer, wherein at least one of the plurality of information amplifiers is connected to one information restorer, respectively, to fuse and amplify information subjected to a restoration operation output from the connected information restorer with information output from an information amplifier of a previous layer, and output the fused amplified information.

Alternatively, the number of information recoverers is equal to or smaller than the number of information compressors and information amplifiers.

Optionally, the information restorer contains a classifier for training better-learned features.

Alternatively, a plurality of feature filters are connected in cascade to form a multi-stage feature filter structure, which is implemented by further connecting the information amplifier of the previous feature filter and the information compressor on the same layer of the next feature filter in the plurality of feature filters by using an additional information restorer.

According to an embodiment of the present invention, there is provided a feature filtering method for performing feature filtering on input information to remove invalid background information and obtain useful information, including: a compression step, performing downsampling compression operation on input information, wherein the compression operation is performed step by step to form a hierarchical structure of the compression operation; a restoration step corresponding to the compression operation of at least one layer to restore important information to the compressed information after the compression of the layer; and an enlargement step of performing an upsampling and enlarging operation on the information, wherein the upsampling and enlarging operation forms a hierarchical structure as in the compression operation to perform the upsampling and enlarging operation layer by layer from a first layer of the upsampling and enlarging operation corresponding to a last layer of the compression operation, wherein in the enlarging operation of each of at least one layer of the plurality of layers of the enlarging operation in the enlargement step, information restored through the restoration operation in the restoration step and information upsampled and enlarged through the upsampling and enlarging operation of a previous layer are fused and enlarged, thereby outputting the fused enlarged information.

Optionally, the recovering step further comprises a classification operation for classifying the information for training to learn better features.

According to an embodiment of the present invention, there is provided an object detection apparatus including: the characteristic screening device described above; and post-processing means configured to post-process the information subjected to the feature filtering by the feature filtering means to confirm the presence or absence of the target and to obtain information on the position and size of the target.

Optionally, the object detection device further comprises a preprocessing unit configured to perform a preprocessing operation on the input image to obtain the roughly processed input information.

According to an embodiment of the present invention, there is provided an object detection method including: a characteristic screening step, which is used for screening the input information by using the characteristic screening method so as to remove invalid background information and obtain useful information; and the post-processing device is configured to post-process the information after the characteristic screening so as to confirm whether the target exists and obtain the position and size related information of the target.

Optionally, the object detection method further includes a preprocessing step of preprocessing the input image to obtain the roughly processed input information.

An electronic device according to an embodiment of the present invention includes: a processor; and a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform any of the methods claimed herein.

A non-transitory machine-readable storage medium according to an embodiment of the present invention has stored thereon executable code which, when executed by a processor of an electronic device, causes the processor to perform any one of the methods claimed herein.

By the feature screening method and device and the target detection method and device, the important information and the non-important information are distinguished particularly by continuously losing the information and recovering the important information, so that the interference of the non-important information in the target detection process can be reduced as little as possible, and the accuracy of the target detection is effectively improved.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in greater detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.

FIG. 1 illustrates an example of a feature screening infrastructure with progressive scaling of the hierarchy according to one embodiment of the present invention.

Figure 2 schematically illustrates a multi-level feature screening architecture.

FIG. 3 presents a schematic flow-chart diagram of a method of object detection in accordance with an embodiment of the present invention.

FIG. 4 presents a schematic block diagram of an object detection apparatus in accordance with an embodiment of the present invention.

Fig. 5 shows an electronic device according to an embodiment of the invention.

Detailed Description

Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. It should be noted that the reference numerals and numbers and serial numbers in the present application are only given for convenience of description, and no limitation is made to the steps, the sequence and the like of the present invention unless the sequence of the steps is explicitly indicated in the specification.

According to one embodiment of the present invention, a feature screening apparatus for target detection is provided.

As shown in fig. 1, the characteristic screening apparatus 100 according to the present invention includes a plurality of information compressors 101, a plurality of information recoverers 102, and a plurality of information amplifiers 103.

Here, the information compressor 101 is configured to down-sample compress the input information so that invalid information such as background information is lost, but at the same time some valid information is inevitably lost. The plurality of information compressors 101 are connected in cascade to form a hierarchical structure (may be referred to as a "multi-layer compression structure") so that information compressed layer by layer, that is, information compressed by an information compressor 101 on a certain layer, is input to an information compressor 101 on the next layer and is compressed. For example, as illustrated in fig. 1, a plurality of (n) information compressors are cascaded to form an n-layer compression structure (n is 2 or more) having an n-layer compression function.

The information recoverer 102 is connected to the information compressor 101 and the information amplifier 103 at the same layer, and is configured to filter the information compressed by the information compressor at the same layer through a filter in the information recoverer at the layer (for example, by using a convolution or neural network method) to recover important information. The output of the information recoverer 102 is supplied to the information amplifier 103 of the present layer connected thereto.

The information amplifier 103 is connected to the information restorer 102 of the same layer to receive the information subjected to the restoring operation output from the information restorer 102. Further, the information amplifier 103 is also connected to the information amplifier 103 of the upper layer to receive the information output from the information amplifier 103 of the upper layer. The information amplifier 103 is configured to fuse and amplify the information subjected to the recovery operation output from the information recoverer 102 of the same layer with the information output from the information amplifier 103 of the previous layer to output fused amplified information. In this case, a plurality of information amplifiers 103 are connected in cascade to form a hierarchical structure (which may be referred to as a "multilayer amplification structure", equivalent to the above-described "multilayer compression structure") so as to amplify information layer by layer. For example, as illustrated in fig. 1, a plurality of (n) information amplifiers 103 are cascaded to form an n-layer amplification structure (n is 2 or more) having an n-layer amplification function.

The above fusion may be summation, multiplication or other operations, and the present invention is not limited in this respect.

As can be seen from the figure, the number of information compressors, information recoverers and information amplifiers may be the same, and the number of layers constituting the hierarchical structure may be the same as the number of these devices, for example, n is shown in fig. 1.

However, the number of information restorers may be different from the number of information compressors and information amplifiers, and for example, the number of information restorers may be smaller than the number of information compressors and information amplifiers, that is, information fusion operation may not be performed in the information amplifiers on one or more layers, but information output from the information amplifier of the previous layer may be directly amplified. That is, the present invention can be implemented such that the fusion operation of the information output by the information restorer and the information output by the information amplifier of the previous layer is performed in the information amplifier at least one layer.

Here, it is to be noted that since the information compressor and the information amplifier have different signal flow directions, to be precise, their signal flow directions are opposite, and thus, the "upper layer" or "lower layer" mentioned for them is described along with their signal flow directions. That is, even in the information compressor and the information amplifier at the same layer, their respective "upper layer"/"lower layer" are different.

By the above-described hierarchical structure formed by the cascade connection of the plurality of information compressors 101 and the information amplifiers 103 and the connection of the information compressors and the information amplifiers of the same layer in the middle by the information restorer 102, the compression, restoration, and amplification functions are organically combined, and important information can be screened layer by layer, so that effective information (information related to a detection target) is gradually increased and ineffective information (useless background information or noise) is gradually reduced, so that unnecessary features can be screened out more, and better features can be obtained, whereby the accuracy and efficiency of target detection can be greatly improved.

In the present invention, the number of information compressors, information recoverers, and information amplifiers is not particularly specified, because it may be related to the number of times information can be compressed. However, the present invention may give examples of the number of information compressors, information recoverers, and information amplifiers, which may be, for example, 3, 4, 6, and so on.

Further alternatively, the information restorer 102 connected to the information compressor 101 of the last layer may or may not include a filter to filter the compressed information, that is, the information output from the compressor of the lowermost layer is not restored but output as it is to the information amplifier of the lowermost layer for amplification.

Optionally, the information restorer 102 may contain a classifier for training better learned features in such a way as to enhance the output of the better features and to screen out useless background information.

Further, the above-described feature screening infrastructure may be cascaded in multiple stages to form a multi-stage feature screening structure, as shown in fig. 2, so as to obtain better features with less and less invalid information.

In fig. 2, a plurality of feature screening infrastructures may be connected by a number of information retrievers 102, thereby forming a multi-level feature screening structure (m levels in fig. 2, m may be greater than or equal to 2).

The information restorer 102 for connecting a plurality of feature screening infrastructures connects the outputs of the information amplifiers 103 of each layer of each feature screening infrastructure and delivers the outputs to the information compressor 101 on the same layer of the feature screening infrastructure of the subsequent stage, where layer-by-layer compression, restoration and amplification are performed again to obtain better features and remove more useless noise, thereby making the target detection result more accurate.

According to one embodiment of the present invention, a feature screening method for target detection is provided.

As shown in fig. 1, in the compression step, the down-sampling compression operation is performed on the input information, so that invalid information such as background information is lost, but at the same time, some valid information is inevitably lost.

Here, the compression operation may be performed in stages, forming a hierarchy of compression operations. That is, the compressed information may continue to be compressed, thereby forming a hierarchy of progressive multi-level compression.

In the restoration step, corresponding to the compression of at least one layer, the compressed information is restored with the important information after the compression of the layer. The restoration operation here can be realized, for example, by filtering using a filter (for example, by a method using convolution or a neural network).

In the up-sampling and up-amplifying step, the information is up-sampled and amplified, and the up-sampling and up-amplifying operation can form a hierarchical structure like a compression operation, that is, information output from an amplifying operation of a certain layer can be input to an amplifying operation of a layer next to the layer. And starting from the first layer of the up-sampling amplification operation corresponding to the last layer of the compression operation, performing the up-sampling amplification operation layer by layer. The information restored via the filter may be fused with the information up-sampled and amplified at the previous layer and amplified to output fused amplified information.

By this progressive scaling process described above, the resulting information contains as little background information as possible, resulting in better effective features, thereby improving the accuracy of target detection.

Optionally, the above process may be performed in a multistage cascade, so as to implement multistage feature screening, so as to be able to screen out more unnecessary information, and at the same time, to obtain better features, thereby greatly improving the accuracy and efficiency of target detection.

Optionally, the recovery operation in the recovery step may comprise a classification operation for training to learn better features, in such a way as to enhance the output of the better features and to screen out useless background information.

According to one embodiment of the present invention, a method of object detection is provided.

As shown in fig. 3, in step S10, the input picture is pre-processed to obtain roughly processed input information.

In step S20, feature screening according to the method described above is performed on the preprocessed information to obtain better features and obtain as much effective information as possible. Here, the feature screening may perform the feature screening according to one feature screening basic structure, or may perform the multi-level feature screening.

At step S30, it is configured to perform post-processing on the feature filtered information to confirm whether there is a target and to obtain information about the position and size of the target.

Here, the above steps are not necessarily all necessary, and a preprocessing operation is not necessarily required, for example.

Here, the post-processing step S30 may be a detection step for detection and/or a classification step for classification, or may be another post-processing method including a detection step and/or a classification step. In summary, the post-processing step S30 is to process the feature-filtered information to confirm whether there is an object and to obtain information about the position and size of the object, which can be used as an output result of object detection.

According to the target detection method provided by the embodiment of the invention, better and more effective characteristics can be obtained by progressively screening the characteristic levels, so that the target detection accuracy is greatly improved.

According to an embodiment of the present invention, there is provided an object detection apparatus.

As shown in fig. 4, the object detection apparatus 1000 according to an embodiment of the present invention may include a preprocessing device 1001, a feature screening device 1002, and a post-processing device 1003 or the like.

Therein, the preprocessing device 1001 is configured to preprocess an input picture to obtain substantially processed input information.

The feature filtering means 1002 is configured to perform feature filtering according to the above-described method on the preprocessed information to obtain better features and obtain as much effective information as possible. Here, the feature screening may perform the feature screening according to one feature screening basic structure, or may perform the multi-level feature screening. Here, the feature screening apparatus may be the feature screening apparatus 100 according to the above-described embodiment.

The post-processing means 1003 is configured to post-process the information subjected to the feature filtering by the feature filtering means 1002 to confirm whether or not there is a target and obtain information on the position and size of the target.

In addition, the post-processing device 1003 here may include a detection device for detection and/or a classification device for classification. In summary, the post-processing device 1003 is used to process the feature filtered information to confirm whether there is an object and obtain the position and size related information of the object, which can be used as the output result of the object detection. .

Note that the above-mentioned devices are not necessarily all necessary for the target detection apparatus of the present invention, and for example, the device may not include a preprocessing device, and in this case, the feature screening device may directly perform the feature screening process on the input.

The target detection device according to the embodiment of the invention can obtain better and more effective characteristics by progressively screening the hierarchy of the characteristics, thereby greatly improving the accuracy of target detection.

Fig. 5 shows a schematic structural diagram of an electronic device that can be used to implement the processing of the above-described method according to an embodiment of the present invention.

Referring to fig. 5, the electronic device 1 comprises a memory 10 and a processor 20.

The processor 20 may be a multi-core processor or may include a plurality of processors. In some embodiments, processor 20 may comprise a general-purpose host processor and one or more special purpose coprocessors such as a Graphics Processor (GPU), Digital Signal Processor (DSP), or the like. In some embodiments, processor 20 may be implemented using custom circuits, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA).

The memory 10 may include various types of storage units such as a system memory, a Read Only Memory (ROM), and a permanent storage device. Wherein the ROM may store static data or instructions for the processor 20 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 10 may comprise any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 10 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a read-only digital versatile disc (e.g., DVD-ROM, dual layer DVD-ROM), a read-only Blu-ray disc, an ultra-density optical disc, a flash memory card (e.g., SD card, min SD card, Micro-SD card, etc.), a magnetic floppy disk, or the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.

The memory 10 has stored thereon processable code, which, when processed by the processor 20, causes the processor 20 to perform any one of the methods mentioned above.

The feature screening method and the object detection method, and the feature screening apparatus and the object detection device according to the present invention have been described in detail above with reference to the accompanying drawings.

Furthermore, the invention may also be embodied as a computer program or computer program product comprising computer program code instructions for carrying out the steps defined above in any one of the above-described methods of the invention.

Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of any of the above-described methods according to the invention.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An object detection device, characterized in that the object detection device comprises:

feature screening means configured to perform feature screening on input information from the input image to remove invalid background information and obtain useful information;

post-processing means configured to post-process the information subjected to the feature filtering by the feature filtering means to confirm the presence or absence of the target and obtain information on the position and size of the target,

wherein, the characteristic sieving mechanism includes:

a plurality of information compressors, each configured to down-sample compress the input information, and the plurality of information compressors being connected in cascade to form a multi-layered compression structure to compress information layer by layer;

a plurality of information recoverers each connected to one of the information compressors such that the connected information recoverers and information compressors are regarded as being on the same layer, wherein each information recoverer is configured to recover information compressed by the information compressor of the same layer via the information recoverer of the layer; and

and a plurality of information amplifiers cascade-connected to form a multilayer amplification structure equivalent to the multilayer compression structure to amplify information layer by layer, wherein at least one of the plurality of information amplifiers is connected to one information restorer each to fuse and amplify information subjected to a restoration operation output from the connected information restorer with information output from an information amplifier of a previous layer, and output fused amplified information.

2. The object detection device according to claim 1, wherein the number of information recoverers is equal to or smaller than the number of information compressors and information amplifiers.

3. The object detection device of claim 1, wherein the information restorer comprises a classifier for training better-learned features.

4. The object detection apparatus according to claim 1, wherein a plurality of feature screening devices are cascade-connected to form a multistage feature screening structure by further employing an additional information restorer connecting an information amplifier of a preceding feature screening device and an information compressor on the same layer as the succeeding feature screening device among the plurality of feature screening devices.

5. The object detection device of claim 1, further comprising preprocessing means configured to perform a preprocessing operation on the input image to obtain the processed input information.