CN109784131B - Object detection method, device, storage medium and processor - Google Patents

Object detection method, device, storage medium and processor Download PDF

Info

Publication number
CN109784131B
CN109784131B CN201711133580.1A CN201711133580A CN109784131B CN 109784131 B CN109784131 B CN 109784131B CN 201711133580 A CN201711133580 A CN 201711133580A CN 109784131 B CN109784131 B CN 109784131B
Authority
CN
China
Prior art keywords
network
suggestion
target
detection
regional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711133580.1A
Other languages
Chinese (zh)
Other versions
CN109784131A (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kuang Chi Innovative Technology Ltd
Shenzhen Kuang Chi Hezhong Technology Ltd
Original Assignee
Kuang Chi Innovative Technology Ltd
Shenzhen Kuang Chi Hezhong Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kuang Chi Innovative Technology Ltd, Shenzhen Kuang Chi Hezhong Technology Ltd filed Critical Kuang Chi Innovative Technology Ltd
Priority to CN201711133580.1A priority Critical patent/CN109784131B/en
Priority to PCT/CN2018/079852 priority patent/WO2019095596A1/en
Publication of CN109784131A publication Critical patent/CN109784131A/en
Application granted granted Critical
Publication of CN109784131B publication Critical patent/CN109784131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an object detection method, an object detection device, a storage medium and a processor. Wherein the method comprises the following steps: generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying a region where an object in a picture is located, and the suggestion frame is used for displaying the region where the object is located; acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting an object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network; fine-tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer; and detecting the target object in the target picture according to the target detection network and the target area suggestion network. The application solves the technical problem of low accuracy of object detection.

Description

Object detection method, device, storage medium and processor
Technical Field
The present application relates to the field of computers, and in particular, to an object detection method, an object detection device, a storage medium, and a processor.
Background
With the rapid development of computer technology, computers can perform more and more tasks, such as: detecting objects, and the like. In the object detection technology, the existing detection method can detect fewer objects and has lower detection accuracy.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides an object detection method, an object detection device, a storage medium and a processor, which are used for at least solving the technical problem of low object detection accuracy.
According to an aspect of an embodiment of the present application, there is provided an object detection method including: generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying a region where an object in a picture is located, and the suggestion frame is used for displaying the region where the object is located; acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting the object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network is formed; fine-tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer; and detecting the target object in the target picture according to the target detection network and the target area suggestion network.
Optionally, fine tuning the detection network and the regional suggestion network according to the suggestion box, and obtaining the target detection network and the target regional suggestion network with the same shared convolution layer includes: and keeping the suggestion frame fixed, and interactively fine-tuning the area suggestion network and the detection network to obtain the target detection network and the target area suggestion network with the same shared convolution layer.
Optionally, keeping the suggestion box fixed, interactively fine-tuning the regional suggestion network and the detection network includes: maintaining the suggestion frame fixed, and fine-tuning a convolution layer unique to the regional suggestion network to the detection network and the regional suggestion network to have a shared convolution layer so as to obtain the target regional suggestion network; and keeping the shared convolution layer fixed, and fine-tuning the FC layer of the detection network to the detection network and the area suggestion network to have the same shared convolution layer so as to obtain the target detection network.
Optionally, the regional suggestion network is an RPN network, and the detection network is a Fast R-CNN network.
Optionally, before generating the suggestion box through the regional suggestion network, the method further comprises: and carrying out initialization training through an ImageNet pre-training model to obtain the RPN.
Optionally, acquiring the detection network according to the suggestion box includes: training an ImageNet pre-training model according to the suggestion frame through Fast R-CNN to obtain the Fast R-CNN network without a shared convolution layer between the Fast R-CNN network and the RPN network.
Optionally, detecting the target object in the target picture according to the target detection network and the target area suggestion network includes: acquiring the target picture; and inputting the target picture into the target area suggestion network and the target detection network to obtain the target object.
Optionally, inputting the target picture into the target area suggestion network and the target detection network, and obtaining the target object includes: inputting the target picture into the target area suggestion network to obtain a picture carrying a target suggestion frame; and inputting the picture carrying the target suggestion frame into the target detection network to obtain the target object detected by the target detection network from the target suggestion frame.
According to another aspect of the embodiment of the present application, there is also provided an object detection apparatus including: the generation module is used for generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying the region where the object in the picture is located, and the suggestion frame is used for displaying the region where the object is located; the acquisition module is used for acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting the object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network is formed; the fine tuning module is used for fine tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer; and the detection module is used for detecting the target object in the target picture according to the target detection network and the target area suggestion network.
Optionally, the fine tuning module includes: and the fine tuning unit is used for keeping the suggestion frame fixed, and interactively fine tuning the area suggestion network and the detection network to obtain the target detection network and the target area suggestion network which have the same shared convolution layer.
Optionally, the fine tuning unit includes: a first fine tuning subunit, configured to keep the suggestion frame fixed, fine tune a convolution layer unique to the regional suggestion network to a shared convolution layer between the detection network and the regional suggestion network, and obtain the target regional suggestion network; and the second fine tuning subunit is used for keeping the shared convolution layer fixed, and fine tuning the FC layer of the detection network to the detection network and the area suggestion network to have the same shared convolution layer so as to obtain the target detection network.
Optionally, the regional suggestion network is an RPN network, and the detection network is a Fast R-CNN network.
Optionally, the apparatus further comprises: and the initialization training module is used for performing initialization training through an ImageNet pre-training model to obtain the RPN.
Optionally, the acquiring module is configured to: training an ImageNet pre-training model according to the suggestion frame through Fast R-CNN to obtain the Fast R-CNN network without a shared convolution layer between the Fast R-CNN network and the RPN network.
Optionally, the detection module includes: an acquisition unit, configured to acquire the target picture; and the input unit is used for inputting the target picture into the target area suggestion network and the target detection network to obtain the target object.
Optionally, the input unit includes: the first input subunit is used for inputting the target picture into the target area suggestion network to obtain a picture carrying a target suggestion frame; and the second input subunit is used for inputting the picture carrying the target suggestion frame into the target detection network to obtain the target object detected by the target detection network from the target suggestion frame.
According to another aspect of the embodiment of the present application, there is further provided a storage medium, where the storage medium includes a stored program, where the program, when executed, controls a device in which the storage medium is located to perform any one of the methods described above.
According to another aspect of the embodiment of the present application, there is further provided a processor, where the processor is configured to execute a program, where the program executes the method according to any one of the preceding claims.
In the embodiment of the application, a suggestion frame is generated through a regional suggestion network, wherein the regional suggestion network is used for identifying the region where the object in the picture is located, and the suggestion frame is used for displaying the region where the object is located; acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting an object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network; fine-tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer; according to the mode of detecting the target object in the target picture by the target detection network and the target area suggestion network, a suggestion frame is generated through the area suggestion network, the detection network is acquired, and the area suggestion network and the detection network are subjected to fine adjustment according to the suggestion frame, so that the same shared convolution layer exists between the target detection network and the target area suggestion network, the target detection network and the target area suggestion network can be unified, the accuracy of object detection is improved, and the technical problem of low accuracy of object detection is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic diagram of an object detection method according to an embodiment of the present application;
FIG. 2 is a block diagram of an object detection apparatus according to an embodiment of the present application;
FIG. 3 is a block diagram II of an object detection device according to an embodiment of the present application;
fig. 4 is a block diagram III of an object detection apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of an object detection apparatus according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In accordance with an embodiment of the present application, there is provided a method embodiment of object detection, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.
Fig. 1 is a schematic diagram of an object detection method according to an embodiment of the present application, as shown in fig. 1, the method includes the steps of:
step S102, generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying the region where the object in the picture is located, and the suggestion frame is used for displaying the region where the object is located;
step S104, acquiring a detection network according to a suggestion frame, wherein the detection network is used for detecting an object from a picture, and a convolution layer which is not shared between the regional suggestion network and the detection network is formed;
step S106, fine-tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer;
step S108, detecting the target object in the target picture according to the target detection network and the target area suggestion network.
Alternatively, in the present embodiment, the above-described object detection method may be applied, but not limited to, in a detection scene of an object. For example: such objects may include, but are not limited to: tableware, fruit, stationery, kitchen ware, clothing, ornaments, and the like.
Alternatively, in the present embodiment, the above-described object detection method may be applied to, but is not limited to, a terminal device. The terminal device may include, but is not limited to: a cell phone, tablet computer, PC computer, smart wearable device, smart home device, etc.
Optionally, in this embodiment, the above-mentioned region suggestion network is used to identify a region where an object in the picture is located. For example: the picture is input into a regional suggestion network, the regional suggestion network identifies an object in the picture, the region where the object is located is given, and the region can be displayed by using a suggestion frame.
Alternatively, in the present embodiment, the suggestion box may be, but is not limited to, a rectangular suggestion box. It should be noted that the shape of the suggestion box may be, but not limited to, any shape, such as: rectangular, circular, oval, diamond, etc., the present embodiment is not limited.
Optionally, in this embodiment, the detection network is used to detect an object from a picture, and may, but is not limited to, use the Fast RCNN detection model to detect an object.
Through the steps, the suggested frame is generated through the regional suggested network, the detection network is acquired, the regional suggested network and the detection network are finely adjusted according to the suggested frame, and the same shared convolution layer is arranged between the target detection network and the target regional suggested network, so that the target detection network and the target regional suggested network can be unified, the accuracy of object detection is improved, and the technical problem of low object detection accuracy is solved.
Alternatively, to unify the regional suggestion network and the detection network, the regional suggestion network and the detection network may be trained, the suggestion box may be kept fixed, and the fine-tuning of the regional suggestion and the fine-tuning of the target detection may be alternated. For example: in the above step S106, the suggestion frame may be kept fixed, and the regional suggestion network and the detection network are interactively fine-tuned, so as to obtain the target detection network and the target regional suggestion network with the same shared convolution layer.
Optionally, the regional advice network and the detection network may be fine-tuned by: maintaining the suggestion frame fixed, and fine-tuning a convolution layer unique to the regional suggestion network until the detection network and the regional suggestion network have shared convolution layers to obtain a target regional suggestion network; and keeping the shared convolution layer fixed, and fine-tuning the FC layer of the detection network to the detection network and the regional suggestion network to have the same shared convolution layer so as to obtain the target detection network.
Alternatively, the obtained target picture may be used as input of the target area suggestion network and the target detection network to detect the target object in the target picture. For example: in the step S108, a target picture may be acquired, and then the target picture is input into the target area suggestion network and the target detection network to obtain the target object.
Alternatively, the obtained target picture may be first used as an input of the target area suggestion network, the target suggestion frame is generated on the target picture, and then the picture with the target suggestion frame is used as an input of the target detection network, so as to obtain the target object detected by the target detection network. For example: inputting the target picture into a target area suggestion network, thereby obtaining a target suggestion frame output by the target area suggestion network, obtaining a picture with the target suggestion frame, inputting the picture with the target suggestion frame into a target detection network, and obtaining a target object detected by the target detection network from the target suggestion frame.
Alternatively, the regional suggestion network may be, but is not limited to being, a regional generation network (RPN network), and the detection network may be, but is not limited to being, a Fast regional convolutional neural network (Fast R-CNN network).
Alternatively, the regional suggestion network may be trained first, but not limited to, prior to generating the suggestion box. For example: before the step S102, an initialization training may be performed through an ImageNet pre-training model to obtain a regional suggestion network.
Optionally, in the step S104, the Fast R-CNN may train the image net pre-training model according to the suggestion box to obtain a detection network with no shared convolution layer with the regional suggestion network.
In an alternative embodiment, the regional suggestion network (RPN) may take an image (of arbitrary size) as input, output a set of rectangular suggestion boxes, each box having an objectness score. The detection network can use a Fast R-CNN model, and uses a high-quality area suggestion frame generated by RPN to detect the target by using Fast R-CNN.
In order to unify object detection of RPN and Fast R-CNN, this alternative embodiment proposes a training scheme for both networks, i.e. keeping the suggestion box fixed, the fine tuning area suggestion network and the fine tuning detection network alternate. A 4-step training algorithm may be employed to learn the shared features through alternate optimization. The 4-step training algorithm comprises the following steps:
in the first step, the RPN is trained, the network is initialized with an ImageNet pre-trained model, and fine-tuned end-to-end for regional advice tasks.
In a second step, a separate detection network is trained by Fast R-CNN using the suggestion box generated by the RPN of the first step, which may be initialized by the ImageNet pre-trained model, when the two networks have not yet shared the convolutional layer.
Thirdly, the RPN training is initialized with the detection network, the shared convolutional layer is fixed, and only the layer unique to the RPN is trimmed, now the two networks share the convolutional layer.
Fourth, keeping the shared convolution layer fixed, fine tuning the fc layer of Fast R-CNN. Thus, the two networks share the same convolution layer, and form a unified network.
According to another embodiment of the present application, there is provided an embodiment of an apparatus for object detection, fig. 2 is a block diagram of an apparatus for object detection according to an embodiment of the present application, as shown in fig. 2, including:
the generating module 22 is configured to generate a suggestion box through a region suggestion network, where the region suggestion network is configured to identify a region where an object in the picture is located, and the suggestion box is configured to display the region where the object is located;
an acquisition module 24, coupled to the generation module 22, for acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting the object from the picture, and there is no shared convolution layer between the regional suggestion network and the detection network;
a trimming module 26, coupled to the obtaining module 24, for trimming the detection network and the area suggestion network according to the suggestion box, to obtain a target detection network and a target area suggestion network having the same shared convolution layer;
a detection module 28, coupled to the fine tuning module 26, is configured to detect the target object in the target picture according to the target detection network and the target area suggestion network.
Alternatively, in the present embodiment, the above-described object detection apparatus may be applied, but not limited to, in a detection scene of an object. For example: such objects may include, but are not limited to: tableware, fruit, stationery, kitchen ware, clothing, ornaments, and the like.
Alternatively, in the present embodiment, the above-described object detection apparatus may be applied to, but is not limited to, a terminal device. The terminal device may include, but is not limited to: a cell phone, tablet computer, PC computer, smart wearable device, smart home device, etc.
Optionally, in this embodiment, the above-mentioned region suggestion network is used to identify a region where an object in the picture is located. For example: the picture is input into a regional suggestion network, the regional suggestion network identifies an object in the picture, the region where the object is located is given, and the region can be displayed by using a suggestion frame.
Alternatively, in the present embodiment, the suggestion box may be, but is not limited to, a rectangular suggestion box. It should be noted that the shape of the suggestion box may be, but not limited to, any shape, such as: rectangular, circular, oval, diamond, etc., the present embodiment is not limited.
Optionally, in this embodiment, the detection network is used to detect an object from a picture, and may, but is not limited to, use the Fast RCNN detection model to detect an object.
Through the device, the suggested frame is generated through the regional suggested network, the detection network is acquired, and the regional suggested network and the detection network are finely adjusted according to the suggested frame, so that the target detection network and the target regional suggested network have the same shared convolution layer, the target detection network and the target regional suggested network can be unified, the accuracy of object detection is improved, and the technical problem of low object detection accuracy is solved.
Fig. 3 is a block diagram two of an object detection apparatus according to an embodiment of the present application, as shown in fig. 3, optionally, the fine adjustment module 26 includes:
and a fine tuning unit 32, configured to keep the suggestion frame fixed, and to interactively fine tune the area suggestion network and the detection network, so as to obtain a target detection network and a target area suggestion network with the same shared convolution layer.
Alternatively, to unify the regional suggestion network and the detection network, the regional suggestion network and the detection network may be trained, the suggestion box may be kept fixed, and the fine-tuning of the regional suggestion and the fine-tuning of the target detection may be alternated.
Fig. 4 is a block diagram III of an object detection apparatus according to an embodiment of the present application, and as shown in fig. 4, optionally, the fine adjustment unit 32 includes:
a first trimming subunit 42, configured to keep the suggestion frame fixed, trim the convolution layer unique to the regional suggestion network to the detection network and the regional suggestion network to have a shared convolution layer, and obtain a target regional suggestion network;
the second trimming subunit 44 is configured to keep the shared convolutional layer fixed, and trim the FC layer of the detection network to the detection network and the regional suggestion network to have the same shared convolutional layer, so as to obtain the target detection network.
Fig. 5 is a block diagram of an object detection apparatus according to an embodiment of the present application, and as shown in fig. 5, optionally, the detection module 28 includes:
an acquisition unit 52 for acquiring a target picture;
an input unit 54, coupled to the obtaining unit 52, is configured to input the target picture into the target area suggestion network and the target detection network to obtain the target object.
Optionally, the input unit 54 includes: the first input subunit is used for inputting the target picture into the target area suggestion network to obtain a picture carrying a target suggestion frame; and the second input subunit is used for inputting the picture carrying the target suggestion frame into the target detection network to obtain the target object detected by the target detection network from the target suggestion frame.
Alternatively, the regional advice network may be, but is not limited to, an RPN network and the detection network may be, but is not limited to, a Fast R-CNN network.
Optionally, the apparatus further includes: and the training module is used for carrying out initialization training through the ImageNet pre-training model to obtain the regional suggestion network.
Optionally, the acquiring module 24 is further configured to: training the ImageNet pre-training model by Fast R-CNN according to the suggestion frame to obtain a detection network which has no shared convolution layer with the regional suggestion network.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims (14)

1. An object detection method, comprising:
generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying a region where an object in a picture is located, and the suggestion frame is used for displaying the region where the object is located;
acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting the object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network is formed;
fine-tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer;
detecting a target object in a target picture according to the target detection network and the target area suggestion network;
wherein the fine tuning the detection network and the regional suggestion network according to the suggestion box, the obtaining the target detection network and the target regional suggestion network with the same shared convolution layer includes: maintaining the suggestion frame fixed, and interactively fine-tuning the area suggestion network and the detection network to obtain the target detection network and the target area suggestion network with the same shared convolution layer;
wherein, the keeping the suggestion box fixed, interactively fine-tuning the regional suggestion network and the detection network includes: maintaining the suggestion frame fixed, and fine-tuning a convolution layer unique to the regional suggestion network to the detection network and the regional suggestion network to have a shared convolution layer so as to obtain the target regional suggestion network; and keeping the shared convolution layer fixed, and fine-tuning the FC layer of the detection network to the detection network and the area suggestion network to have the same shared convolution layer so as to obtain the target detection network.
2. The method of claim 1, wherein the regional suggestion network is an RPN network and the detection network is a Fast R-CNN network.
3. The method of claim 2, wherein prior to generating the suggestion box over the regional suggestion network, the method further comprises:
and carrying out initialization training through an ImageNet pre-training model to obtain the RPN.
4. The method of claim 2, wherein obtaining the detection network according to the suggestion box comprises:
training an ImageNet pre-training model according to the suggestion frame through Fast R-CNN to obtain the Fast R-CNN network without a shared convolution layer between the Fast R-CNN network and the RPN network.
5. The method according to any one of claims 1 to 4, wherein detecting the target object in the target picture according to the target detection network and the target area suggestion network comprises:
acquiring the target picture;
and inputting the target picture into the target area suggestion network and the target detection network to obtain the target object.
6. The method of claim 5, wherein inputting the target picture into the target region suggestion network and the target detection network to obtain the target object comprises:
inputting the target picture into the target area suggestion network to obtain a picture carrying a target suggestion frame;
and inputting the picture carrying the target suggestion frame into the target detection network to obtain the target object detected by the target detection network from the target suggestion frame.
7. An object detection apparatus, comprising:
the generation module is used for generating a suggestion frame through a regional suggestion network, wherein the regional suggestion network is used for identifying the region where the object in the picture is located, and the suggestion frame is used for displaying the region where the object is located;
the acquisition module is used for acquiring a detection network according to the suggestion frame, wherein the detection network is used for detecting the object from the picture, and a convolution layer which is not shared between the regional suggestion network and the detection network is formed;
the fine tuning module is used for fine tuning the detection network and the regional suggestion network according to the suggestion frame to obtain a target detection network and a target regional suggestion network with the same shared convolution layer;
the detection module is used for detecting a target object in a target picture according to the target detection network and the target area suggestion network;
wherein the fine tuning module comprises: a fine tuning unit for keeping the suggestion frame fixed, interactively fine tuning the area suggestion network and the detection network to obtain the target detection network and the target area suggestion network with the same shared convolution layer,
wherein the fine tuning unit includes: a first fine tuning subunit, configured to keep the suggestion frame fixed, fine tune a convolution layer unique to the regional suggestion network to a shared convolution layer between the detection network and the regional suggestion network, and obtain the target regional suggestion network; and the second fine tuning subunit is used for keeping the shared convolution layer fixed, and fine tuning the FC layer of the detection network to the detection network and the area suggestion network to have the same shared convolution layer so as to obtain the target detection network.
8. The apparatus of claim 7, wherein the regional suggestion network is an RPN network and the detection network is a Fast R-CNN network.
9. The apparatus of claim 8, wherein the apparatus further comprises:
and the initialization training module is used for performing initialization training through an ImageNet pre-training model to obtain the RPN.
10. The apparatus of claim 8, wherein the acquisition module is configured to: training an ImageNet pre-training model according to the suggestion frame through Fast R-CNN to obtain the Fast R-CNN network without a shared convolution layer between the Fast R-CNN network and the RPN network.
11. The apparatus according to any one of claims 7 to 10, wherein the detection module comprises:
an acquisition unit, configured to acquire the target picture;
and the input unit is used for inputting the target picture into the target area suggestion network and the target detection network to obtain the target object.
12. The apparatus of claim 11, wherein the input unit comprises:
the first input subunit is used for inputting the target picture into the target area suggestion network to obtain a picture carrying a target suggestion frame;
and the second input subunit is used for inputting the picture carrying the target suggestion frame into the target detection network to obtain the target object detected by the target detection network from the target suggestion frame.
13. A storage medium comprising a stored program, wherein the program, when run, controls a device in which the storage medium is located to perform the method of any one of claims 1 to 6.
14. A processor for running a program, wherein the program when run performs the method of any one of claims 1 to 6.
CN201711133580.1A 2017-11-15 2017-11-15 Object detection method, device, storage medium and processor Active CN109784131B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711133580.1A CN109784131B (en) 2017-11-15 2017-11-15 Object detection method, device, storage medium and processor
PCT/CN2018/079852 WO2019095596A1 (en) 2017-11-15 2018-03-21 Object detection method, device, storage medium and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711133580.1A CN109784131B (en) 2017-11-15 2017-11-15 Object detection method, device, storage medium and processor

Publications (2)

Publication Number Publication Date
CN109784131A CN109784131A (en) 2019-05-21
CN109784131B true CN109784131B (en) 2023-08-22

Family

ID=66495370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711133580.1A Active CN109784131B (en) 2017-11-15 2017-11-15 Object detection method, device, storage medium and processor

Country Status (2)

Country Link
CN (1) CN109784131B (en)
WO (1) WO2019095596A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340092B (en) * 2020-02-21 2023-09-22 浙江大华技术股份有限公司 Target association processing method and device
US20230237788A1 (en) * 2020-04-15 2023-07-27 Aselsan Elektronik Sanayi Ve Ticaret Anonim Sirketi Method for training shallow convolutional neural networks for infrared target detection using a two-phase learning strategy

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
CN106372577A (en) * 2016-08-23 2017-02-01 北京航空航天大学 Deep learning-based traffic sign automatic identifying and marking method
CN106599939A (en) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 Real-time target detection method based on region convolutional neural network
CN106910188A (en) * 2017-02-16 2017-06-30 苏州中科天启遥感科技有限公司 The detection method of airfield runway in remote sensing image based on deep learning
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9858496B2 (en) * 2016-01-20 2018-01-02 Microsoft Technology Licensing, Llc Object detection and classification in images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
CN106372577A (en) * 2016-08-23 2017-02-01 北京航空航天大学 Deep learning-based traffic sign automatic identifying and marking method
CN106599939A (en) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 Real-time target detection method based on region convolutional neural network
CN106910188A (en) * 2017-02-16 2017-06-30 苏州中科天启遥感科技有限公司 The detection method of airfield runway in remote sensing image based on deep learning
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于区域建议网络的行人检测;王琴芳;应娜;;通信技术(03);全文 *

Also Published As

Publication number Publication date
CN109784131A (en) 2019-05-21
WO2019095596A1 (en) 2019-05-23

Similar Documents

Publication Publication Date Title
EP2950551B1 (en) Method for recommending multimedia resource and apparatus thereof
CN106055710A (en) Video-based commodity recommendation method and device
CN105574910A (en) Electronic Device and Method for Providing Filter in Electronic Device
CN108037823B (en) Information recommendation method, Intelligent mirror and computer readable storage medium
CN105938557A (en) Image recognition method and image recognition device
CN106202316A (en) Merchandise news acquisition methods based on video and device
CN106575450A (en) Augmented reality content rendering via albedo models, systems and methods
CN108986197B (en) 3D skeleton line construction method and device
US20140279341A1 (en) Method and system to utilize an intra-body area network
CN109993824B (en) Image processing method, intelligent terminal and device with storage function
CN109492607B (en) Information pushing method, information pushing device and terminal equipment
CN106202304A (en) Method of Commodity Recommendation based on video and device
CN109784131B (en) Object detection method, device, storage medium and processor
CN110858279A (en) Food material identification method and device
JP6941800B2 (en) Emotion estimation device, emotion estimation method and program
CN108876484A (en) Method of Commodity Recommendation and device
WO2015153240A1 (en) Directed recommendations
CN112906806A (en) Data optimization method and device based on neural network
CN109658501B (en) Image processing method, image processing device and terminal equipment
CN109726632A (en) Background recommended method and Related product
CN110264544B (en) Picture processing method and device, storage medium and electronic device
CN106358006A (en) Video correction method and video correction device
CN111104952A (en) Method, system and device for identifying food types and refrigerator
CN109697746A (en) Self-timer video cartoon head portrait stacking method and Related product
CN113869330A (en) Underwater fish target detection method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant