CN109670517A - Object detection method, device, electronic equipment and target detection model - Google Patents
Object detection method, device, electronic equipment and target detection model Download PDFInfo
- Publication number
- CN109670517A CN109670517A CN201811587447.8A CN201811587447A CN109670517A CN 109670517 A CN109670517 A CN 109670517A CN 201811587447 A CN201811587447 A CN 201811587447A CN 109670517 A CN109670517 A CN 109670517A
- Authority
- CN
- China
- Prior art keywords
- network
- target detection
- feature extraction
- image
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of object detection method, device, electronic equipment and target detection models, belong to technical field of image detection.Wherein, object detection method includes: the characteristic pattern that image to be detected is extracted by feature extraction network, carries out target detection based on characteristic pattern.Wherein, feature extraction network includes multiple convolutional layers, at least one convolutional layer includes one or more structural units;Each structural unit includes parallel at least two channel branch and the concatenation unit and channel rearrangement units for being connected to channel branch tail end.By the way of channel branch, the execution speed of network can be improved, reset by channel, the information exchange between channel branch may be implemented, guarantees the detection accuracy and accuracy of network.Etc. calculation amounts in the case where, features described above extract network have optimal feature extraction speed.Therefore, object detection method provided in an embodiment of the present invention can be improved detection speed, save the time while guaranteeing detection accuracy.
Description
Technical field
The invention belongs to technical field of image detection, more particularly, to a kind of object detection method, device, electronic equipment and
Target detection model.
Background technique
As increasingly intelligence, the target detection of electronic equipment are widely applied in every field, can detecte out
With the presence or absence of the position of target object and target object in image.In order to improve the precision of target detection, currently used for carrying out mesh
The neural network model of mark detection mostly uses large-scale neural network model, such as ResNet network, GoolgeNet network.By
It is all very big in the calculation amount of these networks, therefore execution speed is very slow, needs to take a substantial amount of time.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of object detection method, device, electronic equipment and target detections
Model can be improved the speed of target detection, save the time.
To achieve the goals above, technical solution used in the embodiment of the present invention is as follows:
In a first aspect, the embodiment of the invention provides a kind of object detection methods, comprising:
Feature extraction is carried out by feature extraction network handles detection image, obtains the characteristic pattern of described image to be detected;
The feature extraction network includes multiple convolutional layers, at least one convolutional layer in the multiple convolutional layer includes one or more
Structural unit;Each structural unit includes at least two parallel channel branch and is connected to the channel branch tail end
Concatenation unit and channel rearrangement units;
Characteristic pattern input target detection network is subjected to target detection.
With reference to first aspect, the embodiment of the invention provides the first possible embodiments of first aspect, wherein every
At least one channel branch in a structural unit includes multiple convolution units, is included at least in the multiple convolution unit
The depth convolution unit of one convolution kernel with pre-set dimension.
The possible embodiment of with reference to first aspect the first, the embodiment of the invention provides second of first aspect
Possible embodiment, wherein the convolution kernel of the pre-set dimension is the convolution kernel of 3*3.
With reference to first aspect, the embodiment of the invention provides the third possible embodiments of first aspect, wherein institute
Stating feature extraction network includes the convolutional layer that step-length is 1;Each structural unit includes connecting in the convolutional layer that the step-length is 1
Connect the channel segmentation unit in the head end of at least two channel branch.
With reference to first aspect, the embodiment of the invention provides the 4th kind of possible embodiments of first aspect, wherein institute
Stating target detection network includes classification sub-network and/or recurrence sub-network;The classification sub-network is used for based on described to be detected
The characteristic pattern of image determines whether described image to be detected includes target object;The recurrence sub-network is used for based on described to be checked
The characteristic pattern of altimetric image determines position of the target object in described image to be detected.
With reference to first aspect, the embodiment of the invention provides the 5th kind of possible embodiments of first aspect, wherein institute
Stating target detection network includes sequentially connected big core depth separation convolutional layer, pond layer and full articulamentum, and with it is described entirely
The parallel classification sub-network and recurrence sub-network of articulamentum connection;Characteristic pattern input target detection network is subjected to target
The step of detection, comprising:
The characteristic pattern is passed sequentially through into the big core depth separation convolutional layer, pond layer and full articulamentum, is obtained described
The characteristic of full articulamentum output;
The characteristic is inputted into the classification sub-network and the recurrence sub-network respectively, obtains the classification subnet
The classification results of network output and the regression result for returning sub-network output;
In conjunction with the classification results and the regression result, object detection results are exported.
Second aspect, the embodiment of the present invention also provide a kind of target detection model, including feature extraction network and with it is described
The target detection network of feature extraction network connection;The feature extraction network includes multiple convolutional layers, the multiple convolutional layer
In at least one convolutional layer include one or more structural units;Each structural unit includes parallel at least two logical
Road branch and the concatenation unit and channel rearrangement units for being connected to the channel branch tail end.
In conjunction with second aspect, the embodiment of the invention provides the first possible embodiments of second aspect, wherein institute
Stating target detection network includes classification sub-network and/or recurrence sub-network.
In conjunction with second aspect, the embodiment of the invention provides second of possible embodiments of second aspect, wherein institute
Stating target detection network includes sequentially connected big core depth separation convolutional layer, pond layer and full articulamentum, and with it is described entirely
The parallel classification sub-network and recurrence sub-network of articulamentum connection.
The third aspect, the embodiment of the invention provides a kind of object detecting devices, comprising:
Characteristic extracting module, for carrying out feature extraction by feature extraction network handles detection image, obtain it is described to
The characteristic pattern of detection image;The feature extraction network includes multiple convolutional layers, at least one of the multiple convolutional layer volume
Lamination includes one or more structural units;Each structural unit includes parallel at least two channel branch and connection
In the concatenation unit and channel rearrangement units of the channel branch tail end;
Module of target detection, for characteristic pattern input target detection network to be carried out target detection.
Fourth aspect, the embodiment of the invention provides a kind of electronic equipment, including image collecting device, memory and processing
Device;
Described image acquisition device, for acquiring image data;
The computer program that can be run on the processor is stored in the memory, described in the processor executes
The step of method described in any one of above-mentioned first aspect is realized when computer program.
5th aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage
Computer program is stored on medium, the computer program is executed when being run by processor described in above-mentioned any one of first aspect
Method the step of.
Object detection method, device, electronic equipment and target detection model provided in an embodiment of the present invention, are mentioned by feature
It takes network to extract the characteristic pattern of image to be detected, target detection is carried out based on characteristic pattern.Wherein, feature extraction network includes multiple
Convolutional layer, wherein at least one convolutional layer include one or more structural units;Each structural unit include it is parallel extremely
Few two channel branch and the concatenation unit and channel rearrangement units for being connected to the channel branch tail end.Using channel branch
Mode, the execution speed of network can be improved, reset by channel, the information exchange between channel branch may be implemented, protect
Demonstrate,prove the detection accuracy and accuracy of network.Etc. calculation amounts in the case where, feature extraction network provided in an embodiment of the present invention tool
There is optimal feature extraction speed.Therefore, object detection method provided in an embodiment of the present invention is guaranteeing the same of detection accuracy
When, detection speed can be improved, save the time.
Other feature and advantage of the disclosure will illustrate in the following description, alternatively, Partial Feature and advantage can be with
Deduce from specification or unambiguously determine, or by implement the disclosure above-mentioned technology it can be learnt that.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below
Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor
It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 shows the structural schematic diagram of a kind of electronic equipment provided by the embodiment of the present invention;
Fig. 2 shows a kind of flow charts of object detection method provided by the embodiment of the present invention;
Fig. 3 shows a kind of schematic diagram of internal structure of feature extraction network provided by the embodiment of the present invention;
Fig. 4 shows the schematic diagram of internal structure of another kind feature extraction network provided by the embodiment of the present invention;
Fig. 5 shows a kind of structural schematic diagram of target detection network provided by the embodiment of the present invention;
Fig. 6 shows a kind of structural block diagram of object detecting device provided by the embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
In order to solve to cause greatly to calculate speed very much for carrying out the neural network model calculation amount of target detection in the prior art
Slow problem is spent, the embodiment of the invention provides a kind of object detection method, device, electronic equipment and target detection models.With
Lower combination the drawings and specific embodiments are to object detection method provided in an embodiment of the present invention, device, electronic equipment and target
Detection model is described in detail.
Embodiment one:
Firstly, describing the exemplary electronic device of the object detection method for realizing the embodiment of the present invention referring to Fig.1
100.The exemplary electronic device 100 can be the mobile terminals such as smart phone, tablet computer, camera;It can also be authentication
Other equipment such as the server of equipment (such as attendance recorder, testimony of a witness all-in-one machine), monitor or monitoring center.
As shown in Figure 1, electronic equipment 100 includes one or more processors 102, one or more memories 104, input
Device 106, output device 108, can also include image collecting device 110, these components by bus system 112 and/or its
Bindiny mechanism's (not shown) of its form interconnects.It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 only show
Example property, and not restrictive, as needed, the electronic equipment also can have other assemblies and structure.
The processor 102 can be central processing unit (CPU), graphics processor (Graphics Processing
Unit, GPU) or the other forms with data-handling capacity, image-capable and/or instruction execution capability processing list
Member, and can control other components in the electronic equipment 100 to execute desired function.
The memory 104 may include one or more computer program products, and the computer program product can be with
Including various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described volatile
Property memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-easy
The property lost memory for example may include read-only memory (ROM), hard disk, flash memory etc..On the computer readable storage medium
It can store one or more computer program instructions, processor 102 can run described program instruction, described below to realize
The embodiment of the present invention in the function of image segmentation (realized by processor) and/or other desired functions.In the meter
Can also store various application programs and various data in calculation machine readable storage medium storing program for executing, for example, the application program use and/or
The various images etc. generated.
The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat
One or more of gram wind and touch screen etc..
The output device 108 can export various information (for example, image or sound) to external (for example, user), and
It and may include one or more of display, loudspeaker etc..
Described image acquisition device 110 can shoot the desired image of user (such as photo, video etc.), and will be clapped
The image taken the photograph is stored in the memory 104 for the use of other components.
One or more light compensating lamps are additionally provided on electronic equipment 100, light compensating lamp is arranged corresponding to image collecting device, uses
In when ambient light deficiency, when influencing the Image Acquisition effect of image collecting device, light filling is carried out for described image acquisition device.
Light compensating lamp can use infrared light compensating lamp, such as near-infrared LED lamp, laser infrared lamp.Infrared light compensating lamp issues invisible infrared
Light carries out light filling in half-light environment for image collecting device.
Embodiment two:
A kind of object detection method is present embodiments provided, the speed of target detection can be improved, save the time.Fig. 2 shows
The flow chart of the object detection method is gone out.It should be it should be noted that the step of showing in the flowchart of fig. 2 can be such as
It is executed in the computer system of a group of computer-executable instructions, although also, logical order is shown in flow charts,
It in some cases, can be with the steps shown or described are performed in an order that is different from the one herein.The present embodiment is carried out below
It is discussed in detail.
As shown in Fig. 2, object detection method provided in this embodiment, includes the following steps:
Step S202 carries out feature extraction by feature extraction network handles detection image, obtains the spy of image to be detected
Sign figure.
Wherein, image to be detected can be the image that image collecting device acquires in real time, or pre-stored figure
Picture.In addition, image to be detected can be the image of picture format, the picture frame being also possible in video, the embodiment of the present invention is not
It is restricted.Whether object detection method provided in this embodiment can detecte out in image to be detected comprising target object, may be used also
With the position of detected target object.The target object includes but is not limited to face, pedestrian, vehicle, animal or plant etc..Target
Object can also be a part of animal or a part of plant.
The network structure of the feature extraction network can be such that feature extraction network may include multiple convolutional layers, institute
Convolutional layer is stated for extracting characteristic pattern from image to be detected.In order to provide the speed of convolutional calculation, in the multiple convolutional layer
At least one convolutional layer may include structural unit as shown in Figure 3 or Figure 4.The structural unit includes parallel at least two
Channel branch and the concatenation unit (concat) and channel rearrangement units (channel for being connected to channel branch tail end
shuffle).For example, including at least one structure list in a part of convolutional layer of feature extraction network in some embodiments
Member includes in further embodiments at least one structural unit in all convolutional layers of feature extraction network.It needs to illustrate
, Fig. 3 and Fig. 4 illustrate only the structural unit including two channel branch, in a part of the embodiment, in structural unit
It may include parallel three or four, even more channel branch, the tail end of each channel branch is single with the splicing
Member connection.
Characteristic pattern input target detection network is carried out target detection by step S204.
A kind of achievable mode of the step are as follows: characteristic pattern is inputted into target detection network, it is defeated to obtain target detection network
Object detection results out.Whether testing result may include comprising target object in image to be detected, can also include target
Position of the object in image to be detected.Wherein, target detection network may include classification sub-network and/or recurrence sub-network;
The classification sub-network is for determining whether image to be detected includes target object;The recurrence sub-network is for determining target pair
As the position in image to be detected.When target detection network includes classification sub-network and returns sub-network, sub-network of classifying
It is arranged parallel with sub-network is returned.
Object detection method provided in an embodiment of the present invention extracts the feature of image to be detected by feature extraction network
Figure carries out target detection based on characteristic pattern.Wherein, feature extraction network includes multiple convolutional layers, wherein at least one convolutional layer
Including one or more structural units;Each structural unit includes at least two parallel channel branch and is connected to institute
State the concatenation unit and channel rearrangement units of channel branch tail end.By the way of channel branch, the execution of network can be improved
Speed is reset by channel, and the information exchange between channel branch may be implemented, and guarantees the detection accuracy and accuracy of network.
Etc. calculation amounts in the case where, feature extraction network provided in an embodiment of the present invention have optimal feature extraction speed.Therefore,
Detection speed can be improved while guaranteeing detection accuracy in object detection method provided in an embodiment of the present invention, when saving
Between.
In an alternative embodiment, features described above, which extracts network, can use (the second generation channel ShuffleNetV2
Reset network), ShuffleNetV2 network is a kind of convolutional neural networks model of lightweight, compared to existing large-scale nerve
Network model (such as ResNet network, GoolgeNet network), ShuffleNetV2 network etc. calculation amounts in the case where, tool
There is feature extraction speed optimal at present.
In order to further increase the accuracy rate of target detection, ShuffleNetV2 network can be improved, in network
The middle convolution kernel for increasing pre-set dimension, improves the receptive field of feature extraction network.For example, in certain embodiments,
ShuffleNetV2 network may include the convolutional layer that step-length is 1 (stride=1), in further embodiments,
ShuffleNetV2 network may include the convolutional layer that step-length is 2 (stride=2).In further embodiments,
ShuffleNetV2 network may include the convolutional layer of different step-lengths, had such as both included the convolutional layer that step-length is 1, and had also been including step-length
2 convolutional layer.
The convolutional layer for being 1 for step-length may include one or more structural units shown in Fig. 3, the knot in convolutional layer
Structure unit includes two parallel channel branch, and the head end of two channel branch is connected with channel segmentation unit (channel
Split), the port number of input is divided into Liang Ge branch by channel segmentation unit.For example, if the port number of the image of input is c,
After channel segmentation unit, the port number for inputting first branch information an of channel branch is c1, input another channel point
The port number of second branch information of branch is then c-c1.In general, if image to be detected of input is RGB image, lead to
Road number is 3, the i.e. channel R, the channel G and channel B.
In two channel branch, a channel branch may include multiple convolution units, in the multiple convolution unit
Including at least the depth convolution unit of a convolution kernel with pre-set dimension.The convolution kernel of pre-set dimension can be the volume of 5*5
Product core can also use the convolution kernel of 3*3 to reduce calculation amount.Tool can be set according to the actual needs of receptive field size
There is the quantity of the depth convolution unit of the convolution kernel of pre-set dimension.As shown in figure 3, the channel branch on right side includes sequentially connected
Four convolution units, the convolution kernel size of four convolution units are respectively 1*1,3*3,3*3 and 1*1, are preset including two
The depth convolution unit of the convolution kernel of size (3*3), the convolution unit that two convolution kernels are 1*1 is using ReLU as activation letter
Number.First branch information of the channel branch for inputting left side remains unchanged, and inputs the second branch information of the channel branch on right side
The output of the second branch is obtained after multiple convolution, the port number of the second branch output is identical as the second branch information of input.
The first branch information and the output of the second branch are stitched together by concatenation unit, thus the port number for exporting the structural unit
Amount is identical as the input number of channels of the structural unit, remains unchanged number of channels.The output of concatenation unit is using channel
Rearrangement units carry out channel rearrangement, make that information interchange can be carried out between two channel branch, avoid because believing between channel branch
Breath exchanges ability to express that is unsmooth and influencing network.
The convolutional layer for being 2 for step-length may include one or more structural units shown in Fig. 4, the knot in convolutional layer
Structure unit includes two parallel channel branch, and each channel branch includes multiple convolution units, the multiple convolution unit
In include at least one with pre-set dimension convolution kernel depth convolution unit.Equally, the convolution kernel of pre-set dimension can be
The convolution kernel of 5*5 is also possible to the convolution kernel of 3*3.Can be according to the actual needs of receptive field size, being arranged has pre-set dimension
Convolution kernel depth convolution unit quantity.As shown in figure 4, the channel branch on right side includes sequentially connected four convolution lists
Member, the convolution kernel size of four convolution units are respectively 1*1,3*3,3*3 and 1*1, including two pre-set dimensions (3*3)
The depth convolution unit of convolution kernel.The channel branch in left side includes sequentially connected two convolution units, two convolution units
Convolution kernel size is respectively 3*3 and 1*1, including the depth convolution unit of the convolution kernel of a pre-set dimension (3*3).
First branch information of the channel branch for inputting left side obtains the output of the first branch after convolution, inputs right side
Second branch information of channel branch obtains the output of the second branch after multiple convolution.It is by concatenation unit that the first branch is defeated
It exports and is stitched together with the second branch out, since the step-length of the convolutional layer is 2, the number of channels exported after splicing is input
2 times of the number of channels of the structural unit.The output of concatenation unit carries out channel rearrangement using channel rearrangement units, makes two
Information interchange can be carried out between channel branch, avoid the expression energy that network is influenced due to information interchange is unsmooth between channel branch
Power.
In order to further increase the speed of target detection, in embodiments of the present invention, target detection network can be used
Light-Head R-CNN network.The specific network structure of target detection network can be as shown in figure 5, include sequentially connected big
Core depth separates convolutional layer (Large separable convolution), pond layer and full articulamentum (Fully
Collection, FC), and the parallel classification sub-network (classification subnet) that is connect with full articulamentum and
It returns sub-network (Location subnet).The characteristic pattern that feature extraction network exports is passed sequentially through into big core depth separation volume
Lamination, pond layer and full articulamentum, the characteristic of available full articulamentum output, the characteristic is inputted respectively and is divided
Class sub-network and recurrence sub-network obtain the classification results of classification sub-network output and return the regression result that sub-network exports,
Combining classification result and regression result export object detection results.
Specifically, characteristic pattern separates convolutional layer, available narrow characteristic pattern (thinner by big core depth
Feature map), it include the candidate region of multiple candidate regions of different sizes or position sensing in the narrow characteristic pattern.
The pond layer can use the candidate region pond layer (PSROI pooling) or candidate region pond layer (ROI of position sensing
pooling).The effect of the pond layer is to adjust multiple candidate regions of different sizes to fixed dimension.By narrow characteristic pattern
Pond layer is inputted, then the output of pond layer is obtained into characteristic by full articulamentum.Characteristic is inputted to classification respectively
Network and recurrence sub-network, obtain output object detection results.In the target detection network, due to using big core depth point
From convolutional layer, the score matrix that all categories are calculated for each region is no longer needed to when determining candidate region, therefore avoid weight
Head construction and weight tail structure, can greatly reduce computation complexity, improve and execute speed.
Wherein, classification sub-network includes multiple convolutional layers, is mainly used for target classification.Fusion feature figure is inputted into classification
Network, classification sub-network may determine that whether there is target object appearance in the fusion feature figure of input, and output target object occurs
A possibility that, i.e., a possibility that target object occurs in image to be detected.For example, in Face datection task, subnet of classifying
Network can export the testing result " with the presence or absence of face ".
Returning sub-network also includes multiple convolutional layers, is mainly used for target positioning, and target location tasks are also believed to back
Return task.Fusion feature figure is inputted and returns sub-network, target pair in the fusion feature figure of input can be determined by returning sub-network
The position of the position of elephant, i.e. target object in image to be detected.Mark target object position can be exported by returning sub-network
Rectangle surrounds frame.For example, returning sub-network can export " the recurrence frame coordinate of face " in Face datection task, frame is returned
Namely the rectangle encirclement frame of the face of sub-network prediction is returned, characterize the specific location where face.
In conclusion since target detection network uses Light-Head R-CNN network inspection can not lost
In the case where surveying precision, the speed of target detection is further increased.
In order to make feature extraction network and target detection network may be directly applied to carry out target inspection to image to be detected
It surveys, exports more accurately and reliably as a result, it is desirable to be trained in advance to feature extraction network and target detection network.Below in detail
Describe the training process of bright feature extraction network and target detection network in detail.
Obtain training image sample set;The training image sample set includes multiple training images.Using training sample set
Feature extraction network and target detection network are trained.
Optionally, a training image is randomly selected from training image sample set;Training image input feature vector is extracted
Network obtains the characteristic pattern of training image;The characteristic pattern of training image is inputted into target detection network, obtains the inspection of training image
Survey result.The testing result of training image is compared with the label manually marked, is calculated and is damaged using preset loss function
Mistake value.Penalty values are to determine the degree of closeness of actual output and desired output.Penalty values are smaller, illustrate that actual output is got over
Close to desired output.Back-propagation algorithm can be used, adjusts feature extraction network and target detection network according to penalty values
Parameter, until penalty values when converging to preset desired value, complete the training to feature extraction network and target detection network,
Using parameter current as the parameter of feature extraction network and target detection network.
Embodiment three:
With above-mentioned object detection method correspondingly, present embodiments provide a kind of target detection model, the target detection
Model includes feature extraction network and the target detection network with feature extraction network connection.
Wherein, the feature extraction network is used to extract the characteristic pattern of image to be detected.Feature extraction network may include
Multiple convolutional layers, at least one convolutional layer in the multiple convolutional layer include one or more structural units;Each knot
Structure unit includes parallel at least two channel branch and the concatenation unit for being connected to the channel branch tail end and channel weight
Arrange unit.In an alternative embodiment, as shown in figure 3, feature extraction network includes the convolutional layer that step-length is 1;The step
Each structural unit in a length of 1 convolutional layer includes the channel segmentation unit for being connected to the head end of at least two channel branch.
A channel branch at least two channel branch includes multiple convolution units, is at least wrapped in the multiple convolution unit
Include the depth convolution unit of a convolution kernel with pre-set dimension.The convolution kernel of the pre-set dimension can be the convolution of 3*3
Core.In an alternative embodiment, as shown in figure 4, feature extraction network includes the convolutional layer that step-length is 2;The step-length
It include being connected at least two channel branch for each structural unit in 2 convolutional layer, each channel branch includes multiple
Convolution unit includes at least the depth convolution unit of a convolution kernel with pre-set dimension in the multiple convolution unit.Institute
The convolution kernel for stating pre-set dimension can be the convolution kernel of 3*3.
As shown in figure 5, the target detection network includes sequentially connected big core depth separation convolutional layer, pond layer and entirely
Articulamentum, and the parallel classification sub-network and recurrence sub-network that are connect with the full articulamentum.By described image to be detected
Characteristic pattern pass sequentially through big core depth separation convolutional layer, pond layer and full articulamentum, obtain the characteristic that full articulamentum exports
According to.The classification sub-network is used to carry out classification processing to the characteristic, determines in characteristic pattern whether include target object,
And output category result.The recurrence sub-network determines the position of target object for carrying out recurrence processing to the characteristic
It sets, and output regression is as a result, in conjunction with the classification results and the regression result, it can obtain object detection results.
Example IV:
Corresponding to above method embodiment, a kind of object detecting device, one kind shown in Figure 6 are present embodiments provided
The structural schematic diagram of object detecting device, the device include:
Characteristic extracting module 61 obtains described for carrying out feature extraction by feature extraction network handles detection image
The characteristic pattern of image to be detected;The feature extraction network includes at least one of multiple convolutional layers, the multiple convolutional layer
Convolutional layer includes one or more structural units;Each structural unit includes parallel at least two channel branch and company
Connect the concatenation unit and channel rearrangement units in the channel branch tail end;
Module of target detection 62, for characteristic pattern input target detection network to be carried out target detection.
Wherein, at least one channel branch in each structural unit includes multiple convolution units, the multiple volume
The depth convolution unit of a convolution kernel with pre-set dimension is included at least in product unit.The convolution kernel of the pre-set dimension is
The convolution kernel of 3*3.
In an alternative embodiment, the feature extraction network includes the convolutional layer that step-length is 1;The step-length is 1
Convolutional layer in each structural unit include the channel segmentation unit for being connected to the head end of at least two channel branch.
A channel branch at least two channel branch includes multiple convolution units, is at least wrapped in the multiple convolution unit
Include the depth convolution unit of a convolution kernel with pre-set dimension.
In an alternative embodiment, the feature extraction network includes the convolutional layer that step-length is 2;The step-length is
Each structural unit includes being connected at least two channel branch in 2 convolutional layer, and each channel branch includes
Multiple convolution units include at least the depth convolution list of a convolution kernel with pre-set dimension in the multiple convolution unit
Member.
In some embodiments, the target detection network includes classification sub-network and/or recurrence sub-network;The classification
Sub-network is used to determine whether described image to be detected includes target object based on the characteristic pattern of described image to be detected;Described time
Characteristic pattern of the sub-network for based on described image to be detected is returned to determine position of the target object in described image to be detected.
In further embodiments, the target detection network includes the sequentially connected big separation of core depth convolutional layer, pond
Change layer and full articulamentum, and the parallel classification sub-network and recurrence sub-network that connect with the full articulamentum.The target
Detection module 62 can be also used for: the characteristic pattern is passed sequentially through the big core depth separation convolutional layer, pond layer and Quan Lian
Layer is connect, the characteristic of the full articulamentum output is obtained;The characteristic is inputted into the classification sub-network and institute respectively
Recurrence sub-network is stated, the classification results and the regression result for returning sub-network output of the classification sub-network output are obtained;
In conjunction with the classification results and the regression result, object detection results are exported.
In an alternative embodiment, above-mentioned object detecting device can also include training module, training module and spy
It levies extraction module 61 to connect, for obtaining training image sample set;The training image sample set includes multiple training images;It adopts
The feature extraction network and the target detection network are trained with the training sample set.
The training module can be also used for: a training image is randomly selected from training image sample set;It will train
Image input feature vector extracts network, obtains the characteristic pattern of training image;The characteristic pattern of training image is inputted into target detection network,
Obtain the testing result of training image.The testing result of training image is compared with the label manually marked, using default
Loss function calculate penalty values.Penalty values are to determine the degree of closeness of actual output and desired output.Penalty values are smaller,
Illustrate actual export closer to desired output.Back-propagation algorithm can be used, adjusts feature extraction net according to penalty values
The parameter of network and target detection network, until completing when penalty values converge to preset desired value to feature extraction network and mesh
The training of mark detection network, using parameter current as the parameter of feature extraction network and target detection network.
The embodiment of the invention provides a kind of object detecting devices, and the spy of image to be detected is extracted by feature extraction network
Sign figure carries out target detection based on characteristic pattern.Wherein, feature extraction network includes multiple convolutional layers, wherein at least one convolution
Layer includes one or more structural units;Each structural unit includes at least two parallel channel branch and is connected to
The concatenation unit and channel rearrangement units of the channel branch tail end.By the way of channel branch, holding for network can be improved
Scanning frequency degree, is reset by channel, and the information exchange between channel branch may be implemented, and guarantees the detection accuracy of network and accurate
Degree.Therefore, detection speed can be improved while guaranteeing detection accuracy in object detecting device provided in an embodiment of the present invention,
Save the time.
The technical effect of device provided by the present embodiment, realization principle and generation is identical with previous embodiment, for letter
It describes, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
The embodiment of the invention also provides a kind of electronic equipment, including image collecting device, memory, processor.It is described
Image collecting device, for acquiring image data;The computer that can be run on the processor is stored in the memory
Program, the processor realize method documented by preceding method embodiment when executing the computer program.
It is apparent to those skilled in the art that for convenience and simplicity of description, the electronics of foregoing description
The specific work process of equipment, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Further, the present embodiment additionally provides a kind of computer readable storage medium, the computer readable storage medium
On be stored with computer program, the computer program is executed when being run by processor provided by above-mentioned preceding method embodiment
The step of method, specific implementation can be found in embodiment of the method, and details are not described herein.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention
Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair
It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art
In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention
Within the scope of.
Claims (12)
1. a kind of object detection method characterized by comprising
Feature extraction is carried out by feature extraction network handles detection image, obtains the characteristic pattern of described image to be detected;It is described
Feature extraction network includes multiple convolutional layers, at least one convolutional layer in the multiple convolutional layer includes one or more structures
Unit;Each structural unit includes parallel at least two channel branch and the spelling for being connected to the channel branch tail end
Order member and channel rearrangement units;
Characteristic pattern input target detection network is subjected to target detection.
2. the method according to claim 1, wherein at least one channel branch in each structural unit
Including multiple convolution units, the depth convolution of a convolution kernel with pre-set dimension is included at least in the multiple convolution unit
Unit.
3. according to the method described in claim 2, it is characterized in that, the convolution kernel of the pre-set dimension is the convolution kernel of 3*3.
4. the method according to claim 1, wherein the feature extraction network includes the convolutional layer that step-length is 1;
In the convolutional layer that the step-length is 1 each structural unit include be connected at least two channel branch head end it is logical
Road cutting unit.
5. the method according to claim 1, wherein the target detection network include classification sub-network and/or
Return sub-network;The classification sub-network is used to whether determine described image to be detected based on the characteristic pattern of described image to be detected
Include target object;It is described return sub-network be used for based on described image to be detected characteristic pattern determine target object it is described to
Position in detection image.
6. the method according to claim 1, wherein the target detection network includes that sequentially connected big core is deep
Degree separation convolutional layer, pond layer and full articulamentum, and the parallel classification sub-network and recurrence being connect with the full articulamentum
Sub-network;The step of characteristic pattern input target detection network is subjected to target detection, comprising:
The characteristic pattern is passed sequentially through into the big core depth separation convolutional layer, pond layer and full articulamentum, obtains described connecting entirely
Connect the characteristic of layer output;
The characteristic is inputted into the classification sub-network and the recurrence sub-network respectively, it is defeated to obtain the classification sub-network
Classification results and the regression result for returning sub-network output out;
In conjunction with the classification results and the regression result, object detection results are exported.
7. a kind of target detection model, which is characterized in that be connected to the network including feature extraction network and with the feature extraction
Target detection network;The feature extraction network includes multiple convolutional layers, at least one convolutional layer in the multiple convolutional layer
Including one or more structural units;Each structural unit includes at least two parallel channel branch and is connected to institute
State the concatenation unit and channel rearrangement units of channel branch tail end.
8. target detection model according to claim 7, which is characterized in that the target detection network includes classification subnet
Network and/or recurrence sub-network.
9. target detection model according to claim 7, which is characterized in that the target detection network includes being sequentially connected
Big core depth separation convolutional layer, pond layer and full articulamentum, and the parallel classification subnet being connect with the full articulamentum
Network and recurrence sub-network.
10. a kind of object detecting device characterized by comprising
Characteristic extracting module obtains described to be detected for carrying out feature extraction by feature extraction network handles detection image
The characteristic pattern of image;The feature extraction network includes multiple convolutional layers, at least one convolutional layer in the multiple convolutional layer
Including one or more structural units;Each structural unit includes at least two parallel channel branch and is connected to institute
State the concatenation unit and channel rearrangement units of channel branch tail end;
Module of target detection, for characteristic pattern input target detection network to be carried out target detection.
11. a kind of electronic equipment, which is characterized in that including image collecting device, memory and processor;
Described image acquisition device, for acquiring image data;
The computer program that can be run on the processor is stored in the memory, the processor executes the calculating
The step of method described in any one of the claims 1~6 is realized when machine program.
12. a kind of computer readable storage medium, computer program, feature are stored on the computer readable storage medium
It is, the step of method described in any one of the claims 1~6 is executed when the computer program is run by processor
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811587447.8A CN109670517A (en) | 2018-12-24 | 2018-12-24 | Object detection method, device, electronic equipment and target detection model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811587447.8A CN109670517A (en) | 2018-12-24 | 2018-12-24 | Object detection method, device, electronic equipment and target detection model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109670517A true CN109670517A (en) | 2019-04-23 |
Family
ID=66146013
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811587447.8A Pending CN109670517A (en) | 2018-12-24 | 2018-12-24 | Object detection method, device, electronic equipment and target detection model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109670517A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111784555A (en) * | 2020-06-16 | 2020-10-16 | 杭州海康威视数字技术股份有限公司 | Image processing method, device and equipment |
CN112115789A (en) * | 2020-08-18 | 2020-12-22 | 北京嘀嘀无限科技发展有限公司 | Face detection model determining method and device and electronic equipment |
CN112116032A (en) * | 2019-06-21 | 2020-12-22 | 富士通株式会社 | Object detection device and method and terminal equipment |
CN113469146A (en) * | 2021-09-02 | 2021-10-01 | 深圳市海清视讯科技有限公司 | Target detection method and device |
CN113591840A (en) * | 2021-06-30 | 2021-11-02 | 北京旷视科技有限公司 | Target detection method, device, equipment and storage medium |
CN117576488A (en) * | 2024-01-17 | 2024-02-20 | 海豚乐智科技(成都)有限责任公司 | Infrared dim target detection method based on target image reconstruction |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107134144A (en) * | 2017-04-27 | 2017-09-05 | 武汉理工大学 | A kind of vehicle checking method for traffic monitoring |
CN108694401A (en) * | 2018-05-09 | 2018-10-23 | 北京旷视科技有限公司 | Object detection method, apparatus and system |
-
2018
- 2018-12-24 CN CN201811587447.8A patent/CN109670517A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107134144A (en) * | 2017-04-27 | 2017-09-05 | 武汉理工大学 | A kind of vehicle checking method for traffic monitoring |
CN108694401A (en) * | 2018-05-09 | 2018-10-23 | 北京旷视科技有限公司 | Object detection method, apparatus and system |
Non-Patent Citations (2)
Title |
---|
NINGNING MA,XIANGYU ZHANG,HAI-TAO ZHENG,JIAN SUN: "ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design", 《LECTURE NOTES IN COMPUTER SCIENCE》 * |
ZEMING LI;CHAO PENG;GANG YU;XIANGYU ZHANG;YANGDONG DENG;JIAN SUN: "Light-Head R-CNN: In Defense of Two-Stage Object Detector", 《HTTPS://ARXIV.ORG/ABS/1711.07264》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112116032A (en) * | 2019-06-21 | 2020-12-22 | 富士通株式会社 | Object detection device and method and terminal equipment |
JP2021002333A (en) * | 2019-06-21 | 2021-01-07 | 富士通株式会社 | Object detection device, object detection method, and terminal equipment |
JP7428075B2 (en) | 2019-06-21 | 2024-02-06 | 富士通株式会社 | Object detection device, object detection method and terminal equipment |
CN111784555A (en) * | 2020-06-16 | 2020-10-16 | 杭州海康威视数字技术股份有限公司 | Image processing method, device and equipment |
CN111784555B (en) * | 2020-06-16 | 2023-08-25 | 杭州海康威视数字技术股份有限公司 | Image processing method, device and equipment |
CN112115789A (en) * | 2020-08-18 | 2020-12-22 | 北京嘀嘀无限科技发展有限公司 | Face detection model determining method and device and electronic equipment |
CN113591840A (en) * | 2021-06-30 | 2021-11-02 | 北京旷视科技有限公司 | Target detection method, device, equipment and storage medium |
CN113469146A (en) * | 2021-09-02 | 2021-10-01 | 深圳市海清视讯科技有限公司 | Target detection method and device |
CN117576488A (en) * | 2024-01-17 | 2024-02-20 | 海豚乐智科技(成都)有限责任公司 | Infrared dim target detection method based on target image reconstruction |
CN117576488B (en) * | 2024-01-17 | 2024-04-05 | 海豚乐智科技(成都)有限责任公司 | Infrared dim target detection method based on target image reconstruction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wu et al. | Real-time vehicle and distance detection based on improved yolo v5 network | |
CN109670517A (en) | Object detection method, device, electronic equipment and target detection model | |
Wang et al. | Deep networks for saliency detection via local estimation and global search | |
CN109492638A (en) | Method for text detection, device and electronic equipment | |
CN109815868A (en) | A kind of image object detection method, device and storage medium | |
CN111178183B (en) | Face detection method and related device | |
US8792722B2 (en) | Hand gesture detection | |
US8750573B2 (en) | Hand gesture detection | |
CN109670452A (en) | Method for detecting human face, device, electronic equipment and Face datection model | |
CN103295016B (en) | Behavior recognition method based on depth and RGB information and multi-scale and multidirectional rank and level characteristics | |
CN110738101A (en) | Behavior recognition method and device and computer readable storage medium | |
CN107180226A (en) | A kind of dynamic gesture identification method based on combination neural net | |
CN109376667A (en) | Object detection method, device and electronic equipment | |
CN109657533A (en) | Pedestrian recognition methods and Related product again | |
CN107273836A (en) | A kind of pedestrian detection recognition methods, device, model and medium | |
CN106874826A (en) | Face key point-tracking method and device | |
CN109214366A (en) | Localized target recognition methods, apparatus and system again | |
CN110555481A (en) | Portrait style identification method and device and computer readable storage medium | |
CN109886951A (en) | Method for processing video frequency, device and electronic equipment | |
CN109543632A (en) | A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features | |
CN109376631A (en) | A kind of winding detection method and device neural network based | |
CN109117760A (en) | Image processing method, device, electronic equipment and computer-readable medium | |
CN107808126A (en) | Vehicle retrieval method and device | |
CN110135476A (en) | A kind of detection method of personal safety equipment, device, equipment and system | |
CN105303163B (en) | A kind of method and detection device of target detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190423 |