CN110147753A - The method and device of wisp in a kind of detection image - Google Patents
The method and device of wisp in a kind of detection image Download PDFInfo
- Publication number
- CN110147753A CN110147753A CN201910410363.5A CN201910410363A CN110147753A CN 110147753 A CN110147753 A CN 110147753A CN 201910410363 A CN201910410363 A CN 201910410363A CN 110147753 A CN110147753 A CN 110147753A
- Authority
- CN
- China
- Prior art keywords
- candidate region
- image
- wisp
- network
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application provides a kind of method and device of wisp in detection image, for improving the accuracy of wisp in detection image.This method comprises: obtaining image to be processed;Network is generated according to candidate region trained in advance, different process of convolution are carried out repeatedly to image to be processed, treated that result carries out fusion treatment to multiple convolution results, generates the candidate region in image to be processed;Wherein, candidate region includes the wisp in image to be processed;Size normalization and super-resolution processing are carried out to candidate region, the candidate region after obtaining super-resolution processing;The feature of the neighborhood of feature and candidate region to the candidate region after super-resolution processing carries out fusion treatment, obtains fused provincial characteristics;Fused provincial characteristics is identified, the corresponding wisp information in candidate region is obtained;Wherein, wisp information includes the position of the generic and wisp of wisp in image to be processed.
Description
Technical field
This application involves image procossing nerual network technique fields, and in particular to a kind of method of wisp in detection image
And device.
Background technique
It being continued to develop as deep learning calculates, deep learning is increasingly significant in the effect that computer vision field plays,
The deep neural network progress of making a breakthrough property, object detection field in the tasks such as image classification, target detection, semantic segmentation take
It must be even more to obtain achievement highly visible.
But be usually the medium-sized or large-sized object in detection image in existing research, for wisp in image,
Since the resolution ratio of wisp is lower, the feature of the wisp extracted is unobvious, the result accuracy for causing wisp to detect
It is low.
Summary of the invention
The application provides a kind of method and device of wisp in detection image, for improving wisp in detection image
Accuracy.
In a first aspect, providing a kind of method of wisp in detection image, comprising:
Obtain image to be processed;
Network is generated according to candidate region trained in advance, different process of convolution are carried out repeatedly to the image to be processed,
Fusion treatment is carried out to multiple convolution results treated result, generates the candidate region in the image to be processed;Wherein, institute
Stating candidate region includes the wisp in the image to be processed;
Size normalization and super-resolution processing are carried out to the candidate region, the candidate regions after obtaining super-resolution processing
Domain;
The feature of feature and the neighborhood of the candidate region to the candidate region after the super-resolution processing is melted
Conjunction processing, obtains fused provincial characteristics;
The fused provincial characteristics is identified, the corresponding wisp information in the candidate region is obtained;Wherein,
The wisp information includes the position of the generic and the wisp of the wisp in the image to be processed.
In the embodiment of the present application, different process of convolution are carried out to image to be processed, due to the convolutional neural networks difference number of plies
The difference of receptive field makes each layer of region-of-interest different, comprehensively considers and melts the feature of two different feeling open countries
Symphysis is at fused provincial characteristics, so that fused provincial characteristics is more clear, is conducive to improve the accurate of testing result
Property.And judged by the contextual information of candidate region, enable to fused provincial characteristics more comprehensive, into one
Step improves the accuracy of detection.And superresolution processing is carried out to candidate region, so that the image after superresolution processing is more clear
It is clear, further increase the accuracy of testing result.And super-resolution technique and feature are used when object classification and frame return
Fusion substantially increases the accuracy rate that wisp detects under low resolution.Detection accuracy is improved in conjunction with various ways, ensure that
Accuracy, the reliability of low-resolution detection.In short, fused provincial characteristics in the embodiment of the present application, ensure that and extract
To the more accurate feature of wisp, so that testing result is more accurate.
Optionally, network is being generated according to candidate region trained in advance, different convolution is carried out repeatedly to image to be processed
Processing carries out fusion treatment to multiple convolution results treated result, generate candidate region in the image to be processed it
Before, comprising:
Obtain sample image;
Network is generated according to preset candidate region, N kind semanteme scale feature is carried out to the sample image and is extracted, is obtained
Sample characteristics under N kind semanteme scale, N is positive integer;
Network is generated according to the preset candidate region, to M kind semanteme ruler in the sample characteristics of the N kind semanteme scale
The sample areas feature of degree is merged, and obtains fused sample characteristics, and according to the fused sample characteristics, in advance
The object information in the sample image is surveyed, the corresponding probability of current predictive result is obtained, M is the positive integer less than or equal to N;
According to the probability calculation loss function value, and according to the variation of loss function value, adjust the preset time
Favored area generates the parameter of network, and until the loss function is restrained, acquisition candidate region trained in advance generates network.
In the embodiment of the present application, network is generated using trained candidate region in advance to generate candidate region, so that mentioning
The candidate region taken has higher recall rate and accuracy rate, moreover it is possible to improve candidate region formation efficiency.
Optionally, according to the variation of loss function value, the parameter for generating network is adjusted, comprising:
According to the variation of the loss function value, the parameter for generating network is adjusted according to default downward gradient.
Optionally, network is being generated according to the candidate region, the sample areas feature of described M semantic scale is being carried out
Fusion, before obtaining fused sample areas feature, comprising:
Sample areas feature under every two kinds in sample areas feature under N kind semanteme scale semantic scales is melted
It closes, obtains multiple predictions and merge sample areas feature;
Confidence evaluation is carried out to the multiple prediction fusion sample areas feature, obtains multiple confidence levels;
With maximum corresponding two kinds semantic scales of confidence level for M kind semanteme scale.
In the embodiment of the present application, a variety of semantic scales are screened, select two kinds, on the one hand guarantee result after fusion
Reliable rows, on the other hand can greatly reduce calculation amount.
Optionally, size normalization and super-resolution processing are carried out to the candidate region, after obtaining super-resolution processing
Candidate region, comprising:
The candidate region is normalized, the candidate region after obtaining normalized;
Process of convolution is carried out to the candidate region after the normalized, obtains convolution processing result, and to described
Candidate region after normalized carries out deconvolution processing, obtains deconvolution processing result;
Fusion treatment is carried out to the convolution processing result and deconvolution processing result, is obtained and the candidate area size
Candidate region after identical super-resolution processing.
Optionally, the fused provincial characteristics is identified, obtains the corresponding wisp letter in the candidate region
Breath, comprising:
The fused provincial characteristics is input in advance trained classification frame and predicts network, carry out frame prediction and
Classification prediction, exports the corresponding wisp information in the candidate region.
Optionally, the neighborhood of the candidate region is respectively to expand in three times region to remove using the candidate region as center length and width
The region of the candidate region.
Second aspect provides a kind of device of wisp in detection image, comprising:
Module is obtained, for obtaining image to be processed;
Processing module, the generation network trained in advance for basis, the processing image carry out repeatedly different process of convolution,
Fusion treatment is carried out to multiple convolution results treated result, generates the candidate region in the image to be processed;Wherein, institute
Stating candidate region includes the wisp in the image to be processed;
The processing module is also used to carry out size normalization and super-resolution processing to the candidate region, be surpassed
Candidate region after resolution processes;
The processing module is also used to feature and the candidate region to the candidate region after the super-resolution processing
Neighborhood feature carry out fusion treatment, obtain fused provincial characteristics;
The processing module is also used to identify the fused provincial characteristics, obtains the candidate region pair
The wisp information answered;Wherein, the generic and the wisp that the wisp information includes the wisp are described
Position in image to be processed.
Optionally, the acquisition module is also used to generating network according to candidate region trained in advance, to figure to be processed
As carrying out repeatedly different process of convolution, to multiple convolution results, treated that result carries out fusion treatment, generates described to be processed
Before candidate region in image, sample image is obtained;
The processing module, is also used to generate network according to preset candidate region, carries out N kind language to the sample image
Adopted scale feature extracts, and obtains the sample characteristics under N kind semanteme scale, N is positive integer;
The processing module is also used to generate network according to the preset candidate region, to the N kind semanteme scale
The sample areas feature of M kind semanteme scale is merged in sample characteristics, obtains fused sample characteristics, and according to described
Fused sample characteristics predict the object information in the sample image, obtain the corresponding probability of current predictive result, and M is
Positive integer less than or equal to N;
The processing module is also used to according to the probability calculation loss function value, and the change according to loss function value
Change, adjusts the parameter that the preset candidate region generates network until loss function convergence and obtain time trained in advance
Favored area generates network.
Optionally, the processing module is specifically used for:
According to the variation of the loss function value, the parameter for generating network is adjusted according to default downward gradient.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application will make below to required in the embodiment of the present application
Attached drawing is briefly described, it should be apparent that, attached drawing described below is only some embodiments of the present invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 provides the flow chart of a kind of method of wisp in detection image for the embodiment of the present application;
Fig. 2 provides a kind of process schematic of candidate region generation network processes image to be processed for the embodiment of the present application;
Fig. 3 provides the straightforward procedure figure of a kind of method of wisp in detection image for the embodiment of the present application
Fig. 4 provides a kind of process schematic of super-resolution processing for the embodiment of the present application;
Fig. 5 provides the effect diagram after a kind of image procossing to be processed for the embodiment of the present application;
Fig. 6 provides a kind of structure chart of the device of wisp in detection image for the embodiment of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described.
It needs whether to there is cigarette smoking to detect driver in public transport traffic, whether just be needed in classroom to student
Often attend class detected and medical treatment in microfocal detect etc. in scene, in the picture due to detected object
Less than 10%, existing image processing method causes to detect small object the area occupied it is difficult to extract the characteristics of image for arriving wisp
The accuracy rate of body is relatively low.
In consideration of it, the embodiment of the present application provides a kind of method of wisp in detection image, this method is by detection image
The device of wisp executes, and the device of wisp can be by having image processor (Graphics in detection image
Processing Unit, GPU) equipment realize, equipment with GPU such as personal computer, server, video camera etc., this
Application does not limit the concrete type of the device of wisp in detection image.
Fig. 1 is please referred to, the detailed process for executing this method to the device of wisp in detection image below is introduced.It should
Method includes:
Step 11, image to be processed is obtained;
Step 12, network is generated according to candidate region trained in advance, image to be processed is carried out repeatedly at different convolution
Reason carries out fusion treatment to multiple convolution results treated result, generates the candidate region in image to be processed;Wherein, it waits
Favored area includes the wisp in image to be processed;
Step 13, size normalization and super-resolution processing are carried out to candidate region, the time after obtaining super-resolution processing
Favored area;
Step 14, the feature of the neighborhood of the feature and candidate region of the candidate region after super-resolution processing is merged
Processing, obtains fused provincial characteristics;
Step 15, fused provincial characteristics is identified, obtains the corresponding wisp information in candidate region;Wherein,
Wisp information includes the position of the generic and wisp of wisp in image to be processed.
It describes in detail below to above-mentioned each step.
The device of wisp executes step 11 in detection image, i.e., first obtains image to be processed.
The device of wisp can be real-time acquisition in detection image, handle in real time, or pass through other Image Acquisition
The image collected is sent to the device of wisp in detection image by equipment, is equivalent to the device of wisp in detection image
Obtain image to be processed.
After obtaining image to be processed, step 12 is executed, i.e., network is generated according to candidate region trained in advance, treated
It handles image and carries out repeatedly different process of convolution, treated that result carries out fusion treatment to multiple convolution results, generates wait locate
Manage the candidate region in image.
Specifically, generating network using trained candidate region in advance in the embodiment of the present application, the neural network forecast is utilized
Mark is except the candidate region in the image to be processed, in order to detect to wisp that may be present in candidate region.By
Multiple convolution processing can be carried out to image to be processed by generating network in candidate region, and the corresponding receptive field of different process of convolution is not
Together, so that the image information that stresses of each convolution results is also different, then each convolution results are merged, it is hereby achieved that
The image being more clear is conducive to the accuracy for improving wisp in subsequent detection image.Wherein, in multiple convolution processing every time
Process of convolution with repeatedly in other secondary process of convolution it is not exactly the same, it is different that difference is possibly including, but not limited to the convolution number of plies,
Characteristics of image difference of convolution etc..
Wherein, it involving how to that candidate region is trained to generate network, the mode that training candidate region generates network has plants very much,
Network development process is generated to training candidate region below to be illustrated.
A kind of mode that trained candidate region generates network is as follows:
Obtain sample image;
Network is generated according to preset candidate region, N kind semanteme scale feature is carried out to the sample image and is extracted, is obtained
Sample characteristics under N kind semanteme scale, N is positive integer;
Network is generated according to the preset candidate region, to M kind semanteme ruler in the sample characteristics of the N kind semanteme scale
The sample areas feature of degree is merged, and obtains fused sample characteristics, and according to the fused sample characteristics, in advance
The object information in the sample image is surveyed, the corresponding probability of current predictive result is obtained, M is the positive integer less than or equal to N;
According to the probability calculation loss function value, and according to the variation of loss function value, adjust the preset time
Favored area generates the parameter of network, and until the loss function is restrained, acquisition candidate region trained in advance generates network.
Specifically, the device of wisp first obtains a large amount of sample image in detection image, sample image includes the image
In corresponding object category label and the corresponding location tags of object.Great amount of samples figure can be obtained by the manual standard of user
Picture, can also be from existing data
Optionally, since the size of a large amount of sample image may be different, in order to improve trained accuracy, to sample graph
As being normalized, the sample image of uniform sizes is obtained.Normalized for example, by using bilinear interpolation method, such as
Using bilinear interpolation method by image sampling to 224 × 224.
The device of wisp is after obtaining sample image in detection image, and building candidate region generates network, to a large amount of
Sample image carries out N kind semanteme scale feature, obtains the sample characteristics under N kind semanteme scale, can be under N kind semanteme scale
Sample characteristics carry out fusion treatment, obtain fused sample characteristics, calculate the corresponding probability of fused sample characteristics, according to
The probability calculation candidate region generates the loss function value of network training, and candidate region generation is constantly adjusted according to loss function value
The parameter of network obtains preparatory in trained candidate region generation network, that is, step 12 until loss function convergence
Trained candidate region generates network.Wherein, any two kinds of N kind semanteme scale kind semantic scales are all different.N kind semanteme ruler
Degree includes but is not limited to one or more of image outline distribution, image grayscale distribution and image texture distribution.
As one embodiment, it is trained every time with scale feature semantic in N, calculation amount is larger, is reducing meter to the greatest extent
While calculation amount, guarantee the accuracy of training result as far as possible, but not can determine that the knot which semantic scale feature merges out
Fruit is best, therefore M kind semanteme scale feature can be selected from N kind semanteme scale feature to carry out Fusion training process.
There are many modes for screening M kind semanteme scale feature, is illustrated below,
Mode one:
Arbitrarily filter out M kind semanteme scale feature.
Mode two:
Sample areas feature under every two kinds in sample areas feature under N kind semanteme scale semantic scales is melted
It closes, obtains multiple predictions and merge sample areas feature;Confidence evaluation is carried out to multiple predictions fusion sample areas feature, is obtained
Multiple confidence levels;With maximum corresponding two kinds semantic scales of confidence level for M kind semanteme scale.
Specifically, being merged to every two kinds of sample areas feature under N kind semanteme scale, to every two kinds of fusion results
Confidence evaluation is carried out, with the M kind semanteme scale that highest two kinds semantic scales of confidence evaluation alternatively go out, such one
Come, while guaranteeing the accuracy of fusion results, moreover it is possible to greatly reduce the calculation amount in fusion process as far as possible.The characterization of confidence level
There are many kinds of modes, such as can be characterized with probability, and there are many modes for calculating probability, such as common logistic function
To calculate.
It, can be to the sample area of M kind semanteme scale after the sample areas feature for filtering out corresponding M kind semanteme scale
Characteristic of field is merged, and fused sample characteristics are obtained, the object further according to fused sample characteristics, in forecast sample image
Body information obtains the corresponding probability of current predictive result.
The loss function (being referred to as cost function) that network is generated according to the probability calculation candidate region, according to the damage
The parameter that functional value adjustment candidate region generates network is lost, can arbitrarily be adjusted, network can also be generated in adjustment candidate region
When, in order to improve training speed the parameter that candidate region generates network can not be adjusted according to default downward gradient.Until
It is that trained candidate region generates network model in advance, such as makes that the corresponding candidate region of loss function convergence, which generates network,
Decline adjusting parameter with Adam optimizer gradient.
It include that M kind semanteme scale feature extracts since trained candidate region generates in network in advance, naturally, when wait locate
Reason image input is set the candidate region and is generated in network, which handles image and carry out multiple convolution processing, to multiple volume
Product processing carries out fusion treatment, generates the corresponding candidate region of image to be processed.Candidate region may be multiple, multiple candidate regions
Some possible candidate regions include wisp in domain, it is also possible to which some do not include wisp.
For example, referring to figure 2., image to be processed carries out semantic feature by Conv5_3 in core network VGG16 and handles
It handles to obtain the second semantic processes to the first semantic processes result (512*38*38), and by Conv4_3 semantic feature
(512*19*19) obtains the first semantic processes result (512*38*38) by process of convolution (convolution kernel is, for example, 3*3*512)
To characteristic pattern A (512*38*38) shown in Fig. 2, (convolution kernel is handled by deconvolution to the second semantic processes (512*19*19)
For example, 3*3*512) characteristic pattern B shown in Fig. 2 is obtained, operation is attached to characteristic pattern A and characteristic pattern B and obtains 1024*
The eigenmatrix of 38*38, then fusion treatment is carried out to this feature matrix and obtains the fusion feature matrix of 512*38*38, according to pre-
If the sliding window (being referred to as match window etc., size such as 3*3) of size traverses the fusion feature matrix, Mei Gete
Sign point generates K (such as 9) candidate frame, finally can generate 38*38*K candidate region according to fusion feature.
The device of wisp executes step 13 after generating candidate region in detection image, i.e., carries out to candidate region
Size normalization and super-resolution processing, the candidate region after obtaining super-resolution processing.
Specifically, first carrying out size normalized to candidate region, normalized can still be inserted using bilinearity
Value method, normalized can amplify processing to candidate region.But it after amplifying to candidate region, may result in
Noise or fuzzy situation, therefore super-resolution processing is carried out again to the candidate region after normalization.Wherein, super-resolution processing
Also there are many kinds of modes.
A kind of mode:
To after normalized candidate region carry out process of convolution, obtain convolution processing result, and to normalization at
Candidate region after reason carries out deconvolution processing, obtains deconvolution processing result;
Fusion treatment is carried out to convolution processing result and deconvolution processing result, is obtained identical with candidate area size super
Candidate region after resolution processes.
Specifically, carrying out super-resolution processing to the candidate region after normalized using convolution-deconvolution network, obtain
To, clarity higher region identical as candidate region size.Convolution-deconvolution network be also in advance it is trained, about instruction
The process for practicing convolution-deconvolution network is referred to the content that the training candidate region discussed in step 12 generates network, this time
It repeats no more.
Candidate region when candidate region has multiple, after multiple superresolution processings are corresponding with after natural superresolution processing.
For example, by image to be processed be Fig. 3 in a figure for, when a figure in Fig. 3 by candidate region generate network it
Candidate region (shown in the b in Fig. 3) is obtained afterwards, the b in Fig. 3 is carried out to obtain institute in clearer Fig. 3 after superresolution processing
Candidate region shown in the c shown.Fig. 3 is one kind simply signal to process, and limits real image treatment effect.
After the device of wisp executes step 13 in detection image, step 14 can be executed, i.e., to super-resolution processing
The feature of the neighborhood of the feature and candidate region of candidate region afterwards carries out fusion treatment, obtains fused provincial characteristics.
Specifically, the feature of a candidate region after superresolution processing may include whole features of wisp, it can also
It can include the Partial Feature of wisp, it is also possible to it does not include the feature of wisp, it can be by the candidate region after superresolution processing
Feature neighborhood characteristics corresponding with the candidate region merged, with combine more image feature informations, obtain it is more complete
The wisp characteristic image in face.The neighborhood characteristics of candidate region are also it will be further appreciated that the contextual information of the candidate region.
Should illustrate when, when having multiple candidate regions, the feature of each candidate region neighbour corresponding with the candidate region
Characteristic of field is merged, corresponding to obtain K fused provincial characteristics when there is K candidate region.
As one embodiment, the neighborhood of candidate region is chosen excessive, can greatly increase calculation amount, but select too small
Neighborhood, increased feature is very few, may be unable to reach preferable syncretizing effect, therefore, candidate region in the embodiment of the present application
Expand three times along length direction centered on the neighborhood choice justice candidate region, expands in the region of three times along wide direction in addition to the time
The region of favored area.
For example, referring to figure 4., super-resolution processing is passed through into the candidate region of image to be processed, then by superresolution processing
Candidate region afterwards inputs convolutional layer, obtains the feature of candidate region, the neighborhood of the candidate region of figure to be processed is input to volume
Lamination obtains the neighborhood characteristics of candidate region, is passing through full attended operation, is obtaining fused feature.
The device of wisp executes step 15, i.e., to fusion after obtaining fused provincial characteristics in detection image
Provincial characteristics afterwards is identified, the corresponding wisp information in candidate region is obtained.
Specifically, fused provincial characteristics can be input to preparatory instruction after obtaining fused provincial characteristics
The classification frame prediction network perfected, predicts that object information includes small object to the corresponding wisp information in the candidate region
The location information in the picture of body classification and wisp.
The training process of classification frame prediction network is illustrated below.
The classification frame prediction network of building is obtained to the prediction result of sample image, calculates the probability of prediction result, root
The corresponding loss function value of classification frame prediction network is obtained according to probability to be adjusted according to loss function value and Adam optimizer
The parameter of whole classification frame prediction network is trained until the corresponding loss function convergence of classification frame prediction network
Classification frame predict network.
After fused provincial characteristics is input in classification frame prediction network, the candidate regions can be accordingly generated
The process of detection wisp is realized in the classification of the corresponding wisp in domain and the position of wisp.It is right when there is multiple candidate regions
The wisp information of each wisp should be generated.There are many kinds of the forms of the wisp information of output, such as picture is tagged
Form or aggregate form etc., are not specifically limited herein.
For example, continuing by taking Fig. 4 as an example, when fused provincial characteristics (is with complete in Fig. 4 by classification frame prediction network
For connection network) after, export the classification of wisp and the position of wisp in the image to be processed.Referring to figure 5., to
Image is handled as shown in a in Fig. 5, after by above-mentioned processing, exporting the wisp in image to be processed includes b in Fig. 5
Shown in the moon and bird, the position of wisp is as shown in b rectangle frame in Fig. 5.
In a kind of detection image discussed above on the basis of the method for wisp, the embodiment of the present application also provides one kind
The device of wisp in detection image, the device are the device of wisp in the detection image discussed above, please refer to Fig. 6, should
Device includes:
Module 601 is obtained, for obtaining image to be processed;
Processing module 602, for handling image and carrying out repeatedly different process of convolution according to generation network trained in advance,
Fusion treatment is carried out to multiple convolution results treated result, generates the candidate region in image to be processed;Wherein, candidate regions
Domain includes the wisp in image to be processed;
Processing module 602 is also used to carry out size normalization and super-resolution processing to candidate region, obtains super-resolution
Treated candidate region;
Processing module 602 is also used to the neighborhood of the feature and candidate region of the candidate region after super-resolution processing
Feature carries out fusion treatment, obtains fused provincial characteristics;
Processing module 602 is also used to identify fused provincial characteristics, obtains the corresponding wisp in candidate region
Information;Wherein, wisp information includes the position of the generic and wisp of wisp in image to be processed.
Optionally, module 601 is obtained, is also used to generating network according to candidate region trained in advance, to figure to be processed
As carrying out repeatedly different process of convolution, to multiple convolution results, treated that result carries out fusion treatment, generates image to be processed
In candidate region before, obtain sample image;
Processing module 602, is also used to generate network according to preset candidate region, carries out N kind semanteme ruler to sample image
Feature extraction is spent, obtains the sample characteristics under N kind semanteme scale, N is positive integer;
Processing module 602 is also used to generate network according to preset candidate region, to the sample characteristics of N kind semanteme scale
The sample areas feature of middle M kind semanteme scale is merged, and obtains fused sample characteristics, and according to fused sample
Feature, the object information in forecast sample image obtain the corresponding probability of current predictive result, and M is just whole less than or equal to N
Number;
Processing module 602 is also used to according to probability calculation loss function value, and according to the variation of loss function value, is adjusted
Whole preset candidate region generates the parameter of network, and until loss function is restrained, acquisition candidate region trained in advance generates net
Network.
Optionally, processing module 602 is specifically used for: according to the variation of loss function value, adjusting according to default downward gradient
Generate the parameter of network.
Optionally, processing module 602 is also used to:
Network is being generated according to candidate region, the sample areas feature of M semantic scale is being merged, after being merged
Sample areas feature before, to the sample area under two kinds of semantic scales any in the sample areas feature under N kind semanteme scale
Characteristic of field is merged, and is obtained multiple predictions and is merged sample areas feature;
Confidence evaluation is carried out to multiple predictions fusion sample areas feature, obtains multiple confidence levels;
With maximum corresponding two kinds semantic scales of confidence level for M kind semanteme scale.
Optionally, processing module 602 is specifically used for:
Candidate region is normalized, the candidate region after obtaining normalized;
To after normalized candidate region carry out process of convolution, obtain convolution processing result, and to normalization at
Candidate region after reason carries out deconvolution processing, obtains deconvolution processing result;
Fusion treatment is carried out to convolution processing result and deconvolution processing result, is obtained identical with candidate area size super
Candidate region after resolution processes.
Optionally, processing module 602 is specifically used for:
Fused provincial characteristics is input to classification frame trained in advance and predicts network, carries out frame prediction and classification
Prediction, the corresponding wisp information in output candidate region.
Optionally, the neighborhood of candidate region is respectively to expand in three times region by center length and width of candidate region in addition to candidate regions
The region in domain.
Reply explanation, Fig. 6 is illustrated with the software module in the device of wisp in detection image, actually
Acquisition module and processing module in Fig. 6 can be realized by the processor in the device of wisp in detection image, be handled
Device can be realized by CPU, integrated circuit etc..
The above, above embodiments are only described in detail to the technical solution to the application, but the above implementation
The method that the explanation of example is merely used to help understand the embodiment of the present application, should not be construed as the limitation to the embodiment of the present application.This
Any changes or substitutions that can be easily thought of by those skilled in the art, should all cover within the protection scope of the embodiment of the present application.
Claims (10)
1. a kind of method of wisp in detection image characterized by comprising
Obtain image to be processed;
Network is generated according to candidate region trained in advance, different process of convolution are carried out repeatedly to the image to be processed, to more
A convolution results treated result carries out fusion treatment, generates the candidate region in the image to be processed;Wherein, the time
Favored area includes the wisp in the image to be processed;
Size normalization and super-resolution processing are carried out to the candidate region, the candidate region after obtaining super-resolution processing;
The feature of candidate region after the super-resolution processing is carried out merging place with the feature of the neighborhood of the candidate region
Reason, obtains fused provincial characteristics;
The fused provincial characteristics is identified, the corresponding wisp information in the candidate region is obtained;Wherein, described
Wisp information includes the position of the generic and the wisp of the wisp in the image to be processed.
2. the method as described in claim 1, which is characterized in that generating network according to candidate region trained in advance, treating
It handles image and carries out repeatedly different process of convolution, to multiple convolution results treated result carries out fusion treatment, described in generation
Before candidate region in image to be processed, comprising:
Obtain sample image;
Network is generated according to preset candidate region, N kind semanteme scale feature is carried out to the sample image and is extracted, N kind is obtained
Sample characteristics under semantic scale, N are positive integer;
Network is generated according to the preset candidate region, to M kind semanteme scale in the sample characteristics of the N kind semanteme scale
Sample areas feature is merged, and obtains fused sample characteristics, and according to the fused sample characteristics, predict institute
The object information in sample image is stated, the corresponding probability of current predictive result is obtained, M is the positive integer less than or equal to N;
According to the probability calculation loss function value, and according to the variation of loss function value, adjust the preset candidate regions
Domain generates the parameter of network, and until the loss function is restrained, acquisition candidate region trained in advance generates network.
3. method according to claim 2, which is characterized in that according to the variation of loss function value, adjust the generation network
Parameter, comprising:
According to the variation of the loss function value, the parameter for generating network is adjusted according to default downward gradient.
4. method as claimed in claim 3, which is characterized in that network is being generated according to the candidate region, to the M language
The sample areas feature of adopted scale is merged, before obtaining fused sample areas feature, comprising:
Sample areas feature under every two kinds in sample areas feature under N kind semanteme scale semantic scales is merged, is obtained
It obtains multiple predictions and merges sample areas features;
Confidence evaluation is carried out to the multiple prediction fusion sample areas feature, obtains multiple confidence levels;
With maximum corresponding two kinds semantic scales of confidence level for M kind semanteme scale.
5. the method as described in claim 1-4 is any, which is characterized in that carry out size normalization to the candidate region and surpass
Resolution processes, the candidate region after obtaining super-resolution processing, comprising:
The candidate region is normalized, the candidate region after obtaining normalized;
Process of convolution is carried out to the candidate region after the normalized, obtains convolution processing result, and to the normalizing
Change treated candidate region and carry out deconvolution processing, obtains deconvolution processing result;
Fusion treatment is carried out to the convolution processing result and deconvolution processing result, is obtained identical as the candidate area size
Super-resolution processing after candidate region.
6. method as claimed in claim 5, which is characterized in that identified to the fused provincial characteristics, obtain institute
State the corresponding wisp information in candidate region, comprising:
The fused provincial characteristics is input to classification frame trained in advance and predicts network, carries out frame prediction and classification
Prediction, exports the corresponding wisp information in the candidate region.
7. method as claimed in claim 5, which is characterized in that the neighborhood of the candidate region, which is with the candidate region, is
Heart length and width respectively expand in three times region in addition to the region of the candidate region.
8. the device of wisp in a kind of detection image characterized by comprising
Module is obtained, for obtaining image to be processed;
Processing module, for according to generation network trained in advance, the processing image to carry out repeatedly different process of convolution, to more
A convolution results treated result carries out fusion treatment, generates the candidate region in the image to be processed;Wherein, the time
Favored area includes the wisp in the image to be processed;
The processing module is also used to carry out size normalization and super-resolution processing to the candidate region, obtains super-resolution
Rate treated candidate region;
The processing module is also used to the feature to the candidate region after the super-resolution processing and the neighbour of the candidate region
The feature in domain carries out fusion treatment, obtains fused provincial characteristics;
The processing module is also used to identify the fused provincial characteristics, it is corresponding to obtain the candidate region
Wisp information;Wherein, the wisp information include the wisp generic and the wisp described wait locate
Manage the position in image.
9. device as claimed in claim 8, which is characterized in that
The acquisition module is also used to generating network according to candidate region trained in advance, carry out to image to be processed multiple
Different process of convolution carry out fusion treatment to multiple convolution results treated result, generate the time in the image to be processed
Before favored area, sample image is obtained;
The processing module, is also used to generate network according to preset candidate region, carries out N kind semanteme ruler to the sample image
Feature extraction is spent, obtains the sample characteristics under N kind semanteme scale, N is positive integer;
The processing module is also used to generate network according to the preset candidate region, to the sample of the N kind semanteme scale
The sample areas feature of M kind semanteme scale is merged in feature, obtains fused sample characteristics, and according to the fusion
Sample characteristics afterwards, predict the object information in the sample image, obtain the corresponding probability of current predictive result, M be less than
Or the positive integer equal to N;
The processing module is also used to according to the probability calculation loss function value, and according to the variation of loss function value, is adjusted
The whole preset candidate region generates the parameter of network, until loss function convergence, obtains candidate regions trained in advance
Domain generates network.
10. device as claimed in claim 9, which is characterized in that the processing module is specifically used for:
According to the variation of the loss function value, the parameter for generating network is adjusted according to default downward gradient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910410363.5A CN110147753A (en) | 2019-05-17 | 2019-05-17 | The method and device of wisp in a kind of detection image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910410363.5A CN110147753A (en) | 2019-05-17 | 2019-05-17 | The method and device of wisp in a kind of detection image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110147753A true CN110147753A (en) | 2019-08-20 |
Family
ID=67594156
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910410363.5A Pending CN110147753A (en) | 2019-05-17 | 2019-05-17 | The method and device of wisp in a kind of detection image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110147753A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443229A (en) * | 2019-08-22 | 2019-11-12 | 国网四川省电力公司信息通信公司 | A kind of equipment display content identification method based on artificial intelligence |
CN110533105A (en) * | 2019-08-30 | 2019-12-03 | 北京市商汤科技开发有限公司 | A kind of object detection method and device, electronic equipment and storage medium |
CN113223059A (en) * | 2021-05-17 | 2021-08-06 | 浙江大学 | Weak and small airspace target detection method based on super-resolution feature enhancement |
CN113505256A (en) * | 2021-07-02 | 2021-10-15 | 北京达佳互联信息技术有限公司 | Feature extraction network training method, image processing method and device |
KR102599190B1 (en) * | 2022-06-24 | 2023-11-07 | 주식회사 포딕스시스템 | Apparatus and method for object detection based on image super-resolution of an integrated region of interest |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127684A (en) * | 2016-06-22 | 2016-11-16 | 中国科学院自动化研究所 | Image super-resolution Enhancement Method based on forward-backward recutrnce convolutional neural networks |
US20170147905A1 (en) * | 2015-11-25 | 2017-05-25 | Baidu Usa Llc | Systems and methods for end-to-end object detection |
US20180039853A1 (en) * | 2016-08-02 | 2018-02-08 | Mitsubishi Electric Research Laboratories, Inc. | Object Detection System and Object Detection Method |
US20180165551A1 (en) * | 2016-12-08 | 2018-06-14 | Intel Corporation | Technologies for improved object detection accuracy with multi-scale representation and training |
CN108509978A (en) * | 2018-02-28 | 2018-09-07 | 中南大学 | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN |
CN108647682A (en) * | 2018-05-17 | 2018-10-12 | 电子科技大学 | A kind of brand Logo detections and recognition methods based on region convolutional neural networks model |
CN108898078A (en) * | 2018-06-15 | 2018-11-27 | 上海理工大学 | A kind of traffic sign real-time detection recognition methods of multiple dimensioned deconvolution neural network |
CN108960074A (en) * | 2018-06-07 | 2018-12-07 | 西安电子科技大学 | Small size pedestrian target detection method based on deep learning |
CN109117876A (en) * | 2018-07-26 | 2019-01-01 | 成都快眼科技有限公司 | A kind of dense small target deteection model building method, model and detection method |
CN109284669A (en) * | 2018-08-01 | 2019-01-29 | 辽宁工业大学 | Pedestrian detection method based on Mask RCNN |
CN109583321A (en) * | 2018-11-09 | 2019-04-05 | 同济大学 | The detection method of wisp in a kind of structured road based on deep learning |
CN109753946A (en) * | 2019-01-23 | 2019-05-14 | 哈尔滨工业大学 | A kind of real scene pedestrian's small target deteection network and detection method based on the supervision of body key point |
-
2019
- 2019-05-17 CN CN201910410363.5A patent/CN110147753A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170147905A1 (en) * | 2015-11-25 | 2017-05-25 | Baidu Usa Llc | Systems and methods for end-to-end object detection |
CN106127684A (en) * | 2016-06-22 | 2016-11-16 | 中国科学院自动化研究所 | Image super-resolution Enhancement Method based on forward-backward recutrnce convolutional neural networks |
US20180039853A1 (en) * | 2016-08-02 | 2018-02-08 | Mitsubishi Electric Research Laboratories, Inc. | Object Detection System and Object Detection Method |
US20180165551A1 (en) * | 2016-12-08 | 2018-06-14 | Intel Corporation | Technologies for improved object detection accuracy with multi-scale representation and training |
CN108509978A (en) * | 2018-02-28 | 2018-09-07 | 中南大学 | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN |
CN108647682A (en) * | 2018-05-17 | 2018-10-12 | 电子科技大学 | A kind of brand Logo detections and recognition methods based on region convolutional neural networks model |
CN108960074A (en) * | 2018-06-07 | 2018-12-07 | 西安电子科技大学 | Small size pedestrian target detection method based on deep learning |
CN108898078A (en) * | 2018-06-15 | 2018-11-27 | 上海理工大学 | A kind of traffic sign real-time detection recognition methods of multiple dimensioned deconvolution neural network |
CN109117876A (en) * | 2018-07-26 | 2019-01-01 | 成都快眼科技有限公司 | A kind of dense small target deteection model building method, model and detection method |
CN109284669A (en) * | 2018-08-01 | 2019-01-29 | 辽宁工业大学 | Pedestrian detection method based on Mask RCNN |
CN109583321A (en) * | 2018-11-09 | 2019-04-05 | 同济大学 | The detection method of wisp in a kind of structured road based on deep learning |
CN109753946A (en) * | 2019-01-23 | 2019-05-14 | 哈尔滨工业大学 | A kind of real scene pedestrian's small target deteection network and detection method based on the supervision of body key point |
Non-Patent Citations (4)
Title |
---|
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
YINGXIN LOU等: "Improve object detection via a multi-feature and multi-task CNN model", 《2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)》 * |
李华清: "基于SSD的航拍图像小目标快速检测算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
王昊然: "基于多层卷积特征高阶融合的多任务目标检测系统研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443229A (en) * | 2019-08-22 | 2019-11-12 | 国网四川省电力公司信息通信公司 | A kind of equipment display content identification method based on artificial intelligence |
CN110533105A (en) * | 2019-08-30 | 2019-12-03 | 北京市商汤科技开发有限公司 | A kind of object detection method and device, electronic equipment and storage medium |
CN110533105B (en) * | 2019-08-30 | 2022-04-05 | 北京市商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
CN113223059A (en) * | 2021-05-17 | 2021-08-06 | 浙江大学 | Weak and small airspace target detection method based on super-resolution feature enhancement |
CN113505256A (en) * | 2021-07-02 | 2021-10-15 | 北京达佳互联信息技术有限公司 | Feature extraction network training method, image processing method and device |
KR102599190B1 (en) * | 2022-06-24 | 2023-11-07 | 주식회사 포딕스시스템 | Apparatus and method for object detection based on image super-resolution of an integrated region of interest |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110147753A (en) | The method and device of wisp in a kind of detection image | |
CN110287960A (en) | The detection recognition method of curve text in natural scene image | |
CN105574513B (en) | Character detecting method and device | |
CN108647588A (en) | Goods categories recognition methods, device, computer equipment and storage medium | |
CN109146892A (en) | A kind of image cropping method and device based on aesthetics | |
CN110176027A (en) | Video target tracking method, device, equipment and storage medium | |
CN108229341A (en) | Sorting technique and device, electronic equipment, computer storage media, program | |
CN108090904A (en) | A kind of medical image example dividing method and device | |
CN108229418B (en) | Human body key point detection method and apparatus, electronic device, storage medium, and program | |
CN110765833A (en) | Crowd density estimation method based on deep learning | |
CN109544537A (en) | The fast automatic analysis method of hip joint x-ray image | |
CN109711401A (en) | A kind of Method for text detection in natural scene image based on Faster Rcnn | |
CN108230354A (en) | Target following, network training method, device, electronic equipment and storage medium | |
CN116994140A (en) | Cultivated land extraction method, device, equipment and medium based on remote sensing image | |
CN108710893A (en) | A kind of digital image cameras source model sorting technique of feature based fusion | |
CN115019181B (en) | Remote sensing image rotating target detection method, electronic equipment and storage medium | |
CN111339902A (en) | Liquid crystal display number identification method and device of digital display instrument | |
CN109635812A (en) | The example dividing method and device of image | |
CN108229432A (en) | Face calibration method and device | |
CN109670489A (en) | Weakly supervised formula early-stage senile maculopathy classification method based on more case-based learnings | |
CN115526852A (en) | Molten pool and splash monitoring method in selective laser melting process based on target detection and application | |
CN110334590A (en) | Image Acquisition bootstrap technique and device | |
Krishnan et al. | Computer aided detection of leaf disease in agriculture using convolution neural network based squeeze and excitation network | |
CN108109125A (en) | Information extracting method and device based on remote sensing images | |
CN113822277B (en) | Illegal advertisement picture detection method and system based on deep learning target detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190820 |
|
RJ01 | Rejection of invention patent application after publication |