CN109166106B - Target detection position correction method and device based on sliding window - Google Patents

Target detection position correction method and device based on sliding window Download PDF

Info

Publication number
CN109166106B
CN109166106B CN201810871600.3A CN201810871600A CN109166106B CN 109166106 B CN109166106 B CN 109166106B CN 201810871600 A CN201810871600 A CN 201810871600A CN 109166106 B CN109166106 B CN 109166106B
Authority
CN
China
Prior art keywords
confidence
sliding window
target
area
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810871600.3A
Other languages
Chinese (zh)
Other versions
CN109166106A (en
Inventor
赵梦莹
张俊男
李睿豪
潘煜
贾智平
蔡晓军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201810871600.3A priority Critical patent/CN109166106B/en
Publication of CN109166106A publication Critical patent/CN109166106A/en
Application granted granted Critical
Publication of CN109166106B publication Critical patent/CN109166106B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target detection position correction method and a device based on a sliding window, wherein the width and the moving stride of the sliding window are set, and an image of a target to be detected is divided by using the sliding window to obtain a plurality of candidate target areas; sending all candidate target areas into a CNN neural network for training treatment to obtain confidence degrees of all candidate target areas; selecting an index area corresponding to the maximum confidence coefficient value and the maximum value as a reference value; and cutting and combining the candidate target area by using a position correction method and a reference value to form a new target area. The invention provides a combinable and cutting positioning method aiming at a single target in an image on the basis of a convolutional neural network and a sliding window, and improves the accuracy and speed of target identification.

Description

Target detection position correction method and device based on sliding window
Technical Field
The invention relates to the field of image processing, in particular to a target detection position correction method and device based on a sliding window.
Background
As is well known, in the information age, the acquisition, processing, and application of information are all dramatically expanding. The important knowledge source in the world is known as image information, and in many occasions, the information transmitted by the image is richer, more true and more specific than other forms of information. The cooperation of the human eye and the brain enables people to acquire, process and understand visual information, and the efficiency of human perception of external environment information by using vision is high. In fact, according to statistics made by some foreign scholars, about 80% of the external information obtained by human beings is from images taken by the eyes. Therefore, vision is used as a main carrier for human to obtain external information, and a computer needs to be capable of processing image information to realize intellectualization. In particular, in recent years, image data processing featuring a large capacity such as graphics, images, and video has been widely used in the fields of medicine, transportation, industrial automation, and the like.
In recent years, machine learning has received a great deal of academic and engineering attention. In machine learning, a Convolutional Neural Network (CNN) is a deep feedforward artificial Neural Network, generally including a Convolutional layer (Convolutional layer), a normalization layer (normalization layer), a pooling layer (pooling layer), and a fully-connected layer (full-connected layer), and has been successfully applied to image recognition. At present, CNN has become one of the research hotspots in many scientific fields, especially in the field of pattern classification, because the network avoids the complex preprocessing of the image, and can directly input the original image and classify the image, etc., the network is more widely applied.
The target detection is an important content in the field of image processing and target identification, and the main task of the target detection is to locate and classify targets from a given image, wherein a method based on sliding window search is widely applied to the target detection. However, the conventional sliding window (sliding window) search technique has the following disadvantages: (1) the size of the window is fixed, and the size of the segmented image cannot be changed due to the size of the target; (2) if a plurality of groups of sliding windows with different sizes work simultaneously, the calculated amount is increased inevitably, and the efficiency is influenced; (3) when the sliding stride is dense, the data volume is increased, and the speed is influenced; when the sliding stride is too large, the detection accuracy is affected.
In summary, an effective solution to the problem of low accuracy and efficiency of target detection in the prior art is still lacking.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a target detection position correction method and a target detection position correction device based on a sliding window.
The technical scheme adopted by the invention is as follows:
a first object of the present invention is to provide a target detection position correction method based on a sliding window, the method comprising the steps of:
setting the width and the moving stride of a sliding window, and segmenting the image of the target to be detected by using the sliding window to obtain a plurality of candidate target areas;
sending all candidate target areas into a CNN neural network for training treatment to obtain confidence degrees of all candidate target areas;
selecting an index area corresponding to the maximum confidence coefficient value and the maximum value as a reference value;
and cutting and combining the candidate target area by using a position correction method and a reference value to form a new target area.
Further, determining the width of the sliding window according to the average size of all the objects to be detected; the moving width of the sliding window is less than or equal to half of the width of the sliding window.
Further, the step of sending all candidate target regions into the CNN neural network for training includes:
taking the candidate target area with the correlation rate with the target area smaller than the threshold value I as noise, taking the candidate target area with the correlation rate with the target area larger than the threshold value I as a target, and respectively inputting the target areas into a CNN neural network for training;
and obtaining the confidence degrees of all candidate target regions by using the trained CNN neural network.
Further, when the noise areas are too many, a plurality of noise areas are deleted at random by using a random sampling method, or the pictures of the corresponding training set are deleted.
Further, the maximum confidence value is selected from all the confidence values output by the CNN neural network, and the index region corresponding to the maximum confidence value and the maximum confidence value is used as a reference value.
Further, according to the width of the sliding window and the size of the target to be detected, the depth of the breadth traversal is used as a traversal constraint condition, and when the maximum depth of the breadth traversal is less than or equal to 2, the position correction is carried out.
Further, the method for clipping and combining the candidate target region by using the position correction method and the reference value comprises:
taking the index region corresponding to the maximum confidence coefficient as a central point region;
setting region intensity threshold value T1Confidence activation threshold T2And confidence suppression threshold T3
Taking the central point area as an origin, and taking four adjacent areas, namely an upper area, a lower area, a left area and a right area, as current candidate diffusion areas;
based on the breadth traversal algorithm, the confidence coefficient of the current diffusion region is differed from the maximum confidence coefficient of the center point of the index region, and the confidence coefficient of the diffusion region is respectively compared with a confidence coefficient activation threshold value and a confidence coefficient inhibition threshold value;
if the difference between the confidence coefficient of a certain direction of the current diffusion area and the maximum confidence coefficient is less than T1And the diffusion region confidence is greater than T2Expanding the boundary of the central area towards the direction corresponding to the diffusion area;
if the confidence coefficient of a certain direction of the current diffusion region is less than T3If the target area does not extend in the direction, the target to be detected is in the index area corresponding to the maximum value, and the direction corresponding to the central area is reduced in the opposite direction.
A second object of the present invention is to provide a sliding-window-based object detection position correction apparatus, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the following steps, including:
setting the width and the moving stride of a sliding window, and segmenting the image of the target to be detected by using the sliding window to obtain a plurality of candidate target areas;
sending all candidate target areas into a CNN neural network for training treatment to obtain confidence degrees of all candidate target areas;
selecting an index area corresponding to the maximum confidence coefficient value and the maximum value as a reference value;
and cutting and combining the candidate target area by using a position correction method and a reference value to form a new target area.
Compared with the prior art, the invention has the beneficial effects that:
(1) the size of the sliding window is set according to the average size of the object to be detected, so that the target area has better elasticity in combination and cutting, and the area where the target is located can be detected under fewer combination and cutting operations; the moving step is set to be half of the size of the sliding window, and the target detection speed is increased on the premise that the window has a large overlapping area;
(2) the method is based on the breadth traversal method, the depth of the breadth traversal is added according to the size of the sliding window and the real size of the detected target as the traversal constraint condition, namely the maximum depth of the breadth traversal is less than or equal to 2, the combination and the cutting are carried out, the height and the weight of the target to be detected are described instead of an original square, and the accuracy of target detection is effectively improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a flowchart of a sliding window-based target detection position correction method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a sliding window-based target detection position correction method according to a second embodiment of the present invention;
FIG. 3 is a diagram illustrating attribute values of candidate target regions of an image;
FIG. 4 is a schematic view of maximum depth;
FIG. 5 is an exemplary diagram of a cropping candidate region;
fig. 6 is an exemplary diagram of a combination candidate region.
Detailed Description
The invention is further described with reference to the following figures and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As introduced in the background art, the conventional sliding window (sliding window) searching technology has the defects that the window size is fixed, the size of a segmented image cannot be changed due to the size of a target, if a plurality of groups of sliding windows with different sizes work simultaneously, the calculation amount is increased, the efficiency is influenced, when the sliding stride is dense, the data amount is increased, the speed is influenced, and when the sliding stride is too large, the detection accuracy is influenced.
In view of the foregoing disadvantages, an embodiment of the present invention provides a target detection position correction method based on a sliding window. As shown in fig. 1, the method comprises the steps of:
and S101, setting the size and the moving stride of a sliding window, and segmenting the image by using the sliding window.
Firstly, collecting an image of an object to be detected; then the sliding window size and the moving step are set.
When the size of the sliding window is set, the sliding window is usually in a square shape by default, the size of the sliding window is determined according to the average size of a detected object, and the sliding window is not selected to be too large.
When the moving step s is selected, the moving step s is lower than a half of the sliding window, namely 0< s ═ slide.width, and the slide.width is the width of the sliding window; the sliding window is ensured to have a larger overlapping area, and the identification accuracy is improved.
After the size of the sliding window and the moving stride are set, the sliding window is utilized to start to segment the original image of the target detection, and a plurality of candidate target areas are obtained.
And S102, sending all the obtained candidate target regions into a CNN neural network for training, and obtaining confidence degrees of all the candidate target regions.
When the weight of the CNN neural network is trained, the candidate target areas obtained by all sliding windows except the target area are used as noise, and the candidate target areas obtained by all sliding windows with the correlation rate IoU (interaction over Unit) of the target area larger than the threshold I are used as targets and are put into the CNN neural network for training.
The expression of the target area relevance ratio IoU is:
Figure BDA0001752297940000061
in the formula, Area of overlay indicates the intersection of the correct result Area and the detection result Area; area of Union represents the Union of the correct result Area and the detection result Area; detection result represents a target Detection position; and GroudTruth represents the real position of the target.
When the noise regions are too many, some noise regions may be randomly deleted using a random () method, or some pictures of the corresponding training set may be reduced.
The selection of the threshold value I (0< I <1) needs to be adjusted according to the actual training situation, i.e. to the direction with higher accuracy.
Finally, the confidence scores [1], score [2],. and score [ n ] of all candidate target regions obtained by sliding the window are used as the output of the CNN neural network.
S103, selecting, as a reference value, an index region v ∈ (1, n) whose maximum value max _ score is max { score [1], score [2]. score [ n ] } corresponds to the maximum value, among all the confidences output by the CNN neural network.
In this embodiment, the maximum value is selected from all the confidence values output by the CNN neural network, that is, the most likely position of the target to be detected is selected, and then the position is taken as the center point to perform adjustment.
When the original image is divided sequentially by the sliding window, an array of bases also records the information of each divided image block sequentially, which is in one-to-one correspondence with score [ i ]. If socre [ v ] is the maximum value, then bounds [ v ] is the corresponding image block, called the index area.
S104, the candidate target regions are cut and combined by using the position correction method BSF _ Revise _ bases () and the reference values max _ score, v, to form a new target region.
The method is based on BFS (Breadth-First Search) Breadth traversal, and adds the Breadth traversed depth extent according to the size of a sliding window and the real size of a detection target as a traversal constraint condition, namely the Breadth traversed maximum depth extent is less than 2 to correct the position.
The cutting and combining method specifically comprises the following steps:
taking an index region corresponding to the maximum confidence coefficient max _ score as a center, and diffusing the index region to the periphery by using a breadth traversal BFS method;
diffusion region confidence score [ w ]]Comparing with the maximum confidence max _ score of the index region; wherein the invention sets three thresholds, T respectively1、T2、T3,T1Indicates diffusion region confidence score [ w ]]The difference value with the maximum confidence coefficient max _ score represents the strength of the connection between the two regions; t is2Activating a threshold for the confidence level if the confidence level score [ w ] of the current diffusion region]Greater than T2Indicating that the current region has higher reliability; t is3As confidence suppression threshold, if the current diffusion region confidence score [ w ]]Less than T3Indicating that the confidence of the current region is low; t is1、T2、T3The size setting can be flexibly set according to the relation between the actual confidence degree obtained by the CNN neural network and the detection target.
If the confidence score [ w ] of the current diffusion region]Difference with maximum region confidence coefficient max _ score is less than T1And the current diffusion region confidence score [ w ]]Greater than T2The diffusion area is closely related to the index area, and the diffusion area w is a high-reliability area. In this case, the boundary should be enlarged toward the diffusion region w, and whether the peripheral region adjacent to the diffusion region w is covered with the boundary or not is consideredIncluding the target area.
If the confidence score [ w ] of the current diffusion region]Less than T3It means that the w region is weakly associated with the target region and the target region is within the candidate region corresponding to max _ score, and the boundary of the diffusion region w in the corresponding direction should be narrowed, and the extension of the diffusion region w in the corresponding direction should be discarded.
And S105, outputting the corrected target area coordinates.
The embodiment of the invention provides a target detection position correction method based on a sliding window, wherein the size of the sliding window is set according to the size of a detected object, so that the size of a segmented image can be changed due to the size of a target; the size of the movable cloth is set to be half of that of the sliding window, so that a large overlapping area of the window is ensured, and the identification accuracy is improved; and adding the depth of breadth traversal according to the size of the sliding window and the real size of the detected target as a traversal constraint condition, namely, performing position correction when the maximum depth of breadth traversal is less than or equal to 2, and effectively improving the accuracy of target detection.
In order to make those skilled in the art better understand the present invention, the second embodiment of the present invention provides a sliding window based target detection position correction method, as shown in fig. 2, the method includes the following steps:
s201, selecting the width slide and the moving step S of the sliding serial port, and processing the image of the object to be detected by using the sliding window to obtain a 1-n candidate area.
S202, all the obtained candidate regions are sent to a CNN neural network for training, and the trained CNN neural network is used for obtaining the confidence score [ i ] of the candidate regions, as shown in FIG. 3.
S203, selecting a maximum confidence coefficient max _ score and an index region v corresponding to the maximum confidence coefficient from all candidate region confidence coefficients score [ i ] by adopting a maximum max method; the maximum confidence max _ score and the index area v are used as reference values.
And S204, calling a position correction method BFS _ Revise _ Bounds () to correct the boundary region of the candidate region.
The BFS _ Revise _ bounds () position correction method specifically comprises the following steps:
s2041, initializing a queue Q; initializing candidate region depth extension [1,2,3.. n ] ═ infinity.
The queue Q is a one-dimensional queue with the characteristic of first-in first-out, stores the center point traversed by the current breadth, just starts to store the area v corresponding to the maximum value, where the extent is 0, then stores the left, upper, right and lower points of v through circulation, and after the combination and cutting are finished, the end of the circulation is 1, and the next circulation is entered; the next circulation takes the left point as a central point, respectively adds the left, upper, right and lower parts of the left point, combines and cuts, respectively adds the left, upper, right and lower parts of the upper point by taking the left point as the central point, combines and cuts, and sequentially circulates until the left, upper, right and lower traversal of the lower point is completed, and at this time, the extent is 2; no new elements are in queue Q and the loop exits.
The extent array stores the distance of the index point from the maximum value v, initialized to be positive infinity.
S2042, accessing the index region point v corresponding to the maximum confidence coefficient; visited [ v ] ═ 1, extend [ v ] ═ 0; the center point v is queued in queue Q.
The visited array indicates whether the area has been visited, and if the visited [ v ] is 1, the area has been visited; visited [ v ] ═ 0; indicating that it has not been accessed.
And S2043, if the queue Q is not empty, continuing to execute, otherwise, jumping to S20413.
S2044, dequeuing the head element of the queue Q, and assigning to temp.
S2045, w is equal to the left \ upper \ right \ lower index point of temp.
Left \ up \ right \ down means one access at a time, and the order of accesses is different, such as the first access left, the second access up, the third access right, and the fourth access down. This is in a loop statement, which is accessed if it does not exist for the first time.
S2046, if w exists, executing downwards, otherwise, jumping to S2044;
s2047, if w is not accessed, executing downwards, otherwise jumping to S20412;
s2048, accessing w, setting visited [ w ] to 1, where extend [ w ] + 1; as shown in fig. 4;
s2049, confidence score [ w ] of diffusion region w]Difference from maximum confidence max _ score-score [ w]≤T1And confidence score [ w ] of diffusion region w]≥T2Then expand from the central point v corresponding to the left/upper/right/lower boundary of the region to the diffusion region w corresponding to the left/upper/right/lower boundary as shown in fig. 5;
s20410, if extend [ w ] <2, w is enqueued in queue Q;
s20411, confidence score [ w ] of diffusion region w]≤T3And the diffusion region w left/right/upper/lower boundary is not enlarged; the maximum confidence max _ score corresponds to a region left/right/top/bottom boundary reduction d1(ii) a As shown in fig. 6;
s20412, jumping to S2046 for the next (up \ right \ down) adjacent confidence point of w ═ v; jumping to S2044 until w does not exist;
and S20413, returning to the corrected target area boundary coordinates.
Wherein, T1、T2、T3The size setting can be flexibly set according to the relation between the actual confidence coefficient obtained by the CNN neural network and the detection target, d1And the size of the sliding window and the size of the target object are flexibly set.
The embodiment of the invention provides a target detection position correction method based on a sliding window, wherein the size of the sliding window is determined according to the average size of a detected object, the moving stride is less than half of the window, a large overlapping area of the window is ensured, and the identification accuracy is improved; taking the corresponding region of the maximum confidence as the center, diffusing to the periphery by a BFS method, and flexibly setting a threshold T according to the relation between the actual confidence obtained by the CNN neural network and the detection target1、T2、T3The boundary area is corrected, and the accuracy and the speed of target identification are improved.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (5)

1. A target detection position correction method based on a sliding window is characterized by comprising the following steps:
setting the width and the moving stride of a sliding window, and segmenting the image of the target to be detected by using the sliding window to obtain a plurality of candidate target areas;
determining the width of the sliding window according to the average size of all the objects to be detected; the moving cloth width of the sliding window is less than or equal to half of the width of the sliding window;
sending all candidate target areas into a CNN neural network for training treatment to obtain confidence degrees of all candidate target areas;
selecting an index area corresponding to the maximum confidence coefficient value and the maximum value as a reference value;
cutting and combining the candidate target area by using a position correction method and a reference value to form a new target area; according to the width of the sliding window and the size of the target to be detected, the depth of the breadth traversal is used as a traversal constraint condition, and when the maximum depth of the breadth traversal is less than or equal to 2, position correction is carried out;
the method for cutting and combining the candidate target area by using the position correction method and the reference value comprises the following steps:
taking the index area corresponding to the maximum confidence coefficient as a central point;
setting region intensity threshold value T1Confidence activation threshold T2And confidence suppression threshold T3
Taking the central point region as an origin, and taking four adjacent regions, namely an upper region, a lower region, a left region and a right region, as current candidate diffusion regions;
based on the breadth traversal algorithm, the confidence coefficient of the current diffusion region is differed from the maximum confidence coefficient of the center point of the index region, and the confidence coefficient of the diffusion region is respectively compared with a confidence coefficient activation threshold value and a confidence coefficient inhibition threshold value;
if the difference between the confidence coefficient of a certain direction of the current diffusion area and the maximum confidence coefficient is less than T1And the diffusion region confidence is greater than T2Expanding the boundary of the central area towards the direction corresponding to the diffusion area;
if the confidence coefficient of a certain direction of the current diffusion region is less than T3If the target area does not extend in the direction, the target to be detected is in the index area corresponding to the maximum value, and the direction corresponding to the central area is reduced in the opposite direction.
2. The sliding-window-based object detection position correction method according to claim 1, wherein the step of sending all candidate object regions into a CNN neural network for training comprises:
taking the candidate target area with the correlation rate with the target area smaller than the threshold value I as noise, taking the candidate target area with the correlation rate with the target area larger than the threshold value I as a target, and respectively inputting the target areas into a CNN neural network for training;
and obtaining the confidence degrees of all candidate target regions by using the trained CNN neural network.
3. The sliding-window based object detection position correction method as claimed in claim 2, wherein when the noise area is excessive, a plurality of noise areas are randomly deleted or pictures of a corresponding training set are deleted by using a random sampling method.
4. The sliding-window-based object detection position correction method according to claim 1, wherein a maximum confidence value is selected from all confidence values output by the CNN neural network, and an index region corresponding to the maximum confidence value and the maximum confidence value is used as a reference value.
5. A sliding window based object detection position correction apparatus comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program, comprising:
setting the width and the moving stride of a sliding window, and segmenting the image of the target to be detected by using the sliding window to obtain a plurality of candidate target areas;
determining the width of the sliding window according to the average size of all the objects to be detected; the moving cloth width of the sliding window is less than or equal to half of the width of the sliding window;
sending all candidate target areas into a CNN neural network for training treatment to obtain confidence degrees of all candidate target areas;
selecting an index area corresponding to the maximum confidence coefficient value and the maximum value as a reference value;
cutting and combining the candidate target area by using a position correction method and a reference value to form a new target area;
according to the width of the sliding window and the size of the target to be detected, the depth of the breadth traversal is used as a traversal constraint condition, and when the maximum depth of the breadth traversal is less than or equal to 2, position correction is carried out;
the method for cutting and combining the candidate target area by using the position correction method and the reference value comprises the following steps:
taking the index area corresponding to the maximum confidence coefficient as a central point;
setting region intensity threshold value T1Confidence activation threshold T2And confidence suppression threshold T3
Taking the central point region as an origin, and taking four adjacent regions, namely an upper region, a lower region, a left region and a right region, as current candidate diffusion regions;
based on the breadth traversal algorithm, the confidence coefficient of the current diffusion region is differed from the maximum confidence coefficient of the center point of the index region, and the confidence coefficient of the diffusion region is respectively compared with a confidence coefficient activation threshold value and a confidence coefficient inhibition threshold value;
if the difference between the confidence coefficient of a certain direction of the current diffusion area and the maximum confidence coefficient is less than T1And the diffusion region confidence is greater than T2Expanding the boundary of the central area towards the direction corresponding to the diffusion area;
if the confidence coefficient of a certain direction of the current diffusion region is less than T3If the target area does not extend in the direction, the target to be detected is in the index area corresponding to the maximum value, and the direction corresponding to the central area is reduced in the opposite direction.
CN201810871600.3A 2018-08-02 2018-08-02 Target detection position correction method and device based on sliding window Active CN109166106B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810871600.3A CN109166106B (en) 2018-08-02 2018-08-02 Target detection position correction method and device based on sliding window

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810871600.3A CN109166106B (en) 2018-08-02 2018-08-02 Target detection position correction method and device based on sliding window

Publications (2)

Publication Number Publication Date
CN109166106A CN109166106A (en) 2019-01-08
CN109166106B true CN109166106B (en) 2021-07-30

Family

ID=64898741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810871600.3A Active CN109166106B (en) 2018-08-02 2018-08-02 Target detection position correction method and device based on sliding window

Country Status (1)

Country Link
CN (1) CN109166106B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112751633B (en) * 2020-10-26 2022-08-26 中国人民解放军63891部队 Broadband spectrum detection method based on multi-scale window sliding

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107665336A (en) * 2017-09-20 2018-02-06 厦门理工学院 Multi-target detection method based on Faster RCNN in intelligent refrigerator
CN107679469A (en) * 2017-09-22 2018-02-09 东南大学—无锡集成电路技术研究所 A kind of non-maxima suppression method based on deep learning
CN108062531A (en) * 2017-12-25 2018-05-22 南京信息工程大学 A kind of video object detection method that convolutional neural networks are returned based on cascade
CN108304808A (en) * 2018-02-06 2018-07-20 广东顺德西安交通大学研究院 A kind of monitor video method for checking object based on space time information Yu depth network
CN108319949A (en) * 2018-01-26 2018-07-24 中国电子科技集团公司第十五研究所 Mostly towards Ship Target Detection and recognition methods in a kind of high-resolution remote sensing image

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9881234B2 (en) * 2015-11-25 2018-01-30 Baidu Usa Llc. Systems and methods for end-to-end object detection
KR20170131081A (en) * 2016-05-20 2017-11-29 주식회사 에스원 Method and appterminal for detecting object
CN106327507B (en) * 2016-08-10 2019-02-22 南京航空航天大学 A kind of color image conspicuousness detection method based on background and foreground information
CN107154024A (en) * 2017-05-19 2017-09-12 南京理工大学 Dimension self-adaption method for tracking target based on depth characteristic core correlation filter
CN108154520B (en) * 2017-12-25 2019-01-08 北京航空航天大学 A kind of moving target detecting method based on light stream and frame matching

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107665336A (en) * 2017-09-20 2018-02-06 厦门理工学院 Multi-target detection method based on Faster RCNN in intelligent refrigerator
CN107679469A (en) * 2017-09-22 2018-02-09 东南大学—无锡集成电路技术研究所 A kind of non-maxima suppression method based on deep learning
CN108062531A (en) * 2017-12-25 2018-05-22 南京信息工程大学 A kind of video object detection method that convolutional neural networks are returned based on cascade
CN108319949A (en) * 2018-01-26 2018-07-24 中国电子科技集团公司第十五研究所 Mostly towards Ship Target Detection and recognition methods in a kind of high-resolution remote sensing image
CN108304808A (en) * 2018-02-06 2018-07-20 广东顺德西安交通大学研究院 A kind of monitor video method for checking object based on space time information Yu depth network

Also Published As

Publication number Publication date
CN109166106A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
US11842487B2 (en) Detection model training method and apparatus, computer device and storage medium
US10832039B2 (en) Facial expression detection method, device and system, facial expression driving method, device and system, and storage medium
WO2020037932A1 (en) Image quality assessment method, apparatus, electronic device and computer readable storage medium
WO2019174276A1 (en) Method, device, equipment and medium for locating center of target object region
CN109685037A (en) A kind of real-time action recognition methods, device and electronic equipment
US20210158593A1 (en) Pose selection and animation of characters using video data and training techniques
WO2021238586A1 (en) Training method and apparatus, device, and computer readable storage medium
CN112507918A (en) Gesture recognition method
CN109166106B (en) Target detection position correction method and device based on sliding window
CN114299363A (en) Training method of image processing model, image classification method and device
CN111274919A (en) Method, system, server and medium for detecting five sense organs based on convolutional neural network
CN110363103A (en) Identifying pest method, apparatus, computer equipment and storage medium
CN109948489A (en) A kind of face identification system and method based on the fusion of video multiframe face characteristic
CN113496215A (en) Method and device for detecting human face of living body and electronic equipment
CN110223291B (en) Network method for training fundus lesion point segmentation based on loss function
CN111488836A (en) Face contour correction method, device, equipment and storage medium
CN109799905B (en) Hand tracking method and advertising machine
CN109299743A (en) Gesture identification method and device, terminal
CN115457646A (en) Device, method and related product for identifying lesions in the periphery of the ocular fundus
CN111626143B (en) Reverse face detection method, system and equipment based on eye positioning
CN113627410A (en) Method for recognizing and retrieving action semantics in video
CN114445691A (en) Model training method and device, electronic equipment and storage medium
CN113591858A (en) Text recognition method and device, electronic equipment and storage medium
CN108549871B (en) A kind of hand Segmentation method based on region growing and machine learning
Ko et al. Automatic object-of-interest segmentation from natural images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant