CN110490073A - Object detection method, device, equipment and storage medium - Google Patents

Object detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN110490073A
CN110490073A CN201910637703.8A CN201910637703A CN110490073A CN 110490073 A CN110490073 A CN 110490073A CN 201910637703 A CN201910637703 A CN 201910637703A CN 110490073 A CN110490073 A CN 110490073A
Authority
CN
China
Prior art keywords
image sequence
target
video data
background
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910637703.8A
Other languages
Chinese (zh)
Inventor
樊龙
黄晓峰
殷海兵
贾惠柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Institute of Information Technology AIIT of Peking University
Hangzhou Weiming Information Technology Co Ltd
Original Assignee
Advanced Institute of Information Technology AIIT of Peking University
Hangzhou Weiming Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Institute of Information Technology AIIT of Peking University, Hangzhou Weiming Information Technology Co Ltd filed Critical Advanced Institute of Information Technology AIIT of Peking University
Priority to CN201910637703.8A priority Critical patent/CN110490073A/en
Publication of CN110490073A publication Critical patent/CN110490073A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

This application discloses a kind of object detection method, device, equipment and storage mediums, obtain video data, first image sequence of the video data is pre-processed, obtain the second image sequence of removal background image, second image sequence is inputted in trained detection model and carries out target detection, obtains object detection results.On the one hand, only retain foreground target for removing the image of background, without the interference of other background images, detection model is more concerned about foreground target in study and reasoning, so as to improve target detection accuracy rate;On the other hand, due to eliminating the background pixel of input picture, the only foreground pixel that detection model is seen not will receive the influence of video or sequence of pictures scene, to improve the scene migration performance of target detection completely.

Description

Target detection method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a method, an apparatus, a device, and a storage medium for target detection.
Background
As is known, vision is the most direct and effective means for obtaining information, whereas most monitoring systems are in a "record only and no judgment" mode of operation, the video signals obtained by the cameras being transmitted to the control center, analyzed and corresponding judgments made by the operator of the control center. However, this is a great waste of human resources. With the advent of computer vision intelligent video processing systems, video analysis such as target detection and tracking is achieved by using image processing techniques and machine learning methods.
The task of object detection is to find all objects of interest in the image, determine their position and size. Because various objects have different appearances, shapes and postures and are interfered by factors such as illumination, shielding and the like during imaging, target detection is always the most challenging problem in the field of machine vision.
In the existing target detection, false detection is easy to generate for scenes with complex backgrounds in static pictures, so the target detection accuracy rate needs to be improved. In addition, the existing target detection also has certain limitation on the generalization performance of monitoring complex scenes, and in order to improve the scene migration performance of a target detection algorithm, a large number of data sets need to be trained, so that the dependency on data is strong.
Disclosure of Invention
The application aims to provide a target detection method, a target detection device, target detection equipment and a storage medium, so as to improve the accuracy of target detection and scene migration performance.
In a first aspect, an embodiment of the present application provides a target detection method, including:
acquiring video data;
preprocessing a first image sequence of the video data to obtain a second image sequence with background images removed;
and inputting the second image sequence into a trained detection model for target detection to obtain a target detection result.
In a possible implementation manner, the method provided in the embodiment of the present application includes:
detecting a moving object of a first image sequence of the video data by using a background subtraction method;
and reserving pixels of the area where the moving target is located, and segmenting the pixels of the area where the moving target is located into independent moving target units by using a morphological method to obtain a second image sequence with the background image removed.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the detection model adopts an SSD framework, where the SSD framework includes: a feature extraction network and a target detection network.
In one possible implementation manner, in the foregoing method provided in an embodiment of the present application, the method further includes training an SSD framework, which includes:
preprocessing an image sequence of sample video data to obtain a sample image sequence with a background image removed;
carrying out artificial target labeling on the sample image sequence to obtain a training data set;
training an SSD framework based on the training dataset: firstly, initializing parameters to be trained and hyper-parameters in a network, inputting training data into the initialized network for network forward propagation to obtain an actual output result, adjusting network parameters by combining a loss function with a Back Propagation (BP) algorithm, performing iterative training, and ending the training when the loss value of the loss function is smaller than a set threshold or reaches the maximum iteration number to obtain a trained SSD frame.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the loss function is a weighted sum of the position error and the confidence error.
In a possible implementation manner, in the foregoing method provided in this embodiment of the present application, the confidence error is calculated as follows:
wherein,indicating that the prediction box i matches the real box j with respect to the category.
In a second aspect, an embodiment of the present application provides an object detection apparatus, including:
the acquisition module is used for acquiring video data;
the preprocessing module is used for preprocessing the first image sequence of the video data to obtain a second image sequence with background images removed;
and the target detection module is used for inputting the second image sequence into a trained detection model for target detection to obtain a target detection result.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the preprocessing module is specifically configured to:
detecting a moving object of a first image sequence of the video data by using a background subtraction method;
reserving pixels of an area where a moving target is located, segmenting the pixels of the area where the moving target is located by using a morphological method, and segmenting the pixels into independent moving target units to obtain a second image sequence with background images removed;
a second sequence of images with background images removed is acquired.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the detection model employs an SSD framework, where the SSD framework includes: a feature extraction network and a target detection network.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, a training module is further included, configured to:
preprocessing an image sequence of sample video data to obtain a sample image sequence with a background image removed;
carrying out artificial target labeling on the sample image sequence to obtain a training data set;
training an SSD framework based on the training dataset: firstly, initializing parameters to be trained and hyper-parameters in a network, inputting training data into the initialized network for network forward propagation to obtain an actual output result, adjusting network parameters by combining a loss function with a Back Propagation (BP) algorithm, performing iterative training, and ending the training when the loss value of the loss function is smaller than a set threshold or reaches the maximum iteration number to obtain a trained SSD frame.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the loss function is a weighted sum of the position error and the confidence error.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the confidence error is calculated as follows:
wherein,indicating that the prediction box i matches the real box j with respect to the category.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method described in the first aspect and the various embodiments of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the method described in the first aspect and the implementation manners of the first aspect when executed by a processor.
Compared with the prior art, the target detection method, the device, the equipment and the storage medium provided by the application acquire the video data, preprocess the first image sequence of the video data to acquire the second image sequence without the background image, input the second image sequence into the trained detection model for target detection, and acquire the target detection result. On one hand, only the foreground target is reserved for the image without the background, the interference of other background images is avoided, and the detection model focuses more on the foreground target during learning and reasoning, so that the target detection accuracy can be improved; on the other hand, because the background pixels of the input image are removed, only the foreground pixels seen by the detection model are not influenced by the video or picture sequence scene, and therefore the scene migration performance of the target detection is improved.
Drawings
Fig. 1 is a schematic flowchart of a target detection method according to an embodiment of the present application;
fig. 2 is a flowchart of a background removal method provided in an embodiment of the present application;
FIG. 3 is an overall structure of a target detection system based on an SSD framework according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a target detection apparatus according to a second embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present application is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
The problems to be solved by target detection are as follows: finding out some classes of objects in the image or video through the target frame, and giving the probability that the object belongs to some class, namely a task of combining position coordinate regression and class prediction.
SSD: chinese characters are fully called: single multi-frame detector, english full name: the SSD framework comprises a feature extraction network and a target detection network, wherein the feature extraction network is used for extracting features of an image, and the target detection network is used for performing position regression and target category prediction according to the extracted features so as to identify the object category in the image.
Fig. 1 is a schematic flow chart of a target detection method according to an embodiment of the present application, in practical application, an execution main body of this embodiment may be a target detection device, and the target detection device may be implemented by a virtual device, such as a software code, or by an entity device written with a relevant execution code, such as a usb disk, or by an entity device integrated with a relevant execution code, such as a chip, a computer, a robot, and the like.
As shown in fig. 1, the method includes the following steps S101 to S103:
and S101, acquiring video data.
S102, preprocessing the first image sequence of the video data to obtain a second image sequence with background images removed.
S103, inputting the second image sequence into a trained detection model for target detection to obtain a target detection result.
In this embodiment, the video data may be collected by a camera in real time, or may be stored in advance, and it may be understood that the video data is composed of multiple frames of images, and the video data includes objects to be identified, such as people and vehicles. After a video image to be detected by a target is obtained, removing the background from an image sequence in the video image, only retaining the foreground target, namely only retaining the pixels of the region where the target is located, and setting the background pixel region to zero to obtain the image sequence with the background image removed.
Specifically, step S102 may be specifically implemented as: detecting a moving object of a first image sequence of the video data by using a background subtraction method; and reserving pixels of the area where the moving target is located, and segmenting the pixels of the area where the moving target is located into independent moving target units by using a morphological method to obtain a second image sequence with the background image removed. Fig. 2 is a flowchart of a background removal method. The method mainly comprises the steps of calculating pixel stability, recording the gray value of a pixel point with the longest stabilization time from the beginning of operation to the current moment in the operation process of the algorithm, judging the stability of the pixel point through a series of threshold comparison operations when a new frame comes by using the stability of adjacent frames and historical pixels as a judgment basis, and judging whether the pixel point is a background point or not, so that background pixels are removed, and foreground pixels are reserved.
And then inputting the image sequence without the background image into a trained detection model for target detection to obtain a target detection result. The detection model is also trained by using a sample with a background removed during training.
The present application is described below in a specific embodiment.
In this embodiment, the detection model adopts an SSD framework, and the SSD framework includes: a feature extraction network and a target detection network. The SSD framework is trained as follows:
s201, preprocessing an image sequence of the sample video data to obtain a sample image sequence with a background image removed.
S202, carrying out artificial target labeling on the sample image sequence to obtain a training data set.
S203, training the SSD framework based on the training data set: firstly, initializing parameters to be trained and hyper-parameters in a network, inputting training data into the initialized network for network forward propagation to obtain an actual output result, adjusting network parameters by combining a loss function with a Back Propagation (BP) algorithm, performing iterative training, and ending the training when the loss value of the loss function is smaller than a set threshold or reaches the maximum iteration number to obtain a trained SSD frame.
Specifically, a training data set is first prepared: the method comprises the steps of detecting a moving target by a background subtraction method by adopting a traditional image processing algorithm, reserving a pixel mask of a region where the moving target is located, reserving pixels of the region where the moving target is located as far as possible by utilizing a morphological method, obtaining an image sequence with a background removed, manually labeling a data set by utilizing a labeling tool, and obtaining a training data set.
Designing a detection model: the design of the detection model is based on the existing SSD target detection network structure, and mainly aims at modifying a loss function and removing a background loss function item in class loss. In the present embodiment, the loss function is a weighted sum of the position error and the confidence error. For confidence errors, Softmax Loss is used, and for position errors, SmoothL1Loss is used.
The loss function is as follows:
wherein the first term LconfFor confidence errors, the second term LlocThe position error is obtained, N is the number of matched default frames, α is a balance factor (weight coefficient), and the value is 1 during cross validation. And c is a category confidence prediction value. l is the predicted value of the position of the corresponding bounding box of the prior frame, and g is the position parameter of the real target.
Wherein,
wherein, due to the singleness of the background sample, the characteristic learning of the background can be not considered, so that the network can pay more attention to the learning of the foreground sample,indicating that the prediction box i matches the real box j with respect to the class, the higher the probabilistic prediction of p the lower the loss,obtained by Softmax, the more probability of being background if the prediction box has no targetThe higher the loss.
Wherein,
in which, using a position regression function,indicating whether the ith prediction box and the jth real box match with respect to the class k,andrespectively representing a prediction box and a real box,the midpoint of the ith real box is represented,the midpoint of the ith default box is indicated,indicating the width of the ith default box.
Training a detection model: and (3) using the labeled background-removed image data set as a training data set, and training the detection model by using an SSD network framework. Fig. 3 shows an overall structure of an SSD frame-based target detection system, specifically, first, an input image size is adjusted to be an input required by a network (for example, 300 × 300 is adopted), a background-removed picture obtained by preprocessing is used as input data of a training model, as shown in fig. 3B, multi-layer image features are extracted by forward propagation of a main body network, different layer image features are fused, an error value is obtained by comparing IoU (Intersection over unit) with true value data, a model objective function is modified, a loss value is calculated, and the network learns more foreground information and ignores the background; and adjusting network parameters by utilizing error back propagation, adopting a random gradient descent method in the error back propagation process, setting a network learning rate lr to be 0.001 and a gradient momentum to be 0.9, and finishing one iteration. And finishing the training until the loss value of the loss function is smaller than a set threshold or the maximum iteration number is reached, and obtaining the trained SSD frame.
As shown in fig. 3, a data preprocessing layer is added based on the SSD network structure, the data layer obtains video data, removes a background by using a background modeling algorithm, and retains a foreground image and sends the foreground image to the SSD frame for detection. Feature extraction is implemented as shown in C in fig. 3, and then feature maps of different scales are extracted through some convolution and pooling operations, candidate target boxes are proposed for the feature maps of different scales, for example, the feature map is 8 × 8, the number of candidate target boxes is 8 × 9, and nine types of candidate targets, namely 3 proportional scales and 3 areas, are generated at each feature point position. When the SSD framework infers the image, a series of fixed-size candidate boxes are generated, and the likelihood of each candidate box containing an object instance. A large number of target frames are generated by one forward process, and a Non-maximum suppression (NMS) is required to filter out most of the target frames, and the method is to discard the target frames when the confidence threshold of the target frames is smaller than a threshold ct (e.g. 0.01) and IoU is smaller than lt (e.g. 0.45), and only the first N prediction results are retained. And matching the fusion acquired features with the truth-valued features to constrain the loss function, so that the loss function focuses more on the foreground target features, and the target detection is realized.
The target detection method provided in this embodiment preprocesses the first image sequence of the video data to obtain a second image sequence from which the background image is removed, and inputs the second image sequence into a trained detection model to perform target detection, so as to obtain a target detection result. On one hand, only the foreground target is reserved for the image without the background, the interference of other background images is avoided, and the detection model focuses more on the foreground target during learning and reasoning, so that the target detection accuracy can be improved; on the other hand, because the background pixels are removed, only the foreground pixels seen by the detection model can not be influenced by the video or picture sequence scene, and therefore the scene migration performance of the target detection is improved.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 4 is a schematic structural diagram of an object detection apparatus according to a second embodiment of the present application, and as shown in fig. 4, the apparatus may include:
an obtaining module 410, configured to obtain video data;
a preprocessing module 420, configured to preprocess the first image sequence of the video data to obtain a second image sequence with background images removed;
and the target detection module 430 is configured to input the second image sequence into a trained detection model for target detection, so as to obtain a target detection result.
The target detection apparatus provided in this embodiment preprocesses the first image sequence of the video data to obtain a second image sequence from which the background image is removed, and inputs the second image sequence into a trained detection model to perform target detection, so as to obtain a target detection result. On one hand, only the foreground target is reserved for the image without the background, the interference of other background images is avoided, and the detection model focuses more on the foreground target during learning and reasoning, so that the target detection accuracy can be improved; on the other hand, because the background pixels are removed, only the foreground pixels seen by the detection model can not be influenced by the video or picture sequence scene, and therefore the scene migration performance of the target detection is improved.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, the preprocessing module 420 is specifically configured to:
detecting a moving object of a first image sequence of the video data by using a background subtraction method;
and reserving pixels of the area where the moving target is located, and segmenting the pixels of the area where the moving target is located into independent moving target units by using a morphological method to obtain a second image sequence with the background image removed.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the detection model employs an SSD framework, where the SSD framework includes: a feature extraction network and a target detection network.
In a possible implementation manner, in the apparatus provided in this embodiment of the present application, a training module is further included, configured to:
preprocessing an image sequence of sample video data to obtain a sample image sequence with a background image removed;
carrying out artificial target labeling on the sample image sequence to obtain a training data set;
training an SSD framework based on the training dataset: firstly, initializing parameters to be trained and hyper-parameters in a network, inputting training data into the initialized network for network forward propagation to obtain an actual output result, adjusting network parameters by combining a loss function with a Back Propagation (BP) algorithm, performing iterative training, and ending the training when the loss value of the loss function is smaller than a set threshold or reaches the maximum iteration number to obtain a trained SSD frame.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the loss function is a weighted sum of the position error and the confidence error.
In a possible implementation manner, in the foregoing apparatus provided in this embodiment of the present application, the confidence error is calculated as follows:
wherein,indicating that the prediction box i matches the real box j with respect to the category.
Fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, and as shown in fig. 5, the electronic device includes: a memory 501 and a processor 502;
a memory 501 for storing a computer program;
wherein the processor 502 executes the computer program in the memory 501 to implement the methods provided by the method embodiments as described above.
In the embodiment, the object detection device provided by the application is exemplified by an electronic device. The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by a processor to implement the methods of the various embodiments of the present application above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
An embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the methods provided by the method embodiments described above when being executed by a processor.
In practice, the computer program in this embodiment may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, etc., and conventional procedural programming languages, such as the "C" programming language or similar programming languages, for performing the operations of the embodiments of the present application. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
In practice, the computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing descriptions of specific exemplary embodiments of the present application have been presented for purposes of illustration and description. It is not intended to limit the application to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the present application and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the present application and various alternatives and modifications thereof. It is intended that the scope of the application be defined by the claims and their equivalents.

Claims (10)

1. A method of object detection, comprising:
acquiring video data;
preprocessing a first image sequence of the video data to obtain a second image sequence with background images removed;
and inputting the second image sequence into a trained detection model for target detection to obtain a target detection result.
2. The method of claim 1, wherein pre-processing the first image sequence of the video data to obtain a second image sequence with background images removed comprises:
detecting a moving object of a first image sequence of the video data by using a background subtraction method;
and reserving pixels of the area where the moving target is located, and segmenting the pixels of the area where the moving target is located into independent moving target units by using a morphological method to obtain a second image sequence with the background image removed.
3. The method of claim 1, wherein the detection model employs an SSD framework that includes a feature extraction network and a target detection network.
4. The method of claim 3, further comprising training an SSD framework comprising:
preprocessing an image sequence of sample video data to obtain a sample image sequence with a background image removed;
carrying out artificial target labeling on the sample image sequence to obtain a training data set;
training an SSD framework based on the training dataset: firstly, initializing parameters to be trained and hyper-parameters in a network, inputting training data into the initialized network for network forward propagation to obtain an actual output result, adjusting network parameters by combining a loss function with a Back Propagation (BP) algorithm, performing iterative training, and ending the training when the loss value of the loss function is smaller than a set threshold or reaches the maximum iteration number to obtain a trained SSD frame.
5. The method of claim 4, wherein the loss function is a weighted sum of the position error and the confidence error.
6. The method of claim 5, wherein the confidence error is calculated as follows:
wherein,indicating that the prediction box i matches the real box j with respect to the category.
7. An object detection device, comprising:
the acquisition module is used for acquiring video data;
the preprocessing module is used for preprocessing the first image sequence of the video data to obtain a second image sequence with background images removed;
and the target detection module is used for inputting the second image sequence into a trained detection model for target detection to obtain a target detection result.
8. The apparatus according to claim 7, wherein the preprocessing module is specifically configured to:
detecting a moving object of a first image sequence of the video data by using a background subtraction method;
reserving a pixel mask of an area where the moving target is located, and extracting pixels of the area where the moving target is located by using a morphological method;
a second sequence of images with background images removed is acquired.
9. An electronic device, comprising: a memory and a processor;
the memory for storing a computer program;
wherein the processor executes the computer program in the memory to implement the method of any one of claims 1-6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method according to any one of claims 1-6.
CN201910637703.8A 2019-07-15 2019-07-15 Object detection method, device, equipment and storage medium Pending CN110490073A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910637703.8A CN110490073A (en) 2019-07-15 2019-07-15 Object detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910637703.8A CN110490073A (en) 2019-07-15 2019-07-15 Object detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110490073A true CN110490073A (en) 2019-11-22

Family

ID=68547088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910637703.8A Pending CN110490073A (en) 2019-07-15 2019-07-15 Object detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110490073A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126331A (en) * 2019-12-30 2020-05-08 浙江中创天成科技有限公司 Real-time guideboard detection method combining object detection and object tracking
CN111260608A (en) * 2020-01-08 2020-06-09 来康科技有限责任公司 Tongue region detection method and system based on deep learning
CN111401424A (en) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 Target detection method, device and electronic system
CN111401253A (en) * 2020-03-17 2020-07-10 吉林建筑大学 Target detection method based on deep learning
CN111488909A (en) * 2020-03-10 2020-08-04 浙江省北大信息技术高等研究院 Calibration label generation method and device, electronic equipment and medium
CN111882559A (en) * 2020-01-20 2020-11-03 深圳数字生命研究院 ECG signal acquisition method and device, storage medium and electronic device
CN112184708A (en) * 2020-11-04 2021-01-05 成都朴华科技有限公司 Sperm survival rate detection method and device
CN112434631A (en) * 2020-12-01 2021-03-02 天冕信息技术(深圳)有限公司 Target object identification method and device, electronic equipment and readable storage medium
CN112581445A (en) * 2020-12-15 2021-03-30 中国电力科学研究院有限公司 Detection method and device for bolt of power transmission line, storage medium and electronic equipment
CN112767431A (en) * 2021-01-12 2021-05-07 云南电网有限责任公司电力科学研究院 Power grid target detection method and device for power system
CN113706614A (en) * 2021-08-27 2021-11-26 重庆赛迪奇智人工智能科技有限公司 Small target detection method and device, storage medium and electronic equipment
WO2021254205A1 (en) * 2020-06-17 2021-12-23 苏宁易购集团股份有限公司 Target detection method and apparatus
CN113838110A (en) * 2021-09-08 2021-12-24 重庆紫光华山智安科技有限公司 Target detection result verification method and device, storage medium and electronic equipment
CN118015286A (en) * 2024-04-09 2024-05-10 杭州像素元科技有限公司 Method and device for detecting traffic state of toll station lane through background segmentation
CN118134818A (en) * 2024-05-07 2024-06-04 深圳市生强科技有限公司 Scanning and AI fluorescent image processing method based on fluorescent slide and application thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563372A (en) * 2017-07-20 2018-01-09 济南中维世纪科技有限公司 A kind of license plate locating method based on deep learning SSD frameworks
CN107944392A (en) * 2017-11-25 2018-04-20 周晓风 A kind of effective ways suitable for cell bayonet Dense crowd monitor video target mark
CN107967695A (en) * 2017-12-25 2018-04-27 北京航空航天大学 A kind of moving target detecting method based on depth light stream and morphological method
CN108171196A (en) * 2018-01-09 2018-06-15 北京智芯原动科技有限公司 A kind of method for detecting human face and device
CN108304787A (en) * 2018-01-17 2018-07-20 河南工业大学 Road target detection method based on convolutional neural networks
CN108564065A (en) * 2018-04-28 2018-09-21 广东电网有限责任公司 A kind of cable tunnel open fire recognition methods based on SSD

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563372A (en) * 2017-07-20 2018-01-09 济南中维世纪科技有限公司 A kind of license plate locating method based on deep learning SSD frameworks
CN107944392A (en) * 2017-11-25 2018-04-20 周晓风 A kind of effective ways suitable for cell bayonet Dense crowd monitor video target mark
CN107967695A (en) * 2017-12-25 2018-04-27 北京航空航天大学 A kind of moving target detecting method based on depth light stream and morphological method
CN108171196A (en) * 2018-01-09 2018-06-15 北京智芯原动科技有限公司 A kind of method for detecting human face and device
CN108304787A (en) * 2018-01-17 2018-07-20 河南工业大学 Road target detection method based on convolutional neural networks
CN108564065A (en) * 2018-04-28 2018-09-21 广东电网有限责任公司 A kind of cable tunnel open fire recognition methods based on SSD

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王贵槐: "基于深度学习的水面无人船前方船只图像识别方法", 《船舶工程》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126331A (en) * 2019-12-30 2020-05-08 浙江中创天成科技有限公司 Real-time guideboard detection method combining object detection and object tracking
CN111260608A (en) * 2020-01-08 2020-06-09 来康科技有限责任公司 Tongue region detection method and system based on deep learning
CN111882559A (en) * 2020-01-20 2020-11-03 深圳数字生命研究院 ECG signal acquisition method and device, storage medium and electronic device
CN111882559B (en) * 2020-01-20 2023-10-17 深圳数字生命研究院 ECG signal acquisition method and device, storage medium and electronic device
CN111401424B (en) * 2020-03-10 2024-01-26 北京迈格威科技有限公司 Target detection method, device and electronic system
CN111488909A (en) * 2020-03-10 2020-08-04 浙江省北大信息技术高等研究院 Calibration label generation method and device, electronic equipment and medium
CN111401424A (en) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 Target detection method, device and electronic system
CN111401253A (en) * 2020-03-17 2020-07-10 吉林建筑大学 Target detection method based on deep learning
WO2021254205A1 (en) * 2020-06-17 2021-12-23 苏宁易购集团股份有限公司 Target detection method and apparatus
CN112184708A (en) * 2020-11-04 2021-01-05 成都朴华科技有限公司 Sperm survival rate detection method and device
CN112184708B (en) * 2020-11-04 2024-05-31 成都朴华科技有限公司 Sperm survival rate detection method and device
CN112434631A (en) * 2020-12-01 2021-03-02 天冕信息技术(深圳)有限公司 Target object identification method and device, electronic equipment and readable storage medium
CN112581445A (en) * 2020-12-15 2021-03-30 中国电力科学研究院有限公司 Detection method and device for bolt of power transmission line, storage medium and electronic equipment
CN112767431B (en) * 2021-01-12 2024-04-23 云南电网有限责任公司电力科学研究院 Power grid target detection method and device for power system
CN112767431A (en) * 2021-01-12 2021-05-07 云南电网有限责任公司电力科学研究院 Power grid target detection method and device for power system
CN113706614A (en) * 2021-08-27 2021-11-26 重庆赛迪奇智人工智能科技有限公司 Small target detection method and device, storage medium and electronic equipment
CN113838110B (en) * 2021-09-08 2023-09-05 重庆紫光华山智安科技有限公司 Verification method and device for target detection result, storage medium and electronic equipment
CN113838110A (en) * 2021-09-08 2021-12-24 重庆紫光华山智安科技有限公司 Target detection result verification method and device, storage medium and electronic equipment
CN118015286A (en) * 2024-04-09 2024-05-10 杭州像素元科技有限公司 Method and device for detecting traffic state of toll station lane through background segmentation
CN118015286B (en) * 2024-04-09 2024-06-11 杭州像素元科技有限公司 Method and device for detecting traffic state of toll station lane through background segmentation
CN118134818A (en) * 2024-05-07 2024-06-04 深圳市生强科技有限公司 Scanning and AI fluorescent image processing method based on fluorescent slide and application thereof

Similar Documents

Publication Publication Date Title
CN110490073A (en) Object detection method, device, equipment and storage medium
CN111178183B (en) Face detection method and related device
CN109086811B (en) Multi-label image classification method and device and electronic equipment
Kumar et al. Recent trends in multicue based visual tracking: A review
CN112488073A (en) Target detection method, system, device and storage medium
US8355576B2 (en) Method and system for crowd segmentation
CN109492576B (en) Image recognition method and device and electronic equipment
CN109376631A (en) A kind of winding detection method and device neural network based
CN111368634B (en) Human head detection method, system and storage medium based on neural network
CN113281780B (en) Method and device for marking image data and electronic equipment
AU2020272936B2 (en) Methods and systems for crack detection using a fully convolutional network
CN112541403B (en) Indoor personnel falling detection method by utilizing infrared camera
CN113516113A (en) Image content identification method, device, equipment and storage medium
CN114937086A (en) Training method and detection method for multi-image target detection and related products
CN114549557A (en) Portrait segmentation network training method, device, equipment and medium
CN114332799A (en) Target detection method and device, electronic equipment and storage medium
CN108229281B (en) Neural network generation method, face detection device and electronic equipment
CN116402852A (en) Dynamic high-speed target tracking method and device based on event camera
CN115690554A (en) Target identification method, system, electronic device and storage medium
CN114387496A (en) Target detection method and electronic equipment
CN111178200A (en) Identification method of instrument panel indicator lamp and computing equipment
CN113112479A (en) Progressive target detection method and device based on key block extraction
CN113569600A (en) Method and device for identifying weight of object, electronic equipment and storage medium
CN113869163B (en) Target tracking method and device, electronic equipment and storage medium
CN113222989B (en) Image grading method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 101, building 1, block C, Qianjiang Century Park, ningwei street, Xiaoshan District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Weiming Information Technology Co.,Ltd.

Applicant after: Institute of Information Technology, Zhejiang Peking University

Address before: Room 288-1, 857 Xinbei Road, Ningwei Town, Xiaoshan District, Hangzhou City, Zhejiang Province

Applicant before: Institute of Information Technology, Zhejiang Peking University

Applicant before: Hangzhou Weiming Information Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20191122

RJ01 Rejection of invention patent application after publication