CN111008576A - Pedestrian detection and model training and updating method, device and readable storage medium thereof - Google Patents

Pedestrian detection and model training and updating method, device and readable storage medium thereof Download PDF

Info

Publication number
CN111008576A
CN111008576A CN201911163826.9A CN201911163826A CN111008576A CN 111008576 A CN111008576 A CN 111008576A CN 201911163826 A CN201911163826 A CN 201911163826A CN 111008576 A CN111008576 A CN 111008576A
Authority
CN
China
Prior art keywords
training
pedestrian
pedestrian detection
model
detection model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911163826.9A
Other languages
Chinese (zh)
Other versions
CN111008576B (en
Inventor
肖刚
周捷
王逸飞
王正来
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gaochuang Anbang Beijing Technology Co Ltd
Original Assignee
Gaochuang Anbang Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gaochuang Anbang Beijing Technology Co Ltd filed Critical Gaochuang Anbang Beijing Technology Co Ltd
Priority to CN201911163826.9A priority Critical patent/CN111008576B/en
Publication of CN111008576A publication Critical patent/CN111008576A/en
Application granted granted Critical
Publication of CN111008576B publication Critical patent/CN111008576B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a pedestrian detection and model training and updating method, equipment and a readable storage medium thereof, wherein the method for training a pedestrian detection model comprises the following steps: acquiring training video data containing pedestrians; converting training video data into an image sequence training sample set; marking pedestrian areas in the image sequence training sample set according to whether pedestrians in the image sequence training sample set are shielded to obtain positive and negative sample sets; calculating according to the positive and negative sample sets to obtain a first training set; and carrying out iterative training on the deep convolutional neural network model based on the cascade network and the feature fusion according to the first training set to obtain a pedestrian detection model. According to the method for training the pedestrian detection model, provided by the embodiment of the invention, the independent label for blocking the pedestrian is given according to whether the pedestrian is blocked in the image sequence, and the method is different from the pedestrian which is not blocked in the image, so that the detection precision of the blocked pedestrian is improved.

Description

Pedestrian detection and model training and updating method, device and readable storage medium thereof
Technical Field
The invention relates to the technical field of pedestrian detection, in particular to a pedestrian detection method, a model training method, a model updating method, equipment and a readable storage medium.
Background
The pedestrian detection technology is a technology for automatically searching the position and size of a pedestrian in any input image, is a key problem in the field of target detection, and has wide application in the fields of automatic driving, video monitoring, biological feature recognition, behavior analysis and the like.
Under the complex environment in real life, different pedestrians are different in clothing, the situation of confusion with the background is easily generated, meanwhile, the situation that the trunk part is blocked easily occurs, and in addition, the interference of the visual angle of a monitoring lens, illumination and the like causes the blocking problem of the pedestrians, which is one of the biggest challenges in the conventional pedestrian detection, especially in a crowded scene, how to carry out efficient and accurate pedestrian detection is a hotspot and a difficulty in research.
The traditional pedestrian detection method usually adopts a mode of manually designing and extracting features, a good detection effect is usually obtained only in a specific scene, and the robustness of an algorithm is difficult to guarantee.
Disclosure of Invention
In view of this, embodiments of the present invention provide a pedestrian detection method, a method and an apparatus for training and updating a model thereof, and a readable storage medium, so as to solve the problem of poor accuracy of the existing pedestrian detection algorithm.
The technical scheme provided by the invention is as follows:
a first aspect of an embodiment of the present invention provides a method for training a pedestrian detection model, where the method includes: acquiring training video data containing pedestrians; converting the training video data into an image sequence training sample set; marking pedestrian areas in the image sequence training sample set according to whether pedestrians in the image sequence training sample set are shielded to obtain positive and negative sample sets; calculating to obtain a first training set according to the positive and negative sample sets; and performing iterative training on the deep convolutional neural network model based on the cascade network and the feature fusion according to the first training set to obtain the pedestrian detection model.
According to a first aspect, in a first implementation form of the first aspect, the cascaded network comprises: an anchor point refining module and a target detection module; iteratively training a deep convolutional neural network model based on a cascade network and feature fusion according to the first training set to obtain the pedestrian detection model, wherein the iterative training comprises inputting the first training set into the anchor point refining module to obtain a first feature vector through calculation; inputting the first feature vector into the target detection module to calculate to obtain a second feature vector; inputting the first feature vector and the second feature vector into the feature fusion module respectively for calculation to obtain a first loss function and a second loss function; superposing the first loss function and the second loss function, and calculating to obtain a loss function; and selecting the model with the lowest loss function value as the pedestrian detection model.
According to the first aspect, in a second embodiment of the first aspect, the method for training a pedestrian detection model further comprises: calculating the positive and negative sample sets according to a preset proportion to obtain a first verification set; and verifying the pedestrian detection model according to the first verification set.
According to a second embodiment of the first aspect, in a third embodiment of the first aspect, the method for training a pedestrian detection model further comprises: calculating the positive and negative sample sets according to a preset proportion to obtain a first test set; and testing the verified pedestrian detection model according to the first test set to obtain a test result.
A second aspect of an embodiment of the present invention provides a pedestrian detection method, including: acquiring to-be-detected video data containing pedestrians; converting the video data to be detected into an image sequence detection sample set; inputting the image sequence detection sample set into the pedestrian detection model generated by training according to the method for training a pedestrian detection model in any one of the first aspect and the first aspect of the embodiment of the invention, and obtaining a detection result.
A third aspect of the embodiments of the present invention provides a method for updating a pedestrian detection model, where the method includes: acquiring detection results at intervals of preset time, wherein the detection results are obtained by the pedestrian detection method according to the second aspect of the embodiment of the invention; calculating to obtain a second training set according to the detection result; by adopting the method of any one of the first aspect and the second aspect of the embodiment of the invention, an updated model is obtained according to the second training set; judging the accuracy of the updating model and the pedestrian detection model; when the accuracy of the pedestrian detection model is lower than that of the updated model, the updated model is used as a pedestrian detection model to detect video data containing pedestrians; and when the accuracy of the pedestrian detection model is higher than that of the updating model, detecting the video data containing the pedestrian according to the pedestrian detection model.
According to a second aspect, in the first embodiment of the second aspect, calculating a second training set according to the detection result includes: taking a result with a score higher than a preset value in the detection result as a positive sample set, and taking a result with a score lower than a preset value in the detection result as a negative sample set; and calculating to obtain a second training set according to the positive sample set and the negative sample set.
According to a second aspect, in a second embodiment of the second aspect, the determining the accuracy of the update model and the pedestrian detection model includes: calculating according to the positive sample set and the negative sample set to obtain a second test set; calculating to obtain a second test set according to the to-be-detected video data containing the pedestrians; and judging the accuracy of the updated model and the pedestrian detection model according to a first test set and the second test set, wherein the first test set is obtained according to the method for training the pedestrian detection model in the third embodiment of the first aspect.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores computer instructions for causing a computer to execute the method for training a pedestrian detection model according to any one of the first aspect and the first aspect of the embodiments of the present invention, or execute the method for detecting a pedestrian according to the second aspect of the embodiments of the present invention, or execute the method for updating a pedestrian detection model according to any one of the third aspect and the third aspect of the embodiments of the present invention.
A fifth aspect of an embodiment of the present invention provides a pedestrian detection apparatus, including: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing computer instructions, and the processor executing the computer instructions to perform the method for training a pedestrian detection model according to any one of the first aspect and the first aspect of the embodiments of the present invention, or to perform the method for detecting a pedestrian according to the second aspect of the embodiments of the present invention, or to perform the method for updating a pedestrian detection model according to any one of the third aspect and the third aspect of the embodiments of the present invention.
The technical scheme provided by the invention has the following effects:
according to the pedestrian detection and model training and updating method, device and readable storage medium provided by the embodiment of the invention, the video data containing the pedestrian is obtained, the video data is converted into the image sequence, the independent label for shielding the pedestrian is given according to whether the pedestrian is shielded in the image sequence, the pedestrian is distinguished from the pedestrian which is not shielded in the image, the detection precision of the shielded pedestrian is improved, meanwhile, the pedestrian features are extracted through the cascade network, the model performance is improved through iterative training, and in addition, the detection effect of the small target of the pedestrian is further effectively improved through the feature fusion of the cascade network.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow diagram of a method of training a pedestrian detection model according to an embodiment of the invention;
FIG. 2 is a flow diagram of a method of training a pedestrian detection model, according to another embodiment of the invention;
FIG. 3 is a flow chart of a pedestrian detection method according to an embodiment of the invention;
FIG. 4 is a flow chart of a method of updating a pedestrian detection model according to an embodiment of the invention;
FIG. 5 is a flow diagram of a method of updating a pedestrian detection model according to another embodiment of the invention;
FIG. 6 is a flow diagram of a method of updating a pedestrian detection model according to another embodiment of the invention;
fig. 7 is a schematic hardware structure diagram of a pedestrian detection apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The pedestrian detection technology is a technology for automatically searching the position and size of a pedestrian in an arbitrary input image, and is widely applied to the fields of computer vision, pattern recognition and the like, such as automatic driving, video monitoring, biometric recognition and the like.
Under a complex environment in real life, the problem of blocking of pedestrians is one of the biggest challenges facing pedestrian detection at present, and especially under a crowded scene, how to perform efficient and accurate pedestrian detection is a hot spot and a difficult point of research. The method comprises the steps of obtaining training video data containing pedestrians through Deep Learning (Deep Learning), converting the training video data into an image sequence, and labeling the image sequence according to whether the pedestrians are shielded, so that a pedestrian detection result is obtained.
Deep learning is a learning method for establishing a deep structure model, and typical deep learning algorithms comprise a deep confidence network, a convolutional neural network, a limited boltzmann machine, a cyclic neural network and the like. Deep learning is also known as deep neural networks (referring to neural networks with more than 3 layers). Deep learning is derived from a multilayer neural network, and essentially, a mode of combining feature representation and learning is provided. The deep learning is characterized in that interpretability is abandoned, and effectiveness of learning is simply pursued.
Referring to fig. 1, a method for training a pedestrian detection model according to an embodiment of the present invention is mainly described as follows:
step S101: acquiring training video data containing pedestrians; specifically, the video including the pedestrian may be a monitoring video installed at each intersection, or may also be video data including a subway exit, a supermarket exit, a shopping mall exit, a train station exit, a school and the like in different outdoor occasions, which is not limited by the present invention.
Step S102: converting training video data into an image sequence training sample set; the sequence of images is a series of images that are acquired sequentially at different times and different orientations of the object, and the video is composed of a series of images called frames, which are acquired at fixed time intervals (called frame rate, usually expressed in frames/second), so that the scene in motion can be displayed. The present invention can convert the training video data into an image sequence training sample set by adopting the existing video conversion software.
Step S103: and marking the pedestrian area in the image sequence training sample set according to whether the pedestrian in the image sequence training sample set is shielded or not to obtain a positive sample set and a negative sample set. The positive sample set is a sample set containing pedestrians, and the negative sample set is a sample set containing no pedestrians.
Specifically, all pedestrian regions in the image sequence training sample set can be labeled in a rectangular frame form, and Person and Blocked Person labels are given according to whether the pedestrian is Blocked or not, wherein the Person label indicates that the pedestrian is not Blocked, and the Blocked Person label indicates that the pedestrian is Blocked. And storing information such as the name, the size, the label corresponding to the pedestrian area, the completeness, the easiness in identification and the coordinates of the image in the image sequence training sample set into a corresponding annotation file in a standard voc format, wherein the completeness of the image means whether the pedestrian in the annotated pedestrian area completely appears in the image, and when the pedestrian completely appears in the image, the image is easy to identify. Meanwhile, the name of the annotation file can be consistent with the corresponding image in the image sequence training sample set, the format of the annotation file can be xml, and the image and the annotation folder form a positive sample set Y and a negative sample set Y. In addition, in the embodiment of the present invention, the marking tool for marking the pedestrian region may be a homemade python tool, and the user may use the python tool to mark the pedestrian region, or may use another marking tool, which is not limited in this respect.
Step S104: calculating according to the positive and negative sample sets to obtain a first training set; specifically, the positive and negative sample sets may be divided into a training set and a first test set according to a first preset ratio, and then the training set may be divided into the first training set and a first verification set according to a second preset ratio.
Step S105: and carrying out iterative training on the deep convolutional neural network model based on the cascade network and the feature fusion according to the first training set to obtain a pedestrian detection model. Specifically, the deep convolutional neural network model can select a convolutional neural network model reaching a preset depth, and the deep convolutional neural network model comprises a cascade network and a feature fusion model, wherein the cascade network is used as a feature extraction module in the convolutional network, can comprise a plurality of convolutional layers and pooling layers and is used for extracting pedestrian features, and the performance of the model can be improved through iterative training; the feature fusion model can be a concat layer, the pedestrian features extracted from the cascade network are fused, and the optimal model in the iterative training is selected as the pedestrian detection model.
Through the steps S101 to S105, the method for training a pedestrian detection model according to the embodiment of the present invention converts video data including a pedestrian into an image sequence by obtaining the video data, and distinguishes the video data from an unobstructed pedestrian according to whether the pedestrian in the image sequence is obstructed or not, so as to improve the detection accuracy of the obstructed pedestrian, extract pedestrian features through a cascade network, improve the model performance through iterative training, and further effectively improve the detection effect of a small pedestrian target through feature fusion of the cascade network.
As an optional implementation manner of the embodiment of the present invention, the cascade network provided in the foregoing embodiment includes: an anchor point refining module and a target detection module; as shown in fig. 2, the step S105 performs iterative training on the deep convolutional neural network model based on the cascade network and the feature fusion according to the first training set to obtain the pedestrian detection model, which includes the following steps:
the method comprises the following steps of S201, inputting a first training set into an anchor point refining module to obtain a first feature vector through calculation, specifically, obtaining images in the first training set, adjusting the size of the obtained images to be 320 * 320 uniformly, inputting the images with the size into an anchor point refining model to be calculated, and obtaining the image size required by a target detection model.
The method comprises the steps of S202, inputting a first feature vector into a target detection module to obtain a second feature vector through calculation, specifically, obtaining the image size in the first feature vector, selecting feature vectors with the image sizes of 40 * 40, 20 * 20, 10 * 10 and 5 * 5 obtained through convolution calculation of different layers in an anchor point refining module, inputting the feature vectors into the target detection module to be calculated to obtain the second feature vector, selecting feature vectors with the image sizes of 40 * 40, 20 * 20, 10 * 10 and 5 * 5 for obtaining deep features obtained in the anchor point refining module, wherein the target detection module can be connected with the anchor point refining module through a transmission link module, images in the first feature vector can be input into the target detection module through the transmission link module, the target detection module can further extract pedestrian features on the basis of the anchor point refining module, the target detection module takes the anchor point improved by the anchor point refining module as input, the second feature vector is obtained through calculation, the second feature vector comprises position information, line coordinate size, Person frame labeling and other information, and the like, and can be used for further improving the Person classification and the regression effect of the image and the label.
And S203, respectively inputting the first feature vector and the second feature vector into a feature fusion module for calculation to obtain a first loss function and a second loss function, specifically, obtaining the image sizes of the first feature vector and the second feature vector, selecting the feature vectors with the image sizes of 40 * 40, 20 * 20, 10 * 10 and 5 * 5 from the two feature vectors obtained by different layers of convolution calculation in a target detection module, and respectively inputting the feature vectors into the feature fusion module for convolution calculation to obtain the first loss function and the second loss function.
Step S204: superposing the first loss function and the second loss function, and calculating to obtain a loss function; in particular, the first loss function may be expressed as
Figure BDA0002285244400000081
Wherein N isarmTo the number of anchor points, LbFor the classification loss, the method is used for calculating whether the classification is correct or not, namely whether the classification is correct or not for the image background, the Person label, the Blocked Person label and other categories, and LrThe method is used for calculating the offset of an object detection frame and a real frame for regression loss, wherein the object detection frame is the coordinate of a pedestrian frame obtained from the first feature vector, the real frame is the coordinate of the pedestrian frame actually displayed by the image, and piProbability value, x, of whether or not it is an object predicted for the networkiRepresenting the detected coordinates, giRepresenting real coordinates, i representing the respective images in the first training set; the second loss function can be expressed as
Figure BDA0002285244400000082
Figure BDA0002285244400000083
Wherein N isodmFor object detection module acquisitionNumber of anchor points, LmTo classify the loss, LrTo return loss, ciProbability of detection box belonging to each category, tiAnd giRespectively representing the detection coordinates and the real coordinates; after the first loss function and the second loss function are obtained through calculation, the first loss function and the second loss function are superposed, and the loss function is obtained through calculation and can be represented by a formula (1):
Figure BDA0002285244400000091
wherein the content of the first and second substances,
Figure BDA0002285244400000092
representing the loss of only positive samples in the positive and negative sample sets calculated in the block regression. Since the positive sample set is a sample set containing pedestrians, the negative sample set is other sample sets, and the final output result only needs the pedestrian frame and does not need the frames of other categories, only the loss of the positive sample is calculated.
And S205, selecting a model with the lowest loss function value as a pedestrian detection model, specifically, fusing the anchor point refining module and the characteristic vectors with the sizes of 40 * 40, 20 * 20, 10 * 10 and 5 * 5 in the target detection module by characteristic fusion for calculating the loss function value, and taking the model with the lowest calculated loss function value as the pedestrian detection model according to a formula (1).
In the embodiment of the invention, the anchor point refining module and the target detection module are built, the first training set is input into the cascade network for calculation, judgment and classification are carried out step by step, regression of a coarse detection frame to a fine detection frame is realized, and the detection precision of the pedestrian detection model is further improved; meanwhile, the loss function is calculated by carrying out feature fusion on the cascade network, so that the detection effect of the pedestrian detection model on the small pedestrian target is further effectively improved.
As an optional implementation manner of the embodiment of the present invention, the method for training a pedestrian detection model provided in the embodiment of the present invention further includes: inputting a first verification set obtained by calculating in the positive and negative sample sets in the embodiment into the obtained pedestrian detection model for verification; in addition, the first test set obtained by calculating the positive and negative sample sets in the embodiment is input into the verified pedestrian detection model for testing, so that a test result is obtained. Whether the detection result of the pedestrian detection model obtained by the embodiment of the invention meets the preset standard can be judged through the verification and test process, for example, the detection result can be divided into different scores, when the detection result of the preset proportion sample set is higher than the preset score, the detection result meets the preset standard, and the pedestrian detection model can be used for pedestrian detection.
An embodiment of the present invention further provides a pedestrian detection method, as shown in fig. 3, the pedestrian detection method includes the following steps:
step S301: acquiring to-be-detected video data containing pedestrians; specifically, the video to be detected including the pedestrian may be a monitoring video installed at each intersection, or may also be video data in different outdoor situations including a subway exit, a supermarket exit, a market exit, a train station exit, a school and the like, which is not limited by the present invention.
Step S302: converting video data to be detected into an image sequence detection sample set; the sequence of images is a series of images that are acquired sequentially at different times and different orientations of the object, and the video is composed of a series of images called frames, which are acquired at fixed time intervals (called frame rate, usually expressed in frames/second), so that the scene in motion can be displayed. The invention can adopt the existing video conversion software to convert the video data to be detected into the image sequence detection sample set.
Step S303: and inputting the image sequence detection sample set into the pedestrian detection model obtained by the method for training the pedestrian detection model in the embodiment to obtain a detection result. Specifically, the image sequence detection sample set may be input to a pedestrian detection model obtained by a method of training a pedestrian detection model as shown in fig. 1 to 2 for detection, so as to obtain a detection result.
Through the steps S301 to S303, in the pedestrian detection method provided in the embodiment of the present invention, the video data including the pedestrian is obtained, the video data is converted into the image sequence, the image sequence is input into the pedestrian detection model obtained by the method for training the pedestrian detection model in the embodiment, whether the pedestrian in the image sequence is blocked is provided with the individual label for blocking the pedestrian, the individual label is distinguished from the pedestrian that is not blocked in the image, the deep learning is performed on the partial trunk of the human body individually, and the detection accuracy for blocking the pedestrian is improved.
An embodiment of the present invention further provides a method for updating a pedestrian detection model, as shown in fig. 4, the method for updating a pedestrian detection model further includes the following steps:
step S401: acquiring detection results at intervals of preset time, wherein the detection results are obtained by the pedestrian detection method in the embodiment; specifically, according to the pedestrian detection method in the embodiment, video data at different times are detected, and after detection results are obtained, the detection results are obtained at intervals.
Step S402: calculating according to the detection result to obtain a second training set; specifically, the data in the detection result may be acquired as the second training set.
Step S403: by adopting the method for training the pedestrian detection model, the updated model is obtained according to the second training set. Specifically, the second training set may be trained according to the method for training a pedestrian detection model shown in fig. 1 to 2, so as to obtain an updated pedestrian detection model.
Step S404: the accuracy of the update model and the pedestrian detection model is judged.
Step S405: and when the accuracy of the pedestrian detection model is lower than that of the updating model, detecting the video data containing the pedestrian by taking the updating model as the pedestrian detection model.
Step S406: and when the accuracy of the pedestrian detection model is higher than that of the updated model, detecting the video data containing the pedestrian according to the pedestrian detection model.
As an optional implementation manner of the embodiment of the present invention, as shown in fig. 5, the step S402 obtains a second training set by calculation according to the detection result, and includes the following steps:
step S501: taking a result with the score higher than a preset value in the detection result as a positive sample set, and taking a result with the score lower than the preset value in the detection result as a negative sample set; specifically, the detection result is obtained according to the pedestrian detection method shown in fig. 3, the detection result includes the detection score of the video data to be detected, the detection result is obtained once every preset time, the detection result with the score of 0-0.2 in the detection result is used as a negative sample set, the detection result with the score of more than 0.9 is used as a positive sample set, the similarity detection is performed on the images included in the negative sample set and the positive sample set, the repeated samples are removed, and the precision of the subsequent detection by using the training set can be improved. In the embodiment of the present invention, the scores of the detection results in the negative sample set and the positive sample set are only examples, and the detection results including other scores may also be used as the positive sample set and the negative sample set, which is not limited in the present invention.
Step S502: and calculating according to the positive sample set and the negative sample set to obtain a second training set. Specifically, the ratio of 3: randomly sampling the negative sample set and the positive sample set according to the proportion of 1, repeating for 5 times to obtain a sample set G with five times of sampling1、G2、G3、G4And G5The five sample sets are respectively merged with the first training set obtained in step S104 of the above embodiment to obtain merged sample sets H1, H2, H3, H4, and H5, so that the second training set may include five sample sets, or obtain other number of sample sets according to the sampling times, which is not limited in the present invention. When the pedestrian detection model is subjected to iterative training according to the second training set, the five sample sets can be respectively input into the model to obtain five updating models, and the accuracy of the five updating models and the accuracy of the pedestrian detection model can be compared later.
As an alternative implementation manner of the embodiment of the present invention, as shown in fig. 6, the step S404 of determining the accuracy of the update model and the pedestrian detection model includes:
step S601: calculating to obtain a second test set according to the to-be-detected video data containing the pedestrians; specifically, part of the latest video data to be detected is acquired, the video data to be detected is converted into an image sequence set, and the obtained image sequence set is a second test set.
Step S602: judging the precision of the updating model and the pedestrian detection model according to the first test set and the second test set; specifically, a first test set is obtained in step S104 and a second test set is obtained in step S601 according to the above embodiment, the first test set and the second test set are input into the five update models obtained in step S403 and the pedestrian detection model obtained by the method for training the pedestrian detection model according to the above embodiment for detection, and the accuracy of the update models and the pedestrian detection model is determined according to the detection result.
According to the method for updating the pedestrian detection model, provided by the embodiment of the invention, the pedestrian detection model obtained by the method for training the pedestrian detection model in the embodiment is updated by extracting the data in the detection result, so that the detection performance of the model is improved, meanwhile, the model with higher precision can be obtained by judging the precision of the updated model to detect the video data to be detected containing pedestrians, the situations of false alarm and missing alarm are reduced, and the detection precision of the pedestrian detection method is improved.
An embodiment of the present invention further provides a pedestrian detection apparatus, as shown in fig. 7, the pedestrian detection apparatus may include a processor 51 and a memory 52, where the processor 51 and the memory 52 may be connected by a bus or in another manner, and fig. 7 takes the example of connection by a bus as an example.
The processor 51 may be a Central Processing Unit (CPU). The Processor 51 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.
The memory 52 is a non-transitory computer readable storage medium, and can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as the apparatuses corresponding to the pedestrian detection methods in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running non-transitory software programs, instructions and modules stored in the memory 52, that is, implements the pedestrian detection method in the above-described method embodiment.
The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 51, and the like. Further, the memory 52 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, and these remote memories may be connected to the processor 51 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory 52 and, when executed by the processor 51, perform the pedestrian detection method in the embodiment shown in fig. 3.
The details of the pedestrian detection device can be understood by referring to the corresponding related description and effects in the embodiment shown in fig. 3, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a flash Memory (FlashMemory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A method of training a pedestrian detection model, comprising:
acquiring training video data containing pedestrians;
converting the training video data into an image sequence training sample set;
marking pedestrian areas in the image sequence training sample set according to whether pedestrians in the image sequence training sample set are shielded to obtain positive and negative sample sets;
calculating to obtain a first training set according to the positive and negative sample sets;
and performing iterative training on the deep convolutional neural network model based on the cascade network and the feature fusion according to the first training set to obtain the pedestrian detection model.
2. The method of training a pedestrian detection model in accordance with claim 1, wherein the cascade network comprises: an anchor point refining module and a target detection module;
iteratively training a deep convolutional neural network model based on a cascade network and feature fusion according to the first training set to obtain the pedestrian detection model, comprising,
inputting the first training set into the anchor point refinement module to calculate to obtain a first feature vector;
inputting the first feature vector into the target detection module to calculate to obtain a second feature vector;
inputting the first feature vector and the second feature vector into the feature fusion module respectively for calculation to obtain a first loss function and a second loss function;
superposing the first loss function and the second loss function, and calculating to obtain a loss function;
and selecting the model with the lowest loss function value as the pedestrian detection model.
3. The method of training a pedestrian detection model in accordance with claim 1, further comprising:
calculating the positive and negative sample sets according to a preset proportion to obtain a first verification set;
and verifying the pedestrian detection model according to the first verification set.
4. The method of training a pedestrian detection model in accordance with claim 3, further comprising:
calculating the positive and negative sample sets according to a preset proportion to obtain a first test set;
and testing the verified pedestrian detection model according to the first test set to obtain a test result.
5. A pedestrian detection method, characterized by comprising:
acquiring to-be-detected video data containing pedestrians;
converting the video data to be detected into an image sequence detection sample set;
inputting the image sequence detection sample set into a pedestrian detection model generated by training according to the method for training a pedestrian detection model of any one of claims 1 to 4, and obtaining a detection result.
6. A method of updating a pedestrian detection model, comprising:
acquiring detection results at intervals of preset time, wherein the detection results are obtained according to the pedestrian detection method of claim 5;
calculating to obtain a second training set according to the detection result;
obtaining an updated model from the second training set using the method of any of claims 1-4;
judging the accuracy of the updating model and the pedestrian detection model;
when the accuracy of the pedestrian detection model is lower than that of the updated model, the updated model is used as a pedestrian detection model to detect video data containing pedestrians;
and when the accuracy of the pedestrian detection model is higher than that of the updating model, detecting the video data containing the pedestrian according to the pedestrian detection model.
7. The method of updating a pedestrian detection model according to claim 6, wherein calculating a second training set from the detection results comprises:
taking a result with a score higher than a preset value in the detection result as a positive sample set, and taking a result with a score lower than a preset value in the detection result as a negative sample set;
and calculating to obtain a second training set according to the positive sample set and the negative sample set.
8. The method of updating a pedestrian detection model according to claim 6, wherein determining the accuracy of the updated model and the pedestrian detection model comprises:
calculating to obtain a second test set according to the to-be-detected video data containing the pedestrians;
and judging the accuracy of the updating model and the pedestrian detection model according to a first test set and the second test set, wherein the first test set is obtained according to the method for training the pedestrian detection model in claim 4.
9. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1-4, 5, or 6-8.
10. A pedestrian detection apparatus, characterized by comprising: a memory and a processor communicatively coupled to each other, the memory storing computer instructions, the processor executing the computer instructions to perform the method of any of claims 1-4, 5, or 6-8.
CN201911163826.9A 2019-11-22 2019-11-22 Pedestrian detection and model training method, device and readable storage medium Active CN111008576B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911163826.9A CN111008576B (en) 2019-11-22 2019-11-22 Pedestrian detection and model training method, device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911163826.9A CN111008576B (en) 2019-11-22 2019-11-22 Pedestrian detection and model training method, device and readable storage medium

Publications (2)

Publication Number Publication Date
CN111008576A true CN111008576A (en) 2020-04-14
CN111008576B CN111008576B (en) 2023-09-01

Family

ID=70111902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911163826.9A Active CN111008576B (en) 2019-11-22 2019-11-22 Pedestrian detection and model training method, device and readable storage medium

Country Status (1)

Country Link
CN (1) CN111008576B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507289A (en) * 2020-04-22 2020-08-07 上海眼控科技股份有限公司 Video matching method, computer device and storage medium
CN111523452A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and device for detecting human body position in image
CN112132218A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN113158971A (en) * 2021-05-11 2021-07-23 北京易华录信息技术股份有限公司 Event detection model training method and event classification method and system
CN113688761A (en) * 2021-08-31 2021-11-23 安徽大学 Pedestrian behavior category detection method based on image sequence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886167A (en) * 2019-02-01 2019-06-14 中国科学院信息工程研究所 One kind blocking face identification method and device
CN109948573A (en) * 2019-03-27 2019-06-28 厦门大学 A kind of noise robustness face identification method based on cascade deep convolutional neural networks
WO2019128646A1 (en) * 2017-12-28 2019-07-04 深圳励飞科技有限公司 Face detection method, method and device for training parameters of convolutional neural network, and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019128646A1 (en) * 2017-12-28 2019-07-04 深圳励飞科技有限公司 Face detection method, method and device for training parameters of convolutional neural network, and medium
CN109886167A (en) * 2019-02-01 2019-06-14 中国科学院信息工程研究所 One kind blocking face identification method and device
CN109948573A (en) * 2019-03-27 2019-06-28 厦门大学 A kind of noise robustness face identification method based on cascade deep convolutional neural networks

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507289A (en) * 2020-04-22 2020-08-07 上海眼控科技股份有限公司 Video matching method, computer device and storage medium
CN111523452A (en) * 2020-04-22 2020-08-11 北京百度网讯科技有限公司 Method and device for detecting human body position in image
CN111523452B (en) * 2020-04-22 2023-08-25 北京百度网讯科技有限公司 Method and device for detecting human body position in image
CN112132218A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN112132218B (en) * 2020-09-23 2024-04-16 平安国际智慧城市科技股份有限公司 Image processing method, device, electronic equipment and storage medium
CN113158971A (en) * 2021-05-11 2021-07-23 北京易华录信息技术股份有限公司 Event detection model training method and event classification method and system
CN113158971B (en) * 2021-05-11 2024-03-08 北京易华录信息技术股份有限公司 Event detection model training method and event classification method and system
CN113688761A (en) * 2021-08-31 2021-11-23 安徽大学 Pedestrian behavior category detection method based on image sequence
CN113688761B (en) * 2021-08-31 2024-02-20 安徽大学 Pedestrian behavior category detection method based on image sequence

Also Published As

Publication number Publication date
CN111008576B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN111008576B (en) Pedestrian detection and model training method, device and readable storage medium
US11144786B2 (en) Information processing apparatus, method for controlling information processing apparatus, and storage medium
Ukhwah et al. Asphalt pavement pothole detection using deep learning method based on YOLO neural network
CN107545262B (en) Method and device for detecting text in natural scene image
US11003941B2 (en) Character identification method and device
WO2022213879A1 (en) Target object detection method and apparatus, and computer device and storage medium
CN109086811B (en) Multi-label image classification method and device and electronic equipment
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
EP3702957B1 (en) Target detection method and apparatus, and computer device
TW201926140A (en) Method, electronic device and non-transitory computer readable storage medium for image annotation
CN112508975A (en) Image identification method, device, equipment and storage medium
JP2016062610A (en) Feature model creation method and feature model creation device
CN111814850A (en) Defect detection model training method, defect detection method and related device
JP2016018538A (en) Image recognition device and method and program
TWI712980B (en) Claim information extraction method and device, and electronic equipment
KR101753097B1 (en) Vehicle detection method, data base for the vehicle detection, providing method of data base for the vehicle detection
CN112766218B (en) Cross-domain pedestrian re-recognition method and device based on asymmetric combined teaching network
CN109858327B (en) Character segmentation method based on deep learning
CN110781980A (en) Training method of target detection model, target detection method and device
CN111738036A (en) Image processing method, device, equipment and storage medium
CN112634368A (en) Method and device for generating space and OR graph model of scene target and electronic equipment
CN111444850A (en) Picture detection method and related device
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
WO2021147055A1 (en) Systems and methods for video anomaly detection using multi-scale image frame prediction network
CN114266881A (en) Pointer type instrument automatic reading method based on improved semantic segmentation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant