CN110688978A - Pedestrian detection method, device, system and equipment - Google Patents

Pedestrian detection method, device, system and equipment Download PDF

Info

Publication number
CN110688978A
CN110688978A CN201910959000.7A CN201910959000A CN110688978A CN 110688978 A CN110688978 A CN 110688978A CN 201910959000 A CN201910959000 A CN 201910959000A CN 110688978 A CN110688978 A CN 110688978A
Authority
CN
China
Prior art keywords
pedestrian detection
pyramid
residual error
depth residual
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910959000.7A
Other languages
Chinese (zh)
Inventor
罗径庭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910959000.7A priority Critical patent/CN110688978A/en
Publication of CN110688978A publication Critical patent/CN110688978A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application discloses a pedestrian detection method, a device, a system and equipment, wherein the method comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing the pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network, and solves the technical problem of low pedestrian detection accuracy when a human body is subjected to large-scale change in the conventional pedestrian detection method.

Description

Pedestrian detection method, device, system and equipment
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a pedestrian detection method, apparatus, system, and device.
Background
Pedestrian detection is an important component in computer vision, and can be widely applied to the fields of automatic driving, video monitoring and the like. With the publication of a large number of available large pedestrian detection data sets, pedestrian detection algorithms are significantly improved, however, in practical applications, it is often necessary to face pedestrian detection in highly crowded and cluttered scenes, and in such scenes, the dimensions and the form of a human body can change in a complex manner, and compared with pedestrian detection in a single scene, the difficulty is higher. The existing pedestrian detection method has low pedestrian detection accuracy when the human body is subjected to large-scale change.
Disclosure of Invention
The application provides a pedestrian detection method, a device, a system and equipment, which are used for solving the technical problem of low pedestrian detection accuracy rate of the existing pedestrian detection method when a human body is subjected to large-scale change.
In view of the above, a first aspect of the present application provides a pedestrian detection method, including:
establishing a pyramid-depth residual error network model;
inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
Optionally, the establishing a pyramid-depth residual error network model includes:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
Optionally, the de-duplicating the detection frame extracted by the pyramid-depth residual error network model includes:
and de-repeating the detection frame extracted by the pyramid-depth residual error network model based on non-maximum suppression.
Optionally, after obtaining the to-be-trained pedestrian detection image, inputting the to-be-trained pedestrian detection image into the pyramid-depth residual error network model, and before training the pyramid-depth residual error network model, the method further includes:
and preprocessing the pedestrian detection image to be trained.
Optionally, the pyramid network includes 6 convolutional layers for extracting feature maps with different resolutions.
The present application provides in a second aspect a pedestrian detection apparatus comprising: the pedestrian detection system comprises a model building module and a pedestrian detection module;
the model establishing module is used for establishing a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network;
the pedestrian detection module is used for inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model and outputting a pedestrian detection result.
Optionally, the model building module is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
A third aspect of the present application provides a pedestrian detection system comprising: the pedestrian detection device comprises a case, an image collector and the pedestrian detection device of any one of the second aspect.
The image collector and the pedestrian detection device are arranged on the case;
the image collector is used for shooting a pedestrian image and sending the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method in any one of the first aspect.
Optionally, the method further includes: the LCD, the memory and the image processor;
the liquid crystal display screen is used for displaying a pedestrian detection result;
the memory is used for storing the pedestrian image shot by the image collector or the pedestrian detection result;
the image processor is used for controlling the liquid crystal display screen to display the pedestrian detection result.
A fourth aspect of the present application provides a pedestrian detection apparatus, the apparatus comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method of any one of the first aspect according to instructions in the program code.
According to the technical scheme, the method has the following advantages:
the application provides a pedestrian detection method, which comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network. According to the pedestrian detection method, the built pyramid-depth residual error network model is that the convolutional layer is added on the basis of the depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and a plurality of scale features of the image of the pedestrian to be detected are extracted through the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, as the pyramid network extracts deep features, the output of the pyramid network and the output of the residual block of the depth residual error network are fused, so that the deep features and the low-level features are fused, the richness of the features is improved, and the accuracy of pedestrian detection is improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a pedestrian detection method provided herein;
FIG. 2 is a schematic flow chart diagram illustrating another embodiment of a pedestrian detection method provided herein;
FIG. 3 is a schematic structural diagram of an embodiment of a pedestrian detection device provided by the present application;
FIG. 4 is a schematic block diagram illustrating one embodiment of a pedestrian detection system according to the present application;
wherein the reference numerals are:
1. a chassis; 2. a memory; 3. an image collector; 4. a central processing unit; 5. a liquid crystal display screen; 6. a graphics processor.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
For ease of understanding, referring to fig. 1, an embodiment of a pedestrian detection method provided by the present application includes:
step 101, establishing a pyramid-depth residual error network model.
The pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
In practical application, pedestrian detection is often required in highly crowded and disordered scenes, the scale and the form of a human body in such scenes can be changed in a complex manner, the difficulty is higher compared with the pedestrian detection in a single scene, and the existing pedestrian detection method has the technical problem of low pedestrian detection accuracy when the human body is changed in a large scale. Therefore, in the embodiment of the application, a pyramid-depth residual error network model is constructed, a convolutional layer is added on the basis of a depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and the output of a residual error block of the depth residual error network is fused with the output of the pyramid network to obtain the multi-scale pedestrian detection network model. A plurality of scale features of the pedestrian image to be detected are extracted through the constructed pyramid-depth residual error network model, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved.
The depth residual error network is composed of a series of residual error blocks, the depth residual error network adopts a crossing connection mode, so that the output of a certain convolution layer can directly cross several convolution layers to be used as the input of a certain subsequent convolution layer, multiple layers of networks can be overlapped, certain calculation amount is reduced while the network is deepened, the degradation problem of the convolutional neural network is solved, and the detection accuracy is improved.
And 102, inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result.
It should be noted that the video frame may be captured in the surveillance video as the image of the pedestrian to be detected, the image taken by the camera may also be used as the image of the pedestrian to be detected, the image of the pedestrian to be detected may also be screened, and if the image does not contain the pedestrian, the image is removed.
The image of the pedestrian to be detected is input into the pyramid-depth residual error network model, the position of the detection frame can be output, the position of the detection frame is the position of the detected pedestrian, the detection result can be displayed in the image of the pedestrian to be detected, the detected pedestrian is distributed with the detection frame, and the sizes of the detection frames of the pedestrians with different scales can be different.
The embodiment of the application provides a pedestrian detection method, which comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network. According to the pedestrian detection method, the built pyramid-depth residual error network model is that the convolutional layer is added on the basis of the depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and a plurality of scale features of the image of the pedestrian to be detected are extracted through the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, as the pyramid network extracts deep features, the output of the pyramid network and the output of the residual block of the depth residual error network are fused, so that the deep features and the low-level features are fused, the richness of the features is improved, and the accuracy of pedestrian detection is improved.
For easy understanding, referring to fig. 2, another embodiment of a pedestrian detection method provided by the present application includes:
step 201, acquiring a pedestrian detection image to be trained.
It should be noted that the to-be-trained pedestrian detection image with the marked pedestrian position can be obtained from the public pedestrian detection database, a large number of video frames can be intercepted from the video monitoring of the intersection as the to-be-trained pedestrian detection image, and the pedestrian position in the video frame is marked, so that the pyramid-depth residual error network model can be conveniently trained.
In order to fully train the pyramid-depth residual error network model and improve the accuracy rate of pedestrian detection, a data enhancement method can be adopted to perform quantity expansion on the obtained pedestrian detection images to be trained. For example, appropriate noise can be added to the obtained to-be-trained pedestrian detection images to expand the number of to-be-trained pedestrian detection images, and the robustness of the pyramid-depth residual error network model can be improved to a certain extent, wherein the noise can be salt and pepper noise or gaussian noise.
Step 202, preprocessing the pedestrian detection image to be trained.
It should be noted that normalization processing may be performed on the to-be-trained pedestrian detection images, so that the to-be-trained pedestrian detection images are uniform in size, and thus, the pyramid-depth residual error network model can be trained conveniently.
And 203, inputting the pedestrian detection image to be trained into the pyramid-depth residual error network model, and training the pyramid-depth residual error network model.
The pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
The pyramid network comprises 6 convolutional layers for extracting feature maps with different resolutions, the resolution size of the feature maps extracted by the 6 convolutional layers can be 2 × 2, 4 × 4, 8 × 8, 16 × 16, 32 × 32 and 64 × 64 in sequence, the feature map with the size of 4 × 4 is obtained by up-sampling the feature map with the size of 2 × 2 in 2-fold step size, the feature map with the size of 4 × 4 is obtained by up-sampling the feature map with the size of 8 × 8 in the same way, the feature maps with the sizes of 16 × 16, 32 × 32 and 64 × 64 are respectively obtained, a residual block in the pyramid-depth residual network model is also composed of 6 different convolutional layers, the resolution size of the feature maps extracted by the 6 convolutional layers can be 64 × 64, 32 × 32, 16 × 16, 8 × 8, 4 × 4 and 2 × 2 in sequence, the feature map with the size of 64 × 64 is down-sampled in 2-fold step size, obtaining feature maps of 32 × 32 size, obtaining feature maps of 16 × 16, 8 × 8, 4 × 4 and 2 × 2 size, respectively, and fusing the feature maps output by 6 convolutional layers in the residual block with the feature maps output by 6 convolutional layers in the pyramid by the same method, where the feature maps output by 6 convolutional layers in the residual block and the feature maps output by 6 convolutional layers in the pyramid are cascaded, and the feature maps output by the residual block and the feature maps corresponding to the same size in the pyramid are cascaded. For example, a feature map of 64 × 64 size output from the residual block is cascaded with a feature map of 64 × 64 size output from the pyramid network, a feature map of 32 × 32 size output from the residual block is cascaded with a feature map of 32 × 32 size output from the pyramid network, a feature map of 4 × 4 size output from the residual block is cascaded with a feature map of 4 × 4 size output from the pyramid network, and so on, and finally 6 branches are formed, and a multi-scale pedestrian detection network model is obtained. The pyramid-depth residual error network model can repeatedly stack a plurality of residual error blocks, finally, the output of the residual error blocks and the output of the pyramid network are fused, and a plurality of scale features of the image of the pedestrian to be detected are extracted by constructing the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, the output of the pyramid network and the output of the residual block are fused, so that the feature richness is improved, and the accuracy of pedestrian detection is improved.
And step 204, de-repeating the detection frame extracted by the pyramid-depth residual error network model.
It should be noted that, when the to-be-trained pedestrian detection image is adopted to train the pyramid-depth residual error network model, the pyramid-depth residual error network model detects the position of a pedestrian according to the extracted semantic information, and allocates detection frames to all possible pedestrian targets in the to-be-trained pedestrian detection image, and possibly allocates a plurality of different detection frames to the same pedestrian target, so that there will be repeated detection frames, and if the detection accuracy is calculated for each detection frame, the calculated amount of the model can be greatly increased, therefore, in the embodiment of the application, the extracted detection frames are deduplicated, and the detection frames extracted for 6 branches are deduplicated by non-maximum suppression, and the specific steps include:
and arranging the confidence scores of the detection boxes extracted by the pyramid-depth residual error network model from high to low, and selecting the detection box with the highest confidence score as a suggestion box.
And calculating the area overlapping ratio of each detection frame except the suggestion frame to the suggestion frame, namely calculating the area ratio of the intersection position of the detection frame and the suggestion frame to the union position.
Removing the detection frames corresponding to the area overlapping ratio exceeding the preset area overlapping ratio threshold, repeating the steps until the area overlapping ratio is calculated among all the detection frames, and screening all the detection frames by comparing the area overlapping ratio with the preset area overlapping ratio threshold, thereby removing the repeated detection frames, wherein the preset area overlapping ratio threshold can be set according to the actual situation, and the preset area overlapping ratio threshold can be 0.5 or 0.65.
And step 205, when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
It should be noted that the pyramid-depth residual error network model is trained through the pedestrian detection image to be trained, when the number of iterations of the training reaches a threshold value, the training can be stopped, the trained pyramid-depth residual error network model is obtained, and the threshold value can be preset according to the depth of the model and the number of the pedestrian detection images to be trained.
And step 206, inputting the pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a detection result.
It should be noted that the video frame may be captured in the surveillance video as the image of the pedestrian to be detected, the image captured by the camera may also be used as the image of the pedestrian to be detected, the image of the pedestrian to be detected may also be screened, and if the image does not contain the pedestrian, the image is removed.
Inputting the image of the pedestrian to be detected into the trained pyramid-depth residual error network model, outputting the position of the detection frame, wherein the position of the detection frame is the position of the detected pedestrian, displaying the detection result in the image of the pedestrian to be detected, distributing the detection frame to the detected pedestrian, and enabling the detection frames of the pedestrians with different scales to be different in size.
For easy understanding, referring to fig. 3, the present application provides an embodiment of a pedestrian detection apparatus, including:
a model building module 301 and a pedestrian detection module 302.
A model establishing module 301, configured to establish a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
The pedestrian detection module 302 is configured to input a pedestrian image to be detected into the pyramid-depth residual error network model, and output a pedestrian detection result.
Further, the model building module 301 is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting a pedestrian detection image to be trained into the pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
For ease of understanding, referring to fig. 4, the present application provides an embodiment of a pedestrian detection system, comprising:
the case 1, the image collector 3 and the pedestrian detection device in the embodiment of the pedestrian detection device.
The image collector 3 and the pedestrian detection device are arranged on the case 1;
the image collector 3 is configured to capture a pedestrian image and send the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method in the embodiment of the pedestrian detection method.
It should be noted that the material outside the chassis 1 may be an aluminum alloy material, so as to facilitate heat dissipation; the density of the aluminum alloy is small, and the aluminum alloy has relatively light weight under the condition of the same volume, so that the aluminum alloy is convenient to move and use; the hardness of the aluminum alloy is higher than that of other materials, so that the anti-collision and anti-falling capacity of the outer side of the case 1 is improved.
Image collector 3 can be the industry camera, and the industry camera has high image stability, high transmission ability and high interference killing feature, can install 1 industry camera respectively in quick-witted case 1 dead ahead and left and right sides, and the pedestrian image transmission that the industry camera will shoot sends pedestrian detection device to, and pedestrian detection device handles the pedestrian image that receives, obtains pedestrian detection result.
Further, the method also comprises the following steps: a liquid crystal display screen 5, a memory 2 and an image processor 6;
the liquid crystal display screen 5 is used for displaying a pedestrian detection result;
the memory 2 is used for storing pedestrian images or pedestrian detection results shot by the image collector 3;
the image processor 6 is used for controlling the liquid crystal display screen 5 to display the pedestrian detection result.
It should be noted that the number of the memories 2 may be 2 or more than 2, one of the memories 2 may be used to store the pedestrian image captured by the image capture device 3, the other memory 2 may be used to store the pedestrian detection result, or one of the memories 2 may be used to store the pedestrian image or the pedestrian detection result captured by the image capture device 3, and the pedestrian detection system further includes a central processing unit 4, which is an operation and control core of the pedestrian detection system.
The application also provides pedestrian detection equipment, which comprises a processor and a memory;
the memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method in the embodiment of the pedestrian detection method described above according to instructions in the program code.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A pedestrian detection method, characterized by comprising:
establishing a pyramid-depth residual error network model;
inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
2. The pedestrian detection method of claim 1, wherein the establishing a pyramid-depth residual network model comprises:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
3. The pedestrian detection method of claim 2, wherein the de-repeating of the detection box extracted by the pyramid-depth residual error network model comprises:
and de-repeating the detection frame extracted by the pyramid-depth residual error network model based on non-maximum suppression.
4. The pedestrian detection method according to claim 2, wherein after the obtaining of the to-be-trained pedestrian detection image, the to-be-trained pedestrian detection image is input to a pyramid-depth residual error network model, and before the training of the pyramid-depth residual error network model, the method further comprises:
and preprocessing the pedestrian detection image to be trained.
5. The pedestrian detection method of claim 1, wherein the pyramid network includes 6 convolutional layers that extract different resolution feature maps.
6. A pedestrian detection device, characterized by comprising: the pedestrian detection system comprises a model building module and a pedestrian detection module;
the model establishing module is used for establishing a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network;
the pedestrian detection module is used for inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model and outputting a pedestrian detection result.
7. The pedestrian detection apparatus of claim 6, wherein the model building module is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
8. A pedestrian detection system, comprising: a case, an image collector and a pedestrian detection device according to any one of claims 6 to 7.
The image collector and the pedestrian detection device are arranged on the case;
the image collector is used for shooting a pedestrian image and sending the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method according to any one of claims 1 to 5.
9. The pedestrian detection system of claim 8, further comprising: the LCD, the memory and the image processor;
the liquid crystal display screen is used for displaying a pedestrian detection result;
the memory is used for storing the pedestrian image shot by the image collector or the pedestrian detection result;
the image processor is used for controlling the liquid crystal display screen to display the pedestrian detection result.
10. A pedestrian detection apparatus, characterized in that the apparatus comprises a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method of any one of claims 1-5 according to instructions in the program code.
CN201910959000.7A 2019-10-10 2019-10-10 Pedestrian detection method, device, system and equipment Pending CN110688978A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910959000.7A CN110688978A (en) 2019-10-10 2019-10-10 Pedestrian detection method, device, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910959000.7A CN110688978A (en) 2019-10-10 2019-10-10 Pedestrian detection method, device, system and equipment

Publications (1)

Publication Number Publication Date
CN110688978A true CN110688978A (en) 2020-01-14

Family

ID=69112036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910959000.7A Pending CN110688978A (en) 2019-10-10 2019-10-10 Pedestrian detection method, device, system and equipment

Country Status (1)

Country Link
CN (1) CN110688978A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113940635A (en) * 2021-11-25 2022-01-18 南京邮电大学 Skin lesion segmentation and feature extraction method based on depth residual pyramid

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090906A (en) * 2018-01-30 2018-05-29 浙江大学 A kind of uterine neck image processing method and device based on region nomination
CN110211139A (en) * 2019-06-12 2019-09-06 安徽大学 Automatic segmentation Radiotherapy of Esophageal Cancer target area and the method and system for jeopardizing organ
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090906A (en) * 2018-01-30 2018-05-29 浙江大学 A kind of uterine neck image processing method and device based on region nomination
CN110232350A (en) * 2019-06-10 2019-09-13 哈尔滨工程大学 A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study
CN110211139A (en) * 2019-06-12 2019-09-06 安徽大学 Automatic segmentation Radiotherapy of Esophageal Cancer target area and the method and system for jeopardizing organ

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DONGYOON HAN ET AL: "《Deep Pyramidal Residual Networks》", 《IEEE》 *
谢金衡 等;: "《基于深度残差金字塔网络的实时多人脸关键点定位算法》", 《HTTP://KNS.CNKI.NET/KCMS/DETAIL/51.1307.TP.20190822.0958.012.HTML》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113940635A (en) * 2021-11-25 2022-01-18 南京邮电大学 Skin lesion segmentation and feature extraction method based on depth residual pyramid
CN113940635B (en) * 2021-11-25 2023-09-26 南京邮电大学 Skin lesion segmentation and feature extraction method based on depth residual pyramid

Similar Documents

Publication Publication Date Title
Jaritz et al. Sparse and dense data with cnns: Depth completion and semantic segmentation
US10943145B2 (en) Image processing methods and apparatus, and electronic devices
US11048948B2 (en) System and method for counting objects
JP7147078B2 (en) Video frame information labeling method, apparatus, apparatus and computer program
WO2019213459A1 (en) System and method for generating image landmarks
CN110443761B (en) Single image rain removing method based on multi-scale aggregation characteristics
CN108875931B (en) Neural network training and image processing method, device and system
CN112329702B (en) Method and device for rapid face density prediction and face detection, electronic equipment and storage medium
CN111104925B (en) Image processing method, image processing apparatus, storage medium, and electronic device
WO2022151661A1 (en) Three-dimensional reconstruction method and apparatus, device and storage medium
US10582179B2 (en) Method and apparatus for processing binocular disparity image
CN111008631B (en) Image association method and device, storage medium and electronic device
CN112036381B (en) Visual tracking method, video monitoring method and terminal equipment
CN111027555B (en) License plate recognition method and device and electronic equipment
CN111127516A (en) Target detection and tracking method and system without search box
WO2021249114A1 (en) Target tracking method and target tracking device
CN113065645A (en) Twin attention network, image processing method and device
CN113112542A (en) Visual positioning method and device, electronic equipment and storage medium
CN110688978A (en) Pedestrian detection method, device, system and equipment
CN114169425A (en) Training target tracking model and target tracking method and device
CN108229281B (en) Neural network generation method, face detection device and electronic equipment
CN115630660B (en) Barcode positioning method and device based on convolutional neural network
CN112235598A (en) Video structured processing method and device and terminal equipment
CN114913470B (en) Event detection method and device
CN116012609A (en) Multi-target tracking method, device, electronic equipment and medium for looking around fish eyes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200114

RJ01 Rejection of invention patent application after publication