CN110688978A - Pedestrian detection method, device, system and equipment - Google Patents
Pedestrian detection method, device, system and equipment Download PDFInfo
- Publication number
- CN110688978A CN110688978A CN201910959000.7A CN201910959000A CN110688978A CN 110688978 A CN110688978 A CN 110688978A CN 201910959000 A CN201910959000 A CN 201910959000A CN 110688978 A CN110688978 A CN 110688978A
- Authority
- CN
- China
- Prior art keywords
- pedestrian detection
- pyramid
- residual error
- depth residual
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 194
- 238000005070 sampling Methods 0.000 claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 22
- 230000015654 memory Effects 0.000 claims description 18
- 239000004973 liquid crystal related substance Substances 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 10
- 229910000838 Al alloy Inorganic materials 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 235000002566 Capsicum Nutrition 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 244000203593 Piper nigrum Species 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 239000000956 alloy Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000017525 heat dissipation Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The application discloses a pedestrian detection method, a device, a system and equipment, wherein the method comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing the pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network, and solves the technical problem of low pedestrian detection accuracy when a human body is subjected to large-scale change in the conventional pedestrian detection method.
Description
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a pedestrian detection method, apparatus, system, and device.
Background
Pedestrian detection is an important component in computer vision, and can be widely applied to the fields of automatic driving, video monitoring and the like. With the publication of a large number of available large pedestrian detection data sets, pedestrian detection algorithms are significantly improved, however, in practical applications, it is often necessary to face pedestrian detection in highly crowded and cluttered scenes, and in such scenes, the dimensions and the form of a human body can change in a complex manner, and compared with pedestrian detection in a single scene, the difficulty is higher. The existing pedestrian detection method has low pedestrian detection accuracy when the human body is subjected to large-scale change.
Disclosure of Invention
The application provides a pedestrian detection method, a device, a system and equipment, which are used for solving the technical problem of low pedestrian detection accuracy rate of the existing pedestrian detection method when a human body is subjected to large-scale change.
In view of the above, a first aspect of the present application provides a pedestrian detection method, including:
establishing a pyramid-depth residual error network model;
inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
Optionally, the establishing a pyramid-depth residual error network model includes:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
Optionally, the de-duplicating the detection frame extracted by the pyramid-depth residual error network model includes:
and de-repeating the detection frame extracted by the pyramid-depth residual error network model based on non-maximum suppression.
Optionally, after obtaining the to-be-trained pedestrian detection image, inputting the to-be-trained pedestrian detection image into the pyramid-depth residual error network model, and before training the pyramid-depth residual error network model, the method further includes:
and preprocessing the pedestrian detection image to be trained.
Optionally, the pyramid network includes 6 convolutional layers for extracting feature maps with different resolutions.
The present application provides in a second aspect a pedestrian detection apparatus comprising: the pedestrian detection system comprises a model building module and a pedestrian detection module;
the model establishing module is used for establishing a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network;
the pedestrian detection module is used for inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model and outputting a pedestrian detection result.
Optionally, the model building module is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
A third aspect of the present application provides a pedestrian detection system comprising: the pedestrian detection device comprises a case, an image collector and the pedestrian detection device of any one of the second aspect.
The image collector and the pedestrian detection device are arranged on the case;
the image collector is used for shooting a pedestrian image and sending the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method in any one of the first aspect.
Optionally, the method further includes: the LCD, the memory and the image processor;
the liquid crystal display screen is used for displaying a pedestrian detection result;
the memory is used for storing the pedestrian image shot by the image collector or the pedestrian detection result;
the image processor is used for controlling the liquid crystal display screen to display the pedestrian detection result.
A fourth aspect of the present application provides a pedestrian detection apparatus, the apparatus comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method of any one of the first aspect according to instructions in the program code.
According to the technical scheme, the method has the following advantages:
the application provides a pedestrian detection method, which comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network. According to the pedestrian detection method, the built pyramid-depth residual error network model is that the convolutional layer is added on the basis of the depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and a plurality of scale features of the image of the pedestrian to be detected are extracted through the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, as the pyramid network extracts deep features, the output of the pyramid network and the output of the residual block of the depth residual error network are fused, so that the deep features and the low-level features are fused, the richness of the features is improved, and the accuracy of pedestrian detection is improved.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a pedestrian detection method provided herein;
FIG. 2 is a schematic flow chart diagram illustrating another embodiment of a pedestrian detection method provided herein;
FIG. 3 is a schematic structural diagram of an embodiment of a pedestrian detection device provided by the present application;
FIG. 4 is a schematic block diagram illustrating one embodiment of a pedestrian detection system according to the present application;
wherein the reference numerals are:
1. a chassis; 2. a memory; 3. an image collector; 4. a central processing unit; 5. a liquid crystal display screen; 6. a graphics processor.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
For ease of understanding, referring to fig. 1, an embodiment of a pedestrian detection method provided by the present application includes:
The pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
In practical application, pedestrian detection is often required in highly crowded and disordered scenes, the scale and the form of a human body in such scenes can be changed in a complex manner, the difficulty is higher compared with the pedestrian detection in a single scene, and the existing pedestrian detection method has the technical problem of low pedestrian detection accuracy when the human body is changed in a large scale. Therefore, in the embodiment of the application, a pyramid-depth residual error network model is constructed, a convolutional layer is added on the basis of a depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and the output of a residual error block of the depth residual error network is fused with the output of the pyramid network to obtain the multi-scale pedestrian detection network model. A plurality of scale features of the pedestrian image to be detected are extracted through the constructed pyramid-depth residual error network model, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved.
The depth residual error network is composed of a series of residual error blocks, the depth residual error network adopts a crossing connection mode, so that the output of a certain convolution layer can directly cross several convolution layers to be used as the input of a certain subsequent convolution layer, multiple layers of networks can be overlapped, certain calculation amount is reduced while the network is deepened, the degradation problem of the convolutional neural network is solved, and the detection accuracy is improved.
And 102, inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result.
It should be noted that the video frame may be captured in the surveillance video as the image of the pedestrian to be detected, the image taken by the camera may also be used as the image of the pedestrian to be detected, the image of the pedestrian to be detected may also be screened, and if the image does not contain the pedestrian, the image is removed.
The image of the pedestrian to be detected is input into the pyramid-depth residual error network model, the position of the detection frame can be output, the position of the detection frame is the position of the detected pedestrian, the detection result can be displayed in the image of the pedestrian to be detected, the detected pedestrian is distributed with the detection frame, and the sizes of the detection frames of the pedestrians with different scales can be different.
The embodiment of the application provides a pedestrian detection method, which comprises the following steps: establishing a pyramid-depth residual error network model; inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result; the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network. According to the pedestrian detection method, the built pyramid-depth residual error network model is that the convolutional layer is added on the basis of the depth residual error network, the pyramid network is constructed by up-sampling the convolutional layer, and a plurality of scale features of the image of the pedestrian to be detected are extracted through the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, as the pyramid network extracts deep features, the output of the pyramid network and the output of the residual block of the depth residual error network are fused, so that the deep features and the low-level features are fused, the richness of the features is improved, and the accuracy of pedestrian detection is improved.
For easy understanding, referring to fig. 2, another embodiment of a pedestrian detection method provided by the present application includes:
step 201, acquiring a pedestrian detection image to be trained.
It should be noted that the to-be-trained pedestrian detection image with the marked pedestrian position can be obtained from the public pedestrian detection database, a large number of video frames can be intercepted from the video monitoring of the intersection as the to-be-trained pedestrian detection image, and the pedestrian position in the video frame is marked, so that the pyramid-depth residual error network model can be conveniently trained.
In order to fully train the pyramid-depth residual error network model and improve the accuracy rate of pedestrian detection, a data enhancement method can be adopted to perform quantity expansion on the obtained pedestrian detection images to be trained. For example, appropriate noise can be added to the obtained to-be-trained pedestrian detection images to expand the number of to-be-trained pedestrian detection images, and the robustness of the pyramid-depth residual error network model can be improved to a certain extent, wherein the noise can be salt and pepper noise or gaussian noise.
Step 202, preprocessing the pedestrian detection image to be trained.
It should be noted that normalization processing may be performed on the to-be-trained pedestrian detection images, so that the to-be-trained pedestrian detection images are uniform in size, and thus, the pyramid-depth residual error network model can be trained conveniently.
And 203, inputting the pedestrian detection image to be trained into the pyramid-depth residual error network model, and training the pyramid-depth residual error network model.
The pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
The pyramid network comprises 6 convolutional layers for extracting feature maps with different resolutions, the resolution size of the feature maps extracted by the 6 convolutional layers can be 2 × 2, 4 × 4, 8 × 8, 16 × 16, 32 × 32 and 64 × 64 in sequence, the feature map with the size of 4 × 4 is obtained by up-sampling the feature map with the size of 2 × 2 in 2-fold step size, the feature map with the size of 4 × 4 is obtained by up-sampling the feature map with the size of 8 × 8 in the same way, the feature maps with the sizes of 16 × 16, 32 × 32 and 64 × 64 are respectively obtained, a residual block in the pyramid-depth residual network model is also composed of 6 different convolutional layers, the resolution size of the feature maps extracted by the 6 convolutional layers can be 64 × 64, 32 × 32, 16 × 16, 8 × 8, 4 × 4 and 2 × 2 in sequence, the feature map with the size of 64 × 64 is down-sampled in 2-fold step size, obtaining feature maps of 32 × 32 size, obtaining feature maps of 16 × 16, 8 × 8, 4 × 4 and 2 × 2 size, respectively, and fusing the feature maps output by 6 convolutional layers in the residual block with the feature maps output by 6 convolutional layers in the pyramid by the same method, where the feature maps output by 6 convolutional layers in the residual block and the feature maps output by 6 convolutional layers in the pyramid are cascaded, and the feature maps output by the residual block and the feature maps corresponding to the same size in the pyramid are cascaded. For example, a feature map of 64 × 64 size output from the residual block is cascaded with a feature map of 64 × 64 size output from the pyramid network, a feature map of 32 × 32 size output from the residual block is cascaded with a feature map of 32 × 32 size output from the pyramid network, a feature map of 4 × 4 size output from the residual block is cascaded with a feature map of 4 × 4 size output from the pyramid network, and so on, and finally 6 branches are formed, and a multi-scale pedestrian detection network model is obtained. The pyramid-depth residual error network model can repeatedly stack a plurality of residual error blocks, finally, the output of the residual error blocks and the output of the pyramid network are fused, and a plurality of scale features of the image of the pedestrian to be detected are extracted by constructing the pyramid network, so that the technical problem that the pedestrian detection accuracy is low when the human body is subjected to large-scale change in the existing pedestrian detection method is solved. Meanwhile, the output of the pyramid network and the output of the residual block are fused, so that the feature richness is improved, and the accuracy of pedestrian detection is improved.
And step 204, de-repeating the detection frame extracted by the pyramid-depth residual error network model.
It should be noted that, when the to-be-trained pedestrian detection image is adopted to train the pyramid-depth residual error network model, the pyramid-depth residual error network model detects the position of a pedestrian according to the extracted semantic information, and allocates detection frames to all possible pedestrian targets in the to-be-trained pedestrian detection image, and possibly allocates a plurality of different detection frames to the same pedestrian target, so that there will be repeated detection frames, and if the detection accuracy is calculated for each detection frame, the calculated amount of the model can be greatly increased, therefore, in the embodiment of the application, the extracted detection frames are deduplicated, and the detection frames extracted for 6 branches are deduplicated by non-maximum suppression, and the specific steps include:
and arranging the confidence scores of the detection boxes extracted by the pyramid-depth residual error network model from high to low, and selecting the detection box with the highest confidence score as a suggestion box.
And calculating the area overlapping ratio of each detection frame except the suggestion frame to the suggestion frame, namely calculating the area ratio of the intersection position of the detection frame and the suggestion frame to the union position.
Removing the detection frames corresponding to the area overlapping ratio exceeding the preset area overlapping ratio threshold, repeating the steps until the area overlapping ratio is calculated among all the detection frames, and screening all the detection frames by comparing the area overlapping ratio with the preset area overlapping ratio threshold, thereby removing the repeated detection frames, wherein the preset area overlapping ratio threshold can be set according to the actual situation, and the preset area overlapping ratio threshold can be 0.5 or 0.65.
And step 205, when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
It should be noted that the pyramid-depth residual error network model is trained through the pedestrian detection image to be trained, when the number of iterations of the training reaches a threshold value, the training can be stopped, the trained pyramid-depth residual error network model is obtained, and the threshold value can be preset according to the depth of the model and the number of the pedestrian detection images to be trained.
And step 206, inputting the pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a detection result.
It should be noted that the video frame may be captured in the surveillance video as the image of the pedestrian to be detected, the image captured by the camera may also be used as the image of the pedestrian to be detected, the image of the pedestrian to be detected may also be screened, and if the image does not contain the pedestrian, the image is removed.
Inputting the image of the pedestrian to be detected into the trained pyramid-depth residual error network model, outputting the position of the detection frame, wherein the position of the detection frame is the position of the detected pedestrian, displaying the detection result in the image of the pedestrian to be detected, distributing the detection frame to the detected pedestrian, and enabling the detection frames of the pedestrians with different scales to be different in size.
For easy understanding, referring to fig. 3, the present application provides an embodiment of a pedestrian detection apparatus, including:
a model building module 301 and a pedestrian detection module 302.
A model establishing module 301, configured to establish a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
The pedestrian detection module 302 is configured to input a pedestrian image to be detected into the pyramid-depth residual error network model, and output a pedestrian detection result.
Further, the model building module 301 is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting a pedestrian detection image to be trained into the pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
For ease of understanding, referring to fig. 4, the present application provides an embodiment of a pedestrian detection system, comprising:
the case 1, the image collector 3 and the pedestrian detection device in the embodiment of the pedestrian detection device.
The image collector 3 and the pedestrian detection device are arranged on the case 1;
the image collector 3 is configured to capture a pedestrian image and send the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method in the embodiment of the pedestrian detection method.
It should be noted that the material outside the chassis 1 may be an aluminum alloy material, so as to facilitate heat dissipation; the density of the aluminum alloy is small, and the aluminum alloy has relatively light weight under the condition of the same volume, so that the aluminum alloy is convenient to move and use; the hardness of the aluminum alloy is higher than that of other materials, so that the anti-collision and anti-falling capacity of the outer side of the case 1 is improved.
Further, the method also comprises the following steps: a liquid crystal display screen 5, a memory 2 and an image processor 6;
the liquid crystal display screen 5 is used for displaying a pedestrian detection result;
the memory 2 is used for storing pedestrian images or pedestrian detection results shot by the image collector 3;
the image processor 6 is used for controlling the liquid crystal display screen 5 to display the pedestrian detection result.
It should be noted that the number of the memories 2 may be 2 or more than 2, one of the memories 2 may be used to store the pedestrian image captured by the image capture device 3, the other memory 2 may be used to store the pedestrian detection result, or one of the memories 2 may be used to store the pedestrian image or the pedestrian detection result captured by the image capture device 3, and the pedestrian detection system further includes a central processing unit 4, which is an operation and control core of the pedestrian detection system.
The application also provides pedestrian detection equipment, which comprises a processor and a memory;
the memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method in the embodiment of the pedestrian detection method described above according to instructions in the program code.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
Claims (10)
1. A pedestrian detection method, characterized by comprising:
establishing a pyramid-depth residual error network model;
inputting a pedestrian image to be detected into the pyramid-depth residual error network model, and outputting a pedestrian detection result;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network.
2. The pedestrian detection method of claim 1, wherein the establishing a pyramid-depth residual network model comprises:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
3. The pedestrian detection method of claim 2, wherein the de-repeating of the detection box extracted by the pyramid-depth residual error network model comprises:
and de-repeating the detection frame extracted by the pyramid-depth residual error network model based on non-maximum suppression.
4. The pedestrian detection method according to claim 2, wherein after the obtaining of the to-be-trained pedestrian detection image, the to-be-trained pedestrian detection image is input to a pyramid-depth residual error network model, and before the training of the pyramid-depth residual error network model, the method further comprises:
and preprocessing the pedestrian detection image to be trained.
5. The pedestrian detection method of claim 1, wherein the pyramid network includes 6 convolutional layers that extract different resolution feature maps.
6. A pedestrian detection device, characterized by comprising: the pedestrian detection system comprises a model building module and a pedestrian detection module;
the model establishing module is used for establishing a pyramid-depth residual error network model;
the pyramid-depth residual error network model is a multi-scale pedestrian detection network model obtained by adding a convolution layer on the basis of a depth residual error network, constructing a pyramid network by up-sampling the convolution layer, and fusing the output of a residual error block of the depth residual error network with the output of the pyramid network;
the pedestrian detection module is used for inputting the image of the pedestrian to be detected into the pyramid-depth residual error network model and outputting a pedestrian detection result.
7. The pedestrian detection apparatus of claim 6, wherein the model building module is specifically configured to:
acquiring a pedestrian detection image to be trained;
inputting the pedestrian detection image to be trained into a pyramid-depth residual error network model, and training the pyramid-depth residual error network model;
de-duplicating the detection frame extracted by the pyramid-depth residual error network model;
and when the iteration number of the training reaches a threshold value, finishing the training to obtain a trained pyramid-depth residual error network model.
8. A pedestrian detection system, comprising: a case, an image collector and a pedestrian detection device according to any one of claims 6 to 7.
The image collector and the pedestrian detection device are arranged on the case;
the image collector is used for shooting a pedestrian image and sending the pedestrian image to the pedestrian detection device, so that the pedestrian detection device executes the pedestrian detection method according to any one of claims 1 to 5.
9. The pedestrian detection system of claim 8, further comprising: the LCD, the memory and the image processor;
the liquid crystal display screen is used for displaying a pedestrian detection result;
the memory is used for storing the pedestrian image shot by the image collector or the pedestrian detection result;
the image processor is used for controlling the liquid crystal display screen to display the pedestrian detection result.
10. A pedestrian detection apparatus, characterized in that the apparatus comprises a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the pedestrian detection method of any one of claims 1-5 according to instructions in the program code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910959000.7A CN110688978A (en) | 2019-10-10 | 2019-10-10 | Pedestrian detection method, device, system and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910959000.7A CN110688978A (en) | 2019-10-10 | 2019-10-10 | Pedestrian detection method, device, system and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110688978A true CN110688978A (en) | 2020-01-14 |
Family
ID=69112036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910959000.7A Pending CN110688978A (en) | 2019-10-10 | 2019-10-10 | Pedestrian detection method, device, system and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110688978A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113940635A (en) * | 2021-11-25 | 2022-01-18 | 南京邮电大学 | Skin lesion segmentation and feature extraction method based on depth residual pyramid |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090906A (en) * | 2018-01-30 | 2018-05-29 | 浙江大学 | A kind of uterine neck image processing method and device based on region nomination |
CN110211139A (en) * | 2019-06-12 | 2019-09-06 | 安徽大学 | Automatic segmentation Radiotherapy of Esophageal Cancer target area and the method and system for jeopardizing organ |
CN110232350A (en) * | 2019-06-10 | 2019-09-13 | 哈尔滨工程大学 | A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study |
-
2019
- 2019-10-10 CN CN201910959000.7A patent/CN110688978A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090906A (en) * | 2018-01-30 | 2018-05-29 | 浙江大学 | A kind of uterine neck image processing method and device based on region nomination |
CN110232350A (en) * | 2019-06-10 | 2019-09-13 | 哈尔滨工程大学 | A kind of real-time water surface multiple mobile object detecting and tracking method based on on-line study |
CN110211139A (en) * | 2019-06-12 | 2019-09-06 | 安徽大学 | Automatic segmentation Radiotherapy of Esophageal Cancer target area and the method and system for jeopardizing organ |
Non-Patent Citations (2)
Title |
---|
DONGYOON HAN ET AL: "《Deep Pyramidal Residual Networks》", 《IEEE》 * |
谢金衡 等;: "《基于深度残差金字塔网络的实时多人脸关键点定位算法》", 《HTTP://KNS.CNKI.NET/KCMS/DETAIL/51.1307.TP.20190822.0958.012.HTML》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113940635A (en) * | 2021-11-25 | 2022-01-18 | 南京邮电大学 | Skin lesion segmentation and feature extraction method based on depth residual pyramid |
CN113940635B (en) * | 2021-11-25 | 2023-09-26 | 南京邮电大学 | Skin lesion segmentation and feature extraction method based on depth residual pyramid |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jaritz et al. | Sparse and dense data with cnns: Depth completion and semantic segmentation | |
US10943145B2 (en) | Image processing methods and apparatus, and electronic devices | |
US11048948B2 (en) | System and method for counting objects | |
JP7147078B2 (en) | Video frame information labeling method, apparatus, apparatus and computer program | |
WO2019213459A1 (en) | System and method for generating image landmarks | |
CN110443761B (en) | Single image rain removing method based on multi-scale aggregation characteristics | |
CN108875931B (en) | Neural network training and image processing method, device and system | |
CN112329702B (en) | Method and device for rapid face density prediction and face detection, electronic equipment and storage medium | |
CN111104925B (en) | Image processing method, image processing apparatus, storage medium, and electronic device | |
WO2022151661A1 (en) | Three-dimensional reconstruction method and apparatus, device and storage medium | |
US10582179B2 (en) | Method and apparatus for processing binocular disparity image | |
CN111008631B (en) | Image association method and device, storage medium and electronic device | |
CN112036381B (en) | Visual tracking method, video monitoring method and terminal equipment | |
CN111027555B (en) | License plate recognition method and device and electronic equipment | |
CN111127516A (en) | Target detection and tracking method and system without search box | |
WO2021249114A1 (en) | Target tracking method and target tracking device | |
CN113065645A (en) | Twin attention network, image processing method and device | |
CN113112542A (en) | Visual positioning method and device, electronic equipment and storage medium | |
CN110688978A (en) | Pedestrian detection method, device, system and equipment | |
CN114169425A (en) | Training target tracking model and target tracking method and device | |
CN108229281B (en) | Neural network generation method, face detection device and electronic equipment | |
CN115630660B (en) | Barcode positioning method and device based on convolutional neural network | |
CN112235598A (en) | Video structured processing method and device and terminal equipment | |
CN114913470B (en) | Event detection method and device | |
CN116012609A (en) | Multi-target tracking method, device, electronic equipment and medium for looking around fish eyes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200114 |
|
RJ01 | Rejection of invention patent application after publication |