CN115565134B - Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine - Google Patents
Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine Download PDFInfo
- Publication number
- CN115565134B CN115565134B CN202211251757.9A CN202211251757A CN115565134B CN 115565134 B CN115565134 B CN 115565134B CN 202211251757 A CN202211251757 A CN 202211251757A CN 115565134 B CN115565134 B CN 115565134B
- Authority
- CN
- China
- Prior art keywords
- coordinates
- monitoring
- point cloud
- pavement marker
- video frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 95
- 238000002405 diagnostic procedure Methods 0.000 title claims abstract description 16
- 239000003550 marker Substances 0.000 claims abstract description 62
- 238000000034 method Methods 0.000 claims abstract description 33
- 238000001514 detection method Methods 0.000 claims abstract description 13
- 238000005516 engineering process Methods 0.000 claims abstract description 11
- 238000013135 deep learning Methods 0.000 claims abstract description 9
- 230000000007 visual effect Effects 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 230000003287 optical effect Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000013459 approach Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000005286 illumination Methods 0.000 abstract description 3
- 230000011218 segmentation Effects 0.000 abstract description 2
- 238000003745 diagnosis Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Abstract
The invention discloses a diagnostic method, a diagnostic system, diagnostic equipment and a diagnostic storage medium for a monitoring blind area of a spherical machine, wherein the diagnostic method is used for matching single images with different monitoring visual angles in the video monitoring of the spherical machine through a deep learning technology to realize 3D scene reconstruction and reconstruct three-dimensional point cloud coordinates; the method comprises the steps of obtaining a pavement marker video frame picture of a current monitoring video area, and converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, wherein the method specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker; and matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and judging whether a monitoring blind area exists or not according to the matching degree. According to the invention, the pavement exercisable area is obtained through pavement segmentation, and whether a blind area exists or not is judged through a 3D reconstruction technology, so that the influence of illumination and the influence of traffic flow are reduced.
Description
Technical Field
The invention belongs to the technical field of video analysis, and particularly relates to a method, a system, equipment and a storage medium for diagnosing a monitoring blind area of a ball machine.
Background
The highway ball machine can be operated by the department level center, the provincial level center, the regional level center, the subordinate road section monitoring center and the like, preset positions of the ball machine are set differently by the centers, and the ball machine is operated by different level departments without resetting. Each section of distance of the expressway is provided with a ball machine, if the ball machine is rotated and the monitoring area is completely opposite to the preset monitoring area, the ball machine is overlapped with the adjacent ball machine monitoring area, so that part of the expressway cannot be monitored, a monitoring blind area is formed, and the road running condition of the part of the area cannot be monitored. In the prior art, a technical scheme for detecting shooting pose changes exists, and whether the shooting position of a camera changes or not is judged by utilizing the offset degree of the outline image of the target reference object in the front image and the rear image. However, on the expressway, under the influence of weather (such as rainy days and foggy days) and traffic flow, the situation that the identification is unclear can appear in the profile image, and whether a monitoring blind area exists cannot be accurately judged.
Disclosure of Invention
The invention mainly aims to overcome the defects and shortcomings of the prior art and provides a method, a system, equipment and a storage medium for diagnosing a monitoring blind area of a dome camera.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a diagnostic method for a monitoring blind area of a ball machine, including the following steps:
matching single images with different monitoring visual angles in the video monitoring of the dome camera through a deep learning technology, so as to realize 3D scene reconstruction and reconstruct three-dimensional point cloud coordinates;
the method comprises the steps of obtaining a pavement marker video frame picture of a current monitoring video area, and converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, wherein the method specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker;
and matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and judging whether a monitoring blind area exists or not according to the matching degree.
As an optimal technical scheme, the matching is performed on single images with different monitoring visual angles in the video monitoring of the spherical camera through a deep learning technology, so as to realize 3D scene reconstruction, and the reconstruction is performed into three-dimensional point cloud coordinates, specifically:
acquiring a main monitoring video frame picture and video frame pictures under a plurality of different angle deflection angles under a normal state of the dome camera;
extracting a multi-channel feature map by using downsampling;
the characteristics of video frame pictures at other angles are converted into video frame pictures of a main monitoring area by adopting stereo conversion;
outputting the probability of each depth by using 3D convolution operation, obtaining the weighted average of the depth to obtain predicted depth information, and filtering and smoothing the depth by using the information of a plurality of view angles;
and selecting correct depth information by using reconstruction constraint of a plurality of pictures, and reconstructing the correct depth information into a three-dimensional point cloud.
As an preferable technical solution, the transforming the features of the video frame pictures of other angles into the video frame pictures of the main monitoring area by adopting the stereo transformation, specifically, the method comprises the following steps:
transforming the reference view into a camera coordinate system corresponding to the original view through a differentiable homography to estimate a depth map of the reference view: according to the prior depth range information, taking a main optical axis of a reference view as a scanning direction, and according to a fixed minimum depth interval, obtaining a view cone of the original view through differential homography transformation, namely, mapping the features of the original view to the features of different depths in the target view through the camera internal and external parameters and the target depth d.
As an preferable technical solution, the 3D convolution operation is used to output probability of each depth, calculate weighted average of the depths to obtain predicted depth information, and filter and smooth the depths by using information of multiple view angles, specifically:
linearly interpolating the view cone to a feature body with the size of [ D, C, h/4,w/4], and forming N feature bodies by N views;
calculating the matching cost of a plurality of feature bodies based on variance on the basis of the N feature bodies;
regularizing the initial feature based on that the initial feature obtained by variance has more noise, adopting the coding and decoding principle, and utilizing the feature obtained by a 3D (three-dimensional) Unet network structure to reduce a channel C to 1 to obtain a probability body with the dimension of [ D, h/4,w/4], and performing softmax on the dimension D to obtain the probability of each pixel along the depth direction;
the depth value of each pixel is estimated using a desired approach.
As a preferred technical scheme, the target detection model adopts a PP-yolo model, the PP-yolo network uses a modified ResNet50-vd structure, and uses DCN in the last stage, and the header takes the form of yolov3, which comprises a 3×3 convolution and a 1×1 convolution; predicting targets with three anchor sizes at each position in the map, wherein the front K channels of each anchor are the probabilities of K categories, the next 4 channels are predicted frame coordinates, and the last channel is a foreground-background target score; cross entropy and L1 loss are used as a loss function for classification and regression, respectively; a foreground-background loss to oversee whether it is foreground; and after the feature extraction is carried out on the video frame by using the weight file obtained through training, positioning the video frame to the position information coordinates, the confidence coefficient and the vehicle type information of the vehicle obj in the picture, and outputting a detection result.
As a preferred technical solution, in the step of obtaining the corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker, the correspondence between a spatial point [ x, y, z ] and its pixel coordinates [ u, v, D ] in the image is as follows:
u=x*f x z+cx;
v=y*f y z+cy;
d=z*s;
f x ,f y the four parameters cx, cy are defined as the internal reference matrix C of the camera, and given the internal reference, the spatial position and pixel coordinates of each point are transformed by a matrix model:
as an preferable technical scheme, the matching of the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates judges whether a monitoring blind area exists according to the matching degree, specifically:
a sphere with a key point p as a radius R is used for describing a histogram of a 3D model shape and a contour under a logarithmic polar coordinate system, in a frame, j times are radially divided, k times are longitudinally divided, l times are latitudinally divided, each divided small area corresponds to one element in j x k x l feature vectors, and the weight is calculated as follows:
where V (j, k, l) is the volume of the small region, p i Is the local dot density around a small area;
the final 3D descriptor is calculated by counting the weighted sum of the points of each small region using the normalization process of the small region volume, compensating for the large variation in small region size with radius and latitude.
In a second aspect, the invention also provides a diagnostic system of the monitoring blind area of the ball machine, which is applied to the diagnostic method of the monitoring blind area of the ball machine, and is characterized by comprising a 3D scene reconstruction module, a road surface marker acquisition module and a coordinate matching module;
the 3D scene reconstruction module is used for matching single images with different monitoring visual angles in the video monitoring of the dome camera through a deep learning technology to realize 3D scene reconstruction and reconstruct three-dimensional point cloud coordinates;
the pavement marker acquisition module is used for acquiring a pavement marker video frame picture of a current monitoring video area, converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, and specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker;
the coordinate matching module is used for matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and judging whether a monitoring blind area exists or not according to the matching degree.
In a third aspect, the present invention also provides an electronic device, including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the method of diagnosing a ball machine blind spot.
In a fourth aspect, the present invention further provides a computer readable storage medium storing a program, where the program, when executed by a processor, implements the method for diagnosing a monitoring blind area of a ball machine.
Compared with the prior art, the invention has the following advantages and beneficial effects:
in the diagnostic method of the monitoring blind area of the spherical machine, single images with different monitoring visual angles in the video monitoring of the spherical machine are matched through a deep learning technology, so that 3D scene reconstruction is realized, and three-dimensional point cloud coordinates are reconstructed; obtaining a pavement marker video frame picture of a current monitoring video area, converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and judging whether a monitoring blind area exists or not according to the matching degree; according to the invention, the road surface movable area is obtained through road surface segmentation, so that the rotation judgment of the camera gun can be influenced by illumination (such as rainy days, foggy days and the like) and vehicles at times, and the influence of illumination and the influence of traffic flow can be reduced by judging whether a blind area exists or not through a 3D reconstruction technology; the invention can also reduce the situation that blind areas are formed due to rotation of the ball machine and the monitoring is not in place.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for diagnosing a monitoring blind area of a dome camera according to an embodiment of the invention;
fig. 2 is a block diagram of a monitoring blind area diagnosis system of a ball machine according to an embodiment of the invention.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly understand that the embodiments described herein may be combined with other embodiments.
Referring to fig. 1, the present embodiment provides a diagnostic method for a monitoring blind area of a dome camera, which implements 3D scene reconstruction by matching single images of different monitoring angles in multi-level control dome camera video monitoring, and meanwhile, uses feature matching for markers in a monitoring video area to determine whether a dome camera device has a monitoring blind area due to movement; the method comprises the following steps:
and S100, matching single images of different monitoring visual angles in the video monitoring of the dome camera through a deep learning technology, and reconstructing a 3D scene into three-dimensional point cloud coordinates.
The following further describes the technical scheme of step S100:
s110, acquiring a main monitoring video frame picture and video frame pictures under a plurality of different angle deflection angles under a normal state of the dome camera.
It can be understood that the ball machine is usually installed on two sides of the expressway, and a billiard machine is arranged at intervals, and the whole road section of the expressway is monitored in a full coverage mode through videos shot by the adjacent ball machines. The ball machine can rotate, in order to shoot the scene outside the main monitoring area, the control center can control the ball machine to rotate by a certain angle through the instruction, so that the scene monitoring in a large range is realized, but after the control center sends the instruction, the situation that the ball machine is forgotten to be restored to the main monitoring area is caused, so that the video shot by the adjacent ball machine has a blind area, and the full road section monitoring of the expressway cannot be realized.
The normal state refers to that the ball machine does not deflect, and the video stream of the main monitoring area is shot in the normal state.
The method comprises the steps of obtaining video streams of a dome camera in a normal state, rotating the dome camera by 10 degrees and 20 degrees leftwards and by 10 degrees and 20 degrees rightwards respectively, obtaining video streams at corresponding angles respectively, extracting images in corresponding videos from the respective video streams, and forming a panoramic image of a shot scene by the video frame images.
S120, extracting a 32-channel feature map by using downsampling;
for example, N pictures are selected each time, one of the N pictures is a reference view (reference image), the other is a source image, an 8-layer 2D feature extraction network is constructed, the convolution steps of the 3 rd layer and the 6 th layer of the network are 2, so that a feature map of 3 scales is obtained, and meanwhile, the parameter weights of the network are shared. And extracting the characteristics of the selected reference view and the original view by using a 2D characteristic extraction network, and obtaining a characteristic diagram with a channel of 32 and length and width bits of h/4 and w/4 respectively through twice scaling with the stride of 2.
S130, converting the characteristics of video frame pictures at other angles into video frame pictures of a main monitoring area by adopting stereo conversion;
it can be appreciated that, since the reference view and the original view are different in view angle, the reference view is transformed into the camera coordinate system corresponding to the original view through a differentiable homography to estimate the depth map of the reference view: according to priori depth range information, taking a main optical axis of a reference view as a scanning direction, and according to a fixed minimum depth interval, obtaining a view cone of the original view through differential homography transformation (the features of the original view are mapped to the feature bodies with different depths in the target view through camera internal and external parameters and target depth d); the calculation formula is as follows:
s140, outputting the probability of each depth by using 3D convolution operation, obtaining the weighted average of the depth to obtain predicted depth information, and filtering and smoothing the depth by using the information of a plurality of view angles;
further, the specific content of step S140 is as follows:
s141, linearly interpolating the view cone to a feature body with the size of [ D, C, h/4,w/4], wherein N views form N feature bodies;
s142, calculating matching cost of the plurality of feature bodies based on variance on the basis of the N feature bodies:
s143, the initial feature body obtained based on variance has more noise, the initial feature body is regularized, the principle of coding and decoding is adopted, the channel C is reduced to 1 by utilizing the feature body obtained by a 3D Unet network structure, a probability body with the dimension of [ D, h/4,w/4] is obtained, and the probability of each pixel along the depth direction is obtained by making softmax on the dimension D.
S144, estimating the depth value of each pixel point by using a desired mode, wherein the calculation formula is as follows:
s150, selecting correct depth information by using reconstruction constraints of a plurality of pictures, and reconstructing the depth information into a three-dimensional point cloud.
And S200, acquiring a pavement marker video frame picture of the current monitoring video area, and converting the pavement marker from the 2D coordinates to corresponding 3D point cloud coordinates.
The following further describes the technical scheme of step S200:
s210, acquiring a current video frame picture.
Illustratively, a current video frame picture is obtained from a current video stream of the dome camera, wherein the current video frame picture comprises pavement markers including but not limited to lane lines, isolation belts, green belts, road signs and the like.
And S220, detecting the pavement marker by using the target detection model, and giving out the two-dimensional coordinates of the pavement marker.
Illustratively, in one embodiment, the pavement marker is detected using a PP-yolo model, resulting in two-dimensional 2D coordinates (x, y) of the pavement marker.
Further, in this embodiment, the PP-yolo network uses the modified ResNet50-vd structure and in the last stage uses DCN, the header takes the form of yolov3, including a 3×3 convolution and a 1×1 convolution; predicting targets with three anchor sizes at each position in the map, wherein the front K channels of each anchor are the probabilities of K categories, the next 4 channels are predicted frame coordinates, and the last channel is a foreground-background target score; cross entropy and L1 loss are used as a loss function for classification and regression, respectively; one foreground-background loss oversees whether it is foreground or not. After feature extraction is performed on the video frame by using the weight file obtained through training, the video frame is positioned to the position information coordinates, the confidence coefficient, the vehicle type and the like of the vehicle obj in the picture, and detection results objects (x 1, y1, x2, y2, score and cls) are output.
S230, acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker.
Further, the correspondence between a spatial point [ x, y, z ] and its pixel coordinates [ u, v, d ] (d refers to depth data) in the image is such that:
u=x*f x z+cx;
v=y*f y z+cy;
d=z*s;
f x ,f y the four parameters cx, cy are defined as the internal reference matrix C of the camera, and given the internal reference, the spatial position and pixel coordinates of each point are transformed by a matrix model:
and S300, matching the 3D point cloud coordinates of the pavement marker in the step S200 with the three-dimensional point cloud coordinates reconstructed in the step S100, and judging whether a monitoring blind area exists according to the matching degree.
The step S300 specifically includes:
a sphere with a key point p as a radius R is used for describing a histogram of a 3D model shape and a contour under a logarithmic polar coordinate system, in a frame, j times are radially divided, k times are longitudinally divided, l times are latitudinally divided, each divided small area corresponds to one element in j x k x l feature vectors, and the weight is calculated as follows:
where V (j, k, l) is the volume of the small region, p i Is the local dot density around a small area;
the normalization processing of the small area volume is utilized to compensate the huge change of the small area size along with the change of radius and latitude; the final 3D descriptor is calculated by counting the weighted sum of the points of each small region.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present invention is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present invention.
Based on the same ideas of the ball machine monitoring blind area diagnosis method in the above embodiment, the present invention also provides a ball machine monitoring blind area diagnosis system, which can be used to execute the above ball machine monitoring blind area diagnosis method. For ease of illustration, only those portions of the structural schematic diagram of an embodiment of the ball machine blind spot monitoring diagnostic system that are relevant to an embodiment of the present invention are shown, and those skilled in the art will appreciate that the illustrated structure is not limiting of the apparatus and may include more or fewer components than illustrated, or may combine certain components, or a different arrangement of components.
Referring to fig. 2, in another embodiment of the present application, a diagnostic system 100 for a spherical camera monitoring blind area is provided, which includes a 3D scene reconstruction module 101, a road marker acquisition module 102, and a coordinate matching module 103;
the 3D scene reconstruction module 101 is configured to match single images with different monitoring angles in video monitoring of the dome camera through a deep learning technology, so as to implement 3D scene reconstruction, and reconstruct into three-dimensional point cloud coordinates;
the pavement marker acquisition module 102 is configured to acquire a pavement marker video frame picture of a current surveillance video area, and convert a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, specifically: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker;
the coordinate matching module 103 is configured to match the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and determine whether a monitoring blind area exists according to the matching degree.
It should be noted that, the diagnostic system for a monitoring blind area of a ball machine according to the present invention corresponds to the diagnostic method for a monitoring blind area of a ball machine according to the present invention one by one, and technical features and beneficial effects described in the embodiments of the diagnostic method for a monitoring blind area of a ball machine are applicable to the embodiments of the diagnostic method for a monitoring blind area of a ball machine, and specific content can be found in the descriptions of the embodiments of the methods of the present invention, which are not repeated herein, and thus are described herein.
In addition, in the implementation manner of the monitoring blind area diagnostic system for a ball machine according to the foregoing embodiment, the logic division of each program module is merely illustrative, and in practical application, the functional allocation may be performed by different program modules according to needs, for example, in view of configuration requirements of corresponding hardware or convenience of implementation of software, that is, the internal structure of the monitoring blind area diagnostic system for a ball machine is divided into different program modules to perform all or part of the functions described above.
Referring to fig. 3, in one embodiment, an electronic device for implementing the method for diagnosing a monitoring blind area of a ball machine is provided, where the electronic device 200 may include a first processor 201, a first memory 202, and a bus, and may further include a computer program stored in the first memory 202 and executable on the first processor 201, such as a diagnostic program 203 of a monitoring blind area of a ball machine.
The first memory 202 includes at least one type of readable storage medium, which includes flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The first memory 202 may in some embodiments be an internal storage unit of the electronic device 200, such as a mobile hard disk of the electronic device 200. The first memory 202 may also be an external storage device of the electronic device 200 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a secure digital (SecureDigital, SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the electronic device 200. Further, the first memory 202 may also include both an internal memory unit and an external memory device of the electronic device 200. The first memory 202 may be used to store not only application software installed in the electronic device 200 and various data, such as codes of the ball machine monitoring blind area diagnostic program 203, but also data that has been output or is to be output temporarily.
The first processor 201 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The first processor 201 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device 200 and processes data by running or executing programs or modules stored in the first memory 202 and calling data stored in the first memory 202.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device 200 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
The ball machine monitoring blind zone diagnostic program 203 stored in the first memory 202 of the electronic device 200 is a combination of a plurality of instructions, and when running in the first processor 201, may implement:
s110, acquiring a main monitoring video frame picture and video frame pictures under a plurality of different angle deflection angles under a normal state of the dome camera.
S200, obtaining a pavement marker video frame picture of a current monitoring video area, and converting a pavement marker from a 2D coordinate to a corresponding 3D point cloud coordinate, wherein the method specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker;
and S300, matching the 3D point cloud coordinates of the pavement marker in the step S200 with the three-dimensional point cloud coordinates reconstructed in the step S100, and judging whether a monitoring blind area exists according to the matching degree.
Further, the modules/units integrated with the electronic device 200 may be stored in a non-volatile computer readable storage medium if implemented in the form of software functional units and sold or used as a stand-alone product. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (7)
1. The diagnostic method for the monitoring blind area of the ball machine is characterized by comprising the following steps of:
through the deep learning technology, matching single images of different monitoring visual angles in the video monitoring of the dome camera, realizing 3D scene reconstruction, reconstructing into three-dimensional point cloud coordinates, and specifically:
acquiring a main monitoring video frame picture and video frame pictures under a plurality of different angle deflection angles under a normal state of the dome camera;
extracting a multi-channel feature map by using downsampling;
the characteristics of video frame pictures at other angles are converted into video frame pictures of a main monitoring area by adopting stereo conversion;
outputting the probability of each depth by using 3D convolution operation, obtaining the weighted average of the depth to obtain predicted depth information, and filtering and smoothing the depth by using the information of a plurality of view angles;
selecting correct depth information by using reconstruction constraint of a plurality of pictures, and reconstructing the correct depth information into a three-dimensional point cloud;
the method comprises the steps of obtaining a pavement marker video frame picture of a current monitoring video area, and converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, wherein the method specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker; in the step of obtaining the corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker, the corresponding relationship between a spatial point [ x, y, z ] and its pixel coordinates [ u, v, D ] in the image is as follows:
u = x *f x z + cx;
v = y*f y z + cy;
d = z*s;
f x ,f y the four parameters cx and cy are defined as the internal reference matrix of the cameraC, after the internal reference is given, the spatial position and pixel coordinates of each point are converted by using a matrix model:
;
matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, and judging whether a monitoring blind area exists or not according to the matching degree; the method comprises the following steps:
a sphere taking a key point p as a center R as a radius, carrying out histogram description on the shape and the outline of the 3D model under a logarithmic polar coordinate system, and radially dividing in a framejLatitude, longitude direction divisionkSub-latitudinal divisionlSub-times, each divided small area corresponds toj*k*lOne element in the feature vector, the weight is calculated as follows:
;
wherein,V(j,k,l) Is the volume of the small area and,p i is the local dot density around a small area;
the final 3D descriptor is calculated by counting the weighted sum of the points of each small region using the normalization process of the small region volume, compensating for the large variation in small region size with radius and latitude.
2. The diagnostic method for the monitoring blind area of the dome camera according to claim 1, wherein the feature of the video frame picture of other angles is converted into the video frame picture of the main monitoring area by adopting the homography transformation, specifically:
transforming the reference view into a camera coordinate system corresponding to the original view through a differentiable homography to estimate a depth map of the reference view: according to the prior depth range information, taking a main optical axis of a reference view as a scanning direction, and according to a fixed minimum depth interval, obtaining a view cone of the original view through differential homography transformation, namely, mapping the features of the original view to the features of different depths in the target view through the camera internal and external parameters and the target depth d.
3. The method for diagnosing a monitoring blind zone of a spherical camera according to claim 1, wherein the 3D convolution operation is used to output probability of each depth, obtain predicted depth information by weighted average of the depths, and filter and smooth the depth by using information of a plurality of view angles, specifically:
linearly interpolating the view cone to a feature body with the size of [ D, C, h/4,w/4], and forming N feature bodies by N views;
calculating the matching cost of a plurality of feature bodies based on variance on the basis of the N feature bodies;
regularizing the initial feature based on that the initial feature obtained by variance has more noise, adopting the coding and decoding principle, and utilizing the feature obtained by a 3D (three-dimensional) Unet network structure to reduce a channel C to 1 to obtain a probability body with the dimension of [ D, h/4,w/4], and performing softmax on the dimension D to obtain the probability of each pixel along the depth direction;
the depth value of each pixel is estimated using a desired approach.
4. The method of claim 1, wherein the target detection model is a PP-yolo model, the PP-yolo network uses a modified res net50-vd structure, and DCN is used in the last stage, and the head portion takes the form of yolov3, including a 3 x 3 convolution and a 1 x1 convolution; predicting targets with three anchor sizes at each position in the map, wherein the front K channels of each anchor are the probabilities of K categories, the next 4 channels are predicted frame coordinates, and the last channel is a foreground-background target score; cross entropy and L1 loss are used as a loss function for classification and regression, respectively; a foreground-background loss to oversee whether it is foreground; and after the feature extraction is carried out on the video frame by using the weight file obtained through training, positioning the video frame to the position information coordinates, the confidence coefficient and the vehicle type information of the vehicle in the picture, and outputting a detection result.
5. The diagnostic system for the monitoring blind area of the spherical machine is characterized by being applied to the diagnostic method for the monitoring blind area of the spherical machine according to any one of claims 1 to 4 and comprising a 3D scene reconstruction module, a road marker acquisition module and a coordinate matching module;
the 3D scene reconstruction module is used for matching single images with different monitoring visual angles in the video monitoring of the dome camera through a deep learning technology to realize 3D scene reconstruction and reconstruct three-dimensional point cloud coordinates;
the pavement marker acquisition module is used for acquiring a pavement marker video frame picture of a current monitoring video area, converting a pavement marker from 2D coordinates to corresponding 3D point cloud coordinates, and specifically comprises the following steps: acquiring a current video frame picture; detecting the pavement marker by using a target detection model, and giving out the two-dimensional coordinates of the pavement marker; acquiring corresponding 3D point cloud coordinates according to the 2D coordinates of the pavement marker;
the coordinate matching module is used for matching the 3D point cloud coordinates of the pavement marker with the reconstructed three-dimensional point cloud coordinates, judging whether a monitoring blind area exists or not according to the matching degree, and specifically comprises the following steps:
a sphere taking a key point p as a center R as a radius, carrying out histogram description on the shape and the outline of the 3D model under a logarithmic polar coordinate system, and radially dividing in a framejLatitude, longitude direction divisionkSub-latitudinal divisionlSub-times, each divided small area corresponds toj*k*lOne element in the feature vector, the weight is calculated as follows:
;
wherein,V(j,k,l) Is the volume of the small area and,p i is the local dot density around a small area;
the final 3D descriptor is calculated by counting the weighted sum of the points of each small region using the normalization process of the small region volume, compensating for the large variation in small region size with radius and latitude.
6. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the ball machine blind spot monitoring diagnostic method as set forth in any one of claims 1-4.
7. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the diagnostic method for a monitoring blind area of a ball machine according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211251757.9A CN115565134B (en) | 2022-10-13 | 2022-10-13 | Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211251757.9A CN115565134B (en) | 2022-10-13 | 2022-10-13 | Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115565134A CN115565134A (en) | 2023-01-03 |
CN115565134B true CN115565134B (en) | 2024-03-15 |
Family
ID=84744792
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211251757.9A Active CN115565134B (en) | 2022-10-13 | 2022-10-13 | Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115565134B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090043416A (en) * | 2007-10-29 | 2009-05-06 | 삼성전자주식회사 | Surveillance camera apparatus for detecting and suppressing camera shift and control method thereof |
CN105486235A (en) * | 2015-12-07 | 2016-04-13 | 高新兴科技集团股份有限公司 | A target measuring method in ball machine video images |
CN108921900A (en) * | 2018-07-18 | 2018-11-30 | 江苏实景信息科技有限公司 | A kind of method and device in the orientation of monitoring video camera |
CN110174093A (en) * | 2019-05-05 | 2019-08-27 | 腾讯科技(深圳)有限公司 | Localization method, device, equipment and computer readable storage medium |
KR20200064947A (en) * | 2018-11-29 | 2020-06-08 | (주)코어센스 | Apparatus for tracking position based optical position tracking system and method thereof |
WO2021008032A1 (en) * | 2019-07-18 | 2021-01-21 | 平安科技(深圳)有限公司 | Surveillance video processing method and apparatus, computer device and storage medium |
CN112598750A (en) * | 2020-12-22 | 2021-04-02 | 北京百度网讯科技有限公司 | Calibration method and device for road side camera, electronic equipment and storage medium |
CN113297946A (en) * | 2021-05-18 | 2021-08-24 | 珠海大横琴科技发展有限公司 | Monitoring blind area identification method and identification system |
CN114463385A (en) * | 2022-01-12 | 2022-05-10 | 平安科技(深圳)有限公司 | Target tracking method, device, equipment and medium based on gun-ball linkage system |
CN115115768A (en) * | 2021-03-18 | 2022-09-27 | 上海宝信软件股份有限公司 | Object coordinate recognition system, method, device and medium based on stereoscopic vision |
CN115131705A (en) * | 2022-06-29 | 2022-09-30 | 北京市商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
-
2022
- 2022-10-13 CN CN202211251757.9A patent/CN115565134B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090043416A (en) * | 2007-10-29 | 2009-05-06 | 삼성전자주식회사 | Surveillance camera apparatus for detecting and suppressing camera shift and control method thereof |
CN105486235A (en) * | 2015-12-07 | 2016-04-13 | 高新兴科技集团股份有限公司 | A target measuring method in ball machine video images |
CN108921900A (en) * | 2018-07-18 | 2018-11-30 | 江苏实景信息科技有限公司 | A kind of method and device in the orientation of monitoring video camera |
KR20200064947A (en) * | 2018-11-29 | 2020-06-08 | (주)코어센스 | Apparatus for tracking position based optical position tracking system and method thereof |
CN110174093A (en) * | 2019-05-05 | 2019-08-27 | 腾讯科技(深圳)有限公司 | Localization method, device, equipment and computer readable storage medium |
WO2021008032A1 (en) * | 2019-07-18 | 2021-01-21 | 平安科技(深圳)有限公司 | Surveillance video processing method and apparatus, computer device and storage medium |
CN112598750A (en) * | 2020-12-22 | 2021-04-02 | 北京百度网讯科技有限公司 | Calibration method and device for road side camera, electronic equipment and storage medium |
CN115115768A (en) * | 2021-03-18 | 2022-09-27 | 上海宝信软件股份有限公司 | Object coordinate recognition system, method, device and medium based on stereoscopic vision |
CN113297946A (en) * | 2021-05-18 | 2021-08-24 | 珠海大横琴科技发展有限公司 | Monitoring blind area identification method and identification system |
CN114463385A (en) * | 2022-01-12 | 2022-05-10 | 平安科技(深圳)有限公司 | Target tracking method, device, equipment and medium based on gun-ball linkage system |
CN115131705A (en) * | 2022-06-29 | 2022-09-30 | 北京市商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
基于DSP的变外参摄像机在线标定;蒋建国;李相涛;齐美彬;张锐;;仪器仪表学报(12);155-159 * |
基于图像识别与追踪技术的高速公路交通事件有效检测范围;方正鹏;孙岳;陈建;;公路交通科技(S1);9-12 * |
Also Published As
Publication number | Publication date |
---|---|
CN115565134A (en) | 2023-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10462445B2 (en) | Systems and methods for estimating and refining depth maps | |
WO2021004312A1 (en) | Intelligent vehicle trajectory measurement method based on binocular stereo vision system | |
Carr et al. | Monocular object detection using 3d geometric primitives | |
CN109242884B (en) | Remote sensing video target tracking method based on JCFNet network | |
CN113468967B (en) | Attention mechanism-based lane line detection method, attention mechanism-based lane line detection device, attention mechanism-based lane line detection equipment and attention mechanism-based lane line detection medium | |
Shim et al. | Road damage detection using super-resolution and semi-supervised learning with generative adversarial network | |
US11687773B2 (en) | Learning method and recording medium | |
CN111709416A (en) | License plate positioning method, device and system and storage medium | |
CN105321189A (en) | Complex environment target tracking method based on continuous adaptive mean shift multi-feature fusion | |
CN114419568A (en) | Multi-view pedestrian detection method based on feature fusion | |
CN112204614A (en) | Motion segmentation in video from non-stationary cameras | |
US20030169340A1 (en) | Method and apparatus for tracking moving objects in pictures | |
CN112950725A (en) | Monitoring camera parameter calibration method and device | |
CN111738071B (en) | Inverse perspective transformation method based on motion change of monocular camera | |
CN112801021B (en) | Method and system for detecting lane line based on multi-level semantic information | |
CN115565134B (en) | Diagnostic method, system, equipment and storage medium for monitoring blind area of ball machine | |
CN111738061A (en) | Binocular vision stereo matching method based on regional feature extraction and storage medium | |
CN116958927A (en) | Method and device for identifying short column based on BEV (binary image) graph | |
CN116092035A (en) | Lane line detection method, lane line detection device, computer equipment and storage medium | |
CN116128919A (en) | Multi-temporal image abnormal target detection method and system based on polar constraint | |
CN112634141B (en) | License plate correction method, device, equipment and medium | |
CN112308987B (en) | Vehicle-mounted image stitching method, system and device | |
CN115546071B (en) | Data processing method and equipment suitable for image recovery | |
CN116194956A (en) | Generation of 3D point clouds for scenes | |
CN114612999A (en) | Target behavior classification method, storage medium and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |