CN112183355A - Effluent height detection system and method based on binocular vision and deep learning - Google Patents

Effluent height detection system and method based on binocular vision and deep learning Download PDF

Info

Publication number
CN112183355A
CN112183355A CN202011046022.3A CN202011046022A CN112183355A CN 112183355 A CN112183355 A CN 112183355A CN 202011046022 A CN202011046022 A CN 202011046022A CN 112183355 A CN112183355 A CN 112183355A
Authority
CN
China
Prior art keywords
track
video
module
target
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011046022.3A
Other languages
Chinese (zh)
Other versions
CN112183355B (en
Inventor
宋春雷
任旭倩
丁子豪
李曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202011046022.3A priority Critical patent/CN112183355B/en
Publication of CN112183355A publication Critical patent/CN112183355A/en
Application granted granted Critical
Publication of CN112183355B publication Critical patent/CN112183355B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Abstract

The invention discloses a water outlet height detection system and a method thereof based on binocular vision and depth learning, wherein a video acquisition module is used for acquiring a competition video of a swimmer, a video processing module is used for removing ghost shadows in the competition video, a target recognition and tracking module is used for carrying out target recognition and target tracking, a central point motion track of a target is selected as a target motion track, a three-dimensional coordinate recovery module is used for calculating a depth value corresponding to each track point, a three-dimensional track of the swimmer is formed according to the depth value, a track judgment module is used for fitting all acquired tracks, the track is determined as a water outlet runner track when the track is a parabola, and a height calculation module is used for calculating the distance between the highest point and the lowest point of the parabola to be used as the water outlet height of the swimmer. The water outlet height detection system provided by the invention can detect the accurate height of the water surface thrown by the swimmer, and provides objective basis for judging and evaluating action standards.

Description

Effluent height detection system and method based on binocular vision and deep learning
Technical Field
The invention relates to the technical field of height detection, in particular to a system and a method for detecting the height of outlet water based on binocular vision and deep learning.
Background
The swim style sport is a water sports project integrated with dance and music, and athletes need to complete a plurality of groups of lifting, rotating, bending and other actions, wherein the height of the athletes jumping out of the water surface is an important scoring standard. The height of the existing water surface jumping can only be visually observed by a judge at present, and the method has greater subjectivity and has the following defects: (1) the height measurement is inaccurate, disputes are easy to occur, and the normal running of the competition is influenced; (2) the water outflow of the athletes is instantaneous action, and the identification burden is increased for the referees.
Disclosure of Invention
In order to solve the limitations and defects of the prior art, the invention provides a binocular vision and deep learning based effluent height detection system, which comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module;
the video acquisition module is used for acquiring a competition video of the swimmer and transmitting the acquired competition video to the video processing module;
the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
the target identification and tracking module is used for identifying swimmers throwing out of the water surface, tracking the targets, framing the targets by using a rectangular frame, acquiring the central points of the targets, extracting a central point list of the targets and sending the central point list to the three-dimensional coordinate recovery module;
the three-dimensional coordinate recovery module is used for calculating a disparity map according to the left and right eye photos, calculating a depth map according to the focal length and the baseline of the camera, recovering three-dimensional coordinates of track points relative to a camera reference coordinate system according to the disparity map and the depth map, forming a three-dimensional track list and sending the three-dimensional track list to the track judgment module;
the track judging module is used for fitting the obtained track, determining that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sending the target track performing water motion to the height calculating module;
the height calculation module is used for calculating the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
Optionally, the video acquisition module comprises a ZED binocular camera, and the ZED binocular camera is placed on the ground through a bracket to prevent shaking;
the ZED binocular camera is arranged on the edge of the swimming pool on the side face of the swimmer, so that the ZED binocular camera can shoot the whole process of the performance of the swimmer.
The invention also provides a method for detecting the height of the effluent based on binocular vision and deep learning, which uses the detection system and comprises the following steps:
step S1: the video acquisition module acquires a competition video of the swimmer and transmits the acquired competition video to the video processing module;
step S2: the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
step S3: the target identification and tracking module identifies the swimmer throwing out of the water surface, tracks the target, frames the target by using a rectangular frame, obtains the central point of the target, extracts the central point list of the target and sends the central point list to the three-dimensional coordinate recovery module;
step S4: the three-dimensional coordinate recovery module calculates a disparity map according to the left and right eye photos, calculates a depth map according to the focal length and the baseline of the camera, recovers three-dimensional coordinates of the track point relative to a camera reference coordinate system according to the disparity map and the depth map, forms a three-dimensional track list and sends the three-dimensional track list to the track judgment module;
step S5: the track judging module fits the obtained track, determines that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sends the target track performing water outlet motion to the height calculating module;
step S6: the height calculation module calculates the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
Optionally, the step S1 includes:
step S11: the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged;
step S12: if the ZED binocular camera is correctly opened, returning to video image data, and if the ZED binocular camera is not correctly opened, returning to the step S11;
step S13: and reading each frame image of the left eye camera in the video image data, and storing the read image as the match video.
Optionally, the step S2 includes:
step S21: decomposing each frame of image of the acquired left eye camera into a picture sequence, and storing the picture sequence;
step S22: carrying out deblurring processing on the picture sequence by using a deep convolution neural network to eliminate the ghost in the picture;
step S23: and storing the pictures after the blurring processing as videos according to the original frame rate and size.
Optionally, the step S3 includes:
step S31: identifying the video after fuzzy processing by using a Yolov5 target detection algorithm, wherein the Yolov5 target detection algorithm can only detect human beings, and the detected people are framed by using a rectangular frame;
step S32: sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving person;
step S33: if the tracking operation is finished, re-detecting the person which is not tracked, and returning to the step S31;
step S34: and extracting the central points of the rectangular frames in the motion process of all the tracked targets, and taking the central points of the rectangular frames as motion tracks to form a motion track list.
Optionally, the step S4 includes:
step S41: converting each frame of image of a binocular camera shooting video into a gray image;
step S42: extracting corresponding disparity maps from left and right images of each frame corresponding to a video by using a Libelas library, and storing the disparity maps as a picture list;
step S43: calculating a depth map corresponding to each frame of parallax map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering the three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=zdeep/deepscale
x=(u-cx)×z/fx
y=(v-cy)×z/fy
wherein z isdeepThe depth value of the two-dimensional pixel point in the depth map is defined, depscale is the ratio between the depth value and the actual depth, (u, v) is the coordinate of the two-dimensional pixel point, and (c)x,cy,fx,fy) As an internal reference of the camera, cx,cyIs the longitudinal and transverse offset of the image origin relative to the imaging point of the optical center, fx,fyIs the camera focal length.
Optionally, the step S5 includes:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data pointsx,bx,cx,ay,by,cy) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure BDA0002708007470000041
step S53: describing two-dimensional coordinates of depth and height by using the data points, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure BDA0002708007470000042
step S54: determining the motion trajectory line of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure BDA0002708007470000043
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure BDA0002708007470000044
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the coordinate of the highest point of the track according to the fitted parabola, and taking the coordinate of the highest point as the coordinate of the highest point of the motion track.
The invention has the following beneficial effects:
the invention provides a binocular vision and deep learning-based effluent height detection system and a method thereof, wherein the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, the video acquisition module is used for acquiring a game video of swimmers, the video processing module is used for deblurring processing and removing ghosts generated by high-speed motion of objects in the game video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point motion track of a target is selected as a target motion track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on a binocular video, a depth map is converted into a depth map according to a camera focal length and a binocular camera baseline, and the corresponding depth value of each track point, and recovering the three-dimensional coordinates of the swimmer relative to the camera coordinate system according to the depth values to form a three-dimensional track of the swimmer, wherein the track judgment module is used for fitting all collected tracks, the track is determined as a track of the swimmer when the track is a parabola, and the height calculation module is used for calculating the distance between the highest point and the lowest point of the parabola to be used as the water outlet height of the swimmer. The outlet height detection system based on binocular vision and deep learning provided by the invention can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judging and evaluating action standards. The data processing process of the invention is faster, the detection result has timeliness, and the fairness of grading is ensured. The detection system is extremely simple in installation process, labor cost is reduced, detection speed is high, the height value is presented in a data form, and the detection system is clear and visual.
Drawings
Fig. 1 is a schematic structural diagram of a binocular vision and deep learning based effluent height detection system according to an embodiment of the present invention.
Fig. 2 is a top view of a system for detecting a water outlet height based on binocular vision and deep learning according to an embodiment of the present invention.
Fig. 3 is a front view of the effluent height detection system based on binocular vision and deep learning according to an embodiment of the present invention.
Fig. 4 is a flowchart of a binocular vision and deep learning based effluent height detection method according to a second embodiment of the present invention.
Fig. 5 is a schematic diagram illustrating comparison of video effects before and after deblurring according to a second embodiment of the present invention.
Fig. 6 is a schematic diagram of a target tracking effect according to a second embodiment of the present invention.
Fig. 7 is a schematic diagram of another target tracking effect provided in the second embodiment of the present invention.
Fig. 8 is a schematic view illustrating an effect of a disparity map according to a second embodiment of the present invention.
Fig. 9 is a schematic diagram illustrating an effect of a depth map according to a second embodiment of the present invention.
Fig. 10 is a three-dimensional trace point fitting graph provided in the second embodiment of the present invention.
Fig. 11 is a schematic diagram of the calculated measurement value of the height of the human figure simulation experiment with the height of 0.45m according to the second embodiment of the present invention.
Fig. 12 is a schematic diagram of fitting of the calculated height data of the simulation experiment provided in the second embodiment of the present invention.
Fig. 13 is a schematic diagram of a distance calculation measurement value of a distance detection simulation experiment according to a second embodiment of the present invention.
Fig. 14 is a schematic diagram of distance measurement value and actual value recording curve simulation experiment distance calculation data fitting provided by the second embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the following describes in detail the effluent height detection system and method based on binocular vision and deep learning, which are provided by the present invention, with reference to the accompanying drawings.
Example one
The binocular vision and deep learning based water outlet height detection system provided by the embodiment can quickly and accurately detect the height of a water surface ground thrown by an athlete, and provides objective basis for judging to score according to actions. Fig. 1 is a schematic structural diagram of a binocular vision and deep learning based effluent height detection system according to an embodiment of the present invention. As shown in fig. 1, the effluent height detection system based on binocular vision and deep learning provided by this embodiment includes a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a trajectory judgment module, and a height calculation module, where the video acquisition module acquires the effluent video part of the pattern swimmer for a binocular camera, and transmits the acquired game video to the video processing module for deblurring to remove the ghost generated by the high-speed motion of the object in the video. And then sending the processed video to a target recognition and tracking module, wherein the target recognition module is used for recognizing a character target by adopting a YOLOv5 algorithm, then tracking the target by using a Deep Sort algorithm, and selecting a central point motion track of the tracked target as a target character motion track. The two-dimensional projection coordinates of the moving object are extracted from the video and need to be restored to three-dimensional coordinates, so that the disparity map calculation is carried out on the binocular video, the depth map is converted into a depth map according to the camera focal length and the binocular camera baseline, the depth value corresponding to each track point of the target is calculated, the three-dimensional coordinates of the object relative to a camera coordinate system are restored according to the depth values, and the three-dimensional tracks are extracted. And fitting all the acquired tracks, and considering the tracks of the water outlet athletes when the tracks are parabolic. The height calculation module calculates the Euclidean distance from the highest point to the lowest point of the parabola to be used as the water outlet height of the swimmer. The embodiment uses the deep learning method to detect the height, can capture the instantaneous water outlet action on the premise of not contacting with athletes, reduces manpower and material resources, has simple principle, and realizes quick and accurate height calculation. Simultaneously, the ZED binocular camera can shoot high definition left eye right eye image, and the video that saves makes things convenient for off-line computation.
Fig. 2 is a top view of a system for detecting a water outlet height based on binocular vision and deep learning according to an embodiment of the present invention. Fig. 3 is a front view of the effluent height detection system based on binocular vision and deep learning according to an embodiment of the present invention. As shown in fig. 2 and 3, the video acquisition module contains the ZED binocular camera, and the ZED binocular camera shoots high definition video, places on ground through the support, prevents to shake. ZED binocular camera is put at the swimming pool edge of sportsman side, guarantees to shoot the whole process of sportsman performance.
The detection system provided by the embodiment comprises a video acquisition module, a video processing module, a target identification and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module. The video acquisition module adopts ZED stereo camera that U.S. Stereolabs company promoted, through the tripod support during laying, guarantees that binocular camera is perpendicular with the ground to prevent the shake. The binocular camera is located sportsman's side swimming pool, can shoot sportsman's performance overall process. The video acquisition module can be realized by adopting a windows system and an Ubuntu system, the Ubuntu18.04 operating system is adopted in the embodiment, and the software compiling platform adopts the ciion. The binocular camera is connected with the computer through a USB (universal serial bus), athlete performance videos are collected and transmitted to the computer, the computer runs a program in pycharm to perform corresponding processing to obtain the water outlet height value of the pattern swimmer, and the detected target image and the height value are printed on a computer screen.
The embodiment provides a binocular vision and deep learning-based effluent height detection system and a method thereof, the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, wherein the video acquisition module is used for acquiring a game video of a swimmer, the video processing module is used for deblurring processing and removing a residual shadow generated by high-speed motion of an object in the game video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point motion track of a target is selected as a target motion track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on a binocular video, a depth map is converted into a depth map according to a camera focal length and a binocular camera baseline, and a depth value corresponding, and recovering the three-dimensional coordinates of the swimmer relative to the camera coordinate system according to the depth values to form a three-dimensional track of the swimmer, wherein the track judgment module is used for fitting all collected tracks, the track is determined as a track of the swimmer when the track is a parabola, and the height calculation module is used for calculating the distance between the highest point and the lowest point of the parabola to be used as the water outlet height of the swimmer. The system for detecting the water outlet height based on binocular vision and deep learning provided by the embodiment can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judging and evaluating action standards. The data processing process of the embodiment is fast, the detection result has timeliness, and the fairness of grading is guaranteed. The installation process of the detection system of the embodiment is extremely simple, the labor cost is reduced, the detection speed is high, the height value is presented in a data form, and the detection system is clear and visual.
Example two
Fig. 4 is a flowchart of a binocular vision and deep learning based effluent height detection method according to a second embodiment of the present invention. As shown in fig. 4, the present embodiment provides a method for detecting an effluent height based on binocular vision and deep learning, where the detection method employs the detection system provided in the first embodiment, and the detection method includes:
step S1: the video acquisition module shoots a competition video of the swimmer and extracts pictures shot by the left and right cameras of each frame of the competition video.
Step S2: the video processing module carries out deblurring processing on the shot video, eliminates the afterimage of the object in the video due to high-speed movement, and sends the processed video to the target recognition and tracking module.
Step S3: the target recognition and tracking module detects the first frame image through deep learning, recognizes all people, then tracks the people, detects the targets again if the tracking fails, frames the targets by using rectangular square frames, extracts the central points of the targets as the motion track points of the detected people, and sends the calculation results to the three-dimensional coordinate recovery module.
Step S4: the three-dimensional coordinate recovery module is used for calculating a disparity map and a depth map of a left target video and a right target video, recovering three-dimensional coordinates of received two-dimensional projection coordinate points through pixel point depths, converting the two-dimensional coordinates into three-dimensional coordinates relative to a camera reference coordinate system, and then sending a three-dimensional coordinate track to the track judgment module.
Step S5: the track judging module receives the three-dimensional coordinate track, fits the track, judges whether the track is a parabola or not, and sends the track of the water outlet motion to the height calculating module according to the fact that the moving target makes the water motion when the track is the parabola.
Step S6: the height calculation module analyzes and calculates the motion track points in the water outlet process, judges the highest point and the lowest point of the swimmer track, and calculates the vertical displacement between the highest point and the lowest point as the water outlet height of the swimmer.
In this embodiment, the step S1 includes the following steps:
step S11: and the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged.
Step S12: and if the ZED binocular camera is correctly opened, returning to the video image data, and if the ZED binocular camera is not correctly opened, returning to the step S11.
Step S13: and reading each frame image of the left eye camera in the video and storing the frame image as the video.
In this embodiment, the step S2 includes the following steps:
step S21: and decomposing each frame of acquired image of the left eye camera into a picture sequence for storage.
Step S22: the image sequence is deblurred by Using a Deep convolutional neural network mentioned in a Cascade Deep Video Deblurring Using Temporal imaging prism paper, so that the ghost in the image is eliminated.
Step S23: and the picture is saved as the video according to the original frame rate and size.
The video clip of the experiment is selected from the performance videos of the world series of Russian station Russian wars in the 2019 patterned swimming. Fig. 5 is a schematic diagram illustrating comparison of video effects before and after deblurring according to a second embodiment of the present invention. As can be seen from FIG. 5, before and after deblurring, ghosting of the human head is obviously reduced, and subsequent target detection and tracking are facilitated.
In this embodiment, the step S3 includes the following steps:
step S31: the processed video is firstly identified by using a YOLOv5 target detection algorithm, the used pre-training model is trained on a human data set, the algorithm can only detect people, and the detected people are framed by rectangular frames.
Step S32: and sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving human.
Step S33: if the tracking is finished, the algorithm will re-detect the person who is not tracked, and return to step S31.
Step S34: and extracting the middle points of the rectangular frames in the motion process of all the tracked targets as motion tracks to form a motion track list.
Fig. 6 is a schematic diagram of a target tracking effect according to a second embodiment of the present invention. Fig. 7 is a schematic diagram of another target tracking effect provided in the second embodiment of the present invention. It can be seen that the algorithm recognizes the person well and boxes out.
In this embodiment, the step S4 includes the following steps:
step S41: converting each frame of image of a binocular camera shooting video into a gray image;
step S42: extracting corresponding disparity maps from left and right images of each frame corresponding to a video by using a Libelas library, and storing the disparity maps as a picture list;
step S43: calculating a depth map corresponding to each frame of parallax map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering the three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=zdeep/deepscale
x=(u-cx)×z/fx
y=(v-cy)×z/fy
wherein z isdeepThe depth value of the two-dimensional pixel point in the depth map is defined, depscale is the ratio between the depth value and the actual depth, (u, v) is the coordinate of the two-dimensional pixel point, and (c)x,cy,fx,fy) As an internal reference of the camera, cx,cyIs the origin of the image relative toLongitudinal and lateral offset (unit: pixel) of optical center imaging point, fx,fyIs the camera focal length.
In the experiment, because the performance video of the swimmer cannot be acquired by using the binocular camera, the simulation experiment is carried out by adopting a mode that the land people walk according to the parabolic track, and the parabolic pose of the swimmer is simulated. Fig. 8 is a schematic view illustrating an effect of a disparity map according to a second embodiment of the present invention. Fig. 9 is a schematic diagram illustrating an effect of a depth map according to a second embodiment of the present invention. As shown in fig. 8 and 9, the rectangular frame is a target person.
In this embodiment, the step S5 includes the following steps:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data pointsx,bx,cx,ay,by,cy) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure BDA0002708007470000111
step S53: describing two-dimensional coordinates of depth and height by using the data points, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure BDA0002708007470000112
step S54: determining the motion trajectory line of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure BDA0002708007470000113
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure BDA0002708007470000114
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the coordinate of the highest point of the track according to the fitted parabola, and taking the coordinate of the highest point as the coordinate of the highest point of the motion track.
Fig. 10 is a three-dimensional trace point fitting graph provided in the second embodiment of the present invention. In consideration of safety, the simulation experiment only performed 0.45m height test chart. Fig. 11 is a schematic diagram of the calculated measurement value of the height of the human figure simulation experiment with the height of 0.45m according to the second embodiment of the present invention. Fig. 12 is a schematic diagram of fitting of the calculated height data of the simulation experiment provided in the second embodiment of the present invention.
Considering that the height measurement is distance measurement, the embodiment performs an experiment of binocular vision distance measurement of a fixed object, and has verified the feasibility of the scheme in this step. The fixed object is selected from regular objects with known length, such as square cabinets, books and the like. Fig. 13 is a schematic diagram of a distance calculation measurement value of a distance detection simulation experiment according to a second embodiment of the present invention. Fig. 14 is a schematic diagram of distance measurement value and actual value recording curve simulation experiment distance calculation data fitting provided by the second embodiment of the present invention.
The embodiment provides a binocular vision and deep learning-based effluent height detection method, a detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, wherein the video acquisition module is used for acquiring a competition video of a swimmer, the video processing module is used for deblurring processing and removing ghosts generated by high-speed motion of objects in the competition video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point motion track of a target is selected as a target motion track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on a binocular video, a depth map is converted into a depth map according to a camera focal length and a binocular camera baseline, a depth value corresponding to each track point is calculated, and three-dimensional coordinates of the swimmer relative to, and the track judgment module is used for fitting all the acquired tracks, and when the tracks are parabolas, the tracks are determined as water outlet player tracks, and the height calculation module is used for calculating the distance between the highest point and the lowest point of the parabolas as the water outlet height of the swimmers. The system for detecting the water outlet height based on binocular vision and deep learning provided by the embodiment can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judging and evaluating action standards. The data processing process of the embodiment is fast, the detection result has timeliness, and the fairness of grading is guaranteed. The installation process of the detection system of the embodiment is extremely simple, the labor cost is reduced, the detection speed is high, the height value is presented in a data form, and the detection system is clear and visual.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (8)

1. A binocular vision and deep learning based effluent height detection system is characterized by comprising a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module;
the video acquisition module is used for acquiring a competition video of the swimmer and transmitting the acquired competition video to the video processing module;
the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
the target identification and tracking module is used for identifying swimmers throwing out of the water surface, tracking the targets, framing the targets by using a rectangular frame, acquiring the central points of the targets, extracting a central point list of the targets and sending the central point list to the three-dimensional coordinate recovery module;
the three-dimensional coordinate recovery module is used for calculating a disparity map according to the left and right eye photos, calculating a depth map according to the focal length and the baseline of the camera, recovering three-dimensional coordinates of track points relative to a camera reference coordinate system according to the disparity map and the depth map, forming a three-dimensional track list and sending the three-dimensional track list to the track judgment module;
the track judging module is used for fitting the obtained track, determining that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sending the target track performing water motion to the height calculating module;
the height calculation module is used for calculating the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
2. The binocular vision and deep learning based effluent height detection system of claim 1 wherein the video acquisition module comprises a ZED binocular camera placed on the ground through a bracket to prevent jitter;
the ZED binocular camera is arranged on the edge of the swimming pool on the side face of the swimmer, so that the ZED binocular camera can shoot the whole process of the performance of the swimmer.
3. A binocular vision and deep learning based effluent height detection method, wherein the detection method uses the detection system of claim 1 or claim 2, the detection method comprising:
step S1: the video acquisition module acquires a competition video of the swimmer and transmits the acquired competition video to the video processing module;
step S2: the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
step S3: the target identification and tracking module identifies the swimmer throwing out of the water surface, tracks the target, frames the target by using a rectangular frame, obtains the central point of the target, extracts the central point list of the target and sends the central point list to the three-dimensional coordinate recovery module;
step S4: the three-dimensional coordinate recovery module calculates a disparity map according to the left and right eye photos, calculates a depth map according to the focal length and the baseline of the camera, recovers three-dimensional coordinates of the track point relative to a camera reference coordinate system according to the disparity map and the depth map, forms a three-dimensional track list and sends the three-dimensional track list to the track judgment module;
step S5: the track judging module fits the obtained track, determines that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sends the target track performing water outlet motion to the height calculating module;
step S6: the height calculation module calculates the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
4. The binocular vision and deep learning-based effluent height detection method according to claim 3, wherein the step S1 includes:
step S11: the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged;
step S12: if the ZED binocular camera is correctly opened, returning to video image data, and if the ZED binocular camera is not correctly opened, returning to the step S11;
step S13: and reading each frame image of the left eye camera in the video image data, and storing the read image as the match video.
5. The binocular vision and deep learning-based effluent height detection method according to claim 3, wherein the step S2 includes:
step S21: decomposing each frame of image of the acquired left eye camera into a picture sequence, and storing the picture sequence;
step S22: carrying out deblurring processing on the picture sequence by using a deep convolution neural network to eliminate the ghost in the picture;
step S23: and storing the pictures after the blurring processing as videos according to the original frame rate and size.
6. The binocular vision and deep learning-based effluent height detection method according to claim 3, wherein the step S3 includes:
step S31: identifying the video after fuzzy processing by using a Yolov5 target detection algorithm, wherein the Yolov5 target detection algorithm can only detect human beings, and the detected people are framed by using a rectangular frame;
step S32: sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving person;
step S33: if the tracking operation is finished, re-detecting the person which is not tracked, and returning to the step S31;
step S34: and extracting the central points of the rectangular frames in the motion process of all the tracked targets, and taking the central points of the rectangular frames as motion tracks to form a motion track list.
7. The binocular vision and deep learning-based effluent height detection method according to claim 3, wherein the step S4 includes:
step S41: converting each frame of image of a binocular camera shooting video into a gray image;
step S42: extracting corresponding disparity maps from left and right images of each frame corresponding to a video by using a Libelas library, and storing the disparity maps as a picture list;
step S43: calculating a depth map corresponding to each frame of parallax map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering the three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=zdeep/deepscale
x=(u-cx)×z/fx
y=(v-cy)×z/fy
wherein z isdeepThe depth value of the two-dimensional pixel point in the depth map is defined, depscale is the ratio between the depth value and the actual depth, (u, v) is the coordinate of the two-dimensional pixel point, and (c)x,cy,fx,fy) As an internal reference of the camera, cx,cyIs the longitudinal and transverse offset of the image origin relative to the imaging point of the optical center, fx,fyIs the camera focal length.
8. The binocular vision and deep learning-based effluent height detection method according to claim 3, wherein the step S5 includes:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data pointsx,bx,cx,ay,by,cy) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure FDA0002708007460000041
step S53: describing two-dimensional coordinates of depth and height by using the data points, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure FDA0002708007460000042
step S54: determining the motion trajectory line of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure FDA0002708007460000043
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure FDA0002708007460000044
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the coordinate of the highest point of the track according to the fitted parabola, and taking the coordinate of the highest point as the coordinate of the highest point of the motion track.
CN202011046022.3A 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning Active CN112183355B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011046022.3A CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011046022.3A CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Publications (2)

Publication Number Publication Date
CN112183355A true CN112183355A (en) 2021-01-05
CN112183355B CN112183355B (en) 2022-12-27

Family

ID=73945752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011046022.3A Active CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Country Status (1)

Country Link
CN (1) CN112183355B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883866A (en) * 2021-02-08 2021-06-01 上海新纪元机器人有限公司 Method, system and storage medium for detecting regional invasion in real time
CN113240717A (en) * 2021-06-01 2021-08-10 之江实验室 Error modeling position correction method based on three-dimensional target tracking
CN113326835A (en) * 2021-08-04 2021-08-31 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium
CN113744524A (en) * 2021-08-16 2021-12-03 武汉理工大学 Pedestrian intention prediction method and system based on cooperative computing communication between vehicles
CN113822913A (en) * 2021-11-25 2021-12-21 江西科技学院 High-altitude parabolic detection method and system based on computer vision
CN114359411A (en) * 2022-01-10 2022-04-15 杭州巨岩欣成科技有限公司 Method and device for detecting drowning prevention target of swimming pool, computer equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370125A (en) * 2007-08-17 2009-02-18 林嘉 Diving auto-tracking shooting and video feedback method and system thereof
US20190087661A1 (en) * 2017-09-21 2019-03-21 NEX Team, Inc. Methods and systems for ball game analytics with a mobile device
CN110553628A (en) * 2019-08-28 2019-12-10 华南理工大学 Depth camera-based flying object capturing method
CN110711373A (en) * 2019-09-16 2020-01-21 北京理工大学 System and method for detecting height of hitting point of badminton serving
CN111444890A (en) * 2020-04-30 2020-07-24 汕头市同行网络科技有限公司 Sports data analysis system and method based on machine learning
CN111553274A (en) * 2020-04-28 2020-08-18 青岛聚好联科技有限公司 High-altitude parabolic detection method and device based on trajectory analysis
CN111612826A (en) * 2019-12-13 2020-09-01 北京理工大学 High-precision three-dimensional motion track acquisition positioning and motion process reproduction method based on binocular video sensor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370125A (en) * 2007-08-17 2009-02-18 林嘉 Diving auto-tracking shooting and video feedback method and system thereof
US20190087661A1 (en) * 2017-09-21 2019-03-21 NEX Team, Inc. Methods and systems for ball game analytics with a mobile device
CN110553628A (en) * 2019-08-28 2019-12-10 华南理工大学 Depth camera-based flying object capturing method
CN110711373A (en) * 2019-09-16 2020-01-21 北京理工大学 System and method for detecting height of hitting point of badminton serving
CN111612826A (en) * 2019-12-13 2020-09-01 北京理工大学 High-precision three-dimensional motion track acquisition positioning and motion process reproduction method based on binocular video sensor
CN111553274A (en) * 2020-04-28 2020-08-18 青岛聚好联科技有限公司 High-altitude parabolic detection method and device based on trajectory analysis
CN111444890A (en) * 2020-04-30 2020-07-24 汕头市同行网络科技有限公司 Sports data analysis system and method based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
G. GARIBOTTO等: "3D scene analysis by real-time stereovision", 《IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING 2005》 *
郑亮: "基于计算机视觉的艺术体操轨迹跟踪研究", 《现代电子技术》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883866A (en) * 2021-02-08 2021-06-01 上海新纪元机器人有限公司 Method, system and storage medium for detecting regional invasion in real time
CN113240717A (en) * 2021-06-01 2021-08-10 之江实验室 Error modeling position correction method based on three-dimensional target tracking
CN113326835A (en) * 2021-08-04 2021-08-31 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium
CN113744524A (en) * 2021-08-16 2021-12-03 武汉理工大学 Pedestrian intention prediction method and system based on cooperative computing communication between vehicles
CN113822913A (en) * 2021-11-25 2021-12-21 江西科技学院 High-altitude parabolic detection method and system based on computer vision
CN113822913B (en) * 2021-11-25 2022-02-11 江西科技学院 High-altitude parabolic detection method and system based on computer vision
CN114359411A (en) * 2022-01-10 2022-04-15 杭州巨岩欣成科技有限公司 Method and device for detecting drowning prevention target of swimming pool, computer equipment and storage medium
CN114359411B (en) * 2022-01-10 2022-08-09 杭州巨岩欣成科技有限公司 Method and device for detecting drowning prevention target of swimming pool, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112183355B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
CN112183355B (en) Effluent height detection system and method based on binocular vision and deep learning
JP6448223B2 (en) Image recognition system, image recognition apparatus, image recognition method, and computer program
KR101078975B1 (en) Sensing device and method used to apparatus for virtual golf simulation
US9744421B2 (en) Method of analysing a video of sports motion
CN110929596A (en) Shooting training system and method based on smart phone and artificial intelligence
CN111444890A (en) Sports data analysis system and method based on machine learning
CN111027432B (en) Gait feature-based visual following robot method
CN104408725A (en) Target recapture system and method based on TLD optimization algorithm
CN108875730A (en) A kind of deep learning sample collection method, apparatus, equipment and storage medium
CN109684919B (en) Badminton service violation distinguishing method based on machine vision
CN115240130A (en) Pedestrian multi-target tracking method and device and computer readable storage medium
CN113850865A (en) Human body posture positioning method and system based on binocular vision and storage medium
CN110287907A (en) A kind of method for checking object and device
WO2016031573A1 (en) Image-processing device, image-processing method, program, and recording medium
Sha et al. Understanding and analyzing a large collection of archived swimming videos
JP4465150B2 (en) System and method for measuring relative position of an object with respect to a reference point
CN116309686A (en) Video positioning and speed measuring method, device and equipment for swimmers and storage medium
CN115100744A (en) Badminton game human body posture estimation and ball path tracking method
CN103617631A (en) Tracking method based on center detection
Sokolova et al. Human identification by gait from event-based camera
WO2018076170A1 (en) A camera system for filming golf game and the method for the same
CN109784215A (en) A kind of in-vivo detection method and system based on improved optical flow method
JP7198661B2 (en) Object tracking device and its program
CN116703968A (en) Visual tracking method, device, system, equipment and medium for target object
CN112802112B (en) Visual positioning method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant