CN112183355B - Effluent height detection system and method based on binocular vision and deep learning - Google Patents

Effluent height detection system and method based on binocular vision and deep learning Download PDF

Info

Publication number
CN112183355B
CN112183355B CN202011046022.3A CN202011046022A CN112183355B CN 112183355 B CN112183355 B CN 112183355B CN 202011046022 A CN202011046022 A CN 202011046022A CN 112183355 B CN112183355 B CN 112183355B
Authority
CN
China
Prior art keywords
video
track
module
target
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011046022.3A
Other languages
Chinese (zh)
Other versions
CN112183355A (en
Inventor
宋春雷
任旭倩
丁子豪
李曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202011046022.3A priority Critical patent/CN112183355B/en
Publication of CN112183355A publication Critical patent/CN112183355A/en
Application granted granted Critical
Publication of CN112183355B publication Critical patent/CN112183355B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a system and a method for detecting water outlet height based on binocular vision and deep learning. The water outlet height detection system provided by the invention can detect the accurate height of the water surface thrown by the swimmer, and provides objective basis for judging and evaluating action standards.

Description

Effluent height detection system and method based on binocular vision and deep learning
Technical Field
The invention relates to the technical field of height detection, in particular to a system and a method for detecting the height of outlet water based on binocular vision and deep learning.
Background
The swim style sport is a water sports project integrated with dance and music, and athletes need to complete a plurality of groups of lifting, rotating, bending and other actions, wherein the height of the athletes jumping out of the water surface is an important scoring standard. The height of the existing water surface jumping can only be visually observed by a judge at present, and the method has greater subjectivity and has the following defects: (1) The height measurement is inaccurate, disputes are easy to generate, and the normal running of the competition is influenced; (2) The water outflow of the athletes is instantaneous action, and the identification burden is increased for the referees.
Disclosure of Invention
In order to solve the limitations and defects of the prior art, the invention provides a binocular vision and deep learning based effluent height detection system, which comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module;
the video acquisition module is used for acquiring a competition video of the swimmer and transmitting the acquired competition video to the video processing module;
the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
the target identification and tracking module is used for identifying swimmers throwing out of the water surface, tracking the targets, framing the targets by using a rectangular frame, acquiring the central points of the targets, extracting a central point list of the targets and sending the central point list to the three-dimensional coordinate recovery module;
the three-dimensional coordinate recovery module is used for calculating a disparity map according to the left and right eye photos, calculating a depth map according to the focal length and the baseline of the camera, recovering three-dimensional coordinates of track points relative to a camera reference coordinate system according to the disparity map and the depth map, forming a three-dimensional track list and sending the three-dimensional track list to the track judgment module;
the track judging module is used for fitting the obtained track, determining that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sending the target track performing water motion to the height calculating module;
the height calculation module is used for calculating the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
Optionally, the video acquisition module comprises a ZED binocular camera, and the ZED binocular camera is placed on the ground through a bracket to prevent shaking;
the ZED binocular camera is arranged on the edge of the swimming pool on the side face of the swimmer, so that the ZED binocular camera can shoot the whole process of the performance of the swimmer.
The invention also provides a method for detecting the height of the effluent based on binocular vision and deep learning, which uses the detection system and comprises the following steps:
step S1: the video acquisition module acquires a competition video of the swimmer and transmits the acquired competition video to the video processing module;
step S2: the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
and step S3: the target identification and tracking module identifies the swimmer thrown out of the water surface, tracks the target, frames the target by using a rectangular frame to obtain the central point of the target, extracts the central point list of the target and sends the central point list to the three-dimensional coordinate recovery module;
and step S4: the three-dimensional coordinate recovery module calculates a disparity map according to the left and right eye photos, calculates a depth map according to the focal length and the baseline of the camera, recovers three-dimensional coordinates of the track point relative to a camera reference coordinate system according to the disparity map and the depth map, forms a three-dimensional track list and sends the three-dimensional track list to the track judgment module;
step S5: the track judging module fits the obtained track, determines that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sends the target track performing water outlet motion to the height calculating module;
step S6: the height calculation module calculates the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer.
Optionally, the step S1 includes:
step S11: the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged;
step S12: if the ZED binocular camera is correctly started, returning video image data, and if the ZED binocular camera is not correctly started, returning to the step S11;
step S13: and reading each frame image of the left eye camera in the video image data, and storing the read image as the match video.
Optionally, the step S2 includes:
step S21: decomposing each frame of image of the acquired left eye camera into a picture sequence, and storing the picture sequence;
step S22: carrying out deblurring processing on the image sequence by using a deep convolution neural network to eliminate residual shadows in the images;
step S23: and storing the pictures after the blurring processing as videos according to the original frame rate and size.
Optionally, the step S3 includes:
step S31: identifying the video after the fuzzy processing by using a YOLOv5 target detection algorithm, wherein the YOLOv5 target detection algorithm can only detect human beings, and a rectangular frame is used for framing the detected people;
step S32: sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving person;
step S33: if the tracking operation is finished, detecting the person which is not tracked again, and returning to the step S31;
step S34: and extracting the central points of the rectangular frames in the motion process of all the tracked targets, and taking the central points of the rectangular frames as motion tracks to form a motion track list.
Optionally, the step S4 includes:
step S41: converting each frame of image of a binocular camera shooting video into a gray image;
step S42: extracting corresponding disparity maps from left and right images of each frame corresponding to a video by using a Libelas library, and storing the disparity maps as a picture list;
step S43: calculating a depth map corresponding to each frame of disparity map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering the three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=z deep /deepscale
x=(u-c x )×z/f x
y=(v-c y )×z/f y
wherein z is deep The depth value of the two-dimensional pixel point in the depth map is defined, depscale is the ratio between the depth value and the actual depth, (u, v) is the coordinate of the two-dimensional pixel point, and (c) x ,c y ,f x ,f y ) As an internal reference of the camera, c x ,c y Is the longitudinal and transverse offset of the image origin relative to the imaging point of the optical center, f x ,f y Is the camera focal length.
Optionally, the step S5 includes:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data points x ,b x ,c x ,a y ,b y ,c y ) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure BDA0002708007470000041
step S53: describing two-dimensional coordinates of depth and height by using the data points, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure BDA0002708007470000042
step S54: determining the motion trajectory of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure BDA0002708007470000043
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure BDA0002708007470000044
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the highest point coordinate of the track according to the fitted parabola, and taking the highest point coordinate as the coordinate of the highest point of the motion track.
The invention has the following beneficial effects:
the invention provides a binocular vision and deep learning-based water outlet height detection system and a method thereof, wherein the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, the video acquisition module is used for acquiring a game video of swimmers, the video processing module is used for deblurring processing and removing ghosts generated by high-speed movement of objects in the game video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point movement track of a target is selected as a target movement track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on binocular video, a depth map is converted into a depth map according to a camera focal distance and a binocular camera base line, a depth value corresponding to each track point is calculated, a three-dimensional coordinate of the swimmers relative to a camera coordinate system is recovered according to the depth value to form a three-dimensional track of the swimmers, the judgment module is used for fitting all acquired tracks, when the tracks are determined as water outlet parabolas, and the height calculation module is used for calculating the distance between the highest point and the lowest point of the swimmers as the water outlet height of the swimmers. The outlet height detection system based on binocular vision and deep learning provided by the invention can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judging and evaluating action standards. The data processing process of the invention is faster, the detection result has timeliness, and the fairness of grading is ensured. The detection system is extremely simple in installation process, labor cost is reduced, detection speed is high, the height value is presented in a data form, and the detection system is clear and visual.
Drawings
Fig. 1 is a schematic structural diagram of a binocular vision and deep learning based effluent height detection system according to an embodiment of the present invention.
Fig. 2 is a top view of a binocular vision and deep learning based effluent height detection system according to an embodiment of the present invention.
Fig. 3 is a front view of the effluent height detection system based on binocular vision and deep learning according to an embodiment of the present invention.
Fig. 4 is a flowchart of a binocular vision and deep learning based effluent height detection method according to a second embodiment of the present invention.
Fig. 5 is a schematic diagram illustrating comparison of video effects before and after deblurring according to a second embodiment of the present invention.
Fig. 6 is a schematic diagram of a target tracking effect according to a second embodiment of the present invention.
Fig. 7 is a schematic diagram of another target tracking effect provided in the second embodiment of the present invention.
Fig. 8 is a schematic view illustrating an effect of a disparity map according to a second embodiment of the present invention.
Fig. 9 is a schematic diagram illustrating an effect of the depth map provided by the second embodiment of the present invention.
Fig. 10 is a three-dimensional trace point fitting graph provided in the second embodiment of the present invention.
Fig. 11 is a schematic diagram of the calculated measurement value of the height of the human figure simulation experiment with the height of 0.45m according to the second embodiment of the present invention.
Fig. 12 is a schematic diagram of fitting the calculated height data of the simulation experiment provided in the second embodiment of the present invention.
Fig. 13 is a schematic diagram of a distance calculation measurement value of a distance detection simulation experiment according to a second embodiment of the present invention.
Fig. 14 is a schematic diagram of distance measurement value and actual value recording curve simulation experiment distance calculation data fitting provided by the second embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the binocular vision and deep learning based effluent height detection system and the method thereof provided by the present invention are described in detail below with reference to the accompanying drawings.
Example one
The binocular vision and deep learning based water outlet height detection system provided by the embodiment can quickly and accurately detect the height of a water surface ground thrown by an athlete, and provides objective basis for judging to score according to actions. Fig. 1 is a schematic structural diagram of a binocular vision and deep learning based effluent height detection system according to an embodiment of the present invention. As shown in fig. 1, the effluent height detection system based on binocular vision and deep learning provided by this embodiment includes a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a trajectory judgment module, and a height calculation module, where the video acquisition module acquires the effluent video part of the pattern swimmer for a binocular camera, and transmits the acquired game video to the video processing module for deblurring to remove the ghost generated by the high-speed motion of the object in the video. And then, the processed video is sent to a target recognition tracking module, the target recognition module is used for carrying out character target recognition by adopting a YOLOv5 algorithm, then carrying out target tracking by using a Deep Sort algorithm, and selecting a central point motion track of a tracked target as a target character motion track. The two-dimensional projection coordinates of the moving object are extracted from the video and need to be restored to three-dimensional coordinates, so that the disparity map calculation is carried out on the binocular video, the depth map is converted into a depth map according to the camera focal length and the binocular camera baseline, the depth value corresponding to each track point of the target is calculated, the three-dimensional coordinates of the object relative to a camera coordinate system are restored according to the depth values, and the three-dimensional tracks are extracted. And fitting all the acquired tracks, and considering the tracks of the water outlet athletes when the tracks are parabolic. The height calculation module calculates the Euclidean distance from the highest point to the lowest point of the parabola to be used as the water outlet height of the swimmer. The embodiment uses the deep learning method to detect the height, can capture the instantaneous water outlet action on the premise of not contacting with athletes, reduces manpower and material resources, has simple principle, and realizes quick and accurate height calculation. Simultaneously, the ZED binocular camera can shoot high definition left eye right eye image, and the video that saves makes things convenient for off-line computation.
Fig. 2 is a top view of a system for detecting a water outlet height based on binocular vision and deep learning according to an embodiment of the present invention. Fig. 3 is a front view of the effluent height detection system based on binocular vision and deep learning according to an embodiment of the present invention. As shown in fig. 2 and 3, the video acquisition module contains a ZED binocular camera, and the ZED binocular camera shoots high-definition video and is placed on the ground through a support to prevent shaking. ZED binocular camera is put at the swimming pool edge of sportsman side, guarantees to shoot the whole process of sportsman performance.
The detection system provided by the embodiment comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module. The video acquisition module adopts ZED stereo camera that U.S. Stereolabs company promoted, through the tripod support during laying, guarantees that binocular camera is perpendicular with the ground to prevent the shake. The binocular camera is located sportsman's side swimming pool, can shoot sportsman's performance overall process. The video acquisition module can be realized by adopting a windows system and an Ubuntu system, the Ubuntu18.04 operating system is adopted in the embodiment, and the software compiling platform adopts the ciion. The binocular camera is connected with the computer through a USB (universal serial bus), athlete performance videos are collected and transmitted to the computer, the computer runs a program in pycharm to perform corresponding processing to obtain the water outlet height value of the pattern swimmer, and the detected target image and the height value are printed on a computer screen.
The embodiment provides a binocular vision and deep learning-based water outlet height detection system and a method thereof, the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, wherein the video acquisition module is used for acquiring a competition video of swimmers, the video processing module is used for deblurring processing and removing ghosts generated by high-speed motion of objects in the competition video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point motion track of a target is selected as a target motion track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on binocular video, a depth map is converted into a depth map according to a camera focal length and a binocular camera baseline, a depth value corresponding to each track point is calculated, a three-dimensional coordinate of the swimmers relative to a camera coordinate system is recovered according to the depth value, a three-dimensional track of the swimmers is formed, the judgment module is used for fitting all the acquired tracks, when the tracks are determined as water outlet tracks when the tracks are parabolas, and the height calculation module is used for calculating the distances between the highest points of the parabolas and the swimmers as the lowest points of the swimmers. The binocular vision and deep learning based water outlet height detection system provided by the embodiment can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judgment to evaluate action standards. The data processing process of the embodiment is fast, the detection result has timeliness, and the fairness of scoring is guaranteed. The installation process of the detection system of the embodiment is extremely simple, the labor cost is reduced, the detection speed is high, the height value is in a data form, and the detection system is clear and visual.
Example two
Fig. 4 is a flowchart of a binocular vision and deep learning based effluent height detection method according to a second embodiment of the present invention. As shown in fig. 4, the present embodiment provides a method for detecting an effluent height based on binocular vision and deep learning, where the detection method employs the detection system provided in the first embodiment, and the detection method includes:
step S1: the video acquisition module shoots a match video of the swimmer and extracts pictures shot by the left eye camera and the right eye camera of each frame of the match video.
Step S2: the video processing module carries out deblurring processing on the shot video, eliminates the afterimage of the object in the video due to high-speed movement, and sends the processed video to the target recognition and tracking module.
And step S3: the target recognition and tracking module detects the first frame image through deep learning, recognizes all people, then tracks the people, detects the targets again if the tracking fails, frames the targets by using rectangular square frames, extracts the central points of the targets as the motion track points of the detected people, and sends the calculation results to the three-dimensional coordinate recovery module.
And step S4: the three-dimensional coordinate recovery module is used for calculating a disparity map and a depth map of a left target video and a right target video, recovering three-dimensional coordinates of received two-dimensional projection coordinate points through pixel point depths, converting the two-dimensional coordinates into three-dimensional coordinates relative to a camera reference coordinate system, and then sending a three-dimensional coordinate track to the track judgment module.
Step S5: the track judging module receives the three-dimensional coordinate track, fits the track, judges whether the track is a parabola or not, and sends the track of the water outlet motion to the height calculating module according to the fact that the moving target makes the water motion when the track is the parabola.
Step S6: the height calculation module analyzes and calculates the motion track points in the water outlet process, judges the highest point and the lowest point of the swimmer track, and calculates the vertical displacement between the highest point and the lowest point as the water outlet height of the swimmer.
In this embodiment, the step S1 includes the following steps:
step S11: and the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged.
Step S12: if the ZED binocular camera is correctly started, returning video image data, and if the ZED binocular camera is not correctly started, returning to the step S11.
Step S13: and reading each frame image of the left eye camera in the video and storing the frame image as the video.
In this embodiment, the step S2 includes the following steps:
step S21: and decomposing each frame of acquired image of the left eye camera into a picture sequence for storage.
Step S22: the image sequence is deblurred by Using a Deep convolutional neural network mentioned in a Cascade Deep Video Deblurring Using Temporal sharp color Prior paper, and the ghost in the image is eliminated.
Step S23: and the picture is saved as the video according to the original frame rate and size.
The video clip of the experiment is selected from the performance videos of the world series of Russian station Russian wars in the 2019 patterned swimming. Fig. 5 is a schematic diagram illustrating comparison of video effects before and after deblurring according to a second embodiment of the present invention. As can be seen from FIG. 5, before and after deblurring, ghosting of the human head is obviously reduced, and subsequent target detection and tracking are facilitated.
In this embodiment, the step S3 includes the following steps:
step S31: firstly, identifying the processed video by using a YOLOv5 target detection algorithm, training a pre-training model on a human data set, detecting only people by using the algorithm, and framing the detected people by using a rectangular frame.
Step S32: and sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving human.
Step S33: if the tracking is finished, the algorithm will detect the person not tracked again and return to step S31.
Step S34: and extracting the middle points of the rectangular frames in the motion process of all the tracked targets as motion tracks to form a motion track list.
Fig. 6 is a schematic diagram of a target tracking effect according to a second embodiment of the present invention. Fig. 7 is a schematic diagram of another target tracking effect according to a second embodiment of the present invention. It can be seen that the algorithm recognizes the person well and boxes out.
In this embodiment, the step S4 includes the following steps:
step S41: converting each frame of image of a binocular camera shooting video into a gray image;
step S42: extracting corresponding disparity maps from left and right eye pictures of each frame corresponding to a video by using a Libelas library, and storing the disparity maps into a picture list;
step S43: calculating a depth map corresponding to each frame of parallax map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering a three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=z deep /deepscale
x=(u-c x )×z/f x
y=(v-c y )×z/f y
wherein z is deep The depth value of the two-dimensional pixel point in the depth map is defined, depscale is the ratio between the depth value and the actual depth, (u, v) is the coordinate of the two-dimensional pixel point, and (c) x ,c y ,f x ,f y ) As an internal reference of the camera, c x ,c y Is the longitudinal and transverse offset (unit: pixel) of the image origin relative to the optical centre imaging point, f x ,f y Is the camera focal length.
In the experiment, because the performance video of the swimmer cannot be acquired by using the binocular camera, the simulation experiment is carried out by adopting a mode that the land people walk according to the parabolic track, and the parabolic pose of the swimmer is simulated. Fig. 8 is a schematic view illustrating an effect of a disparity map according to a second embodiment of the present invention. Fig. 9 is a schematic diagram illustrating an effect of a depth map according to a second embodiment of the present invention. As shown in fig. 8 and 9, the rectangular frame is a target person.
In this embodiment, the step S5 includes the following steps:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data points x ,b x ,c x ,a y ,b y ,c y ) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure BDA0002708007470000111
step S53: using the data points to describe two-dimensional coordinates of depth and height, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure BDA0002708007470000112
step S54: determining the motion trajectory line of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure BDA0002708007470000113
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure BDA0002708007470000114
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the coordinate of the highest point of the track according to the fitted parabola, and taking the coordinate of the highest point as the coordinate of the highest point of the motion track.
Fig. 10 is a three-dimensional trace point fitting graph provided in the second embodiment of the present invention. Considering the safety factor, the simulation experiment only makes 0.45m height test chart. Fig. 11 is a schematic diagram of the calculated measurement value of the height of the human figure simulation experiment with the height of 0.45m according to the second embodiment of the present invention. Fig. 12 is a schematic diagram of fitting the calculated height data of the simulation experiment provided in the second embodiment of the present invention.
Considering that the height measurement is distance measurement, the embodiment performs an experiment of binocular vision distance measurement of a fixed object, and has verified the feasibility of the scheme in this step. The fixed object is selected from regular objects with known length, such as square cabinets, books and the like. Fig. 13 is a schematic diagram of a distance calculation measurement value of a distance detection simulation experiment according to a second embodiment of the present invention. Fig. 14 is a schematic diagram of fitting the distance calculation data in the simulation experiment of the distance measurement value and the real value recording curve provided in the second embodiment of the present invention.
The embodiment provides a method for detecting the height of a swimming player based on binocular vision and deep learning, wherein the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module, the video acquisition module is used for acquiring a competition video of the swimming player, the video processing module is used for deblurring processing and removing ghost shadows generated by high-speed movement of objects in the competition video, the target recognition and tracking module is used for carrying out target recognition by using a YOLOv5 algorithm and then carrying out target tracking by using a Deep Sort algorithm, a central point movement track of a target is selected as a target movement track, the three-dimensional coordinate recovery module is used for carrying out parallax map calculation on a binocular video, a depth map is converted according to a camera focal length and a binocular camera base line, a depth value corresponding to each track point is calculated, a three-dimensional coordinate of the swimming player relative to a camera coordinate system is recovered according to the depth value to form a three-dimensional track of the swimming player, the track judgment module is used for fitting all acquired tracks, the tracks are determined as the tracks of the water players when the tracks are parabolas, and the height calculation module is used for calculating the distance between the highest point and the lowest point of the parabolas as the height of the water players. The system for detecting the water outlet height based on binocular vision and deep learning provided by the embodiment can detect the accurate height of the water surface thrown by a swimmer, and provides objective basis for judging and evaluating action standards. The data processing process of the embodiment is fast, the detection result has timeliness, and the fairness of scoring is guaranteed. The installation process of the detection system of the embodiment is extremely simple, the labor cost is reduced, the detection speed is high, the height value is presented in a data form, and the detection system is clear and visual.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (6)

1. A method for detecting the effluent height based on binocular vision and deep learning is characterized in that a binocular vision and deep learning based effluent height detection system is used in the detection method, and the detection system comprises a video acquisition module, a video processing module, a target recognition and tracking module, a three-dimensional coordinate recovery module, a track judgment module and a height calculation module;
the video acquisition module is used for acquiring a competition video of a swimmer and transmitting the acquired competition video to the video processing module;
the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target recognition and tracking module;
the target identification and tracking module is used for identifying swimmers thrown out of the water surface, tracking the target, framing the target by using a rectangular frame, obtaining the central point of the target, extracting a central point list of the target and sending the central point list to the three-dimensional coordinate recovery module;
the three-dimensional coordinate recovery module is used for calculating a disparity map according to the left and right eye photos, calculating a depth map according to the focal length and the baseline of the camera, recovering three-dimensional coordinates of track points relative to a camera reference coordinate system according to the disparity map and the depth map, forming a three-dimensional track list and sending the three-dimensional track list to the track judgment module;
the track judging module is used for fitting the obtained track, determining that the corresponding target performs water outlet motion when the track is judged to be parabolic, and sending the target track performing water outlet motion to the height calculating module;
the height calculation module is used for calculating the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, and the vertical distance is the water outlet height of the swimmer;
the detection method comprises the following steps:
step S1: the video acquisition module acquires a competition video of the swimmer and transmits the acquired competition video to the video processing module;
step S2: the video processing module is used for processing a video with a high-speed moving object according to a video deblurring algorithm so as to eliminate a fuzzy ghost in the match video and transmitting the processed match video to the target identification and tracking module;
and step S3: the target identification and tracking module identifies the swimmer thrown out of the water surface, tracks the target, frames the target by using a rectangular frame to obtain the central point of the target, extracts the central point list of the target and sends the central point list to the three-dimensional coordinate recovery module;
and step S4: the three-dimensional coordinate recovery module calculates a disparity map according to the left and right eye photos, calculates a depth map according to the focal length and the baseline of the camera, recovers three-dimensional coordinates of the track point relative to a camera reference coordinate system according to the disparity map and the depth map, forms a three-dimensional track list and sends the three-dimensional track list to the track judgment module;
step S5: the track judging module fits the obtained track, determines that the corresponding target performs water outlet motion when the track is judged to be a parabola, and sends the target track performing water outlet motion to the height calculating module;
step S6: the height calculation module calculates the vertical distance between the highest point and the lowest point of the parabola according to the obtained parabola, wherein the vertical distance is the water outlet height of the swimmer;
the step S5 includes:
step S51: constructing a motion trail data point set according to the three-dimensional coordinates;
step S52: fitting unknown parameters (a) of the three-dimensional parabolic curved surface using a least squares method from the collected data points x ,b x ,c x ,a y ,b y ,c y ) The calculation formula of the three-dimensional parabolic curved surface is as follows:
Figure FDA0003865232900000021
step S53: describing two-dimensional coordinates of depth and height by using the data points, and fitting the plane position of the object motion in the three-dimensional space by using a least square method, wherein the calculation formula is as follows:
Figure FDA0003865232900000022
step S54: determining the motion trajectory line of the swimmer according to the intersection line of the motion plane and the curved surface, wherein the calculation formula is as follows:
Figure FDA0003865232900000031
step S55: judging whether the data point track belongs to a parabolic track, and calculating the deviation between a parabola obtained by fitting and the original data point, wherein the calculation formula is as follows:
Figure FDA0003865232900000032
step S56: when the deviation mean value of the fitted curve is a preset error threshold value tau, the group of data points are determined to be a parabolic track;
step S57: and calculating the coordinate of the highest point of the track according to the fitted parabola, and taking the coordinate of the highest point as the coordinate of the highest point of the motion track.
2. The binocular vision and deep learning based effluent height detection method according to claim 1, wherein the video acquisition module comprises a ZED binocular camera placed on the ground through a bracket to prevent shaking;
the ZED binocular camera is arranged on the edge of the swimming pool on the side face of the swimmer, so that the ZED binocular camera can shoot the whole process of the performance of the swimmer.
3. The binocular vision and deep learning based effluent height detection method according to claim 2, wherein the step S1 comprises:
step S11: the detection system is started and initialized, and whether the ZED binocular camera is correctly started or not is judged;
step S12: if the ZED binocular camera is correctly started, returning video image data, and if the ZED binocular camera is not correctly started, returning to the step S11;
step S13: and reading each frame image of the left eye camera in the video image data, and storing the read image as the match video.
4. The binocular vision and deep learning based effluent height detection method according to claim 1, wherein the step S2 comprises:
step S21: decomposing each frame of image of the acquired left eye camera into a picture sequence, and storing the picture sequence;
step S22: carrying out deblurring processing on the picture sequence by using a deep convolution neural network to eliminate the ghost in the picture;
step S23: and saving the deblurred picture as a video according to the original frame rate and size.
5. The binocular vision and deep learning based effluent height detection method according to claim 1, wherein the step S3 comprises:
step S31: identifying the deblurred video by using a YOLOv5 target detection algorithm, wherein the YOLOv5 target detection algorithm can only detect human beings, and a rectangular frame is used for framing the detected people;
step S32: sending the detection result to a Deep Sort algorithm for target tracking, and tracking the track of the moving person;
step S33: if the tracking operation is finished, detecting the person which is not tracked again, and returning to the step S31;
step S34: and extracting the central points of the rectangular frames in the motion process of all the tracked targets, and taking the central points of the rectangular frames as motion tracks to form a motion track list.
6. The binocular vision and deep learning based effluent height detection method according to claim 1, wherein the step S4 comprises:
step S41: converting each frame of image of a video shot by a binocular camera into a gray image;
step S42: extracting corresponding disparity maps from left and right eye pictures of each frame corresponding to a video by using a Libelas library, and storing the disparity maps into a picture list;
step S43: calculating a depth map corresponding to each frame of parallax map according to the focal length and the base line of the binocular camera;
step S44: the corresponding value of each pixel point in the depth map is a depth value, and the corresponding depth value of each pixel point is calculated according to the two-dimensional coordinate point of the track;
step S45: and recovering a three-dimensional coordinate according to the two-dimensional coordinate point, wherein a calculation formula is as follows:
z=z deep /deepscale
x=(u-c x )×z/f x
y=(v-c y )×z/f y
wherein z is deep The corresponding depth value of the two-dimensional pixel point in the depth map is determined, depscale is the ratio between the depth value and the actual depth, (u, v) are the coordinates of the two-dimensional pixel point, and (c) x ,c y ,f x ,f y ) As an internal reference of the camera, c x ,c y Is the longitudinal and transverse offset of the image origin relative to the imaging point of the optical center, f x ,f y Is the camera focal length.
CN202011046022.3A 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning Active CN112183355B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011046022.3A CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011046022.3A CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Publications (2)

Publication Number Publication Date
CN112183355A CN112183355A (en) 2021-01-05
CN112183355B true CN112183355B (en) 2022-12-27

Family

ID=73945752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011046022.3A Active CN112183355B (en) 2020-09-28 2020-09-28 Effluent height detection system and method based on binocular vision and deep learning

Country Status (1)

Country Link
CN (1) CN112183355B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883866A (en) * 2021-02-08 2021-06-01 上海新纪元机器人有限公司 Method, system and storage medium for detecting regional invasion in real time
CN113240717B (en) * 2021-06-01 2022-12-23 之江实验室 Error modeling position correction method based on three-dimensional target tracking
CN113326835B (en) * 2021-08-04 2021-10-29 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium
CN113744524B (en) * 2021-08-16 2023-04-18 武汉理工大学 Pedestrian intention prediction method and system based on cooperative computing communication between vehicles
CN113822913B (en) * 2021-11-25 2022-02-11 江西科技学院 High-altitude parabolic detection method and system based on computer vision
CN114241602B (en) * 2021-12-16 2024-05-28 北京理工大学 Deep learning-based multi-objective moment of inertia measurement and calculation method
CN114359411B (en) * 2022-01-10 2022-08-09 杭州巨岩欣成科技有限公司 Method and device for detecting drowning prevention target of swimming pool, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370125A (en) * 2007-08-17 2009-02-18 林嘉 Diving auto-tracking shooting and video feedback method and system thereof
CN110553628A (en) * 2019-08-28 2019-12-10 华南理工大学 Depth camera-based flying object capturing method
CN110711373A (en) * 2019-09-16 2020-01-21 北京理工大学 System and method for detecting height of hitting point of badminton serving
CN111444890A (en) * 2020-04-30 2020-07-24 汕头市同行网络科技有限公司 Sports data analysis system and method based on machine learning
CN111553274A (en) * 2020-04-28 2020-08-18 青岛聚好联科技有限公司 High-altitude parabolic detection method and device based on trajectory analysis
CN111612826A (en) * 2019-12-13 2020-09-01 北京理工大学 High-precision three-dimensional motion track acquisition positioning and motion process reproduction method based on binocular video sensor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489656B2 (en) * 2017-09-21 2019-11-26 NEX Team Inc. Methods and systems for ball game analytics with a mobile device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370125A (en) * 2007-08-17 2009-02-18 林嘉 Diving auto-tracking shooting and video feedback method and system thereof
CN110553628A (en) * 2019-08-28 2019-12-10 华南理工大学 Depth camera-based flying object capturing method
CN110711373A (en) * 2019-09-16 2020-01-21 北京理工大学 System and method for detecting height of hitting point of badminton serving
CN111612826A (en) * 2019-12-13 2020-09-01 北京理工大学 High-precision three-dimensional motion track acquisition positioning and motion process reproduction method based on binocular video sensor
CN111553274A (en) * 2020-04-28 2020-08-18 青岛聚好联科技有限公司 High-altitude parabolic detection method and device based on trajectory analysis
CN111444890A (en) * 2020-04-30 2020-07-24 汕头市同行网络科技有限公司 Sports data analysis system and method based on machine learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
3D scene analysis by real-time stereovision;G. Garibotto等;《IEEE International Conference on Image Processing 2005》;20051114;第1-4页 *
基于计算机视觉的艺术体操轨迹跟踪研究;郑亮;《现代电子技术》;20171001(第19期);第86-90页 *

Also Published As

Publication number Publication date
CN112183355A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN112183355B (en) Effluent height detection system and method based on binocular vision and deep learning
KR101078975B1 (en) Sensing device and method used to apparatus for virtual golf simulation
US9743014B2 (en) Image recognition system, image recognition apparatus, image recognition method, and computer program
CN110929596A (en) Shooting training system and method based on smart phone and artificial intelligence
US20130178304A1 (en) Method of analysing a video of sports motion
CN111444890A (en) Sports data analysis system and method based on machine learning
CN103093198B (en) A kind of crowd density monitoring method and device
CN108875730A (en) A kind of deep learning sample collection method, apparatus, equipment and storage medium
CN109684919B (en) Badminton service violation distinguishing method based on machine vision
CN111027432B (en) Gait feature-based visual following robot method
CN113850865A (en) Human body posture positioning method and system based on binocular vision and storage medium
CN104408725A (en) Target recapture system and method based on TLD optimization algorithm
CN110287907A (en) A kind of method for checking object and device
CN116309686A (en) Video positioning and speed measuring method, device and equipment for swimmers and storage medium
JP4465150B2 (en) System and method for measuring relative position of an object with respect to a reference point
CN115100744A (en) Badminton game human body posture estimation and ball path tracking method
JP7198661B2 (en) Object tracking device and its program
Sokolova et al. Human identification by gait from event-based camera
CN117333550A (en) Shuttlecock service height violation judging method based on computer vision detection
WO2018076170A1 (en) A camera system for filming golf game and the method for the same
US11229824B2 (en) Determining golf club head location in an image using line detection and contour separation
CN116703968A (en) Visual tracking method, device, system, equipment and medium for target object
CN110910489A (en) Monocular vision based intelligent court sports information acquisition system and method
CN110717931B (en) System and method for detecting height of hitting point of badminton serving
CN114187663A (en) Method for controlling unmanned aerial vehicle by posture based on radar detection gray level graph and neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant