CN113033308A - Team sports video game lens extraction method based on color features - Google Patents
Team sports video game lens extraction method based on color features Download PDFInfo
- Publication number
- CN113033308A CN113033308A CN202110204176.9A CN202110204176A CN113033308A CN 113033308 A CN113033308 A CN 113033308A CN 202110204176 A CN202110204176 A CN 202110204176A CN 113033308 A CN113033308 A CN 113033308A
- Authority
- CN
- China
- Prior art keywords
- video
- shot
- clip
- video clip
- clips
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 6
- 238000000034 method Methods 0.000 claims description 23
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000003064 k means clustering Methods 0.000 claims description 5
- 238000003491 array Methods 0.000 claims description 2
- 239000012634 fragment Substances 0.000 claims description 2
- 238000011160 research Methods 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 3
- 230000002411 adverse Effects 0.000 abstract description 2
- 239000000284 extract Substances 0.000 abstract description 2
- 230000002349 favourable effect Effects 0.000 abstract 1
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000001514 detection method Methods 0.000 description 13
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Computing Systems (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
A team sports video match shot extraction method based on color features belongs to the field of video data processing. Firstly, preprocessing a video, and automatically segmenting the video by taking a lens as a unit to structure the video; and then calculating the r, g and b channel arithmetic mean values of the video frames and the shot video clips, then clustering through K-means, selecting a reference shot video clip according to the extreme difference of the r, g and b channel arithmetic mean values of the video frames in the shot video clips, and dividing the team sports video into a match shot (far shot and middle shot) video clip and other shot (special-write shot and off-site shot) video clips by comparing the dominant color difference of the reference shot video clip and each shot video clip. The invention extracts the far-lens and middle-lens video segments occupying most of the content of team sports videos by utilizing the dominant color difference of the video frames, is favorable for reducing the adverse effect of redundant content on video analysis and processing, and lays a good structural foundation for further research.
Description
Technical Field
The invention belongs to the field of video data processing.
Background
Team sports such as football, basketball and the like are popular with people and have wide crowd foundation, so that the method has wide application prospect in research on team sports videos. Video data is unstructured data, which is converted into a series of labeled sequences with shot type as a key attribute, and is the basis for event detection and summary generation. The common method is that firstly, the video is preprocessed, and the video is automatically segmented and structured by taking a shot as a unit; on the basis, the type of the lens is automatically identified through methods such as mode identification and the like. In team sports videos, shots in the videos can be divided into four types, namely a far shot, a medium shot, a close-up shot and an off-site shot, according to the difference between shooting angles and shooting distances. The far lens displays the station of each player in the court and the progress of the game at an aerial view angle, and the middle lens more clearly displays the game situation of most players in the court; close-up often is the playback of far and medium shots; off-site footage such as spectators, advertising, and LOGO are redundant content relative to the game itself. In summary, the far and middle shots, which typically occupy the vast majority of the entire game, are the most important game shots for the game itself. Extracting the game shots of the sports video may reduce the adverse impact of redundant content on the video analysis process.
Eldib and the like determine an RGB range interval of field pixels through experiments, all pixels of which components fall in the interval are regarded as field color pixels, and then threshold values are set according to the proportion of the field pixels to distinguish various lenses. And the type of the lens is reflected by utilizing the field color proportion of a sub-window area in a video frame in Junqing and the like, and a certain effect is achieved.
The shot classification firstly needs to carry out video segmentation, accurately detects shot boundaries to complete video segmentation, and is a guarantee premise for effectively dividing the shot types. In team sports video, shot boundaries can be divided into abrupt and gradual types. In the existing related work, the shot segmentation still easily generates false detection, especially the detection of the gradual shot, the reason is that many slow shots are played back in the team sports video, and the LOGO is frequently swept and changed at the beginning and the end of the playback, so that the wrong judgment of the algorithm is easily caused, which is very unfavorable for the subsequent research work. The existing lens classification algorithm cannot compensate for the false detection of the segmentation of the lens boundary, and the accuracy of lens classification is reduced due to the influence of the false detection.
Secondly, the algorithm proposed by Eldib et al is based on the field color, but the playing field will change with the factors of venue, weather, light, etc. The algorithm proposed by Junqing et al depends on the selection of the size and position of the sub-window, so the algorithm robustness is poor.
In summary, the recognition accuracy of each type of shot in the current shot classification method is obviously affected by the insufficient accuracy of shot boundary detection, so that the recognition accuracy of the shot is not ideal, and the universality of the existing shot classification algorithm is not strong.
Disclosure of Invention
The invention aims to provide a team sports video game shot extraction method based on color features, which is used for extracting a far lens and a medium lens occupying most of the contents of team sports videos to perform subsequent analysis and research on the videos. The team sports video is divided into a match shot (far shot, middle shot) video clip and other shot (close-up shot, off-scene) video clips according to the fact that the dominant colors of the whole video frame are different greatly among different types of shots. Experiments prove that the method has good adaptability, high accuracy, practicability and high efficiency.
Inputting: original team sports video
And (3) outputting: set of game shot video clips
Defining:
S={S1,S2,……,SS}: collecting shot video clips after video segmentation;
f: a set of video frames of a shot video clip;
AMR: the arithmetic mean value of the R channel values corresponding to all the pixel points in the video frame;
AMG: the arithmetic mean value of the values of the G channels corresponding to all the pixel points in the video frame;
AMB: the arithmetic mean value of the values of the B channels corresponding to all the pixel points in the video frame;
RR: AM in shot video clipRA very poor result;
RG: AM in shot video clipGA very poor result;
RB: AM in shot video clipBA very poor result;
Cmax: after K-means clustering, the video clip with the most shots is provided;
THBS: a threshold for distinguishing the candidate reference shot from all other shots;
and (3) CBS: a set of candidate reference shot video clips;
SFM: the CBS has shot video clips with the maximum video frame number;
THGS: a threshold for distinguishing match shots from non-match shots;
s': a set of game shot video segments.
The method comprises the following steps: the video is segmented into video shot fragments, and the video is stored frame by frame according to the form of pictures to obtain S
Step two: computing AM of all video framesR,AMG,AMB
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmax
Step five: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
Step seven: computing stationWith shot video clipsAnd performing the following operations on all shot video clips: if it is notAre all less than or equal to THGSThen it is stored in S'.
Taking values: THBS=100,THGS=15%。
By using mean attribute in ImageStat module of Python image processing library PIL, average AM of three channel pixel values of each frame in single-shot video clip can be calculatedR,AMG,AMB。
The formula is as follows:
in the formula, r, g, b are RGB color pixel values corresponding to pixel points in the video frame, respectively, and n is the number of pixel points.
In the formula, N is the number of video frames included in a single shot video clip.
The invention utilizes the color characteristics to extract the match shots of the team sports video, and utilizes the dominant color difference of the whole video frame to eliminate the close-up shots and off-scene shots, thereby keeping the main match shots in the video. In the existing related work, the classification of the lens is usually concentrated on the field, and the method is not limited by the conventional thought and provides a new idea. Firstly, the method is not limited by site factors and is suitable for sports of multiple teams, and meanwhile, the method can well make up for partial false detection in the shot boundary detection method: if the non-match shot has wrong segmentation, the whole shot can be identified and removed, and the research and analysis of the subsequent video cannot be influenced. Therefore, the method can extract the most main match shot part in the sports video of the whole team, and lays a good structural foundation for the subsequent semantic event detection and abstract generation.
To test the effectiveness of the method, 25, 112 minutes of video were extracted from the published Youtube-8M dataset to form a SportKF dataset as shown in FIG. 2, from four sports (basketball, football, American football, and hockey), containing 197,878 frames, 572 shots.
The method is applied to the data set, and 5 sections of football videos are compared with the recall ratio and the precision ratio by the method proposed by Junqing and the like, wherein the computing formulas of the recall ratio and the precision ratio are as follows:
checking rate (actual detection number-false detection number)/number to be detected
Precision ratio (true check count-false check count)/true check count
For the extracted match shots, the recall ratio of the four team sports is 100%, and the precision ratio is 86.67%. Aiming at football videos, the recall ratio of the method proposed by Junqing and the like is 93.2%, and the recall ratio of the method reaches 100%; the precision rate of the method proposed by Junqing and the like is 88.7%, and the recall rate of the method reaches 93.4%.
The experimental result shows that the method has a good effect on the detection and identification of the match shots, and the recall ratio and the precision ratio are improved compared with the representative methods provided by Junqing and the like.
Drawings
FIG. 1 is a flow chart
FIG. 2SportKF dataset video cover
FIG. 3 perspective view of the lens
FIG. 4 close-up lens
FIG. 5LOGO lens
Detailed Description
The invention provides a team sports video game shot extraction method based on color features. The method comprises the following specific implementation steps:
take the NBA game video with a duration of 7 minutes, a frame width 640, a frame height 360, and a frame rate of 30 frames/second as an example. The far shot is shown in fig. 3, the close-up shot is shown in fig. 4, and the LOGO shot is shown in fig. 5 (the specific numerical values below are 2 bits after decimal point, rounded).
The method comprises the following steps: segmenting the video into video shots to obtain S ═ S1,S2,……,S33}. (with 33 lenses in total)
Step two: computing AM of all video framesR,AMG,AMB. AM corresponding to FIG. 3R=100.50,AMG=80.34,AMB74.88; AM corresponding to FIG. 4R=93.04,AMG=76.01,AMB71.23; AM corresponding to FIG. 5R=130.07,AMG=112.26,AMB=79.29.
R corresponding to the lens (serial number 17) in FIG. 3R=37.22,RG=35.56,RB=40.65; ARGB=(110.05,90.39,79.10)
R corresponding to the lens (serial number 3) in FIG. 4R=40.13,RG=28.46,RB=19.81; ARGB=(96.27,77.20,65.88)
R corresponding to the lens (serial number 12) in FIG. 5R=13.84,RG=12.63,RB=1.80; ARGB=(105.70,90.43,74.45)
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmaxIn this example CmaxComprises 9 lenses 4,6,9,15,17,20,21,27 and 31
Step five: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
TH in this exampleBSSet to 100, see CmaxLens with serial number 17 in (1), corresponding to RR=40.13,RG=28.46,RB19.81 satisfies the condition "if RR,RG,RBAt least one of them is less than or equal to THBS"so the CBS is stored; in contrast, CmaxLens with a serial number of 20 in (1), corresponding to RR=212.30,RG=216.18,RB196.36 does not satisfy the condition "if RR,RG,RBAt least one of them is less than or equal to THBS"so CBS is not stored; the result CBS of step five in this example contains 8 shots 4,6,9,15,17,21,27, 31.
In this example, the shot with the largest frame number among the 8 shots included in the CBS is the shot with the sequence number of 31, and includes 1895 frames, and therefore, the corresponding shot corresponds to 1895 frames
Step seven: computing all shot video clipsAnd performing the following operations on all shot video clips: if it is notAre all less than or equal to THGSThen store it into S'
TH in this exampleGSSet to 15%, as shown in the lens (number 17) of FIG. 3, corresponding to Satisfy'Are all less than or equal to THGS"so stored in S'; on the contrary, as the lens (serial number 3) in FIG. 4, the corresponding one And do not satisfy "Are all less than or equal to THGS"so S' is not stored; the final resulting set S' of game shots in this example containsThe 11 shots with the serial numbers 2,4,6,9,13,15,17,21,25,27 and 31 are all match shots and have no missed detection.
Claims (1)
1. A team sports video match shot extraction method based on color features is characterized by comprising the following steps:
S={S1,S2,……,SS}: collecting shot video clips after video segmentation;
f: a set of video frames of a shot video clip;
AMR: the arithmetic mean value of the R channel values corresponding to all the pixel points in the video frame;
AMG: the arithmetic mean value of the values of the G channels corresponding to all the pixel points in the video frame;
AMB: the arithmetic mean value of the values of the B channels corresponding to all the pixel points in the video frame;
RR: AM in shot video clipRA very poor result;
RG: AM in shot video clipGA very poor result;
RB: AM in shot video clipBA very poor result;
Cmax: after K-means clustering, the video clip with the most shots is provided;
THBS: a threshold for distinguishing the candidate reference shot from all other shots;
and (3) CBS: a set of candidate reference shot video clips;
SFM: the CBS has shot video clips with the maximum video frame number;
THGS: a threshold for distinguishing match shots from non-match shots;
s': a match shot video clip set;
the method comprises the following steps: the video is segmented into video shot fragments, and the video is stored frame by frame according to the form of pictures to obtain S
Step two: computing AM of all video framesR,AMG,AMB
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmax
Step (ii) ofFifthly: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
Step seven: computing all shot video clipsAnd performing the following operations on all shot video clips: if it is notAre all less than or equal to THGSIf yes, storing the data in S';
taking values: THBS=100,THGS=15%。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204176.9A CN113033308A (en) | 2021-02-24 | 2021-02-24 | Team sports video game lens extraction method based on color features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204176.9A CN113033308A (en) | 2021-02-24 | 2021-02-24 | Team sports video game lens extraction method based on color features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113033308A true CN113033308A (en) | 2021-06-25 |
Family
ID=76461212
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110204176.9A Pending CN113033308A (en) | 2021-02-24 | 2021-02-24 | Team sports video game lens extraction method based on color features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113033308A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040130567A1 (en) * | 2002-08-02 | 2004-07-08 | Ahmet Ekin | Automatic soccer video analysis and summarization |
CN101072305A (en) * | 2007-06-08 | 2007-11-14 | 华为技术有限公司 | Lens classifying method, situation extracting method, abstract generating method and device |
CN101604325A (en) * | 2009-07-17 | 2009-12-16 | 北京邮电大学 | Method for classifying sports video based on key frame of main scene lens |
CN110826491A (en) * | 2019-11-07 | 2020-02-21 | 北京工业大学 | Video key frame detection method based on cascading manual features and depth features |
-
2021
- 2021-02-24 CN CN202110204176.9A patent/CN113033308A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040130567A1 (en) * | 2002-08-02 | 2004-07-08 | Ahmet Ekin | Automatic soccer video analysis and summarization |
CN101072305A (en) * | 2007-06-08 | 2007-11-14 | 华为技术有限公司 | Lens classifying method, situation extracting method, abstract generating method and device |
CN101604325A (en) * | 2009-07-17 | 2009-12-16 | 北京邮电大学 | Method for classifying sports video based on key frame of main scene lens |
CN110826491A (en) * | 2019-11-07 | 2020-02-21 | 北京工业大学 | Video key frame detection method based on cascading manual features and depth features |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101463085B1 (en) | Method and apparatus for detecting objects of interest in soccer video by color segmentation and shape analysis | |
Sharma et al. | Self-supervised learning of face representations for video face clustering | |
US20080303942A1 (en) | System and method for extracting text captions from video and generating video summaries | |
CN110267061B (en) | News splitting method and system | |
CN111460961B (en) | Static video abstraction method for CDVS-based similarity graph clustering | |
WO2017114211A1 (en) | Method and apparatus for detecting switching of video scenes | |
CN107247942A (en) | A kind of tennis Video Events detection method for merging multi-modal feature | |
Bhalla et al. | A multimodal approach for automatic cricket video summarization | |
Hari et al. | Event detection in cricket videos using intensity projection profile of Umpire gestures | |
Jiang et al. | Video copy detection using a soft cascade of multimodal features | |
Cirne et al. | A video summarization method based on spectral clustering | |
Zhang et al. | Key frame extraction based on entropy difference and perceptual hash | |
Chen et al. | Novel framework for sports video analysis: a basketball case study | |
CN113033308A (en) | Team sports video game lens extraction method based on color features | |
Bertini et al. | Player identification in soccer videos | |
Zhao et al. | Matching logos for slow motion replay detection in broadcast sports video | |
Sharma et al. | A new method for character segmentation from multi-oriented video words | |
Ozkan et al. | Finding people frequently appearing in news | |
CN113032631A (en) | Team sports video key frame extraction method based on global motion statistical characteristics | |
Sun et al. | Field lines and players detection and recognition in soccer video | |
Nunez et al. | Soccer Video Segmentation: referee and player detection | |
Huang et al. | A reliable logo and replay detector for sports video | |
Ikizler et al. | Person search made easy | |
Ramya et al. | Performance comparison of content based and ISODATA clustering based on news video anchorperson detection | |
Goyani et al. | Key frame detection based semantic event detection and classification using heirarchical approach for cricket sport video indexing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |