CN113033308A - Team sports video game lens extraction method based on color features - Google Patents

Team sports video game lens extraction method based on color features Download PDF

Info

Publication number
CN113033308A
CN113033308A CN202110204176.9A CN202110204176A CN113033308A CN 113033308 A CN113033308 A CN 113033308A CN 202110204176 A CN202110204176 A CN 202110204176A CN 113033308 A CN113033308 A CN 113033308A
Authority
CN
China
Prior art keywords
video
shot
clip
video clip
clips
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110204176.9A
Other languages
Chinese (zh)
Inventor
毋立芳
袁元
卢哲
简萌
孙泽文
张晔鹏
韩嘉润
万青青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110204176.9A priority Critical patent/CN113033308A/en
Publication of CN113033308A publication Critical patent/CN113033308A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A team sports video match shot extraction method based on color features belongs to the field of video data processing. Firstly, preprocessing a video, and automatically segmenting the video by taking a lens as a unit to structure the video; and then calculating the r, g and b channel arithmetic mean values of the video frames and the shot video clips, then clustering through K-means, selecting a reference shot video clip according to the extreme difference of the r, g and b channel arithmetic mean values of the video frames in the shot video clips, and dividing the team sports video into a match shot (far shot and middle shot) video clip and other shot (special-write shot and off-site shot) video clips by comparing the dominant color difference of the reference shot video clip and each shot video clip. The invention extracts the far-lens and middle-lens video segments occupying most of the content of team sports videos by utilizing the dominant color difference of the video frames, is favorable for reducing the adverse effect of redundant content on video analysis and processing, and lays a good structural foundation for further research.

Description

Team sports video game lens extraction method based on color features
Technical Field
The invention belongs to the field of video data processing.
Background
Team sports such as football, basketball and the like are popular with people and have wide crowd foundation, so that the method has wide application prospect in research on team sports videos. Video data is unstructured data, which is converted into a series of labeled sequences with shot type as a key attribute, and is the basis for event detection and summary generation. The common method is that firstly, the video is preprocessed, and the video is automatically segmented and structured by taking a shot as a unit; on the basis, the type of the lens is automatically identified through methods such as mode identification and the like. In team sports videos, shots in the videos can be divided into four types, namely a far shot, a medium shot, a close-up shot and an off-site shot, according to the difference between shooting angles and shooting distances. The far lens displays the station of each player in the court and the progress of the game at an aerial view angle, and the middle lens more clearly displays the game situation of most players in the court; close-up often is the playback of far and medium shots; off-site footage such as spectators, advertising, and LOGO are redundant content relative to the game itself. In summary, the far and middle shots, which typically occupy the vast majority of the entire game, are the most important game shots for the game itself. Extracting the game shots of the sports video may reduce the adverse impact of redundant content on the video analysis process.
Eldib and the like determine an RGB range interval of field pixels through experiments, all pixels of which components fall in the interval are regarded as field color pixels, and then threshold values are set according to the proportion of the field pixels to distinguish various lenses. And the type of the lens is reflected by utilizing the field color proportion of a sub-window area in a video frame in Junqing and the like, and a certain effect is achieved.
The shot classification firstly needs to carry out video segmentation, accurately detects shot boundaries to complete video segmentation, and is a guarantee premise for effectively dividing the shot types. In team sports video, shot boundaries can be divided into abrupt and gradual types. In the existing related work, the shot segmentation still easily generates false detection, especially the detection of the gradual shot, the reason is that many slow shots are played back in the team sports video, and the LOGO is frequently swept and changed at the beginning and the end of the playback, so that the wrong judgment of the algorithm is easily caused, which is very unfavorable for the subsequent research work. The existing lens classification algorithm cannot compensate for the false detection of the segmentation of the lens boundary, and the accuracy of lens classification is reduced due to the influence of the false detection.
Secondly, the algorithm proposed by Eldib et al is based on the field color, but the playing field will change with the factors of venue, weather, light, etc. The algorithm proposed by Junqing et al depends on the selection of the size and position of the sub-window, so the algorithm robustness is poor.
In summary, the recognition accuracy of each type of shot in the current shot classification method is obviously affected by the insufficient accuracy of shot boundary detection, so that the recognition accuracy of the shot is not ideal, and the universality of the existing shot classification algorithm is not strong.
Disclosure of Invention
The invention aims to provide a team sports video game shot extraction method based on color features, which is used for extracting a far lens and a medium lens occupying most of the contents of team sports videos to perform subsequent analysis and research on the videos. The team sports video is divided into a match shot (far shot, middle shot) video clip and other shot (close-up shot, off-scene) video clips according to the fact that the dominant colors of the whole video frame are different greatly among different types of shots. Experiments prove that the method has good adaptability, high accuracy, practicability and high efficiency.
Inputting: original team sports video
And (3) outputting: set of game shot video clips
Defining:
S={S1,S2,……,SS}: collecting shot video clips after video segmentation;
f: a set of video frames of a shot video clip;
AMR: the arithmetic mean value of the R channel values corresponding to all the pixel points in the video frame;
AMG: the arithmetic mean value of the values of the G channels corresponding to all the pixel points in the video frame;
AMB: the arithmetic mean value of the values of the B channels corresponding to all the pixel points in the video frame;
RR: AM in shot video clipRA very poor result;
RG: AM in shot video clipGA very poor result;
RB: AM in shot video clipBA very poor result;
Figure BDA0002949785410000021
all video frames in a shot video clip correspond to AMRThe arithmetic mean of (a);
Figure BDA0002949785410000022
all video frames in a shot video clip correspond to AMGThe arithmetic mean of (a);
Figure BDA0002949785410000023
all video frames in a shot video clip correspond to AMBThe arithmetic mean of (a);
ARGB: will be provided with
Figure BDA0002949785410000024
Arrays stacked in row order;
Cmax: after K-means clustering, the video clip with the most shots is provided;
THBS: a threshold for distinguishing the candidate reference shot from all other shots;
and (3) CBS: a set of candidate reference shot video clips;
SFM: the CBS has shot video clips with the maximum video frame number;
Figure BDA0002949785410000031
shot video clip SFMCorresponding to
Figure BDA0002949785410000032
Figure BDA0002949785410000033
Shot video clip SFMCorresponding to
Figure BDA0002949785410000034
Figure BDA0002949785410000035
Shot video clip SFMCorresponding to
Figure BDA0002949785410000036
Figure BDA0002949785410000037
Figure BDA0002949785410000038
And
Figure BDA0002949785410000039
percent error of (d);
Figure BDA00029497854100000310
Figure BDA00029497854100000311
and
Figure BDA00029497854100000312
percent error of (d);
Figure BDA00029497854100000313
Figure BDA00029497854100000314
and
Figure BDA00029497854100000315
percent error of (d);
THGS: a threshold for distinguishing match shots from non-match shots;
s': a set of game shot video segments.
The method comprises the following steps: the video is segmented into video shot fragments, and the video is stored frame by frame according to the form of pictures to obtain S
Step two: computing AM of all video framesR,AMG,AMB
Step three: calculating R of all shot video clipsR,RG,RB;
Figure BDA00029497854100000316
And ARGB
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmax
Step five: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
Step six: find out
Figure BDA00029497854100000317
Step seven: computing stationWith shot video clips
Figure BDA00029497854100000318
And performing the following operations on all shot video clips: if it is not
Figure BDA00029497854100000319
Are all less than or equal to THGSThen it is stored in S'.
Taking values: THBS=100,THGS=15%。
By using mean attribute in ImageStat module of Python image processing library PIL, average AM of three channel pixel values of each frame in single-shot video clip can be calculatedR,AMG,AMB
The formula is as follows:
Figure BDA00029497854100000320
in the formula, r, g, b are RGB color pixel values corresponding to pixel points in the video frame, respectively, and n is the number of pixel points.
Figure BDA00029497854100000321
In the formula, N is the number of video frames included in a single shot video clip.
Figure BDA0002949785410000041
Figure BDA0002949785410000042
Figure BDA0002949785410000043
The invention utilizes the color characteristics to extract the match shots of the team sports video, and utilizes the dominant color difference of the whole video frame to eliminate the close-up shots and off-scene shots, thereby keeping the main match shots in the video. In the existing related work, the classification of the lens is usually concentrated on the field, and the method is not limited by the conventional thought and provides a new idea. Firstly, the method is not limited by site factors and is suitable for sports of multiple teams, and meanwhile, the method can well make up for partial false detection in the shot boundary detection method: if the non-match shot has wrong segmentation, the whole shot can be identified and removed, and the research and analysis of the subsequent video cannot be influenced. Therefore, the method can extract the most main match shot part in the sports video of the whole team, and lays a good structural foundation for the subsequent semantic event detection and abstract generation.
To test the effectiveness of the method, 25, 112 minutes of video were extracted from the published Youtube-8M dataset to form a SportKF dataset as shown in FIG. 2, from four sports (basketball, football, American football, and hockey), containing 197,878 frames, 572 shots.
The method is applied to the data set, and 5 sections of football videos are compared with the recall ratio and the precision ratio by the method proposed by Junqing and the like, wherein the computing formulas of the recall ratio and the precision ratio are as follows:
checking rate (actual detection number-false detection number)/number to be detected
Precision ratio (true check count-false check count)/true check count
For the extracted match shots, the recall ratio of the four team sports is 100%, and the precision ratio is 86.67%. Aiming at football videos, the recall ratio of the method proposed by Junqing and the like is 93.2%, and the recall ratio of the method reaches 100%; the precision rate of the method proposed by Junqing and the like is 88.7%, and the recall rate of the method reaches 93.4%.
The experimental result shows that the method has a good effect on the detection and identification of the match shots, and the recall ratio and the precision ratio are improved compared with the representative methods provided by Junqing and the like.
Drawings
FIG. 1 is a flow chart
FIG. 2SportKF dataset video cover
FIG. 3 perspective view of the lens
FIG. 4 close-up lens
FIG. 5LOGO lens
Detailed Description
The invention provides a team sports video game shot extraction method based on color features. The method comprises the following specific implementation steps:
take the NBA game video with a duration of 7 minutes, a frame width 640, a frame height 360, and a frame rate of 30 frames/second as an example. The far shot is shown in fig. 3, the close-up shot is shown in fig. 4, and the LOGO shot is shown in fig. 5 (the specific numerical values below are 2 bits after decimal point, rounded).
The method comprises the following steps: segmenting the video into video shots to obtain S ═ S1,S2,……,S33}. (with 33 lenses in total)
Step two: computing AM of all video framesR,AMG,AMB. AM corresponding to FIG. 3R=100.50,AMG=80.34,AMB74.88; AM corresponding to FIG. 4R=93.04,AMG=76.01,AMB71.23; AM corresponding to FIG. 5R=130.07,AMG=112.26,AMB=79.29.
Step three: calculating R of all shot video clipsR,RG,RB;
Figure BDA0002949785410000051
And ARGB
R corresponding to the lens (serial number 17) in FIG. 3R=37.22,RG=35.56,RB=40.65;
Figure BDA0002949785410000052
Figure BDA0002949785410000053
ARGB=(110.05,90.39,79.10)
R corresponding to the lens (serial number 3) in FIG. 4R=40.13,RG=28.46,RB=19.81;
Figure BDA0002949785410000054
Figure BDA0002949785410000055
ARGB=(96.27,77.20,65.88)
R corresponding to the lens (serial number 12) in FIG. 5R=13.84,RG=12.63,RB=1.80;
Figure BDA0002949785410000056
Figure BDA0002949785410000057
ARGB=(105.70,90.43,74.45)
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmaxIn this example CmaxComprises 9 lenses 4,6,9,15,17,20,21,27 and 31
Step five: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
TH in this exampleBSSet to 100, see CmaxLens with serial number 17 in (1), corresponding to RR=40.13,RG=28.46,RB19.81 satisfies the condition "if RR,RG,RBAt least one of them is less than or equal to THBS"so the CBS is stored; in contrast, CmaxLens with a serial number of 20 in (1), corresponding to RR=212.30,RG=216.18,RB196.36 does not satisfy the condition "if RR,RG,RBAt least one of them is less than or equal to THBS"so CBS is not stored; the result CBS of step five in this example contains 8 shots 4,6,9,15,17,21,27, 31.
Step six: find out
Figure BDA0002949785410000061
In this example, the shot with the largest frame number among the 8 shots included in the CBS is the shot with the sequence number of 31, and includes 1895 frames, and therefore, the corresponding shot corresponds to 1895 frames
Figure BDA0002949785410000062
Step seven: computing all shot video clips
Figure BDA0002949785410000063
And performing the following operations on all shot video clips: if it is not
Figure BDA0002949785410000064
Are all less than or equal to THGSThen store it into S'
TH in this exampleGSSet to 15%, as shown in the lens (number 17) of FIG. 3, corresponding to
Figure BDA0002949785410000065
Figure BDA0002949785410000066
Satisfy'
Figure BDA0002949785410000067
Are all less than or equal to THGS"so stored in S'; on the contrary, as the lens (serial number 3) in FIG. 4, the corresponding one
Figure BDA0002949785410000068
Figure BDA0002949785410000069
And do not satisfy "
Figure BDA00029497854100000610
Are all less than or equal to THGS"so S' is not stored; the final resulting set S' of game shots in this example containsThe 11 shots with the serial numbers 2,4,6,9,13,15,17,21,25,27 and 31 are all match shots and have no missed detection.

Claims (1)

1. A team sports video match shot extraction method based on color features is characterized by comprising the following steps:
S={S1,S2,……,SS}: collecting shot video clips after video segmentation;
f: a set of video frames of a shot video clip;
AMR: the arithmetic mean value of the R channel values corresponding to all the pixel points in the video frame;
AMG: the arithmetic mean value of the values of the G channels corresponding to all the pixel points in the video frame;
AMB: the arithmetic mean value of the values of the B channels corresponding to all the pixel points in the video frame;
RR: AM in shot video clipRA very poor result;
RG: AM in shot video clipGA very poor result;
RB: AM in shot video clipBA very poor result;
Figure FDA0002949785400000011
all video frames in a shot video clip correspond to AMRThe arithmetic mean of (a);
Figure FDA0002949785400000012
all video frames in a shot video clip correspond to AMGThe arithmetic mean of (a);
Figure FDA0002949785400000013
all video frames in a shot video clip correspond to AMBThe arithmetic mean of (a);
ARGB: will be provided with
Figure FDA0002949785400000014
Arrays stacked in row order;
Cmax: after K-means clustering, the video clip with the most shots is provided;
THBS: a threshold for distinguishing the candidate reference shot from all other shots;
and (3) CBS: a set of candidate reference shot video clips;
SFM: the CBS has shot video clips with the maximum video frame number;
Figure FDA0002949785400000015
shot video clip SFMCorresponding to
Figure FDA0002949785400000016
Figure FDA0002949785400000017
Shot video clip SFMCorresponding to
Figure FDA0002949785400000018
Figure FDA0002949785400000019
Shot video clip SFMCorresponding to
Figure FDA00029497854000000110
Figure FDA00029497854000000111
Figure FDA00029497854000000112
And
Figure FDA00029497854000000113
percent error of (d);
Figure FDA00029497854000000114
Figure FDA00029497854000000115
and
Figure FDA00029497854000000116
percent error of (d);
Figure FDA00029497854000000117
Figure FDA00029497854000000118
and
Figure FDA00029497854000000119
percent error of (d);
THGS: a threshold for distinguishing match shots from non-match shots;
s': a match shot video clip set;
the method comprises the following steps: the video is segmented into video shot fragments, and the video is stored frame by frame according to the form of pictures to obtain S
Step two: computing AM of all video framesR,AMG,AMB
Step three: calculating R of all shot video clipsR,RG,RB;
Figure FDA00029497854000000120
And ARGB
Step four: corresponding to all shot video clipsRGBK-means clustering is performed and C is foundmax
Step (ii) ofFifthly: to CmaxAll shot video clips are processed as follows: if R isR,RG,RBAt least one of them is less than or equal to THBSThen store the shot video clip into CBS
Step six: find out
Figure FDA0002949785400000021
Step seven: computing all shot video clips
Figure FDA0002949785400000022
And performing the following operations on all shot video clips: if it is not
Figure FDA0002949785400000023
Are all less than or equal to THGSIf yes, storing the data in S';
taking values: THBS=100,THGS=15%。
CN202110204176.9A 2021-02-24 2021-02-24 Team sports video game lens extraction method based on color features Pending CN113033308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110204176.9A CN113033308A (en) 2021-02-24 2021-02-24 Team sports video game lens extraction method based on color features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110204176.9A CN113033308A (en) 2021-02-24 2021-02-24 Team sports video game lens extraction method based on color features

Publications (1)

Publication Number Publication Date
CN113033308A true CN113033308A (en) 2021-06-25

Family

ID=76461212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110204176.9A Pending CN113033308A (en) 2021-02-24 2021-02-24 Team sports video game lens extraction method based on color features

Country Status (1)

Country Link
CN (1) CN113033308A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040130567A1 (en) * 2002-08-02 2004-07-08 Ahmet Ekin Automatic soccer video analysis and summarization
CN101072305A (en) * 2007-06-08 2007-11-14 华为技术有限公司 Lens classifying method, situation extracting method, abstract generating method and device
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens
CN110826491A (en) * 2019-11-07 2020-02-21 北京工业大学 Video key frame detection method based on cascading manual features and depth features

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040130567A1 (en) * 2002-08-02 2004-07-08 Ahmet Ekin Automatic soccer video analysis and summarization
CN101072305A (en) * 2007-06-08 2007-11-14 华为技术有限公司 Lens classifying method, situation extracting method, abstract generating method and device
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens
CN110826491A (en) * 2019-11-07 2020-02-21 北京工业大学 Video key frame detection method based on cascading manual features and depth features

Similar Documents

Publication Publication Date Title
KR101463085B1 (en) Method and apparatus for detecting objects of interest in soccer video by color segmentation and shape analysis
Sharma et al. Self-supervised learning of face representations for video face clustering
US20080303942A1 (en) System and method for extracting text captions from video and generating video summaries
CN110267061B (en) News splitting method and system
CN111460961B (en) Static video abstraction method for CDVS-based similarity graph clustering
WO2017114211A1 (en) Method and apparatus for detecting switching of video scenes
CN107247942A (en) A kind of tennis Video Events detection method for merging multi-modal feature
Bhalla et al. A multimodal approach for automatic cricket video summarization
Hari et al. Event detection in cricket videos using intensity projection profile of Umpire gestures
Jiang et al. Video copy detection using a soft cascade of multimodal features
Cirne et al. A video summarization method based on spectral clustering
Zhang et al. Key frame extraction based on entropy difference and perceptual hash
Chen et al. Novel framework for sports video analysis: a basketball case study
CN113033308A (en) Team sports video game lens extraction method based on color features
Bertini et al. Player identification in soccer videos
Zhao et al. Matching logos for slow motion replay detection in broadcast sports video
Sharma et al. A new method for character segmentation from multi-oriented video words
Ozkan et al. Finding people frequently appearing in news
CN113032631A (en) Team sports video key frame extraction method based on global motion statistical characteristics
Sun et al. Field lines and players detection and recognition in soccer video
Nunez et al. Soccer Video Segmentation: referee and player detection
Huang et al. A reliable logo and replay detector for sports video
Ikizler et al. Person search made easy
Ramya et al. Performance comparison of content based and ISODATA clustering based on news video anchorperson detection
Goyani et al. Key frame detection based semantic event detection and classification using heirarchical approach for cricket sport video indexing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination