CN111709324A - News video strip splitting method based on space-time consistency - Google Patents

News video strip splitting method based on space-time consistency Download PDF

Info

Publication number
CN111709324A
CN111709324A CN202010473634.4A CN202010473634A CN111709324A CN 111709324 A CN111709324 A CN 111709324A CN 202010473634 A CN202010473634 A CN 202010473634A CN 111709324 A CN111709324 A CN 111709324A
Authority
CN
China
Prior art keywords
news video
news
video
cutting
cut
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010473634.4A
Other languages
Chinese (zh)
Inventor
周凡
张富为
王若梅
林格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202010473634.4A priority Critical patent/CN111709324A/en
Publication of CN111709324A publication Critical patent/CN111709324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Computer Interaction (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Television Signal Processing For Recording (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a news video strip splitting method based on space-time consistency. Firstly, marking a news video as a reference system news video; then, carrying out space-time consistency correspondence on the news video to be stripped and the reference system news video, and storing the obtained pre-cutting point and pre-cutting graph; then, deleting the pre-cut points and pictures belonging to the beginning and ending parts of the news video in the pre-cut graph by using face detection to obtain accurate cut points and corresponding cut graphs; and finally, cutting the news video according to the accurate cutting point, and detaching the news video. According to the method, the strip of the news video is removed by adopting a space-time consistency algorithm, so that the strip removing process of the current news video is simplified, and the condition that the labeling data amount of the current news video is insufficient is relieved. Because only a single video needs to be marked manually, the repeated labor is reduced, the accuracy of stripping the news video is improved, and the efficiency of stripping the news video is improved.

Description

News video strip splitting method based on space-time consistency
Technical Field
The invention relates to the technical field of video processing, video content structuring and video retrieval, in particular to a news video stripping method based on space-time consistency.
Background
With the vigorous development of multimedia technology, video becomes the main form of news media, and under the current fast-paced life, it becomes a habit to quickly acquire news, so that for the traditional news video, how to quickly let people acquire key information in news is a problem to be solved urgently in the news industry at present.
Therefore, news splitting technology comes along, various types of splitting technology are formed, the means of the prior art are summarized and summarized, and the current splitting method comprises two main types: most of the conventional methods focus on the rule-based news stripping method. The news strip splitting method based on the rules is a bottom-up strip splitting method, and mainly comprises the steps of selecting characteristics of news videos, then training a classifier, and finally generating a corresponding strip splitting result; the news strip splitting method based on semantics mainly aims at analyzing high-level semantic features of news videos and splitting low-level features of the high-level semantic features, and is a top-down strip splitting method. The common disadvantage of the two is that the data needs to be marked, which is also a key problem to be solved urgently by the current video processing technology.
One of the existing technologies is a news stripping algorithm based on audio and video features, which extracts the basic features of a news video in vision and audio, namely, the host feature and the audio mute segment feature, and analyzes the features, then extracts the host feature through face recognition, extracts the mute feature by using short-time energy and zero crossing rate, and performs conditional screening on the mute feature, and completes stripping work by combining the two features. The method has the disadvantages that only visual feature and audio feature data are extracted, and text features are not considered, so that the bar splitting accuracy is influenced, and the operation process is complex.
The second prior art is an automatic news stripping method for monitoring massive broadcast television, which automatically obtains audio waveforms and video images of news programs by initializing broadcast television data; extracting audio and video characteristics of news data, including host detection, subtitle detection and tracking and voice detection; acquiring visual candidate points and voice candidate points of a news item boundary through a heuristic rule; positioning calculation of news item boundaries is achieved according to audio and video fusion; and after the processing result provided by the step is manually checked, the result is input into a knowledge base to serve as a knowledge resource for supporting the supervision requirement. The method has the defects that the news stripping step is too complicated and fussy, and the efficiency of video stripping is greatly reduced.
The invention discloses a news video program segmentation method, a news video cataloging method and a system, which are characterized in that characteristic information such as a leader, a news title, presenter characteristic information, shot change, a mute point of audio, a switching point, a pitch period mutation point and the like of a news video are detected, and a detection result is arranged according to a time sequence to obtain an event sequence according to the characteristic information; adopting a preset symbol set and a production rule to reduce the event sequence, and further judging the rough position of the start point and the stop point of each news segment in the event sequence; and calculating the joint posterior probability of the initial positions of the news segments near the rough initial position according to the event sequence, selecting the moment with the maximum posterior probability as the accurate initial position of the news segments, and segmenting the news video to obtain each news video segment. Although the method is used for jointly detecting various characteristic information, the efficiency of video striping is greatly reduced due to the complex process of extracting various characteristics.
Disclosure of Invention
The invention aims to overcome the defects of the existing method and provides a news video strip splitting method based on space-time consistency. The method solves the main problems that (1) in the video strip splitting technology, the extraction of features is difficult and the combination of features is complicated, so that the strip splitting result is inaccurate and the strip splitting process is time-consuming; (2) the problem that a large amount of labeled data is lacked in the video striping technology is how to alleviate the problem, which is one of the problems to be solved urgently in the field of video processing at present.
In order to solve the above problems, the present invention provides a news video striping method based on spatiotemporal consistency, wherein the method comprises:
marking a randomly selected news video to obtain a reference system news video;
carrying out space-time consistency correspondence on the news video to be stripped and the reference system news video, namely carrying out similarity matching and double-threshold detection frame by frame, wherein the similarity is similar when the similarity is greater than a set threshold A, and then storing the obtained pre-cutting points and pre-cutting graphs when the similarity is greater than a set threshold B (B > A);
deleting the pre-cut points and pictures belonging to the beginning and ending parts of the news video in the pre-cut graph by using face detection to obtain accurate cut points and corresponding cut graphs;
and cutting the news video according to the accurate cutting points, processing the keywords by utilizing a voice recognition and word segmentation tool, storing the cutting points and the corresponding voice texts, and stripping the news video.
Preferably, the pre-cut points and pictures belonging to the beginning and ending parts of the news video in the pre-cut graph are deleted and selected by using the face detection, so as to obtain accurate cut points and corresponding cut graphs, specifically:
deleting and selecting by adopting face detection by utilizing the characteristic that the start and end pictures of the news video have stronger similarity, and clearing the previous stored data when the number of faces appearing for the first time is equal to 2; and when the number of the faces appearing for the second time is equal to 2, emptying the stored data, thereby obtaining an accurate cutting point and a corresponding cutting graph.
According to the news video strip splitting method based on the space-time consistency, the news video strip splitting is performed by adopting a space-time consistency algorithm, so that the current news video strip splitting process is simplified, and the condition that the labeling data amount of the current news video is insufficient is relieved. Because only a single video needs to be marked manually, the repeated labor is reduced, the accuracy of stripping the news video is improved, and the efficiency of stripping the news video is improved.
Drawings
FIG. 1 is a general flow chart of a news video stripping method based on spatiotemporal consistency according to an embodiment of the present invention;
FIG. 2 is a flow chart of a spatiotemporal consistency algorithm of an embodiment of the present invention;
fig. 3 is a flowchart of obtaining accurate cutting points and cutting maps by using face detection according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a general flowchart of a news video striping method based on spatiotemporal consistency according to an embodiment of the present invention, as shown in fig. 1, the method includes:
s1, labeling a randomly selected news video to obtain a reference system news video;
s2, carrying out space-time consistency correspondence on the news video to be split and the reference system news video, namely carrying out similarity matching and double-threshold detection frame by frame, wherein the similarity is similar when the similarity is greater than a set threshold A, and the obtained pre-cutting point and pre-cutting graph are stored when the similarity is greater than a set threshold B (B > A);
s3, deleting the pre-cut points and pictures belonging to the beginning and ending parts of the news video in the pre-cut graph by using face detection to obtain accurate cut points and corresponding cut graphs;
and S4, cutting the news video according to the accurate cutting points, processing the keywords by using a voice recognition and word segmentation tool, storing the cutting points and the corresponding voice texts, and stripping the news video.
Step S2, as shown in fig. 2, is as follows:
the reference system news video output by the S1 is recorded as v0, then the news video to be split is selected and recorded as v1, then space-time consistency correspondence is carried out, namely v0 and v1 are synchronously read frame by frame, similarity matching and double-threshold detection are carried out by adopting a difference value hash algorithm, similarity is similar when the similarity is greater than a set threshold value of 0.6, and a pre-cut graph and a cut point are stored when the similarity is greater than a set threshold value of 0.8.
Step S3, as shown in fig. 3, is as follows:
deleting and selecting by adopting a dlib face detection algorithm by utilizing the characteristic that the initial and final pictures of the news video have stronger similarity, and clearing the previous saved data when the number of faces appearing for the first time is equal to 2; and when the number of the faces appearing for the second time is equal to 2, emptying the stored data, thereby obtaining an accurate cutting point and a corresponding cutting graph.
Step S4 is specifically as follows:
and cutting the news video according to the cutting points by using FFmpeg, performing voice recognition by using a science news flying API (application program interface), processing the keywords by using a jieba word segmentation tool, storing the cutting points and the corresponding voice texts, and splitting the news video.
According to the news video strip splitting method based on the space-time consistency, the strip splitting of the news video is performed by adopting the space-time consistency algorithm, so that the strip splitting process of the current news video is simplified, and the condition that the labeling data amount of the current news video is insufficient is relieved. Because only a single video needs to be marked manually, the repeated labor is reduced, the accuracy of stripping the news video is improved, and the efficiency of stripping the news video is improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
In addition, the above detailed description is given to a news video striping method based on spatiotemporal consistency, and a specific example is applied in the present disclosure to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (2)

1. A news video striping method based on space-time consistency is characterized by comprising the following steps:
marking a randomly selected news video to obtain a reference system news video;
carrying out space-time consistency correspondence on the news video to be stripped and the reference system news video, namely carrying out similarity matching and double-threshold detection frame by frame, wherein the similarity is similar when the similarity is greater than a set threshold A, and then storing the obtained pre-cutting points and pre-cutting graphs when the similarity is greater than a set threshold B (B > A);
deleting the pre-cut points and pictures belonging to the beginning and ending parts of the news video in the pre-cut graph by using face detection to obtain accurate cut points and corresponding cut graphs;
and cutting the news video according to the accurate cutting points, processing the keywords by utilizing a voice recognition and word segmentation tool, storing the cutting points and the corresponding voice texts, and stripping the news video.
2. The method for breaking news video strips based on spatiotemporal consistency of claim 1, wherein the pre-cut points and the pictures belonging to the beginning and ending parts of the news video in the pre-cut graph are deleted and selected by using face detection to obtain accurate cut points and corresponding cut graphs, and specifically:
deleting and selecting by adopting face detection by utilizing the characteristic that the start and end pictures of the news video have stronger similarity, and clearing the previous stored data when the number of faces appearing for the first time is equal to 2; and when the number of the faces appearing for the second time is equal to 2, emptying the stored data, thereby obtaining an accurate cutting point and a corresponding cutting graph.
CN202010473634.4A 2020-05-29 2020-05-29 News video strip splitting method based on space-time consistency Pending CN111709324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010473634.4A CN111709324A (en) 2020-05-29 2020-05-29 News video strip splitting method based on space-time consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010473634.4A CN111709324A (en) 2020-05-29 2020-05-29 News video strip splitting method based on space-time consistency

Publications (1)

Publication Number Publication Date
CN111709324A true CN111709324A (en) 2020-09-25

Family

ID=72538247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010473634.4A Pending CN111709324A (en) 2020-05-29 2020-05-29 News video strip splitting method based on space-time consistency

Country Status (1)

Country Link
CN (1) CN111709324A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565820A (en) * 2020-12-24 2021-03-26 新奥特(北京)视频技术有限公司 Video news splitting method and device
CN113807085A (en) * 2021-11-19 2021-12-17 成都索贝数码科技股份有限公司 Method for extracting title and subtitle aiming at news scene

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616264A (en) * 2008-06-27 2009-12-30 中国科学院自动化研究所 News video categorization and system
CN106162223A (en) * 2016-05-27 2016-11-23 北京奇虎科技有限公司 A kind of news video cutting method and device
CN110267061A (en) * 2019-04-30 2019-09-20 新华智云科技有限公司 A kind of news demolition method and system
CN110881115A (en) * 2019-12-24 2020-03-13 新华智云科技有限公司 Strip splitting method and system for conference video

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616264A (en) * 2008-06-27 2009-12-30 中国科学院自动化研究所 News video categorization and system
CN106162223A (en) * 2016-05-27 2016-11-23 北京奇虎科技有限公司 A kind of news video cutting method and device
CN110267061A (en) * 2019-04-30 2019-09-20 新华智云科技有限公司 A kind of news demolition method and system
CN110881115A (en) * 2019-12-24 2020-03-13 新华智云科技有限公司 Strip splitting method and system for conference video

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565820A (en) * 2020-12-24 2021-03-26 新奥特(北京)视频技术有限公司 Video news splitting method and device
CN112565820B (en) * 2020-12-24 2023-03-28 新奥特(北京)视频技术有限公司 Video news splitting method and device
CN113807085A (en) * 2021-11-19 2021-12-17 成都索贝数码科技股份有限公司 Method for extracting title and subtitle aiming at news scene

Similar Documents

Publication Publication Date Title
US11776267B2 (en) Intelligent cataloging method for all-media news based on multi-modal information fusion understanding
CN111931775B (en) Method, system, computer device and storage medium for automatically acquiring news headlines
US10304458B1 (en) Systems and methods for transcribing videos using speaker identification
US8503523B2 (en) Forming a representation of a video item and use thereof
US8316301B2 (en) Apparatus, medium, and method segmenting video sequences based on topic
KR100707189B1 (en) Apparatus and method for detecting advertisment of moving-picture, and compter-readable storage storing compter program controlling the apparatus
CN110717470B (en) Scene recognition method and device, computer equipment and storage medium
CN106649713B (en) Movie visualization processing method and system based on content
CN111061915B (en) Video character relation identification method
CN113613065A (en) Video editing method and device, electronic equipment and storage medium
CN111083141A (en) Method, device, server and storage medium for identifying counterfeit account
CN113361462B (en) Method and device for video processing and caption detection model
CN112633241B (en) News story segmentation method based on multi-feature fusion and random forest model
Dumont et al. Automatic story segmentation for tv news video using multiple modalities
CN113435438B (en) Image and subtitle fused video screen plate extraction and video segmentation method
CN111709324A (en) News video strip splitting method based on space-time consistency
CN113923479A (en) Audio and video editing method and device
CN114051154A (en) News video strip splitting method and system
CN111414908B (en) Method and device for recognizing caption characters in video
CN116017088A (en) Video subtitle processing method, device, electronic equipment and storage medium
Eickeler et al. A new approach to content-based video indexing using hidden markov models
Haloi et al. Unsupervised story segmentation and indexing of broadcast news video
CN114387589A (en) Voice supervision data acquisition method and device, electronic equipment and storage medium
CN113807085B (en) Method for extracting title and subtitle aiming at news scene
JP4305921B2 (en) Video topic splitting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200925

RJ01 Rejection of invention patent application after publication