CN113660541A - News video abstract generation method and device - Google Patents

News video abstract generation method and device Download PDF

Info

Publication number
CN113660541A
CN113660541A CN202110808406.2A CN202110808406A CN113660541A CN 113660541 A CN113660541 A CN 113660541A CN 202110808406 A CN202110808406 A CN 202110808406A CN 113660541 A CN113660541 A CN 113660541A
Authority
CN
China
Prior art keywords
news
abstract
original
candidate
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110808406.2A
Other languages
Chinese (zh)
Other versions
CN113660541B (en
Inventor
张记袁
郑烨翰
蔡远俊
彭卫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110808406.2A priority Critical patent/CN113660541B/en
Publication of CN113660541A publication Critical patent/CN113660541A/en
Application granted granted Critical
Publication of CN113660541B publication Critical patent/CN113660541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/232Content retrieval operation locally within server, e.g. reading video streams from disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234336Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by media transcoding, e.g. video is transformed into a slideshow of still pictures or audio is converted into text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The disclosure provides a news video abstract generation method and device, relates to the technical field of computers, and particularly relates to the technical fields of knowledge maps, deep learning, computer vision and voice. The specific implementation scheme is as follows: acquiring a news text library and a news video of an abstract to be generated; identifying a title of the news video to obtain an original title and/or extracting an abstract of the news video to obtain an original abstract of the news video; according to the original title and/or the original abstract, at least one candidate news text is obtained by searching in the news text library; determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text; and generating an abstract of the news video according to the target news. The method and the device effectively improve the accuracy of news video abstract generation.

Description

News video abstract generation method and device
Technical Field
The present disclosure relates to the field of computer technology, and more particularly to the field of knowledge-graph, deep learning, computer vision, and speech technology.
Background
The media industry now catalogs a large number of news video programs for easy retrieval and management, and the generation of news program summaries is an important link in news program cataloging.
In the related art, the summary generation of the news program video is basically obtained by directly recognizing the news program video based on an Automatic Speech Recognition (ASR) technology, or by recognizing the news program video based on an improved ASR method.
Disclosure of Invention
The disclosure provides a news video abstract generation method, a news video abstract generation device, equipment and a storage medium.
According to a first aspect of the present disclosure, there is provided a method for generating a summary of a news video, including:
acquiring a news text library and a news video of an abstract to be generated;
identifying a title of the news video to obtain an original title and/or extracting an abstract of the news video to obtain an original abstract of the news video;
according to the original title and/or the original abstract, at least one candidate news text is obtained by searching in the news text library;
determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text;
and generating an abstract of the news video according to the target news.
According to a second aspect of the present disclosure, there is provided a digest generation apparatus for a news video, including:
the acquisition module is used for acquiring a news text base and a news video of an abstract to be generated;
the feature extraction module is used for identifying the news video to obtain an original title and/or extracting the abstract of the news video to obtain an original abstract;
the retrieval module is used for retrieving at least one candidate news text from the news text library according to the original title and/or the original abstract;
the screening module is used for determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text;
and the abstract generating module is used for generating an abstract of the news video according to the target news.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the first aspect of the present disclosure.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect of the present disclosure.
The method, the device and the storage medium for generating the news video abstract provided by the embodiment of the disclosure can identify the original title obtained by the news video and/or extract the news video to obtain the original abstract, retrieve the original title and/or the original abstract in a news text library to obtain a candidate news text, screen and determine the target news according to the similarity between the original abstract and/or the original title and the candidate news text, and generate the news video abstract according to the target news. Through retrieval in the news text base and target news is screened out through the similarity, the target news and news video reports are guaranteed to be the same news event, so that the news video abstract is generated according to the target news abstract, the accuracy of abstract generation is effectively improved, news videos can be catalogued more efficiently, and the cost is reduced.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 3 is another schematic diagram according to a second embodiment of the present disclosure;
FIG. 4 is another schematic diagram according to a second embodiment of the present disclosure;
FIG. 5 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 6 is a schematic diagram according to a fourth embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device for implementing a method of summary generation of a news video according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the modern media industry, various types of programs are developed continuously, and under the situation, a large amount of programs need to be catalogued for the convenience of management, development and application. For news video programs, the generation of a summary of the news video is an important link in cataloging.
In the related art, the generation of the abstract of the news program video is basically obtained by directly recognizing the news program video based on an automatic speech recognition technology ASR, or by recognizing the news program video based on an improved ASR mode. However, the ASR technology is prone to error in keyword recognition, and news is very time-sensitive, new nouns will continuously appear, it is difficult to optimize ASR in time to accurately recognize new nouns, and in addition, if ASR is improved to improve recognition accuracy, the cost is high.
Based on this, the disclosed embodiment provides a method for generating a summary of a news video, which can be executed by various electronic devices with data processing capabilities, and the method mainly includes, without limitation, the following steps S101 to S105, with reference to a flowchart of the method for generating a summary of a news video shown in fig. 1:
step S101, a news text base and a news video of the abstract to be generated are obtained.
The news text library is a database storing a large amount of news text data, and the news video to be generated into the abstract is news program video including images and audio.
In one embodiment, the news text library is constructed based on the elasticsearch (es). The ES is a distributed, high-expansion and high-real-time search and data analysis engine, and can conveniently enable a large amount of data to have the capabilities of searching, analyzing and exploring.
It should be noted that in order to cover the known news as comprehensively as possible, a large amount of news needs to be acquired, and the news text data may be text data of news including both main stream media and self media. There are many ways to obtain the news text data, and optionally, the news is crawled from a mainstream news website on the internet to obtain the news text data. Further, data preprocessing may be performed on the news text data.
It should be noted that, because of the timeliness of news, the news text data can be retrieved and the news text library can be updated periodically.
And S102, identifying a title of the news video to obtain an original title and/or extracting an abstract of the news video to obtain an original abstract.
The identification means that target data is automatically acquired from input data and identified.
Optionally, a screen of the news video is subjected to Character Recognition (OCR), and the original title is recognized from data after the OCR Recognition.
Optionally, ASR automatic speech recognition is performed on the audio of the news video, and the original headline is recognized from the data after ASR recognition processing.
It should be noted that different recognition rules can be set according to the characteristics of different news programs, so as to improve the accuracy of original title recognition.
The extraction is to select target data from input data, and the target data is a part of the input data.
Optionally, OCR character recognition is performed on the news video, and the original abstract is extracted from the data after OCR recognition processing.
Optionally, performing ASR automatic speech recognition on the audio of the news video, and extracting the original summary from the data after ASR recognition processing.
It will also be appreciated that different extraction rules may be set according to the characteristics of different news programs to improve the accuracy of the original summary extraction.
And step S103, retrieving at least one candidate news text from the news text library according to the original title and/or the original abstract.
In a first possible implementation, at least one candidate news text is retrieved from the news text library according to the original title.
Optionally, the original headline may be subjected to keyword segmentation and extraction, and at least one candidate news text is obtained by searching in a news text library through the keyword.
Optionally, the original title may be input into a news text library for retrieval, so as to obtain at least one candidate news text.
In a second possible implementation, at least one candidate news text is retrieved from the news text base according to the original abstract.
Optionally, the original abstract may be subjected to keyword segmentation and extraction, and at least one candidate news text is obtained by searching in a news text library through the keywords.
Optionally, the original summary may be input into a news text library for retrieval, so as to obtain at least one candidate news text.
In a third possible implementation, at least one candidate news text is retrieved from the news text base according to the original title and the original abstract.
Optionally, the original headline and the original abstract may be subjected to keyword segmentation and extraction, and at least one candidate news text is obtained by searching in a news text library through the keywords.
Optionally, the original headline and the original abstract may be input into a news text library for retrieval, so as to obtain at least one candidate news text.
Further, at least one candidate news text can be ranked according to the weight or the similarity, and a certain number or a certain score of news texts are selected as the candidate news texts.
And step S104, determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text.
Wherein the similarity characterizes how similar the original abstract and/or original headline are to the candidate news text, which may be semantic. That is, the similarity of the original abstract and/or original headline to the candidate news text may be calculated based on a semantic-based algorithm.
And determining target news from the candidate news texts based on semantic similarity.
Optionally, ranking all the calculated similarities, selecting a semantically most similar candidate news text as the target news,
optionally, the candidate news text with the calculated first similarity exceeding a preset threshold is used as the target news.
And step S105, generating an abstract of the news video according to the target news.
Optionally, the target news is abstracted, and the abstracted abstract is used as an abstract of the news video.
Optionally, when the news text data is acquired, data with an attribute of a summary exists in the news text data, and the summary of the target news is used as the summary of the news video.
Optionally, the original summary of the news video is compared and modified according to the target news to generate the summary of the news video.
According to the method provided by the embodiment of the disclosure, a news text base and a news video to be abstracted are obtained, a title of the news video is identified to obtain an original title and/or the news video is abstracted to obtain an original abstract, at least one candidate news text is obtained by searching in the news text base according to the original title and/or the original abstract, target news is determined from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the candidate news text, and the abstract of the news video is generated according to the target news. Through retrieval in the news text base and target news is screened out through the similarity, the target news and news video reports are guaranteed to be the same news event, so that the news video abstract is generated according to the target news abstract, the accuracy of abstract generation is effectively improved, news videos can be catalogued more efficiently, and the cost is reduced.
Referring to fig. 2, a flowchart of a method for generating a news video summary mainly includes the following steps S201 to S213:
step S201, a news text base and a news video to be generated into an abstract are obtained.
Optionally, news text data is crawled from a mainstream news website on the internet, and is stored in the ES, so as to construct a news text library.
Furthermore, after crawling to the news text data, duplicate removal and attribute calculation are carried out on the news text data, and then the news text data are stored in the ES to construct a news text library.
And S202, performing feature extraction on the news video through at least one of OCR character recognition and ASR automatic voice recognition.
Optionally, the frames of the video are recognized by OCR character recognition and/or the audio of the video is recognized by ASR automatic speech recognition, and feature data of the news video is extracted.
In some embodiments, the extracted feature data of the news video can be written into a news text library to serve as data update of the news text library, and other subsequent processing of the related news video is facilitated, so that repeated feature extraction is avoided, the cost is saved, and the efficiency is improved.
And step S203, extracting the original abstract of the news video according to the extracted features.
Optionally, from the extracted feature data, an original summary of the news video is extracted.
For example, OCR and/or ASR recognition is performed on a news program video to extract feature data, in which there are elements of a news digest, such as time, place, event, and the like, and corresponding data is extracted from the feature data according to the elements to obtain an original digest of the news video. Or, some news videos simply introduce the news of the videos at the beginning, and after feature extraction is performed on the news videos, data of the brief part is directly extracted based on the extracted feature data to obtain the original abstract of the news videos.
It can be understood that different extraction rules can be set for different news programs to improve the efficiency and accuracy of the original summary extraction.
And step S204, extracting the original title of the news video based on the characteristics and the metadata of the news video.
Where the metadata of a news video is data describing the attributes of the news video data, also referred to as data about the changes in the video stream, the metadata provides context to the event and allows for the rapid organization, searching, retrieval and use of real-time video and recorded footage.
In an embodiment of the present disclosure, the metadata of the news video may include: resolution, frame rate, time axis, tags or descriptions of the video, etc. And extracting the original title of the video based on the metadata of the news video and the extracted characteristic data.
Optionally, the extracted title is also written into the news text base to serve as data update of the news text base and facilitate other subsequent processing of related news videos, repeated feature extraction is avoided, cost is saved, and efficiency is improved.
It will be appreciated that different extraction strategies may be designed for different news programs to improve the efficiency and accuracy of the original title extraction.
Step S205, at least one candidate news text is obtained by searching in a news text library according to the original headline and the original abstract.
Optionally, the at least one candidate news text is ranked or scored according to similarity or weight.
In some embodiments, as shown in fig. 3, a candidate news text is obtained according to the original headline and the original abstract, and the candidate news text is the candidate news text with the highest weight or similarity ranking in the search result.
Step S206, an abstract of the text of at least one candidate news is obtained.
In one embodiment, the method comprises the step of extracting a summary of at least one candidate news text to obtain a summary of a text of the candidate news.
As another embodiment, there is text data with an attribute of a summary in the data of the candidate news text, and the summary is acquired.
Step S207, respectively calculating a first similarity between the abstract of the text of each candidate news and the original abstract.
In some embodiments, a first similarity of the summary of the body of the most similar candidate news retrieved from the original summary and the most similar candidate news retrieved from the original headline to the original summary is calculated, respectively.
Alternatively, the similarity between the digests is calculated based on an embedded word vector (word embedding).
The embedded word vector can represent each word from different dimensions and degrees of different dimensions, the similarity between the abstracts is calculated based on the embedded word vector, and the degree of association between the two abstracts can be calculated, wherein the association is in the dimensions of semantics and the like.
Optionally, the embedded word vectors of the respective abstracts are input, and the cosine similarity between the vector of the abstract of the candidate news and the vector of the original abstract is calculated respectively, so as to obtain a similarity score, namely a first similarity, of each candidate news abstract.
Step S208, determining whether candidate news with the first similarity exceeding a first preset threshold exists, if yes, performing step S209, and if not, performing step S210.
The first preset threshold may be determined according to a previous empirical value, and may be continuously updated according to feedback of the system. The embedded word vectors trained by different training models may also correspond to different first preset thresholds.
Optionally, the first preset threshold may be continuously updated according to the calculated similarity result and the association degree of the text, and the association degree of the text may be manually confirmed.
Step S209, determining the candidate news with the first similarity exceeding a first preset threshold as the target news.
Step S210, obtaining the title of the at least one candidate news.
As an embodiment, the candidate news text data includes text data having a title as an attribute, and the title is acquired.
In another embodiment, the title of the text of the candidate news is obtained by performing title extraction on at least one candidate news text.
Step S211, respectively calculating a second similarity between the title of each candidate news and the original title.
In some embodiments, the most similar candidate news retrieved from the original summary and the second similarity of the title of the most similar candidate news retrieved from the original title to the original title are calculated separately.
Alternatively, the similarity between the titles is calculated based on the embedded word vectors.
Optionally, the embedded word vectors of the respective titles are input, and the cosine similarity between the vector of the title of the candidate news and the vector of the original title is calculated respectively, so as to obtain a similarity score of each candidate news title, that is, the second similarity.
Alternatively, the embedded word vector of the title and the embedded word vector of the abstract use differently trained embedded word vectors.
Step S212, in response to the candidate news of which the second similarity exceeds a second preset threshold, determining the candidate news of which the second similarity exceeds the second preset threshold as target news.
Optionally, the second preset threshold may be the same as the first preset threshold, or may be different from the first preset threshold.
It will be appreciated that the second predetermined threshold may also be determined based on previous empirical values and may be continuously updated based on feedback from the system. The embedded word vectors trained by different training models may also correspond to different second preset thresholds. Optionally, the second preset threshold may be continuously updated according to the result of the calculated similarity and the association degree of the text, and the association degree of the text may be manually confirmed.
In some embodiments, different weights may be set according to different degrees of dependence on the titles and the summaries, the first similarity and the second similarity are weighted and fused, and candidate news with a fusion computation result exceeding a third preset threshold may be determined as target news.
Step S213, extracting the abstract of the text of the target news, and using the abstract of the text of the target news as the text abstract of the news video.
It is understood that the target news is determined from the candidate news, and thus the extraction method of the digest of the body of the target news is the same as the method of obtaining the digest of the candidate news in step S206.
In order to more intuitively explain steps S207 to S209 of the above-described embodiment, reference may be made to the process shown in fig. 4.
The method provided by the embodiment of the disclosure obtains the news text base and the news video to be generated with the abstract, performing OCR and/or ASR recognition on the news video to extract characteristic data, extracting to obtain an original abstract according to the characteristic data, extracting an original title according to the characteristic data and the video metadata, searching at least one candidate news text in a news text base according to the original title and the original abstract, taking the candidate news text as target news in response to the fact that the first similarity between the original abstract and the abstract of the candidate news exceeds a first preset threshold, calculating the second similarity between the original title and the title of the candidate news in response to the fact that the first similarity does not exceed the first preset threshold, determining the candidate news with the second similarity exceeding the second preset threshold as the target news, obtaining the abstract of the text of the target news, and taking the abstract of the text of the target news as the abstract of the news video. Through retrieval in the news text base, the target news is screened out through the similarity, the target news is determined through the similarity of the abstracts, the dependence degree on news titles is reduced, the fact that the target news and the news video report the same news event is guaranteed, the abstracts of the news videos are generated according to the abstracts of the target news, the accuracy of abstract generation is effectively improved, the news videos can be catalogued more efficiently, and the cost is reduced.
Corresponding to the above summary generation method for news video, the embodiment of the present disclosure further provides a summary generation apparatus for news video, referring to a structural block diagram of the summary generation apparatus for news video shown in fig. 5, which mainly includes the following steps:
the obtaining module 510 is configured to obtain a news text base and a news video to be summarized.
The feature extraction module 520 is configured to identify a headline of the news video to obtain an original headline and/or extract an abstract of the news video to obtain an original abstract.
A retrieving module 530, configured to retrieve at least one candidate news text from the news text library according to the original headline and/or the original abstract.
And the screening module 540 is configured to determine target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text.
And an abstract generating module 550, configured to generate an abstract of the news video according to the target news.
In some implementations, the feature extraction module 520 includes:
the recognition unit is used for performing feature extraction on the news video through at least one of OCR character recognition and ASR automatic voice recognition;
and the abstract extracting unit is used for extracting the original abstract of the news video according to the extracted features.
In some implementations, the feature extraction module 520 further includes:
and the title extraction unit is used for extracting the original title of the news video based on the characteristics and the metadata of the news video.
In some embodiments, the determining the target news from the at least one candidate news based on the similarity between the original summary and/or the original title and the at least one candidate news includes:
acquiring an abstract of a text of the at least one candidate news;
respectively calculating first similarity of the abstract of the text of each candidate news and the original abstract;
in response to the candidate news with the first similarity exceeding a first preset threshold value, determining the candidate news with the first similarity exceeding the first preset threshold value as the target news.
In some embodiments, the determining the target news from the at least one candidate news according to the similarity between the original abstract and/or the original title and the at least one candidate news further includes:
in response to the fact that candidate news with the first similarity exceeding the first preset threshold do not exist, acquiring a title of the candidate news;
respectively calculating a second similarity of the title of each candidate news and the original title;
in response to the candidate news with the second similarity exceeding a second preset threshold value, determining the candidate news with the second similarity exceeding the second preset threshold value as the target news.
In some embodiments, the generating a text summary of the news video from the target news includes:
extracting an abstract of the text of the target news;
and taking the abstract of the text of the target news as the text abstract of the news video.
According to the device provided by the embodiment of the disclosure, a news text base and a news video to be abstracted are obtained, a title of the news video is identified to obtain an original title and/or the news video is abstracted to obtain an original abstract, at least one candidate news text is obtained by searching in the news text base according to the original title and/or the original abstract, target news is determined from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the candidate news text, and the abstract of the news video is generated according to the target news. Through retrieval in the news text base and target news is screened out through the similarity, the target news and news video reports are guaranteed to be the same news event, so that the news video abstract is generated according to the target news abstract, the accuracy of abstract generation is effectively improved, news videos can be catalogued more efficiently, and the cost is reduced.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
First, an embodiment of the present disclosure provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the methods of news video summarization.
The disclosed embodiment also provides a computer program product, which includes a computer program, and the computer program realizes the summary generation method of any one of the news videos when being executed by a processor.
Alternatively, as shown in fig. 6, the computer program includes: the system comprises a scheduling module, a summarization operator, a summarization strategy and a news text base.
And the scheduling module is used for triggering a task and triggering a subsequent module to execute any news video abstract generation method.
And the abstract operator calls an abstract strategy and inputs the metadata of the news video stored in the news text base into the abstract strategy.
And the abstract strategy is used for extracting an original title of the news video, extracting an original abstract and writing the original title and the original abstract back into the news text library according to the metadata of the news video input by the abstract operator and the characteristic data which is written into the news text library after the characteristics are extracted.
And the news text base stores the extracted feature data of the news video and the metadata of the news video, the stored metadata of the news video is called by a summarization operator, the stored feature data is called by a summarization strategy, and the written-back original title and the original summary of the summarization strategy are stored.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 performs the respective methods and processes described above, such as the digest generation method of the news video. For example, in some embodiments, the method of generating a summary of a news video may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded and/or installed onto device 700 via ROM 402 and/or communications unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the above-described news video summary generation method may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g., by means of firmware) to perform a digest generation method of news videos.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1. A news video abstract generation method comprises the following steps:
acquiring a news text library and a news video of an abstract to be generated;
identifying a title of the news video to obtain an original title and/or extracting an abstract of the news video to obtain an original abstract;
according to the original title and/or the original abstract, at least one candidate news text is obtained by searching in the news text library;
determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text;
and generating an abstract of the news video according to the target news.
2. The method of claim 1, wherein the summarizing the news video to obtain a raw summary comprises:
performing feature extraction on the news video through at least one of OCR character recognition and ASR automatic voice recognition;
and extracting the original abstract of the news video according to the extracted features.
3. The method of claim 1, wherein the identifying the headline for the news video results in an original headline comprising:
and extracting the original title of the news video based on the characteristics and the metadata of the news video.
4. The method of any of claims 1-3, wherein determining the target news from the at least one candidate news based on the similarity of the original summary and/or the original headline to the at least one candidate news comprises:
acquiring an abstract of a text of the at least one candidate news;
respectively calculating first similarity of the abstract of the text of each candidate news and the original abstract;
in response to the candidate news with the first similarity exceeding a first preset threshold value, determining the candidate news with the first similarity exceeding the first preset threshold value as the target news.
5. The method of claim 4, wherein determining the target news from the at least one candidate news item based on the similarity of the original summary and/or the original headline to the at least one candidate news item, further comprises:
in response to the fact that candidate news with the first similarity exceeding the first preset threshold do not exist, acquiring a title of the candidate news;
respectively calculating a second similarity of the title of each candidate news and the original title;
in response to the candidate news with the second similarity exceeding a second preset threshold value, determining the candidate news with the second similarity exceeding the second preset threshold value as the target news.
6. The method of claim 1, the generating a text summary of the news video from the target news, comprising:
extracting an abstract of the text of the target news;
and taking the abstract of the text of the target news as the text abstract of the news video.
7. An apparatus for generating a summary of a news video, comprising:
the acquisition module is used for acquiring a news text base and a news video of an abstract to be generated;
the feature extraction module is used for identifying the news video to obtain an original title and/or extracting the abstract of the news video to obtain an original abstract;
the retrieval module is used for retrieving at least one candidate news text from the news text library according to the original title and/or the original abstract;
the screening module is used for determining target news from the at least one candidate news text according to the similarity between the original abstract and/or the original title and the at least one candidate news text;
and the abstract generating module is used for generating an abstract of the news video according to the target news.
8. The apparatus of claim 7, wherein the feature extraction module comprises:
the recognition unit is used for performing feature extraction on the news video through at least one of OCR character recognition and ASR automatic voice recognition;
and the abstract extracting unit is used for extracting the original abstract of the news video according to the extracted features.
9. The apparatus of claim 7, wherein the feature extraction module further comprises:
and the title extraction unit is used for extracting the original title of the news video based on the characteristics and the metadata of the news video.
10. The apparatus according to any one of claims 7-9, wherein said determining the target news from the at least one candidate news based on the similarity of the original summary and/or the original title to the at least one candidate news comprises:
acquiring an abstract of a text of the at least one candidate news;
respectively calculating first similarity of the abstract of the text of each candidate news and the original abstract;
in response to the candidate news with the first similarity exceeding a first preset threshold value, determining the candidate news with the first similarity exceeding the first preset threshold value as the target news.
11. The apparatus of claim 10, wherein the determining the target news from the at least one candidate news item based on the similarity of the original summary and/or the original headline to the at least one candidate news item further comprises:
in response to the fact that candidate news with the first similarity exceeding the first preset threshold do not exist, acquiring a title of the candidate news;
respectively calculating a second similarity of the title of each candidate news and the original title;
in response to the candidate news with the second similarity exceeding a second preset threshold value, determining the candidate news with the second similarity exceeding the second preset threshold value as the target news.
12. The apparatus of claim 7, the generating a text summary of the news video from the target news comprising:
extracting an abstract of the text of the target news;
and taking the abstract of the text of the target news as the text abstract of the news video.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
CN202110808406.2A 2021-07-16 2021-07-16 Method and device for generating abstract of news video Active CN113660541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110808406.2A CN113660541B (en) 2021-07-16 2021-07-16 Method and device for generating abstract of news video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110808406.2A CN113660541B (en) 2021-07-16 2021-07-16 Method and device for generating abstract of news video

Publications (2)

Publication Number Publication Date
CN113660541A true CN113660541A (en) 2021-11-16
CN113660541B CN113660541B (en) 2023-10-13

Family

ID=78477441

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110808406.2A Active CN113660541B (en) 2021-07-16 2021-07-16 Method and device for generating abstract of news video

Country Status (1)

Country Link
CN (1) CN113660541B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281981A (en) * 2021-12-22 2022-04-05 北京百度网讯科技有限公司 News briefing generation method and device and electronic equipment
CN114363714A (en) * 2021-12-31 2022-04-15 阿里巴巴(中国)有限公司 Title generation method, title generation device and storage medium
CN115334367A (en) * 2022-07-11 2022-11-11 北京达佳互联信息技术有限公司 Video summary information generation method, device, server and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060165379A1 (en) * 2003-06-30 2006-07-27 Agnihotri Lalitha A System and method for generating a multimedia summary of multimedia streams
US20090031352A1 (en) * 2007-07-25 2009-01-29 Tp Lab Inc. Method and system to process television program summary
CN103200463A (en) * 2013-03-27 2013-07-10 天脉聚源(北京)传媒科技有限公司 Method and device for generating video summary
US20150347920A1 (en) * 2012-12-27 2015-12-03 Touchtype Limited Search system and corresponding method
CN106202057A (en) * 2016-08-30 2016-12-07 东软集团股份有限公司 The recognition methods of similar news information and device
US9881077B1 (en) * 2013-08-08 2018-01-30 Google Llc Relevance determination and summary generation for news objects
CN107844586A (en) * 2017-11-16 2018-03-27 百度在线网络技术(北京)有限公司 News recommends method and apparatus
CN108829893A (en) * 2018-06-29 2018-11-16 北京百度网讯科技有限公司 Determine method, apparatus, storage medium and the terminal device of video tab
CN109408672A (en) * 2018-12-14 2019-03-01 北京百度网讯科技有限公司 A kind of article generation method, device, server and storage medium
CN111666402A (en) * 2020-04-30 2020-09-15 平安科技(深圳)有限公司 Text abstract generation method and device, computer equipment and readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060165379A1 (en) * 2003-06-30 2006-07-27 Agnihotri Lalitha A System and method for generating a multimedia summary of multimedia streams
US20090031352A1 (en) * 2007-07-25 2009-01-29 Tp Lab Inc. Method and system to process television program summary
US20150347920A1 (en) * 2012-12-27 2015-12-03 Touchtype Limited Search system and corresponding method
CN103200463A (en) * 2013-03-27 2013-07-10 天脉聚源(北京)传媒科技有限公司 Method and device for generating video summary
US9881077B1 (en) * 2013-08-08 2018-01-30 Google Llc Relevance determination and summary generation for news objects
CN106202057A (en) * 2016-08-30 2016-12-07 东软集团股份有限公司 The recognition methods of similar news information and device
CN107844586A (en) * 2017-11-16 2018-03-27 百度在线网络技术(北京)有限公司 News recommends method and apparatus
CN108829893A (en) * 2018-06-29 2018-11-16 北京百度网讯科技有限公司 Determine method, apparatus, storage medium and the terminal device of video tab
CN109408672A (en) * 2018-12-14 2019-03-01 北京百度网讯科技有限公司 A kind of article generation method, device, server and storage medium
CN111666402A (en) * 2020-04-30 2020-09-15 平安科技(深圳)有限公司 Text abstract generation method and device, computer equipment and readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281981A (en) * 2021-12-22 2022-04-05 北京百度网讯科技有限公司 News briefing generation method and device and electronic equipment
CN114363714A (en) * 2021-12-31 2022-04-15 阿里巴巴(中国)有限公司 Title generation method, title generation device and storage medium
CN114363714B (en) * 2021-12-31 2024-01-05 阿里巴巴(中国)有限公司 Title generation method, title generation device and storage medium
CN115334367A (en) * 2022-07-11 2022-11-11 北京达佳互联信息技术有限公司 Video summary information generation method, device, server and storage medium
CN115334367B (en) * 2022-07-11 2023-10-17 北京达佳互联信息技术有限公司 Method, device, server and storage medium for generating abstract information of video

Also Published As

Publication number Publication date
CN113660541B (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN113660541B (en) Method and device for generating abstract of news video
JP2023516209A (en) METHOD, APPARATUS, APPARATUS AND COMPUTER-READABLE STORAGE MEDIUM FOR SEARCHING CONTENT
CN112989235A (en) Knowledge base-based internal link construction method, device, equipment and storage medium
CN112579729A (en) Training method and device for document quality evaluation model, electronic equipment and medium
CN111078849B (en) Method and device for outputting information
CN113806660A (en) Data evaluation method, training method, device, electronic device and storage medium
CN112506864A (en) File retrieval method and device, electronic equipment and readable storage medium
CN116597443A (en) Material tag processing method and device, electronic equipment and medium
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN112926297B (en) Method, apparatus, device and storage medium for processing information
US20220129634A1 (en) Method and apparatus for constructing event library, electronic device and computer readable medium
CN116166814A (en) Event detection method, device, equipment and storage medium
CN112528644B (en) Entity mounting method, device, equipment and storage medium
CN112784046B (en) Text clustering method, device, equipment and storage medium
CN112989190B (en) Commodity mounting method and device, electronic equipment and storage medium
CN111368036B (en) Method and device for searching information
CN114048315A (en) Method and device for determining document tag, electronic equipment and storage medium
CN112860626A (en) Document sorting method and device and electronic equipment
CN113268987B (en) Entity name recognition method and device, electronic equipment and storage medium
CN115828915B (en) Entity disambiguation method, device, electronic equipment and storage medium
CN116610782B (en) Text retrieval method, device, electronic equipment and medium
CN113377921B (en) Method, device, electronic equipment and medium for matching information
CN113377922B (en) Method, device, electronic equipment and medium for matching information
CN115392389B (en) Cross-modal information matching and processing method and device, electronic equipment and storage medium
CN115795023B (en) Document recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant