CN113709521B - System for automatically matching background according to video content - Google Patents

System for automatically matching background according to video content Download PDF

Info

Publication number
CN113709521B
CN113709521B CN202111101320.2A CN202111101320A CN113709521B CN 113709521 B CN113709521 B CN 113709521B CN 202111101320 A CN202111101320 A CN 202111101320A CN 113709521 B CN113709521 B CN 113709521B
Authority
CN
China
Prior art keywords
module
background
video
app
recorded video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111101320.2A
Other languages
Chinese (zh)
Other versions
CN113709521A (en
Inventor
付金龙
付译虹
邢硕
蒋昌杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxin Intelligent Technology Co ltd
Original Assignee
Wuxin Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxin Intelligent Technology Co ltd filed Critical Wuxin Intelligent Technology Co ltd
Priority to CN202111101320.2A priority Critical patent/CN113709521B/en
Publication of CN113709521A publication Critical patent/CN113709521A/en
Application granted granted Critical
Publication of CN113709521B publication Critical patent/CN113709521B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2355Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • H04N21/8405Generation or processing of descriptive data, e.g. content descriptors represented by keywords
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of video synthesis, in particular to a system for automatically matching background according to video content, which comprises the following components: the front-end APP is provided with a camera module, a storage module, a green curtain module, an editing module and an uploading module; the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module; the camera shooting module is connected with the storage module through the green curtain module and the editing module in sequence; the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module; and the uploading module is respectively connected with the material library and the synthesizing module. According to the invention, through analyzing the information in the recorded video, the related materials in the material library are synthesized with the information, so that more imagination space is provided for a video producer.

Description

System for automatically matching background according to video content
Technical Field
The invention relates to the technical field of video synthesis, in particular to a system for automatically matching background according to video content.
Background
In the prior art, a real image and a virtual background template can be simply combined through a computer, a mobile phone and the like to manufacture a head portrait and the like, which is called as a big head sticker. Even if some websites can provide the function of making an active album, a plurality of "big-end stickers" are simply played at larger time intervals. This simple static synthesis approach has failed to meet the needs of the user. It is more desirable to see a dynamic real life in combination with a virtual background template. How to flexibly combine dynamic real life and virtual background templates to manufacture dynamic video is a technical problem to be solved in the technical field.
Disclosure of Invention
The invention provides a system for automatically matching background according to video content, which synthesizes related materials in a material library with the information in a recorded video through analyzing the information in the recorded video, thereby providing more imagination space for a video producer.
In order to achieve the above purpose, the present invention provides the following technical solutions: a system for automatically matching a background from video content, comprising: the front-end APP is provided with a camera module, a storage module, a green curtain module, an editing module and an uploading module; the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module; the camera shooting module is connected with the storage module through the green curtain module and the editing module in sequence; the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module; the voice recognition module and the semantic analysis module are connected with the synthesis module through the material library.
Preferably, the system further comprises a screening module, and the material library is connected with the synthesis module through the screening module.
Preferably, a method for automatically matching a background according to video content includes the steps of:
firstly, extracting information from a recorded video to generate text content, and carrying out semantic analysis on the text content to obtain scene keywords;
step two, positioning corresponding background images in a material library through scene keywords;
and thirdly, synthesizing the background image and the recorded video.
Preferably, the recorded video is a green curtain video recorded by a front end APP, and the front end APP performs tag adding and background limiting on each of the recorded video.
Preferably, the step one includes: and extracting the characters from the audio information in the recorded video through a TTS technology of the APP background, and dividing the extracted characters and the labels of the recorded video into scene keywords.
Preferably, in the third step, the method further includes adding subtitles to the recorded video through an APP background.
Preferably, the background definition includes a picture background, a video background, and an audio background.
The invention has the beneficial effects that: the invention has the beneficial effects that: the front-end APP records the video through the camera module, the background of the recorded video is processed into green under the processing of the green screen module and stored in the storage module, a user sets the recorded video through the editing module during the period, the recorded video with setting information is sent to the rear end of the APP through the uploading module, and scene keywords are obtained under the action of the voice analysis module and the semantic analysis module. And the material library locates background images according to the scene keywords and editing information of the recorded video, and finally outputs the synthesized video under the processing of the synthesis module. The screening module performs one-to-one examples on the plurality of background keywords drawn by the voice analysis module and the semantic analysis module, so that a preference selection space is provided for a user. A user inputs a label for characters of a recorded video through a front end APP, the label can be a background keyword reserved by the user, and therefore, when semantic analysis is performed, the system can output a background image corresponding to the background keyword. The synthesizing module not only can screen the background materials corresponding to the background keywords from the material library, but also can directly compile background subtitles through the subtitle adding module, thereby further facilitating the user to customize the personalized content.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture of the present invention;
fig. 2 is a schematic flow chart of a background matching method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made more apparent and fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the illustration of fig. 1, a system for automatically matching a background based on video content, comprises: the front-end APP1 is provided with a camera module 2, a storage module 3, a green curtain module 4, an editing module 5 and an uploading module 6; the APP background 7 is provided with a voice recognition module 8, a semantic analysis module 9, a material library 10 and a synthesis module 11; the camera module 2 is connected with the storage module 3 through the green curtain module 4 and the editing module 5 in sequence; the storage module 3 is connected with the voice recognition module 8 and the semantic analysis module 9 through the uploading module 6; the uploading module 6 is respectively connected with the material library 10 and the synthesizing module 11.
Through the arrangement, the front end APP1 records the video through the camera module 2, and under the processing of the green screen module 4, the background of the recorded video is processed to be green and stored in the storage module 3, during the period, a user sets the recorded video through the editing module 5, the recorded video with setting information is sent to the rear end of the APP through the uploading module 6, and scene keywords are obtained under the action of the voice analysis module and the semantic analysis module 9. The material library 10 locates the background image according to the scene keywords and the editing information of the recorded video, and finally outputs the synthesized video under the processing of the synthesizing module 11.
The system also comprises a screening module, wherein the material library 10 is connected with the synthesis module 11 through the screening module.
In this arrangement, the screening module performs one-to-one examples of the plurality of background keywords drawn by the voice analysis module and the semantic analysis module 9, thereby providing a preference selection space for the user.
The editing module 5 comprises a label editing module 5 and a background type selecting module.
In the setting, a user inputs a label for characters of a recorded video through the front end APP, and the label can be a background keyword reserved by the user, so that when semantic analysis is performed, the system can output a background image corresponding to the background keyword.
The material library 10 comprises a static library, a dynamic library and an audio library; the synthesis module 11 is provided with a first input 14, a second input 15, a third input 16 and a synthesis output 17; the first input end 14 is connected with the uploading module 6, the second input module is connected with the screening module, and the third input end 16 is provided with a subtitle adding module 18.
Through the arrangement, the synthesizing module 11 not only can screen the background materials corresponding to the background keywords from the material library 10, but also can directly compile background subtitles through the subtitle adding module 18, thereby further facilitating the user to customize the personalized content.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A system for automatically matching a background based on video content, comprising:
the front-end APP is provided with a camera module, a storage module, a green curtain module, an editing module and an uploading module;
the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module;
the camera shooting module is connected with the storage module through the green curtain module and the editing module in sequence;
the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module;
the voice recognition module and the semantic analysis module are connected with the synthesis module through the material library;
further comprises:
the front-end APP records the video through the camera module, processes the background of the recorded video into green under the processing of the green screen module, and stores the green background into the storage module; the editing module is used for setting the recorded video, the recorded video with setting information is sent to the rear end of the APP through the uploading module, and scene keywords are obtained under the action of the voice analysis module and the semantic analysis module;
the material library locates background images according to the scene keywords and editing information of the recorded video, and finally outputs a synthesized video under the processing of the synthesis module;
the recorded videos are green curtain videos recorded through a front-end APP, and the front-end APP performs tag addition and background limitation on each recorded video.
2. A system for automatically matching a background based on video content as recited in claim 1, wherein: the system also comprises a screening module, wherein the material library is connected with the synthesis module through the screening module; the screening module performs one-to-one examples on the plurality of background keywords drawn by the voice recognition module and the semantic analysis module.
3. A system for automatically matching a background based on video content as recited in claim 2, wherein: and extracting the characters from the audio information in the recorded video through a TTS technology of the APP background, and dividing the extracted characters and the labels of the recorded video into scene keywords.
4. A system for automatically matching a background based on video content as recited in claim 1, wherein: and the method further comprises the step of adding subtitles to the recorded video through an APP background.
5. A system for automatically matching a background based on video content as recited in claim 2, wherein: the background definition comprises a picture background, a video background and an audio background.
CN202111101320.2A 2021-09-18 2021-09-18 System for automatically matching background according to video content Active CN113709521B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111101320.2A CN113709521B (en) 2021-09-18 2021-09-18 System for automatically matching background according to video content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111101320.2A CN113709521B (en) 2021-09-18 2021-09-18 System for automatically matching background according to video content

Publications (2)

Publication Number Publication Date
CN113709521A CN113709521A (en) 2021-11-26
CN113709521B true CN113709521B (en) 2023-08-29

Family

ID=78661281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111101320.2A Active CN113709521B (en) 2021-09-18 2021-09-18 System for automatically matching background according to video content

Country Status (1)

Country Link
CN (1) CN113709521B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339446B (en) * 2021-12-28 2024-04-05 北京百度网讯科技有限公司 Audio/video editing method, device, equipment, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130032653A (en) * 2011-09-23 2013-04-02 브로드밴드미디어주식회사 System and method for serching images using caption of moving picture in keyword
KR20150022088A (en) * 2013-08-22 2015-03-04 주식회사 엘지유플러스 Context-based VOD Search System And Method of VOD Search Using the Same
KR101894956B1 (en) * 2017-06-21 2018-10-24 주식회사 미디어프론트 Server and method for image generation using real-time enhancement synthesis technology
CN111327839A (en) * 2020-02-27 2020-06-23 江苏尚匠文化传播有限公司 Video post-production method and system based on virtual video technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130032653A (en) * 2011-09-23 2013-04-02 브로드밴드미디어주식회사 System and method for serching images using caption of moving picture in keyword
KR20150022088A (en) * 2013-08-22 2015-03-04 주식회사 엘지유플러스 Context-based VOD Search System And Method of VOD Search Using the Same
KR101894956B1 (en) * 2017-06-21 2018-10-24 주식회사 미디어프론트 Server and method for image generation using real-time enhancement synthesis technology
CN111327839A (en) * 2020-02-27 2020-06-23 江苏尚匠文化传播有限公司 Video post-production method and system based on virtual video technology

Also Published As

Publication number Publication date
CN113709521A (en) 2021-11-26

Similar Documents

Publication Publication Date Title
CN107770626B (en) Video material processing method, video synthesizing device and storage medium
CN108833973B (en) Video feature extraction method and device and computer equipment
CN109688463B (en) Clip video generation method and device, terminal equipment and storage medium
EP3996381A1 (en) Cover image determination method and apparatus, and device
US7324943B2 (en) Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing
WO2021109678A1 (en) Video generation method and apparatus, electronic device, and storage medium
US9071815B2 (en) Method, apparatus and computer program product for subtitle synchronization in multimedia content
CN112367551B (en) Video editing method and device, electronic equipment and readable storage medium
CN110781328A (en) Video generation method, system, device and storage medium based on voice recognition
CN104735468A (en) Method and system for synthesizing images into new video based on semantic analysis
CN106303290A (en) A kind of terminal and the method obtaining video
JP2018078402A (en) Content production device, and content production system with sound
CN113709521B (en) System for automatically matching background according to video content
US9666211B2 (en) Information processing apparatus, information processing method, display control apparatus, and display control method
CN112929746A (en) Video generation method and device, storage medium and electronic equipment
US20150111189A1 (en) System and method for browsing multimedia file
CN113132780A (en) Video synthesis method and device, electronic equipment and readable storage medium
CN106372106A (en) Method and apparatus for providing video content assistance information
WO2021057957A1 (en) Video call method and apparatus, computer device and storage medium
JP2005346259A (en) Information processing device and information processing method
JP2004023661A (en) Recorded information processing method, recording medium, and recorded information processor
CN114390341B (en) Video recording method, electronic equipment, storage medium and chip
CN115801977A (en) Multi-mode system for segmenting video, multi-mode system for segmenting multimedia and multi-mode method for segmenting multimedia
CN113411532B (en) Method, device, terminal and storage medium for recording content
CN112188116B (en) Video synthesis method, client and system based on object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant