CN113709521A - System for automatically matching background according to video content - Google Patents
System for automatically matching background according to video content Download PDFInfo
- Publication number
- CN113709521A CN113709521A CN202111101320.2A CN202111101320A CN113709521A CN 113709521 A CN113709521 A CN 113709521A CN 202111101320 A CN202111101320 A CN 202111101320A CN 113709521 A CN113709521 A CN 113709521A
- Authority
- CN
- China
- Prior art keywords
- module
- background
- video
- app
- automatically matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 17
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 17
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 2
- 101150053844 APP1 gene Proteins 0.000 description 2
- 101100189105 Homo sapiens PABPC4 gene Proteins 0.000 description 2
- 102100039424 Polyadenylate-binding protein 4 Human genes 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23424—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
- H04N21/2355—Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47205—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
- H04N21/8405—Generation or processing of descriptive data, e.g. content descriptors represented by keywords
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Marketing (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Television Signal Processing For Recording (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Studio Circuits (AREA)
Abstract
The invention relates to the technical field of video synthesis, in particular to a system for automatically matching background according to video content, which comprises: the device comprises a front-end APP, a storage module, a green screen module, an editing module and an uploading module, wherein the front-end APP is provided with a camera module, the storage module, the green screen module, the editing module and the uploading module; the system comprises an APP background, wherein the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module; the camera module is connected with the storage module sequentially through the green screen module and the editing module; the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module; the material library of the uploading module is respectively connected with the synthesizing module. The invention synthesizes the related materials in the material library by analyzing the information in the recorded video, thereby providing more imagination space for a video producer.
Description
Technical Field
The invention relates to the technical field of video synthesis, in particular to a system for automatically matching a background according to video content.
Background
In the prior art, a real image and a virtual background template can be simply combined through a computer or a mobile phone and the like to be made into a head portrait and the like, which is called as a photo sticker. Even though some websites can provide the function of creating a dynamic album, it is simple to play a plurality of "stickers" at large time intervals. The simple static synthesis method cannot meet the requirements of users. It is more desirable to see dynamic real life combined with virtual background templates. How to combine a dynamic real life and a virtual background template flexibly to make a dynamic video is a technical problem which needs to be solved urgently in the technical field.
Disclosure of Invention
The invention provides a system for automatically matching background according to video content, which synthesizes related materials in a material library by analyzing information in a recorded video so as to provide more imagination space for a video producer.
In order to achieve the purpose, the invention provides the following technical scheme: a system for automatically matching context from video content, comprising: the device comprises a front-end APP, a storage module, a green screen module, an editing module and an uploading module, wherein the front-end APP is provided with a camera module, the storage module, the green screen module, the editing module and the uploading module; the system comprises an APP background, wherein the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module; the camera module is connected with the storage module sequentially through the green screen module and the editing module; the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module; the speech recognition module and the semantic analysis module are connected with the synthesis module through the material library.
Preferably, the system further comprises a screening module, and the material library is connected with the synthesis module through the screening module.
Preferably, a method for automatically matching a background according to video content includes the following steps:
firstly, extracting information of a recorded video to generate text content, and performing semantic analysis on the text content to obtain scene keywords;
secondly, positioning a corresponding background image in a material library through the scene keywords;
and step three, synthesizing the background image and the recorded video.
Preferably, the recorded videos are green videos recorded through a front-end APP, and the front-end APP adds a label and defines a background to each of the recorded videos.
Preferably, the first step includes: and performing character extraction on audio information in the recorded video through a TTS technology of an APP background, and performing scene keyword division on the extracted characters and the labels of the recorded video.
Preferably, in the third step, subtitle addition is performed on the recorded video through an APP background.
Preferably, the background definition comprises a picture background, a video background and an audio background.
The invention has the beneficial effects that: the invention has the beneficial effects that: the front end APP records videos through the camera module, under the processing of the green screen module, the background processing of the recorded videos is green, and the videos are stored in the storage module, the user sets the recorded videos through the editing module in the period, the recorded videos with set information are sent to the APP rear end through the uploading module, and scene keywords are obtained under the action of the voice analysis module and the semantic analysis module. The material library positions the background image according to the scene keywords and the editing information of the recorded video, and finally outputs the synthesized video under the processing of the synthesis module. The screening module performs a one-to-one example on a plurality of background keywords marked out by the voice analysis module and the semantic analysis module, so that a preference selection space is provided for a user. The user carries out the text input label to recording the video through front end APP, and this label can be for the background keyword that the user ordered to when semantic analysis, the system can be with the background image output that corresponds this background keyword. The synthesis module can not only screen the background material corresponding to the background key words from the material library, but also directly compile the background caption through the caption additional module, thereby further facilitating the user to self-define the personalized content.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of the system of the present invention;
fig. 2 is a flow chart of the background matching method of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a system for automatically matching a background according to video contents includes: the front-end APP1 is provided with a camera module 2, a storage module 3, a green screen module 4, an editing module 5 and an uploading module 6; the system comprises an APP background 7, wherein the APP background 7 is provided with a voice recognition module 8, a semantic analysis module 9, a material library 10 and a synthesis module 11; the camera module 2 is connected with the storage module 3 sequentially through the green screen module 4 and the editing module 5; the storage module 3 is connected with the voice recognition module 8 and the semantic analysis module 9 through the uploading module 6; the uploading module 6 and the material library 10 are respectively connected with the synthesizing module 11.
Through the setting, front end APP1 records the video through camera module 2 to under the processing of green curtain module 4, the background processing of recording the video is green, and in saving storage module 3, the user still sets for recording the video through editing module 5 in the period, the video of recording that has the settlement information sends to the APP rear end through upload module 6, and reachs the scene keyword under speech analysis module, semantic analysis module 9's effect. The material library 10 positions the background image according to the scene keyword and the editing information of the recorded video, and finally outputs the synthesized video under the processing of the synthesis module 11.
The material library 10 is connected with the synthesis module 11 through the screening module.
In this arrangement, the screening module performs a one-to-one example of the plurality of background keywords extracted by the speech analysis module and the semantic analysis module 9, thereby providing a preference selection space for the user.
The editing module 5 comprises a label editing module 5 and a background type selecting module.
In the setting, the user carries out the text input label to recording the video through front end APP, and this label can be the background keyword that the user ordered to when semantic analysis, the system can be with the background image output that corresponds this background keyword.
The material library 10 comprises a static library, a dynamic library and an audio library; the synthesis module 11 is provided with a first input end 14, a second input end 15, a third input end 16 and a synthesis output end 17; the first input end 14 is connected to the uploading module 6, the second input module is connected to the screening module, and the third input end 16 is provided with a subtitle appending module 18.
Through the setting, the synthesis module 11 not only can screen the background material corresponding to the background keyword from the material library 10, but also can directly compile the background caption through the caption additional module 18, thereby further facilitating the user to self-define the personalized content.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (7)
1. A system for automatically matching context based on video content, comprising:
the device comprises a front-end APP, a storage module, a green screen module, an editing module and an uploading module, wherein the front-end APP is provided with a camera module, the storage module, the green screen module, the editing module and the uploading module;
the system comprises an APP background, wherein the APP background is provided with a voice recognition module, a semantic analysis module, a material library and a synthesis module;
the camera module is connected with the storage module sequentially through the green screen module and the editing module;
the storage module is connected with the voice recognition module and the semantic analysis module through the uploading module;
the speech recognition module and the semantic analysis module are connected with the synthesis module through the material library.
2. The system for automatically matching video content to background as claimed in claim 1, wherein: the material library is connected with the synthesis module through the screening module.
3. A method for automatically matching a background based on video content for use in the system of claim 1, comprising the steps of:
firstly, extracting information of a recorded video to generate text content, and performing semantic analysis on the text content to obtain scene keywords;
secondly, positioning a corresponding background image in a material library through the scene keywords;
and step three, synthesizing the background image and the recorded video.
4. The method of claim 1, wherein the method comprises: record video for the green screen video that records through front end APP, and front end APP is to each record video and carry out label addition and background and restrict.
5. A method for automatically matching a background based on video content according to claim 2, wherein: the first step comprises the following steps: and performing character extraction on audio information in the recorded video through a TTS technology of an APP background, and performing scene keyword division on the extracted characters and the labels of the recorded video.
6. A method for automatically matching a background based on video content according to claim 3, wherein: and in the third step, adding subtitles to the recorded video through an APP background.
7. A method for automatically matching a background based on video content according to claim 2, wherein: the background definition comprises a picture background, a video background and an audio background.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111101320.2A CN113709521B (en) | 2021-09-18 | 2021-09-18 | System for automatically matching background according to video content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111101320.2A CN113709521B (en) | 2021-09-18 | 2021-09-18 | System for automatically matching background according to video content |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113709521A true CN113709521A (en) | 2021-11-26 |
CN113709521B CN113709521B (en) | 2023-08-29 |
Family
ID=78661281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111101320.2A Active CN113709521B (en) | 2021-09-18 | 2021-09-18 | System for automatically matching background according to video content |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113709521B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114339446A (en) * | 2021-12-28 | 2022-04-12 | 北京百度网讯科技有限公司 | Audio and video editing method, device, equipment, storage medium and program product |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130032653A (en) * | 2011-09-23 | 2013-04-02 | 브로드밴드미디어주식회사 | System and method for serching images using caption of moving picture in keyword |
KR20150022088A (en) * | 2013-08-22 | 2015-03-04 | 주식회사 엘지유플러스 | Context-based VOD Search System And Method of VOD Search Using the Same |
KR101894956B1 (en) * | 2017-06-21 | 2018-10-24 | 주식회사 미디어프론트 | Server and method for image generation using real-time enhancement synthesis technology |
CN111327839A (en) * | 2020-02-27 | 2020-06-23 | 江苏尚匠文化传播有限公司 | Video post-production method and system based on virtual video technology |
-
2021
- 2021-09-18 CN CN202111101320.2A patent/CN113709521B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130032653A (en) * | 2011-09-23 | 2013-04-02 | 브로드밴드미디어주식회사 | System and method for serching images using caption of moving picture in keyword |
KR20150022088A (en) * | 2013-08-22 | 2015-03-04 | 주식회사 엘지유플러스 | Context-based VOD Search System And Method of VOD Search Using the Same |
KR101894956B1 (en) * | 2017-06-21 | 2018-10-24 | 주식회사 미디어프론트 | Server and method for image generation using real-time enhancement synthesis technology |
CN111327839A (en) * | 2020-02-27 | 2020-06-23 | 江苏尚匠文化传播有限公司 | Video post-production method and system based on virtual video technology |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114339446A (en) * | 2021-12-28 | 2022-04-12 | 北京百度网讯科技有限公司 | Audio and video editing method, device, equipment, storage medium and program product |
CN114339446B (en) * | 2021-12-28 | 2024-04-05 | 北京百度网讯科技有限公司 | Audio/video editing method, device, equipment, storage medium and program product |
Also Published As
Publication number | Publication date |
---|---|
CN113709521B (en) | 2023-08-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107770626B (en) | Video material processing method, video synthesizing device and storage medium | |
JP4250301B2 (en) | Method and system for editing video sequences | |
US8930817B2 (en) | Theme-based slideshows | |
US9317531B2 (en) | Autocaptioning of images | |
CN101202864B (en) | Player for movie contents | |
US8923654B2 (en) | Information processing apparatus and method, and storage medium storing program for displaying images that are divided into groups | |
CN110781328A (en) | Video generation method, system, device and storage medium based on voice recognition | |
CN112579826A (en) | Video display and processing method, device, system, equipment and medium | |
KR100493674B1 (en) | Multimedia data searching and browsing system | |
US7844115B2 (en) | Information processing apparatus, method, and program product | |
CN104735468A (en) | Method and system for synthesizing images into new video based on semantic analysis | |
KR20090094826A (en) | Automated production of multiple output products | |
CN104657074A (en) | Method, device and mobile terminal for realizing sound recording | |
US9666211B2 (en) | Information processing apparatus, information processing method, display control apparatus, and display control method | |
CN113709521A (en) | System for automatically matching background according to video content | |
US20150221114A1 (en) | Information processing apparatus, information processing method, and program | |
CN117478975A (en) | Video generation method, device, computer equipment and storage medium | |
JP6603929B1 (en) | Movie editing server and program | |
CN108255917B (en) | Image management method and device and electronic device | |
US20140297678A1 (en) | Method for searching and sorting digital data | |
CN113269855A (en) | Method, equipment and storage medium for converting text semantics into scene animation | |
CN115250372A (en) | Video processing method, device, equipment and computer readable storage medium | |
JP2021119662A (en) | Server and data allocation method | |
JP7133367B2 (en) | MOVIE EDITING DEVICE, MOVIE EDITING METHOD, AND MOVIE EDITING PROGRAM | |
KR20080084303A (en) | Technology which is storing easily, quickly and accurately only wanted part from the movie and audio files |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |