CN109862393A - Method of dubbing in background music, system, equipment and the storage medium of video file - Google Patents
Method of dubbing in background music, system, equipment and the storage medium of video file Download PDFInfo
- Publication number
- CN109862393A CN109862393A CN201910216297.8A CN201910216297A CN109862393A CN 109862393 A CN109862393 A CN 109862393A CN 201910216297 A CN201910216297 A CN 201910216297A CN 109862393 A CN109862393 A CN 109862393A
- Authority
- CN
- China
- Prior art keywords
- video file
- video
- file
- background music
- dubbing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses method of dubbing in background music, system, equipment and the storage mediums of a kind of video file, this method comprises: extracting every video features of the initial video file from initial video file to be dubbed in background music, and the soundtrack audio file of the initial video file is generated in conjunction with every video features;Based on the initial video file and soundtrack audio file, test video file is generated;The user's portrait model and evaluation parameter that object is watched according to the test video file, are modified soundtrack audio file in the test video file, generate stand-by video file.Present invention reduces the overall costs that video is dubbed in background music, and combine video content features and user feedback, carry out video and dub in background music, user is enable to obtain better experience in watching video.
Description
Technical field
The present invention relates to video dub in background music the method for dubbing in background music of technical field more particularly to video file a kind of, system, equipment and
Storage medium.
Background technique
When making the video file towards audience, video content is usually first produced, then according in video
Hold the progress later period to dub in background music, ultimately forms the video played to users, this point is in advertisement video manufacturing process at present
That embodies is especially apparent.In existing advertisement video manufacturing process, advertiser designer can preferentially be wanted according to trustee
It asks and designs video content, then select existing audio file and dub in background music to the video progress later period, in this way, advertisement video is not only whole
Body is at high cost, and does not account for the preference requirement that audience dubs in background music for video.It is existing that there is also the lifes of automatic music
At algorithm, still, existing music automatic generating calculation can not combine music with video content features, and video is dubbed in background music effect
Fruit is general.
Summary of the invention
The main purpose of the present invention is to provide method of dubbing in background music, system, equipment and the storage medium of a kind of video file, purports
In the quality that raising creating ad video is newly dubbed in background music, cost of dubbing in background music is reduced, and combines advertisement video content characteristic and user feedback,
It dubs in background music to advertisement video and optimizes adjustment, so that obtaining user when watching advertisement video preferably watches experience.
To achieve the above object, the present invention provides a kind of method of dubbing in background music of video file, the side of dubbing in background music of the video file
Method the following steps are included:
Every video features of the initial video file are extracted from initial video file to be dubbed in background music, and are combined each
The item video features generate the soundtrack audio file of the initial video file;
Based on the initial video file and soundtrack audio file, test video file is generated;
The user's portrait model and evaluation parameter that object is watched according to the test video file, to the test video text
Soundtrack audio file is modified in part, generates stand-by video file.
Optionally, the video features include: light stream strength characteristic, chroma histogram feature, shot boundary characteristic,
The step of extracting every video features of the video file from initial video file to be dubbed in background music packet
It includes:
It is straight to extract the coloration of each corresponding each light stream figure of video image and the video image in the initial video file
Fang Tu;
Using the average light intensity of flow of each light stream figure as the light stream strength characteristic of the initial video file;
After the chroma histogram is normalized, the chroma histogram as the initial video file is special
Sign;
The boundary shot for detecting the video image, the boundary shot of initial video file described in the boundary shot is special
Sign.
Optionally, the video features further include: video feeling score feature,
Described the step of extracting every video features of the video file from initial video file to be dubbed in background music, is also
Include:
The video content for reading the initial video file detects and counts mark video feeling in the video content
Affection data;
The affection data is input to default sentiment analysis model, so that the default sentiment analysis model is to the feelings
Sense data are predicted to obtain the emotion score of the video content;
Using the emotion score as the video feeling score feature of the initial video file.
Optionally, the step of generating the soundtrack audio file of the initial video file in conjunction with every video features packet
It includes:
Every video features are input to default model of dubbing in background music, the default instruction that the preset configuration model passes through addition
Practice sample and carry out learning training, the default training sample includes: audio, video data and pure audio data;
In the default model of dubbing in background music, in conjunction with every video features generate the initial video file with musical sound
Frequency file.
Optionally, every video features are input to described before presetting the step of dubbing in background music model, the method
Further include:
The lookback feature of the initial video file is detected, and the lookback feature is input to described preset
It dubs in background music model.
Optionally, the default model of dubbing in background music is the model of dubbing in background music that audio file is generated based on series neural network,
In the default model of dubbing in background music, in conjunction with every video features generate the initial video file with musical sound
The step of frequency file includes:
According to every video features of the initial video file and the lookback feature, note sequence is generated
Column;
The sequence of notes is inputted into note duration sequence neural network, so that the note duration neural network is according to institute
State sequence of notes and lookback feature output note duration sequence;
The sequence of notes is inputted into drum sequence neural network, so that the drum sequence neural network is according to the sound
Accord with sequence output drumbeat combination;
It is combined according to the sequence of notes, note duration sequence and the drumbeat, generates matching for the initial video file
Musical sound frequency file.
Optionally, be based on the initial video file and soundtrack audio file, generate test video file the step of include:
Read the play time sequence of the initial video file and the soundtrack audio file;
It is test view by the initial video file and the soundtrack audio file synthesis based on the play time sequence
Frequency file.
Optionally, the user's portrait model and evaluation parameter that object is watched according to the test video file, to institute
Stating the step of soundtrack audio file is modified, generates stand-by video file in test video file includes:
The release platform for detecting the test video file obtains the test video file from the release platform and receives
See the user's portrait model and evaluation parameter of object;
Each user for reading same subscriber portrait model watches the evaluation ginseng of the test video file in predetermined period
Number, and user behavior characteristics sequence is constructed according to the evaluation parameter;
When watching the test video file according to user behavior characteristics sequence calculating user, to the soundtrack audio
The preference probability distribution data of file;
The default model of dubbing in background music that the soundtrack audio file is generated with preference probability distribution data guidance, to described
Soundtrack audio file is modified in test video file, generates stand-by video file.
In addition, the scoring system of the video file is based on sequence the present invention also provides a kind of scoring system of video file
Column neural network generates the soundtrack audio of video file, and the scoring system of the video file includes:
Soundtrack audio generation module, for extracting the initial video file from initial video file to be dubbed in background music
Every video features, and generate in conjunction with every video features the soundtrack audio file of the initial video file;
Video generation module to be measured generates test video for being based on the initial video file and soundtrack audio file
File;
Soundtrack audio correction module, for watching user's portrait model and the evaluation of object according to the test video file
Parameter is modified soundtrack audio file in the test video file, generates stand-by video file.
Optionally, the scoring system of the video file further include:
Learning training module, for adding default training sample to the default model of dubbing in background music for generating the soundtrack audio file
Learning training is carried out, the default training sample includes: audio, video data and pure audio data.
In addition, the equipment of dubbing in background music of the video file includes: to deposit the present invention also provides a kind of equipment of dubbing in background music of video file
Reservoir, processor and the program of dubbing in background music for being stored in the video file that can be run on the memory and on the processor, institute
State realized when the program of dubbing in background music of video file is executed by the processor video file as described above dub in background music method the step of.
In addition, being applied to computer the present invention also provides a kind of storage medium, video text is stored on the storage medium
The program of dubbing in background music of part, the program of dubbing in background music of the video file realize dubbing in background music for video file as described above when being executed by processor
The step of method.
The present invention is special by the every video for extracting the initial video file from initial video file to be dubbed in background music
It levies, and generates the soundtrack audio file of the initial video file in conjunction with every video features;Based on the initial video
File and soundtrack audio file generate test video file;User's portrait mould of object is watched according to the test video file
Type and evaluation parameter are modified soundtrack audio file in the test video file, generate stand-by video file;As a result,
In conjunction with the every video features extracted from the video content of initial video file, by by addition audio, video data and pure tone
Frequency is trained according to transfer learning is carried out, and the user characteristic data for watching the advertisement video file audience by acquiring carries out
Model is dubbed in guidance optimization, generates stand-by video file of the current initial video file after dubbing.Not only by automatic
Dub algorithm and realize and dub in background music automatically and reduce the sky high cost that video file is dubbed in background music, and in conjunction with video content features dub in background music into
One step improves the total quality dubbed in background music, moreover, the Feedback Evaluation also based on the video file audience is to soundtrack audio text
Part optimizes adjustment, meets preference requirement of the user for content of dubbing in background music, improves user for the receipts of the video file
See experience.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of the method first embodiment of dubbing in background music of video file of the present invention;
Fig. 3 is the refinement step schematic diagram of step S100 in Fig. 2;
Fig. 4 is the flow diagram of the method second embodiment of dubbing in background music of video file of the present invention;
Fig. 5 is the flow diagram of the method 3rd embodiment of dubbing in background music of video file of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
As shown in Figure 1, Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to.
It should be noted that Fig. 1 can be the structural schematic diagram of the hardware running environment of the equipment of dubbing in background music of video file.This
The equipment of dubbing in background music of inventive embodiments video file can be PC, the terminal devices such as portable computer.
As shown in Figure 1, the equipment of dubbing in background music of the video file may include: processor 1001, such as CPU, network interface
1004, user interface 1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 for realizing these components it
Between connection communication.User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard),
Optional user interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include
Standard wireline interface and wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, be also possible to steady
Fixed memory (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally can also be independently of
The storage device of aforementioned processor 1001.
It will be understood by those skilled in the art that the device structure of dubbing in background music of video file shown in Fig. 1 is not constituted to view
The restriction of the equipment of dubbing in background music of frequency file may include perhaps combining certain components or not than illustrating more or fewer components
Same component layout.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium
Believe module, the program of dubbing in background music of Subscriber Interface Module SIM and video file.Wherein, operating system is to manage and control Sample video text
The program of dub in background music device hardware and the software resource of part supports the fortune of dub in background music program and the other softwares or program of video file
Row.
In the equipment of dubbing in background music of video file shown in Fig. 1, user interface 1003 is mainly used for being counted with each terminal
According to communication;Network interface 1004 is mainly used for connecting background server, carries out data communication with background server;And processor
1001 can be used for calling the program of dubbing in background music of the video file stored in memory 1005, and execute following operation:
Every video features of the initial video file are extracted from initial video file to be dubbed in background music, and are combined each
The item video features generate the soundtrack audio file of the initial video file;
Based on the initial video file and soundtrack audio file, test video file is generated;
The user's portrait model and evaluation parameter that object is watched according to the test video file, to the test video text
Soundtrack audio file is modified in part, generates stand-by video file.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
It is straight to extract the coloration of each corresponding each light stream figure of video image and the video image in the initial video file
Fang Tu;
Using the average light intensity of flow of each light stream figure as the light stream strength characteristic of the initial video file;
After the chroma histogram is normalized, the chroma histogram as the initial video file is special
Sign;
The boundary shot for detecting the video image, the boundary shot of initial video file described in the boundary shot is special
Sign.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
The video features further include: video feeling score feature,
Described the step of extracting every video features of the video file from initial video file to be dubbed in background music, is also
Include:
The video content for reading the initial video file detects and counts mark video feeling in the video content
Affection data;
The affection data is input to default sentiment analysis model, so that the default sentiment analysis model is to the feelings
Sense data are predicted to obtain the emotion score of the video content;
Using the emotion score as the video feeling score feature of the initial video file.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
Every video features are input to default model of dubbing in background music, the default instruction that the preset configuration model passes through addition
Practice sample and carry out learning training, the default training sample includes: audio, video data and pure audio data;
In the default model of dubbing in background music, in conjunction with every video features generate the initial video file with musical sound
Frequency file.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, it is described every video features are input to it is default dub in background music model the step of before, execute following steps:
The lookback feature of the initial video file is detected, and the lookback feature is input to described preset
It dubs in background music model.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
According to every video features of the initial video file and the lookback feature, note sequence is generated
Column;
The sequence of notes is inputted into note duration sequence neural network, so that the note duration neural network is according to institute
State sequence of notes and lookback feature output note duration sequence;
The sequence of notes is inputted into drum sequence neural network, so that the drum sequence neural network is according to the sound
Accord with sequence output drumbeat combination;
It is combined according to the sequence of notes, note duration sequence and the drumbeat, generates matching for the initial video file
Musical sound frequency file.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
Read the play time sequence of the initial video file and the soundtrack audio file;
It is test view by the initial video file and the soundtrack audio file synthesis based on the play time sequence
Frequency file.
Further, processor 1001 can be also used for calling the journey of dubbing in background music of the video file stored in memory 1005
Sequence, and execute following steps:
The release platform for detecting the test video file obtains the test video file from the release platform and receives
See the user's portrait model and evaluation parameter of object;
Each user for reading same subscriber portrait model watches the evaluation ginseng of the test video file in predetermined period
Number, and user behavior characteristics sequence is constructed according to the evaluation parameter;
When watching the test video file according to user behavior characteristics sequence calculating user, to the soundtrack audio
The preference probability distribution data of file;
The default model of dubbing in background music that the soundtrack audio file is generated with preference probability distribution data guidance, to described
Soundtrack audio file is modified in test video file, generates stand-by video file.
Based on above-mentioned structure, each embodiment of the method for dubbing in background music of video file of the present invention is proposed.
Referring to figure 2., Fig. 2 is the flow diagram of the method first embodiment of dubbing in background music of video file of the present invention.
The embodiment of the invention provides the embodiments of the method for dubbing in background music of video file, it should be noted that although in process
Logical order is shown in figure, but in some cases, it can be to be different from shown or described by sequence execution herein
Step.
The method of dubbing in background music of video file of the embodiment of the present invention is applied to the equipment of dubbing in background music of video file, view of the embodiment of the present invention
The equipment of dubbing in background music of frequency file can be PC, and the terminal devices such as portable computer are not particularly limited herein.
The method of dubbing in background music of the present embodiment video file includes:
Step S100, every video that the initial video file is extracted from initial video file to be dubbed in background music are special
It levies, and generates the soundtrack audio file of the initial video file in conjunction with every video features.
When detect start to play initial video file when, call preset algorithm and predetermined sequence neural network model,
Every video features are extracted from currently playing initial video file, and the every video features extracted are sent to base
In the default model of dubbing in background music that series neural network is dubbed in background music automatically, by presetting models coupling currently every video features of dubbing in background music,
According to the broadcasting timing of current initial video file, the soundtrack audio file of the initial video file is sequentially generated.
In the present embodiment, video file be specifically as follows advertiser designer will be in video according to customer demand
Hold the advertisement video that production finishes;Each preset algorithm and predetermined sequence neural network model are specifically as follows Gunnar
Farneback optical flow algorithm, chroma histogram algorithm, shot border detection model and visual classification training prediction model;In advance
The happy model of establishing is specifically as follows the model of dubbing in background music automatically based on series neural network.
Specifically, for example, in the present embodiment, passing through the above-mentioned each preset algorithm of calling and predetermined sequence neural network mould
Type, after extracting every video features in the advertisement video content being currently played, by every video features be passed to
In default model of dubbing in background music based on series neural network, for every video spy of the default models coupling advertisement video of dubbing in background music
Sign is dubbed in background music automatically.
Further, referring to figure 3., Fig. 3 is the refinement step schematic diagram of step S100 in Fig. 2, initial view to be dubbed in background music
Every video features of frequency file include: light stream strength characteristic, chroma histogram feature, shot boundary characteristic, in step S100
In, the step of extracting every video features of the video file from initial video file to be dubbed in background music includes:
Step S101 extracts each corresponding each light stream figure of video image and the video figure in the initial video file
The chroma histogram of picture.
Preset algorithm is called, analysis extracts light corresponding to the currently playing each frame video image of initial video file
Chroma histogram corresponding to flow graph and each frame video image.
Specifically, for example, in the present embodiment, advertisement view is being played until terminating from starting to play Current ad video
During the entire process of frequency, Gunnar Farneback optical flow algorithm and chroma histogram algorithm are called or called simultaneously respectively,
It analyzes one by one and extracts the color in light stream figure and each frame video image corresponding to the video image of Current ad video
Spend histogram.
In the present embodiment, Gunnar Farneback optical flow algorithm is called, each frame view of Current ad video is extracted
The dense optical flow of frequency image, and form light stream figure corresponding to video image.
Step S102, the average light intensity of flow of each light stream figure is special as the light stream intensity of the initial video file
Sign.
Specifically, for example, in the present embodiment, by calling Gunnar Farneback optical flow algorithm, calculating shape
At each frame video image of Current ad video corresponding to light stream figure average light intensity of flow, using the average light intensity of flow as
The light stream strength characteristic of Current ad video file.
Step S103, the color after the chroma histogram is normalized, as the initial video file
Spend histogram feature.
Specifically, further current by what is extracted by calling chroma histogram algorithm for example, in the present embodiment
The chroma histogram of each frame video image of advertisement video is normalized, and by the coloration after being normalized
Chroma histogram feature of the histogram vectors as Current ad video file.
Step S104 detects the boundary shot of the video image, by initial video file described in the boundary shot
Shot boundary characteristic.
Default shot border detection model is called, is detected in current initial video file, each paragraph variation of video content
Situation, and using shot border detection result as the shot boundary characteristic of current initial video file.
Specifically, for example, in the present embodiment, advertisement view is being played until terminating from starting to play Current ad video
During the entire process of frequency, default shot border detection model is called, the segmentation situation of change of currently playing advertisement video is detected,
And using the testing result of shot border detection model as the shot boundary characteristic of currently playing advertisement video file.
Further, the video features of initial video file to be dubbed in background music further include: video feeling score feature, step
In S100, the step of extracting every video features of the video file from initial video file to be dubbed in background music, further includes:
Step S105 reads the video content of the initial video file, detects and counts and identifies in the video content
The affection data of video feeling.
The video content for reading and detecting currently playing initial video file, in the video content for identifying video
The affection data of emotion is counted.
Specifically, for example, in the present embodiment, reading the video content of currently playing advertisement video, and to the video
Video data in content is marked, with according to the label to the emotion score of the video content of Current ad video according to 1
Analyzed and counted to 10 points obtain fractional result (the higher video content for representing Current ad video of score more have passion or
Happier, score is lower, and the video content for representing Current ad video is tranquiler).
The affection data is input to default sentiment analysis model by step S106, for the default sentiment analysis mould
Type predicts the affection data to obtain the emotion score of the video content.
Specifically, for example, in the present embodiment, after obtaining flag data, which is input to pre- setting video
Classification based training prediction model, by calling the visual classification training prediction model based on series neural network, further to current
The emotion degree score of advertisement video subsequent time is predicted.
In the present embodiment, default visual classification training prediction model used is specifically as follows TSN (Temporal
Segment Network) Behavior-based control identification video classification model, or can be trend prediction (Stream) visual classification
Model.
Step S107, using the emotion score as the video feeling score feature of the initial video file.
Specifically, for example, in the present embodiment, in Current ad video that visual classification training prediction model is predicted
The emotion score of appearance, the emotion score feature as Current ad video file.
Further, in step S100, in conjunction with every video features generate the initial video file with musical sound
The step of frequency file includes:
Every video features are input to default model of dubbing in background music by step S108.
Specifically, for example, in the present embodiment, will be calculated based on Gunnar Farneback optical flow algorithm, chroma histogram
The light stream intensity for the Current ad video that method, shot border detection model and visual classification training prediction model extract is special
The video features such as sign, chroma histogram feature, shot boundary characteristic and emotion score feature are passed to based on series neural network
Default model of dubbing in background music.
In the present embodiment, default model of dubbing in background music used is specifically as follows based on time recursive sequence neural network (LSTM
Series neural network) dub in background music model automatically, generate the video in the default every video features of models coupling of dubbing in background music
Before the soundtrack audio file of file, this is default dub in background music model by add default training sample to it is described it is default dub in background music model into
Row learning training, default training sample includes: audio, video data and pure audio data, by addition training sample to based on sequence
The model of dubbing in background music automatically of neural network carries out learning training, dubs in background music model automatically when being dubbed in background music automatically to video file, energy
Enough obtain better effect.
Specifically, for example, utilizing audio, video data (MTV) and various pure audio datas two using the method for transfer learning
The different sample of class carries out model training to the model of dubbing in background music automatically based on series neural network.According to the generalization of transfer learning
Problem definition is trained, using the model knot of coder-decoder using the second class sample (pure audio data) in originating task
The music samples of input are mapped to feature space by structure, encoder model, and decoder is again to the insertion feature in feature space
(embedding feature) is decoded, and is realized the mapping from feature space to music, is passed through the instruction of encoder and decoder
Practice, originating task model obtains the Model Weight from feature space to music;Goal task uses first kind sample (audio-video number
According to) be trained, audio, video data is mapped to the feature space in originating task using characteristic extracting module first, then pass through source
Embedding feature is mapped to music by the decoder model in task, realizes that study is synchronous with model end to end
It updates.
Further, in step S108, every video features are input to before presetting the step of dubbing in background music model, this
The method of dubbing in background music of invention video file further include:
The lookback feature of the initial video file is detected, and the lookback feature is input to described preset
It dubs in background music model.
In the present embodiment, in order to enable the above-mentioned default model of dubbing in background music based on series neural network preferably to advertisement
The soundtrack audio of video file carries out learning training, to the middle lookback feature of generation (i.e. output before 1-2 trifle,
Whether a upper output is same with the output phase before 1-2 trifle, and current output is in the position of current trifle) it is detected, and
By the lookback feature together with other every video features of current video file, it is input to basic sequence neural network
Default model of dubbing in background music, dubs middle repetition and similar melody for the better identification learning current video file of model of dubbing in background music.
Step S109 generates the initial video text in conjunction with every video features in the default model of dubbing in background music
The soundtrack audio file of part.
When the default model of dubbing in background music based on series neural network, every view of currently playing initial video file is received
After frequency feature and the lookback feature of current initial video file, in conjunction with current every video features, according to current first
The broadcasting timing of beginning video file sequentially generates the soundtrack audio file of the initial video file.
Specifically, for example, in this embodiment, when the default model of dubbing in background music based on series neural network, receiving and being based on
Gunnar Farneback optical flow algorithm, chroma histogram algorithm, shot border detection model and visual classification training prediction
Light stream strength characteristic, chroma histogram feature, shot boundary characteristic and the emotion score for the Current ad video that model extraction goes out
After the lookback feature of the video features such as feature and current video file, in conjunction with the advertisement video currently playing moment
Every video features and lookback feature and currently playing moment previous moment, the advertisement video generated
Soundtrack audio, according to the calculating prediction of series neural network automatically generate the lower playing time of Current ad video with musical sound
Frequently, it and according to the broadcasting timing of Current ad video, repeats the above operation of dubbing in background music and sequentially generates soundtrack audio file, until current
Advertisement video finishes.
Step S200 is based on the initial video file and soundtrack audio file, generates test video file.
It is special according to current initial video file, and according to the every video features and lookback of the initial video file
The play time sequence for levying the soundtrack audio file generated, the initial video file and soundtrack audio file are combined, with
Generate the test video file that current initial video file contains audio content.
Further, step S200 includes:
Step S201 reads the play time sequence of the initial video file and the soundtrack audio file.
Specifically, for example, in the present embodiment, reading the play time sequence of currently playing advertisement video file respectively
Column, and model is dubbed in background music according to light stream strength characteristic, the color of Current ad video file by default based on series neural network
Spend the lookback of the video features such as histogram feature, shot boundary characteristic and emotion score feature and current video file
The play time sequence for the soundtrack audio file that feature generates.
Step S202 is based on the play time sequence, and the initial video file and the soundtrack audio file are closed
As test video file.
Specifically, for example, in the present embodiment, according to the play time sequence of the currently playing advertisement video file of reading
Column, and the play time sequence with soundtrack audio file corresponding to the play time sequence, by current soundtrack audio file group
It is bonded in Current ad video file, to generate the test video file that Current ad video file contains audio content.
Step S300 watches the user's portrait model and evaluation parameter of object according to the test video file, to described
Soundtrack audio file is modified in test video file, generates stand-by video file.
On the release platform of current initial video file, the audient user of current initial video file is watched in detection, and
When obtaining the portrait model of audient user from the platform and watching current test video file, to the test video file
Evaluation parameter calls default recommended models, and the every user portrait model and evaluation parameter that will acquire are input to the recommended models,
Preference when current video file, which is predicted, is watched to audient user, and default model of dubbing in background music is carried out according to the prediction result
Optimization to instruct the default model of dubbing in background music to be modified the soundtrack audio file of generation, and ultimately generates current test video
The stand-by video file of file.
The present invention is special by the every video for extracting the initial video file from initial video file to be dubbed in background music
It levies, and generates the soundtrack audio file of the initial video file in conjunction with every video features;Based on the initial video
File and soundtrack audio file generate test video file;User's portrait mould of object is watched according to the test video file
Type and evaluation parameter are modified soundtrack audio file in the test video file, generate stand-by video file;As a result,
In conjunction with the every video features extracted from the video content of initial video file, by by addition audio, video data and pure tone
Frequency is trained according to transfer learning is carried out, and the user characteristic data for watching the advertisement video file audience by acquiring carries out
Model is dubbed in guidance optimization, generates stand-by video file of the current initial video file after dubbing.Not only by automatic
Dub algorithm and realize and dub in background music automatically and reduce the sky high cost that video file is dubbed in background music, and in conjunction with video content features dub in background music into
One step improves the total quality dubbed in background music, moreover, the Feedback Evaluation also based on the video file audience is to soundtrack audio text
Part optimizes adjustment, meets preference requirement of the user for content of dubbing in background music, improves user for the receipts of the video file
See experience.
Further, propose that the present invention is based on the characteristic analysis method second embodiments of machine learning model.
Referring to figure 4., Fig. 4 is the flow diagram of the method second embodiment of dubbing in background music of video file of the present invention, based on upper
State the method first embodiment of dubbing in background music of video file, in the present embodiment, above-mentioned steps S109, in the default model of dubbing in background music,
The step of generating the soundtrack audio file of the initial video file in conjunction with every video features include:
Step S1091 is raw according to every video features of the initial video file and the lookback feature
At sequence of notes.
Preset algorithm and predetermined sequence neural network model will be being called, mentioned from currently playing initial video file
Every video features are taken out, and the lookback feature of the current initial video file detected is input to based on series neural network
Default model of dubbing in background music in after, default dub in background music the models coupling items video features and lookback feature firstly generate
It dubs in background music sequence of notes.
Specifically, for example, in the present embodiment, default based on series neural network dub in background music model receive it is current
Light stream strength characteristic, chroma histogram feature, shot boundary characteristic and the emotion score feature and Current ad of advertisement video
After the lookback feature of video, based on LSTM series neural network in each playing time t, the items of input time point t
Video features and Lookback feature, i.e. light stream strength characteristic, chroma histogram feature, shot boundary characteristic and emotion score are special
Sign and the note of time point t-1 (i.e. currently playing moment previous moment) output, and LSTM series neural network is exported every
One time point was the output note probability distribution of note selection, and taking a note of maximum probability is current note.
In the present embodiment, to simplify dub in background music automatically model and effect of optimization, the range for exporting note is limited to C3-C6
Between 3 octaves, i.e. 36 notes, finally, the output of model are the probability distribution of 37 dimensions, represent+1 sky of 36 notes
White position (i.e. this moment does not have note).
The sequence of notes is inputted note duration sequence neural network by step S202, for the note duration nerve
Network exports note duration sequence according to the sequence of notes and the lookback feature.
In the present embodiment, every video features of default models coupling current video file of dubbing in background music and lookback are special
The sequence of notes that sign generates is input to the note duration sequence neural network that current preset is dubbed in background music in model, by sound as input
It accords with duration neural network and combines lookback feature corresponding to the sequence of notes and each video file playing time, output is current
The note duration sequence of sequence of notes.
The sequence of notes is inputted drum sequence neural network, for the drum sequence neural network by step S203
Drumbeat combination is exported according to the sequence of notes.
In the present embodiment, every video features of default models coupling current video file of dubbing in background music and sequence of notes are made
For input, it is input to the drum sequence neural network that current preset is dubbed in background music in model, by drum sequence neural network according to input
Sequence of notes, in each trifle of sequence of notes, according to the drum of the sequence of notes of current trifle and the previous trifle of current trifle
Point combination from selection in the drumbeat integrated mode (pattern) for having current drum sequence neural network and exports current trifle
Drumbeat combination.
Step S204 combines according to the sequence of notes, note duration sequence and the drumbeat, generates the initial video
The soundtrack audio file of file.
According to the play time sequence of presently described initial video file, by the default mould of dubbing in background music based on series neural network
The sound of sequence of notes, each sequence of notes that every video features and lookback feature of the type based on current video file generate
The drumbeat combination for according with duration sequence and each sequence of notes, synthesizes the soundtrack audio file of presently described initial video file.
The sequence of notes is inputted note duration sequence neural network by the present invention, for the note duration neural network
Note duration sequence is exported according to the sequence of notes and the lookback feature;The sequence of notes is inputted into drum sequence
Neural network, so that the drum sequence neural network exports drumbeat combination according to the sequence of notes;According to the note sequence
Column, note duration sequence and drumbeat combination, generate the soundtrack audio file of the initial video file;As a result, with video
Based on the video content of file, call every video features of mature series neural network combination video file, it is layer-by-layer, according to
Sequence automatically generates the soundtrack audio file of current video file, reduces previous advertisement video and dubs in background music the overall cost of production, and
And the total quality that advertisement video is dubbed in background music is improved, make soundtrack audio that there is the good result organically combined with video features, from
And it is provided with the audience of advertisement video and preferably watches experience.
Further, the method 3rd embodiment of dubbing in background music of video file of the present invention is proposed.
Referring to figure 5., Fig. 5 is the flow diagram of the method 3rd embodiment of dubbing in background music of video file of the present invention, based on upper
State dub in background music method first embodiment and the second embodiment of video file, in the present embodiment, step S300 is regarded according to the test
Frequency file watches the user's portrait model and evaluation parameter of object, repairs to soundtrack audio file in the test video file
Just, the step of generating stand-by video file include:
Step S301 detects the release platform of the test video file, the test is obtained from the release platform
Video file watches the user's portrait model and evaluation parameter of object.
Specifically, for example, in the present embodiment, detecting release platform-DSP (party in request's platform) of Current ad video,
The audient user of Current ad video is watched in detection on the DSP, and the user for extracting part audient user draw a portrait model with
And when watching the test video file of Current ad, evaluation parameter of the audient user to the test video file.
In the present embodiment, user's portrait model includes: age, gender, region and client type etc., Shou Zhongyong
Family includes: click, duration, time, drumbeat type and the style of dubbing in background music for playing video to the evaluation parameter of the test video file
Deng.
Step S302, each user for reading same subscriber portrait model watch the test video file in predetermined period
Evaluation parameter, and according to the evaluation parameter construct user behavior characteristics sequence.
In the present embodiment, default recommended models are called, every user is drawn a portrait into mode input to the recommended models, with right
Audient user watches that preference when current test video file is predicted, and according to the prediction result to it is default dub in background music model into
Row optimization to instruct the default model of dubbing in background music to be modified the soundtrack audio file of generation, and ultimately generates current test view
The stand-by video file of frequency file.
Specifically, for example, in the present embodiment, default recommended models are specifically as follows session-based (to service week
Based on phase) recommended models, in session-based recommended models, read have same age, gender, region or
A kind of audient user of client type etc. portrait model, as within 1 to 2 week, receives within the certain predetermined time cycle
It sees every evaluation parameter when Current ad video, such as clicks, plays duration, time, drumbeat type and the wind of dubbing in background music of video
Lattice etc. construct current a kind of identical portrait model audient by each behavioral data according to the chronological order in 1 to 2 week
The user behavior characteristics sequence of user.
Step S303, when watching the test video file according to user behavior characteristics sequence calculating user, to institute
State the preference probability distribution data of soundtrack audio file.
Specifically, for example, in the present embodiment, using the user behavior characteristics sequence of building as input, being input to current
In the series neural network of session-based recommended models, and the output result of the series neural network state layer is passed to
To layer is connected entirely, connects layer entirely to the audient user of current a kind of same alike result data in the series neural network, work as currently watching
The test video subsequent time of preceding advertisement predicts the preference probability distribution data of soundtrack audio style, and final output
The preference probability distribution data.
Step S304 generates the default mould of dubbing in background music of the soundtrack audio file with preference probability distribution data guidance
Type generates stand-by video file to be modified to soundtrack audio file in the test video file.
Connect the preference probability distribution number of layer output entirely according to the series neural network of current session-based recommended models
It is predicted that as a result, guidance optimization is carried out to the default model of dubbing in background music for being currently based on series neural network, so as to the default mould of dubbing in background music
Type is modified the soundtrack audio file of currently playing test video file, ultimately generate the test video file to
Use video file.
Specifically, for example, in the present embodiment, during playing Current ad video, when based on LSTM sequence mind
Drum sequence neural network in model of dubbing in background music automatically through network, the drum for selecting Current ad video to dub in background music according to sequence of notes
When point integrated mode, by the drumbeat combined prediction result and session- of the current trifle of drum sequence neural network prediction
The preference probability distribution data prediction result that the series neural network of based recommended models connects layer output entirely is weighted, with choosing
The drumbeat combination for being more in line with the audient's user preference for watching Current ad video is selected out, and is finally more in line with and is watched according to this
The drumbeat of audient's user preference of Current ad video combines, generate Current ad video be more in line with audient user with musical sound
Frequency file, and then combine to form the final stand-by video file of the advertisement with initial ad video file.
The present invention detects the release platform of the test video file, and the test video is obtained from the release platform
File watches the user's portrait model and evaluation parameter of object;Each user of same subscriber portrait model is read in predetermined period
It watches the evaluation parameter of the test video file, and user behavior characteristics sequence is constructed according to the evaluation parameter;According to institute
It states user behavior characteristics sequence and calculates user when watching the test video file, to the preference probability of the soundtrack audio file
Distributed data;With the preference probability distribution data guidance default model of dubbing in background music, with to matching in the test video file
Musical sound frequency file is modified, and generates stand-by video file;Feedback as a result, based on the advertisement video audience to dub in background music into
Row is optimized and revised, and is met preference requirement of the user for content of dubbing in background music, is further improved user for the advertisement video
Watch experience.
In addition, the embodiment of the present invention also proposes a kind of scoring system of video file, the scoring system of the video file
Include:
Soundtrack audio generation module, for extracting the initial video file from initial video file to be dubbed in background music
Every video features, and generate in conjunction with every video features the soundtrack audio file of the initial video file;
Video generation module to be measured generates test video for being based on the initial video file and soundtrack audio file
File;
Soundtrack audio correction module, for watching user's portrait model and the evaluation of object according to the test video file
Parameter is modified soundtrack audio file in the test video file, generates stand-by video file.
Preferably, the scoring system of the video file further include:
Learning training module, for adding default training sample to the default model of dubbing in background music for generating the soundtrack audio file
Learning training is carried out, the default training sample includes: audio, video data and pure audio data.
Video file as described above is realized when the scoring system modules operation for the video file that the present embodiment proposes
Dub in background music method the step of, details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of storage medium, it is applied to computer, i.e., the described storage medium is to calculate
Machine readable storage medium storing program for executing, the program of dubbing in background music of video file is stored on the medium, and the program of dubbing in background music of the video file is located
Reason device execute when realize video file as described above dub in background music method the step of.
Wherein, the program of dubbing in background music of the video file run on the processor is performed realized method and can refer to
The present invention is based on each embodiments of method of dubbing in background music of video file, and details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in a storage medium
In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes
Business device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (12)
1. a kind of method of dubbing in background music of video file, which is characterized in that the video file dub in background music method the following steps are included:
Every video features of the initial video file are extracted from initial video file to be dubbed in background music, and combine every institute
State the soundtrack audio file that video features generate the initial video file;
Based on the initial video file and soundtrack audio file, test video file is generated;
The user's portrait model and evaluation parameter that object is watched according to the test video file, in the test video file
Soundtrack audio file is modified, and generates stand-by video file.
2. the method for dubbing in background music of video file as described in claim 1, which is characterized in that the video features include: that light stream is strong
Feature, chroma histogram feature, shot boundary characteristic are spent,
Described the step of extracting every video features of the video file from initial video file to be dubbed in background music includes:
Extract the coloration histogram of each video image corresponding each light stream figure and the video image in the initial video file
Figure;
Using the average light intensity of flow of each light stream figure as the light stream strength characteristic of the initial video file;
Chroma histogram feature after the chroma histogram is normalized, as the initial video file;
The boundary shot for detecting the video image, by the shot boundary characteristic of initial video file described in the boundary shot.
3. the method for dubbing in background music of video file as described in claim 1, which is characterized in that the video features further include: video
Emotion score feature,
Described the step of extracting every video features of the video file from initial video file to be dubbed in background music further include:
The video content for reading the initial video file detects and counts the emotion for identifying video feeling in the video content
Data;
The affection data is input to default sentiment analysis model, so that the default sentiment analysis model is to the emotion number
According to being predicted to obtain the emotion score of the video content;
Using the emotion score as the video feeling score feature of the initial video file.
4. the method for dubbing in background music of video file as described in any one of claims 1 to 3, which is characterized in that in conjunction with every view
Frequency feature generates the step of soundtrack audio file of the initial video file and includes:
Every video features are input to default model of dubbing in background music, the default trained sample that the preset configuration model passes through addition
This progress learning training, the default training sample includes: audio, video data and pure audio data;
In the default model of dubbing in background music, the soundtrack audio text of the initial video file is generated in conjunction with every video features
Part.
5. the method for dubbing in background music of video file as claimed in claim 4, which is characterized in that described by every video features
It is input to before presetting the step of dubbing in background music model, the method also includes:
The lookback feature of the initial video file is detected, and the lookback feature is input to described preset and is dubbed in background music
Model.
6. the method for dubbing in background music of video file as claimed in claim 4, which is characterized in that the default model of dubbing in background music is based on sequence
Column neural network generates the model of dubbing in background music of audio file,
In the default model of dubbing in background music, the soundtrack audio text of the initial video file is generated in conjunction with every video features
The step of part includes:
According to every video features of the initial video file and the lookback feature, sequence of notes is generated;
The sequence of notes is inputted into note duration sequence neural network, so that the note duration neural network is according to the sound
It accords with sequence and the lookback feature exports note duration sequence;
The sequence of notes is inputted into drum sequence neural network, so that the drum sequence neural network is according to the note sequence
Column output drumbeat combination;
It is combined according to the sequence of notes, note duration sequence and the drumbeat, generate the initial video file matches musical sound
Frequency file.
7. the method for dubbing in background music of video file as described in claim 1, which is characterized in that based on the initial video file and match
Musical sound frequency file, generate test video file the step of include:
Read the play time sequence of the initial video file and the soundtrack audio file;
It is test video text by the initial video file and the soundtrack audio file synthesis based on the play time sequence
Part.
8. the method for dubbing in background music of video file as described in claim 1, which is characterized in that described according to the test video file
The user's portrait model and evaluation parameter for watching object, are modified soundtrack audio file in the test video file, raw
Include: at the step of stand-by video file
The release platform for detecting the test video file obtains the test video file from the release platform and watches pair
The user's portrait model and evaluation parameter of elephant;
Each user for reading same subscriber portrait model watches the evaluation parameter of the test video file in predetermined period, and
User behavior characteristics sequence is constructed according to the evaluation parameter;
When watching the test video file according to user behavior characteristics sequence calculating user, to the soundtrack audio file
Preference probability distribution data;
The default model of dubbing in background music that the soundtrack audio file is generated with preference probability distribution data guidance, to the test
Soundtrack audio file is modified in video file, generates stand-by video file.
9. a kind of scoring system of video file, which is characterized in that the scoring system of the video file is based on sequential nerve net
Network generates the soundtrack audio of video file, and the scoring system of the video file includes:
Soundtrack audio generation module, for extracting the items of the initial video file from initial video file to be dubbed in background music
Video features, and generate in conjunction with every video features the soundtrack audio file of the initial video file;
Video generation module to be measured generates test video file for being based on the initial video file and soundtrack audio file;
Soundtrack audio correction module, for watching the user's portrait model and evaluation ginseng of object according to the test video file
Number, is modified soundtrack audio file in the test video file, generates stand-by video file.
10. the scoring system of video file as claimed in claim 9, which is characterized in that the scoring system of the video file
Further include:
Learning training module carries out the default model of dubbing in background music for generating the soundtrack audio file for adding default training sample
Learning training, the default training sample includes: audio, video data and pure audio data.
11. a kind of equipment of dubbing in background music of video file, which is characterized in that the equipment of dubbing in background music of the video file includes: memory, place
Reason device and the program of dubbing in background music for being stored in the video file that can be run on the memory and on the processor, the video text
Dubbing in background music such as video file described in any item of the claim 1 to 8 is realized when the program of dubbing in background music of part is executed by the processor
The step of method.
12. a kind of storage medium, which is characterized in that be applied to computer, be stored with matching for video file on the storage medium
Happy program realizes such as view described in any item of the claim 1 to 8 when the program of dubbing in background music of the video file is executed by processor
Frequency file dub in background music method the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910216297.8A CN109862393B (en) | 2019-03-20 | 2019-03-20 | Method, system, equipment and storage medium for dubbing music of video file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910216297.8A CN109862393B (en) | 2019-03-20 | 2019-03-20 | Method, system, equipment and storage medium for dubbing music of video file |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109862393A true CN109862393A (en) | 2019-06-07 |
CN109862393B CN109862393B (en) | 2022-06-14 |
Family
ID=66901380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910216297.8A Active CN109862393B (en) | 2019-03-20 | 2019-03-20 | Method, system, equipment and storage medium for dubbing music of video file |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109862393B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110446057A (en) * | 2019-08-30 | 2019-11-12 | 北京字节跳动网络技术有限公司 | Providing method, device, equipment and the readable medium of auxiliary data is broadcast live |
CN110753238A (en) * | 2019-10-29 | 2020-02-04 | 北京字节跳动网络技术有限公司 | Video processing method, device, terminal and storage medium |
CN110781835A (en) * | 2019-10-28 | 2020-02-11 | 中国传媒大学 | Data processing method and device, electronic equipment and storage medium |
CN110933406A (en) * | 2019-12-10 | 2020-03-27 | 央视国际网络无锡有限公司 | Objective evaluation method for short video music matching quality |
CN111259192A (en) * | 2020-01-15 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Audio recommendation method and device |
CN111737516A (en) * | 2019-12-23 | 2020-10-02 | 北京沃东天骏信息技术有限公司 | Interactive music generation method and device, intelligent sound box and storage medium |
CN111800650A (en) * | 2020-06-05 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Video dubbing method and device, electronic equipment and computer readable medium |
CN112231499A (en) * | 2019-07-15 | 2021-01-15 | 李姿慧 | Intelligent video music distribution system |
WO2022005442A1 (en) * | 2020-07-03 | 2022-01-06 | Назар Юрьевич ПОНОЧЕВНЫЙ | System (embodiments) for harmoniously combining video files and audio files and corresponding method |
CN113923517A (en) * | 2021-09-30 | 2022-01-11 | 北京搜狗科技发展有限公司 | Background music generation method and device and electronic equipment |
CN115174959A (en) * | 2022-06-21 | 2022-10-11 | 咪咕文化科技有限公司 | Video 3D sound effect setting method and device |
US20220366881A1 (en) * | 2021-05-13 | 2022-11-17 | Microsoft Technology Licensing, Llc | Artificial intelligence models for composing audio scores |
WO2023197749A1 (en) * | 2022-04-15 | 2023-10-19 | 腾讯科技(深圳)有限公司 | Background music insertion time point determining method and apparatus, device, and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040100487A1 (en) * | 2002-11-25 | 2004-05-27 | Yasuhiro Mori | Short film generation/reproduction apparatus and method thereof |
US20060122842A1 (en) * | 2004-12-03 | 2006-06-08 | Magix Ag | System and method of automatically creating an emotional controlled soundtrack |
CN102403011A (en) * | 2010-09-14 | 2012-04-04 | 北京中星微电子有限公司 | Music output method and device |
US8737817B1 (en) * | 2011-02-08 | 2014-05-27 | Google Inc. | Music soundtrack recommendation engine for videos |
CN104182413A (en) * | 2013-05-24 | 2014-12-03 | 福建星网视易信息系统有限公司 | Method and system for recommending multimedia content |
US20140376888A1 (en) * | 2008-10-10 | 2014-12-25 | Sony Corporation | Information processing apparatus, program and information processing method |
CN105261374A (en) * | 2015-09-23 | 2016-01-20 | 海信集团有限公司 | Cross-media emotion correlation method and system |
CN107170432A (en) * | 2017-03-31 | 2017-09-15 | 珠海市魅族科技有限公司 | A kind of music generating method and device |
CN108712574A (en) * | 2018-05-31 | 2018-10-26 | 维沃移动通信有限公司 | A kind of method and device playing music based on image |
CN109063163A (en) * | 2018-08-14 | 2018-12-21 | 腾讯科技(深圳)有限公司 | A kind of method, apparatus, terminal device and medium that music is recommended |
CN109492128A (en) * | 2018-10-30 | 2019-03-19 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating model |
-
2019
- 2019-03-20 CN CN201910216297.8A patent/CN109862393B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040100487A1 (en) * | 2002-11-25 | 2004-05-27 | Yasuhiro Mori | Short film generation/reproduction apparatus and method thereof |
US20060122842A1 (en) * | 2004-12-03 | 2006-06-08 | Magix Ag | System and method of automatically creating an emotional controlled soundtrack |
US20140376888A1 (en) * | 2008-10-10 | 2014-12-25 | Sony Corporation | Information processing apparatus, program and information processing method |
CN102403011A (en) * | 2010-09-14 | 2012-04-04 | 北京中星微电子有限公司 | Music output method and device |
US8737817B1 (en) * | 2011-02-08 | 2014-05-27 | Google Inc. | Music soundtrack recommendation engine for videos |
CN104182413A (en) * | 2013-05-24 | 2014-12-03 | 福建星网视易信息系统有限公司 | Method and system for recommending multimedia content |
CN105261374A (en) * | 2015-09-23 | 2016-01-20 | 海信集团有限公司 | Cross-media emotion correlation method and system |
CN107170432A (en) * | 2017-03-31 | 2017-09-15 | 珠海市魅族科技有限公司 | A kind of music generating method and device |
CN108712574A (en) * | 2018-05-31 | 2018-10-26 | 维沃移动通信有限公司 | A kind of method and device playing music based on image |
CN109063163A (en) * | 2018-08-14 | 2018-12-21 | 腾讯科技(深圳)有限公司 | A kind of method, apparatus, terminal device and medium that music is recommended |
CN109492128A (en) * | 2018-10-30 | 2019-03-19 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating model |
Non-Patent Citations (2)
Title |
---|
FANG-FEI KUO等: ""Background music recommendation for video based on multimodal latent semantic analysis"", 《 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)》 * |
郄子涵等: ""视频背景音乐选配的人工神经网络模型"", 《电脑知识与技术》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112231499A (en) * | 2019-07-15 | 2021-01-15 | 李姿慧 | Intelligent video music distribution system |
CN110446057A (en) * | 2019-08-30 | 2019-11-12 | 北京字节跳动网络技术有限公司 | Providing method, device, equipment and the readable medium of auxiliary data is broadcast live |
CN110781835A (en) * | 2019-10-28 | 2020-02-11 | 中国传媒大学 | Data processing method and device, electronic equipment and storage medium |
CN110781835B (en) * | 2019-10-28 | 2022-08-23 | 中国传媒大学 | Data processing method and device, electronic equipment and storage medium |
CN110753238A (en) * | 2019-10-29 | 2020-02-04 | 北京字节跳动网络技术有限公司 | Video processing method, device, terminal and storage medium |
CN110933406B (en) * | 2019-12-10 | 2021-05-14 | 央视国际网络无锡有限公司 | Objective evaluation method for short video music matching quality |
CN110933406A (en) * | 2019-12-10 | 2020-03-27 | 央视国际网络无锡有限公司 | Objective evaluation method for short video music matching quality |
CN111737516A (en) * | 2019-12-23 | 2020-10-02 | 北京沃东天骏信息技术有限公司 | Interactive music generation method and device, intelligent sound box and storage medium |
CN111259192A (en) * | 2020-01-15 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Audio recommendation method and device |
CN111259192B (en) * | 2020-01-15 | 2023-12-01 | 腾讯科技(深圳)有限公司 | Audio recommendation method and device |
CN111800650A (en) * | 2020-06-05 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Video dubbing method and device, electronic equipment and computer readable medium |
WO2022005442A1 (en) * | 2020-07-03 | 2022-01-06 | Назар Юрьевич ПОНОЧЕВНЫЙ | System (embodiments) for harmoniously combining video files and audio files and corresponding method |
US20220366881A1 (en) * | 2021-05-13 | 2022-11-17 | Microsoft Technology Licensing, Llc | Artificial intelligence models for composing audio scores |
WO2022240525A1 (en) * | 2021-05-13 | 2022-11-17 | Microsoft Technology Licensing, Llc | Artificial intelligence models for composing audio scores |
CN113923517A (en) * | 2021-09-30 | 2022-01-11 | 北京搜狗科技发展有限公司 | Background music generation method and device and electronic equipment |
WO2023197749A1 (en) * | 2022-04-15 | 2023-10-19 | 腾讯科技(深圳)有限公司 | Background music insertion time point determining method and apparatus, device, and storage medium |
CN115174959A (en) * | 2022-06-21 | 2022-10-11 | 咪咕文化科技有限公司 | Video 3D sound effect setting method and device |
CN115174959B (en) * | 2022-06-21 | 2024-01-30 | 咪咕文化科技有限公司 | Video 3D sound effect setting method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109862393B (en) | 2022-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109862393A (en) | Method of dubbing in background music, system, equipment and the storage medium of video file | |
CN107918653A (en) | A kind of intelligent playing method and device based on hobby feedback | |
US8963926B2 (en) | User customized animated video and method for making the same | |
US8442389B2 (en) | Electronic apparatus, reproduction control system, reproduction control method, and program therefor | |
CN110019961A (en) | Method for processing video frequency and device, for the device of video processing | |
CN109447234A (en) | A kind of model training method, synthesis are spoken the method and relevant apparatus of expression | |
CN107172485A (en) | A kind of method and apparatus for being used to generate short-sighted frequency | |
US10789972B2 (en) | Apparatus for generating relations between feature amounts of audio and scene types and method therefor | |
CN108924599A (en) | Video caption display methods and device | |
CN109147800A (en) | Answer method and device | |
CN108292314A (en) | Information processing unit, information processing method and program | |
CN108241997A (en) | Advertisement broadcast method, device and computer readable storage medium | |
CN107895016A (en) | One kind plays multimedia method and apparatus | |
CN107872685A (en) | A kind of player method of multi-medium data, device and computer installation | |
US11756571B2 (en) | Apparatus that identifies a scene type and method for identifying a scene type | |
WO2019047850A1 (en) | Identifier displaying method and device, request responding method and device | |
CN114073854A (en) | Game method and system based on multimedia file | |
JP2019071009A (en) | Content display program, content display method, and content display device | |
CN113538628A (en) | Expression package generation method and device, electronic equipment and computer readable storage medium | |
CN108920585A (en) | The method and device of music recommendation, computer readable storage medium | |
CN109429077A (en) | Method for processing video frequency and device, for the device of video processing | |
CN114339076A (en) | Video shooting method and device, electronic equipment and storage medium | |
CN106331525A (en) | Realization method for interactive film | |
CN115866339A (en) | Television program recommendation method and device, intelligent device and readable storage medium | |
JP7466087B2 (en) | Estimation device, estimation method, and estimation system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |