CN110781835B - Data processing method and device, electronic equipment and storage medium - Google Patents
Data processing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110781835B CN110781835B CN201911029239.0A CN201911029239A CN110781835B CN 110781835 B CN110781835 B CN 110781835B CN 201911029239 A CN201911029239 A CN 201911029239A CN 110781835 B CN110781835 B CN 110781835B
- Authority
- CN
- China
- Prior art keywords
- vector
- key frame
- feature vector
- vectors
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 17
- 239000013598 vector Substances 0.000 claims abstract description 274
- 238000012545 processing Methods 0.000 claims abstract description 38
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000012163 sequencing technique Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 8
- 230000003287 optical effect Effects 0.000 claims description 8
- 230000003068 static effect Effects 0.000 claims description 7
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/485—End-user interface for client configuration
- H04N21/4852—End-user interface for client configuration for modifying audio parameters, e.g. switching between mono and stereo
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8106—Monomedia components thereof involving special audio data, e.g. different tracks for different languages
- H04N21/8113—Monomedia components thereof involving special audio data, e.g. different tracks for different languages comprising music, e.g. song in MP3 format
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Auxiliary Devices For Music (AREA)
Abstract
The application provides a data processing method, a data processing device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring attribute feature vectors of key frames of a target video; obtaining the feature vector of the key frame according to the attribute feature vector of the key frame; inputting the feature vector of the key frame and the vector to be ordered of the musical notes used for representing the previous key frame into a decoding model as input parameters to obtain the vector to be ordered of the musical notes used for representing the key frame; according to the obtained all the vectors to be sequenced, the background music of the target video is obtained, and by the method, the background music of the target video can be obtained without manual participation, so that the method is beneficial to reducing the manual workload.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
With the development of technology, the popularity of computer multimedia technology is higher and higher, and video production becomes a work that ordinary people can operate. People can shoot video production materials through tools such as digital video cameras, mobile phones and cameras, and then produce videos so as to record the learning, work and daily life of users.
When a high-quality video is produced, background music needs to be configured for the video after the video is produced, so that the produced video has a good scene reproduction effect when being played, and when the background music is produced, a user needs to find the background music which accords with the video in a large amount of music materials, so that the manual workload is large when the background music is configured for the video.
Disclosure of Invention
In view of the above, embodiments of the present application provide a data processing method, an apparatus, an electronic device, and a storage medium, so as to reduce the workload of configuring background music for a video.
In a first aspect, an embodiment of the present application provides a data processing method, including:
acquiring attribute feature vectors of key frames of a target video;
obtaining the feature vector of the key frame according to the attribute feature vector of the key frame;
inputting the feature vector of the key frame and the vector to be sequenced of the notes for representing the previous key frame into a decoding model as input parameters to obtain the vector to be sequenced of the notes for representing the key frame;
and obtaining the background music of the target video according to all the obtained vectors to be sequenced.
Optionally, the attribute feature vector of the key frame includes:
the dynamic feature vector of the key frame, the static feature vector of the key frame, and/or the optical flow feature vector of the key frame.
Optionally, the obtaining the feature vector of the key frame according to the attribute feature vector of the key frame includes:
and according to the attribute feature vector of the key frame, carrying out vector splicing processing through a full connection layer to obtain the feature vector of the key frame.
Optionally, the obtaining the background music of the target video according to all the obtained vectors to be sorted includes:
judging the vector to be sorted by using a preset target note vector set so as to determine whether the vector to be sorted meets a preset requirement;
and sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule so as to take a sequencing result as the background music of the target video.
Optionally, the determining, by using a preset target note vector set, the vector to be sorted to determine whether the vector to be sorted meets a preset requirement includes:
carrying out mean square error operation on the vector to be ordered and a preset target note vector set to obtain a loss function value of the vector to be ordered relative to the target note vector;
when the loss function value is within a preset range, determining that the vector to be ordered meets the preset requirement;
and when the loss function value is not in the preset range, determining that the vector to be sorted does not meet the preset requirement.
Optionally, the method further comprises:
and taking all vectors to be sequenced which do not meet the preset requirement as training samples to train the decoding model.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the acquiring unit is used for acquiring the attribute feature vector of the key frame of the target video;
the first processing unit is used for obtaining the feature vector of the key frame according to the attribute feature vector of the key frame;
the second processing unit is used for inputting the feature vector of the key frame and the vector to be ordered of the musical notes for representing the previous key frame into the decoding model as input parameters to obtain the vector to be ordered of the musical notes for representing the key frame;
and the third processing unit is used for obtaining the background music of the target video according to all the obtained vectors to be sequenced.
Optionally, the attribute feature vector of the key frame includes:
the dynamic feature vector of the key frame, the static feature vector of the key frame, and/or the optical flow feature vector of the key frame.
Optionally, when the configuration of the first processing unit is configured to obtain the feature vector of the key frame according to the attribute feature vector of the key frame, the configuration of the first processing unit includes:
and according to the attribute feature vector of the key frame, carrying out vector splicing processing through a full connection layer to obtain the feature vector of the key frame.
Optionally, when the third processing unit is configured to obtain the background music of the target video according to all obtained vectors to be sorted, the third processing unit includes:
judging the vector to be sorted by using a preset target note vector set so as to determine whether the vector to be sorted meets a preset requirement;
and sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule so as to take a sequencing result as the background music of the target video.
Optionally, when the third processing unit is configured to determine the vector to be sorted by using a preset set of target note vectors to determine whether the vector to be sorted meets a preset requirement, the method includes:
carrying out mean square error operation on the vector to be ordered and a preset target note vector set to obtain a loss function value of the vector to be ordered relative to the target note vector;
when the loss function value is within a preset range, determining that the vector to be ordered meets the preset requirement;
and when the loss function value is not in the preset range, determining that the vector to be sorted does not meet the preset requirement.
Optionally, the data processing apparatus further includes:
and the training unit is used for training the decoding model by taking all vectors to be ordered which do not meet the preset requirement as training samples.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the data processing method according to any one of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the data processing method according to any one of the first aspect.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
in this application, when configuring background music for a target video, attribute feature vectors of key frames of the target video are obtained first, and each key frame includes different attribute feature vectors, so that the feature vectors of the key frames can be obtained according to the attribute feature vectors of the key frames, and then the feature vectors of the key frames and a to-be-ordered vector for representing notes of a previous key frame are taken as input parameters and input into a decoding model, and the feature vectors of the key frames can represent related contents represented by the key frames, for example: the method comprises the steps of obtaining a vector of a musical note, obtaining a musical note corresponding to the key frame, obtaining a background music of a target video according to the obtained vector to be sequenced after obtaining all vectors to be sequenced for representing the musical note, wherein the obtained vector is used as the vector for representing the musical note, the content expressed by the obtained musical note is matched with the key frame, the vector to be sequenced of the musical note of the previous key frame is also used as an input parameter, the purpose is to ensure that the vector of the current obtained musical note is matched with the vector of the musical note of the previous key frame, so that the musical notes corresponding to two adjacent key frames are matched, and the music is formed by a plurality of musical notes, the background music obtained by utilizing the vectors to be sequenced is also matched with the target video, and the background music of the target video can be obtained without manual participation through the method, thus facilitating a reduction in manual effort.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of another data processing method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a data processing apparatus according to a second embodiment of the present application;
fig. 4 is a schematic structural diagram of a data processing apparatus according to a second embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
Example one
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application, as shown in fig. 1, the data processing method includes the following steps:
Specifically, the video comprises a plurality of key frames, each key frame comprises a plurality of attributes, such as objects, people, actions, scenes, human-object relationships and the like, the attributes are combined together to form all elements of one key frame, and after the attribute feature vectors of the key frames are obtained, quantized data of the key frames can be obtained, so that data support is provided for subsequent processing.
It should be noted that, the specific attribute feature vector may be set according to actual needs, and is not specifically limited herein.
And 102, obtaining the feature vector of the key frame according to the attribute feature vector of the key frame.
Specifically, since the attribute vectors of the key frames are combined together to represent a complete key frame, and when configuring the background music for the target video, consideration needs to be given to the whole target video, the attribute feature vectors of the key frames need to be used to obtain the feature vectors of the key frames, and since the feature vectors of the key frames can represent the content in the target video as a whole, the feature vectors of the key frames can provide a reference basis from a whole angle when configuring the background music for the target video.
It should be noted that, the specific implementation manner of obtaining the feature vector of the key frame according to the attribute feature vector may be set according to actual needs, and is not specifically limited herein.
And 103, inputting the feature vector of the key frame and the vector to be ordered of the musical notes for representing the previous key frame into a decoding model as input parameters to obtain the vector to be ordered of the musical notes for representing the key frame.
Specifically, after obtaining the feature vector of the key frame, the purpose of analyzing the whole target video can be achieved, and in order to analyze the contents such as the person, the motion, the scene, the person-object relationship, and the like from the whole target video, it is necessary to input the feature vector of the key frame into the decoding model so as to obtain the vector related to the parameters, that is: after the feature vectors of the key frames are input into the decoding model, content vectors representing different characters, actions, scenes, person-to-object relationships and the like in the relevant target video can be output, and since the notes in a piece of music are all related, in order to make the notes generated by the current key frame match the notes corresponding to the previous key frame, vectors to be ordered for representing the notes of the previous key frame are also required to be input into the decoding model as input parameters, wherein the vectors to be ordered for representing the notes of the first key frame of the target video are obtained by inputting the preset note vectors and the feature vectors of the first key frame into the decoding model as input parameters, and since the output vectors are closely related to the content expressed by the target video and the obtained notes are related, the output vectors are used as the vectors to be ordered for representing the notes, the content of the musical notes corresponding to the vector to be sorted and the content of the target video are closely related, and all the obtained musical notes are also related, so that background music with high matching degree with the target video can be configured by using the musical notes corresponding to the vector to be sorted.
It should be noted that, which decoding model is specifically used may be set according to actual needs, and is not limited in particular here.
And step 104, obtaining the background music of the target video according to all the obtained vectors to be sequenced.
Specifically, since the music is formed by combining a plurality of notes according to a certain rule, after all vectors to be sorted are obtained, the notes forming the background music can be obtained, and therefore the background music of the target video can be obtained according to the vectors to be sorted.
It should be noted that, the specific method for obtaining the background music according to all the vectors to be sorted can be set according to actual needs, for example: the method may obtain the corresponding musical notes by using the vector to be sorted, and then combine the musical notes together according to a certain rule to obtain the background music, or perform sorting on the vector to be sorted, and then sort the musical notes with the sequence corresponding to the vector to be sorted, so as to use the musical notes with the sequence as the background music, and the specific implementation manner is not specifically limited herein.
In the method, the obtained musical notes are matched with the key frames, so that the background music obtained by using the vectors to be sequenced is also matched with the target video.
In one possible embodiment, the attribute feature vector of the key frame includes:
the dynamic feature vector of the key frame, the static feature vector of the key frame, and/or the optical flow feature vector of the key frame.
Specifically, the motion feature vector of the key frame, the static feature vector of the key frame, and the optical flow feature vector of the key frame may quantitatively describe the relationship between the object, person, motion, scene, and person and object in the key frame, so after obtaining the above vectors, the feature vectors of the key frame may be obtained through the above vectors.
It should be noted that, which attribute feature vector or attribute feature vectors are specifically used may be set according to actual needs, and is not specifically limited herein.
In a possible embodiment, in step 102, a feature vector of the key frame may be obtained by performing a vector stitching process through the full-link layer according to the attribute feature vector of the key frame.
It should be noted that, what kind of fully connected layer is specifically used to perform the splicing processing on the attribute feature vector may be set according to actual needs, and is not specifically limited herein.
In a possible implementation, fig. 2 is a schematic flow chart of another data processing method provided in the first embodiment of the present application, and as shown in fig. 2, when step 104 is executed, the following steps may be implemented:
And step 202, sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule, and taking a sequencing result as background music of the target video.
Specifically, the preset target note vectors are notes meeting the requirements of the target video, when the vectors to be sorted meet the preset requirements, the vectors to be sorted meet the preset requirements of the user, after all the vectors to be sorted meeting the preset requirements of the user are determined, all the vectors to be sorted meeting the requirements can be sorted according to a preset note arrangement rule, and the notes corresponding to all the vectors to be sorted meeting the requirements can be arranged according to a certain rule, so that the background music of the target video is formed.
It should be noted that, the specific preset requirement and the specific arrangement rule may be set according to actual needs, and are not specifically limited herein.
In a possible embodiment, in step 201, a mean square error operation may be performed on the vector to be sorted and a preset target note vector set to obtain a loss function value of the vector to be sorted relative to the target note vector, and when the loss function value is within a preset range, it is determined that the vector to be sorted meets a preset requirement; and when the loss function value is not in the preset range, determining that the vector to be sorted does not meet the preset requirement.
It should be noted that, a specific preset range may be set according to actual needs, and the preset range may be a numerical range, or may also be a specific numerical value, for example: if the loss function value is 0, determining that the vector to be sorted meets the preset requirement, and if the loss function value is not 0, determining that the vector to be sorted does not meet the preset requirement, wherein the specific preset range is not specifically limited herein.
In a possible embodiment, all vectors to be ordered which do not meet the preset requirement are used as training samples to train the decoding model.
Specifically, when a certain vector to be ordered does not meet the preset requirement, and the obtained vector to be ordered is not in the preset target note vector set, the accuracy of the result indicating the output of the decoding model needs to be improved, so that the vector to be ordered can be used as a training sample to train the decoding model, and the accuracy of the result output by the decoding model can be improved.
It should be noted that the specific model training mode may be set according to actual needs, and is not specifically limited herein.
Example two
Fig. 3 is a schematic structural diagram of a data processing apparatus according to a second embodiment of the present application, and as shown in fig. 3, the data processing apparatus includes:
an obtaining unit 31, configured to obtain an attribute feature vector of a key frame of a target video;
a first processing unit 32, configured to obtain a feature vector of the key frame according to the attribute feature vector of the key frame;
a second processing unit 33, configured to input the feature vector of the key frame and the to-be-sorted vector for representing the note of the previous key frame into the decoding model as input parameters, so as to obtain the to-be-sorted vector for representing the note of the key frame;
and the third processing unit 34 is configured to obtain background music of the target video according to all the obtained vectors to be sorted.
In one possible embodiment, the attribute feature vector of the key frame includes:
the dynamic feature vector of the key frame, the static feature vector of the key frame, and/or the optical flow feature vector of the key frame.
In a possible embodiment, the configuration of the first processing unit 32, when configured to obtain the feature vector of the key frame according to the attribute feature vector of the key frame, includes:
and according to the attribute feature vector of the key frame, carrying out vector splicing processing through a full connection layer to obtain the feature vector of the key frame.
In a possible embodiment, the configuration of the third processing unit 34, when configured to obtain the background music of the target video according to all the obtained vectors to be sorted, includes:
judging the vector to be sorted by using a preset target note vector set so as to determine whether the vector to be sorted meets a preset requirement;
and sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule so as to take a sequencing result as the background music of the target video.
In a possible embodiment, the configuration of the third processing unit 34, when configured to determine the vector to be sorted by using a preset set of target note vectors to determine whether the vector to be sorted meets a preset requirement, includes:
carrying out mean square error operation on the vector to be ordered and a preset target note vector set to obtain a loss function value of the vector to be ordered relative to the target note vector;
when the loss function value is within a preset range, determining that the vector to be ordered meets the preset requirement;
and when the loss function value is not in the preset range, determining that the vector to be sequenced does not meet the preset requirement.
In a possible implementation, fig. 4 is a schematic structural diagram of a data processing apparatus provided in example two of the present application, and as shown in fig. 4, the data processing apparatus further includes:
and the training unit 35 is configured to train the decoding model by using all vectors to be ordered which do not meet the preset requirement as training samples.
For the principles of the second embodiment, reference may be made to the related descriptions of the first embodiment, which are not repeated herein.
In this application, when configuring background music for a target video, attribute feature vectors of key frames of the target video are obtained first, and each key frame includes different attribute feature vectors, so that the feature vectors of the key frames can be obtained according to the attribute feature vectors of the key frames, and then the feature vectors of the key frames and a to-be-ordered vector for representing notes of a previous key frame are taken as input parameters and input into a decoding model, and the feature vectors of the key frames can represent related contents represented by the key frames, for example: the method comprises the steps of obtaining a vector of a musical note, obtaining a musical note corresponding to the key frame, obtaining a background music of a target video according to the obtained vector to be sequenced after obtaining all vectors to be sequenced for representing the musical note, wherein the obtained vector is used as the vector for representing the musical note, the content expressed by the obtained musical note is matched with the key frame, the vector to be sequenced of the musical note of the previous key frame is also used as an input parameter, the purpose is to ensure that the vector of the current obtained musical note is matched with the vector of the musical note of the previous key frame, so that the musical notes corresponding to two adjacent key frames are matched, and the music is formed by a plurality of musical notes, the background music obtained by utilizing the vectors to be sequenced is also matched with the target video, and the background music of the target video can be obtained without manual participation through the method, thus facilitating a reduction in manual effort.
EXAMPLE III
Fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, including: a processor 501, a storage medium 502 and a bus 503, wherein the storage medium 502 stores machine-readable instructions executable by the processor 501, when the electronic device executes the data processing method, the processor 501 and the storage medium 502 communicate with each other through the bus 503, and the processor 501 executes the machine-readable instructions to perform the following steps:
acquiring attribute feature vectors of key frames of a target video;
obtaining the feature vector of the key frame according to the attribute feature vector of the key frame;
inputting the feature vector of the key frame and the vector to be sequenced of the notes for representing the previous key frame into a decoding model as input parameters to obtain the vector to be sequenced of the notes for representing the key frame;
and obtaining the background music of the target video according to all the obtained vectors to be sequenced.
In this embodiment of the application, the storage medium 502 may further execute other machine-readable instructions to perform other methods as described in the first embodiment, and for the method steps and principles to be specifically executed, refer to the description of the first embodiment, which is not described in detail herein.
Example four
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the following steps:
acquiring attribute feature vectors of key frames of a target video;
obtaining the feature vector of the key frame according to the attribute feature vector of the key frame;
inputting the feature vector of the key frame and the vector to be ordered of the musical notes used for representing the previous key frame into a decoding model as input parameters to obtain the vector to be ordered of the musical notes used for representing the key frame;
and obtaining the background music of the target video according to all the obtained vectors to be sequenced.
In the embodiment of the present application, when being executed by a processor, the computer program may further execute other machine-readable instructions to perform other methods as described in the first embodiment, and for the specific method steps and principles to be performed, reference is made to the description of the first embodiment, which is not described in detail herein.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (6)
1. A data processing method, comprising:
acquiring attribute feature vectors of key frames of a target video;
obtaining a feature vector of the key frame according to the attribute feature vector of the key frame, wherein the attribute feature vector of the key frame comprises a dynamic feature vector of the key frame, a static feature vector of the key frame and/or an optical flow feature vector of the key frame;
the obtaining the feature vector of the key frame according to the attribute feature vector of the key frame includes:
according to the attribute feature vector of the key frame, carrying out vector splicing processing through a full connection layer to obtain the feature vector of the key frame;
inputting the feature vector of the key frame and the vector to be ordered of the musical notes used for representing the previous key frame into a decoding model as input parameters to obtain the vector to be ordered of the musical notes used for representing the key frame;
obtaining background music of the target video according to all the obtained vectors to be sequenced;
the obtaining the background music of the target video according to all the obtained vectors to be sequenced comprises:
judging the vector to be sorted by using a preset target note vector set so as to determine whether the vector to be sorted meets a preset requirement;
and sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule so as to take a sequencing result as the background music of the target video.
2. The method as claimed in claim 1, wherein said determining the vector to be sorted by using a predetermined set of target note vectors to determine whether the vector to be sorted meets a predetermined requirement comprises:
carrying out mean square error operation on the vector to be ordered and a preset target note vector set to obtain a loss function value of the vector to be ordered relative to the target note vector;
when the loss function value is within a preset range, determining that the vector to be ordered meets the preset requirement;
and when the loss function value is not in the preset range, determining that the vector to be sequenced does not meet the preset requirement.
3. The method of claim 1, wherein the method further comprises:
and taking all vectors to be ordered which do not meet the preset requirement as training samples to train the decoding model.
4. A data processing apparatus, comprising:
the acquiring unit is used for acquiring attribute feature vectors of key frames of the target video;
the first processing unit is used for obtaining the feature vector of the key frame according to the attribute feature vector of the key frame, wherein the attribute feature vector of the key frame comprises a dynamic feature vector of the key frame, a static feature vector of the key frame and/or an optical flow feature vector of the key frame;
when the first processing unit is configured to obtain the feature vector of the key frame according to the attribute feature vector of the key frame, the configuration of the first processing unit includes:
according to the attribute feature vector of the key frame, carrying out vector splicing processing through a full connection layer to obtain the feature vector of the key frame;
the second processing unit is used for inputting the feature vector of the key frame and the vector to be ordered of the musical notes for representing the previous key frame into the decoding model as input parameters to obtain the vector to be ordered of the musical notes for representing the key frame;
the third processing unit is used for obtaining background music of the target video according to all the obtained vectors to be sequenced;
the third processing unit is configured to, when obtaining the background music of the target video according to all the obtained vectors to be sorted, include:
judging the vector to be sorted by utilizing a preset target note vector set to determine whether the vector to be sorted meets the preset requirement;
and sequencing all vectors to be sequenced which meet the preset requirement according to a preset note arrangement rule so as to take a sequencing result as the background music of the target video.
5. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the data processing method according to any one of claims 1 to 3.
6. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the data processing method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911029239.0A CN110781835B (en) | 2019-10-28 | 2019-10-28 | Data processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911029239.0A CN110781835B (en) | 2019-10-28 | 2019-10-28 | Data processing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110781835A CN110781835A (en) | 2020-02-11 |
CN110781835B true CN110781835B (en) | 2022-08-23 |
Family
ID=69386876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911029239.0A Active CN110781835B (en) | 2019-10-28 | 2019-10-28 | Data processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110781835B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112235517B (en) * | 2020-09-29 | 2023-09-12 | 北京小米松果电子有限公司 | Method for adding white-matter, device for adding white-matter, and storage medium |
CN113923517B (en) * | 2021-09-30 | 2024-05-07 | 北京搜狗科技发展有限公司 | Background music generation method and device and electronic equipment |
CN115052147B (en) * | 2022-04-26 | 2023-04-18 | 中国传媒大学 | Human body video compression method and system based on generative model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109086416A (en) * | 2018-08-06 | 2018-12-25 | 中国传媒大学 | A kind of generation method of dubbing in background music, device and storage medium based on GAN |
CN109599079A (en) * | 2017-09-30 | 2019-04-09 | 腾讯科技(深圳)有限公司 | A kind of generation method and device of music |
CN109862393A (en) * | 2019-03-20 | 2019-06-07 | 深圳前海微众银行股份有限公司 | Method of dubbing in background music, system, equipment and the storage medium of video file |
KR20190116199A (en) * | 2018-10-29 | 2019-10-14 | 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 | Video data processing method, device and readable storage medium |
-
2019
- 2019-10-28 CN CN201911029239.0A patent/CN110781835B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109599079A (en) * | 2017-09-30 | 2019-04-09 | 腾讯科技(深圳)有限公司 | A kind of generation method and device of music |
CN109086416A (en) * | 2018-08-06 | 2018-12-25 | 中国传媒大学 | A kind of generation method of dubbing in background music, device and storage medium based on GAN |
KR20190116199A (en) * | 2018-10-29 | 2019-10-14 | 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 | Video data processing method, device and readable storage medium |
CN109862393A (en) * | 2019-03-20 | 2019-06-07 | 深圳前海微众银行股份有限公司 | Method of dubbing in background music, system, equipment and the storage medium of video file |
Non-Patent Citations (1)
Title |
---|
Visual to Sound: Generating Natural Sound for Videos in the Wild;Yipin Zhou et al;《arXiv:1712.01393v2》;20180601;摘要、第1-6节 * |
Also Published As
Publication number | Publication date |
---|---|
CN110781835A (en) | 2020-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102416558B1 (en) | Video data processing method, device and readable storage medium | |
CN109688463B (en) | Clip video generation method and device, terminal equipment and storage medium | |
CN110781835B (en) | Data processing method and device, electronic equipment and storage medium | |
US20240107127A1 (en) | Video display method and apparatus, video processing method, apparatus, and system, device, and medium | |
CN111259192A (en) | Audio recommendation method and device | |
CN111460179A (en) | Multimedia information display method and device, computer readable medium and terminal equipment | |
JP2022538702A (en) | Voice packet recommendation method, device, electronic device and program | |
CN112584062B (en) | Background audio construction method and device | |
CN111723289B (en) | Information recommendation method and device | |
CN109815448B (en) | Slide generation method and device | |
CN111435369B (en) | Music recommendation method, device, terminal and storage medium | |
CN116132711A (en) | Method and device for generating video template and electronic equipment | |
CN117014693A (en) | Video processing method, device, equipment and storage medium | |
CN114065720A (en) | Conference summary generation method and device, storage medium and electronic equipment | |
CN112843681A (en) | Virtual scene control method and device, electronic equipment and storage medium | |
CN117939190A (en) | Method for generating video content and music content with soundtrack and electronic equipment | |
KR101804679B1 (en) | Apparatus and method of developing multimedia contents based on story | |
CN115115901A (en) | Method and device for acquiring cross-domain learning model | |
CN114840743A (en) | Model recommendation method and device, electronic equipment and readable storage medium | |
CN112449249A (en) | Video stream processing method and device, electronic equipment and storage medium | |
CN110489581A (en) | A kind of image processing method and equipment | |
CN115237248B (en) | Virtual object display method, device, equipment, storage medium and program product | |
CN115440198B (en) | Method, apparatus, computer device and storage medium for converting mixed audio signal | |
CN113992866B (en) | Video production method and device | |
CN118780974A (en) | Method and device for converting original picture into cartoon style picture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |