CN107241646B - Multimedia video editing method and device - Google Patents

Multimedia video editing method and device Download PDF

Info

Publication number
CN107241646B
CN107241646B CN201710566432.2A CN201710566432A CN107241646B CN 107241646 B CN107241646 B CN 107241646B CN 201710566432 A CN201710566432 A CN 201710566432A CN 107241646 B CN107241646 B CN 107241646B
Authority
CN
China
Prior art keywords
video
data
image
audio
target image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710566432.2A
Other languages
Chinese (zh)
Other versions
CN107241646A (en
Inventor
邵可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201710566432.2A priority Critical patent/CN107241646B/en
Publication of CN107241646A publication Critical patent/CN107241646A/en
Application granted granted Critical
Publication of CN107241646B publication Critical patent/CN107241646B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a method and a device for editing a multimedia video, relates to the technical field of multimedia, and mainly aims to solve the problem that a short video intercepted from a conventional live broadcast or small video cannot be edited. The main technical scheme comprises: acquiring a multimedia file; decoding video data and audio data in the multimedia file; rendering the video data and performing audio track processing on the audio data; and coding the processed video data and the processed audio data to obtain the multimedia video. The method is mainly used for editing the multimedia video.

Description

Multimedia video editing method and device
Technical Field
The present invention relates to the field of multimedia technologies, and in particular, to a method and an apparatus for editing a multimedia video.
Background
With the rapid development of internet technology, people no longer satisfy the requirement of simply using mobile phone calls to communicate and communicate, wherein social platforms established by multimedia technologies such as online live broadcast, small videos and the like have become the main means for communicating among users.
Currently, when a user uses a terminal device to perform live broadcast or record a small video, the small video can be stored by intercepting a small segment of the video, for example, a live broadcast platform is dancing a girl, and in order to record a video of the rotation of the girl, the short video of the rotation of the girl in the live broadcast video needs to be intercepted. After intercepting a video, editing a multimedia video becomes an urgent problem to be solved in order to enhance the playing effect of the video content.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for editing a multimedia video, and mainly aims to solve the problem that a short video captured from an existing live video or a small video cannot be edited.
According to an aspect of the present invention, there is provided a multimedia video editing method, including:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.
Further, the rendering the video data and the audio track processing the audio data comprise:
receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier;
rendering the video data according to the video effect identifier in the effect identifier, and processing the audio data according to the audio effect identifier in the effect identifier.
Further, the rendering the video data according to the video effect identifier in the effect identifiers comprises:
extracting image data of each frame in the video data, and carrying out filter processing on the image data;
and identifying a target image in the image data after filter processing according to the video effect identifier, and performing synthesis rendering on the target image.
Further, the identifying a target image in the image data after filter processing according to the video effect identifier, and performing composite rendering on the target image includes:
if the video effect identifier is recognized to be a composite three-dimensional image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendering image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting the position display relation among the target image, the segmented target image and the rendering image.
Further, the processing the audio data according to the audio effect identifier in the effect identifiers comprises:
collecting discrete audio track data in the audio data according to a preset time interval;
and effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
Further, the decoding video data and audio data in the multimedia file comprises:
and respectively decoding the video data and the audio data in the multimedia file according to the video track and the audio track.
Further, after the rendering processing of the video data and the audio track processing of the audio data, the method further includes:
and when a real-time preview request is received, displaying the video data and the audio data.
Further, the method further comprises:
and receiving a speed adjusting instruction, and adjusting the playing speed of video data and audio data in the multimedia video according to speed information carried in the speed adjusting instruction.
According to an aspect of the present invention, there is provided an editing apparatus for multimedia video, comprising:
the acquiring unit is used for acquiring the multimedia file;
a decoding unit for decoding video data and audio data in the multimedia file;
the processing unit is used for rendering the video data and carrying out audio track processing on the audio data;
and the coding unit is used for coding the processed video data and the processed audio data to obtain the multimedia video.
Further, the processing unit includes:
the receiving module is used for receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier;
and the processing module is used for rendering the video data according to the video effect identifier in the effect identifier and processing the audio data according to the audio effect identifier in the effect identifier.
Further, the processing module comprises:
the extraction submodule is used for extracting the image data of each frame in the video data and carrying out filter processing on the image data;
and the synthesis submodule is used for identifying a target image in the image data after the filter processing according to the video effect identifier and performing synthesis rendering on the target image.
The synthesis sub-module is specifically configured to, if it is identified that the video effect identifier is a synthesized stereo image, segment the target image, and perform color synthesis on the target image, the segmented target image, and the rendered image according to a preset coloring rule, where the preset coloring rule is used to reflect a position display relationship among the target image, the segmented target image, and the rendered image.
Further, the processing module further comprises:
the acquisition submodule is used for acquiring discrete audio track data in the audio data according to a preset time interval;
and the superposition submodule is used for effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
The decoding unit is specifically configured to decode video data and audio data in the multimedia file according to a video track and an audio track, respectively.
Further, the apparatus further comprises:
and the display unit is used for displaying the video data and the audio data when a real-time preview request is received.
Further, the apparatus further comprises:
and the adjusting unit is used for receiving the speed adjusting instruction and adjusting the playing speed of the video data and the audio data in the multimedia video according to the speed information carried in the speed adjusting instruction.
According to one aspect of the invention, there is provided a memory device having stored therein a plurality of instructions adapted to be loaded and executed by a processor to:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.
According to one aspect of the present invention, there is provided a mobile terminal comprising a processor adapted to implement various instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.
By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:
the invention provides a method and a device for editing a multimedia video, which are characterized by firstly obtaining a multimedia file, then decoding video data and audio data in the multimedia file, then rendering the video data, carrying out audio track processing on the audio data, and finally coding the processed video data and the processed audio data to obtain the multimedia video. Compared with the existing method that the short video intercepted from the live video or the small video cannot be edited, the method and the device provided by the embodiment of the invention respectively process the video data and the audio data by decoding the video data and the audio data in the multimedia file, and realize the editing of the live video or the intercepted video after the video is coded into the multimedia video, thereby increasing the playing effect of the short video, enabling the video to be more vivid, enabling characters in the edited video to be more attached to the rendered image, and improving the use efficiency of the video.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a flowchart illustrating an editing method for a multimedia video according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating another multimedia video editing method according to a second embodiment of the present invention;
fig. 3 shows a block diagram of an editing apparatus for multimedia video according to a third embodiment of the present invention;
fig. 4 shows a block diagram of another apparatus for editing multimedia video according to the fourth embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
An embodiment of the present invention provides a method for editing a multimedia video, as shown in fig. 1, the method includes:
101. and acquiring the multimedia file.
The multimedia file may be a video file with a different format, such as an MP4 format, an MKV format, a 3GP format, and the like, and the embodiment of the present invention is not specifically limited, and the multimedia file may be obtained by shooting with a camera in a terminal device, may also be captured from an online live video, and may also be directly extracted from a storage space in the terminal device, and the embodiment of the present invention is not specifically limited.
It should be noted that, in order to facilitate the editing and application of the multimedia video to the terminal device, a certain video playing time needs to be set when the video is captured or recorded, so that the finally generated multimedia video is a shorter video, and the current video editing method is conveniently applied to the terminal device with a smaller memory space.
102. And decoding video data and audio data in the multimedia file.
The decoding of the video and audio data may be realized by reading the video data and the audio data in the multimedia file, respectively, that is, decoding and restoring the video stream and the audio stream into analog video data and analog audio data.
It should be noted that, in the embodiment of the present invention, the decoding process may be completed by one media decoder, and the multimedia file is transmitted to the media decoder, so that the video data and the audio time may be automatically obtained.
103. Rendering the video data and performing audio track processing on the audio data.
The rendering processing includes adding a bitmap in a video, adding a dynamic image, adjusting an image effect of the video image, and the like, and the audio track processing includes adding different audio tracks for combination, adding a sound effect, and the like.
When video data and audio data are processed, they may be processed separately or in association with each other. For example, when adding a background to a video, only the video data may be subjected to rendering background processing, and not the audio track data, but when characters are output in the video, it is necessary to add the output characters to the video by converting the output characters into characters, and then to process the audio track data first, and then to add corresponding image data in the character library to the video data by recognizing the characters in the audio track data, and then to process the video data together with the audio track data.
104. And coding the processed video data and the processed audio data to obtain the multimedia video.
And the coding is a coding for matching the rendered video data with the rendered audio data so as to obtain a multimedia video corresponding to the smooth audio and video.
The invention provides a method for editing a multimedia video, which is compared with the existing method that a short video intercepted from a live broadcast or a small video cannot be edited.
An embodiment of the present invention provides another method for editing a multimedia video, as shown in fig. 2, the method includes:
201. and acquiring the multimedia file.
This step is the same as step 101 shown in fig. 1, and is not described herein again.
It should be noted that the method for editing a multimedia video according to the embodiment of the present invention may be applied to other live broadcast or video recording application programs, and the video editing is implemented by calling an interface, and the method may also be written as a separately used application program according to a corresponding program, and a multimedia file is obtained by directly shooting through calling a camera, which is not limited in the embodiment of the present invention.
202. And respectively decoding the video data and the audio data in the multimedia file according to the video track and the audio track.
The video track is a content track of video playing, the audio track is a content track of audio playing, and in order to decode video data and audio track data in a multimedia file, so as to process the video data and the audio track data respectively, decoding needs to be performed according to the video track and the audio track respectively.
203. And receiving a processing instruction input by a user.
Wherein, the processing instruction carries the effect identifier. The processing instruction is used to instruct the system to perform specific video editing on a video, and the effect identifier is information identifying that different video and audio effects can be achieved, such as rendering an image, converting a voice and a text, adding a background, and the like.
It should be noted that, if the rendered image is an image input by a user, the image may be transmitted through a processing instruction. In addition, the bitmap of the image to be rendered and the added background may be a preset image or an image input by a user, which is not specifically limited in the embodiment of the present invention.
204. Rendering the video data according to the video effect identifier in the effect identifier, and processing the audio data according to the audio effect identifier in the effect identifier.
The video effect identifier is an identifier for processing a video effect in video data, the audio effect identifier is an identifier for processing an audio effect in audio data, and in order to further add different images corresponding to different effects, the video data or the audio data needs to be processed according to the video or audio effect identifier.
The video and the audio are respectively subjected to rendering processing and sound effect processing according to the video effect identifier and the audio effect identifier in the effect identifier, so that the image and the sound are respectively edited, and the performance of video editing is optimized.
For the embodiment of the present invention, the step of rendering the video data according to the video effect identifier in the effect identifier may specifically include: extracting image data of each frame in the video data, and carrying out filter processing on the image data; and identifying a target image in the image data after filter processing according to the video effect identifier, and performing synthesis rendering on the target image.
Since the analyzed video data is composed of image information of one frame and one frame, in order to add an image to a video, the image information in each frame needs to be added with the image, and before the image information is processed, a filter process needs to be performed, so that a video effect needing to be filtered is obtained. The target image is a corresponding object to which a bitmap needs to be added, or an object to be rendered, and the embodiment of the present invention is not particularly limited, for example, when the video effect identifier is an addition background image, the target image is a human image or an animal image, and if the video effect identifier is an addition special effect of a word spitting, the target image is a human face or a human mouth.
It should be noted that, in the composite rendering, a bitmap to be added is added to each frame of image, and positions of the added bitmaps in each frame are different, so that the rendered image in the video playing is dynamic.
For the embodiment of the present invention, the step of identifying the target image in the image data after filter processing according to the video effect identifier, and performing composite rendering on the target image may specifically include: if the video effect identifier is recognized to be a composite three-dimensional image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendering image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting the position display relation among the target image, the segmented target image and the rendering image.
The composite stereo image is obtained by displaying the bitmap as a virtual reality stereo dynamic image with hierarchy by using a visual difference effect, and the stereo image depends on whether the added bitmap is displayed at different positions in each frame of image or not and how much the added bitmap is displayed.
It should be noted that, if the video effect identifier is a composite stereoscopic image, the specific step is to divide the target image, and color and combine the target image, the divided target image, and the rendered image according to a preset coloring rule, where generally, the target image of the composite stereoscopic image is a character image, image information of each frame needs to be divided in order to distinguish a character in the image from a background, the rendered image is a bitmap which needs to be added, the preset coloring rule is a policy for determining whether the rendered image is displayed or not when covering the target image, and whether the rendered image needs to be hidden, and the specific policy is set according to different bitmaps and positions of the character, which is not specifically limited in the embodiment of the present invention.
For the embodiment of the present invention, the step of processing the audio data according to the audio effect identifier in the effect identifier may specifically include: collecting discrete audio track data in the audio data according to a preset time interval; and effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
In order to better superimpose different audio tracks, rather than simply superimpose the sound volume alone, discretization needs to be performed on the audio data, and discrete audio track data are collected according to a preset time interval, where the preset time interval may be 1 second, 0.05 second, and the like, and the embodiment of the present invention is not particularly limited. The effective superposition may be to superpose discrete audio track data of multiple preset audio tracks, and other preset audio tracks may be stored in a cache or a hard disk of the current terminal device, except for the discrete audio track data acquired from the audio data in the multimedia file. For example, the acquired discrete audio track data is the sound of poetry read by children, and the preset audio track needing to be added in an overlapping manner is background music of the moons of the lotus pool, and then the sound is overlapped.
It should be noted that, in the audio processing, an open-source audio processing method, such as Ffmpg, may be selected.
205. And when a real-time preview request is received, displaying the video data and the audio data.
The real-time preview request is a request input by a user and required to preview a state of a currently processed video or audio, the real-time preview request is used for indicating that a video image processed by simulation playing is simulated, the video image can be browsed for each frame and also can be played in a video form, the audio processed by simulation playing is also simulated, generally, the real-time browse request is also used for indicating that an unprocessed original image and an unprocessed original audio are simulated and displayed, and the time for receiving the real-time preview request is specifically determined, and embodiments of the present invention are not particularly limited.
206. And coding the processed video data and the processed audio data to obtain the multimedia video.
This step is the same as step 104 shown in fig. 1, and will not be described herein again.
207. And receiving a speed adjusting instruction, and adjusting the playing speed of video data and audio data in the multimedia video according to speed information carried in the speed adjusting instruction.
The speed adjustment instruction carries speed information, which may be speed-up or speed-down, and specific data may be carried in the speed information, which is not specifically limited in the embodiment of the present invention.
It should be noted that, the specific method for adjusting the speed may be to adjust the number of frames of the image within 1 second for the video data to achieve the adjustment of the speed of playing the video, and to adjust the speed of playing the audio track within the preset time for the audio data to achieve the adjustment of the speed of playing the audio.
For the embodiment of the present invention, specific application scenarios may be as follows, but are not limited to the following scenarios, including: intercepting a multimedia file of online reading of boys, decoding video data and audio data of the online reading of the boys according to a video track and an audio track, and respectively processing the video data and the audio data if an effect identifier input by a user is a spitting character conversion identifier.
The embodiment of the invention realizes the editing of live or intercepted video and increases the playing effect of short video by decoding video data and audio data in a multimedia file, rendering the video data according to a video effect identifier, effectively superposing the audio data according to the audio effect identifier and encoding the audio data into the multimedia video, thereby enabling the video to be more vivid, improving the display effect of video content, enabling characters in the edited video to be more attached to rendered images, designing and recording the short video according to different requirements, increasing the application of the short video and improving the use efficiency of the video.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides an apparatus for editing a multimedia video, as shown in fig. 3, the apparatus includes: an acquisition unit 31, a decoding unit 32, a processing unit 33, and an encoding unit 34.
An acquisition unit 31 for acquiring a multimedia file; the acquiring unit 31 executes a function module for acquiring a multimedia file for an editing apparatus of a multimedia video.
A decoding unit 32 for decoding video data and audio data in the multimedia file; the decoding unit 32 is a functional module for an editing device of multimedia video to execute decoding of video data and audio data in the multimedia file.
A processing unit 33, configured to perform rendering processing on the video data and perform audio track processing on the audio data; the processing unit 33 is a functional module for performing rendering processing on the video data and performing audio track processing on the audio data for an editing apparatus of a multimedia video.
And an encoding unit 34, configured to encode the processed video data and the processed audio data to obtain a multimedia video. The encoding unit 34 is a functional module for performing encoding on the processed video data and the processed audio data by the editing apparatus of the multimedia video to obtain the multimedia video.
The invention provides a multimedia video editing device, which is compared with the existing device that short videos intercepted from live broadcast or small videos cannot be edited, the embodiment of the invention respectively processes the video data and the audio data by decoding the video data and the audio data in a multimedia file, and encodes the video into the multimedia video, thereby realizing the editing of the live broadcast or intercepted videos, increasing the playing effect of the short videos, leading the videos to be more vivid, leading characters in the edited videos to be more attached to rendered images, and improving the use efficiency of the videos.
Further, as an implementation of the method shown in fig. 2, an embodiment of the present invention provides another apparatus for editing a multimedia video, as shown in fig. 4, the apparatus includes: an acquisition unit 41, a decoding unit 42, a processing unit 43, an encoding unit 44, a presentation unit 45, and an adjustment unit 46.
An acquisition unit 41 for acquiring a multimedia file;
a decoding unit 42 for decoding video data and audio data in the multimedia file;
a processing unit 43, configured to perform rendering processing on the video data and perform audio track processing on the audio data;
and an encoding unit 44, configured to encode the processed video data and the processed audio data to obtain a multimedia video.
Specifically, in order to process video and audio according to the requirement of the user, the processing unit 43 includes:
a receiving module 4301, configured to receive a processing instruction input by a user, where the processing instruction carries an effect identifier;
and the processing module 4302 is configured to render the video data according to the video effect identifier in the effect identifier, and process the audio data according to the audio effect identifier in the effect identifier.
Specifically, in order to implement the processing steps of the video data, the processing module 4302 includes:
an extraction submodule 430201, configured to extract image data of each frame in the video data, and perform filter processing on the image data;
and the synthesis submodule 430202 is configured to identify a target image in the image data after the filter processing according to the video effect identifier, and perform synthesis rendering on the target image.
The synthesis sub-module 430202 is specifically configured to, if it is identified that the video effect identifier is a synthesized stereo image, segment the target image, and perform coloring synthesis on the target image, the segmented target image, and the rendered image according to a preset coloring rule, where the preset coloring rule is used to reflect a position display relationship among the target image, the segmented target image, and the rendered image.
Specifically, in order to implement the processing steps of the audio data, the processing module 4302 further includes:
an acquisition sub-module 430203, configured to acquire discrete audio track data in the audio data at preset time intervals;
and the superposition submodule 430204 is configured to effectively superimpose the discrete audio track data with a preset audio track according to the audio effect identifier.
The decoding unit 42 is specifically configured to decode video data and audio data in the multimedia file according to a video track and an audio track, respectively.
Further, in order to facilitate the user to preview the rendered video and the processed audio at any time, the apparatus further comprises:
a display unit 45, configured to display the video data and the audio data when a live preview request is received.
Further, in order to adjust the speed of playing the video at will, the device further comprises:
the adjusting unit 46 is configured to receive a speed adjusting instruction, and adjust the playing speed of the video data and the audio data in the multimedia video according to the speed information carried in the speed adjusting instruction.
The invention provides another multimedia video editing device, which is characterized in that video data and audio data in a multimedia file are decoded, the video data are rendered according to a video effect identifier, the audio data are effectively overlapped according to the audio effect identifier and then are encoded into a multimedia video, the editing of a live or intercepted video is realized, the playing effect of a short video is increased, the video is more vivid, the display effect of video content is improved, characters in the edited video are more attached to rendered images, the recorded short video can be designed according to different requirements, the purpose of the short video is increased, and the use efficiency of the video is improved.
An embodiment of the present invention provides a storage device, in which a plurality of instructions are stored, the instructions being adapted to be loaded and executed by a processor: acquiring a multimedia file; decoding video data and audio data in the multimedia file; rendering the video data and performing audio track processing on the audio data; and coding the processed video data and the processed audio data to obtain the multimedia video.
The embodiment of the invention provides a mobile terminal, which comprises a processor, a processor and a control module, wherein the processor is suitable for realizing various instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to: acquiring a multimedia file; decoding video data and audio data in the multimedia file; rendering the video data and performing audio track processing on the audio data; and coding the processed video data and the processed audio data to obtain the multimedia video.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It will be appreciated that the relevant features of the method and apparatus described above are referred to one another. In addition, "first", "second", and the like in the above embodiments are for distinguishing the embodiments, and do not represent merits of the embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be understood by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the multimedia video editing method and apparatus according to the embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
The embodiment of the invention discloses:
a1, a method for editing multimedia video, comprising:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.
A2, the rendering the video data and the soundtrack processing the audio data according to the method of A1 comprising:
receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier;
rendering the video data according to the video effect identifier in the effect identifier, and processing the audio data according to the audio effect identifier in the effect identifier.
A3, according to the method of A2, the rendering the video data according to the video effect identification in the effect identification includes:
extracting image data of each frame in the video data, and carrying out filter processing on the image data;
and identifying a target image in the image data after filter processing according to the video effect identifier, and performing synthesis rendering on the target image.
A4, according to the method of A3, the identifying a target image in the filter-processed image data according to the video effect, and the compositely rendering the target image includes:
if the video effect identifier is recognized to be a composite three-dimensional image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendering image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting the position display relation among the target image, the segmented target image and the rendering image.
A5, the method of A2, the processing the audio data according to the audio effect identification of the effect identifications including:
collecting discrete audio track data in the audio data according to a preset time interval;
and effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
A6, the method of A1, the decoding video data and audio data in the multimedia file comprising:
and respectively decoding the video data and the audio data in the multimedia file according to the video track and the audio track.
A7, after the rendering processing the video data and the soundtrack processing the audio data according to the method of A1, the method further comprising:
and when a real-time preview request is received, displaying the video data and the audio data.
A8, the method of A1, the method further comprising:
and receiving a speed adjusting instruction, and adjusting the playing speed of video data and audio data in the multimedia video according to speed information carried in the speed adjusting instruction.
B9, an apparatus for editing a multimedia video, comprising:
the acquiring unit is used for acquiring the multimedia file;
a decoding unit for decoding video data and audio data in the multimedia file;
the processing unit is used for rendering the video data and carrying out audio track processing on the audio data;
and the coding unit is used for coding the processed video data and the processed audio data to obtain the multimedia video.
B10, the apparatus according to B9, the processing unit comprising:
the receiving module is used for receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier;
and the processing module is used for rendering the video data according to the video effect identifier in the effect identifier and processing the audio data according to the audio effect identifier in the effect identifier.
B11, the apparatus of B10, the processing module comprising:
the extraction submodule is used for extracting the image data of each frame in the video data and carrying out filter processing on the image data;
and the synthesis submodule is used for identifying a target image in the image data after the filter processing according to the video effect identifier and performing synthesis rendering on the target image.
B12, the device according to B11,
the synthesis sub-module is specifically configured to, if it is identified that the video effect identifier is a synthesized stereo image, segment the target image, and perform color synthesis on the target image, the segmented target image, and the rendered image according to a preset coloring rule, where the preset coloring rule is used to reflect a position display relationship among the target image, the segmented target image, and the rendered image.
B13, the apparatus of B10, the processing module further comprising:
the acquisition submodule is used for acquiring discrete audio track data in the audio data according to a preset time interval;
and the superposition submodule is used for effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
B14, the device according to B9,
the decoding unit is specifically configured to decode video data and audio data in the multimedia file according to a video track and an audio track, respectively.
B15, the apparatus of B9, the apparatus further comprising:
and the display unit is used for displaying the video data and the audio data when a real-time preview request is received.
B16, the apparatus of B9, the apparatus further comprising:
and the adjusting unit is used for receiving the speed adjusting instruction and adjusting the playing speed of the video data and the audio data in the multimedia video according to the speed information carried in the speed adjusting instruction.
C17, a storage device having stored therein a plurality of instructions adapted to be loaded and executed by a processor:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.
D18, a mobile terminal comprising a processor adapted to implement various instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data;
and coding the processed video data and the processed audio data to obtain the multimedia video.

Claims (12)

1. A method for editing a multimedia video, comprising:
acquiring a multimedia file; decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data, including:
receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier; rendering the video data according to the video effect identifier in the effect identifiers, and processing the audio data according to the audio effect identifier in the effect identifiers;
coding the processed video data and the processed audio data to obtain a multimedia video;
the rendering the video data according to the video effect identifier in the effect identifiers comprises:
extracting image data of each frame in the video data, and carrying out filter processing on the image data;
identifying a target image in the image data processed by the filter according to the video effect identifier, and performing synthesis rendering on the target image;
the identifying a target image in the image data after the filter processing according to the video effect identifier and the synthesizing and rendering the target image comprise:
if the video effect identifier is recognized to be a composite stereo image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendered image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting a position display relation among the target image, the segmented target image and the rendered image, the rendered image is a bitmap which needs to be added, and the position display relation refers to whether the rendered image is displayed or not when the rendered image covers the target image, how much the rendered image is displayed and whether the rendered image needs to be hidden or not.
2. The method of claim 1, wherein the processing the audio data according to the audio effect identifier of the effect identifiers comprises:
collecting discrete audio track data in the audio data according to a preset time interval;
and effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
3. The method of claim 1, wherein the decoding video data and audio data in the multimedia file comprises:
and respectively decoding the video data and the audio data in the multimedia file according to the video track and the audio track.
4. The method of claim 1, wherein after the rendering the video data and the soundtrack processing the audio data, the method further comprises:
and when a real-time preview request is received, displaying the video data and the audio data.
5. The method of claim 1, further comprising:
and receiving a speed adjusting instruction, and adjusting the playing speed of video data and audio data in the multimedia video according to speed information carried in the speed adjusting instruction.
6. An apparatus for editing a multimedia video, comprising:
the acquiring unit is used for acquiring the multimedia file;
a decoding unit for decoding video data and audio data in the multimedia file;
a processing unit, configured to perform rendering processing on the video data and perform audio track processing on the audio data, including: the receiving module is used for receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier; the processing module is used for rendering the video data according to the video effect identifier in the effect identifiers and processing the audio data according to the audio effect identifier in the effect identifiers;
the encoding unit is used for encoding the processed video data and the processed audio data to obtain a multimedia video;
the processing module comprises:
the extraction submodule is used for extracting the image data of each frame in the video data and carrying out filter processing on the image data;
the synthesis submodule is used for identifying a target image in the image data after the filter processing according to the video effect identifier and performing synthesis rendering on the target image;
the composition sub-module is specifically configured to, if it is identified that the video effect identifier is a composite stereo image, segment the target image, and perform coloring composition on the target image, the segmented target image, and the rendered image according to a preset coloring rule, where the preset coloring rule is used to reflect a position display relationship among the target image, the segmented target image, and the rendered image, the rendered image is a bitmap that needs to be added, and the position display relationship refers to whether the rendered image is displayed, how much it is displayed, and whether the rendered image needs to be hidden when the rendered image covers the target image.
7. The apparatus of claim 6, wherein the processing module further comprises:
the acquisition submodule is used for acquiring discrete audio track data in the audio data according to a preset time interval;
and the superposition submodule is used for effectively superposing the discrete audio track data and a preset audio track according to the audio effect identifier.
8. The apparatus of claim 6,
the decoding unit is specifically configured to decode video data and audio data in the multimedia file according to a video track and an audio track, respectively.
9. The apparatus of claim 6, further comprising:
and the display unit is used for displaying the video data and the audio data when a real-time preview request is received.
10. The apparatus of claim 6, further comprising:
and the adjusting unit is used for receiving the speed adjusting instruction and adjusting the playing speed of the video data and the audio data in the multimedia video according to the speed information carried in the speed adjusting instruction.
11. A memory device having stored therein a plurality of instructions adapted to be loaded and executed by a processor:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data, including:
receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier; rendering the video data according to the video effect identifier in the effect identifiers, and processing the audio data according to the audio effect identifier in the effect identifiers;
coding the processed video data and the processed audio data to obtain a multimedia video;
the rendering the video data according to the video effect identifier in the effect identifiers comprises:
extracting image data of each frame in the video data, and carrying out filter processing on the image data;
identifying a target image in the image data processed by the filter according to the video effect identifier, and performing synthesis rendering on the target image;
the identifying a target image in the image data after the filter processing according to the video effect identifier and the synthesizing and rendering the target image comprise:
if the video effect identifier is recognized to be a composite stereo image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendered image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting a position display relation among the target image, the segmented target image and the rendered image, the rendered image is a bitmap which needs to be added, and the position display relation refers to whether the rendered image is displayed or not when the rendered image covers the target image, how much the rendered image is displayed and whether the rendered image needs to be hidden or not.
12. A mobile terminal comprising a processor adapted to implement various instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to:
acquiring a multimedia file;
decoding video data and audio data in the multimedia file;
rendering the video data and performing audio track processing on the audio data, including:
receiving a processing instruction input by a user, wherein the processing instruction carries an effect identifier; rendering the video data according to the video effect identifier in the effect identifiers, and processing the audio data according to the audio effect identifier in the effect identifiers;
coding the processed video data and the processed audio data to obtain a multimedia video;
the rendering the video data according to the video effect identifier in the effect identifiers comprises:
extracting image data of each frame in the video data, and carrying out filter processing on the image data;
identifying a target image in the image data processed by the filter according to the video effect identifier, and performing synthesis rendering on the target image;
the identifying a target image in the image data after the filter processing according to the video effect identifier and the synthesizing and rendering the target image comprise:
if the video effect identifier is recognized to be a composite stereo image, segmenting the target image, and coloring and synthesizing the target image, the segmented target image and the rendered image according to a preset coloring rule, wherein the preset coloring rule is used for reflecting a position display relation among the target image, the segmented target image and the rendered image, the rendered image is a bitmap which needs to be added, and the position display relation refers to whether the rendered image is displayed or not when the rendered image covers the target image, how much the rendered image is displayed and whether the rendered image needs to be hidden or not.
CN201710566432.2A 2017-07-12 2017-07-12 Multimedia video editing method and device Active CN107241646B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710566432.2A CN107241646B (en) 2017-07-12 2017-07-12 Multimedia video editing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710566432.2A CN107241646B (en) 2017-07-12 2017-07-12 Multimedia video editing method and device

Publications (2)

Publication Number Publication Date
CN107241646A CN107241646A (en) 2017-10-10
CN107241646B true CN107241646B (en) 2020-08-14

Family

ID=59990913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710566432.2A Active CN107241646B (en) 2017-07-12 2017-07-12 Multimedia video editing method and device

Country Status (1)

Country Link
CN (1) CN107241646B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108234479A (en) * 2017-12-29 2018-06-29 北京百度网讯科技有限公司 For handling the method and apparatus of information
CN109168027B (en) * 2018-10-25 2020-12-11 北京字节跳动网络技术有限公司 Instant video display method and device, terminal equipment and storage medium
CN109543560A (en) * 2018-10-31 2019-03-29 百度在线网络技术(北京)有限公司 Dividing method, device, equipment and the computer storage medium of personage in a kind of video
CN109587552B (en) * 2018-11-26 2021-06-15 Oppo广东移动通信有限公司 Video character sound effect processing method and device, mobile terminal and storage medium
CN111343499A (en) * 2018-12-18 2020-06-26 北京奇虎科技有限公司 Video synthesis method and device
CN111355960B (en) * 2018-12-21 2021-05-04 北京字节跳动网络技术有限公司 Method and device for synthesizing video file, mobile terminal and storage medium
CN111866404B (en) * 2019-04-25 2022-04-29 华为技术有限公司 Video editing method and electronic equipment
CN112533058A (en) * 2019-09-17 2021-03-19 西安中兴新软件有限责任公司 Video processing method, device, equipment and computer readable storage medium
CN111460183B (en) * 2020-03-30 2024-02-13 北京金堤科技有限公司 Method and device for generating multimedia file, storage medium and electronic equipment
CN111818385B (en) * 2020-07-22 2022-08-09 Oppo广东移动通信有限公司 Video processing method, video processing device and terminal equipment
CN113315928B (en) * 2021-05-25 2022-03-22 南京慕映影视科技有限公司 Multimedia file making system and method
CN114007077B (en) * 2021-11-17 2023-09-01 北京百度网讯科技有限公司 Method and device for processing multimedia resources, electronic equipment and storage medium
CN114979766B (en) * 2022-05-11 2023-11-21 深圳市闪剪智能科技有限公司 Audio and video synthesis method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102638658A (en) * 2012-03-01 2012-08-15 盛乐信息技术(上海)有限公司 Method and system for editing audio-video
CN103049908A (en) * 2012-12-10 2013-04-17 北京百度网讯科技有限公司 Method and device for generating stereoscopic video file
CN103327361A (en) * 2012-11-22 2013-09-25 中兴通讯股份有限公司 Method, device and system for obtaining real-time video communication playback data flow
CN104732593A (en) * 2015-03-27 2015-06-24 厦门幻世网络科技有限公司 Three-dimensional animation editing method based on mobile terminal
CN106373170A (en) * 2016-08-31 2017-02-01 北京云图微动科技有限公司 Video making method and video making device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080002942A1 (en) * 2006-05-24 2008-01-03 Peter White Method and apparatus for creating a custom track

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102638658A (en) * 2012-03-01 2012-08-15 盛乐信息技术(上海)有限公司 Method and system for editing audio-video
CN103327361A (en) * 2012-11-22 2013-09-25 中兴通讯股份有限公司 Method, device and system for obtaining real-time video communication playback data flow
CN103049908A (en) * 2012-12-10 2013-04-17 北京百度网讯科技有限公司 Method and device for generating stereoscopic video file
CN104732593A (en) * 2015-03-27 2015-06-24 厦门幻世网络科技有限公司 Three-dimensional animation editing method based on mobile terminal
CN106373170A (en) * 2016-08-31 2017-02-01 北京云图微动科技有限公司 Video making method and video making device

Also Published As

Publication number Publication date
CN107241646A (en) 2017-10-10

Similar Documents

Publication Publication Date Title
CN107241646B (en) Multimedia video editing method and device
US11482192B2 (en) Automated object selection and placement for augmented reality
CN108200446B (en) On-line multimedia interaction system and method of virtual image
US11218739B2 (en) Live video broadcast method, live broadcast device and storage medium
CN108307229B (en) Video and audio data processing method and device
CN105340014B (en) Touch optimization design for video editing
CN111741326B (en) Video synthesis method, device, equipment and storage medium
CN112291627A (en) Video editing method and device, mobile terminal and storage medium
TWI556639B (en) Techniques for adding interactive features to videos
CN112637670B (en) Video generation method and device
US20170242833A1 (en) Systems and Methods to Generate Comic Books or Graphic Novels from Videos
WO2023202095A1 (en) Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium
CN108965746A (en) Image synthesizing method and system
KR20160119218A (en) Sound image playing method and device
CN104580837A (en) Video director engine based on GPU+CPU+IO architecture and using method thereof
CN111246196B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN105872827A (en) Live broadcast method and device of application interface in mobile terminal
CN112422844A (en) Method, device and equipment for adding special effect in video and readable storage medium
CN113923504B (en) Video preview moving picture generation method and device
US10043302B2 (en) Method and apparatus for realizing boot animation of virtual reality system
CN113395569B (en) Video generation method and device
CN106792219B (en) It is a kind of that the method and device reviewed is broadcast live
CN113497963A (en) Video processing method, device and equipment
CN114500879A (en) Video data processing method, device, equipment and storage medium
US9807350B2 (en) Automated personalized imaging system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant