WO2022135105A1

WO2022135105A1 - Video dubbing method and apparatus for functional machine, terminal device and storage medium

Info

Publication number: WO2022135105A1
Application number: PCT/CN2021/135090
Authority: WO
Inventors: 黄瑞; 李京
Original assignee: 展讯半导体(成都)有限公司
Priority date: 2020-12-21
Filing date: 2021-12-02
Publication date: 2022-06-30
Also published as: CN112689194B; CN112689194A

Abstract

Embodiments of the present application provide a video dubbing method and apparatus for a functional machine, which are applied to a terminal device, and the terminal device belongs to a functional machine. Said method comprises: acquiring a first audio file required for dubbing; transcoding the first audio file to obtain a transcoded second audio file; acquiring a first video file to be dubbed; decoding the second audio file to obtain first audio data streams, and encoding the first video file to obtain video data streams; and synthesizing the audio data streams and the video data streams to obtain a second video file, so that the video dubbing function of the functional machine can be realized, and the software and hardware resources of the functional machine can be saved.

Description

Function machine video soundtrack method, device, terminal device and storage medium

technical field

The present application relates to the technical field of audio and video, and in particular, to a method, device, terminal device and storage medium for video soundtracking of a functional machine.

Background technique

The soundtrack function has been implemented on the smart phone (android operating system), and the specific music selected by the user is configured for the video recorded by the user. The operating system of the function machine is not compatible with the application code on the smart machine, for example, a real-time operating system (RTOS), and the function machine does not have a java layer.

The function machine is constrained by cost and hardware conditions, and its performance is far lower than that of the smart machine. The realization of the video soundtrack function of the function machine needs to rely on its existing function modules as much as possible to reduce the overhead of software and hardware. Therefore, how to The problem of realizing the video soundtrack function of the feature phone and saving the software and hardware resources of the feature phone needs to be solved.

SUMMARY OF THE INVENTION

The embodiments of the present application provide a method and device for video soundtracking of a functional machine, which can realize the video soundtracking function of the functional machine and save the software and hardware resources of the functional machine.

In a first aspect, an embodiment of the present application provides a method for soundtracking a video of a functional machine, the method comprising:

obtaining the first audio file required for the soundtrack; transcoding the first audio file to obtain the transcoded second audio file;

Obtain the first video file to be composed;

Decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream;

The audio data stream and the video data stream are synthesized to obtain a second video file.

In a second aspect, an embodiment of the present application provides a video soundtrack device for a functional machine, the device comprising:

an acquisition unit for acquiring the first audio file required for the soundtrack;

a processing unit for transcoding the first audio file to obtain a transcoded second audio file;

The obtaining unit is also used to obtain the first video file to be composed;

The processing unit is further configured to decode the second audio file to obtain a first audio data stream, and encode the first video file to obtain a video data stream;

The processing unit is further configured to synthesize the audio data stream and the video data stream to obtain a second video file.

In a third aspect, embodiments of the present application provide a terminal device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory, and are configured to be processed by the above-mentioned The above program includes instructions for executing the steps in the method described in the first aspect of the embodiments of the present application.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, wherein the computer program causes a computer to execute the first aspect of the embodiment of the present application. some or all of the steps described in the method.

In a fifth aspect, an embodiment of the present application provides a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute as implemented in the present application. For example, some or all of the steps described in the method described in the first aspect. The computer program product may be a software installation package.

It can be seen that, in the embodiment of the present application, the first audio file required for the soundtrack is obtained; the first audio file is transcoded to obtain the transcoded second audio file; the first video file to be soundtracked is obtained; Decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream; synthesizing the audio data stream and the video data stream to obtain a second video file, In this way, the video soundtrack function of the function machine can be realized, and the software and hardware resources of the function machine can be saved.

Description of drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

1A is a schematic flowchart of a method for soundtracking a video for a feature phone provided by an embodiment of the present application;

1B is a schematic diagram illustrating the writing and reading of second audio data in a DMA buffer provided by an embodiment of the present application;

2 is a schematic flowchart of another function machine video soundtrack method provided by an embodiment of the present application;

3 is a schematic structural diagram of a terminal device provided by an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a video soundtrack device for a functional machine provided by an embodiment of the present application.

Detailed ways

The terms used in the embodiments of the present application are only used to explain specific embodiments of the present application, and are not intended to limit the present application. The terms "first", "second", "third" and "fourth" in the description and claims of the present application and the drawings are used to distinguish different objects, rather than to describe a specific order . Furthermore, the terms "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion.

The terminal device in the embodiment of the present application is a function machine, and the operating system of the terminal device may be RTOS, for example, and the operating system of the smart machine is generally an android operating system, an ios operating system, etc., which is compatible with many applications. program, but the operating system of the function machine does not have a java layer and cannot be compatible with many functional applications. In addition, the function machine is constrained by cost and hardware conditions, and its performance is far lower than that of the smart machine. The realization of the video soundtrack function of the function machine needs to rely on its existing function modules as much as possible to reduce the overhead of software and hardware.

Specifically, the terminal device in this embodiment of the present application has a wireless communication function, and can be deployed on land, including indoor or outdoor, handheld, wearable, or vehicle-mounted; it can also be deployed on water (such as ships, etc.); or Deployed in the air (eg aircraft, balloons, satellites, etc.). The terminal device can be a mobile phone (mobile phone), a tablet computer (pad), a computer with a wireless transceiver function, a virtual reality (VR) terminal device, an augmented reality (AR) terminal device, an industrial control (industrial) terminal device wireless terminal in control), wireless terminal in self-driving, wireless terminal in remote medical, wireless terminal in smart grid, wireless terminal in smart home terminal etc. The terminal device may also be a handheld device with wireless communication function, a vehicle-mounted device, a wearable device, a computer device, or other processing device connected to a wireless modem.

Please refer to FIG. 1A . FIG. 1A is a schematic flowchart of a method for soundtracking a video for a feature phone provided by an embodiment of the application. The method for soundtracking a video for a feature phone is applied to a terminal device, and the terminal device belongs to a feature phone. The method includes the following steps:

101. Obtain a first audio file required for the soundtrack; transcode the first audio file to obtain a transcoded second audio file.

The audio format of the first audio file may be any of the following: mp3, wav, midi, or amr, etc. The first audio file may be an audio file recorded by a user through a terminal device, or may be downloaded from the Internet, downloaded from other devices The audio file to be transmitted is not limited in this embodiment of the present application.

Wherein, the second audio file may be a pulse code modulation (pulse code modulation, pcm) type audio file in wav format.

In the specific implementation, the first audio file can be transcoded, specifically, the audio file whose audio format is mp3, wav, midi or amr can be converted into a pcm-type wav-format audio file, because the pcm-type wav-format audio file can be It is implemented in the multimedia interface (multimedia interface, MMI) layer in the operating system of the function machine, wherein, the MMI layer can be realized by the C language code, so that it can be compatible with the function machine, so that the defect that the function machine does not have the java layer can be solved.

Optionally, in the above step 101, the transcoding of the first audio file to obtain a transcoded second audio file may include:

11. The first audio file is separated to obtain the separated audio file;

12. Decoding the separated audio file to obtain the first audio data of the first preset type;

13. Add a file header of the first audio format to the first audio data to obtain the second audio file.

The first preset type may be a pcm type, and the file header of the first audio format may be a wav file header.

In a specific implementation, the first audio file can be separated to obtain the separated audio file; the separated audio file can be decoded to obtain the first audio data of the first preset type, for example, the first audio data of the pcm type; Then, the first audio data of the pcm type is output as a file to a storage device such as a flash memory, and finally a file header of the first audio format can be added to the first audio data of the pcm type, for example, a wav file header can be added to obtain the second The audio file, for example, a pcm-type wav format audio file, so that the second audio file can realize the function of video soundtrack on the function machine, wherein the second audio file can be temporarily stored as an intermediate file in the storage space of the function machine .

102. Acquire a first video file to be composed.

Wherein, obtaining the first video file to be composed may be recording the first video file through a video application, or may be receiving the first video file transmitted by an external device, or may be downloading from the network or calling the locally stored first video file. a video file.

Optionally, in the above-mentioned step 102, obtaining the first video file to be composed may include the following steps:

record first video data;

The first video data is encoded to obtain an encoded first video file.

In this embodiment of the present application, the recording of the first video file may be performed simultaneously with the transcoding of the first audio file.

103. Decode the second audio file to obtain a first audio data stream.

In the specific implementation, it is not necessary to perform video recording encoding, but based on the existing functions of the functional machine, the second audio file is decoded into the first audio data stream, and the first video file is encoded to obtain the video data stream, so that the video data stream can be The function of realizing video soundtrack on the function machine.

Optionally, in the above step 103, the decoding of the second audio file to obtain the first audio data stream may include:

The file header in the second audio file is removed to obtain the decoded first audio data;

The decoded first audio data is synthesized into the first audio data stream.

In a specific implementation, the wav file header in the pcm-type wav format audio file can be removed to obtain the pcm-type first audio data, and then the decoded pcm-type first audio data can be synthesized into the first audio data stream.

Optionally, if the video duration of the first video file is greater than the audio duration of the second audio file, the second audio file can be used cyclically. Specifically, after the second audio file is decoded, the Two audio files are decoded, or, video playback can be paused.

104. Synthesize the first audio data stream and the first video file to obtain a second video file.

In this embodiment of the present application, the first audio data stream and the first video file may be synthesized to obtain the second video file, so that the audio data and the video data are stored and transmitted as one file.

Optionally, after the above step 104, the following steps may also be included:

105, decompose the second video file to obtain the decomposed second audio data and the first video data;

106, write the second audio data into a direct memory access (direct memory access, DMA) buffer;

107, read the second audio data of the DMA buffer, carry out encoding and decoding processing, and obtain the second audio data stream after encoding and decoding processing;

108. Synchronously play the first video data and the encoded and decoded second audio data stream.

In the embodiment of the present application, the second video file may also be played. Specifically, the second video file may be decomposed first to obtain decomposed second audio data and first video data. For the second audio data, you may Write the second audio data into the DMA buffer, then read the second audio data from the DMA buffer, perform encoding and decoding processing, obtain the second audio data stream after encoding and decoding processing, and transmit the second audio data stream to playback A hardware device, such as a speaker, an earpiece or an earphone; at the same time, the first video data is transmitted to the display screen for playback in the same time line.

It can be seen that by directly writing the second audio data into the DMA buffer, the present application does not need to write the second audio data into the track buffer, so that the amount of data transmitted in the DMA buffer increases, and the interruption in the DMA transmission process is greatly reduced. , so that the terminal device can listen to music for a longer time in the same environment.

Optionally, the second audio data includes multiple sets of audio data frames, the number of DMA buffers is multiple, and each set of audio data frames includes at least one frame of audio data. In the above step 106, the The second audio data is written into the DMA buffer, including:

61. Write the multi-frame audio data into multiple DMA buffers, wherein the multiple DMA buffers include a first DMA buffer and a second DMA buffer, and write a second audio into the second DMA buffer The data frame set is performed at the same time as the first audio data frame set is read from the first DMA buffer, and the first audio data frame set and the second audio data frame set are among the plurality of audio data frame sets. A collection of any two distinct audio data frames.

In this embodiment of the present application, the second audio data may be written into different multiple DMA buffers. For example, as shown in FIG. 1B , FIG. Demonstration schematic diagram of writing and reading, wherein at least one frame of audio data of the first set of audio data frames can be written into the first DMA buffer, and after the writing is completed, at least one frame of the second set of audio data frames can be written. The audio data is written into the second DMA buffer, and in the process of writing into the second DMA buffer, at least one frame of audio data of the first audio data frame set can be read from the first DMA buffer at the same time, and encoding and decoding processing is performed, The encoding and decoding processing is obtained, and the audio data stream after the encoding and decoding processing is obtained, in this way, the efficiency of processing the audio data by the terminal device can be improved.

Optionally, in this embodiment of the present application, the following steps may also be included:

receiving a pause play instruction for the second video file;

The operation of writing the set of audio data frames to the DMA buffer and the operation of reading audio data from the DMA buffer are stopped according to the play pause instruction.

In this embodiment of the present application, the user may pause the playback of the second video file, and at this time, the operation of writing the set of audio data frames to the DMA buffer and the reading of audio data from the DMA buffer may be stopped. operation, because this solution does not need to write the second audio data into the track buffer first, and then read the second audio data from the track buffer and write it into the DMA buffer, therefore, when the DMA interrupt is triggered by pausing the video playback, reduce the The operation of copying the second audio data in sequence from the track buffer to the DMA buffer, thereby reducing the power consumption of the terminal device and improving the audio data processing efficiency.

Please refer to FIG. 2. FIG. 2 is a schematic flowchart of another feature machine video soundtrack method provided by an embodiment of the present application. The feature machine video soundtrack method is applied to a terminal device, and the terminal device belongs to a feature machine. The method includes the following steps:

201. Obtain a first audio file required for the soundtrack.

The audio format of the first audio file may be any one of the following: mp3, wav, midi, or amr.

In a specific implementation, the user can select the first audio file required for the soundtrack.

202. Separate the first audio file to obtain a separated audio file.

203. Decode the separated audio file to obtain first audio data of a first preset type.

Wherein, the first preset type may be a pcm type.

204. Add a file header of a first audio format to the first audio data to obtain the second audio file.

The file header of the first audio format may be a wav file header, and the second audio file may be a pcm-type wav format audio file.

205. Acquire a first video file to be composed.

In a specific implementation, the user may perform video recording through the terminal device to obtain the first video file.

206. Remove the file header in the second audio file to obtain decoded first audio data.

In a specific implementation, the wav file header in the pcm-type wav format audio file may be removed to obtain the pcm-type first audio data.

207. Synthesize the decoded first audio data into the first audio data stream.

208. Synthesize the first audio data stream and the first video file to obtain a second video file.

209. Decompose the second video file to obtain decomposed second audio data and first video data.

210. Write the second audio data into the direct memory access DMA buffer.

211. Read the second audio data in the DMA buffer, perform encoding and decoding processing, and obtain a second audio data stream after encoding and decoding processing.

212. Synchronously play the first video data and the encoded and decoded second audio data stream.

The specific implementation process of the above steps 201 to 212 may refer to the corresponding descriptions in the steps 101 to 104, which will not be repeated here.

It can be seen that, in the embodiment of the present application, the first audio file required for the soundtrack is obtained, the first audio file is separated to obtain the separated audio file, and the separated audio file is decoded to obtain For the first audio data of the first preset type, add a file header of the first audio format to the first audio data, obtain the second audio file, obtain the first video file to be accompanied by music, and add the second audio file to the second audio file. The file header in the audio file is removed, the decoded first audio data is obtained, the decoded first audio data is synthesized into the first audio data stream, and the first audio data stream and the first audio data stream are combined. A video file is synthesized to obtain a second video file, the second video file is decomposed, the decomposed second audio data and the first video data are obtained, and the second audio data is written into the direct memory access DMA buffer area, read the second audio data of the DMA buffer, perform encoding and decoding processing, obtain the second audio data stream after the encoding and decoding processing, and combine the first video data with the second audio after the encoding and decoding processing. The data stream is played synchronously, in this way, the video soundtrack function of the function machine can be realized, and the software and hardware resources of the function machine can be saved.

Please refer to FIG. 3. FIG. 3 is a schematic structural diagram of a terminal device provided by an embodiment of the present application, including: one or more processors, one or more memories, one or more communication interfaces, and one or more programs ;

the one or more programs are stored in the memory and configured to be executed by the one or more processors;

The program includes instructions for performing the following steps:

Obtain the first video file to be composed;

The second audio file is decoded to obtain the first audio data stream;

The first audio data stream and the first video file are synthesized to obtain a second video file.

In an implementation manner of the present application, in the aspect of transcoding the first audio file to obtain the transcoded second audio file, the program includes instructions for performing the following steps:

The first audio file is separated to obtain the separated audio file;

Decoding the separated audio file to obtain the first audio data of the first preset type;

A file header of a first audio format is added to the first audio data to obtain the second audio file.

In an implementation manner of the present application, in the aspect of decoding the second audio file to obtain the first audio data stream, the program includes instructions for performing the following steps:

The decoded first audio data is synthesized into the first audio data stream.

In an implementation manner of the present application, in the aspect of obtaining the first video file to be composed, the program includes instructions for performing the following steps:

record first video data;

The first video data is encoded to obtain an encoded first video file.

In an implementation manner of the present application, after synthesizing the first audio data stream and the first video file to obtain the second video file, the program includes an instruction for performing the following steps:

The second video file is decomposed to obtain the decomposed second audio data and the first video data;

writing the second audio data into a direct memory access DMA buffer;

Read the second audio data of the DMA buffer, carry out encoding and decoding processing, and obtain the second audio data stream after encoding and decoding processing;

The first video data and the encoded and decoded second audio data stream are played synchronously.

In an implementation manner of the present application, the second audio data includes multiple sets of audio data frames, the number of DMA buffers is multiple, and each set of audio data frames includes at least one frame of audio data. In terms of writing the second audio data to a DMA buffer, the program includes instructions for performing the following steps:

Write the multi-frame audio data into multiple DMA buffers, wherein the multiple DMA buffers include a first DMA buffer and a second DMA buffer, and write a second frame of audio data to the second DMA buffer The collection is performed simultaneously with the reading of the first set of audio data frames from the first DMA buffer, and the first set of audio data frames and the second set of audio data frames are any of the plurality of sets of audio data frames. Two distinct collections of audio dataframes.

In an implementation of the present application, the program further includes instructions for performing the following steps:

receiving a pause play instruction for the second video file;

It should be noted that, for the specific implementation process of this embodiment, reference may be made to the specific implementation process described in the foregoing method embodiment, which is not described herein again.

Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a video soundtrack device for a functional machine provided by an embodiment of the present application, applied to a terminal device, and the device includes:

Obtaining unit 401, for obtaining the first audio file required for the soundtrack;

a processing unit 402 for transcoding the first audio file to obtain a transcoded second audio file;

The obtaining unit 401 is further configured to obtain the first video file to be composed;

The processing unit 402 is further configured to decode the second audio file to obtain a first audio data stream, and encode the first video file to obtain a video data stream;

The processing unit 402 is further configured to synthesize the audio data stream and the video data stream to obtain a second video file.

In an implementation manner of the present application, in the aspect of transcoding the first audio file to obtain the transcoded second audio file, the processing unit 402 is specifically configured to:

The first audio file is separated to obtain the separated audio file;

Add the file header of the first audio format to the first audio data to obtain the second audio file

In an implementation manner of the present application, in the aspect of decoding the second audio file to obtain the first audio data stream, the processing unit 402 is specifically configured to:

The decoded first audio data is synthesized into the first audio data stream.

In an implementation manner of the present application, in the aspect of obtaining the first video file to be composed, the obtaining unit 401 is specifically configured to:

record first video data;

The first video data is encoded to obtain an encoded first video file.

In an implementation manner of the present application, after synthesizing the first audio data stream and the first video file to obtain the second video file, the processing unit 402 is further configured to:

writing the second audio data into a direct memory access DMA buffer;

In an implementation manner of the present application, the second audio data includes multiple sets of audio data frames, the number of DMA buffers is multiple, and each set of audio data frames includes at least one frame of audio data. In terms of writing the second audio data into the DMA buffer, the above-mentioned processing unit 402 is specifically used for:

In an implementation manner of the present application, the above-mentioned processing unit 402 is further configured to:

receiving a pause play instruction for the second video file;

It should be noted that, the obtaining unit 401 and the processing unit 402 in the apparatus may be implemented by a processor. For the specific implementation steps and other implementation steps in the embodiments of the present application, reference may be made to the specific implementation processes described in the foregoing method embodiments, which are not described herein again.

Embodiments of the present application further provide a computer storage medium, wherein the computer storage medium stores a computer program for storing a computer program, and the computer program causes a computer to execute part or all of the steps of any method described in the above method embodiments. Including user equipment.

Embodiments of the present application further provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute any one of the method embodiments described above. some or all of the steps of the method. The computer program product may be a software installation package, and the computer includes user equipment.

It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.

In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative, for example, the division of the above-mentioned units is only a logical function division, and other division methods may be used in actual implementation, for example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.

The above-mentioned units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

The above-mentioned integrated units, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable memory. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory, Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the above-mentioned methods in the various embodiments of the present application. The aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , Read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), magnetic disk or optical disk, etc.

The embodiments of the present application are described in detail above, and specific examples are used in this paper to illustrate the principles and implementations of the present application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; at the same time, for Persons of ordinary skill in the art, based on the idea of the present application, will have changes in the specific implementation manner and application scope. In summary, the contents of this specification should not be construed as limitations on the present application.

Claims

A function machine video soundtrack method, characterized in that the method comprises:

Obtain the first audio file required for the soundtrack; transcode the first audio file to obtain the transcoded second audio file;

Obtain the first video file to be composed;

The second audio file is decoded to obtain the first audio data stream;

The first audio data stream and the first video file are synthesized to obtain a second video file.
The method according to claim 1, wherein the transcoding the first audio file to obtain the transcoded second audio file comprises:

The first audio file is separated to obtain the separated audio file;

Decoding the separated audio file to obtain the first audio data of the first preset type;

A file header of a first audio format is added to the first audio data to obtain the second audio file.
The method according to claim 1 or 2, wherein the decoding the second audio file to obtain the first audio data stream comprises:

The file header in the second audio file is removed to obtain the decoded first audio data;

The decoded first audio data is synthesized into the first audio data stream.
The method according to claim 3, wherein the acquiring the first video file to be composed comprises:

record first video data;

The first video data is encoded to obtain an encoded first video file.
The method according to claim 4, wherein after synthesizing the first audio data stream and the first video file to obtain the second video file, the method further comprises:

The second video file is decomposed to obtain the decomposed second audio data and the first video data;

writing the second audio data into a direct memory access DMA buffer;

Read the second audio data of the DMA buffer, carry out encoding and decoding processing, and obtain the second audio data stream after encoding and decoding processing;

The first video data and the encoded and decoded second audio data stream are played synchronously.
The method according to claim 5, wherein the second audio data includes multiple sets of audio data frames, the number of DMA buffers is multiple, and each set of audio data frames includes at least one frame of audio data , the writing of the second audio data into the DMA buffer includes:

Write the multi-frame audio data into multiple DMA buffers, wherein the multiple DMA buffers include a first DMA buffer and a second DMA buffer, and write a second frame of audio data to the second DMA buffer The collection is performed simultaneously with the reading of the first set of audio data frames from the first DMA buffer, and the first set of audio data frames and the second set of audio data frames are any of the plurality of sets of audio data frames. Two distinct collections of audio dataframes.
The method according to claim 6, wherein the method further comprises:

receiving a pause play instruction for the second video file;

The operation of writing the set of audio data frames to the DMA buffer and the operation of reading audio data from the DMA buffer are stopped according to the play pause instruction.
The method of claim 1, wherein the first audio file is an audio file recorded by a user through a terminal device, or downloaded from the Internet.
The method of claim 1, wherein the audio format of the first audio file is mp3, wav, midi or amr format, and the audio format of the second audio file is the wav format of pcm type.
A function machine video soundtrack device, characterized in that the device comprises:

an acquisition unit for acquiring the first audio file required for the soundtrack;

a processing unit for transcoding the first audio file to obtain a transcoded second audio file;

The obtaining unit is also used to obtain the first video file to be composed;

The processing unit is further configured to decode the second audio file to obtain a first audio data stream, and encode the first video file to obtain a video data stream;

The processing unit is further configured to synthesize the audio data stream and the video data stream to obtain a second video file.
The apparatus of claim 10, wherein the processing unit is specifically configured to:

The first audio file is separated to obtain the separated audio file;

Decoding the separated audio file to obtain the first audio data of the first preset type;

A file header of a first audio format is added to the first audio data to obtain the second audio file.
The apparatus according to claim 10 or 11, wherein the processing unit 402 is specifically configured to:

The file header in the second audio file is removed to obtain the decoded first audio data;

The decoded first audio data is synthesized into the first audio data stream.
The device according to claim 12, wherein the obtaining unit 401 is specifically configured to:

record first video data;

The first video data is encoded to obtain an encoded first video file.
The apparatus of claim 13, wherein the processing unit 402 is further configured to:

The second video file is decomposed to obtain the decomposed second audio data and the first video data;

writing the second audio data into a direct memory access DMA buffer;

Read the second audio data of the DMA buffer, carry out encoding and decoding processing, and obtain the second audio data stream after encoding and decoding processing;

The first video data and the encoded and decoded second audio data stream are played synchronously.
The apparatus of claim 14, wherein the second audio data includes multiple sets of audio data frames, the number of DMA buffers is multiple, and each set of audio data frames includes at least one frame of audio data , in the aspect of writing the second audio data into the DMA buffer, the above processing unit 402 is specifically used for:

Write the multi-frame audio data into multiple DMA buffers, wherein the multiple DMA buffers include a first DMA buffer and a second DMA buffer, and write a second frame of audio data to the second DMA buffer The collection is performed simultaneously with the reading of the first set of audio data frames from the first DMA buffer, and the first set of audio data frames and the second set of audio data frames are any of the plurality of sets of audio data frames. Two distinct collections of audio dataframes.
The apparatus of claim 15, wherein the processing unit 402 is further configured to:

receiving a pause play instruction for the second video file;

The operation of writing the set of audio data frames to the DMA buffer zone is stopped according to the play pause instruction, and the operation of reading audio data from the DMA buffer zone is stopped.
The apparatus of claim 10, wherein the first audio file is an audio file recorded by a user through a terminal device, or downloaded from the Internet.
The apparatus according to claim 10, wherein the audio format of the first audio file is mp3, wav, midi or amr format, and the audio format of the second audio file is the wav format of pcm type.
A terminal device, characterized in that the terminal device includes a memory, a communication interface, and one or more programs, the one or more programs being stored in the memory and configured to be executed by the processor , the program comprising instructions for performing the steps in the method of any of claims 1-7.
A computer-readable storage medium, characterized in that, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method according to any one of claims 1-7.
A chip, characterized by comprising a processor, wherein the processor executes the steps of the method according to any one of claims 1-9.