CN112689194B

CN112689194B - Functional machine video music matching method and device, terminal equipment and storage medium

Info

Publication number: CN112689194B
Application number: CN202011523943.4A
Authority: CN
Inventors: 黄瑞; 李京
Original assignee: Spreadtrum Semiconductor Chengdu Co Ltd
Current assignee: Spreadtrum Semiconductor Chengdu Co Ltd
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2023-02-10
Anticipated expiration: 2040-12-21
Also published as: CN112689194A; WO2022135105A1

Abstract

The embodiment of the application provides a method and a device for matching music with video of a function machine, which are applied to terminal equipment, wherein the terminal equipment belongs to the function machine, and the method comprises the following steps: acquiring a first audio file required by the score; transcoding the first audio file to obtain a transcoded second audio file; acquiring a first video file to be dubbed; decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream; and synthesizing the audio data stream and the video data stream to obtain a second video file, so that the video dubbing music function of the functional machine can be realized, and the software and hardware resources of the functional machine are saved.

Description

Functional machine video music matching method and device, terminal equipment and storage medium

Technical Field

The application relates to the technical field of audio and video, in particular to a method and a device for matching music with video of a functional machine, terminal equipment and a storage medium.

Background

The music-matching function is already implemented on a smart machine (android operating system), and the specific music selected by the user is configured for the video recorded by the user by software located in an application layer, but at present, the method is not compatible for the functional machine, and the operating system of the functional machine is not compatible with application codes on the smart machine, such as a Real-time operating system (RTOS), and the functional machine does not have a java layer.

The function machine is restrained by cost and hardware conditions, the performance of the function machine is far lower than that of an intelligent machine, and the realization of the video music matching function of the function machine needs to be realized by depending on the existing function module as much as possible so as to reduce the expenditure of software and hardware, so that the problem of how to realize the video music matching function of the function machine and save the software and hardware resources of the function machine needs to be solved.

Disclosure of Invention

The embodiment of the application provides a video music matching method and device for a function machine, which can realize the video music matching function of the function machine and save software and hardware resources of the function machine.

In a first aspect, an embodiment of the present application provides a method for matching music with a video of a functional machine, where the method includes:

acquiring a first audio file required by the soundtrack; transcoding the first audio file to obtain a transcoded second audio file;

acquiring a first video file to be dubbed;

decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream;

and synthesizing the audio data stream and the video data stream to obtain a second video file.

In a second aspect, an embodiment of the present application provides a functional machine video dubbing apparatus, where the apparatus includes:

the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring a first audio file required by the score;

the processing unit is used for transcoding the first audio file to obtain a transcoded second audio file;

the acquisition unit is also used for acquiring a first video file to be dubbed;

the processing unit is further configured to decode the second audio file to obtain a first audio data stream, and encode the first video file to obtain a video data stream;

the processing unit is further configured to synthesize the audio data stream and the video data stream to obtain a second video file.

In a third aspect, an embodiment of the present application provides a terminal device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in the method according to the first aspect of the embodiment of the present application.

In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program causes a computer to perform some or all of the steps described in the method according to the first aspect of the present application.

In a fifth aspect, the present application provides a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps described in the method according to the first aspect of the present application. The computer program product may be a software installation package.

It can be seen that, in the embodiment of the present application, a first audio file required by a soundtrack is obtained; transcoding the first audio file to obtain a transcoded second audio file; acquiring a first video file to be dubbed; decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream; and synthesizing the audio data stream and the video data stream to obtain a second video file, so that the video dubbing music function of the functional machine can be realized, and the software and hardware resources of the functional machine are saved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1A is a schematic flowchart of a video dubbing method for a functional machine according to an embodiment of the present application;

FIG. 1B is a schematic illustration showing a DMA buffer for writing and reading second audio data according to an embodiment of the present application;

fig. 2 is a schematic flow chart of another video dubbing method for a functional machine according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a video dubbing apparatus of a functional machine according to an embodiment of the present application.

Detailed Description

The terminology used in the description of the embodiments section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application. The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions.

The terminal device in the embodiment of the application is a functional machine, an operating system of the terminal device may be, for example, an RTOS, and unlike an intelligent machine, an operating system of the intelligent machine is generally an android operating system, an ios operating system, and the like, and can be compatible with a plurality of application programs, but the operating system of the functional machine does not have a java layer, and cannot be compatible with a plurality of application programs. In addition, the function machine is constrained by cost and hardware conditions, the performance of the function machine is far lower than that of an intelligent machine, and the realization of the video music matching function of the function machine needs to be realized by depending on the existing function module as much as possible so as to reduce the expenses of software and hardware.

Specifically, the terminal device in the embodiment of the present application has a wireless communication function, and may be deployed on land, including indoors or outdoors, handheld, wearable, or vehicle-mounted; can also be deployed on the water surface (such as a ship and the like); and may also be deployed in the air (e.g., airplanes, balloons, satellites, etc.). The terminal device may be a mobile phone (mobile phone), a tablet computer (pad), a computer with a wireless transceiving function, a Virtual Reality (VR) terminal device, an Augmented Reality (AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in self driving (self driving), a wireless terminal in remote medical (remote medical), a wireless terminal in smart grid (smart grid), a wireless terminal in smart home (smart home), and the like. The terminal device may also be a handheld device with wireless communication capabilities, a vehicle mounted device, a wearable device, a computer device or other processing device connected to a wireless modem, etc.

Referring to fig. 1A, fig. 1A is a schematic flow diagram of a video dubbing method for a function machine according to an embodiment of the present application, where the video dubbing method for the function machine is applied to a terminal device, and the terminal device belongs to the function machine, and the method includes the following steps:

101. acquiring a first audio file required by the score; and transcoding the first audio file to obtain a transcoded second audio file.

The audio format of the first audio file may be any one of the following: mp3, wav, midi, or amr, and the first audio file may be an audio file recorded by a user through a terminal device, or an audio file downloaded from a network and transmitted from another device, which is not limited in the embodiments of the present application.

Wherein the second audio file may be a pulse code modulation (pcm) type wav format audio file.

In specific implementation, the first audio file may be transcoded, and specifically, an audio file with an audio format of mp3, wav, midi, or amr, etc. may be converted into a pcm-type wav-format audio file, where the pcm-type wav-format audio file may be implemented in a multimedia interface (MMI) layer in an operating system of the function machine, where the MMI layer may be implemented by a c-language code, and thus, may be compatible with the function machine, so that a defect that the function machine does not have a java layer may be solved.

Optionally, in step 101, the transcoding the first audio file to obtain a transcoded second audio file may include:

11. separating the first audio file to obtain a separated audio file;

12. decoding the separated audio file to obtain first audio data of a first preset type;

13. and adding a file header in a first audio format to the first audio data to obtain the second audio file.

The first preset type may be a pcm type, and the file header of the first audio format may be a wav file header.

In specific implementation, the first audio file can be separated to obtain a separated audio file; decoding the separated audio file to obtain first audio data of a first preset type, such as pcm type first audio data; and finally, a file header of a first audio format can be added to the pcm type first audio data, for example, a wav file header can be added, so that a second audio file, for example, a pcm type wav format audio file, can be obtained, and thus, the function of video dubbing can be realized on the function machine by the second audio file, wherein the second audio file can be temporarily stored in the storage space of the function machine as an intermediate file.

102. The method comprises the steps of obtaining a first video file to be dubbed.

The obtaining of the first video file to be dubbed music may be recording the first video file through a video application program, or may be receiving the first video file transmitted by an external device, or may be downloading from a network or calling a locally stored first video file.

Optionally, in the step 102, the obtaining of the first video file to be dubbed may include the following steps:

recording first video data;

and coding the first video data to obtain a coded first video file.

In the embodiment of the application, the recording of the first video file and the transcoding of the first audio file can be performed simultaneously.

103. And decoding the second audio file to obtain a first audio data stream.

In specific implementation, video recording coding is not needed, but the second audio file is decoded into the first audio data stream based on the existing function of the function machine, and the first video file is coded to obtain the video data stream, so that the function of video dubbing music can be realized on the function machine.

Optionally, in the step 103, the decoding the second audio file to obtain the first audio data stream may include:

removing the file header in the second audio file to obtain decoded first audio data;

and synthesizing the decoded first audio data into the first audio data stream.

In a specific implementation, the wav file header in the pcm type wav format audio file may be removed to obtain pcm type first audio data, and then, the decoded pcm type first audio data is synthesized into the first audio data stream.

Optionally, if the video duration of the first video file is longer than the audio duration of the second audio file, the second audio file may be recycled, specifically, the second audio file may be decoded again after the second audio file is decoded, or the video playing may be paused.

104. And synthesizing the first audio data stream and the first video file to obtain a second video file.

In the embodiment of the application, the first audio data stream and the first video file can be synthesized to obtain the second video file, so that the audio data and the video data are stored and transmitted as one file.

Optionally, after the step 104, the following steps may be further included:

105. decomposing the second video file to obtain decomposed second audio data and first video data;

106. writing the second audio data into a Direct Memory Access (DMA) buffer;

107. reading second audio data of the DMA buffer area, and performing coding and decoding processing to obtain a second audio data stream after the coding and decoding processing;

108. and synchronously playing the first video data and the second audio data stream after the coding and decoding processing.

In this embodiment of the present application, a second video file may also be played, specifically, the second video file may be decomposed to obtain decomposed second audio data and first video data, for the second audio data, the second audio data may be written into the DMA buffer, then the second audio data is read from the DMA buffer, and is subjected to encoding and decoding processing to obtain a second audio data stream after encoding and decoding processing, and the second audio data stream is transmitted to a playing hardware device, such as a speaker, a headphone, or an earphone; meanwhile, the first video data is transmitted to a display screen for playing in the same time line.

Therefore, the second audio data are directly written into the DMA buffer area without being written into the magnetic track buffer area, so that the data volume transmitted by the DMA buffer area is increased, the interruption in the DMA transmission process is greatly reduced, and the terminal equipment can listen to music for a longer time in the same environment.

Optionally, the writing the second audio data into the DMA buffer in step 106 includes:

61. and writing the multi-frame audio data into a plurality of DMA buffer areas, wherein the plurality of DMA buffer areas comprise a first DMA buffer area and a second DMA buffer area, writing a second audio data frame set into the second DMA buffer area and reading a first audio data frame set from the first DMA buffer area are performed simultaneously, and the first audio data frame set and the second audio data frame set are any two different audio data frame sets in the plurality of audio data frame sets.

In this embodiment of the present application, second audio data may be written into different multiple DMA buffer areas, for example, as shown in fig. 1B, fig. 1B is a schematic illustration of a DMA buffer area provided in this embodiment of the present application for performing a demonstration of writing and reading of second audio data, where at least one frame of audio data of a first audio data frame set may be written into a first DMA buffer area, and after the writing is completed, at least one frame of audio data of a second audio data frame set is written into a second DMA buffer area, and during the writing into the second DMA buffer area, at least one frame of audio data of the first audio data frame set may be simultaneously read from the first DMA buffer area, and then encoded and decoded to obtain encoded and decoded audio data streams, so that the efficiency of terminal device processing audio data may be improved.

Optionally, in this embodiment of the present application, the following steps may also be included:

receiving a pause playing instruction of the second video file;

and stopping writing the audio data frame set into the DMA buffer area according to the pause playing instruction, and stopping reading the audio data from the DMA buffer area.

In the embodiment of the application, the user can pause playing of the second video file, at this time, the operation of writing the audio data frame set into the DMA buffer area and the operation of reading the audio data from the DMA buffer area can be stopped, and because the scheme does not need to write the second audio data into the magnetic track buffer area first and then read the second audio data from the magnetic track buffer area and write the second audio data into the DMA buffer area, the operation of copying the second audio data from the magnetic track buffer area to the DMA buffer area in sequence can be reduced when the DMA is interrupted by pausing playing of the video and the like, so that the power consumption of the terminal device is reduced, and the audio data processing efficiency is improved.

Referring to fig. 2, fig. 2 is a schematic flow diagram of another video dubbing method for a function machine according to an embodiment of the present application, where the video dubbing method for the function machine is applied to a terminal device, and the terminal device belongs to the function machine, and the method includes the following steps:

201. a first audio file required for the soundtrack is obtained.

The audio format of the first audio file may be any one of the following: mp3, wav, midi, or amr, and the like.

In particular implementations, the user may select a first audio file required for the soundtrack.

202. And separating the first audio file to obtain a separated audio file.

203. And decoding the separated audio file to obtain first audio data of a first preset type.

Wherein the first preset type may be a pcm type.

204. And adding a file header in a first audio format to the first audio data to obtain the second audio file.

The file header of the first audio format may be a wav file header, and the second audio file may be a pcm-type wav format audio file.

205. The method comprises the steps of obtaining a first video file to be dubbed.

In specific implementation, a user can record a video through a terminal device to obtain a first video file.

206. And removing the file header in the second audio file to obtain the decoded first audio data.

In a specific implementation, the wav file header in the pcm-type wav format audio file may be removed, so as to obtain the pcm-type first audio data.

207. And synthesizing the decoded first audio data into the first audio data stream.

208. And synthesizing the first audio data stream and the first video file to obtain a second video file.

209. And decomposing the second video file to obtain decomposed second audio data and first video data.

210. Writing the second audio data to a Direct Memory Access (DMA) buffer.

211. And reading the second audio data of the DMA buffer area, and performing coding and decoding processing to obtain a second audio data stream after the coding and decoding processing.

212. And synchronously playing the first video data and the second audio data stream after the coding and decoding processing.

The specific implementation process of steps 201 to 212 may refer to the corresponding description in steps 101 to 104, and is not described herein again.

It can be seen that, in the embodiment of the present application, a first audio file required for dubbing music is obtained, the first audio file is separated to obtain a separated audio file, the separated audio file is decoded to obtain first audio data of a first preset type, a file header of a first audio format is added to the first audio data to obtain a second audio file, a first video file to be dubbed music is obtained, the file header of the second audio file is removed to obtain decoded first audio data, the decoded first audio data is synthesized into a first audio data stream, the first audio data stream and the first video file are synthesized to obtain a second video file, the second audio data and the first video file are decomposed to obtain decomposed second audio data and first video data, the second audio data is written into a direct memory access buffer, the second audio data of the DMA buffer is read, codec processing is performed to obtain the second audio data stream after codec processing, the first audio data and the second video data are written into a direct memory access buffer, and the video data stream can be played in a music saving function after the first audio data and the second audio data are synchronized.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present application, including: one or more processors, one or more memories, one or more communication interfaces, and one or more programs;

the one or more programs are stored in the memory and configured to be executed by the one or more processors;

the program includes instructions for performing the steps of:

acquiring a first video file to be dubbed;

decoding the second audio file to obtain a first audio data stream;

and synthesizing the first audio data stream and the first video file to obtain a second video file.

In an implementation manner of the present application, in the aspect of transcoding the first audio file to obtain a transcoded second audio file, the program includes instructions further configured to:

separating the first audio file to obtain a separated audio file;

decoding the separated audio file to obtain first audio data of a first preset type;

and adding a file header in a first audio format to the first audio data to obtain the second audio file.

In an implementation manner of the present application, in the aspect of decoding the second audio file to obtain the first audio data stream, the program includes instructions further configured to:

and synthesizing the decoded first audio data into the first audio data stream.

In an implementation manner of the present application, in the acquiring the first video file of the to-be-dubbed music, the program includes instructions for executing the following steps:

recording first video data;

and coding the first video data to obtain a coded first video file.

In an implementation manner of the present application, after the synthesizing the first audio data stream and the first video file to obtain a second video file, the program includes instructions further configured to:

decomposing the second video file to obtain decomposed second audio data and first video data;

writing the second audio data to a Direct Memory Access (DMA) buffer;

reading second audio data of the DMA buffer area, and performing coding and decoding processing to obtain a second audio data stream after the coding and decoding processing;

and synchronously playing the first video data and the second audio data stream after the coding and decoding processing.

In one implementation of the present application, the second audio data includes a plurality of audio data frame sets, the number of DMA buffers is plural, each of the audio data frame sets includes at least one frame of audio data, and the program includes instructions for performing the following steps in terms of writing the second audio data into a DMA buffer:

writing the multi-frame audio data into a plurality of DMA buffer areas, wherein the plurality of DMA buffer areas comprise a first DMA buffer area and a second DMA buffer area, writing a second audio data frame set into the second DMA buffer area and reading a first audio data frame set from the first DMA buffer area are carried out simultaneously, and the first audio data frame set and the second audio data frame set are any two different audio data frame sets in the plurality of audio data frame sets.

In an implementation of the present application, the program further comprises instructions for performing the steps of:

receiving a pause playing instruction of the second video file;

It can be seen that, in the embodiment of the present application, a first audio file required by a score is acquired; transcoding the first audio file to obtain a transcoded second audio file; acquiring a first video file to be dubbed; decoding the second audio file to obtain a first audio data stream, and encoding the first video file to obtain a video data stream; and synthesizing the audio data stream and the video data stream to obtain a second video file, so that the video dubbing music function of the functional machine can be realized, and the software and hardware resources of the functional machine are saved.

It should be noted that, for the specific implementation process of the present embodiment, reference may be made to the specific implementation process described in the above method embodiment, and a description thereof is omitted here.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a functional machine video music apparatus provided in an embodiment of the present application, and the apparatus is applied to a terminal device, and the apparatus includes:

an acquisition unit 401 configured to acquire a first audio file required for a soundtrack;

the processing unit 402 is configured to transcode the first audio file to obtain a transcoded second audio file;

the obtaining unit 401 is further configured to obtain a first video file to be dubbed music;

the processing unit 402 is further configured to decode the second audio file to obtain a first audio data stream, and encode the first video file to obtain a video data stream;

the processing unit 402 is further configured to synthesize the audio data stream and the video data stream to obtain a second video file.

In an implementation manner of the present application, in the aspect of transcoding the first audio file to obtain a transcoded second audio file, the processing unit 402 is specifically configured to:

separating the first audio file to obtain a separated audio file;

adding a file header in a first audio format to the first audio data to obtain the second audio file

In an implementation manner of the present application, in terms of decoding the second audio file to obtain the first audio data stream, the processing unit 402 is specifically configured to:

and synthesizing the decoded first audio data into the first audio data stream.

In an implementation manner of the present application, in acquiring the first video file of the to-be-dubbed music, the acquiring unit 401 is specifically configured to:

recording first video data;

and coding the first video data to obtain a coded first video file.

In an implementation manner of this application, after the synthesizing the first audio data stream and the first video file to obtain a second video file, the processing unit 402 is further configured to:

writing the second audio data to a Direct Memory Access (DMA) buffer;

In an implementation manner of the present application, the second audio data includes a plurality of audio data frame sets, the number of DMA buffers is multiple, each of the audio data frame sets includes at least one frame of audio data, and in terms of writing the second audio data into the DMA buffers, the processing unit 402 is specifically configured to:

In an implementation manner of the present application, the processing unit 402 is further configured to:

receiving a pause playing instruction of the second video file;

It should be noted that the acquiring unit 401 and the processing unit 402 in the apparatus may be implemented by a processor. The specific implementation steps and other implementation steps in the embodiments of the present application may refer to the specific implementation processes described in the above method embodiments, and are not described herein.

Embodiments of the present application further provide a computer storage medium, where the computer storage medium stores a computer program for causing a computer to execute part or all of the steps of any one of the methods described in the above method embodiments, and the computer includes a user equipment.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package, the computer comprising user equipment.

It should be noted that for simplicity of description, the above-mentioned embodiments of the method are described as a series of acts, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the above-described units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, read-Only memories (ROMs), random Access Memories (RAMs), magnetic or optical disks, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A video dubbing method of a function machine is characterized in that the video dubbing method is applied to the function machine, and an operating system of the function machine is provided with a multimedia interface MMI layer but not a Java layer; the method comprises the following steps:

acquiring a first audio file required by the soundtrack;

separating the first audio file to obtain a separated audio file;

decoding the separated audio file to obtain first audio data of a Pulse Code Modulation (PCM) type;

adding a WAV format file header to the PCM type first audio data to obtain a PCM type WAV format second audio file;

acquiring a first video file to be dubbed;

decoding the second audio file in the WAV format of the PCM type to obtain a first audio data stream;

2. The method according to claim 1, wherein said decoding said second audio file in WAV format of PCM type to obtain a first audio data stream comprises:

removing a file header in the second audio file in the WAV format of the PCM type to obtain decoded first audio data;

and synthesizing the decoded first audio data into the first audio data stream.

3. The method of claim 2, wherein obtaining the first video file of the to-be-dubbed music comprises:

recording first video data;

and coding the first video data to obtain a coded first video file.

4. The method of claim 3, wherein after said combining said first audio data stream and said first video file to obtain a second video file, said method further comprises:

writing the second audio data to a Direct Memory Access (DMA) buffer;

5. The method of claim 4, wherein the second audio data comprises a plurality of sets of audio data frames, wherein the number of DMA buffers is plural, wherein each set of audio data frames comprises at least one frame of audio data, and wherein writing the second audio data into a DMA buffer comprises:

writing multi-frame audio data into a plurality of DMA buffer areas, wherein the plurality of DMA buffer areas comprise a first DMA buffer area and a second DMA buffer area, writing a second audio data frame set into the second DMA buffer area and reading a first audio data frame set from the first DMA buffer area are carried out simultaneously, and the first audio data frame set and the second audio data frame set are any two different audio data frame sets in the plurality of audio data frame sets.

6. The method of claim 5, further comprising:

receiving a pause playing instruction of the second video file;

and stopping the operation of writing the audio data frame set into the DMA buffer area according to the pause playing instruction and stopping the operation of reading the audio data from the DMA buffer area.

7. The video dubbing music device of the function machine is characterized by being applied to the function machine, wherein an operating system of the function machine is provided with a multimedia interface MMI layer but does not have a Java layer; the device comprises:

the processing unit is used for separating the first audio file to obtain a separated audio file;

the processing unit is further configured to decode the separated audio file to obtain first audio data of a Pulse Code Modulation (PCM) type;

the processing unit is further configured to add a file header in the WAV format to the PCM-type first audio data to obtain a PCM-type second audio file in the WAV format;

8. A terminal device, characterized in that the terminal device comprises a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by a processor, the programs comprising instructions for performing the steps in the method of any of claims 1-6.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, which is executed by a processor to implement the method according to any one of claims 1-6.