CN112019917A

CN112019917A - Audio data extraction method, device, equipment and storage medium

Info

Publication number: CN112019917A
Application number: CN202010735021.3A
Authority: CN
Inventors: 林婷婷; 肖龙源; 李稀敏; 邓仁超; 刘晓葳
Original assignee: Xiamen Kuaishangtong Technology Co Ltd
Current assignee: Xiamen Kuaishangtong Technology Co Ltd
Priority date: 2020-07-28
Filing date: 2020-07-28
Publication date: 2020-12-01

Abstract

The invention discloses a method, a device and equipment for extracting audio data based on television program video, wherein the method comprises the following steps: acquiring video data, wherein the video data comprises video files acquired from television programs and a video list is generated; separating the video files in the video list into audio files by utilizing an open source audio and video multimedia processing framework technology; and storing the separated audio file in a server. By the mode, the audio data can be automatically extracted from the television program video without manual work, and the cost of people can be saved.

Description

Audio data extraction method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of audio processing, in particular to an audio data extraction method, device, equipment and storage medium based on television program videos.

Background

The sound application in the television program is divided into three categories, namely a language category, a score category and a sound effect category. In the process that the sound reaches the audience along with the picture, the sound is processed through different channels and processes, such as initial microphone pickup, recording sound console manufacturing, matrix switching scheduling, delay processing, loudness control, modulation broadcasting, receiving demodulation and the like, and finally, a high-definition and high-quality sound effect is formed. These sounds are well suited for model training and testing for voiceprint recognition techniques. However, the existing audio data extraction scheme based on the television program video generally extracts the audio data from the television program video manually, which is high in cost.

Disclosure of Invention

In view of the above, the present invention provides a method, an apparatus, and a device for extracting audio data based on a television program video, which can automatically extract audio data from a television program video without human labor, and can save human cost.

In order to achieve the above object, the present invention provides a method for extracting audio data based on a video of a television program, the method comprising: acquiring video data, wherein the video data comprises video files acquired from television programs and a video list is generated; separating the video files in the video list into audio files by utilizing an open source audio and video multimedia processing framework technology; and storing the separated audio file in a server.

Preferably, after the storing the separated audio file in the server, the method further includes:

and mounting an object storage service function in the server, and creating a storage space to perform real-time cloud storage on the audio file.

Preferably, the video files are obtained by downloading from the television programs in bulk.

In order to achieve the above object, the present invention further provides an apparatus for extracting audio data based on a video of a television program, the apparatus comprising: the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring video data, and the video data comprises video files acquired from television programs and a video list; the separation unit is used for separating the video files in the video list into audio files by utilizing an open-source audio and video multimedia processing framework technology;

and the storage unit is used for storing the separated audio file in the server.

Preferably, the apparatus for extracting audio data based on television program video further includes: and the mounting unit is used for mounting an object storage service function in the server, creating a storage space and performing real-time cloud storage on the audio file.

To achieve the above object, the present invention further provides an audio data extraction device based on tv program video, which includes a processor, a memory, and a computer program stored in the memory, wherein the computer program is executable by the processor to implement the audio data extraction method based on tv program video as described in any one of the above.

In order to achieve the above object, the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, the apparatus on which the computer-readable storage medium is located is controlled to execute the method for extracting audio data based on television program video according to any one of the above items.

It can be found that, according to the above scheme, video data can be acquired, the video data includes video files acquired from television programs, and a video list is generated, audio files can be separated from the video files in the video list by using an open-source audio-video multimedia processing framework technology, and the separated audio files can be stored in a server, so that the audio data can be automatically extracted from the television program videos without manual work, and the cost of people can be saved.

Furthermore, according to the scheme, the server can mount an object storage service function, and create a storage space to perform real-time cloud storage on the audio file, so that the advantage of performing real-time cloud storage on the separated audio file can be realized, and the safety of the separated audio file can be improved.

Further, according to the above scheme, the video file is obtained by downloading the video file from the television program in batches, so that the advantage that the video file can be downloaded in batches is achieved, and the time for downloading the video file can be shortened.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of an audio data extraction method based on a television program video according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an audio data extraction apparatus based on tv program video according to another embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The present invention will be described in detail with reference to the following examples.

Referring to fig. 1, a flowchart of an audio data extraction method based on a television program video according to an embodiment of the present invention is shown, where the method includes:

and S1, acquiring video data, wherein the video data comprises video files acquired from television programs, and generating a video list.

In this embodiment, the Video file may be a file in an MPEG (Moving Picture Experts Group) format, a file in an AVI (Audio Video Interleaved) format, a file in a WMV (Windows Media Video, microsoft streaming) format, or the like, and the present invention is not limited thereto.

In this embodiment, in step S1, acquiring a video, for example, a video file in MP4 format, from a television program, and generating a video list specifically includes:

a script is written for downloading MP4 video in bulk. Inputting the address, a video list can be generated. The main code scripts are as follows:

s2, separating the MP4 video files in the video list into audio files by using FFmpeg (Fast Moving Picture Experts Group, open source video multimedia processing framework) technology.

In this embodiment, step S2 specifically includes:

the FFmepg environment is installed and configured, and audio extraction is performed in batches in the video list obtained in step S1, for example, the MP4 video list. The execution code is as follows:

XXX @ ubuntu: $ ffmpeg-i applet. mp4-f mp3-vn applet. mp3 (where applet. mp4 is the source video file and applet. mp3 is the separated audio file).

Explanation of parameters in the above codes:

-i represents input, i.e. an input file;

-f represents a format, i.e. output format;

-vn represents the video not, i.e. the output contains no video;

viewing the size of the source video file and the size of the extracted audio file;

XXX@ubuntu:～$ls-lrt；

-rw-rw-r--1XXX XXX 24118025Jan 9 02:52apple.mp4；

-rw-rw-r--1XXX XXX 3379969Jan 9 02:54apple.mp3。

and S3, storing the separated audio file in the server.

In the foregoing embodiment, in a preferred embodiment of the present invention, after storing the separated audio file in the server, the method may further include:

the server is mounted with an object storage service function, and a storage space is created to perform real-time cloud storage on the audio file, so that the advantage that the real-time cloud storage on the separated audio file can be realized, and the safety of the separated audio file can be improved. .

In this embodiment, step S3 specifically includes:

a server for Storage is built and configured, an oss (Object Storage Service) function is mounted in the server, a packet (Storage space) is newly created, which may be called media (media), for example, and is associated with a directory/data/media in the server, and the audio file obtained in step S2 is subjected to real-time cloud Storage, where the following codes are mainly implemented:

in the above embodiment, in a preferred embodiment of the present invention, the video file, for example, the MP4 video file, can be obtained by downloading the video file from the television program in batches, which has the advantages of being able to realize the batch downloading of the video file and being able to shorten the time for downloading the video file.

The audio in the television program is processed by the related technology, the audio quality is high, the audio file in the television program is extracted, the influence of the physical state and the environmental noise of a speaker on the audio data when the data are collected in a real environment is avoided, and the audio features can be extracted more effectively.

Audio files are extracted from video files of a large number of television programs. A video is composed of a video stream and an audio stream, and sometimes, the content such as subtitles and chapters is added, and then the video is packaged into a format such as MKV (multimedia container), MP4, and the like. The encoding of the Audio stream of the video may be MP3, or AAC (Advanced Audio Coding), or other encoding, and these encodings may be played directly by a mobile phone or MP 3. We do not need to recompress the audio, but simply extract it as is, so that the quality of the sound is not lost.

ffmpeg is a very fast video and audio converter and can also be derived from real-time audio/video sources. It also allows switching between arbitrary sample rates and real-time resizing of the video using high quality polyphase filters. From the video by acquiring the television program, audio data in the video is separated by using ffmepg technique.

Referring to fig. 2, a second embodiment of the present invention provides a schematic structural diagram of an apparatus for extracting audio data based on television program video, where the apparatus 1 includes:

an obtaining unit 11, configured to obtain video data, where the video data includes a video file obtained from a television program, and generate a video list;

a separating unit 12, configured to separate the video files in the video list into audio files by using an open-source audio-video multimedia processing framework technology;

and a storage unit 13 for storing the separated audio file in the server.

In the foregoing embodiment, in a preferred embodiment of the present invention, the apparatus for extracting audio data based on television program video may further include:

and the mounting unit (not marked in the figure) is used for mounting an object storage service function in the server and creating a storage space to perform real-time cloud storage on the audio file.

In the foregoing embodiment, in a preferred embodiment of the present invention, the video files are obtained by downloading from the television programs in batch.

The functions or operation steps implemented by each unit in the above-mentioned audio data extraction apparatus based on tv program video are substantially the same as those in the above-mentioned embodiment, and are not described herein again.

An embodiment of the present invention further provides an apparatus for extracting audio data based on television program video, which includes a processor, a memory, and a computer program stored in the memory, where the computer program is executable by the processor to implement the method for extracting audio data based on television program video according to the above embodiment.

An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, a device in which the computer-readable storage medium is located is controlled to execute the method for extracting audio data based on television program video according to the above embodiment.

Illustratively, the computer program may be divided into one or more units, which are stored in the memory and executed by the processor to accomplish the present invention. The one or more elements may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the audio data extraction device based on the television program video.

The audio data extraction device based on the television program video can comprise a processor and a memory. It will be understood by those skilled in the art that the schematic diagram is merely an example of a television program video based audio data extraction device, and does not constitute a limitation of a television program video based audio data extraction device, and may include more or fewer components than those shown, or combine some of the components, or different components, for example, the television program video based audio data extraction device may also include an input output device, a network access device, a bus, etc.

The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the control center of the television program video based audio data extraction apparatus connects the various parts of the entire television program video based audio data extraction apparatus using various interfaces and lines.

The memory may be used to store the computer programs and/or modules, and the processor may implement the various functions of the apparatus for extracting audio data based on television program video by running or executing the computer programs and/or modules stored in the memory and calling the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the integrated unit of the audio data extraction device based on the television program video can be stored in a computer readable storage medium if the integrated unit is realized in the form of a software functional unit and sold or used as an independent product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

The embodiments in the above embodiments can be further combined or replaced, and the embodiments are only used for describing the preferred embodiments of the present invention, and do not limit the concept and scope of the present invention, and various changes and modifications made to the technical solution of the present invention by those skilled in the art without departing from the design idea of the present invention belong to the protection scope of the present invention.

Claims

1. A method for extracting audio data based on television program video, the method comprising:

acquiring video data, wherein the video data comprises video files acquired from television programs and a video list is generated;

separating the video files in the video list into audio files by utilizing an open source audio and video multimedia processing framework technology;

and storing the separated audio file in a server.

2. The method for extracting audio data based on TV program video as claimed in claim 1, further comprising, after storing the separated audio file in the server:

3. The method of claim 1, wherein the video file is obtained by downloading from the television program in bulk.

4. An apparatus for extracting audio data based on video of a television program, the apparatus comprising:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring video data, and the acquisition of the video data comprises the acquisition of a video file from a television program and the generation of a video list;

the separation unit is used for separating the video files in the video list into audio files by utilizing an open-source audio and video multimedia processing framework technology;

5. The television program video-based audio data extraction device of claim 4, further comprising:

and the mounting unit is used for mounting an object storage service function in the server, creating a storage space and performing real-time cloud storage on the audio file.

6. The apparatus according to claim 4, wherein the video file is obtained by batch downloading from the TV program.

7. A television program video-based audio data extraction device, comprising a processor, a memory, and a computer program stored in the memory, the computer program being executable by the processor to implement the television program video-based audio data extraction method according to any one of claims 1 to 3.

8. A computer-readable storage medium, comprising a stored computer program, wherein when the computer program runs, the computer-readable storage medium controls a device to execute the method for extracting audio data based on tv program video according to any one of claims 1 to 3.