CN116600168A - Multimedia data processing method and device, electronic equipment and storage medium - Google Patents

Multimedia data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116600168A
CN116600168A CN202310401667.1A CN202310401667A CN116600168A CN 116600168 A CN116600168 A CN 116600168A CN 202310401667 A CN202310401667 A CN 202310401667A CN 116600168 A CN116600168 A CN 116600168A
Authority
CN
China
Prior art keywords
video
data
caption
multimedia
background music
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202310401667.1A
Other languages
Chinese (zh)
Inventor
陶继伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Sailing Weiye Technology Co ltd
Original Assignee
Shenzhen Sailing Weiye Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Sailing Weiye Technology Co ltd filed Critical Shenzhen Sailing Weiye Technology Co ltd
Priority to CN202310401667.1A priority Critical patent/CN116600168A/en
Publication of CN116600168A publication Critical patent/CN116600168A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/432Content retrieval operation from a local storage medium, e.g. hard-disk
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The application is suitable for the technical field of data processing, and provides a multimedia data processing method, a device, electronic equipment and a storage medium, which comprise the following steps: receiving a multimedia video and a multimedia tag input by a user; identifying a caption area in the multimedia video to obtain caption-free video data and caption text data; processing and identifying an audio analog signal in the multimedia video to obtain background music data; classifying and storing the caption-free video data, caption text data and background music data according to the multimedia tag; receiving a video making instruction input by a user, wherein the video making instruction comprises a video to be made and a video tag; receiving a caption importing instruction, and invoking corresponding caption text data according to the video label, so that a user selects the caption text data, and importing the caption text data selected by the user into the video to be produced. The application can automatically generate the caption-free video data, the caption text data and the background music data, and is efficient and convenient.

Description

Multimedia data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and apparatus for processing multimedia data, an electronic device, and a storage medium.
Background
For video fans, some wonderful video clips are downloaded frequently, video content, video lines and even background music can be utilized and processed secondarily, at present, users need to process the video and the subtitle text by themselves to obtain video and subtitle text without subtitles, and find the video and the subtitle text to obtain the background music, which is time-consuming and labor-consuming. Accordingly, there is a need to provide a multimedia data processing method, apparatus, electronic device, and storage medium, which aim to solve the above problems.
Disclosure of Invention
Aiming at the defects existing in the prior art, the application aims to provide a multimedia data processing method, a device, an electronic device and a storage medium, so as to solve the problems existing in the background art.
The application is realized in that a method for processing multimedia data comprises the following steps:
receiving a multimedia video and a multimedia tag input by a user;
identifying a caption area in the multimedia video to obtain caption-free video data and caption text data;
processing and identifying an audio analog signal in the multimedia video to obtain background music data;
classifying and storing the caption-free video data, caption text data and background music data according to the multimedia tag;
receiving a video making instruction input by a user, wherein the video making instruction comprises a video to be made and a video tag;
receiving a caption importing instruction, and invoking corresponding caption text data according to the video label so that a user selects the caption text data selected by the user to import the caption text data selected by the user into the video to be produced;
and receiving a music import instruction, calling corresponding background music data according to the video tag, enabling a user to select, and importing the background music data selected by the user into the video to be produced.
As a further scheme of the application: the step of identifying the caption area in the multimedia video to obtain the caption-free video data and the caption text data specifically comprises the following steps:
determining a caption area in the multimedia video, identifying caption texts in the caption area, sequencing and integrating all the caption texts according to the playing time to obtain a caption text data;
and cutting the lower edge of the multimedia video so that the caption area is cut off, and obtaining the caption-free video data.
As a further scheme of the application: the step of processing and identifying the audio analog signal in the multimedia video to obtain the background music data specifically comprises the following steps:
converting an audio analog signal in the multimedia video into a digital signal;
extracting audio features, and constructing an audio fingerprint according to the audio features;
and inputting the audio fingerprints into a song database, and performing similarity retrieval to obtain background music data.
As a further scheme of the application: the step of calling the corresponding caption text data according to the video tag to enable the user to select and importing the caption text data selected by the user into the video to be produced specifically comprises the following steps:
matching the video tag with the multimedia tags of all the caption text data to determine successfully matched caption text data;
receiving a caption text selection instruction, and selecting one caption text data;
and receiving a caption text editing instruction, editing caption text data, adding a time period for each caption text, and importing the caption text data into the video to be produced according to the time period.
As a further scheme of the application: the step of calling the corresponding background music data according to the video tag to enable the user to select and importing the background music data selected by the user into the video to be produced specifically comprises the following steps:
matching the video tag with the multimedia tags of all background music data to determine the successfully matched background music data;
receiving a background music selection instruction, and selecting one of the background music data;
receiving a background music editing instruction, intercepting background music data, adding an import time node, and importing the background music data into a video to be produced according to the import time node.
Another object of the present application is to provide a multimedia data processing apparatus, the apparatus comprising:
the multimedia data receiving module is used for receiving the multimedia video and the multimedia tag input by the user;
the subtitle video data module is used for identifying a subtitle region in the multimedia video to obtain subtitle-free video data and subtitle text data;
the background music data module is used for processing and identifying the audio analog signals in the multimedia video to obtain background music data;
the data classification storage module is used for classifying and storing the caption-free video data, the caption text data and the background music data according to the multimedia tag;
the video production module is used for receiving a video production instruction input by a user, wherein the video production instruction comprises a video to be produced and a video tag;
the subtitle text importing module is used for receiving a subtitle importing instruction, invoking corresponding subtitle text data according to the video tag, enabling a user to select, and importing the subtitle text data selected by the user into the video to be produced;
the background music importing module is used for receiving a music importing instruction, invoking corresponding background music data according to the video tag, enabling a user to select, and importing the background music data selected by the user into the video to be produced.
As a further scheme of the application: the subtitle video data module includes:
the caption text data unit is used for determining a caption area in the multimedia video, identifying caption texts in the caption area, sequencing and integrating all the caption texts according to the playing time to obtain a caption text data;
and the caption-free video data unit is used for cutting the lower edge of the multimedia video so that the caption area is cut off to obtain caption-free video data.
As a further scheme of the application: the background music data module includes:
the analog signal conversion unit is used for converting an audio analog signal in the multimedia video into a digital signal;
the audio fingerprint generation unit is used for extracting audio characteristics and constructing audio fingerprints according to the audio characteristics;
and the song data retrieval unit is used for inputting the audio fingerprints into the song database, and performing similarity retrieval to obtain background music data.
The application also provides an electronic device comprising a processor, a storage medium and a computer program stored on the storage medium and capable of running on the processor, which when executed by the processor, implements the specific steps of the multimedia data processing method.
The application also provides a storage medium, wherein the storage medium stores a program or instructions which, when executed by a processor, implement specific steps in the multimedia data processing method.
Compared with the prior art, the application has the beneficial effects that:
after the multimedia video and the multimedia label are input by a user, the application can automatically generate the caption-free video data, the caption text data and the background music data, and can carry out classified storage, and the subsequent utilization of the caption text data and the background music data is more convenient and quicker.
Drawings
Fig. 1 is a flowchart of a multimedia data processing method.
Fig. 2 is a flowchart of a method for processing multimedia data to obtain subtitle-free video data and subtitle text data.
Fig. 3 is a flowchart of a method for processing multimedia data to obtain background music data.
Fig. 4 is a flowchart of a multimedia data processing method for retrieving corresponding subtitle text data according to a video tag.
Fig. 5 is a flowchart of a multimedia data processing method for retrieving corresponding background music data according to a video tag.
Fig. 6 is a schematic structural diagram of a multimedia data processing apparatus.
Fig. 7 is a schematic diagram of a structure of a subtitle video data module in a multimedia data processing apparatus.
Fig. 8 is a schematic diagram of a background music data module in a multimedia data processing apparatus.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the present application will be described in further detail with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Specific implementations of the application are described in detail below in connection with specific embodiments.
As shown in fig. 1, an embodiment of the present application provides a multimedia data processing method, which includes the following steps:
s100, receiving a multimedia video and a multimedia tag input by a user;
s200, identifying a caption area in the multimedia video to obtain caption-free video data and caption text data;
s300, processing and identifying audio analog signals in the multimedia video to obtain background music data;
s400, classifying and storing the caption-free video data, caption text data and background music data according to the multimedia label;
s500, receiving a video production instruction input by a user, wherein the video production instruction comprises a video to be produced and a video tag;
s600, receiving a caption importing instruction, calling corresponding caption text data according to a video label, enabling a user to select, and importing the caption text data selected by the user into a video to be produced;
s700, receiving a music import instruction, calling corresponding background music data according to the video tag, enabling a user to select, and importing the background music data selected by the user into the video to be produced.
It should be noted that, for video fans, some wonderful video clips are often downloaded, and video content, video lines and even background music can be secondarily utilized and processed, currently, users need to process by themselves to obtain video without subtitles and subtitle text, and find themselves to obtain background music, which is time-consuming and laborious.
In the embodiment of the application, when a user needs to make secondary use and process a certain section of multimedia video, the multimedia video is directly input, and a multimedia tag is set, wherein the multimedia tag is used for reflecting the type of the multimedia video, for example, the multimedia tag is a text and a wisdom, so that the subsequent secondary use is convenient. When a user needs to make a short video by using stored caption text data and background music data, directly inputting a video making instruction, wherein the video making instruction comprises a video to be made and a video tag, uploading the video to be made, and inputting a caption importing instruction when the caption text data is needed, the embodiment of the application can call the corresponding caption text data according to the video tag, and after the user selects, importing the selected caption text data into the video to be made; when background music data is needed to be used, a music import instruction is input, the embodiment of the application can call the corresponding background music data according to the video tag, and after a user selects the background music data, the selected background music data is imported into a video to be produced, so that the subtitle text data and the background music data are more conveniently and rapidly utilized.
As shown in fig. 2, as a preferred embodiment of the present application, the step of identifying a subtitle region in a multimedia video to obtain subtitle-less video data and subtitle text data specifically includes:
s201, determining a caption area in a multimedia video, identifying caption texts in the caption area, sequencing and integrating all the caption texts according to the playing time to obtain a caption text data;
s202, cutting the lower edge of the multimedia video so that the caption area is cut off, and obtaining caption-free video data.
In the embodiment of the application, the caption area in the multimedia video is automatically determined, the picture at the caption area is intercepted, the picture is subjected to character recognition to obtain the caption text, all the caption texts are sequenced and integrated according to the playing time to obtain a caption text data, and a user can modify and delete the caption text data according to the self requirement; the embodiment of the application also cuts the lower edge of the multimedia video so that the caption area is cut off, and thus, the video data without captions can be obtained.
As shown in fig. 3, as a preferred embodiment of the present application, the steps of processing and identifying an audio analog signal in a multimedia video to obtain background music data specifically include:
s301, converting an audio analog signal in a multimedia video into a digital signal;
s302, extracting audio features, and constructing an audio fingerprint according to the audio features;
s303, inputting the audio fingerprints into a song database, and performing similarity retrieval to obtain background music data.
In the embodiment of the application, the multimedia video is required to be divided into a plurality of sections, then the audio analog signal in each section of the multimedia video is converted into a digital signal (ADC), then the audio characteristics are extracted, the audio fingerprints are constructed according to the audio characteristics, the audio fingerprints are input into a song database, the song database contains a large number of digital signals of songs, the similarity is searched, the song with the highest similarity is output, and the digital file of the output song is background music data.
As shown in fig. 4, as a preferred embodiment of the present application, the step of retrieving corresponding subtitle text data according to a video tag, enabling a user to select, and importing the subtitle text data selected by the user into a video to be produced specifically includes:
s601, matching the video tag with the multimedia tags of all the caption text data, and determining successfully matched caption text data;
s602, receiving a caption text selection instruction, and selecting one caption text data;
s603, receiving a caption text editing instruction, editing caption text data, adding a time period for each caption text, and importing the caption text data into the video to be produced according to the time period.
In the embodiment of the application, in order to quickly determine the caption text data matched with the video, the video label is required to be matched with the multimedia labels of all the caption text data, the matching degree is calculated, when the matching degree is larger than a set value, the matching is considered successful, all the successfully matched caption text data are called out, the user selects one of the caption text data, the user can edit the caption text data according to the requirement, and a time period is added for each caption text.
As shown in fig. 5, as a preferred embodiment of the present application, the step of retrieving corresponding background music data according to a video tag, enabling a user to select, and importing the background music data selected by the user into a video to be produced specifically includes:
s701, matching the video tag with the multimedia tags of all background music data, and determining the successfully matched background music data;
s702, receiving a background music selection instruction, and selecting one of the background music data;
s703, receiving a background music editing instruction, intercepting the background music data, adding an import time node, and importing the background music data into the video to be produced according to the import time node.
In the embodiment of the application, in order to quickly determine the background music data matched with the video, the video tag is required to be matched with the multimedia tags of all the background music data, the matching degree is calculated, when the matching degree is larger than a set value, the matching is considered successful, all the successfully matched background music data are called out, the user selects one of the background music data, the user can intercept the background music data according to the requirement, an import time node is added, and the background music data are imported into the corresponding node of the video to be manufactured according to the import time node.
As shown in fig. 6, an embodiment of the present application further provides a multimedia data processing apparatus, where the apparatus includes:
a multimedia data receiving module 100 for receiving a multimedia video and a multimedia tag inputted by a user;
the caption video data module 200 is configured to identify a caption area in the multimedia video, so as to obtain caption-free video data and caption text data;
the background music data module 300 is configured to process and identify an audio analog signal in a multimedia video to obtain background music data;
the data classification storage module 400 is configured to store the non-subtitle video data, the subtitle text data, and the background music data in a classified manner according to the multimedia tag;
the video production module 500 is configured to receive a video production instruction input by a user, where the video production instruction includes a video to be produced and a video tag;
the subtitle text importing module 600 is configured to receive a subtitle importing instruction, invoke corresponding subtitle text data according to a video tag, enable a user to select, and import the subtitle text data selected by the user into a video to be produced;
the background music importing module 700 is configured to receive a music importing instruction, invoke corresponding background music data according to a video tag, enable a user to select, and import the background music data selected by the user into a video to be produced.
In the embodiment of the application, when a user needs to make secondary use and process a certain section of multimedia video, the multimedia video is directly input, and a multimedia tag is set, wherein the multimedia tag is used for reflecting the type of the multimedia video, for example, the multimedia tag is a text and a wisdom, so that the subsequent secondary use is convenient. When a user needs to make a short video by using stored caption text data and background music data, directly inputting a video making instruction, wherein the video making instruction comprises a video to be made and a video tag, uploading the video to be made, and inputting a caption importing instruction when the caption text data is needed, the embodiment of the application can call the corresponding caption text data according to the video tag, and after the user selects, importing the selected caption text data into the video to be made; when background music data is needed to be used, a music import instruction is input, the embodiment of the application can call the corresponding background music data according to the video tag, and after a user selects the background music data, the selected background music data is imported into a video to be produced, so that the subtitle text data and the background music data are more conveniently and rapidly utilized.
As shown in fig. 7, as a preferred embodiment of the present application, the subtitle video data module 200 includes:
a caption text data unit 201, configured to determine a caption area in the multimedia video, identify caption texts in the caption area, sort and integrate all caption texts according to the playing time, and obtain a caption text data;
the no-subtitle video data unit 202 is configured to crop a lower edge of the multimedia video, so that a subtitle region is cropped to obtain no-subtitle video data.
As shown in fig. 8, as a preferred embodiment of the present application, the background music data module 300 includes:
an analog signal conversion unit 301, configured to convert an audio analog signal in a multimedia video into a digital signal;
an audio fingerprint generation unit 302, configured to extract audio features, and construct an audio fingerprint according to the audio features;
the song data retrieving unit 303 is configured to input the audio fingerprint into a song database, and perform similarity retrieval to obtain background music data.
The embodiment of the application also provides electronic equipment, which comprises a processor, a storage medium and a computer program stored on the storage medium and capable of running on the processor, wherein the specific steps in the multimedia data processing method are realized when the processor executes the computer program.
The embodiment of the application also provides a storage medium, wherein the storage medium stores a program or instructions which, when executed by a processor, realize specific steps in the multimedia data processing method.
The foregoing description of the preferred embodiments of the present application should not be taken as limiting the application, but rather should be understood to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the application.
It should be understood that, although the steps in the flowcharts of the embodiments of the present application are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
Other embodiments of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A method of multimedia data processing, the method comprising the steps of:
receiving a multimedia video and a multimedia tag input by a user;
identifying a caption area in the multimedia video to obtain caption-free video data and caption text data;
processing and identifying an audio analog signal in the multimedia video to obtain background music data;
classifying and storing the caption-free video data, caption text data and background music data according to the multimedia tag;
receiving a video making instruction input by a user, wherein the video making instruction comprises a video to be made and a video tag;
receiving a caption importing instruction, and invoking corresponding caption text data according to the video label so that a user selects the caption text data selected by the user to import the caption text data selected by the user into the video to be produced;
and receiving a music import instruction, calling corresponding background music data according to the video tag, enabling a user to select, and importing the background music data selected by the user into the video to be produced.
2. The method for processing multimedia data according to claim 1, wherein the step of identifying a caption area in the multimedia video to obtain the caption-free video data and the caption text data comprises:
determining a caption area in the multimedia video, identifying caption texts in the caption area, sequencing and integrating all the caption texts according to the playing time to obtain a caption text data;
and cutting the lower edge of the multimedia video so that the caption area is cut off, and obtaining the caption-free video data.
3. The method for processing multimedia data according to claim 1, wherein the step of processing and recognizing the audio analog signal in the multimedia video to obtain the background music data specifically comprises:
converting an audio analog signal in the multimedia video into a digital signal;
extracting audio features, and constructing an audio fingerprint according to the audio features;
and inputting the audio fingerprints into a song database, and performing similarity retrieval to obtain background music data.
4. The method for processing multimedia data according to claim 1, wherein the step of retrieving the corresponding subtitle text data according to the video tag, enabling the user to select, and importing the subtitle text data selected by the user into the video to be produced specifically comprises:
matching the video tag with the multimedia tags of all the caption text data to determine successfully matched caption text data;
receiving a caption text selection instruction, and selecting one caption text data;
and receiving a caption text editing instruction, editing caption text data, adding a time period for each caption text, and importing the caption text data into the video to be produced according to the time period.
5. The method for processing multimedia data according to claim 1, wherein the step of retrieving the corresponding background music data according to the video tag, enabling the user to select, and importing the background music data selected by the user into the video to be produced specifically comprises:
matching the video tag with the multimedia tags of all background music data to determine the successfully matched background music data;
receiving a background music selection instruction, and selecting one of the background music data;
receiving a background music editing instruction, intercepting background music data, adding an import time node, and importing the background music data into a video to be produced according to the import time node.
6. A multimedia data processing apparatus, the apparatus comprising:
the multimedia data receiving module is used for receiving the multimedia video and the multimedia tag input by the user;
the subtitle video data module is used for identifying a subtitle region in the multimedia video to obtain subtitle-free video data and subtitle text data;
the background music data module is used for processing and identifying the audio analog signals in the multimedia video to obtain background music data;
the data classification storage module is used for classifying and storing the caption-free video data, the caption text data and the background music data according to the multimedia tag;
the video production module is used for receiving a video production instruction input by a user, wherein the video production instruction comprises a video to be produced and a video tag;
the subtitle text importing module is used for receiving a subtitle importing instruction, invoking corresponding subtitle text data according to the video tag, enabling a user to select, and importing the subtitle text data selected by the user into the video to be produced;
the background music importing module is used for receiving a music importing instruction, invoking corresponding background music data according to the video tag, enabling a user to select, and importing the background music data selected by the user into the video to be produced.
7. The multimedia data processing apparatus of claim 6, wherein the subtitle video data module comprises:
the caption text data unit is used for determining a caption area in the multimedia video, identifying caption texts in the caption area, sequencing and integrating all the caption texts according to the playing time to obtain a caption text data;
and the caption-free video data unit is used for cutting the lower edge of the multimedia video so that the caption area is cut off to obtain caption-free video data.
8. The multimedia data processing apparatus of claim 6, wherein the background music data module comprises:
the analog signal conversion unit is used for converting an audio analog signal in the multimedia video into a digital signal;
the audio fingerprint generation unit is used for extracting audio characteristics and constructing audio fingerprints according to the audio characteristics;
and the song data retrieval unit is used for inputting the audio fingerprints into the song database, and performing similarity retrieval to obtain background music data.
9. An electronic device comprising a processor, a storage medium and a computer program stored on the storage medium and capable of running on the processor, which when executed by the processor, performs the specific steps of the multimedia data processing method according to any one of claims 1 to 5.
10. A storage medium having stored thereon a program or instructions which, when executed by a processor, implement the specific steps of the multimedia data processing method of any of claims 1 to 5.
CN202310401667.1A 2023-04-10 2023-04-10 Multimedia data processing method and device, electronic equipment and storage medium Withdrawn CN116600168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310401667.1A CN116600168A (en) 2023-04-10 2023-04-10 Multimedia data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310401667.1A CN116600168A (en) 2023-04-10 2023-04-10 Multimedia data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116600168A true CN116600168A (en) 2023-08-15

Family

ID=87598126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310401667.1A Withdrawn CN116600168A (en) 2023-04-10 2023-04-10 Multimedia data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116600168A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080084303A (en) * 2007-03-16 2008-09-19 어뉴텍코리아 주식회사 Technology which is storing easily, quickly and accurately only wanted part from the movie and audio files
CN103179093A (en) * 2011-12-22 2013-06-26 腾讯科技(深圳)有限公司 Matching system and method for video subtitles
CN111491205A (en) * 2020-04-17 2020-08-04 维沃移动通信有限公司 Video processing method and device and electronic equipment
CN111918094A (en) * 2020-06-29 2020-11-10 北京百度网讯科技有限公司 Video processing method and device, electronic equipment and storage medium
CN113190709A (en) * 2021-03-31 2021-07-30 浙江大学 Background music recommendation method and device based on short video key frame
CN113792178A (en) * 2021-08-31 2021-12-14 北京达佳互联信息技术有限公司 Song generation method and device, electronic equipment and storage medium
CN114020960A (en) * 2021-11-15 2022-02-08 北京达佳互联信息技术有限公司 Music recommendation method, device, server and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080084303A (en) * 2007-03-16 2008-09-19 어뉴텍코리아 주식회사 Technology which is storing easily, quickly and accurately only wanted part from the movie and audio files
CN103179093A (en) * 2011-12-22 2013-06-26 腾讯科技(深圳)有限公司 Matching system and method for video subtitles
CN111491205A (en) * 2020-04-17 2020-08-04 维沃移动通信有限公司 Video processing method and device and electronic equipment
CN111918094A (en) * 2020-06-29 2020-11-10 北京百度网讯科技有限公司 Video processing method and device, electronic equipment and storage medium
CN113190709A (en) * 2021-03-31 2021-07-30 浙江大学 Background music recommendation method and device based on short video key frame
CN113792178A (en) * 2021-08-31 2021-12-14 北京达佳互联信息技术有限公司 Song generation method and device, electronic equipment and storage medium
CN114020960A (en) * 2021-11-15 2022-02-08 北京达佳互联信息技术有限公司 Music recommendation method, device, server and storage medium

Similar Documents

Publication Publication Date Title
CN109800407B (en) Intention recognition method and device, computer equipment and storage medium
CN110321470B (en) Document processing method, device, computer equipment and storage medium
CN101202864B (en) Player for movie contents
US20090234854A1 (en) Search system and search method for speech database
US11907659B2 (en) Item recall method and system, electronic device and readable storage medium
CN113934869A (en) Database construction method, multimedia file retrieval method and device
CN111353055B (en) Cataloging method and system based on intelligent tag extension metadata
CN113065018A (en) Audio and video index library creating and retrieving method and device and electronic equipment
CN115329048A (en) Statement retrieval method and device, electronic equipment and storage medium
JP2005151127A5 (en)
CN116600168A (en) Multimedia data processing method and device, electronic equipment and storage medium
CN111522992A (en) Method, device and equipment for putting questions into storage and storage medium
CN115687579B (en) Document tag generation and matching method, device and computer equipment
CN114218437A (en) Adaptive picture clipping and fusing method, system, computer device and medium
Raimond et al. Using the past to explain the present: interlinking current affairs with archives via the semantic web
CN115203474A (en) Automatic database classification and extraction technology
CN115618054A (en) Video recommendation method and device
JP4394083B2 (en) Signal detection apparatus, signal detection method, signal detection program, and recording medium
CN111339359B (en) Sudoku-based video thumbnail automatic generation method
CN114490510A (en) Text stream filing method and device, computer equipment and storage medium
JP2004341948A (en) Concept extraction system, concept extraction method, program therefor, and storing medium thereof
CN115858797A (en) Method and system for generating Chinese near-meaning words based on OCR technology
CN113468377A (en) Video and literature association and integration method
JPWO2011042946A1 (en) Similar content search apparatus and program
CN111444386A (en) Video information retrieval method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20230815