CN109286841B - Movie sound effect processing method and related product - Google Patents

Movie sound effect processing method and related product Download PDF

Info

Publication number
CN109286841B
CN109286841B CN201811209949.7A CN201811209949A CN109286841B CN 109286841 B CN109286841 B CN 109286841B CN 201811209949 A CN201811209949 A CN 201811209949A CN 109286841 B CN109286841 B CN 109286841B
Authority
CN
China
Prior art keywords
video
indoor
outdoor
frame data
interval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811209949.7A
Other languages
Chinese (zh)
Other versions
CN109286841A (en
Inventor
朱克智
严锋贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201811209949.7A priority Critical patent/CN109286841B/en
Publication of CN109286841A publication Critical patent/CN109286841A/en
Application granted granted Critical
Publication of CN109286841B publication Critical patent/CN109286841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Abstract

The embodiment of the application discloses a movie sound effect processing method and a related product, wherein the method comprises the following steps: determining a movie video to be played, and extracting video frame data and audio frame data in the movie video; analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval; when the movie video is played to the indoor time period, the audio frame data of the indoor time period are played by adopting an indoor 3D sound effect strategy, and when the movie video is played to the outdoor time period, the audio frame data of the outdoor time period are played by adopting an outdoor 3D sound effect strategy. The technical scheme provided by the application has the advantage of high user experience.

Description

Movie sound effect processing method and related product
Technical Field
The application relates to the technical field of audio, in particular to a movie sound effect processing method and a related product.
Background
With the widespread use of electronic devices (such as mobile phones, tablet computers, and the like), the electronic devices can support more and more applications and have more and more powerful functions, the electronic devices develop towards diversification and personalization directions and become indispensable electronic products in user life, movie applications are high-frequency applications of the electronic devices, videos of existing movies are all based on surround sound, the audio effects of the existing movies are based on devices playing audio, for example, sound boxes and earphones play the same video with different effects, the actual scenes played by the movies are embodied by videos, and the audio is not distinguished, so that the scenes of the existing movies cannot be distinguished, and the user experience is affected.
Disclosure of Invention
The embodiment of the application provides a movie sound effect processing method and a related product, which can process audio according to the actual scene of a movie and improve user experience.
In a first aspect, an embodiment of the present application provides a method for processing sound effects of a movie, where the method includes the following steps:
determining a movie video to be played, and extracting video frame data and audio frame data in the movie video;
analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
when the movie video is played to the indoor time period, the audio frame data of the indoor time period are played by adopting an indoor 3D sound effect strategy, and when the movie video is played to the outdoor time period, the audio frame data of the outdoor time period are played by adopting an outdoor 3D sound effect strategy.
In a second aspect, a motion picture sound effect processing apparatus is provided, the motion picture sound effect processing apparatus comprising:
the acquisition unit is used for determining a movie video to be played and extracting video frame data and audio frame data in the movie video;
the analysis unit is used for analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
and the playing unit is used for playing the audio frame data in the indoor time period by adopting an indoor 3D sound effect strategy when the movie video is played in the indoor time period, and playing the audio frame data in the outdoor time period by adopting an outdoor 3D sound effect strategy when the movie video is played in the outdoor time period.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing the steps in the first aspect of the embodiment of the present application.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program enables a computer to perform some or all of the steps described in the first aspect of the embodiment of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps as described in the first aspect of the embodiments of the present application. The computer program product may be a software installation package.
According to the technical scheme, when the movie video to be played is determined, the video frame data and the audio frame data of the movie video are obtained, then the video frame data are analyzed to determine an indoor scene interval and an outdoor scene interval, time corresponding to the indoor scene interval is extracted, different 3D sound effect strategies are adopted according to the time of different scenes, and therefore the user can experience indoor and outdoor difference from the audio frequency and the video frequency, the audio frequency effect is increased, and the user experience degree is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart illustrating a method for processing sound effects of a movie according to an embodiment of the present application;
FIG. 3 is a schematic flow chart illustrating another method for processing sound effects of a movie according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a sound effect processing apparatus for movies according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of another electronic device disclosed in the embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The electronic device according to the embodiment of the present application may include various handheld devices (e.g., smart phones), vehicle-mounted devices, Virtual Reality (VR)/Augmented Reality (AR) devices, wearable devices, computing devices or other processing devices connected to wireless modems, and various forms of User Equipment (UE), Mobile Stations (MSs), terminal devices (terminal devices), development/test platforms, servers, and so on, which have wireless communication functions. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices.
In a specific implementation, in this embodiment of the application, the electronic device may filter audio data (sound emitted by a sound source) by using an HRTF (Head Related Transfer Function) filter to obtain virtual surround sound, which is also called surround sound or panoramic sound, so as to implement a three-dimensional stereo effect. The name of the HRTF in the time domain is hrir (head Related Impulse response). Or convolve the audio data with a Binaural Room Impulse Response (BRIR), which consists of three parts: direct sound, early reflected sound and reverberation.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device includes a control circuit and an input-output circuit, and the input-output circuit is connected to the control circuit.
The control circuitry may include, among other things, storage and processing circuitry. The storage circuit in the storage and processing circuit may be a memory, such as a hard disk drive memory, a non-volatile memory (e.g., a flash memory or other electronically programmable read only memory used to form a solid state drive, etc.), a volatile memory (e.g., a static or dynamic random access memory, etc.), etc., and the embodiments of the present application are not limited thereto. Processing circuitry in the storage and processing circuitry may be used to control the operation of the electronic device. The processing circuitry may be implemented based on one or more microprocessors, microcontrollers, digital signal processors, baseband processors, power management units, audio codec chips, application specific integrated circuits, display driver integrated circuits, and the like.
The storage and processing circuitry may be used to run software in the electronic device, such as play incoming call alert ringing application, play short message alert ringing application, play alarm alert ringing application, play media file application, Voice Over Internet Protocol (VOIP) phone call application, operating system functions, and so forth. The software may be used to perform some control operations, such as playing an incoming alert ring, playing a short message alert ring, playing an alarm alert ring, playing a media file, making a voice phone call, and performing other functions in the electronic device, and the embodiments of the present application are not limited.
The input-output circuit can be used for enabling the electronic device to input and output data, namely allowing the electronic device to receive data from the external device and allowing the electronic device to output data from the electronic device to the external device.
The input-output circuit may further include a sensor. The sensors may include ambient light sensors, optical and capacitive based infrared proximity sensors, ultrasonic sensors, touch sensors (e.g., optical based touch sensors and/or capacitive touch sensors, where the touch sensors may be part of a touch display screen or may be used independently as a touch sensor structure), acceleration sensors, gravity sensors, and other sensors, etc. The input-output circuit may further include audio components that may be used to provide audio input and output functionality for the electronic device. The audio components may also include a tone generator and other components for generating and detecting sound.
The input-output circuitry may also include one or more display screens. The display screen can comprise one or a combination of a liquid crystal display screen, an organic light emitting diode display screen, an electronic ink display screen, a plasma display screen and a display screen using other display technologies. The display screen may include an array of touch sensors (i.e., the display screen may be a touch display screen). The touch sensor may be a capacitive touch sensor formed by a transparent touch sensor electrode (e.g., an Indium Tin Oxide (ITO) electrode) array, or may be a touch sensor formed using other touch technologies, such as acoustic wave touch, pressure sensitive touch, resistive touch, optical touch, and the like, and the embodiments of the present application are not limited thereto.
The input-output circuitry may further include communications circuitry that may be used to provide the electronic device with the ability to communicate with external devices. The communication circuitry may include analog and digital input-output interface circuitry, and wireless communication circuitry based on radio frequency signals and/or optical signals. The wireless communication circuitry in the communication circuitry may include radio frequency transceiver circuitry, power amplifier circuitry, low noise amplifiers, switches, filters, and antennas. For example, the wireless communication circuitry in the communication circuitry may include circuitry to support Near Field Communication (NFC) by transmitting and receiving near field coupled electromagnetic signals. For example, the communication circuit may include a near field communication antenna and a near field communication transceiver. The communications circuitry may also include cellular telephone transceiver and antennas, wireless local area network transceiver circuitry and antennas, and so forth.
The input-output circuit may further include other input-output units. Input-output units may include buttons, joysticks, click wheels, scroll wheels, touch pads, keypads, keyboards, cameras, light emitting diodes and other status indicators, and the like.
The electronic device may further include a battery (not shown) for supplying power to the electronic device.
The film is a continuous image picture developed by combining motion photography and slide show, is a modern art of vision and hearing, and is a modern technology and art complex capable of accommodating a plurality of arts such as drama, photography, drawing, music, dance, character, sculpture, building and the like.
Movies are mostly shown in special places such as movie theaters, and with the development of electronic devices and the development of communication technologies, electronic devices such as smartphones are also devices for constantly playing movies, and the processing of audio by the electronic devices can be much worse than the configuration in the movie theaters. However, for the movie, the experience of the user includes video experience and audio experience, for the video experience, the promotion is mainly through a display screen technology, and for the audio experience, the existing electronic device does not have a special processing flow for executing the movie, so that the experience degree of the user on the movie audio is influenced.
For the experience of movies, the higher goal is to restore the scene to the viewer, i.e. to make the viewer enjoy the experience of being personally on the scene, for example, existing 5D movies and the like, all of which are to improve the experience of the viewer by some technical means. Then, for a living scene, for example, audio data heard in an indoor scene is different from audio data heard in an outdoor scene. Specifically, some scenes of explosion and battle often appear in a movie scene, and the audio data in the scene actually has completely different effects if a viewer is indoors or outdoors, but the existing movie playing does not experience different effects in the electronic device, thereby affecting the user experience.
The following describes embodiments of the present application in detail.
Referring to fig. 2, fig. 2 is a schematic flow chart of a method for processing sound effect of a movie, which is applied to the electronic device described in fig. 1 and disclosed in an embodiment of the present application, and the method for processing sound effect of a movie includes the following steps:
step S201, determining a movie video to be played, and extracting video frame data and audio frame data in the movie video;
the determining of the movie video to be played in step S201 may specifically include:
acquiring movie video identifiers played by a video app run by an electronic device, and determining movie videos to be played according to the movie video identifiers, wherein the movie video identifiers include but are not limited to: movie ID number, movie name, etc.
The extracting of the video frame data and the audio frame data in the movie video in step S201 may specifically include:
separating video data and audio data in the movie video to obtain video frame data and audio frame data, wherein the video frame data comprises: the video frame and the time corresponding to the video frame, the audio frame includes: audio frames and times corresponding to the audio frames.
Step S202, analyzing video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period of the outdoor scene interval;
step S203, when the movie video is played to the indoor time period, the audio frame data of the indoor time period is played by adopting an indoor 3D sound effect strategy, and when the movie video is played to the outdoor time period, the audio frame data of the outdoor time period is played by adopting an outdoor 3D sound effect strategy.
Optionally, the indoor 3D sound effect strategy playing includes but is not limited to: decreasing volume, increasing echo, etc. The outdoor 3D sound effect strategy playing may specifically include: filtering echoes, lowering peak volume, etc.
The technical scheme that this application provides is when confirming the movie video of waiting to broadcast, acquire the video frame data and the audio frame data of movie video, then carry out analysis to the video frame data and confirm indoor scene interval and outdoor scene interval, then it is the time to extract that this indoor scene interval corresponds, time to different scenes adopts different 3D audio strategies, can let the user experience indoor and outdoor difference from audio frequency and video homoenergetic promptly like this, thereby increase the effect of audio frequency, improve user experience.
Optionally, analyzing and determining the indoor scene interval and the outdoor scene interval in the video frame data in step S202 may specifically include:
transmitting each frame of video data in the video frame data to a trained classifier, executing a classification algorithm to process to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the outdoor scene as an outdoor scene interval.
The preset time may be specifically 2s, and of course, may also be another value, and the embodiment of the present application is not limited to the specific value of the preset time.
Such classifiers include, but are not limited to: machine learning, neural network models, deep learning models, and the like have an algorithm model with a classification function.
In the following, a practical example is used to determine how to determine the indoor scene interval from the video frames obtained by the classifier, and for the picture of the movie, the data statistics shows that the switching between the indoor scene and the outdoor scene is not fast, that is, if the picture is an indoor scene, the switching generally lasts for more than 2 seconds, and the same also applies for an outdoor scene. Then counting the succession of time of the video frames enables the noise data produced by the classifier to be culled.
Specifically, for example, as determined by the classifier, the video frames 1 to 1000 are all indoor scenes, the duration x of the video frames 1 to 1000 is extracted, and if x is greater than 2 seconds, the video frames 1 to 1000 are determined as an indoor scene interval.
Of course, the indoor scene interval and the outdoor scene interval may also be determined by the number, which may specifically include:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of which the number of continuous frames is greater than a number threshold value in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames of which the number of continuous frames is greater than the number threshold value in the plurality of video frames of the outdoor scene as an outdoor scene interval.
Specifically, for example, as determined by the classifier, the video frames 1 to 1000 are all indoor scenes, the continuous number of the extracted videos 1 to 1000 is 1000, and assuming that the number threshold is 100, the video frames 1 to 1000 are determined as an indoor scene interval.
Of course, the above scheme may also determine the indoor scene interval and the outdoor scene interval in the following manner, which may specifically include:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of the plurality of video frames of the indoor scene, the number of which is larger than a number threshold value, as an indoor scene interval, determining the video frames of the plurality of video frames of the outdoor scene, the number of which is larger than the number threshold value, as the outdoor scene interval, if the time interval of two adjacent indoor scene intervals is smaller than a preset time, combining the two adjacent indoor scene intervals into one indoor scene interval, and if the time interval of the two adjacent outdoor scene intervals is smaller than the preset time, combining the two adjacent outdoor scene intervals into one outdoor scene interval.
Specifically, for example, as determined by the classifier, video frames 1-1000 are all indoor scenes, the number of extracted videos 1-1000 is 1000, assuming the number threshold is 100, then video frame 1-video frame 1000 are determined to be the indoor scene interval 1, video frame 1003-video frame 2000 are all indoor scenes, the number of consecutive video frames 1003-video frame 2000 extracted is 997, assuming the number threshold is 100, then video frame 1003-video frame 2000 are determined to be indoor scene interval 2, when there is an interval of only 2 video frames between indoor scene interval 2 and indoor scene interval 1, this interval is less than a set threshold, since indoor and outdoor scene transitions are not so fast, the 2 video frames can be determined as noise signals, namely, the classifier identifies the erroneous video frame and directly combines the indoor scene interval 2 and the indoor scene interval 1 into one indoor scene interval.
Optionally, the method may further include:
determining a physical environment of an outdoor scene interval, if the physical environment is a ground environment, determining a first attenuation curve of a transmission target distance of the audio frame data in the air, and determining the volume of the audio data according to the relation between the first attenuation curve and the target distance;
if the physical environment is an underwater environment, determining a second attenuation curve of the target distance of audio frame data transmission in water, and determining the volume of the audio data according to the relation between the second attenuation curve and the target distance.
In an alternative example, the method for determining the target distance may specifically be that a first position of a sound source in first audio frame data at a first time of an indoor scene interval is determined, face recognition is performed on video frame data corresponding to the first time, for example, a human face is recognized, a second position of the human face is determined, the first position and the second position are marked in a map of the first video frame data, and a distance between the first position and the second position is calculated as a transmission target distance (the distance may be a straight-line distance, because for audio, transmission may be basically regarded as straight-line propagation).
The map of the first video frame data may be map data of a preset configuration, for example, a map of a shooting scene at east kingdom in beijing, and a map of east kingdom in a movie scene is configured.
Referring to fig. 3, fig. 3 is a schematic flow chart of a method for processing sound effect of a movie, which is applied to the electronic device described in fig. 1 and disclosed in the embodiment of the present application, and the method for processing sound effect of a movie includes the following steps:
step S301, determining a movie video to be played, and extracting video frame data and audio frame data in the movie video;
step S302, transmitting each frame of video data in the video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames with the number of continuous frames larger than a number threshold value in the plurality of video frames of the indoor scene as an indoor scene interval, determining the video frames with the number of continuous frames larger than the number threshold value in the plurality of video frames of the outdoor scene as an outdoor scene interval, if the time interval between two adjacent indoor scene intervals is less than the preset time, combining the two adjacent indoor scene intervals into one indoor scene interval, if the time interval between two adjacent outdoor scene intervals is smaller than the preset time, combining the two adjacent outdoor scene intervals into one outdoor scene interval, and extracting the indoor time period corresponding to the indoor scene interval and the outdoor time period of the outdoor scene interval.
Step S303, when the movie video is played to the indoor time period, the volume of the audio frame data in the indoor time period is reduced, and when the movie video is played to the outdoor time period, the echo is filtered.
The technical scheme that this application provides is when confirming the movie video of waiting to broadcast, acquire the video frame data and the audio frame data of movie video, then carry out analysis to the video frame data and confirm indoor scene interval and outdoor scene interval, then it is the time to extract that this indoor scene interval corresponds, time to different scenes adopts different 3D audio strategies, can let the user experience indoor and outdoor difference from audio frequency and video homoenergetic promptly like this, thereby increase the effect of audio frequency, improve user experience.
Referring to fig. 4, fig. 4 provides a sound effect processing apparatus, which includes:
an obtaining unit 401, configured to determine a movie video to be played, and extract video frame data and audio frame data in the movie video;
an analyzing unit 402, configured to analyze the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extract an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
the playing unit 403 is configured to play the audio frame data in the indoor time period by using an indoor 3D sound effect policy when the movie video is played in the indoor time period, and play the audio frame data in the outdoor time period by using an outdoor 3D sound effect policy when the movie video is played in the outdoor time period.
Optionally, the analyzing and determining the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
the parsing unit 402 is specifically configured to transmit each frame of video data in the video frame data to a trained classifier, execute a classification algorithm to process the frame to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determine a video frame of the plurality of video frames of the indoor scene whose continuous time is greater than a preset time as an indoor scene interval, and determine a video frame of the plurality of video frames of the outdoor scene whose continuous time is greater than the preset time as an outdoor scene interval.
Optionally, the analyzing and determining the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
the parsing unit 402 is specifically configured to transmit each frame of video data in the video frame data to a trained classifier, execute a classification algorithm to process the frame to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determine, as an indoor scene interval, a video frame of the plurality of video frames of the indoor scene whose number of consecutive frames is greater than a number threshold, and determine, as an outdoor scene interval, a video frame of the plurality of video frames of the outdoor scene whose number of consecutive frames is greater than the number threshold.
Optionally, the analyzing and determining the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
the parsing unit 402 is specifically configured to transmit each frame of video data in the video frame data to a trained classifier, execute a classification algorithm to process the frame to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determine video frames of the plurality of video frames of the indoor scene whose number of consecutive frames is greater than a number threshold as an indoor scene interval, determine video frames of the plurality of video frames of the outdoor scene whose number of consecutive frames is greater than the number threshold as an outdoor scene interval, if a time interval between two adjacent indoor scene intervals is less than a preset time, merge the two adjacent indoor scene intervals into one indoor scene interval, if a time interval between two adjacent outdoor scene intervals is less than the preset time, merge the two adjacent outdoor scene intervals into one outdoor scene interval.
Optionally, the apparatus further comprises:
a processing unit 404, configured to determine a physical environment of an outdoor scene interval, if the physical environment is a ground environment, determine a first attenuation curve of a transmission target distance of the audio frame data in air, and determine a volume of the audio data according to a relationship between the first attenuation curve and the target distance.
Optionally, the apparatus further comprises:
and the processing unit 404 is configured to determine a second attenuation curve of the target distance of the audio frame data transmitted in the water if the physical environment is an underwater environment, and determine the volume of the audio data according to the relationship between the second attenuation curve and the target distance.
Referring to fig. 5, fig. 5 is a schematic structural diagram of another electronic device disclosed in the embodiment of the present application, and as shown in the drawing, the electronic device includes a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for performing the following steps:
determining a movie video to be played, and extracting video frame data and audio frame data in the movie video;
analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
when the movie video is played to the indoor time period, the audio frame data of the indoor time period are played by adopting an indoor 3D sound effect strategy, and when the movie video is played to the outdoor time period, the audio frame data of the outdoor time period are played by adopting an outdoor 3D sound effect strategy.
In an optional example, the analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
transmitting each frame of video data in the video frame data to a trained classifier, executing a classification algorithm to process to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the outdoor scene as an outdoor scene interval.
In an optional example, the analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of which the number of continuous frames is greater than a number threshold value in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames of which the number of continuous frames is greater than the number threshold value in the plurality of video frames of the outdoor scene as an outdoor scene interval.
In an optional example, the analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically includes:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of the plurality of video frames of the indoor scene, the number of which is larger than a number threshold value, as an indoor scene interval, determining the video frames of the plurality of video frames of the outdoor scene, the number of which is larger than the number threshold value, as the outdoor scene interval, if the time interval of two adjacent indoor scene intervals is smaller than a preset time, combining the two adjacent indoor scene intervals into one indoor scene interval, and if the time interval of the two adjacent outdoor scene intervals is smaller than the preset time, combining the two adjacent outdoor scene intervals into one outdoor scene interval.
In an optional example, the method further comprises:
determining a physical environment of an outdoor scene interval, if the physical environment is a ground environment, determining a first attenuation curve of a transmission target distance of the audio frame data in the air, and determining the volume of the audio data according to the relation between the first attenuation curve and the target distance.
In an optional example, the method further comprises:
if the physical environment is an underwater environment, determining a second attenuation curve of the target distance of audio frame data transmission in water, and determining the volume of the audio data according to the relation between the second attenuation curve and the target distance.
The above description has introduced the solution of the embodiment of the present application mainly from the perspective of the method-side implementation process. It is understood that the electronic device comprises corresponding hardware structures and/or software modules for performing the respective functions in order to realize the above-mentioned functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative elements and algorithm steps described in connection with the embodiments provided herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the electronic device may be divided into the functional units according to the method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
It should be noted that the electronic device described in the embodiments of the present application is presented in the form of a functional unit. The term "unit" as used herein is to be understood in its broadest possible sense, and objects used to implement the functions described by the respective "unit" may be, for example, an integrated circuit ASIC, a single circuit, a processor (shared, dedicated, or chipset) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that provide the described functionality.
The present embodiment also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program makes a computer execute part or all of the steps of any one of the movie sound effect processing methods described in the above method embodiments.
Embodiments of the present application also provide a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to execute some or all of the steps of any one of the sound effect processing methods as described in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for processing movie sound effect is characterized in that the method comprises the following steps:
determining a movie video to be played, and extracting video frame data and audio frame data in the movie video;
analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
when the movie video is played to an indoor time period, the indoor 3D sound effect strategy is adopted for playing the audio frame data of the indoor time period, and when the movie video is played to an outdoor time period, the outdoor 3D sound effect strategy is adopted for playing the audio frame data of the outdoor time period, wherein the indoor 3D sound effect strategy comprises volume reduction and echo increase, and the outdoor 3D sound effect strategy comprises echo filtration and highest volume reduction.
2. The method of claim 1,
the video frame data includes: video frames and time corresponding to the video frames;
the audio frame data includes: audio frames and times corresponding to the audio frames.
3. The method of claim 1, wherein analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically comprises:
transmitting each frame of video data in the video frame data to a trained classifier, executing a classification algorithm to process to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames with the continuous time longer than the preset time in the plurality of video frames of the outdoor scene as an outdoor scene interval.
4. The method of claim 1, wherein analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically comprises:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of which the number of continuous frames is greater than a number threshold value in the plurality of video frames of the indoor scene as an indoor scene interval, and determining the video frames of which the number of continuous frames is greater than the number threshold value in the plurality of video frames of the outdoor scene as an outdoor scene interval.
5. The method of claim 1, wherein analyzing the video frame data to determine the indoor scene interval and the outdoor scene interval in the video frame data specifically comprises:
the method comprises the steps of transmitting each frame of video data in video frame data to a trained classifier, executing a classification algorithm to process the video data to obtain a plurality of video frames of an indoor scene and a plurality of video frames of an outdoor scene, determining the video frames of the plurality of video frames of the indoor scene, the number of which is larger than a number threshold value, as an indoor scene interval, determining the video frames of the plurality of video frames of the outdoor scene, the number of which is larger than the number threshold value, as the outdoor scene interval, if the time interval of two adjacent indoor scene intervals is smaller than a preset time, combining the two adjacent indoor scene intervals into one indoor scene interval, and if the time interval of the two adjacent outdoor scene intervals is smaller than the preset time, combining the two adjacent outdoor scene intervals into one outdoor scene interval.
6. The method of claim 1, further comprising:
determining a physical environment of an outdoor scene interval, if the physical environment is a ground environment, determining a first attenuation curve of a transmission target distance of the audio frame data in the air, and determining the volume of the audio data according to the relation between the first attenuation curve and the target distance.
7. The method of claim 1, further comprising:
determining a physical environment of an outdoor scene interval, if the physical environment is an underwater environment, determining a second attenuation curve of the transmission target distance of the audio frame data in the water, and determining the volume of the audio data according to the relation between the second attenuation curve and the target distance.
8. A movie sound effect processing apparatus, comprising:
the acquisition unit is used for determining a movie video to be played and extracting video frame data and audio frame data in the movie video;
the analysis unit is used for analyzing the video frame data to determine an indoor scene interval and an outdoor scene interval in the video frame data, and extracting an indoor time period corresponding to the indoor scene interval and an outdoor time period corresponding to the outdoor scene interval;
the playing unit is used for playing the audio frame data of the indoor time period by adopting an indoor 3D sound effect strategy when the movie video is played to the indoor time period, and playing the audio frame data of the outdoor time period by adopting an outdoor 3D sound effect strategy when the movie video is played to the outdoor time period, wherein the indoor 3D sound effect strategy comprises volume reduction and echo increase, and the outdoor 3D sound effect strategy comprises echo filtration and highest volume reduction.
9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that a computer program for electronic data exchange is stored, wherein the computer program causes a computer to perform the method according to any one of claims 1-7.
CN201811209949.7A 2018-10-17 2018-10-17 Movie sound effect processing method and related product Active CN109286841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811209949.7A CN109286841B (en) 2018-10-17 2018-10-17 Movie sound effect processing method and related product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811209949.7A CN109286841B (en) 2018-10-17 2018-10-17 Movie sound effect processing method and related product

Publications (2)

Publication Number Publication Date
CN109286841A CN109286841A (en) 2019-01-29
CN109286841B true CN109286841B (en) 2021-10-08

Family

ID=65177942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811209949.7A Active CN109286841B (en) 2018-10-17 2018-10-17 Movie sound effect processing method and related product

Country Status (1)

Country Link
CN (1) CN109286841B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022710B (en) * 2022-05-30 2023-09-19 咪咕文化科技有限公司 Video processing method, device and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107179908A (en) * 2017-05-16 2017-09-19 网易(杭州)网络有限公司 Audio method of adjustment, device, electronic equipment and computer-readable recording medium
CN107888843A (en) * 2017-10-13 2018-04-06 深圳市迅雷网络技术有限公司 Sound mixing method, device, storage medium and the terminal device of user's original content

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3025461A4 (en) * 2013-07-21 2017-03-15 Wizedsp Ltd. Systems and methods using acoustic communication
CN104036789B (en) * 2014-01-03 2018-02-02 北京智谷睿拓技术服务有限公司 Multi-media processing method and multimedia device
CN108337556B (en) * 2018-01-30 2021-05-25 三星电子(中国)研发中心 Method and device for playing audio-video file

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107179908A (en) * 2017-05-16 2017-09-19 网易(杭州)网络有限公司 Audio method of adjustment, device, electronic equipment and computer-readable recording medium
CN107888843A (en) * 2017-10-13 2018-04-06 深圳市迅雷网络技术有限公司 Sound mixing method, device, storage medium and the terminal device of user's original content

Also Published As

Publication number Publication date
CN109286841A (en) 2019-01-29

Similar Documents

Publication Publication Date Title
CN109413563B (en) Video sound effect processing method and related product
US9924205B2 (en) Video remote-commentary synchronization method and system, and terminal device
CN108924438B (en) Shooting control method and related product
CN109327795B (en) Sound effect processing method and related product
CN105487657A (en) Sound loudness determination method and apparatus
CN107656718A (en) A kind of audio signal direction propagation method, apparatus, terminal and storage medium
US10993063B2 (en) Method for processing 3D audio effect and related products
CN108966067B (en) Play control method and related product
CN109597481A (en) AR virtual portrait method for drafting, device, mobile terminal and storage medium
CN110290262B (en) Call method and terminal equipment
EP3660660A1 (en) Processing method for sound effect of recording and mobile terminal
CN106412681A (en) Bullet screen video live broadcasting method and device
CN109254752B (en) 3D sound effect processing method and related product
CN104375811A (en) Method and device for processing sound effects
CN111638779A (en) Audio playing control method and device, electronic equipment and readable storage medium
CN111370018B (en) Audio data processing method, electronic device and medium
CN108833779B (en) Shooting control method and related product
CN109121069B (en) 3D sound effect processing method and related product
CN110198421B (en) Video processing method and related product
CN108924705B (en) 3D sound effect processing method and related product
CN109104687B (en) Sound effect processing method and related product
CN108650466A (en) The method and electronic equipment of photo tolerance are promoted when a kind of strong light or reversible-light shooting portrait
CN103458114A (en) Method, equipment and terminal for switching multimedia steams
CN109286841B (en) Movie sound effect processing method and related product
CN114143696B (en) Sound box position adjusting method, audio rendering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant