WO2017107309A1 - Control method, control device, terminal, and synchronous audio playback system - Google Patents

Control method, control device, terminal, and synchronous audio playback system Download PDF

Info

Publication number
WO2017107309A1
WO2017107309A1 PCT/CN2016/075665 CN2016075665W WO2017107309A1 WO 2017107309 A1 WO2017107309 A1 WO 2017107309A1 CN 2016075665 W CN2016075665 W CN 2016075665W WO 2017107309 A1 WO2017107309 A1 WO 2017107309A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
multimedia
control
play
module
Prior art date
Application number
PCT/CN2016/075665
Other languages
French (fr)
Chinese (zh)
Inventor
罗炜
贾鑫
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017107309A1 publication Critical patent/WO2017107309A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams

Definitions

  • the audio synchronous playback technology of the present invention particularly relates to a control method, a control device, a terminal and an audio synchronous playback system.
  • a user may wish to use multiple smart terminals to simultaneously play the same music for a surround sound effect.
  • users who like music may also want to play the same music simultaneously through their respective smart terminals to achieve interactive and social purposes.
  • the existing smart terminal also needs to realize synchronous play of audio through manual operation, so that the user needs to operate accurately, for example, start playing music at the same time, so that the music starts playing at the same time.
  • it may be because there is a difference in the running speed between the various smart terminals, or an individual smart terminal encounters an event such as a crash restart, so that even if the music starts playing at the same time, the playing process cannot be strictly synchronized, thus affecting the audiovisual. effect.
  • Embodiments of the embodiments of the present invention aim to at least solve one of the technical problems existing in the prior art. To this end, embodiments of the present invention are required to provide a control method, a control device, a terminal, and an audio synchronous playback system.
  • the first terminal or the second terminal comprises a mobile phone, a tablet or a wearable device.
  • the identifying step comprises:
  • a matching sub-step performs feature matching according to the feature value and the feature library to identify the multimedia.
  • the feature value comprises an acoustic feature or a linguistic feature
  • the acoustic features include melody features or scale features.
  • the multimedia is a song and the linguistic features include lyrics.
  • the feature library is stored at the second terminal or remote server.
  • the identifying step comprises:
  • the multimedia played by the first terminal is recorded as a recording segment
  • the extracting substep extracts the feature value by analyzing the recorded segment.
  • the identifying step comprises:
  • the sub-step is determined to determine whether the multimedia is successfully identified, and if not, the recording sub-step is returned.
  • the multimedia is a song
  • the audio information includes a song title, a lyrics, an author, a singer, or a genre.
  • the play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
  • controlling step comprises:
  • the second terminal is configured to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
  • the second control sub-step performs synchronous play control during playback.
  • the multimedia file is stored at the second terminal or stored at a remote server.
  • the second control sub-step includes:
  • the second control sub-step comprises the following grandchild steps:
  • the second control sub-step comprises the following grandchild steps:
  • the correlation includes the linearity of the speech spectrum or the number of identical syllables within a predetermined time.
  • the multimedia is a song
  • the play status information includes a play position of the lyrics
  • the second control sub-step includes the following grand steps:
  • controlling method further includes:
  • An identification module configured to identify multimedia played by the first terminal to obtain corresponding audio information and play status information
  • control module is configured to control the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the play status information to play the multimedia synchronously with the first terminal.
  • the first terminal or the second terminal comprises a mobile phone, a tablet or a wearable device.
  • the identification module comprises:
  • An extraction module configured to extract feature values of the multimedia
  • the matching module being configured to perform feature matching according to the feature value and the feature library to identify the multimedia.
  • the feature value comprises an acoustic feature or a linguistic feature
  • the acoustic features include melody features or scale features.
  • the multimedia is a song and the linguistic features include lyrics.
  • the feature library is stored at the second terminal or remote server.
  • the identification module comprises:
  • An admission module configured to record the multimedia played by the first terminal as a recording segment
  • the extraction module is configured to extract the feature value by analyzing the recorded segment.
  • the identification module comprises:
  • the judging module being configured to determine whether the multimedia is successfully identified and notifying the admission module when the multimedia is not successfully identified.
  • the multimedia is a song
  • the audio information includes a song title, a lyrics, an author, a singer, or a genre.
  • the play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
  • control module comprises:
  • the extraction module being configured to acquire the multimedia file according to the audio information
  • the first sub-control module is configured to control the second terminal to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
  • the second sub-control module being configured to perform synchronous play control during playback.
  • the multimedia file is stored at the second terminal or stored at a remote server.
  • the second sub-control module is configured to intermittently turn off the speaker of the second terminal, identify the multimedia played by the first terminal, and determine that the first terminal and the second terminal play Whether the multimedia is synchronized and notifies the identification module when it is not synchronized.
  • the second sub-control module is configured to: capture the multimedia played by the first terminal and the second terminal, filter the multimedia played by the second terminal, and identify the first terminal to play And determining whether the multimedia played by the first terminal and the second terminal is synchronized and notifying the identification module when not synchronized.
  • the second sub-control module is configured to record the multimedia played by the first terminal and the second terminal, and compare the relevance of the multimedia played by the first terminal and the second terminal. And determining, according to the correlation, whether the multimedia played by the first terminal and the second terminal is synchronized, and notifying the identification module when not synchronized.
  • the correlation includes the linearity of the speech spectrum or the number of identical syllables within a predetermined time.
  • the multimedia is a song
  • the play status information includes a play position of the lyrics
  • the second sub-control module is configured to: capture the multimedia played by the first terminal and the second terminal, and identify The multimedia played by the first terminal and the second terminal obtains a play position of the lyrics and determines, according to the play position of the lyrics, whether the multimedia played by the first terminal and the second terminal is synchronized and not The identification module is notified when synchronizing.
  • control device comprises:
  • the request receiving module is configured to determine whether a request for synchronous play is received and notify the identification module when the request is received.
  • a terminal of an embodiment of the present invention includes the control device.
  • the terminal of the audio synchronization playing system of the embodiment of the present invention is the terminal of the audio synchronization playing system of the embodiment of the present invention.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the playback control method in the foregoing embodiment.
  • the control method, the control device, the terminal, and the audio synchronous playback system of the embodiment of the present invention use the voice recognition technology to identify the multimedia played by the first terminal, and automatically control the second terminal to play the corresponding multimedia file, so that The synchronization of the multimedia start playback can be realized, and the synchronization control can also be realized during the playback process, so that the multimedia synchronous playback can be realized.
  • the second terminal can be associated with the first terminal to form a network, and can change the situation that the past terminals are independently controlled or can only transmit files through the local area network to realize synchronous control of multimedia playback, thereby achieving better performance.
  • the sound effects, such as the surround sound effect in the family, and the second, can achieve interactive, social purposes.
  • FIG. 1 is a schematic flow chart of a control method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of functional modules of an audio synchronous playback system according to an embodiment of the present invention.
  • FIG. 3 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
  • FIG. 4 is a flow chart of a control method of some embodiments of the present invention.
  • FIG. 5 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
  • FIG. 6 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
  • FIG. 7 is a flow chart of a control method of some embodiments of the present invention.
  • FIG. 8 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
  • FIG. 9 is a flow chart of a control method of some embodiments of the present invention.
  • FIG. 10 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
  • 11-15 are schematic flow diagrams of a control method of some embodiments of the present invention.
  • 16 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
  • a control method of an embodiment of the present invention includes the following steps:
  • S20 Control the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal.
  • the control device 100 of the embodiment of the present invention includes an identification module 10 and a control module 20 .
  • the control method of the embodiment of the present invention can be implemented by the control device 100 of the embodiment of the present invention, and can be applied to the audio synchronous playback system 1000.
  • the audio synchronous playback system 1000 can include a control device 100, a first terminal 200, and a second terminal 300.
  • the step S10 of the control method of the embodiment of the present invention may be implemented by the identification module 10, and the step S20 may be implemented by the control module 20. That is to say, the identification module 10 is configured to recognize the multimedia played by the first terminal 200 to obtain audio information and play status information.
  • the control module 20 is configured to control the second terminal 300 to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal 200.
  • the control method, the control device 100 and the audio synchronization playing system 1000 of the embodiment of the present invention use the voice recognition technology to identify the multimedia played by the first terminal 200, and automatically control the second terminal 300 to play the corresponding multimedia file, so that one can
  • the synchronization of the multimedia start playback is realized, and the synchronization control can also be realized during the playback process, so that the multimedia synchronous playback can be realized.
  • the second terminal 300 can be associated with the first terminal 200 to form a network, and can change the situation that the previous terminals are independently controlled or can only transmit files through the local area network to realize the synchronous control of the multimedia play, thereby achieving better sound effects.
  • a three-dimensional sound effect can be achieved; secondly, an interactive and social purpose can be achieved.
  • the first terminal 200 or the second terminal 300 may be a communication terminal, an entertainment terminal, or a smart terminal having a multimedia playing function, such as a mobile phone, a tablet computer, a wearable device, a computer, or the like.
  • the wearable device can be a smart bracelet, a smart watch or smart glasses.
  • the first terminal 200 or the second terminal 300 is a product commonly carried by people in modern society and has a real demand for multimedia synchronous playback functions.
  • the number of the second terminals 300 is one.
  • the number of the second terminals 300 may be multiple, and is not limited to the above embodiments. In these embodiments, each of the second terminals 300 may apply the control device 100 of the embodiment of the present invention.
  • the types of the first terminal 200 and the second terminal 300 may be the same or different.
  • the first terminal 200 may be a mobile phone
  • the second terminal 300 is also a mobile phone, that is, the two mobile phones are associated to form an audio synchronous playback system 1000.
  • the first terminal 200 can be a computer
  • the second terminal 300 has two, which can be a mobile phone and a tablet computer, respectively.
  • control device 100 can be integrated in whole or in part in the second terminal 300.
  • control device 100 can also be set independently of the second terminal 300.
  • the second terminal 300 can be considered to include the control device 100.
  • the control device 100 may also share some or all of the components with the slave device 300.
  • the control device 100 is an application installed in the second terminal 300 and implements a corresponding function when the second terminal 300 operates.
  • the first terminal 200 can also be applied with the control device 100, that is, the first terminal 200 can also be used as the second terminal 300 to follow other first terminals 200 for multimedia synchronous playback, or first.
  • the role between the terminal 200 or the second terminal 300 can be switched as needed, and is not limited to any of the embodiments of the present invention.
  • step S10 includes sub-steps:
  • S14 Perform feature matching according to the feature value and the feature library to identify the multimedia.
  • the identification module 10 includes an extraction module 12 and a matching module 14 .
  • Step S12 can be implemented by the extraction module 12, and step S14 can be implemented by the matching module 14.
  • the extraction module 12 is configured to extract feature values of the multimedia
  • the matching module 14 is configured to perform feature matching based on the feature values and the feature library to identify the multimedia.
  • the feature value comprises an acoustic feature or a linguistic feature.
  • the multimedia can be pure music or songs, and the like.
  • the feature values are acoustic features, such as melody features or scale features.
  • language information such as lyrics
  • linguistic features such as by lyrics, that is, linguistic features include lyrics.
  • the audio feature and the linguistic feature can be simultaneously used to simultaneously identify the multimedia.
  • the feature library should be established in advance, and various multimedia can be analyzed and then features can be established.
  • the creation of a signature library can also be a machine learning process that builds and constantly updates the signature library by continuously training the machine.
  • the feature library is stored in the second terminal 300 or stored in the remote server 400.
  • the speed of voice recognition can be improved.
  • the feature inventory is stored in the remote server 400, the resources occupying the second terminal 300 can be reduced.
  • the audio sync play system 1000b includes a server 400.
  • step S10 includes sub-steps:
  • step S11 The multimedia played by the first terminal 200 is recorded as a recording segment, and in step S12, the feature value is extracted by analyzing the recorded segment.
  • step S10 further includes the substeps:
  • the identification module 10 includes an admission module 11 and a determination module 15 .
  • Step S11 can be implemented by the admission module 11, and step S15 can be implemented by the determination module 15.
  • the admission module 11 is configured to record the multimedia played by the first terminal 200 as a recording segment
  • the determining module 15 is configured to determine whether the multimedia is successfully recognized.
  • the admission module 11 can set a predetermined sampling rate to record the multimedia played by the first terminal 200.
  • the extraction module 12 uses a specific extraction algorithm to extract feature values of the recorded segments, such as melody features, scale features, or lyrics.
  • the matching module 14 searches for audio information matching the feature value by performing a search comparison in the feature library after obtaining the feature value. If the audio information matching the feature value is found, the judging module 15 judges that the multimedia recognition is successful; otherwise, the judging module 15 judges that the multimedia recognition has failed.
  • the recorded segment includes more noise, and the extraction module 12 may not be able to accurately extract the feature values, thus causing the matching module to fail to match.
  • the decision module 15 should notify the admission module 11 to re-admit the multimedia to obtain different audio segments, and the extraction module 12 re-extracts the feature values. This cycle directly identifies multimedia successfully.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the method in the foregoing embodiment.
  • the multimedia is a song and the audio information includes a song title, lyrics, author, singer, or genre.
  • the play status information includes the current play position, whether it is paused, whether it is fast forward, or whether it is fast backward.
  • the audio information may include:
  • Lyrics Get up! People who don't want to be slaves! Make our flesh and blood into our new Great Wall! When the Chinese nation reached its most dangerous time, everyone was forced to make the final snoring. stand up! stand up! ! stand up! ! ! ! We are all in one heart, braving the enemy's artillery, and marching forward! Take the enemy's gunfire and move on! go ahead! go ahead! ! Into! ! ! !
  • the play status information may include a play position, for example, the play position is 01 minutes 22 seconds (represented as 01:22).
  • step S20 includes the following sub-steps:
  • S24 Control the second terminal to configure a play parameter according to the audio information, and start playing the multimedia file according to the play status information;
  • control module of the embodiment of the present invention may include an acquisition module 22, a first sub-control module 24, and a second sub-control module 26.
  • Step S21 can be implemented by the acquisition module 22, step S24 can be implemented by the first sub-control module 24, and step S26 can be implemented by the second sub-control module 26.
  • the acquisition module 22 is configured to retrieve the multimedia file based on the audio information.
  • the first sub-control module 24 is configured to control the second terminal to configure the play parameters according to the audio information and start playing the multimedia file according to the play status information.
  • the second sub-control module 26 is arranged to perform synchronized play control during playback.
  • the obtaining module 22 may control the second terminal 300 to search for and acquire the corresponding multimedia file, for example, in the case of obtaining the above audio information, the second control may be performed.
  • the terminal 300 searches for and acquires a file of the song "The March of the Volunteers".
  • the multimedia file can be stored in the second terminal 300 or stored with the remote server 400.
  • the acquisition module 22 can control the second terminal 300 to perform a local search first. If the multimedia file is local, there is no need to download it, which can speed up. In the case where the multimedia file is not searched locally, the acquisition module 22 may control the second terminal 300 to search for the remote server 400. After the remote server 400 searches for the multimedia file, it can download to the second terminal 300, and can also play online. The multimedia file can reduce the resources of the second terminal 300 at the remote server 400.
  • setting the play parameters may be setting an equalizer, for example, the equalizer includes different play modes: classical music, rock music, heavy metal music, jazz music, romantic music, or country music.
  • the playing mode of the equalizer can be set to classical music.
  • playing the multimedia file according to the playing status information may be based on the playing position.
  • the playing position is 01:22
  • the second terminal 300 should be controlled to start playing from 01:22 when playing the multimedia file.
  • playing the multimedia file according to the playing state information should also play the lyrics at the same time.
  • the control device 20 needs to perform synchronous control during the entire multimedia playback process to achieve true synchronous playback.
  • sub-step S26 includes a grandchild step:
  • S266 Determine whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized, and if no, return to step S10.
  • the step S262 can be implemented by the second sub-control module 26 to control the second terminal 300.
  • the grand step S264 can be implemented by the second sub-control module 26 to control the identification module 10, and the grand step S266 can also be implemented by the second sub-control module 26. .
  • Steps S262-S266 can be continuously performed to ensure the synchronization control of the multimedia playback. However, this may require more resources, and may affect the multimedia playback of the second terminal 300. Therefore, in this embodiment, Steps S262-266 are continued at predetermined time intervals, for example every 10 seconds.
  • the determination may be based on the comparison of the playback position, for example, the playback position of the recognized multimedia is 01:56, and the first sub-control module 24 may obtain the playback position of the multimedia played by the second terminal 300, if 01:56, you can judge the synchronous play, and if it is 02:01, you can judge that it is not synchronized.
  • the second terminal 300 can continue to play the multimedia synchronously with the first terminal 200, and in the case of determining that the synchronous play is not synchronized, it should return to the control step of re-synchronizing the playback.
  • sub-step S26 includes the following grand steps:
  • S262a accepting multimedia played by the first terminal 200 and the second terminal 300;
  • S266a It is determined whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized. If not, the process returns to step S10.
  • the grand step S262a can be implemented by the second sub-control module 26 to control the second terminal 300, the grand step S264a can be controlled by the second sub-control module 26 to control the identification module 10, and the grandchild step S266a can also be implemented by the second sub-control module 26. .
  • step S262a-S266a are similar to steps S262-S266a, but the recorded segments in step S262a include the multimedia played by the first terminal 200 and the multimedia played by the second terminal 300.
  • step S264a the multimedia played by the second terminal 300 can be obtained according to the first sub-control module 24, so as to be filtered out in the recorded segment, and the multimedia played by the first terminal 200 is obtained.
  • Step S266a is substantially the same as step S266.
  • sub-step S26 includes the following grand steps:
  • S262b the multimedia played by the first terminal 200 and the second terminal 300;
  • S266b Determine whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized according to the correlation degree, and if not, return to the control step S10.
  • the grand step S262b can be implemented by the second sub-control module 26 to control the second terminal 300, and the grand step S264b and the grandchild step S266b can be implemented by the second sub-control module 26.
  • the sub-step realizes the judgment by comparing the linearity of the speech spectrum or the number of identical syllables within a preset time.
  • a syllable is the most natural unit of speech that can be felt by hearing. One or several phonemes are combined according to certain rules.
  • a Chinese character in Chinese is a syllable. Each syllable consists of three parts: initial, final and tonal.
  • a vowel phoneme in English can form a syllable, and a vowel phoneme and one or several consonant phonemes can also form a syllable.
  • the determination rule of the syllable can be defined according to actual needs before the judgment.
  • the multimedia played by the first terminal 200 is synchronized with the multimedia played by the second terminal 300.
  • sub-step S26 includes:
  • S262c the multimedia played by the first terminal 200 and the second terminal 300;
  • S264c identify the multimedia played by the first terminal 200 and the second terminal 300;
  • S266c Determine, according to the playing state information, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, return to the identifying step.
  • the grand step S262c can be implemented by the second sub-control module 26 to control the second terminal 300, the grand step S264c can be controlled by the second sub-control module 26 to control the identification module 10, and the grandchild step S266c can also be implemented by the second sub-control module 26. .
  • the playback position of the playback status information can be utilized for comparison. Additionally, in some embodiments, the playback position of the lyrics can also be utilized for comparison.
  • control method further includes:
  • step S00 It is determined whether a request for synchronous play is received, and if yes, the process proceeds to step S10.
  • the audio synchronous playback system 1000d of some embodiments is substantially the same as the audio synchronous playback system 1000, except that the control device 100d of the audio synchronous playback system 1000d is different from the control device 100, and the control device further includes a request receiving module 30.
  • Step S00 can be implemented by the request receiving module 30, that is, the request receiving module 30 is set to determine whether a request for synchronous play is received. If yes, go to step S10, if not, return to continue judgment.
  • control device 100 or 100d of the embodiment of the present invention and the other portions of the audio synchronization playback system 1000-100d that are not deployed may be referred to the corresponding portions of the control method of the above embodiment, and will not be developed in detail herein.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include one or more of the described features either explicitly or implicitly.
  • the meaning of "a plurality" is two or more unless specifically defined otherwise.
  • the terms “installation”, “connected”, and “connected” should be understood broadly, and may be a fixed connection, for example, or They are detachable or integrally connected; they can be mechanically connected, they can be electrically connected or can communicate with each other; they can be connected directly or indirectly through an intermediate medium, which can be internal or two components of two components. Interaction relationship.
  • an intermediate medium which can be internal or two components of two components. Interaction relationship.
  • the "on" or “below” of the second feature may include direct contact of the first and second features, and may also include the first sum, unless otherwise specifically defined and defined.
  • the second feature is not in direct contact but through additional features between them.
  • the first feature “above”, “above” and “above” the second feature includes the first feature directly above and above the second feature, or merely indicating that the first feature level is higher than the second feature.
  • the first feature “below”, “below” and “below” the second feature includes the first feature directly above and above the second feature, or merely the first feature level being less than the second feature.
  • a "computer-readable medium” can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
  • portions of the embodiments of the invention may be implemented in hardware, software, firmware or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
  • each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
  • the voice recognition technology is used to identify the multimedia played by the first terminal, and the second terminal is automatically controlled to play the corresponding multimedia file, so that the multimedia can be started to play. Synchronization, and synchronization control can also be realized during playback, so that multimedia synchronization can be realized.

Abstract

A control method, control device, terminal, and synchronous audio playback system. The method comprises: first, identifying a multimedia object being played by a first terminal to obtain corresponding audio information and playback state information (S10); and then, controlling a second terminal to play, according to the audio information and the playback state information, a multimedia file corresponding to the audio information, so as to synchronously play the multimedia object with the first terminal (S20). By employing an audio identification technique to identify the multimedia object being played by the first terminal, and automatically controlling the second terminal to play a corresponding multimedia file, the control method, control device, terminal, and synchronous audio playback system can synchronize a start of multimedia playback, and achieve synchronous control during the playback process, thus realizing synchronous multimedia playback.

Description

控制方法、控制装置、终端及音频同步播放系统Control method, control device, terminal and audio synchronous playback system 技术领域Technical field
本发明的音频同步播放技术,特别涉及一种控制方法、控制装置、终端及音频同步播放系统。The audio synchronous playback technology of the present invention particularly relates to a control method, a control device, a terminal and an audio synchronous playback system.
背景技术Background technique
随着智能终端和互联网的发展,用户对多媒体服务的需求越来越高。例如,用户可能希望采用多个智能终端同时播放相同的音乐,以实现环绕立体声效果。另外,喜好音乐的用户还可能希望通过各自的智能终端同时播放相同的音乐,达到互动、社交的目的。With the development of smart terminals and the Internet, users are increasingly demanding multimedia services. For example, a user may wish to use multiple smart terminals to simultaneously play the same music for a surround sound effect. In addition, users who like music may also want to play the same music simultaneously through their respective smart terminals to achieve interactive and social purposes.
然而,现有的智能终端还需要通过人工操作来实现音频的同步播放,如此需要用户操作精确,例如同时开始播放音乐,以使得音乐开始播放的时间相同。然而,即使如此,还是可能因为各个智能终端之间的运行速度存在差异,又或者个别智能终端遇到死机重启等事件,导致即使音乐开始播放时间相同,但播放过程无法严格同步,如此从而影响视听效果。However, the existing smart terminal also needs to realize synchronous play of audio through manual operation, so that the user needs to operate accurately, for example, start playing music at the same time, so that the music starts playing at the same time. However, even in this case, it may be because there is a difference in the running speed between the various smart terminals, or an individual smart terminal encounters an event such as a crash restart, so that even if the music starts playing at the same time, the playing process cannot be strictly synchronized, thus affecting the audiovisual. effect.
发明内容Summary of the invention
本发明实施例的实施方式旨在至少解决现有技术中存在的技术问题之一。为此,本发明的实施方式需要提供一种控制方法、控制装置、终端及音频同步播放系统。Embodiments of the embodiments of the present invention aim to at least solve one of the technical problems existing in the prior art. To this end, embodiments of the present invention are required to provide a control method, a control device, a terminal, and an audio synchronous playback system.
本发明实施方式的控制方法包括以下步骤:The control method of the embodiment of the present invention includes the following steps:
识别步骤,识别第一终端播放的多媒体以获得对应的音频信息及播放状态信息;及The identifying step of identifying the multimedia played by the first terminal to obtain corresponding audio information and playing status information;
控制步骤,控制第二终端根据所述音频信息及所述播放状态信息播放与所述音频信息对应的多媒体文件以与所述第一终端同步播放所述多媒体。And a controlling step of controlling the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal.
在某些实施方式中,所述第一终端或所述第二终端包括手机、平板电脑或可穿戴设备。In some embodiments, the first terminal or the second terminal comprises a mobile phone, a tablet or a wearable device.
在某些实施方式中,所述识别步骤包括:In some embodiments, the identifying step comprises:
提取子步骤,提取所述多媒体的特征值;Extracting a sub-step, extracting feature values of the multimedia;
匹配子步骤,根据所述特征值及特征库进行特征匹配以识别所述多媒体。A matching sub-step performs feature matching according to the feature value and the feature library to identify the multimedia.
在某些实施方式中,所述特征值包括声学特征或语言特征,In some embodiments, the feature value comprises an acoustic feature or a linguistic feature,
所述声学特征包括旋律特征或音阶特征, The acoustic features include melody features or scale features.
所述多媒体为歌曲,所述语言特征包括歌词。The multimedia is a song and the linguistic features include lyrics.
在某些实施方式中,所述特征库存储于所述第二终端或远程服务器。In some embodiments, the feature library is stored at the second terminal or remote server.
在某些实施方式中,所述识别步骤包括:In some embodiments, the identifying step comprises:
录取子步骤,录取所述第一终端播放的多媒体为录音片段;Taking the sub-step, the multimedia played by the first terminal is recorded as a recording segment;
所述提取子步骤通过分析所述录音片段提取所述特征值。The extracting substep extracts the feature value by analyzing the recorded segment.
在某些实施方式中,所述识别步骤包括:In some embodiments, the identifying step comprises:
判断子步骤,判断是否成功识别所述多媒体,若否,则返回所述录取子步骤。The sub-step is determined to determine whether the multimedia is successfully identified, and if not, the recording sub-step is returned.
在某些实施方式中,所述多媒体为歌曲,所述音频信息包括歌曲名称、歌词、作者、演唱者或流派。In some embodiments, the multimedia is a song, and the audio information includes a song title, a lyrics, an author, a singer, or a genre.
在某些实施方式中,所述播放状态信息包括当前播放位置、是否暂停、是否快进或是否快退。In some embodiments, the play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
在某些实施方式中,所述控制步骤包括:In some embodiments, the controlling step comprises:
获取子步骤,根据所述音频信息获取所述多媒体文件;Obtaining a sub-step of acquiring the multimedia file according to the audio information;
第一控制子步骤,控制所述第二终端根据所述音频信息配置播放参数并根据所述播放参数及所述播放状态信息开始播放所述多媒体文件;及a first control sub-step, the second terminal is configured to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
第二控制子步骤,在播放过程中进行同步播放控制。The second control sub-step performs synchronous play control during playback.
在某些实施方式中,所述多媒体文件存储于所述第二终端或存储于远程服务器。In some embodiments, the multimedia file is stored at the second terminal or stored at a remote server.
在某些实施方式中,所述第二控制子步骤包括:In some embodiments, the second control sub-step includes:
间歇关闭所述第二终端的扬声器;Intermittently closing the speaker of the second terminal;
识别所述第一终端播放的多媒体;及Identifying multimedia played by the first terminal; and
判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
在某些实施方式中,所述第二控制子步骤包括以下孙步骤:In some embodiments, the second control sub-step comprises the following grandchild steps:
录取所述第一终端及所述第二终端播放的多媒体;Receiving multimedia played by the first terminal and the second terminal;
过滤所述第二终端播放的多媒体后识别所述第一终端播放的多媒体;及Identifying multimedia played by the first terminal after filtering the multimedia played by the second terminal; and
判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
在某些实施方式中,所述第二控制子步骤包括以下孙步骤:In some embodiments, the second control sub-step comprises the following grandchild steps:
录取所述第一终端及所述第二终端播放的多媒体; Receiving multimedia played by the first terminal and the second terminal;
比较所述第一终端及所述第二终端播放的多媒体的相关度;及Comparing the relevance of the multimedia played by the first terminal and the second terminal; and
根据所述相关度判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining, according to the correlation, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
在某些实施方式中,所述相关度包括语音频谱线性或者预设时间内相同音节的个数。In some embodiments, the correlation includes the linearity of the speech spectrum or the number of identical syllables within a predetermined time.
在某些实施方式中,所述多媒体为歌曲,所述播放状态信息包括歌词的播放位置,所述第二控制子步骤包括以下孙步骤:In some embodiments, the multimedia is a song, the play status information includes a play position of the lyrics, and the second control sub-step includes the following grand steps:
录取所述第一终端及所述第二终端播放的多媒体;Receiving multimedia played by the first terminal and the second terminal;
识别所述第一终端及所述第二终端播放的多媒体以获得对应的所述歌词的播放位置;Identifying multimedia played by the first terminal and the second terminal to obtain a corresponding playing position of the lyrics;
根据所述歌词的播放位置判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining, according to the playing position of the lyrics, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
在某些实施方式中,所述控制方法还包括:In some embodiments, the controlling method further includes:
判断是否接收到同步播放的请求,若是,则进入所述识别步骤。It is judged whether a request for synchronous play is received, and if so, the identification step is entered.
本发明实施方式的控制装置包括:The control device of the embodiment of the present invention includes:
识别模块,所述识别模块设置为识别第一终端播放的多媒体以获得对应的音频信息及播放状态信息;及An identification module, configured to identify multimedia played by the first terminal to obtain corresponding audio information and play status information;
控制模块,所述控制模块设置为控制第二终端根据所述音频信息及所述播放状态信息播放与所述音频信息对应的多媒体文件以与所述第一终端同步播放所述多媒体。a control module, the control module is configured to control the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the play status information to play the multimedia synchronously with the first terminal.
在某些实施方式中,所述第一终端或所述第二终端包括手机、平板电脑或可穿戴设备。In some embodiments, the first terminal or the second terminal comprises a mobile phone, a tablet or a wearable device.
在某些实施方式中,所述识别模块包括:In some embodiments, the identification module comprises:
提取模块,所述提取模块设置为提取所述多媒体的特征值;An extraction module, configured to extract feature values of the multimedia;
匹配模块,所述匹配模块设置为根据所述特征值及特征库进行特征匹配以识别所述多媒体。a matching module, the matching module being configured to perform feature matching according to the feature value and the feature library to identify the multimedia.
在某些实施方式中,所述特征值包括声学特征或语言特征,In some embodiments, the feature value comprises an acoustic feature or a linguistic feature,
所述声学特征包括旋律特征或音阶特征,The acoustic features include melody features or scale features.
所述多媒体为歌曲,所述语言特征包括歌词。The multimedia is a song and the linguistic features include lyrics.
在某些实施方式中,所述特征库存储于所述第二终端或远程服务器。In some embodiments, the feature library is stored at the second terminal or remote server.
在某些实施方式中,所述识别模块包括:In some embodiments, the identification module comprises:
录取模块,所述录取模块设置为录取所述第一终端播放的多媒体为录音片段; An admission module, the admission module is configured to record the multimedia played by the first terminal as a recording segment;
所述提取模块设置为通过分析所述录音片段提取所述特征值。The extraction module is configured to extract the feature value by analyzing the recorded segment.
在某些实施方式中,所述识别模块包括:In some embodiments, the identification module comprises:
判断模块,所述判断模块设置为判断是否成功识别所述多媒体并在未成功识别所述多媒体时通知所述录取模块。a judging module, the judging module being configured to determine whether the multimedia is successfully identified and notifying the admission module when the multimedia is not successfully identified.
在某些实施方式中,所述多媒体为歌曲,所述音频信息包括歌曲名称、歌词、作者、演唱者或流派。In some embodiments, the multimedia is a song, and the audio information includes a song title, a lyrics, an author, a singer, or a genre.
在某些实施方式中,所述播放状态信息包括当前播放位置、是否暂停、是否快进或是否快退。In some embodiments, the play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
在某些实施方式中,所述控制模块包括:In some embodiments, the control module comprises:
获取模块,所述提取模块设置为根据所述音频信息获取所述多媒体文件;Obtaining a module, the extraction module being configured to acquire the multimedia file according to the audio information;
第一子控制模块,所述第一子控制模块设置为控制所述第二终端根据所述音频信息配置播放参数并根据所述播放参数及所述播放状态信息开始播放所述多媒体文件;及a first sub-control module, the first sub-control module is configured to control the second terminal to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
第二子控制模块,所述第二子控制模块设置为在播放过程中进行同步播放控制。a second sub-control module, the second sub-control module being configured to perform synchronous play control during playback.
在某些实施方式中,所述多媒体文件存储于所述第二终端或存储于远程服务器。In some embodiments, the multimedia file is stored at the second terminal or stored at a remote server.
在某些实施方式中,所述第二子控制模块设置为间歇关闭所述第二终端的扬声器、识别所述第一终端播放的多媒体及判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。In some embodiments, the second sub-control module is configured to intermittently turn off the speaker of the second terminal, identify the multimedia played by the first terminal, and determine that the first terminal and the second terminal play Whether the multimedia is synchronized and notifies the identification module when it is not synchronized.
在某些实施方式中,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、过滤所述第二终端播放的多媒体后识别所述第一终端播放的多媒体及判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。In some embodiments, the second sub-control module is configured to: capture the multimedia played by the first terminal and the second terminal, filter the multimedia played by the second terminal, and identify the first terminal to play And determining whether the multimedia played by the first terminal and the second terminal is synchronized and notifying the identification module when not synchronized.
在某些实施方式中,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、比较所述第一终端及所述第二终端播放的多媒体的相关度及根据所述相关度判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。In some embodiments, the second sub-control module is configured to record the multimedia played by the first terminal and the second terminal, and compare the relevance of the multimedia played by the first terminal and the second terminal. And determining, according to the correlation, whether the multimedia played by the first terminal and the second terminal is synchronized, and notifying the identification module when not synchronized.
在某些实施方式中,所述相关度包括语音频谱线性或者预设时间内相同音节的个数。In some embodiments, the correlation includes the linearity of the speech spectrum or the number of identical syllables within a predetermined time.
在某些实施方式中,所述多媒体为歌曲,所述播放状态信息包括歌词的播放位置,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、识别所述第一终端及所述第二终端播放的多媒体以获得所述歌词的播放位置及根据所述歌词的播放位置判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。In some embodiments, the multimedia is a song, the play status information includes a play position of the lyrics, and the second sub-control module is configured to: capture the multimedia played by the first terminal and the second terminal, and identify The multimedia played by the first terminal and the second terminal obtains a play position of the lyrics and determines, according to the play position of the lyrics, whether the multimedia played by the first terminal and the second terminal is synchronized and not The identification module is notified when synchronizing.
在某些实施方式中,所述控制装置包括:In some embodiments, the control device comprises:
请求接收模块,所述请求接收模块设置为判断是否接收到同步播放的请求并在接收到所述请求时通知所述识别模块。 The request receiving module is configured to determine whether a request for synchronous play is received and notify the identification module when the request is received.
本发明实施方式的终端包括所述控制装置。A terminal of an embodiment of the present invention includes the control device.
本发明实施方式的音频同步播放系统宝库所述终端。The terminal of the audio synchronization playing system of the embodiment of the present invention.
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的播放控制方法。In the embodiment of the present invention, a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the playback control method in the foregoing embodiment.
本发明实施方式的控制方法、控制装置、终端及音频同步播放系统由于采用了语音识别技术识别所述第一终端播放的多媒体,并自动控制所述第二终端播放对应的多媒体文件,如此,一来可以实现多媒体开始播放的同步,而且播放过程中也可以实现同步控制,从而可以实现多媒体同步播放。The control method, the control device, the terminal, and the audio synchronous playback system of the embodiment of the present invention use the voice recognition technology to identify the multimedia played by the first terminal, and automatically control the second terminal to play the corresponding multimedia file, so that The synchronization of the multimedia start playback can be realized, and the synchronization control can also be realized during the playback process, so that the multimedia synchronous playback can be realized.
如此,所述第二终端可以与所述第一终端关联成网,改变过往终端之间独立控制或者只能通过局域网联网传输文件的局面,实现多媒体播放同步控制,从而,一来可以达到更好的音效,例如在家庭场合实现环绕立体声音效;二来,可以实现互动、社交的目的。In this way, the second terminal can be associated with the first terminal to form a network, and can change the situation that the past terminals are independently controlled or can only transmit files through the local area network to realize synchronous control of multimedia playback, thereby achieving better performance. The sound effects, such as the surround sound effect in the family, and the second, can achieve interactive, social purposes.
本发明的实施方式的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实施方式的实践了解到。The additional aspects and advantages of the embodiments of the present invention will be set forth in part in the description which follows.
附图说明DRAWINGS
本发明的实施方式的上述和/或附加的方面和优点从结合下面附图对实施方式的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the embodiments of the present invention will become apparent and readily understood from the following description
图1是本发明实施方式的控制方法的流程示意图。FIG. 1 is a schematic flow chart of a control method according to an embodiment of the present invention.
图2是本发明实施方式的音频同步播放系统的功能模块示意图。2 is a schematic diagram of functional modules of an audio synchronous playback system according to an embodiment of the present invention.
图3是本发明某些实施方式的音频同步播放系统的功能模块示意图。3 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
图4是本发明某些实施方式的控制方法的流程示意图。4 is a flow chart of a control method of some embodiments of the present invention.
图5是本发明某些实施方式的音频同步播放系统的功能模块示意图。FIG. 5 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
图6是本发明某些实施方式的音频同步播放系统的功能模块示意图。6 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
图7是本发明某些实施方式的控制方法的流程示意图。7 is a flow chart of a control method of some embodiments of the present invention.
图8是本发明某些实施方式的音频同步播放系统的功能模块示意图。FIG. 8 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
图9是本发明某些实施方式的控制方法的流程示意图。9 is a flow chart of a control method of some embodiments of the present invention.
图10是本发明某些实施方式的音频同步播放系统的功能模块示意图。FIG. 10 is a schematic diagram of functional modules of an audio synchronous playback system according to some embodiments of the present invention.
图11-15是本发明某些实施方式的控制方法的流程示意图。 11-15 are schematic flow diagrams of a control method of some embodiments of the present invention.
图16是本发明某些实施方式的音频同步播放系统的功能模块示意图。16 is a functional block diagram of an audio synchronous playback system in accordance with some embodiments of the present invention.
具体实施方式detailed description
下面详细描述本发明的实施方式的实施方式,所述实施方式的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明的实施方式,而不能理解为对本发明的实施方式的限制。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the drawings are intended to be illustrative of the embodiments of the invention and are not to be construed as limiting.
请参阅图1,本发明实施方式的控制方法包括以下步骤:Referring to FIG. 1, a control method of an embodiment of the present invention includes the following steps:
S10:识别第一终端播放的多媒体以获得对应的音频信息及播放状态信息;及S10: Identify the multimedia played by the first terminal to obtain corresponding audio information and play status information; and
S20:控制第二终端根据音频信息及播放状态信息播放与音频信息对应的多媒体文件以与第一终端同步播放多媒体。S20: Control the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal.
请参阅图2,本发明实施方式的控制装置100包括识别模块10及控制模块20。作为例子,本发明实施方式的控制方法可以由本发明实施方式的控制装置100实现,可应用于音频同步播放系统1000。音频同步播放系统1000可以包括控制装置100、第一终端200及第二终端300。Referring to FIG. 2 , the control device 100 of the embodiment of the present invention includes an identification module 10 and a control module 20 . As an example, the control method of the embodiment of the present invention can be implemented by the control device 100 of the embodiment of the present invention, and can be applied to the audio synchronous playback system 1000. The audio synchronous playback system 1000 can include a control device 100, a first terminal 200, and a second terminal 300.
其中,本发明实施方式的控制方法的步骤S10可以由识别模块10实现,而步骤S20可以由控制模块20实现。也即是说,识别模块10设置为语音识别第一终端200播放的多媒体以获得音频信息就播放状态信息。控制模块20设置为控制第二终端300根据音频信息及播放状态信息播放与音频信息对应的多媒体文件以与第一终端200同步播放多媒体。The step S10 of the control method of the embodiment of the present invention may be implemented by the identification module 10, and the step S20 may be implemented by the control module 20. That is to say, the identification module 10 is configured to recognize the multimedia played by the first terminal 200 to obtain audio information and play status information. The control module 20 is configured to control the second terminal 300 to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal 200.
本发明实施方式的控制方法、控制装置100及音频同步播放系统1000由于采用了语音识别技术识别第一终端200播放的多媒体,并自动控制第二终端300播放对应的多媒体文件,如此,一来可以实现多媒体开始播放的同步,而且播放过程中也可以实现同步控制,从而可以实现多媒体同步播放。The control method, the control device 100 and the audio synchronization playing system 1000 of the embodiment of the present invention use the voice recognition technology to identify the multimedia played by the first terminal 200, and automatically control the second terminal 300 to play the corresponding multimedia file, so that one can The synchronization of the multimedia start playback is realized, and the synchronization control can also be realized during the playback process, so that the multimedia synchronous playback can be realized.
如此,第二终端300可以与第一终端200关联成网,改变过往终端之间独立控制或者只能通过局域网联网传输文件的局面,实现多媒体播放同步控制,从而,一来可以达到更好的音效,例如在家庭场合实现环绕立体声音效;二来,可以实现互动、社交的目的。In this way, the second terminal 300 can be associated with the first terminal 200 to form a network, and can change the situation that the previous terminals are independently controlled or can only transmit files through the local area network to realize the synchronous control of the multimedia play, thereby achieving better sound effects. For example, in a home setting, a three-dimensional sound effect can be achieved; secondly, an interactive and social purpose can be achieved.
在某些实施方式中,第一终端200或第二终端300可以是手机、平板电脑、可穿戴设备、电脑等具备多媒体播放功能的通信终端、娱乐终端或智能终端。可穿戴设备可以是智能手环、智能手表或智能眼镜等设备。In some embodiments, the first terminal 200 or the second terminal 300 may be a communication terminal, an entertainment terminal, or a smart terminal having a multimedia playing function, such as a mobile phone, a tablet computer, a wearable device, a computer, or the like. The wearable device can be a smart bracelet, a smart watch or smart glasses.
如此,第一终端200或第二终端300是现代社会人们普遍携带的产品而且对多媒体同步播放功能有确实需求。As such, the first terminal 200 or the second terminal 300 is a product commonly carried by people in modern society and has a real demand for multimedia synchronous playback functions.
本实施方式中,第二终端300的数目为一个。 In this embodiment, the number of the second terminals 300 is one.
请参阅图3,本发明其他实施方式的音频同步播放系统1000a中,第二终端300的数目也可以为多个,并不限于上面的实施方式。在这些实施方式中,每个第二终端300可以都应用本发明实施方式的控制装置100。Referring to FIG. 3, in the audio synchronous playback system 1000a of other embodiments of the present invention, the number of the second terminals 300 may be multiple, and is not limited to the above embodiments. In these embodiments, each of the second terminals 300 may apply the control device 100 of the embodiment of the present invention.
需说明的是,第一终端200及第二终端300的种类可以相同也可以不同。例如,在图2的例子中,第一终端200可以是手机,而第二终端300也是手机,也即是说,两个手机关联构成音频同步播放系统1000。而在图3的例子中,第一终端200可以是电脑,而第二终端300有两个,可以分别是手机及平板电脑。It should be noted that the types of the first terminal 200 and the second terminal 300 may be the same or different. For example, in the example of FIG. 2, the first terminal 200 may be a mobile phone, and the second terminal 300 is also a mobile phone, that is, the two mobile phones are associated to form an audio synchronous playback system 1000. In the example of FIG. 3, the first terminal 200 can be a computer, and the second terminal 300 has two, which can be a mobile phone and a tablet computer, respectively.
可以理解,控制装置100可以全部或部分集成于第二终端300,当然,控制装置100也可以与第二终端300独立设置。It can be understood that the control device 100 can be integrated in whole or in part in the second terminal 300. Of course, the control device 100 can also be set independently of the second terminal 300.
在控制装置100全部集成于第二终端300的实施方式中,可以认为第二终端300包括控制装置100。在这些实施方式中,控制装置100还可以与从属装置300共用部分或全部元件,例如,控制装置100是安装于第二终端300的应用程序,并在第二终端300运行时实现对应的功能。In an embodiment in which the control device 100 is all integrated in the second terminal 300, the second terminal 300 can be considered to include the control device 100. In these embodiments, the control device 100 may also share some or all of the components with the slave device 300. For example, the control device 100 is an application installed in the second terminal 300 and implements a corresponding function when the second terminal 300 operates.
在某些实施方式中,第一终端200也可以应用有控制装置100,也即是说第一终端200也可以作为第二终端300跟随其他第一终端200实现多媒体同步播放,或者说,第一终端200或第二终端300之间的角色可以根据需要切换,并不限于本发明的任何实施方式。In some embodiments, the first terminal 200 can also be applied with the control device 100, that is, the first terminal 200 can also be used as the second terminal 300 to follow other first terminals 200 for multimedia synchronous playback, or first. The role between the terminal 200 or the second terminal 300 can be switched as needed, and is not limited to any of the embodiments of the present invention.
请参阅图4,在某些实施方式中,步骤S10包括子步骤:Referring to FIG. 4, in some embodiments, step S10 includes sub-steps:
S12:提取多媒体的特征值;及S12: extracting feature values of the multimedia; and
S14:根据特征值及特征库进行特征匹配以识别多媒体。S14: Perform feature matching according to the feature value and the feature library to identify the multimedia.
请参阅图5,而在一些实施方式中,识别模块10包括提取模块12及匹配模块14。步骤S12可以由提取模块12实现,而步骤S14可以由匹配模块14实现。或者说,提取模块12设置为提取多媒体的特征值,而匹配模块14设置为根据特征值及特征库进行特征匹配以识别多媒体。Please refer to FIG. 5 , and in some embodiments, the identification module 10 includes an extraction module 12 and a matching module 14 . Step S12 can be implemented by the extraction module 12, and step S14 can be implemented by the matching module 14. Alternatively, the extraction module 12 is configured to extract feature values of the multimedia, and the matching module 14 is configured to perform feature matching based on the feature values and the feature library to identify the multimedia.
在某些实施方式中,特征值包括声学特征或语言特征。例如,多媒体可以是纯音乐或歌曲等。对于纯音乐,由于不存在语言信息,因此,可以主要通过声学特征来识别,也即是说,特征值为声学特征,例如为旋律特征或音阶特征等。In some embodiments, the feature value comprises an acoustic feature or a linguistic feature. For example, the multimedia can be pure music or songs, and the like. For pure music, since there is no language information, it can be identified mainly by acoustic features, that is, the feature values are acoustic features, such as melody features or scale features.
在对于歌曲,由于包括语言信息(例如歌词),因此,可以通过语言特征来识别,例如通过歌词来识别,也即是说语言特征包括歌词。For songs, since language information (such as lyrics) is included, it can be identified by linguistic features, such as by lyrics, that is, linguistic features include lyrics.
在其他的实施方式中,若多媒体为歌曲,也可以同时利用声学特征及语言特征同时来识别多媒体。In other embodiments, if the multimedia is a song, the audio feature and the linguistic feature can be simultaneously used to simultaneously identify the multimedia.
可以理解,应该事先建立特征库,可以分析各种多媒体然后建立特征。特征库的建立也可以是一个机器学习的过程,通过不断地训练机器建立并不断更新特征库。 It can be understood that the feature library should be established in advance, and various multimedia can be analyzed and then features can be established. The creation of a signature library can also be a machine learning process that builds and constantly updates the signature library by continuously training the machine.
请参阅图6,在某些实施方式中,特征库存储于第二终端300或存储于与远程服务器400。Referring to FIG. 6, in some embodiments, the feature library is stored in the second terminal 300 or stored in the remote server 400.
可以理解,如果特征库存储于第二终端300,可以提高语音识别的速度。而如果特征库存储于远程服务器400,则可以减少占用第二终端300的资源。It can be understood that if the feature library is stored in the second terminal 300, the speed of voice recognition can be improved. And if the feature inventory is stored in the remote server 400, the resources occupying the second terminal 300 can be reduced.
在特征库存储于服务器400的实施方式中,音频同步播放系统1000b包括服务器400。In an embodiment in which the feature store is stored in the server 400, the audio sync play system 1000b includes a server 400.
请参阅图7,在某些实施方式中个,步骤S10包括子步骤:Referring to FIG. 7, in some embodiments, step S10 includes sub-steps:
S11:录取第一终端200播放的多媒体为录音片段,步骤S12通过分析录音片段提取特征值。S11: The multimedia played by the first terminal 200 is recorded as a recording segment, and in step S12, the feature value is extracted by analyzing the recorded segment.
在另外的其他实施方式中,步骤S10还包括子步骤:In still other embodiments, step S10 further includes the substeps:
S15:判断是否成功识别多媒体,若否,则返回步骤S11。S15: It is judged whether the multimedia is successfully recognized, and if not, the process returns to step S11.
请参阅图8,在一些实施方式中,识别模块10包括录取模块11及判断模块15。步骤S11可以由录取模块11实现,而步骤S15可以由判断模块15实现。或者说,录取模块11设置为录取第一终端200播放的多媒体为录音片段,而判断模块15设置为判断是否成功识别多媒体。Referring to FIG. 8 , in some embodiments, the identification module 10 includes an admission module 11 and a determination module 15 . Step S11 can be implemented by the admission module 11, and step S15 can be implemented by the determination module 15. In other words, the admission module 11 is configured to record the multimedia played by the first terminal 200 as a recording segment, and the determining module 15 is configured to determine whether the multimedia is successfully recognized.
例如,录取模块11可以设置预定的采样率对第一终端200播放的多媒体进行录音。而提取模块12采用特定的提取算法提取录音片段的特征值,例如旋律特征、音阶特征或歌词等。匹配模块14在获得特征值后通过在特征库内进行搜索比对寻找与特征值匹配的音频信息。假若寻找到与特征值匹配的音频信息,则判断模块15判断多媒体识别成功,否则,判断模块15判断多媒体识别失败。For example, the admission module 11 can set a predetermined sampling rate to record the multimedia played by the first terminal 200. The extraction module 12 uses a specific extraction algorithm to extract feature values of the recorded segments, such as melody features, scale features, or lyrics. The matching module 14 searches for audio information matching the feature value by performing a search comparison in the feature library after obtaining the feature value. If the audio information matching the feature value is found, the judging module 15 judges that the multimedia recognition is successful; otherwise, the judging module 15 judges that the multimedia recognition has failed.
在某些情况下,例如环境噪声较大,导致录音片段包括较多的噪声,提取模块12可能无法准确提取特征值,因此,导致匹配模块匹配失败。在这些实施方式中,判断模块15应该通知录取模块11重新录取多媒体从而得到不同的录音片段,而提取模块12重新提取特征值。如此循环,直接成功识别多媒体。In some cases, such as a large ambient noise, the recorded segment includes more noise, and the extraction module 12 may not be able to accurately extract the feature values, thus causing the matching module to fail to match. In these embodiments, the decision module 15 should notify the admission module 11 to re-admit the multimedia to obtain different audio segments, and the extraction module 12 re-extracts the feature values. This cycle directly identifies multimedia successfully.
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的方法。In the embodiment of the present invention, a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the method in the foregoing embodiment.
在某些实施方式中,多媒体为歌曲,音频信息包括歌曲名称、歌词、作者、演唱者或流派。In some embodiments, the multimedia is a song and the audio information includes a song title, lyrics, author, singer, or genre.
而播放状态信息包括当前播放位置、是否暂停、是否快进或是否快退。The play status information includes the current play position, whether it is paused, whether it is fast forward, or whether it is fast backward.
例如,假若第一终端200播放的是歌曲《义勇军进行曲》,则音频信息可以包括:For example, if the first terminal 200 plays the song "The Volunteer March", the audio information may include:
歌曲名称:义勇军进行曲;Song name: The March of the Volunteers;
歌词:起来!不愿做奴隶的人们!把我们的血肉,筑成我们新的长城!中华民族到了最危险的时候,每个人被迫着发出最后的吼声。起来!起来!!起来!!!我们万众一心,冒着敌人的炮火,前进!冒着敌人的炮火,前进!前进!前进!!进!!! Lyrics: Get up! People who don't want to be slaves! Make our flesh and blood into our new Great Wall! When the Chinese nation reached its most dangerous time, everyone was forced to make the final snoring. stand up! stand up! ! stand up! ! ! We are all in one heart, braving the enemy's artillery, and marching forward! Take the enemy's gunfire and move on! go ahead! go ahead! ! Into! ! !
作者:作词田汉、作曲聂耳;Author: Tian Han lyricist, composer Nie Er;
演唱者:合唱;及Singer: Chorus; and
流派:革命歌曲。Genre: Revolutionary songs.
而播放状态信息可以包括播放位置,例如播放位置为01分22秒(表示为01:22)。The play status information may include a play position, for example, the play position is 01 minutes 22 seconds (represented as 01:22).
请参阅图9,在某些实施方式中,步骤S20包括以下子步骤:Referring to FIG. 9, in some embodiments, step S20 includes the following sub-steps:
S22:根据音频信息获取多媒体文件;S22: Acquire a multimedia file according to the audio information;
S24:控制第二终端根据音频信息配置播放参数并根据播放状态信息开始播放多媒体文件;及S24: Control the second terminal to configure a play parameter according to the audio information, and start playing the multimedia file according to the play status information; and
S26:在播放过程中进行同步播放控制。S26: Synchronous playback control is performed during playback.
请参阅图10,在某些实施方式中,本发明实施方式的控制模块可以包括获取模块22、第一子控制模块24及第二子控制模块26。Referring to FIG. 10, in some embodiments, the control module of the embodiment of the present invention may include an acquisition module 22, a first sub-control module 24, and a second sub-control module 26.
步骤S21可以由获取模块22实现,步骤S24可以由第一子控制模块24实现,步骤S26可以由第二子控制模块26实现。或者说,获取模块22设置为根据音频信息获取多媒体文件。第一子控制模块24设置为控制第二终端根据音频信息配置播放参数并根据播放状态信息开始播放多媒体文件。第二子控制模块26设置为在播放过程中进行同步播放控制。Step S21 can be implemented by the acquisition module 22, step S24 can be implemented by the first sub-control module 24, and step S26 can be implemented by the second sub-control module 26. Alternatively, the acquisition module 22 is configured to retrieve the multimedia file based on the audio information. The first sub-control module 24 is configured to control the second terminal to configure the play parameters according to the audio information and start playing the multimedia file according to the play status information. The second sub-control module 26 is arranged to perform synchronized play control during playback.
在某些实施方式中,在步骤S22,在获得音频信息的情况下,获取模块22可以控制第二终端300搜索并获取对应的多媒体文件,例如,在获得上面音频信息的情况下可以控制第二终端300搜索并获取歌曲《义勇军进行曲》的文件。In some embodiments, in step S22, in the case of obtaining audio information, the obtaining module 22 may control the second terminal 300 to search for and acquire the corresponding multimedia file, for example, in the case of obtaining the above audio information, the second control may be performed. The terminal 300 searches for and acquires a file of the song "The March of the Volunteers".
在某些实施方式中,多媒体文件可以存储于第二终端300或存储于与远程服务器400。In some embodiments, the multimedia file can be stored in the second terminal 300 or stored with the remote server 400.
因此,获取模块22可以控制第二终端300先在本地进行搜索。假若多媒体文件在本地,则无需进行下载,可以加快速度。在本地未搜索到多媒体文件的情况下,获取模块22可以控制第二终端300搜索远程服务器400。在远程服务器400搜索到多媒体文件后可以下载至第二终端300,也可以进行在线播放。多媒体文件在远程服务器400可以减少占用第二终端300的资源。Therefore, the acquisition module 22 can control the second terminal 300 to perform a local search first. If the multimedia file is local, there is no need to download it, which can speed up. In the case where the multimedia file is not searched locally, the acquisition module 22 may control the second terminal 300 to search for the remote server 400. After the remote server 400 searches for the multimedia file, it can download to the second terminal 300, and can also play online. The multimedia file can reduce the resources of the second terminal 300 at the remote server 400.
在某些实施方式中,在步骤S24,设置播放参数可以是设置均衡器,例如均衡器包括不同的播放模式:古典音乐、摇滚音乐、重金属音乐、爵士音乐、浪漫音乐或乡村音乐等播放模式。对于本实施方式的歌曲《义勇军进行曲》可以设置均衡器的播放模式为古典音乐。In some embodiments, in step S24, setting the play parameters may be setting an equalizer, for example, the equalizer includes different play modes: classical music, rock music, heavy metal music, jazz music, romantic music, or country music. For the song "The Volunteer March" of the present embodiment, the playing mode of the equalizer can be set to classical music.
另外,根据播放状态信息播放多媒体文件可以是根据播放位置,例如以上面的实施方式为例,播放位置为01:22,则应该控制第二终端300播放多媒体文件时从01:22开始播放。In addition, playing the multimedia file according to the playing status information may be based on the playing position. For example, in the above embodiment, the playing position is 01:22, and the second terminal 300 should be controlled to start playing from 01:22 when playing the multimedia file.
对于某些可以同时播放歌词的第二终端300,根据播放状态信息播放多媒体文件还应该同时播放歌词。 For some second terminals 300 that can simultaneously play lyrics, playing the multimedia file according to the playing state information should also play the lyrics at the same time.
为了防止由于第二终端300与第一终端200之间的设备差异,或者由于用户操作或故障等原因导致第一终端200停止播放、快进或快退。因此,控制装置20需在整个多媒体的播放过程中进行同步控制,以实现真正的同步播放。In order to prevent the first terminal 200 from being stopped, fast forwarded or rewinded due to a device difference between the second terminal 300 and the first terminal 200, or due to a user operation or a malfunction or the like. Therefore, the control device 20 needs to perform synchronous control during the entire multimedia playback process to achieve true synchronous playback.
因此,请参阅图11,在某些实施方式中,子步骤S26包括孙步骤:Thus, referring to Figure 11, in some embodiments, sub-step S26 includes a grandchild step:
S262:关闭第二终端300的扬声器;S262: Turn off the speaker of the second terminal 300;
S264:语音识别第一终端200播放的多媒体;及S264: voice recognition of the multimedia played by the first terminal 200; and
S266:判断第一终端200与第二终端300播放的多媒体是否同步,若否,返回步骤S10。S266: Determine whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized, and if no, return to step S10.
孙步骤S262可以由第二子控制模块26控制第二终端300实现,孙步骤S264则可以由第二子控制模块26控制识别模块10实现,而孙步骤S266同样可以由第二子控制模块26实现。The step S262 can be implemented by the second sub-control module 26 to control the second terminal 300. The grand step S264 can be implemented by the second sub-control module 26 to control the identification module 10, and the grand step S266 can also be implemented by the second sub-control module 26. .
可以理解,持续执行上面的步骤S262-S266可以持续保证多媒体播放的同步控制,然而,如此会需要占用较多的资源,而且可能会影响第二终端300的多媒体播放,因此,本实施方式中,步骤S262-266以预定的时间间隔持续进行,例如每隔10s进行。It can be understood that the above steps S262-S266 can be continuously performed to ensure the synchronization control of the multimedia playback. However, this may require more resources, and may affect the multimedia playback of the second terminal 300. Therefore, in this embodiment, Steps S262-266 are continued at predetermined time intervals, for example every 10 seconds.
在孙步骤S266中,判断可以基于播放位置的对比,例如识别得到的多媒体的播放位置为01:56,而第一子控制模块24可以得到第二终端300播放的多媒体的播放位置,假若也为01:56,则可以判断同步播放,而假若为02:01,则可以判断未同步播放。In the step S266, the determination may be based on the comparison of the playback position, for example, the playback position of the recognized multimedia is 01:56, and the first sub-control module 24 may obtain the playback position of the multimedia played by the second terminal 300, if 01:56, you can judge the synchronous play, and if it is 02:01, you can judge that it is not synchronized.
判断到同步播放的情况下,第二终端300可以继续与第一终端200同步播放多媒体,而在判断到未同步播放的情况下,应该重新返回识别步骤重新进行同步播放的控制。When it is determined that the synchronous play is performed, the second terminal 300 can continue to play the multimedia synchronously with the first terminal 200, and in the case of determining that the synchronous play is not synchronized, it should return to the control step of re-synchronizing the playback.
当然,同步控制的实现并不限于上面的实施方式。Of course, the implementation of the synchronization control is not limited to the above embodiment.
请参阅图12,在某些实施方式中,子步骤S26包括以下孙步骤:Referring to Figure 12, in some embodiments, sub-step S26 includes the following grand steps:
S262a:录取第一终端200及第二终端300播放的多媒体;S262a: accepting multimedia played by the first terminal 200 and the second terminal 300;
S264a:过滤第二终端300播放的多媒体后识别第一终端200播放的多媒体;及S264a: After filtering the multimedia played by the second terminal 300, identifying the multimedia played by the first terminal 200; and
S266a:判断第一终端200与第二终端300播放的多媒体是否同步,若否,返回步骤S10。S266a: It is determined whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized. If not, the process returns to step S10.
孙步骤S262a可以由第二子控制模块26控制第二终端300实现,孙步骤S264a则可以由第二子控制模块26控制识别模块10实现,而孙步骤S266a同样可以由第二子控制模块26实现。The grand step S262a can be implemented by the second sub-control module 26 to control the second terminal 300, the grand step S264a can be controlled by the second sub-control module 26 to control the identification module 10, and the grandchild step S266a can also be implemented by the second sub-control module 26. .
孙步骤S262a-S266a与步骤S262-S266a相似,但是步骤S262a中的录音片段包括第一终端200播放的多媒体及第二终端300播放的多媒体。而步骤S264a中,可以根据第一子控制模块24得到第二终端300播放的多媒体,从而在录音片段中滤去,得到第一终端200播放的多媒体。步骤S266a则与步骤S266基本相同。The grandchild steps S262a-S266a are similar to steps S262-S266a, but the recorded segments in step S262a include the multimedia played by the first terminal 200 and the multimedia played by the second terminal 300. In step S264a, the multimedia played by the second terminal 300 can be obtained according to the first sub-control module 24, so as to be filtered out in the recorded segment, and the multimedia played by the first terminal 200 is obtained. Step S266a is substantially the same as step S266.
请参阅图13,在某些实施方式中,子步骤S26包括以下孙步骤: Referring to Figure 13, in some embodiments, sub-step S26 includes the following grand steps:
S262b:录取第一终端200及第二终端300播放的多媒体;S262b: the multimedia played by the first terminal 200 and the second terminal 300;
S264b:比较第一终端200及第二终端300播放的多媒体的相关度;及S264b: comparing the relevance of the multimedia played by the first terminal 200 and the second terminal 300; and
S266b:根据相关度判断第一终端200与第二终端300播放的多媒体是否同步,若否,返回控制步骤S10。S266b: Determine whether the multimedia played by the first terminal 200 and the second terminal 300 is synchronized according to the correlation degree, and if not, return to the control step S10.
孙步骤S262b可以由第二子控制模块26控制第二终端300实现,孙步骤S264b及孙步骤S266b则可以由第二子控制模块26实现。The grand step S262b can be implemented by the second sub-control module 26 to control the second terminal 300, and the grand step S264b and the grandchild step S266b can be implemented by the second sub-control module 26.
在孙步骤S264b中,可以判断子步骤通过比较语音频谱线性或者预设时间内相同音节的个数实现判断。In the grand step S264b, it can be judged that the sub-step realizes the judgment by comparing the linearity of the speech spectrum or the number of identical syllables within a preset time.
音节是听觉能感受到的最自然的语音单位,有一个或几个音素按一定规律组合而成。汉语中一个汉字就是一个音节,每个音节由声母、韵母和声调三个部分组成。英语中一个元音音素可构成一个音节,一个元音音素和一个或几个辅音音素结合也可以构成一个音节。本实施方式可以在判断前根据实际需求对音节的判定规则进行定义。A syllable is the most natural unit of speech that can be felt by hearing. One or several phonemes are combined according to certain rules. A Chinese character in Chinese is a syllable. Each syllable consists of three parts: initial, final and tonal. A vowel phoneme in English can form a syllable, and a vowel phoneme and one or several consonant phonemes can also form a syllable. In the present embodiment, the determination rule of the syllable can be defined according to actual needs before the judgment.
在某些实施方式中,如果在预设时间内,相同音节达到一定数量则认为第一终端200播放的多媒体与第二终端300播放的多媒体同步。In some embodiments, if the same syllable reaches a certain number within a preset time, it is considered that the multimedia played by the first terminal 200 is synchronized with the multimedia played by the second terminal 300.
请参阅图14,在某些实施方式中,子步骤S26包括:Referring to FIG. 14, in some embodiments, sub-step S26 includes:
S262c:录取第一终端200及第二终端300播放的多媒体;S262c: the multimedia played by the first terminal 200 and the second terminal 300;
S264c:识别第一终端200及第二终端300播放的多媒体;S264c: identify the multimedia played by the first terminal 200 and the second terminal 300;
S266c:根据播放状态信息判断第一终端与第二终端播放的多媒体是否同步,若否,则返回识别步骤。S266c: Determine, according to the playing state information, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, return to the identifying step.
孙步骤S262c可以由第二子控制模块26控制第二终端300实现,孙步骤S264c则可以由第二子控制模块26控制识别模块10实现,而孙步骤S266c同样可以由第二子控制模块26实现。The grand step S262c can be implemented by the second sub-control module 26 to control the second terminal 300, the grand step S264c can be controlled by the second sub-control module 26 to control the identification module 10, and the grandchild step S266c can also be implemented by the second sub-control module 26. .
在某些实施方式中,可以利用播放状态信息的播放位置进行比较。另外,在某些实施方式中,也可以利用歌词的播放位置进行比较。In some embodiments, the playback position of the playback status information can be utilized for comparison. Additionally, in some embodiments, the playback position of the lyrics can also be utilized for comparison.
请参阅图15,在某些实施方式中,控制方法还包括:Referring to FIG. 15, in some embodiments, the control method further includes:
S00:判断是否接收到同步播放的请求,若是,则进入步骤S10。S00: It is determined whether a request for synchronous play is received, and if yes, the process proceeds to step S10.
请参阅图16,某些实施方式的音频同步播放系统1000d与音频同步播放系统1000基本相同,区别在于音频同步播放系统1000d的控制装置100d与控制装置100不同,控制装置还包括请求接收模块30。Referring to FIG. 16, the audio synchronous playback system 1000d of some embodiments is substantially the same as the audio synchronous playback system 1000, except that the control device 100d of the audio synchronous playback system 1000d is different from the control device 100, and the control device further includes a request receiving module 30.
步骤S00可以由请求接收模块30实现,也即是说,请求接收模块30设置为判断是否接收到同步播放的请求。若是,则进入步骤S10,若否则返回继续判断。 Step S00 can be implemented by the request receiving module 30, that is, the request receiving module 30 is set to determine whether a request for synchronous play is received. If yes, go to step S10, if not, return to continue judgment.
本发明实施方式的控制装置100或100d、音频同步播放系统1000-100d未展开的其它部分,可参以上实施方式的控制方法的对应部分,在此不再详细展开。The control device 100 or 100d of the embodiment of the present invention and the other portions of the audio synchronization playback system 1000-100d that are not deployed may be referred to the corresponding portions of the control method of the above embodiment, and will not be developed in detail herein.
在本发明的实施方式的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”、“顺时针”、“逆时针”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明的实施方式和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的实施方式的限制。此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。在本发明的实施方式的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。In the description of the embodiments of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "previous" "," "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inside", "outside", "clockwise", "counterclockwise", etc. The orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is merely for the convenience of describing the embodiments and the simplified description of the present invention, and does not indicate or imply that the device or component referred to has a specific orientation, The orientation configuration and operation are therefore not to be construed as limiting the embodiments of the invention. Moreover, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include one or more of the described features either explicitly or implicitly. In the description of the embodiments of the present invention, the meaning of "a plurality" is two or more unless specifically defined otherwise.
在本发明的实施方式的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接或可以相互通讯;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明的实施方式中的具体含义。In the description of the embodiments of the present invention, it should be noted that the terms "installation", "connected", and "connected" should be understood broadly, and may be a fixed connection, for example, or They are detachable or integrally connected; they can be mechanically connected, they can be electrically connected or can communicate with each other; they can be connected directly or indirectly through an intermediate medium, which can be internal or two components of two components. Interaction relationship. For those skilled in the art, the specific meanings of the above terms in the embodiments of the present invention can be understood on a case-by-case basis.
在本发明的实施方式中,除非另有明确的规定和限定,第一特征在第二特征之“上”或之“下”可以包括第一和第二特征直接接触,也可以包括第一和第二特征不是直接接触而是通过它们之间的另外的特征接触。而且,第一特征在第二特征“之上”、“上方”和“上面”包括第一特征在第二特征正上方和斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”包括第一特征在第二特征正上方和斜上方,或仅仅表示第一特征水平高度小于第二特征。In the embodiments of the present invention, the "on" or "below" of the second feature may include direct contact of the first and second features, and may also include the first sum, unless otherwise specifically defined and defined. The second feature is not in direct contact but through additional features between them. Moreover, the first feature "above", "above" and "above" the second feature includes the first feature directly above and above the second feature, or merely indicating that the first feature level is higher than the second feature. The first feature "below", "below" and "below" the second feature includes the first feature directly above and above the second feature, or merely the first feature level being less than the second feature.
下文的公开提供了许多不同的实施方式或例子用来实现本发明的实施方式的不同结构。为了简化本发明的实施方式的公开,下文中对特定例子的部件和设置进行描述。当然,它们仅仅为示例,并且目的不在于限制本发明。此外,本发明的实施方式可以在不同例子中重复参考数字和/或参考字母,这种重复是为了简化和清楚的目的,其本身不指示所讨论各种实施方式和/或设置之间的关系。此外,本发明的实施方式提供了的各种特定的工艺和材料的例子,但是本领域普通技术人员可以意识到其他工艺的应用和/或其他材料的使用。The following disclosure provides many different embodiments or examples for implementing different structures of embodiments of the present invention. In order to simplify the disclosure of embodiments of the present invention, the components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit the invention. In addition, the embodiments of the present invention may repeat reference numerals and/or reference letters in different examples, which are for the purpose of simplicity and clarity, and do not in themselves indicate the relationship between the various embodiments and/or arrangements discussed. . Moreover, embodiments of the present invention provide examples of various specific processes and materials, but one of ordinary skill in the art will recognize the use of other processes and/or the use of other materials.
在本说明书的描述中,参考术语“一个实施方式”、“一些实施方式”、“示意性实施方式”、“示例”、“具体示例”或“一些示例”等的描述意指结合所述实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "illustrative embodiment", "example", "specific example" or "some examples", etc. Particular features, structures, materials or features described in the manner or examples are included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms does not necessarily mean the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且 本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code that includes one or more executable instructions for implementing the steps of a particular logical function or process. And The scope of the preferred embodiments of the present invention includes additional implementations in which the functions may be performed in a substantially simultaneous manner or in an opposite order depending on the functions involved, in the order shown or discussed. Embodiments of the invention will be understood by those skilled in the art.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理模块的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and/or steps represented in the flowchart or otherwise described herein, for example, may be considered as an ordered list of executable instructions for implementing logical functions, and may be embodied in any computer readable medium, Used in conjunction with, or in conjunction with, an instruction execution system, apparatus, or device (eg, a computer-based system, a system including a processing module, or other system that can fetch instructions and execute instructions from an instruction execution system, apparatus, or device) Or use with equipment. For the purposes of this specification, a "computer-readable medium" can be any apparatus that can contain, store, communicate, propagate, or transport a program for use in an instruction execution system, apparatus, or device, or in conjunction with the instruction execution system, apparatus, or device. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) having one or more wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read only memory (ROM), erasable editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM). In addition, the computer readable medium may even be a paper or other suitable medium on which the program can be printed, as it may be optically scanned, for example by paper or other medium, followed by editing, interpretation or, if appropriate, other suitable The method is processed to obtain the program electronically and then stored in computer memory.
应当理解,本发明的实施方式的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that portions of the embodiments of the invention may be implemented in hardware, software, firmware or a combination thereof. In the above-described embodiments, multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques well known in the art: having logic gates for implementing logic functions on data signals. Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。One of ordinary skill in the art can understand that all or part of the steps carried by the method of implementing the above embodiments can be completed by a program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, one or a combination of the steps of the method embodiments is included.
此外,在本发明的各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as stand-alone products, may also be stored in a computer readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。The above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the invention. The embodiments are subject to variations, modifications, substitutions and variations.
工业实用性 Industrial applicability
本发明实施方式提供的上述技术方案,由于采用了语音识别技术识别所述第一终端播放的多媒体,并自动控制所述第二终端播放对应的多媒体文件,如此,一来可以实现多媒体开始播放的同步,而且播放过程中也可以实现同步控制,从而可以实现多媒体同步播放。 According to the foregoing technical solution provided by the embodiment of the present invention, the voice recognition technology is used to identify the multimedia played by the first terminal, and the second terminal is automatically controlled to play the corresponding multimedia file, so that the multimedia can be started to play. Synchronization, and synchronization control can also be realized during playback, so that multimedia synchronization can be realized.

Claims (37)

  1. 一种控制方法,包括:A control method comprising:
    识别步骤,识别第一终端播放的多媒体以获得对应的音频信息及播放状态信息;及The identifying step of identifying the multimedia played by the first terminal to obtain corresponding audio information and playing status information;
    控制步骤,控制第二终端根据所述音频信息及所述播放状态信息播放与所述音频信息对应的多媒体文件以与所述第一终端同步播放所述多媒体。And a controlling step of controlling the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the playing state information to play the multimedia synchronously with the first terminal.
  2. 如权利要求1所述的控制方法,其中,所述第一终端或所述第二终端包括手机、平板电脑或可穿戴设备。The control method according to claim 1, wherein the first terminal or the second terminal comprises a mobile phone, a tablet or a wearable device.
  3. 如权利要求1所述的控制方法,其中,所述识别步骤包括:The control method according to claim 1, wherein said identifying step comprises:
    提取子步骤,提取所述多媒体的特征值;Extracting a sub-step, extracting feature values of the multimedia;
    匹配子步骤,根据所述特征值及特征库进行特征匹配以识别所述多媒体。A matching sub-step performs feature matching according to the feature value and the feature library to identify the multimedia.
  4. 如权利要求3所述的控制方法,其中,所述特征值包括声学特征或语言特征,所述声学特征包括旋律特征或音阶特征,所述多媒体为歌曲,所述语言特征包括歌词。The control method according to claim 3, wherein said feature value comprises an acoustic feature or a linguistic feature, said acoustic feature comprises a melody feature or a scale feature, said multimedia is a song, and said linguistic feature comprises lyrics.
  5. 如权利要求3所述的控制方法,其中,所述特征库存储于所述第二终端或远程服务器。The control method according to claim 3, wherein said feature library is stored in said second terminal or a remote server.
  6. 如权利要求3所述的控制方法,其中,所述识别步骤包括:The control method according to claim 3, wherein said identifying step comprises:
    录取子步骤,录取所述第一终端播放的多媒体为录音片段;Taking the sub-step, the multimedia played by the first terminal is recorded as a recording segment;
    所述提取子步骤通过分析所述录音片段提取所述特征值。The extracting substep extracts the feature value by analyzing the recorded segment.
  7. 如权利要求6所述的控制方法,其中,所述识别步骤包括:The control method according to claim 6, wherein said identifying step comprises:
    判断子步骤,判断是否成功识别所述多媒体,若否,则返回所述录取子步骤。The sub-step is determined to determine whether the multimedia is successfully identified, and if not, the recording sub-step is returned.
  8. 如权利要求1所述的控制方法,其中,所述多媒体为歌曲,所述音频信息包括歌曲名称、歌词、作者、演唱者或流派。The control method according to claim 1, wherein said multimedia is a song, and said audio information includes a song name, a lyrics, an author, a singer, or a genre.
  9. 如权利要求1所述的控制方法,其中,所述播放状态信息包括当前播放位置、是否暂停、是否快进或是否快退。The control method according to claim 1, wherein the play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
  10. 如权利要求1所述的控制方法,其中,所述控制步骤包括:The control method according to claim 1, wherein said controlling step comprises:
    获取子步骤,根据所述音频信息获取所述多媒体文件;Obtaining a sub-step of acquiring the multimedia file according to the audio information;
    第一控制子步骤,控制所述第二终端根据所述音频信息配置播放参数并根据所述播放参数及所述播放状态信息开始播放所述多媒体文件;及a first control sub-step, the second terminal is configured to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
    第二控制子步骤,在播放过程中进行同步播放控制。The second control sub-step performs synchronous play control during playback.
  11. 如权利要求10所述的控制方法,其中,所述多媒体文件存储于所述第二终端或存储于远程服务器。 The control method according to claim 10, wherein said multimedia file is stored in said second terminal or stored in a remote server.
  12. 如权利要求10所述的控制方法,其中,所述第二控制子步骤包括:The control method according to claim 10, wherein said second control sub-step comprises:
    间歇关闭所述第二终端的扬声器;Intermittently closing the speaker of the second terminal;
    识别所述第一终端播放的多媒体;及Identifying multimedia played by the first terminal; and
    判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
  13. 如权利要求10所述的控制方法,其中,所述第二控制子步骤包括以下孙步骤:The control method according to claim 10, wherein said second control substep comprises the following grandchild steps:
    录取所述第一终端及所述第二终端播放的多媒体;Receiving multimedia played by the first terminal and the second terminal;
    过滤所述第二终端播放的多媒体后识别所述第一终端播放的多媒体;及Identifying multimedia played by the first terminal after filtering the multimedia played by the second terminal; and
    判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
  14. 如权利要求10所述的控制方法,其中,所述第二控制子步骤包括以下孙步骤:The control method according to claim 10, wherein said second control substep comprises the following grandchild steps:
    录取所述第一终端及所述第二终端播放的多媒体;Receiving multimedia played by the first terminal and the second terminal;
    比较所述第一终端及所述第二终端播放的多媒体的相关度;及Comparing the relevance of the multimedia played by the first terminal and the second terminal; and
    根据所述相关度判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining, according to the correlation, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
  15. 如权利要求14所述的控制方法,其中,所述相关度包括语音频谱线性或者预设时间内相同音节的个数。The control method according to claim 14, wherein the correlation includes a linearity of the speech spectrum or the number of identical syllables within a preset time.
  16. 如权利要求10所述的控制方法,其中,所述多媒体为歌曲,所述播放状态信息包括歌词的播放位置,所述第二控制子步骤包括以下孙步骤:The control method according to claim 10, wherein said multimedia is a song, said play status information includes a play position of the lyrics, and said second control sub-step includes the following grand steps:
    录取所述第一终端及所述第二终端播放的多媒体;Receiving multimedia played by the first terminal and the second terminal;
    识别所述第一终端及所述第二终端播放的多媒体以获得对应的所述歌词的播放位置;Identifying multimedia played by the first terminal and the second terminal to obtain a corresponding playing position of the lyrics;
    根据所述歌词的播放位置判断所述第一终端与所述第二终端播放的多媒体是否同步,若否,则返回所述识别步骤。Determining, according to the playing position of the lyrics, whether the multimedia played by the first terminal and the second terminal is synchronized, and if not, returning to the identifying step.
  17. 如权利要求1所述的控制方法,其中,所述控制方法还包括:The control method according to claim 1, wherein the control method further comprises:
    判断是否接收到同步播放的请求,若是,则进入所述识别步骤。It is judged whether a request for synchronous play is received, and if so, the identification step is entered.
  18. 一种控制装置,包括:A control device comprising:
    识别模块,所述识别模块设置为识别第一终端播放的多媒体以获得对应的音频信息及播放状态信息;及 An identification module, configured to identify multimedia played by the first terminal to obtain corresponding audio information and play status information;
    控制模块,所述控制模块设置为控制第二终端根据所述音频信息及所述播放状态信息播放与所述音频信息对应的多媒体文件以与所述第一终端同步播放所述多媒体。a control module, the control module is configured to control the second terminal to play the multimedia file corresponding to the audio information according to the audio information and the play status information to play the multimedia synchronously with the first terminal.
  19. 如权利要求18所述的控制装置,其中,所述第一终端或所述第二终端包括手机、平板电脑或可穿戴设备。The control device according to claim 18, wherein said first terminal or said second terminal comprises a mobile phone, a tablet or a wearable device.
  20. 如权利要求18所述的控制装置,其中,所述识别模块包括:The control device according to claim 18, wherein said identification module comprises:
    提取模块,所述提取模块设置为提取所述多媒体的特征值;An extraction module, configured to extract feature values of the multimedia;
    匹配模块,所述匹配模块设置为根据所述特征值及特征库进行特征匹配以识别所述多媒体。a matching module, the matching module being configured to perform feature matching according to the feature value and the feature library to identify the multimedia.
  21. 如权利要求20所述的控制装置,其中,所述特征值包括声学特征或语言特征,The control device according to claim 20, wherein said feature value comprises an acoustic feature or a linguistic feature,
    所述声学特征包括旋律特征或音阶特征,The acoustic features include melody features or scale features.
    所述多媒体为歌曲,所述语言特征包括歌词。The multimedia is a song and the linguistic features include lyrics.
  22. 如权利要求20所述的控制装置,其中,所述特征库存储于所述第二终端或远程服务器。The control device according to claim 20, wherein said feature library is stored in said second terminal or remote server.
  23. 如权利要求20所述的控制装置,其中,所述识别模块包括:The control device of claim 20, wherein the identification module comprises:
    录取模块,所述录取模块设置为录取所述第一终端播放的多媒体为录音片段;An admission module, the admission module is configured to record the multimedia played by the first terminal as a recording segment;
    所述提取模块设置为通过分析所述录音片段提取所述特征值。The extraction module is configured to extract the feature value by analyzing the recorded segment.
  24. 如权利要求23所述的控制装置,其中,所述识别模块包括:The control device according to claim 23, wherein said identification module comprises:
    判断模块,所述判断模块设置为判断是否成功识别所述多媒体并在未成功识别所述多媒体时通知所述录取模块。a judging module, the judging module being configured to determine whether the multimedia is successfully identified and notifying the admission module when the multimedia is not successfully identified.
  25. 如权利要求18所述的控制装置,其中,所述多媒体为歌曲,所述音频信息包括歌曲名称、歌词、作者、演唱者或流派。The control device according to claim 18, wherein said multimedia is a song, and said audio information includes a song name, a lyrics, an author, a singer or a genre.
  26. 如权利要求18所述的控制装置,其中,所述播放状态信息包括当前播放位置、是否暂停、是否快进或是否快退。The control device according to claim 18, wherein said play status information includes a current play position, whether to pause, whether to fast forward, or whether to rewind.
  27. 如权利要求18所述的控制装置,其中,所述控制模块包括:The control device of claim 18, wherein the control module comprises:
    获取模块,所述提取模块设置为根据所述音频信息获取所述多媒体文件;Obtaining a module, the extraction module being configured to acquire the multimedia file according to the audio information;
    第一子控制模块,所述第一子控制模块设置为控制所述第二终端根据所述音频信息配置播放参数并根据所述播放参数及所述播放状态信息开始播放所述多媒体文件;及a first sub-control module, the first sub-control module is configured to control the second terminal to configure a play parameter according to the audio information, and start playing the multimedia file according to the play parameter and the play status information;
    第二子控制模块,所述第二子控制模块设置为在播放过程中进行同步播放控制。a second sub-control module, the second sub-control module being configured to perform synchronous play control during playback.
  28. 如权利要求27所述的控制装置,其中,所述多媒体文件存储于所述第二终端或存储于远程服务器。 The control device according to claim 27, wherein said multimedia file is stored in said second terminal or stored in a remote server.
  29. 如权利要求27所述的控制装置,其中,所述第二子控制模块设置为间歇关闭所述第二终端的扬声器、识别所述第一终端播放的多媒体及判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。The control device according to claim 27, wherein said second sub-control module is arranged to intermittently turn off the speaker of said second terminal, identify multimedia played by said first terminal, and determine said first terminal and said Whether the multimedia played by the second terminal is synchronized and notifies the identification module when not synchronized.
  30. 如权利要求27所述的控制装置,其中,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、过滤所述第二终端播放的多媒体后识别所述第一终端播放的多媒体及判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。The control device according to claim 27, wherein the second sub-control module is configured to: capture the multimedia played by the first terminal and the second terminal, filter the multimedia played by the second terminal, and identify the The multimedia played by the first terminal determines whether the multimedia played by the first terminal and the second terminal is synchronized and notifies the identification module when not synchronized.
  31. 如权利要求27所述的控制装置,其中,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、比较所述第一终端及所述第二终端播放的多媒体的相关度及根据所述相关度判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。The control device according to claim 27, wherein the second sub-control module is configured to listen to the multimedia played by the first terminal and the second terminal, compare the first terminal and the second terminal to play And determining, according to the correlation degree, whether the multimedia played by the first terminal and the second terminal is synchronized according to the correlation degree, and notifying the identification module when not synchronized.
  32. 如权利要求31所述的控制装置,其中,所述相关度包括语音频谱线性或者预设时间内相同音节的个数。The control device according to claim 31, wherein said correlation includes a linearity of the speech spectrum or the number of identical syllables within a preset time.
  33. 如权利要求27所述的控制装置,其中,所述多媒体为歌曲,所述播放状态信息包括歌词的播放位置,所述第二子控制模块设置为录取所述第一终端及所述第二终端播放的多媒体、识别所述第一终端及所述第二终端播放的多媒体以获得所述歌词的播放位置及根据所述歌词的播放位置判断所述第一终端与所述第二终端播放的多媒体是否同步并在不同步时通知所述识别模块。The control device according to claim 27, wherein said multimedia is a song, said play status information comprises a play position of the lyrics, and said second sub-control module is configured to take said first terminal and said second terminal Playing the multimedia, identifying the multimedia played by the first terminal and the second terminal to obtain a play position of the lyrics, and determining, according to the play position of the lyrics, the multimedia played by the first terminal and the second terminal Whether to synchronize and notify the identification module when it is not synchronized.
  34. 如权利要求18所述的控制装置,其中,所述控制装置包括:The control device according to claim 18, wherein said control device comprises:
    请求接收模块,所述请求接收模块设置为判断是否接收到同步播放的请求并在接收到所述请求时通知所述识别模块。The request receiving module is configured to determine whether a request for synchronous play is received and notify the identification module when the request is received.
  35. 一种终端,包括如权利要求18-34任意一项所述的控制装置。A terminal comprising the control device according to any one of claims 18-34.
  36. 一种音频同步播放系统,包括如权利要求35所述的终端。An audio synchronization playback system comprising the terminal of claim 35.
  37. 一种计算机存储介质,所述计算机存储介质存储有执行指令,所述执行指令用于执行权利要求1至17中任一项所述的方法。 A computer storage medium storing execution instructions for performing the method of any one of claims 1 to 17.
PCT/CN2016/075665 2015-12-24 2016-03-04 Control method, control device, terminal, and synchronous audio playback system WO2017107309A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510989411.2 2015-12-24
CN201510989411.2A CN106921882A (en) 2015-12-24 2015-12-24 Control method, control device, terminal and audio sync Play System

Publications (1)

Publication Number Publication Date
WO2017107309A1 true WO2017107309A1 (en) 2017-06-29

Family

ID=59088842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/075665 WO2017107309A1 (en) 2015-12-24 2016-03-04 Control method, control device, terminal, and synchronous audio playback system

Country Status (2)

Country Link
CN (1) CN106921882A (en)
WO (1) WO2017107309A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113765754A (en) * 2020-06-02 2021-12-07 云米互联科技(广东)有限公司 Audio synchronous playing method and device and computer readable storage medium
CN111985975A (en) * 2020-08-31 2020-11-24 湖南快乐阳光互动娱乐传媒有限公司 Information delivery method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5808662A (en) * 1995-11-08 1998-09-15 Silicon Graphics, Inc. Synchronized, interactive playback of digital movies across a network
CN102497400A (en) * 2011-11-30 2012-06-13 上海博泰悦臻电子设备制造有限公司 Music media information obtaining method of vehicle-mounted radio equipment and obtaining system thereof
CN102595228A (en) * 2011-01-07 2012-07-18 三星电子株式会社 Content synchronization apparatus and method
CN104349199A (en) * 2014-11-21 2015-02-11 王方淇 Information synchronization method and device
CN104539999A (en) * 2014-11-24 2015-04-22 深圳市金立通信设备有限公司 Multimedia data sharing method and terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5808662A (en) * 1995-11-08 1998-09-15 Silicon Graphics, Inc. Synchronized, interactive playback of digital movies across a network
CN102595228A (en) * 2011-01-07 2012-07-18 三星电子株式会社 Content synchronization apparatus and method
CN102497400A (en) * 2011-11-30 2012-06-13 上海博泰悦臻电子设备制造有限公司 Music media information obtaining method of vehicle-mounted radio equipment and obtaining system thereof
CN104349199A (en) * 2014-11-21 2015-02-11 王方淇 Information synchronization method and device
CN104539999A (en) * 2014-11-24 2015-04-22 深圳市金立通信设备有限公司 Multimedia data sharing method and terminal

Also Published As

Publication number Publication date
CN106921882A (en) 2017-07-04

Similar Documents

Publication Publication Date Title
CN107918653B (en) Intelligent playing method and device based on preference feedback
US9691429B2 (en) Systems and methods for creating music videos synchronized with an audio track
TWI553494B (en) Multi-modal fusion based Intelligent fault-tolerant video content recognition system and recognition method
CN110675886B (en) Audio signal processing method, device, electronic equipment and storage medium
EP3434073B1 (en) Enriching audio with lighting
WO2014161282A1 (en) Method and device for adjusting playback progress of video file
US20230018853A1 (en) Creating a cinematic storytelling experience using network-addressable devices
US10606950B2 (en) Controlling playback of speech-containing audio data
CN1937462A (en) Content-preference-score determining method, content playback apparatus, and content playback method
US20210082382A1 (en) Method and System for Pairing Visual Content with Audio Content
EP3839938A1 (en) Karaoke query processing system
US9905221B2 (en) Automatic generation of a database for speech recognition from video captions
KR20190108027A (en) Method, system and non-transitory computer-readable recording medium for generating music associated with a video
CN111046226B (en) Tuning method and device for music
WO2017107309A1 (en) Control method, control device, terminal, and synchronous audio playback system
US11574627B2 (en) Masking systems and methods
JP2012015809A (en) Music selection apparatus, music selection method, and music selection program
EP3203468A1 (en) Acoustic system, communication device, and program
US11862144B2 (en) Augmented training data for end-to-end models
EP3648106B1 (en) Media content steering
KR101554662B1 (en) Method for providing chord for digital audio data and an user terminal thereof
Roininen et al. Modeling the timing of cuts in automatic editing of concert videos
CA3172527A1 (en) Live caption feedback systems and methods
CN105741830B (en) Audio synthesis method and device
US20230031056A1 (en) Audio recommendation based on text information and video content

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16877123

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16877123

Country of ref document: EP

Kind code of ref document: A1