WO2014106375A1 - 信息处理的方法、装置和系统 - Google Patents

信息处理的方法、装置和系统 Download PDF

Info

Publication number
WO2014106375A1
WO2014106375A1 PCT/CN2013/079798 CN2013079798W WO2014106375A1 WO 2014106375 A1 WO2014106375 A1 WO 2014106375A1 CN 2013079798 W CN2013079798 W CN 2013079798W WO 2014106375 A1 WO2014106375 A1 WO 2014106375A1
Authority
WO
WIPO (PCT)
Prior art keywords
accompaniment
song
information
played
vocal
Prior art date
Application number
PCT/CN2013/079798
Other languages
English (en)
French (fr)
Inventor
张德明
张琦
龙志明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014106375A1 publication Critical patent/WO2014106375A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/056Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres

Definitions

  • Karaoke 0K is a common form of entertainment. Because traditional Karaoke requires a professional equipment and a professional Karaoke accompaniment library, many places cannot meet the requirements of traditional Karaoke. In order to solve the above problem, in the prior art, one can install the Karaoke 0K system on the terminal to realize the Karaoke function, and the user can perform Karaoke entertainment in any occasion using the terminal equipped with the Karaoke system.
  • Karaoke 0K system can include but is not limited to: K Ge Dato, K to Explosion, etc.
  • the steps of implementing the Karaoke function in the Kara 0K system may include:
  • the terminal installed with the Karaoke 0K system downloads the professional accompaniment selected by the user from the server while being connected to the network;
  • the terminal obtains the accompaniment from the downloaded professional accompaniment, while playing the accompaniment, the user sings information is collected through the microphone connected to the terminal; wherein the song can be: professional accompaniment or non-professional accompaniment, professional accompaniment and non-professional accompaniment Accompaniment and vocals can be included, but the accompaniment and vocals in the professional accompaniment are not in the same channel, but in the right channel and the left channel, respectively.
  • the accompaniment and vocals in the non-professional accompaniment are in the same channel, that is, It can be said that the accompaniment and vocals can be included in the left channel, and the accompaniment and vocals can be included in the right channel.
  • the terminal can directly obtain the accompaniment in the professional accompaniment from the right channel; 3. Collect the user singing voice, and add the user singing voice to the acquired accompaniment to play together, that is, the user singing the vocal and the accompaniment to mix and play.
  • the prior art includes at least the following problems: Most of the songs are non-professional accompaniment, the number of professional accompaniment is small, and the terminal installed with the Karaoke system can only be downloaded. Professional accompaniment is accompanied by accompaniment, so the number of accompaniments available to the terminal equipped with the Karaoke system is small. In addition, after collecting the user's singing voice, the sound is directly mixed with the accompaniment, and the sound quality of the mixed sound is poor, which reduces the quality of the played sound. The above problems all lead to low karaoke 0K playback effect.
  • Embodiments of the present invention provide a method, apparatus, and system for information processing, which solve the problem that the Karaoke 0K playback effect is low.
  • a method for information processing including:
  • the terminal acquires a first song, and the first song includes an accompaniment to be played;
  • the first vocal is the same content that exists between the first part and the second part of the first song, and the first part is the first vocal a song is distributed in a portion of the left channel, the second portion is a portion of the first song distributed in the right channel; if the first human voice is included, the first human voice is eliminated, and the waiting is obtained Playing the accompaniment; receiving the second human voice to be played, performing the sound optimization processing and the mixing processing on the second human voice and the to-be-played accompaniment to obtain the second song.
  • the eliminating the first human voice includes:
  • the eliminating the first human voice comprises: calculating a portion having the greatest correlation between the first portion and the second portion;
  • the portion corresponding to the product is eliminated from the left and right channels, respectively.
  • the canceling the first vocal, the accompaniment to be played includes:
  • the left channel accompaniment and the right channel accompaniment are synthesized to obtain the accompaniment to be played, and the information refers to time domain information or frequency domain information.
  • the sound optimization processing includes: At least one of noise processing, sound automatic gain adjustment processing, and sound pitch adjustment processing.
  • the method further includes:
  • the song does not include the first vocal
  • the second vocal is received, and the second vocal and the accompaniment to be played are subjected to sound optimization processing and mixing processing to obtain a second song.
  • the method further includes:
  • the eliminating the first human voice includes:
  • the most relevant part of the left and right channels is eliminated from the left and right channels, respectively.
  • the eliminating the first human voice comprises: calculating a portion having the greatest correlation between the first portion and the second portion;
  • the portion corresponding to the product is eliminated from the left and right channels, respectively.
  • the canceling the first human voice comprises: subtracting information of the second portion from the information of the first portion to obtain a left channel accompaniment;
  • the left channel accompaniment and the right channel accompaniment are synthesized to obtain the accompaniment to be played, and the information refers to time domain information or frequency domain information.
  • the method further includes:
  • the song is sent to the terminal.
  • a terminal including:
  • An acquiring unit configured to acquire a first song, where the first song includes an accompaniment to be played; a judging unit, configured to determine whether the first vocal is included in the first song, where the first vocal is the same content that exists between the first part and the second part of the first song, the first part a portion of the first song distributed to the left channel, the second portion being a portion of the first song distributed to the right channel;
  • a eliminating unit configured to: if the first human voice is included, cancel the first human voice, and obtain the to-be-played accompaniment;
  • a processing unit configured to acquire a second human voice to be played, perform sound optimization processing and mixing processing on the second human voice and the to-be-played accompaniment, to obtain a second song.
  • a server including:
  • a receiving unit configured to receive a request message for acquiring a first song sent by the terminal, where the first song includes an accompaniment to be played;
  • a judging unit configured to determine whether the first vocal is included in the first song, where the first vocal is the same content that exists between the first part and the second part of the first song, the first part a portion of the first song distributed to the left channel, the second portion being a portion of the first song distributed to the right channel;
  • a eliminating unit configured to: if the first human voice is included, cancel the first human voice, and obtain the to-be-played accompaniment;
  • a sending unit configured to send the to-be-played accompaniment to the terminal.
  • a system for information processing including:
  • the terminal provided by the third aspect and the server provided by the fourth aspect.
  • the method, device and system for information processing provided by the embodiment of the present invention, after adopting the above solution, after the terminal acquires the first song, the first human voice is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform the karaoke 0K. Singing, which allows the terminal to get accompaniment based on other songs other than professional accompaniment, increases the number of accompaniments that can be obtained.
  • FIG. 1 is a flowchart of a method for processing information by using a terminal as an execution subject according to an embodiment of the present invention
  • FIG. 2 is a flowchart of another method for processing information by using a terminal as an execution subject according to the embodiment; a flow chart of the method of information processing shown in FIG. 2;
  • FIG. 4 is a flowchart of a method for processing information by using a server as an execution subject according to an embodiment of the present invention
  • FIG. 5 is a flow chart of another method for processing information by using a server as an execution subject according to the embodiment
  • FIG. 6 is a schematic structural diagram of a terminal virtual device according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic structural diagram of another terminal virtual device provided by this embodiment.
  • FIG. 8 is a schematic structural diagram of a server entity device according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a terminal entity device according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of another terminal entity device according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of a server entity device according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of a system for information processing according to an embodiment of the present disclosure.
  • the Karaoke 0K system can be installed in various terminals, wherein the terminal can be, but not limited to, a mobile phone, a computer, and other electronic devices that can collect and play sound.
  • the embodiment provides a method for information processing, which is used for extracting a accompaniment from a non-professional accompaniment to support vocal singing.
  • the execution body of the method may be a terminal equipped with a karaoke system, such as As shown in Figure 1, the following steps can be included:
  • the terminal acquires a first song, where the first song includes an accompaniment to be played;
  • the first voice is the same content that exists between the first part and the second part of the first song, and the first part is that the first song is distributed in the left channel.
  • the second part is the part of the first song distributed in the right channel;
  • step 103 If the first voice is included in the first song, step 103 is performed; if the first voice is not included in the first song, step 104 is performed.
  • the first human voice is eliminated by: calculating the portion having the greatest correlation between the first portion and the second portion; and removing the most relevant portion of the left and right channels from the left and right channels, respectively. Eliminate the most relevant part of the left and right channels, and retain the uncorrelated left and right channel signals, that is, the accompaniment signals.
  • the portion having the greatest correlation between the first portion and the second portion is calculated; and the product of the control factor and the portion having the greatest correlation among the left and right channels is eliminated from the left and right channels, respectively, wherein the control factor is 0 to 1 between.
  • the degree of vocal elimination can be controlled by the user himself, and the user can choose to eliminate the vocal when the user needs the original sound prompt, The original vocal vocals are guided, and the user can choose to eliminate the vocals when the user does not need the original sound.
  • the left and right channel signals as the frequency domain information as an example, the complex domain frequency domain information of the received left channel signal is L[L, LJ, L is the real part, !
  • is the imaginary part; the frequency domain information of the right channel signal is R[ , RJ , R r is the real part, and ⁇ is the imaginary part.
  • the first human voice is eliminated by direct frequency/time domain subtraction of the left and right channels.
  • the received left channel information is L
  • the right channel information is R
  • the left and right channel information may be time domain information or frequency domain information.
  • the complex domain frequency domain information of the received left channel signal is L [L, LJ, L is the real part, ! ⁇ is the imaginary part;
  • the frequency domain information of the right channel signal is R [ , RJ , R r is the real part, and ⁇ is the imaginary part.
  • the left and right channel signals are subtracted in the frequency domain for vocal cancellation, the left channel information after the elimination is Lo [Lor, Loi], and the right channel information is Ro [Ror, Roi], wherein
  • the accompaniment information is obtained by synthesizing the left and right channel information after the cancellation.
  • the time domain information of the received left channel signal is 1 and the time domain information of the right channel signal is r.
  • the left and right channel signals are subtracted in the time domain for vocal cancellation, and the eliminated left channel time domain information is 1.
  • the right channel time domain information is r. , among them
  • the accompaniment information is obtained by synthesizing the left and right channel information after the cancellation.
  • 104 Receive a second human voice to be played, perform sound optimization processing and mixing processing on the second human voice and the accompaniment to be played, to obtain a second song;
  • the second vocal and the accompaniment to be played are required to perform sound optimization processing and the second vocal and the accompaniment to be played are mixed; the order of the sound optimization processing and the mixing processing in this step may be arbitrary.
  • the second vocal and the accompaniment to be played may be separately optimized for sound, and then the optimized second vocal and the accompaniment to be played may be mixed to obtain the second song; or the second vocal may be used first. Mixing with the accompaniment to be played, and then performing sound optimization processing on the second vocal and the accompaniment to be played after the mixing, to obtain the second song.
  • the terminal After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased.
  • the number of accompaniments; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • This embodiment provides another method for information processing for extracting a play accompaniment from a non-professional accompaniment song to support vocal singing.
  • the method is a further extension of the method of FIG. 1, and the execution subject may be installed with karaoke.
  • the terminal of the 0K system as shown in FIG. 2, may include:
  • the terminal acquires a first song, and the first song includes an accompaniment to be played.
  • the terminal Before playing the user-specified accompaniment, the terminal first needs to acquire the song containing the accompaniment to be played and then obtain the accompaniment to be played from the acquired first song.
  • the terminal may, but is not limited to, acquiring the first song from the songs stored by the server, and the first song may be stored on the server, or at least one of the terminals.
  • the terminal may obtain the first song from the server end in multiple manners, such as by using text information, by voice command, by humming retrieval, and by the user recording a piece of music on the terminal for audio fingerprinting. Search, etc.
  • the method for acquiring the first song by the terminal is not limited, and may be set according to actual needs, for example, may be obtained from any device that stores the first song, and details are not described herein again.
  • step 203 Determine whether the first voice is included in the first song. If the first vocal is included, step 203 is performed; if the first vocal information is not included, step 204 is performed.
  • the first vocal may be the same content as the content existing between the first part and the second part of the first song, the first part may be a part of the first song distributed in the left channel, and the second part may be the first part A song is distributed over the right channel.
  • the first song may include a accompaniment, and/or a first vocal, and the first song may be assigned to the left channel and/or the right channel.
  • the first vocal in the first song can be evenly distributed in the left channel and the right channel, that is, equally distributed in the left channel and the right channel, and the first vocal 50% is distributed in the left and right channels, and the accompaniment in the first song can be distributed to the left and right channels in an arbitrary ratio, and such a first song can be called a non-professional accompaniment.
  • the terminal determining whether the first voice is included in the first song may include: determining, by the terminal, whether the first song is distributed between the first portion of the left channel and the second portion distributed to the right channel. Distributed over the left and right channels.
  • the first song may include the first vocal If there is no portion equally distributed between the left channel and the right channel between the first portion distributed on the left channel and the second portion distributed on the right channel, the first song may not include the first portion.
  • One voice If there is a portion equally distributed between the left channel and the right channel between the first portion distributed on the left channel and the second portion distributed on the right channel, the first song may include the first vocal If there is no portion equally distributed between the left channel and the right channel between the first portion distributed on the left channel and the second portion distributed on the right channel, the first song may not include the first portion.
  • One voice is a portion equally distributed between the left channel and the right channel between the first portion distributed on the left channel and the second portion distributed on the right channel.
  • the companion in the first song is distributed to the left channel, and the terminal determines whether the first voice is included in the first song.
  • the terminal may determine whether the terminal determines whether the left voice is in the left channel. Contains the corresponding content, if included, the first song may contain the first voice.
  • the method of determining whether the first voice is included in the first song is a prior art, and details are not described herein again.
  • the first song is taken as an example of non-professional accompaniment.
  • the ratio between the first portion of the left channel and the second portion of the right channel is not limited, and may be set according to actual needs, and details are not described herein again.
  • eliminating the first vocal sound may include: calculating a portion having the greatest correlation between the first portion and the second portion; respectively removing the correlation between the left and right channels from the left and right channels The most sexual part.
  • the eliminating the first vocal may further include: calculating a portion having the greatest correlation between the first part and the second part; calculating a correlation between the elimination control factor and the left and right channels The product of the largest part; where the control factor is between 0 and 1; the portion corresponding to the product is eliminated from the left and right channels, respectively.
  • the value of the elimination factor is not limited in this embodiment, and may be set according to actual needs, and details are not described herein again.
  • the first song is a non-professional accompaniment
  • the first vocal is eliminated, and the accompaniment to be played may include:
  • the information refers to time domain information or frequency domain information.
  • the method for eliminating the first human voice is not limited in this embodiment, and can be set according to actual needs.
  • the method has various options, and the user can select an appropriate accompaniment.
  • the terminal can switch between "original singing” and "accompaniment” according to the user's instructions.
  • "Original singing” can include accompaniment and vocals;
  • "accompaniment singing” can include accompaniment without vocals.
  • the terminal can also adjust the amount of sound of the first person in the song, and the "singing sing" when the volume is the lowest, and the "original sing” when the volume is the highest.
  • a second human voice to be played perform sound optimization processing and mixing processing on the second human voice and the accompaniment to be played, to obtain a second song.
  • the second voice is the user singing the voice.
  • the second human voice and the accompaniment are subjected to sound optimization processing.
  • the second vocal may be a sound that the user sings according to the accompaniment played by the terminal.
  • the method for obtaining the second human voice by the terminal is not limited in this embodiment, and may be any method well known to those skilled in the art.
  • the terminal may obtain the second human voice according to a microphone or other sound collecting device connected to the terminal. No longer.
  • the sound optimization processing may include, but is not limited to, at least one of a sound denoising process, a sound automatic gain adjustment process, and a sound pitch adjustment process.
  • the sound denoising process can perform various enhancements and beautifications on the recorded singing voices, that is, the second vocals and accompaniment, which can eliminate the adverse effects on the singing voices in various noises in the actual recording environment;
  • the sound automatic gain adjustment processing can be Various enhancements and beautifications of the vocal vocals and accompaniment of the admission can eliminate the unstable vocal vocal and accompaniment energy under non-professional recording conditions;
  • the sound pitch adjustment processing can better match the user's personal habits, for example , tone, etc., and can also perform the performance of the admission Singing vocals and accompaniment to rhythm and tone beautification.
  • the method for the sound optimization processing in the embodiment is not limited, and is a technique well known to those skilled in the art, and can be set according to actual needs, and details are not described herein again.
  • the order of at least one of the sound denoising processing, the sound automatic gain adjustment processing, and the sound pitch adjustment processing is not limited, and may be set according to actual needs, and details are not described herein again.
  • the processing may be performed automatically by the terminal according to the actual situation, and may be other methods, and details are not described herein again.
  • Mixing is a process of mixing a plurality of sound sources such as dialogue, music, sound effects, etc.
  • the mixing is a process of mixing the second human voice with the accompaniment.
  • the sound optimization of the second vocal and accompaniment can improve the quality of the song after mixing, and thus improve the Karaoke 0K playback effect.
  • the present embodiment does not limit the mixing, and is well known to those skilled in the art, and details are not described herein again. It should be noted that, in this embodiment, the order of the second voice and the accompaniment is optimized, and the order of the sound mixing processing is not limited, and may be set according to actual needs, and details are not described herein again.
  • the position of the step 206 in the embodiment is not limited, and may be performed before or after any step in the embodiment, which is not limited and described herein.
  • the terminal when the terminal acquires the first song from the song stored by the server, in order to increase the number of songs stored in the server, the terminal may send more specified songs to the server, so that the terminal acquires .
  • the specified song can contain accompaniment, or the specified song can contain the first vocal and accompaniment.
  • the terminal may further encode and compress the mixed songs, and upload them through the network, share the network community, and perform mass score interaction.
  • the method for uploading the reverb-supplied songs to the network by the terminal is not limited, and may be set according to actual needs, and details are not described herein again.
  • FIG. 3 is a flow chart of an embodiment of the present embodiment.
  • the "original audio signal" in Figure 3 is the song; the "elimination of the human voice” is to eliminate the first human voice; the "singing human voice” is the second human voice; the processing of the mixing in this embodiment can be The process of reverberation is included, but in Figure 3, the mix and reverb are treated as separate operations, and the reverbered information is played.
  • FIG. 3 is only one of the embodiments of the present embodiment, and may further include other embodiments, and is not limited to the embodiment of FIG. 3, and details are not described herein again.
  • the terminal After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased.
  • the number of accompaniments; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • the embodiment provides another method for processing information.
  • the execution subject of the method is a server.
  • the application scenario may include: the terminal acquiring the first song from the server, as shown in FIG. 4, may include:
  • the 402. Determine whether the first voice is included in the first song.
  • the first voice is the same content that exists between the first part and the second part of the first song, and the first part is that the first song is distributed in the left channel.
  • the second part is the part of the first song distributed in the right channel;
  • This embodiment provides another method for processing information, which is a further extension of the method shown in FIG. 4, as shown in FIG. 5, which may include:
  • step 502. Determine whether the first voice is included in the first song. If yes, go to step 503. If not, go to step 504.
  • the first vocal may be the same content as the content existing between the first part and the second part of the first song, the first part may be a part of the first song distributed in the left channel, and the second part may be the first part A song is distributed over the right channel.
  • eliminating the first vocal sound may include:
  • the eliminating the first vocal sound may further include:
  • the eliminating the first vocal sound may further include:
  • the information refers to time domain information or frequency domain information.
  • the terminal After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased.
  • the number of accompaniments After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased. The number of accompaniments.
  • This embodiment provides a terminal, as shown in FIG. 6, which may include:
  • the acquiring unit 61 is configured to acquire a first song, where the first song includes an accompaniment to be played;
  • the determining unit 62 is configured to determine whether the first vocal is included in the first song, where the first vocal is the same content that exists between the first part and the second part of the first song, and the first part is the first song distribution In the portion of the left channel, the second portion is the portion of the first song distributed in the right channel;
  • the eliminating unit 63 is configured to: if the first human voice is included, eliminate the first human voice, and obtain the accompaniment to be played; the processing unit 64 is configured to acquire the second human voice to be played, and perform the second human voice and the accompaniment to be played.
  • the sound optimization process and the mixing process are performed to obtain a second song.
  • the eliminating unit eliminates the first vocal sound, obtains the accompaniment to be played, and uses the accompaniment to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the accompaniment is added.
  • the number of accompaniments that can be obtained; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • the terminal may include:
  • the acquiring unit 71 is configured to acquire a first song, where the first song includes an accompaniment to be played;
  • the determining unit 72 is configured to determine whether the first voice is included in the first song, where the first voice is the same content that exists between the first part and the second part of the first song, and the first part is the first song distribution In the portion of the left channel, the second portion is the portion of the first song distributed in the right channel;
  • the eliminating unit 73 is configured to: if the first human voice is included, eliminate the first human voice, and obtain the accompaniment to be played; the processing unit 74 is configured to acquire the second human voice to be played, and perform the second human voice and the accompaniment to be played.
  • the sound optimization process and the mixing process are performed to obtain a second song.
  • the eliminating unit 7 3 is specifically configured to calculate a portion having the greatest correlation between the first portion and the second portion; and removing the most relevant portion of the left and right channels from the left and right channels, respectively.
  • the eliminating unit 7 3 is specifically configured to calculate a portion having the greatest correlation between the first portion and the second portion; calculating a product of the elimination control factor and a portion having the greatest correlation among the left and right channels; wherein the control factor is 0 to 1 The parts corresponding to the product are eliminated from the left and right channels, respectively.
  • the eliminating unit 73 is specifically configured to subtract the information of the second part from the information of the first part to obtain a left channel accompaniment; subtract the information of the first part from the information of the second part to obtain a right channel accompaniment;
  • the channel accompaniment and the right channel accompaniment, the accompaniment to be played is obtained, and the information refers to time domain information or frequency domain information.
  • the sound optimization processing includes at least one of a sound denoising process, a sound automatic gain adjustment process, and a sound pitch adjustment process.
  • the terminal may also include, but is not limited to, including:
  • the receiving unit 75 is configured to receive a second human voice if the song does not include the first human voice, perform sound optimization processing and mixing processing on the second human voice and the to-be-played accompaniment, to obtain a second song.
  • the terminal may also include, but is not limited to, including:
  • the playing unit 76 is configured to play the second song.
  • the eliminating unit eliminates the first vocal sound, and obtains the accompaniment to be played, and the accompaniment can perform karaoke 0K singing, so that the terminal can be based on the professional accompaniment.
  • the other songs are accompaniment, increasing the number of accompaniments that can be obtained; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • This embodiment provides a server, as shown in FIG. 8, which may include:
  • the receiving unit 81 is configured to receive a request message for acquiring the first song sent by the terminal, where the first song includes an accompaniment to be played;
  • the determining unit 82 is configured to determine whether the first voice is included in the first song, where the first voice is the same content that exists between the first part and the second part of the first song, and the first part is the first song distribution In the portion of the left channel, the second portion is the portion of the first song distributed in the right channel;
  • the eliminating unit 8 3 is configured to: if the first human voice is included, eliminate the first human voice, and obtain the accompaniment to be played; and the sending unit 84 is configured to send the accompaniment to be played to the terminal.
  • the eliminating unit 8 3 is specifically configured to calculate a portion having the greatest correlation between the first portion and the second portion; and removing the most relevant portion of the left and right channels from the left and right channels, respectively.
  • the eliminating unit 8 3 is specifically configured to calculate a portion having the greatest correlation between the first portion and the second portion; calculating a product of the elimination control factor and the portion having the greatest correlation among the left and right channels; wherein the control factor is 0 to 1 The parts corresponding to the product are eliminated from the left and right channels, respectively.
  • the eliminating unit 8 3 is specifically configured to subtract the information of the second part from the information of the first part to obtain a left channel accompaniment; and subtract the information of the first part from the information of the second part to obtain a right channel accompaniment; Left channel accompaniment and right channel accompaniment, the accompaniment to be played is obtained, and the information refers to time domain information or frequency domain information.
  • the sending unit 84 is further configured to send the song to the terminal if the song does not include the first vocal.
  • the eliminating unit eliminates the first human voice in the first song received by the receiving unit, obtains the accompaniment to be played, and sends the accompaniment information to the terminal, and the terminal can perform karaoke singing using the accompaniment, so that the terminal can Added accompaniment based on other songs except professional accompaniment, increased accessibility
  • the number of accompaniments to the accompaniment; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • This embodiment provides another terminal, as shown in FIG. 9, which may include:
  • the processor 91 is configured to acquire a first song, where the first song includes an accompaniment to be played; determine whether the first vocal is included in the first song, and the first vocal is the existence between the first part and the second part of the first song The same part of the content, the first part is the part of the first song distributed in the left channel, the second part is the part of the first song distributed in the right channel; if the first human voice is included, the first human voice is eliminated, The accompaniment to be played; the second vocal to be played is obtained, and the second vocal and the accompaniment to be played are subjected to sound optimization processing and mixing processing to obtain a second song.
  • the processor eliminates the first vocal in the first song and obtains the accompaniment to be played, and the terminal can perform karaoke singing using the accompaniment, so that the terminal can obtain the accompaniment according to other songs other than the professional accompaniment, and the accompaniment is added.
  • the number of accompaniments obtained; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • the terminal may include:
  • the processor 101 is configured to acquire a first song, where the first song includes an accompaniment to be played; determine whether the first vocal is included in the first song, and the first vocal is present between the first part and the second part of the first song The same part of the content, the first part is the part of the first song distributed in the left channel, the second part is the part of the first song distributed in the right channel; if the first human voice is included, the first human voice is eliminated, The accompaniment to be played; the second vocal to be played is obtained, and the second vocal and the accompaniment to be played are subjected to sound optimization processing and mixing processing to obtain a second song.
  • the processor 101 is specifically configured to calculate a portion with the greatest correlation between the first portion and the second portion; and remove the portion with the most correlation among the left and right channels from the left and right channels, respectively. Further, the processor 101 is specifically configured to calculate a portion with the greatest correlation between the first part and the second part; and calculate a product of the elimination control factor and the portion with the most correlation between the left and right channels; wherein the control factor is 0 to 1. The parts corresponding to the product are eliminated from the left and right channels, respectively.
  • the processor 101 is specifically configured to subtract the information of the second part from the information of the first part to obtain a left channel accompaniment; subtract the information of the first part from the information of the second part to obtain a right channel accompaniment;
  • the channel accompaniment and the right channel accompaniment, the accompaniment to be played is obtained, and the information refers to time domain information or frequency domain information.
  • the sound optimization processing performed by the processor 101 includes at least one of a sound denoising process, a sound automatic gain adjustment process, and a sound pitch adjustment process.
  • the terminal may further include:
  • the receiver 102 is configured to receive a second human voice if the song does not include the first human voice, perform sound optimization processing and mixing processing on the second human voice and the to-be-played accompaniment, to obtain a second song.
  • the terminal may further include:
  • the display 103 is configured to play a second song.
  • the processor eliminates the first vocal in the first song and obtains the accompaniment to be played, and the terminal can perform karaoke singing using the accompaniment, so that the terminal can obtain the accompaniment according to other songs other than the professional accompaniment, and the accompaniment is added.
  • the number of accompaniments obtained; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after mixing, and thus improve the Karaoke playback effect.
  • This embodiment provides another server, as shown in FIG. 11, which may include:
  • the receiver 111 is configured to receive a request message for acquiring a first song sent by the terminal, where the first song includes an accompaniment to be played;
  • the processor 112 is configured to determine whether the first voice is included in the first song, where the first voice is the same content that exists between the first part and the second part of the first song, and the first part is the first song distribution. In the left channel portion, the second portion is the portion of the first song distributed in the right channel; if the first person is included Sound, then eliminate the first vocal, get the accompaniment to be played;
  • the transmitter 11 3 is configured to send the accompaniment to be played to the terminal.
  • the processor 112 is specifically configured to calculate a portion having the greatest correlation between the first portion and the second portion; and removing the most relevant portion of the left and right channels from the left and right channels, respectively.
  • the processor 112 is specifically configured to calculate a portion with the greatest correlation between the first part and the second part; and calculate a product of the elimination control factor and the portion with the most correlation between the left and right channels; wherein the control factor is 0 to 1.
  • the parts corresponding to the product are eliminated from the left and right channels, respectively.
  • the processor 112 is specifically configured to subtract the information of the second part from the information of the first part to obtain a left channel accompaniment; subtract the information of the first part from the information of the second part to obtain a right channel accompaniment;
  • the channel accompaniment and the right channel accompaniment, the accompaniment to be played is obtained, and the information refers to time domain information or frequency domain information.
  • the transmitter 11 3 is further configured to send a song to the terminal if the song does not include the first vocal.
  • the processor eliminates the first vocal in the first song, obtains the accompaniment to be played, and sends the accompaniment information to the terminal, and the terminal can perform karaoke singing using the accompaniment, so that the terminal can be based on the professional accompaniment.
  • the other songs are accompaniment, and the number of accompaniments that can be obtained is increased; the sound optimization of the second vocals and accompaniment can improve the quality of the songs after the mixing, thereby improving the karaoke effect.
  • This embodiment provides a system for processing information, as shown in FIG. 12, which may include:
  • the terminal After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased.
  • the number of accompaniments After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased. The number of accompaniments.
  • This embodiment provides a system for processing information, as shown in FIG. 12, which may include:
  • the terminal 121 shown in FIG. 9 or FIG. 10 and the server 122 shown in FIG. After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased.
  • the number of accompaniments After adopting the above scheme, after the terminal acquires the first song, the first vocal is eliminated, and the accompaniment to be played is obtained, and the accompaniment can be used to perform karaoke 0K singing, so that the terminal can obtain accompaniment according to other songs other than the professional accompaniment, and the available accompaniment is increased. The number of accompaniments.
  • the present invention can be implemented by means of software plus necessary general hardware, and of course, by hardware, but in many cases, the former is a better implementation. .
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a readable storage medium, such as a floppy disk of a computer.
  • a hard disk or optical disk or the like includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

本发明提供一种信息处理的方法、装置和系统。该方法可以包括:终端获取第一歌曲;判断第一歌曲中是否包含第一人声,第一个歌曲中第一部分与第二部分之间存在的内容相同的部分,所述第一部分为所述第一歌曲分布于左声道的部分,所述第二部分为所述第一歌曲分布于右声道的部分;若包含第一人声,则消除第一人声,得到待播放伴奏,第一歌曲包含待播放伴奏和第一人声;获取待播放的第二人声,将第二人声和待播放伴奏进行声音优化处理以及混音处理,得到第二歌曲。可应用于卡拉OK系统中。

Description

信息处理的方法、 装置和系统
本申请要求于 2013年 1月 7日提交中国专利局、申请号为 201310004990.1 , 发明名称为 "信息处理的方法、 装置和系统" 的中国专利申请优先权, 上述专 利的全部内容通过引用结合在本申请中。 技术领域 本发明涉及通信技术领域, 尤其涉及信息处理的方法、 装置和系统。 背景技术
卡拉 0K是一种常见的娱乐方式, 由于, 传统卡拉 0K要求场所设置有专业 设备、 和专业的卡拉 0K伴奏库等, 因此, 很多场所满足不了传统卡拉 0K的要 求。 为了解决上述问题,现有技术中,人们可以通过在终端上安装卡拉 0K系统, 以实现卡拉 0K功能, 用户可以使用安装了卡拉 0K系统的终端在任意场合中进 行卡拉 0K娱乐。 卡拉 0K系统可以包括但不限于: K歌达人、 K到爆等。 卡拉 0K系统实现卡拉 0K功能的步骤可以包括:
1.安装有卡拉 0K系统的终端在与网络处于连接状态下, 从服务器中下载用 户选取的专业伴奏;
2.终端从下载的专业伴奏中获取伴奏后, 在播放伴奏的同时, 通过与终端 连接的麦克风采集用户演唱信息; 其中, 歌曲可以为: 专业伴奏或非专业伴奏, 专业伴奏与非专业伴奏均可 以包含伴奏和人声, 但专业伴奏中的伴奏和人声不在同一个声道, 而是分别位 于右声道和左声道, 非专业伴奏中伴奏和人声在同一个声道, 也就是说, 左声 道中可以同时包含伴奏和人声, 右声道中也可以同时包含伴奏和人声。 本步骤中, 终端可以直接从右声道获取专业伴奏中的伴奏; 3. 采集用户演唱人声, 并将用户演唱人声添加到获取到的伴奏中一起播 放, 即将用户演唱人声与伴奏进行混音后播放。
在实现上述信息处理的过程中, 发明人发现现有技术中至少包含如下问题: 大部分的歌曲为非专业伴奏, 专业伴奏的数量较少, 且安装有卡拉 0K系统的终 端只能通过下载的专业伴奏得到伴奏, 因此, 安装有卡拉 0K系统的终端可获取 的伴奏的数量较少。 另外, 采集用户演唱人声后直接与伴奏进行混音处理, 混 音后的歌曲的音质较差, 降低了播放的声音的质量, 上述问题均导致卡拉 0K播 放效果较低。
发明内容
本发明的实施例提供一种信息处理的方法、 装置和系统, 解决了卡拉 0K播 放效果较低的问题。
为达到上述目的, 本发明的实施例采用如下技术方案:
第一方面, 提供一种信息处理的方法, 包括:
终端获取第一歌曲, 所述第一歌曲包含待播放伴奏;
判断所述第一歌曲中是否包含第一人声, 所述第一人声为第一个歌曲中第 一部分与第二部分之间存在的内容相同的部分, 所述第一部分为所述第一歌曲 分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右声道的部分; 若包含所述第一人声, 则消除所述第一人声, 得到所述待播放伴奏; 接收待播放的第二人声, 将所述第二人声和所述待播放伴奏进行声音优化 处理以及混音处理, 得到第二歌曲。
在第一种可能的实现方式中, 所述消除所述第一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
分别从左右声道中消除左右声道中相关性最大的部分。 结合第一方面, 在第二种可能的实现方式中, 所述消除所述第一人声包括: 计算所述第一部分和第二部分之间相关性最大的部分;
计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中所述控制 因子为 0到 1之间;
分别从左右声道中消除与所述乘积对应的部分。
结合第一方面, 在第三种可能的实现方式中, 所述消除所述第一人声, 得 到所述待播放伴奏包括:
用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏;
用所述第二部分的信息减去第一部分的信息, 得到右声道伴奏;
合成所述左声道伴奏和所述右声道伴奏, 得到所述待播放伴奏, 所述信息 指时域信息或频域信息。
结合第一方面或第一方面的第一种可能的实现方式至第三种可能的实现方 式中任意一种实现方式, 在第四种可能的实现方式中, 所述声音优化处理包括: 声音去噪处理、 声音自动增益调整处理、 声音音调调整处理中至少一项。
结合第一方面或第一方面的第一种可能的实现方式至第四种可能的实现方 式中任意一种实现方式, 在第五种可能的实现方式中, 还包括:
若所述歌曲不包含所述第一人声, 则接收第二人声, 将第二人声和待播放 伴奏进行声音优化处理以及混音处理, 得到第二歌曲。
结合第一方面或第一方面的第一种可能的实现方式至第五种可能的实现方 式中任意一种实现方式, 在第六种可能的实现方式中, 还包括:
播放所述第二歌曲 。
第二方面, 提供另一种信息处理的方法, 包括:
接收终端发送的获取第一歌曲的请求消息, 所述第一歌曲包含待播放伴奏; 判断所述第一歌曲中是否包含第一人声, 所述第一人声为第一个歌曲中第 一部分与第二部分之间存在的内容相同的部分, 所述第一部分为所述第一歌曲 分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右声道的部分; 若包含所述第一人声, 则消除所述第一人声, 得到所述待播放伴奏; 向所述终端发送所述待播放伴奏。
在第一种可能的实现方式中, 所述消除所述第一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
分别从左右声道中消除左右声道中相关性最大的部分。
结合第二方面, 在第二种可能的实现方式中, 所述消除所述第一人声包括: 计算所述第一部分和第二部分之间相关性最大的部分;
计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中所述控制 因子为 0到 1之间;
分别从左右声道中消除与所述乘积对应的部分。
结合第二方面, 在第三种可能的实现方式中, 所述消除所述第一人声包括: 用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏;
用所述第二部分的信息减去第一部分的信息, 得到右声道伴奏;
合成所述左声道伴奏和所述右声道伴奏, 得到所述待播放伴奏, 所述信息 指时域信息或频域信息。
结合第二方面或第二方面的第一种可能的实现方式至第三种可能的实现方 式中任意一种实现方式, 在第四种可能的实现方式中, 还包括:
若所述歌曲不包含所述第一人声, 则向所述终端发送所述歌曲。
第三方面, 提供一种终端, 包括:
获取单元, 用于获取第一歌曲, 所述第一歌曲包含待播放伴奏; 判断单元, 用于判断所述第一歌曲中是否包含第一人声, 所述第一人声为 第一个歌曲中第一部分与第二部分之间存在的内容相同的部分, 所述第一部分 为所述第一歌曲分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右 声道的部分;
消除单元, 用于若包含所述第一人声, 则消除所述第一人声, 得到所述待 播放伴奏;
处理单元, 用于获取待播放的第二人声, 将所述第二人声和所述待播放伴 奏进行声音优化处理以及混音处理, 得到第二歌曲。
第四方面, 提供一种服务器, 包括:
接收单元, 用于接收终端发送的获取第一歌曲的请求消息, 所述第一歌曲 包含待播放伴奏;
判断单元, 用于判断所述第一歌曲中是否包含第一人声, 所述第一人声为 第一个歌曲中第一部分与第二部分之间存在的内容相同的部分, 所述第一部分 为所述第一歌曲分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右 声道的部分;
消除单元, 用于若包含所述第一人声, 则消除所述第一人声, 得到所述待 播放伴奏;
发送单元, 用于向所述终端发送所述待播放伴奏。
第五方面, 提供一种信息处理的系统, 包括:
第三方面提供的终端和第四方面提供的服务器。 本发明实施例提供的信息处理的方法、 装置和系统, 采用上述方案后, 终 端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K 演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到伴奏, 增加了可获取到 的伴奏的数量。
附图说明
为了更清楚地说明本发明实施例中的技术方案, 下面将对实施例描述中所 需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明 的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为本实施例提供的一种以终端为执行主体的信息处理的方法流程图; 图 2为本实施例提供的另一种以终端为执行主体的信息处理的方法流程图; 图 3为图 2所示的信息处理的方法的流程框图;
图 4为本实施例提供的一种以服务器为执行主体的信息处理的方法流程图; 图 5 为本实施例提供的另一种以服务器为执行主体的信息处理的方法流程 图;
图 6为本实施例提供的一种终端虚拟装置结构示意图;
图 Ί为本实施例提供的另一种终端虚拟装置结构示意图;
图 8为本实施例提供的一种服务器实体装置结构示意图;
图 9为本实施例提供的一种终端实体装置结构示意图;
图 10为本实施例提供的另一种终端实体装置结构示意图;
图 11为本实施例提供的一种服务器实体装置结构示意图;
图 12为本实施例提供的一种信息处理的系统结构示意图。
具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创造 性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
卡拉 0K系统可以安装在各种终端中, 其中, 终端可以为但不限于: 手机、 电脑以及其他可采集和播放声音的电子设备中。
现有技术中, 卡拉 0K可获取到的伴奏的数量较少, 播放的声音质量较低, 导致卡拉 0K播放效果较低。
为了解决上述问题, 本实施例提供一种信息处理的方法, 用于从非专业伴 奏的歌曲中提取播放伴奏以支持人声演唱, 该方法的执行主体可以为安装有卡 拉 0K系统的终端, 如图 1所示, 可以包括以下步骤:
101、 终端获取第一歌曲, 第一歌曲包含待播放伴奏;
102、 判断第一歌曲中是否包含第一人声, 第一人声为第一个歌曲中第一部 分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲分布于左声道 的部分, 第二部分为第一歌曲分布于右声道的部分;
若第一歌曲中包含第一人声, 则执行步骤 103; 若第一歌曲中不包含第一人 声, 则执行步骤 104。
103、 消除第一人声, 得到待播放伴奏;
一个实施例中, 通过如下方式消除第一人声: 计算第一部分和第二部分之 间相关性最大的部分; 分别从左右声道中消除左右声道中相关性最大的部分。 消除左右声道中相关性最大的部分, 保留下不相关的左右声道信号即伴奏信号。
另一个实施例中, 计算第一部分和第二部分之间相关性最大的部分; 分别 从左右声道中消除控制因子与左右声道中相关性最大的部分的乘积, 其中控制 因子为 0到 1之间。 消除左右声道中相关性最大的部分过程中, 人声消除的程 度可以有用户自己控制, 在用户需要原声提示的时候可以选择少消除人声, 由 原唱人声进行引导, 在用户不需要原声时可以选择多消除人声。 以左右声道信号为频域信息为例, 接收到的左声道信号的复数域频域信息 为 L[L,LJ , L为实部, !^为虚部; 右声道信号的频域信息为 R[ ,RJ , Rr为实 部, ^为虚部。 计算左、 右声道相关性最大部分, 按照用户控制信息进行人声消除, 消除 后的左声道信息为 L。 [L。r,L 右声道信息为 R。 [Id] , 其中 C表示左右声 道相同的部分, 所以经过消除人声处理后的左右声道信号在理想条件应该是两 λ , ,一 (L - Α.Χ OCR- AX C) = 0
个不相关信号即 。
― L― i X A X C ^-^ L¾ ^— ' L 一 X A X ,- ― ―€ X Λ X 1^; R = R.― - X K X C Q = Rr― : X λ X C Ra¾ = 一 Ci X A X 其中0≤ ≤1,, 是用户控制信息, 0表示完全不消除人声, 1表示彻底消除人 声。 C[Cr,Cj为初始估计左右声道相同的部分, ar
= L X al + R X ar al = (Lr X R:r + X RL)/(Lr X Lr - Lf X L^) ar = (Lr X Rr ÷ X R^)/(Rr X R:r + R X R^) 其中 λ表示左右声道相关程度, λ= (Β + VD)/(2 X A) A = Cr x Cr + Cf x
B =— (Cr X (Lr+Rr) + C X (L^R.!))
D = B2 - 4 X A X C 另一个实施例中, 通过左右声道直接频域 /时域相减消除第一人声。
接收到的左声道信息为 L, 右声道信息为 R, 这里的左右声道信息可以是时 域信息也可以是频域信息。 可以筒单的认为左右声道的信息是由伴奏信息和第 一人声信息相加得到, 即 L=L, +C, R=R, +C, 其中 C为第一人声信息, 所以可 以可以通过在时域或者频域直接将左右声道相减来消除第一人声。
以左右声道信号为频域信息为例, 接收到的左声道信号的复数域频域信息 为 L [L,LJ , L为实部, !^为虚部; 右声道信号的频域信息为 R [ ,RJ , Rr为实 部, ^为虚部。
左右声道信号在频域相减进行人声消除, 消除后的左声道信息为 Lo [Lor,Loi], 右声道信息为 Ro [Ror,Roi ] , 其中
Li。
Figure imgf000010_0001
对消除后的左右声道信息合成后得到的即为伴奏信息。
以左右声道信号为时域信息为例,接收到的左声道信号的时域信息为 1 , 右 声道信号的时域信息为 r。
左右声道信号在时域相减进行人声消除, 消除后的左声道时域信息为 1。, 右声道时域信息为 r。, 其中
l0=l-r r0=l-r
对消除后的左右声道信息合成后得到的即为伴奏信息。 104、 接收待播放的第二人声, 将第二人声和待播放伴奏进行声音优化处理 以及混音处理, 得到第二歌曲;
该步骤中, 需要第二人声和待播放伴奏进行声音优化处理以及对第二人声 和待播放伴奏进行混音处理; 该步骤中的声音优化处理和混音处理的顺序可以 是任意的, 例如: 可以先对第二人声和待播放伴奏分别进行声音优化处理, 再 对优化后的第二人声和待播放伴奏进行混音处理, 得到第二歌曲; 也可以先对 第二人声和待播放伴奏进行混音处理, 再对混音后的第二人声和待播放伴奏进 行声音优化处理, 得到第二歌曲。
采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进行声音优化处理, 可以提高混音后的歌曲的质量, 进而提高了卡拉 0K播放效果。
本实施例提供另一种信息处理的方法, 用于从非专业伴奏的歌曲中提取播 放伴奏以支持人声演唱, 该方法是对图 1 的方法的进一步扩展, 且执行主体可 以为安装有卡拉 0K系统的终端, 如图 2所示, 可以包括:
201、 终端获取第一歌曲, 第一歌曲包含待播放伴奏。
终端在播放用户指定的伴奏之前, 首先需要获取包含有待播放伴奏的歌曲 然后再从获取到的第一歌曲中获取待播放伴奏。
进一步的, 终端可以但不限于从服务器存储的歌曲中获取第一歌曲, 第一 歌曲可以存储于服务器、 或终端中至少一个设备上。
作为本实施例的一种实施方式, 终端可以通过多种方式从服务器端中获取 第一歌曲, 如通过文字信息, 通过语音命令, 通过哼唱检索, 通过用户在终端 上录一段音乐进行音频指纹检索等。 本实施例对终端获取第一歌曲的方法不作限定, 可以根据实际需要进行设 定, 例如, 可以从任意存储有第一歌曲的设备中获取, 在此不再赘述。
202、判断第一歌曲中是否包含第一人声。若包含第一人声,则执行步骤 203 ; 若不包含第一人声信息, 则执行步骤 204。
进一步的, 第一人声可以为第一个歌曲中第一部分与第二部分之间存在的 内容相同的部分, 第一部分可以为第一歌曲分布于左声道的部分, 第二部分可 以为第一歌曲分布于右声道的部分。
通常的, 第一歌曲可以包含伴奏、 和 /或第一人声, 且第一歌曲可以被分配 于左声道和 /或右声道。
若第一歌曲为非专业伴奏, 则第一歌曲中的第一人声可以被平均分布于左 声道和右声道, 即等量分布于左声道和右声道, 第一人声各有 50%分布在左右声 道, 第一歌曲中的伴奏可以被以任意比例分布于左声道和右声道, 这样的第一 歌曲可以称为非专业伴奏。
也就是说, 若第一歌曲为非专业伴奏, 则第一人声可以为第一歌曲中被分 布于左声道的第一部分与被分布于右声道的第二部分之间存在的内容相同且等 量分布于左声道与右声道的部分。 则此时, 终端判断第一歌曲中是否包含第一 人声可以包括: 终端判断第一歌曲被分布于左声道的第一部分与被分布于右声 道的第二部分之间是否存在等量分布于左声道与右声道的部分。
若被分布于左声道的第一部分与被分布于右声道的第二部分之间存在等量 分布于左声道与右声道的部分, 则该第一歌曲中可能包含第一人声; 若被分布 于左声道的第一部分与被分布于右声道的第二部分之间不存在等量分布于左声 道与右声道的部分, 则该第一歌曲中可能不包含第一人声。
或者, 如背景技术中描述的, 若第一歌曲为专业伴奏, 则第一歌曲中的伴 奏被分布于右声道, 第一歌曲中的第一人声被分布于左声道, 则此时, 终端判 断第一歌曲中是否包含第一人声可以包括: 终端判断左声道中是否包含相应内 容, 若包含, 则第一歌曲中可能包含第一人声。
若第一歌曲为专业伴奏, 则判断第一歌曲中是否包含第一人声的方法为现 有技术, 在此不再赘述。 本实施例以第一歌曲为非专业伴奏为例进行说明。
本实施例对歌曲被分布于左声道的第一部分与被分布于右声道的第二部分 之间的比例不作限定, 可以根据实际需要进行设定, 在此不再赘述。
203、 消除第一人声, 得到待播放伴奏。
进一步可选的, 若第一歌曲为非专业伴奏, 则消除第一人声可以包括: 计算第一部分和第二部分之间相关性最大的部分; 分别从左右声道中消除 左右声道中相关性最大的部分。
进一步可选的, 若第一歌曲为非专业伴奏, 则消除第一人声还可以包括: 计算第一部分和第二部分之间相关性最大的部分; 计算消除控制因子与左 右声道中相关性最大的部分的乘积; 其中控制因子为 0到 1之间; 分别从左右 声道中消除与乘积对应的部分。
本实施例对消除因子的数值不作限定, 可以根据实际需要进行设定, 在此 不再赘述。
进一步可选的, 若第一歌曲为非专业伴奏, 则消除第一人声, 得到待播放 伴奏可以包括:
用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的 信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。 使用该方法虽然可能会从第一 歌曲终端消除部分伴奏, 但是, 可以保证第一人声全部被消除, 同时, 不会降 低终端播放效果。
本实施例对消除第一人声的方法不作限定, 可以根据实际需要进行设定, 该方法有多种选择, 可以由用户选择合适的伴奏。
例如, 终端可以根据用户的指示, 任意在 "原唱" 与 "伴唱" 之间进行切 换, "原唱" 可以包含伴奏和人声; "伴唱" 可以包含伴奏不包含人声。
进一步可选的, 终端还可以调节歌曲中第一人声音量高低, 音量最低时即 为 "伴唱", 音量最高时即为 "原唱"。
204、 获取待播放的第二人声, 将第二人声和待播放伴奏进行声音优化处理 以及混音处理, 得到第二歌曲。 第二人声即用户演唱人声。
为了使卡拉 0K播放的声音的质量较高, 在获取第二人声后, 将第二人声和 伴奏进行声音优化处理。
作为本实施例的一种实施方式, 第二人声可以为用户根据终端播放的伴奏 进行演唱的声音。
本实施例对终端获取第二人声的方法不作限定, 可以为本领域技术人员熟 知的任意方法, 例如, 终端可以根据与终端连接的麦克风或其他声音采集设备 等获取第二人声, 在此不再赘述。
进一步的, 声音优化处理可以包括但不限于: 声音去噪处理、 声音自动增 益调整处理、 声音音调调整处理中至少一项。
其中, 声音去噪处理可以对录取的演唱人声即第二人声和伴奏进行多种增 强和美化, 可以消除实际录音环境各种噪声中对演唱人声的不利影响; 声音自 动增益调整处理可以对录取的演唱人声和伴奏进行多种增强和美化, 可以消除 非专业录音条件下录取的演唱人声和伴奏能量大小不稳定的情况; 声音音调调 整处理可以更好的匹配用户个人习惯, 例如, 音调等, 并且还可以对录取的演 唱人声和伴奏进行节奏、 音调美化。
本实施例对声音优化处理的方法不作限定, 为本领域技术人员熟知的技术, 且可根据实际需要进行设定, 在此不再赘述。
本实施例对进行声音去噪处理、 声音自动增益调整处理、 声音音调调整处 理中至少一项的顺序不作限定, 可以根据实际需要进行设定, 在此不再赘述。 进行处理, 也可以由终端根据实际情况自动进行处理, 还可以为其他方法, 在 此不再赘述。
混音是指将对白、 音乐、 音效等多种音源予以混合的处理过程, 在本实施 例中, 混音即是将第二人声与伴奏进行混合的处理的过程。 对第二人声、 和伴 奏进行声音优化处理, 可以提高混音后的歌曲的质量, 进而提高了卡拉 0K播放 效果.
本实施例对混音不作限定, 为本领域技术人员熟知的技术, 在此不再赘述。 值得说明的是, 本实施例对对第二人声和伴奏进行优化处理与进行混音处 理的顺序不作限定, 可以根据实际需要进行设定, 在此不再赘述。
205、 播放第二歌曲。
206、 向服务器发送指定歌曲, 以便服务器存储指定歌曲。
值得说明的是, 本实施例对步骤 206 在实施例中执行的位置不作限定, 可 以执行与本实施例中任意步骤之前或之后, 在此不作限定和赘述。
作为本实施例的一种实施方式, 当终端从服务器存储的歌曲中获取第一歌 曲时, 为了增加服务器中存储的歌曲的数量, 则终端可以向服务器发送更多的 指定歌曲, 以便终端获取到。
指定歌曲可以包含伴奏, 或者, 指定歌曲可以包含第一人声和伴奏。 进一步的, 终端还可以将混音后的歌曲进行编码压缩存储, 并且可以通过 网络上传, 进行网络社区共享, 并且进行大众评分交互。
本实施例对终端将混响后的歌曲上传至网络的方法不作限定, 可以根据实 际需要进行设定, 在此不再赘述。
如图 3 , 为本实施例的一种实施方式的流程框图。
其中, 图 3中的 "原始音频信号" 即为歌曲; "消除人声" 即为消除第一人 声; "演唱人声"即为第二人声; 本实施例中的混音的处理可以包含混响的处理, 但, 在图 3 中, 将混音与混响作为单独的操作, 并将经过混响后的信息进行播 放。
图 3 的实施方式仅为本实施例的其中一种实施方式, 还可以包含有其他的 实施方式, 不限于图 3的实施方式, 在此不再赘述。
采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进行声音优化处理, 可以提高混音后的歌曲的质量, 进而提高了卡拉 0K播放效果。
本实施例提供另一种信息处理的方法, 该方法的执行主体为服务器, 应用 场景可以包括: 终端从服务器中获取第一歌曲, 如图 4所示, 可以包括:
401、接收终端发送的获取第一歌曲的请求消息,第一歌曲包含待播放伴奏;
402、 判断第一歌曲中是否包含第一人声, 第一人声为第一个歌曲中第一部 分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲分布于左声道 的部分, 第二部分为第一歌曲分布于右声道的部分;
403、 若包含第一人声, 则消除第一人声, 得到待播放伴奏;
404、 向终端发送待播放伴奏。 采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量。
本实施例提供另一种信息处理的方法, 该方法是对图 4 所示的方法的进一 步扩展, 如图 5所示, 可以包括:
501、接收终端发送的获取第一歌曲的请求消息,第一歌曲包含待播放伴奏;
502、 判断第一歌曲中是否包含第一人声。 若包含, 则执行步骤 503 , 若不 包含, 则执行步骤 504。
进一步的, 第一人声可以为第一个歌曲中第一部分与第二部分之间存在的 内容相同的部分, 第一部分可以为第一歌曲分布于左声道的部分, 第二部分可 以为第一歌曲分布于右声道的部分。
503、 若包含第一人声, 则消除第一人声, 得到待播放伴奏。
进一步可选的, 消除第一人声可以包括:
计算第一部分和第二部分之间相关性最大的部分; 分别从左右声道中消除 左右声道中相关性最大的部分。
进一步可选的, 消除第一人声还可以包括:
计算第一部分和第二部分之间相关性最大的部分; 计算消除控制因子与左 右声道中相关性最大的部分的乘积; 其中控制因子为 0到 1之间; 分别从左右 声道中消除与乘积对应的部分。
进一步可选的, 消除第一人声还可以包括:
用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的 信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。 图 1 对应的实施例中消除第一人声的的具体的计算过程同样适用于该实施 例。
504、 向终端发送待播放伴奏。
采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量。
下面提供以下装置实施例, 且该装置实施例分别与上述提供的方法实施例 相对应。
本实施例提供一种终端, 如图 6所示, 可以包括:
获取单元 61 , 用于获取第一歌曲, 第一歌曲包含待播放伴奏;
判断单元 62 , 用于判断第一歌曲中是否包含第一人声, 第一人声为第一个 歌曲中第一部分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲 分布于左声道的部分, 第二部分为第一歌曲分布于右声道的部分;
消除单元 63 , 用于若包含第一人声, 则消除第一人声, 得到待播放伴奏; 处理单元 64 , 用于获取待播放的第二人声, 将第二人声和待播放伴奏进行 声音优化处理以及混音处理, 得到第二歌曲。
采用上述方案后, 获取单元获取第一歌曲后, 消除单元消除第一人声, 得 到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外 的其他歌曲得到伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进 行声音优化处理,可以提高混音后的歌曲的质量,进而提高了卡拉 0K播放效果。
本实施例提供另一种终端, 该终端是对图 6提供的终端的进一步扩展, 如 图 7 , 可以包括:
获取单元 71 , 用于获取第一歌曲, 第一歌曲包含待播放伴奏; 判断单元 72 , 用于判断第一歌曲中是否包含第一人声, 第一人声为第一个 歌曲中第一部分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲 分布于左声道的部分, 第二部分为第一歌曲分布于右声道的部分;
消除单元 73 , 用于若包含第一人声, 则消除第一人声, 得到待播放伴奏; 处理单元 74 , 用于获取待播放的第二人声, 将第二人声和待播放伴奏进行 声音优化处理以及混音处理, 得到第二歌曲。
进一步的, 消除单元 7 3 , 具体用于计算第一部分和第二部分之间相关性最 大的部分; 分别从左右声道中消除左右声道中相关性最大的部分。
进一步的, 消除单元 7 3 , 具体用于计算第一部分和第二部分之间相关性最 大的部分; 计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中控 制因子为 0到 1之间; 分别从左右声道中消除与乘积对应的部分。
进一步的, 消除单元 73 , 具体用于用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。
进一步的, 声音优化处理包括: 声音去噪处理、 声音自动增益调整处理、 声音音调调整处理中至少一项。
进一步的, 终端还可以但不限于包括:
接收单元 75 , 用于若歌曲不包含第一人声, 则接收第二人声, 将第二人声 和待播放伴奏进行声音优化处理以及混音处理, 得到第二歌曲。
进一步的, 终端还可以但不限于包括:
播放单元 76 , 用于播放第二歌曲 。
采用上述方案后, 获取单元获取第一歌曲后, 消除单元消除第一人声, 得 到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外 的其他歌曲得到伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进 行声音优化处理,可以提高混音后的歌曲的质量,进而提高了卡拉 0K播放效果。
本实施例提供一种服务器, 如图 8所示, 可以包括:
接收单元 81 , 用于接收终端发送的获取第一歌曲的请求消息, 第一歌曲包 含待播放伴奏;
判断单元 82 , 用于判断第一歌曲中是否包含第一人声, 第一人声为第一个 歌曲中第一部分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲 分布于左声道的部分, 第二部分为第一歌曲分布于右声道的部分;
消除单元 8 3 , 用于若包含第一人声, 则消除第一人声, 得到待播放伴奏; 发送单元 84 , 用于向终端发送待播放伴奏。
进一步的, 消除单元 8 3 , 具体用于计算第一部分和第二部分之间相关性最 大的部分; 分别从左右声道中消除左右声道中相关性最大的部分。
进一步的, 消除单元 8 3 , 具体用于计算第一部分和第二部分之间相关性最 大的部分; 计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中控 制因子为 0到 1之间; 分别从左右声道中消除与乘积对应的部分。
进一步的, 消除单元 8 3 , 具体用于用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。
进一步的, 发送单元 84 , 还用于若歌曲不包含第一人声, 则向终端发送歌 曲。
采用上述方案后, 消除单元消除接收单元接收到的第一歌曲中的第一人声, 得到待播放伴奏, 并将该伴奏信息发送至终端, 终端可以利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到伴奏, 增加了可获取 到的伴奏的数量; 对第二人声、 和伴奏进行声音优化处理, 可以提高混音后的 歌曲的质量, 进而提高了卡拉 0K播放效果。
本实施例提供另一种终端, 如图 9所示, 可以包括:
处理器 91 , 用于获取第一歌曲, 第一歌曲包含待播放伴奏; 判断第一歌曲 中是否包含第一人声, 第一人声为第一个歌曲中第一部分与第二部分之间存在 的内容相同的部分, 第一部分为第一歌曲分布于左声道的部分, 第二部分为第 一歌曲分布于右声道的部分; 若包含第一人声, 则消除第一人声, 得到待播放 伴奏; 获取待播放的第二人声, 将第二人声和待播放伴奏进行声音优化处理以 及混音处理, 得到第二歌曲。
采用上述方案后, 处理器消除第一歌曲中的第一人声, 得到待播放伴奏, 终端可以利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他 歌曲得到伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进行声音 优化处理, 可以提高混音后的歌曲的质量, 进而提高了卡拉 0K播放效果。
本实施例提供另一种终端, 该终端是对图 9 所示的终端的进一步扩展, 如 图 10所示, 可以包括:
处理器 101 , 用于获取第一歌曲, 第一歌曲包含待播放伴奏; 判断第一歌曲 中是否包含第一人声, 第一人声为第一个歌曲中第一部分与第二部分之间存在 的内容相同的部分, 第一部分为第一歌曲分布于左声道的部分, 第二部分为第 一歌曲分布于右声道的部分; 若包含第一人声, 则消除第一人声, 得到待播放 伴奏; 获取待播放的第二人声, 将第二人声和待播放伴奏进行声音优化处理以 及混音处理, 得到第二歌曲。
进一步的, 处理器 101 , 具体用于计算第一部分和第二部分之间相关性最大 的部分; 分别从左右声道中消除左右声道中相关性最大的部分。 进一步的, 处理器 101 , 具体用于计算第一部分和第二部分之间相关性最大 的部分; 计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中控制 因子为 0到 1之间; 分别从左右声道中消除与乘积对应的部分。
进一步的, 处理器 101 , 具体用于用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。
进一步的, 处理器 101 进行的声音优化处理包括: 声音去噪处理、 声音自 动增益调整处理、 声音音调调整处理中至少一项。
进一步的, 终端还可以包括:
接收器 102 , 用于若歌曲不包含第一人声, 则接收第二人声, 将第二人声和 待播放伴奏进行声音优化处理以及混音处理, 得到第二歌曲。
进一步的, 终端还可以包括:
显示器 103 , 用于播放第二歌曲 。
采用上述方案后, 处理器消除第一歌曲中的第一人声, 得到待播放伴奏, 终端可以利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他 歌曲得到伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进行声音 优化处理, 可以提高混音后的歌曲的质量, 进而提高了卡拉 0K播放效果。
本实施例提供另一种服务器, 如图 11所示, 可以包括:
接收器 111 , 用于接收终端发送的获取第一歌曲的请求消息, 第一歌曲包含 待播放伴奏;
处理器 112 , 用于判断第一歌曲中是否包含第一人声, 第一人声为第一个歌 曲中第一部分与第二部分之间存在的内容相同的部分, 第一部分为第一歌曲分 布于左声道的部分, 第二部分为第一歌曲分布于右声道的部分; 若包含第一人 声, 则消除第一人声, 得到待播放伴奏;
发送器 11 3 , 用于向终端发送待播放伴奏。
进一步的, 处理器 112 , 具体用于计算第一部分和第二部分之间相关性最大 的部分; 分别从左右声道中消除左右声道中相关性最大的部分。
进一步的, 处理器 112 , 具体用于计算第一部分和第二部分之间相关性最大 的部分; 计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中控制 因子为 0到 1之间; 分别从左右声道中消除与乘积对应的部分。
进一步的, 处理器 112 , 具体用于用第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用第二部分的信息减去第一部分的信息, 得到右声道伴奏; 合成左声道伴奏和右声道伴奏, 得到待播放伴奏, 信息指时域信息或频域信息。
进一步的,发送器 11 3 ,还用于若歌曲不包含第一人声,则向终端发送歌曲。 采用上述方案后, 处理器消除第一歌曲中的第一人声, 得到待播放伴奏, 并将该伴奏信息发送至终端, 终端可以利用伴奏可进行卡拉 0K演唱, 使得终端 可以根据除专业伴奏外的其他歌曲得到伴奏, 增加了可获取到的伴奏的数量; 对第二人声、 和伴奏进行声音优化处理, 可以提高混音后的歌曲的质量, 进而 提高了卡拉 0K播放效果。
本实施例提供一种信息处理的系统, 如图 12所示, 可以包括:
图 6或图 7所示的终端 121和图 8所示的服务器 122。
采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量。
本实施例提供一种信息处理的系统, 如图 12所示, 可以包括:
图 9或图 10所示的终端 121和图 11所示的月良务器 122。 采用上述方案后, 终端获取第一歌曲后, 消除第一人声, 得到待播放伴奏, 利用伴奏可进行卡拉 0K演唱, 使得终端可以根据除专业伴奏外的其他歌曲得到 伴奏, 增加了可获取到的伴奏的数量。
通过以上的实施方式的描述, 所属领域的技术人员可以清楚地了解到本发 明可借助软件加必需的通用硬件的方式来实现, 当然也可以通过硬件, 但很多 情况下前者是更佳的实施方式。 基于这样的理解, 本发明的技术方案本质上或 者说对现有技术做出贡献的部分可以以软件产品的形式体现出来, 该计算机软 件产品存储在可读取的存储介质中, 如计算机的软盘, 硬盘或光盘等, 包括若 干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备 等)执行本发明各个实施例所述的方法。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限于 此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到 变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护范围应 所述以权利要求的保护范围为准。

Claims

权 利 要 求
1、 一种信息处理的方法, 其特征在于, 包括:
终端获取第一歌曲, 所述第一歌曲包含待播放伴奏;
判断所述第一歌曲中是否包含第一人声, 所述第一人声为第一个歌曲中第 一部分与第二部分之间存在的内容相同的部分, 所述第一部分为所述第一歌曲 分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右声道的部分; 若包含所述第一人声, 则消除所述第一人声, 得到所述待播放伴奏; 接收待播放的第二人声, 将所述第二人声和所述待播放伴奏进行声音优化 处理以及混音处理, 得到第二歌曲。
2、 根据权利要求 1所述的信息处理的方法, 其特征在于, 所述消除所述第 一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
分别从左右声道中消除左右声道中相关性最大的部分。
3、 根据权利要求 1所述的信息处理的方法, 其特征在于, 所述消除所述第 一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中所述控制 因子为 0到 1之间;
分别从左右声道中消除与所述乘积对应的部分。
4、 根据权利要求 1所述的信息处理的方法, 其特征在于, 所述消除所述第 一人声, 得到所述待播放伴奏包括:
用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏;
用所述第二部分的信息减去第一部分的信息, 得到右声道伴奏; 合成所述左声道伴奏和所述右声道伴奏, 得到所述待播放伴奏, 所述信息 指时域信息或频域信息。
5、 根据权利要求 1至 4所述的信息处理方法, 其特征在于, 所述声音优化 处理包括: 声音去噪处理、 声音自动增益调整处理、 声音音调调整处理中至少 一项。
6、 根据权利要求 1至 5中任意一项所述的信息处理的方法, 其特征在于, 还包括:
若所述歌曲不包含所述第一人声, 则接收第二人声, 将第二人声和待播放 伴奏进行声音优化处理以及混音处理, 得到第二歌曲。
7、 根据权利要求 1至 6中任意一项所述的信息处理的方法, 其特征在于, 还包括:
播放所述第二歌曲 。
8、 一种信息处理的方法, 其特征在于, 包括:
接收终端发送的获取第一歌曲的请求消息, 所述第一歌曲包含待播放伴奏; 判断所述第一歌曲中是否包含第一人声, 所述第一人声为第一个歌曲中第 一部分与第二部分之间存在的内容相同的部分, 所述第一部分为所述第一歌曲 分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右声道的部分; 若包含所述第一人声, 则消除所述第一人声, 得到所述待播放伴奏; 向所述终端发送所述待播放伴奏。
9、 根据权利要求 8所述的信息处理的方法, 其特征在于, 所述消除所述第 一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
分别从左右声道中消除左右声道中相关性最大的部分。
10、 根据权利要求 8 所述的信息处理的方法, 其特征在于, 所述消除所述 第一人声包括:
计算所述第一部分和第二部分之间相关性最大的部分;
计算消除控制因子与左右声道中相关性最大的部分的乘积; 其中所述控制 因子为 0到 1之间;
分别从左右声道中消除与所述乘积对应的部分。
11、 根据权利要求 8 所述的信息处理的方法, 其特征在于, 所述消除所述 第一人声包括:
用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏;
用所述第二部分的信息减去第一部分的信息, 得到右声道伴奏;
合成所述左声道伴奏和所述右声道伴奏, 得到所述待播放伴奏, 所述信息 指时域信息或频域信息。
12、根据权利要求 8至 11中任意一项所述的信息处理的方法,其特征在于, 还包括:
若所述歌曲不包含所述第一人声, 则向所述终端发送所述歌曲。
13、 一种终端, 其特征在于, 包括:
获取单元, 用于获取第一歌曲, 所述第一歌曲包含待播放伴奏;
判断单元, 用于判断所述第一歌曲中是否包含第一人声, 所述第一人声为 第一个歌曲中第一部分与第二部分之间存在的内容相同的部分, 所述第一部分 为所述第一歌曲分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右 声道的部分;
消除单元, 用于若包含所述第一人声, 则消除所述第一人声, 得到所述待 播放伴奏; 处理单元, 用于获取待播放的第二人声, 将所述第二人声和所述待播放伴 奏进行声音优化处理以及混音处理, 得到第二歌曲。
14、 根据权利要求 1 3所述的终端, 其特征在于, 所述消除单元, 具体用于 计算所述第一部分和第二部分之间相关性最大的部分; 分别从左右声道中消除 左右声道中相关性最大的部分。
15、 根据权利要求 1 3所述的终端, 其特征在于, 所述消除单元, 具体用于 计算所述第一部分和第二部分之间相关性最大的部分; 计算消除控制因子与左 右声道中相关性最大的部分的乘积; 其中所述控制因子为 0到 1之间; 分别从 左右声道中消除与所述乘积对应的部分。
16、 根据权利要求 1 3所述的终端, 其特征在于, 所述消除单元, 具体用于 用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用所述第二部 分的信息减去第一部分的信息, 得到右声道伴奏; 合成所述左声道伴奏和所述 右声道伴奏, 得到所述待播放伴奏, 所述信息指时域信息或频域信息。
17、 根据权利要求 1 3至 16 中任意一项所述的终端, 其特征在于, 所述声 音优化处理包括: 声音去噪处理、 声音自动增益调整处理、 声音音调调整处理 中至少一项。
18、 根据权利要求 1 3至 17 中任意一项所述的终端, 其特征在于, 所述终 端还包括:
接收单元, 用于若所述歌曲不包含所述第一人声, 则接收第二人声, 将第 二人声和待播放伴奏进行声音优化处理以及混音处理, 得到第二歌曲。
19、 根据权利要求 1 3至 17 中任意一项所述的终端, 其特征在于, 所述终 端还包括:
播放单元, 用于播放所述第二歌曲 。
20、 一种服务器, 其特征在于, 包括:
接收单元, 用于接收终端发送的获取第一歌曲的请求消息, 所述第一歌曲 包含待播放伴奏;
判断单元, 用于判断所述第一歌曲中是否包含第一人声, 所述第一人声为 第一个歌曲中第一部分与第二部分之间存在的内容相同的部分, 所述第一部分 为所述第一歌曲分布于左声道的部分, 所述第二部分为所述第一歌曲分布于右 声道的部分;
消除单元, 用于若包含所述第一人声, 则消除所述第一人声, 得到所述待 播放伴奏;
发送单元, 用于向所述终端发送所述待播放伴奏。
21、 根据权利要求 20所述的服务器, 其特征在于, 所述消除单元, 具体用 于计算所述第一部分和第二部分之间相关性最大的部分; 分别从左右声道中消 除左右声道中相关性最大的部分。
11、 根据权利要求 20所述的服务器, 其特征在于, 所述消除单元, 具体用 于计算所述第一部分和第二部分之间相关性最大的部分; 计算消除控制因子与 左右声道中相关性最大的部分的乘积; 其中所述控制因子为 0到 1之间; 分别 从左右声道中消除与所述乘积对应的部分。
23、 根据权利要求 20所述的服务器, 其特征在于, 所述消除单元, 具体用 于用所述第一部分的信息减去第二部分的信息, 得到左声道伴奏; 用所述第二 部分的信息减去第一部分的信息, 得到右声道伴奏; 合成所述左声道伴奏和所 述右声道伴奏, 得到所述待播放伴奏, 所述信息指时域信息或频域信息。
24、 根据权利要求 20至 23 中任意一项所述的服务器, 其特征在于, 所述 发送单元, 还用于若所述歌曲不包含所述第一人声, 则向所述终端发送所述歌 曲。
25、 一种信息处理的系统, 其特征在于, 包括:
权利要求 13至 19所述的任一项终端和权利要求 20至 24所述的任一项服 务器。
PCT/CN2013/079798 2013-01-07 2013-07-22 信息处理的方法、装置和系统 WO2014106375A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310004990.1 2013-01-07
CN201310004990.1A CN103915086A (zh) 2013-01-07 2013-01-07 信息处理的方法、装置和系统

Publications (1)

Publication Number Publication Date
WO2014106375A1 true WO2014106375A1 (zh) 2014-07-10

Family

ID=51040717

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/079798 WO2014106375A1 (zh) 2013-01-07 2013-07-22 信息处理的方法、装置和系统

Country Status (2)

Country Link
CN (1) CN103915086A (zh)
WO (1) WO2014106375A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI683582B (zh) * 2018-09-06 2020-01-21 宏碁股份有限公司 增益動態調節之音效控制方法及音效輸出裝置

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104869233B (zh) * 2015-04-27 2019-04-23 深圳市金立通信设备有限公司 一种录音方法
CN104869232A (zh) * 2015-04-27 2015-08-26 深圳市金立通信设备有限公司 一种终端
CN106060143A (zh) * 2016-06-21 2016-10-26 徐文波 基于多层服务器架构的音效控制装置
CN109905789A (zh) * 2017-12-10 2019-06-18 张德明 一种k歌话筒
CN108231091B (zh) * 2018-01-24 2021-05-25 广州酷狗计算机科技有限公司 一种检测音频的左右声道是否一致的方法和装置
JP7243052B2 (ja) * 2018-06-25 2023-03-22 カシオ計算機株式会社 オーディオ抽出装置、オーディオ再生装置、オーディオ抽出方法、オーディオ再生方法、機械学習方法及びプログラム
CN110232931B (zh) * 2019-06-18 2022-03-22 广州酷狗计算机科技有限公司 音频信号的处理方法、装置、计算设备及存储介质
CN112885318A (zh) * 2019-11-29 2021-06-01 阿里巴巴集团控股有限公司 多媒体数据生成方法、装置、电子设备及计算机存储介质
CN111261175A (zh) * 2020-01-17 2020-06-09 北京塞宾科技有限公司 一种蓝牙音频信号传输方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6148086A (en) * 1997-05-16 2000-11-14 Aureal Semiconductor, Inc. Method and apparatus for replacing a voice with an original lead singer's voice on a karaoke machine
CN101609667A (zh) * 2009-07-22 2009-12-23 福州瑞芯微电子有限公司 Pmp播放器中实现卡拉ok功能的方法
CN201667200U (zh) * 2010-03-25 2010-12-08 康佳集团股份有限公司 一种卡拉ok电路及电视机
CN102594982A (zh) * 2012-01-31 2012-07-18 惠州Tcl移动通信有限公司 便携式设备、系统及实现卡拉ok的方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3331297B2 (ja) * 1997-01-23 2002-10-07 株式会社東芝 背景音/音声分類方法及び装置並びに音声符号化方法及び装置
KR100636248B1 (ko) * 2005-09-26 2006-10-19 삼성전자주식회사 보컬 제거 장치 및 방법

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6148086A (en) * 1997-05-16 2000-11-14 Aureal Semiconductor, Inc. Method and apparatus for replacing a voice with an original lead singer's voice on a karaoke machine
CN101609667A (zh) * 2009-07-22 2009-12-23 福州瑞芯微电子有限公司 Pmp播放器中实现卡拉ok功能的方法
CN201667200U (zh) * 2010-03-25 2010-12-08 康佳集团股份有限公司 一种卡拉ok电路及电视机
CN102594982A (zh) * 2012-01-31 2012-07-18 惠州Tcl移动通信有限公司 便携式设备、系统及实现卡拉ok的方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI683582B (zh) * 2018-09-06 2020-01-21 宏碁股份有限公司 增益動態調節之音效控制方法及音效輸出裝置

Also Published As

Publication number Publication date
CN103915086A (zh) 2014-07-09

Similar Documents

Publication Publication Date Title
WO2014106375A1 (zh) 信息处理的方法、装置和系统
CN105378826B (zh) 音频场景装置
JP2023052537A (ja) ライブ音楽実演のマルチメディア・コンテンツのネットワーク・ベースの処理および配送
WO2016188323A1 (zh) K歌处理方法及系统
KR101572894B1 (ko) 오디오 신호의 디코딩 방법 및 장치
US20130162905A1 (en) Information processing device, information processing method, program, recording medium, and information processing system
KR20150131268A (ko) 다수의 오디오 스템들로부터의 자동 다-채널 뮤직 믹스
CN112216294B (zh) 音频处理方法、装置、电子设备及存储介质
TW201251479A (en) Apparatus and method for generating an output signal employing a decomposer
CN110211556B (zh) 音乐文件的处理方法、装置、终端及存储介质
WO2011035626A1 (zh) 音频播放方法及音频播放装置
WO2023221559A1 (zh) K歌音频处理方法、装置及计算机可读存储介质
US11997459B2 (en) Crowd-sourced device latency estimation for synchronization of recordings in vocal capture applications
JP3810004B2 (ja) ステレオ音響信号処理方法、ステレオ音響信号処理装置、ステレオ音響信号処理プログラム
TWI690895B (zh) 社交應用中擴展內容來源的方法及系統、用戶端和伺服器
CN112017622B (zh) 一种音频数据的对齐方法、装置、设备和存储介质
WO2021245234A1 (en) Electronic device, method and computer program
EP3627495B1 (en) Information processing device and information processing method
JP7256164B2 (ja) オーディオ処理装置及びオーディオ処理方法
JP5966531B2 (ja) 通信システム、端末装置、再生制御方法およびプログラム
JP2014066922A (ja) 楽曲演奏装置
KR101573868B1 (ko) 노래 가사 자동 디스플레이 방법, 노래 가사를 인식하는 서버 및 이 서버를 포함하는 노래 가사 자동 디스플레이 시스템
TW201040940A (en) Sound processing apparatus, chat system, sound processing method, information storage medium, and program
KR20090054583A (ko) 휴대용 단말기에서 스테레오 효과를 제공하기 위한 장치 및방법
US8767969B1 (en) Process for removing voice from stereo recordings

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13870220

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13870220

Country of ref document: EP

Kind code of ref document: A1