WO2020177190A1 - Processing method, apparatus and device - Google Patents

Processing method, apparatus and device Download PDF

Info

Publication number
WO2020177190A1
WO2020177190A1 PCT/CN2019/083454 CN2019083454W WO2020177190A1 WO 2020177190 A1 WO2020177190 A1 WO 2020177190A1 CN 2019083454 W CN2019083454 W CN 2019083454W WO 2020177190 A1 WO2020177190 A1 WO 2020177190A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
dry
sound effect
song
target
Prior art date
Application number
PCT/CN2019/083454
Other languages
French (fr)
Chinese (zh)
Inventor
刘承诚
徐东
张玫颖
Original Assignee
腾讯音乐娱乐科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯音乐娱乐科技(深圳)有限公司 filed Critical 腾讯音乐娱乐科技(深圳)有限公司
Publication of WO2020177190A1 publication Critical patent/WO2020177190A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack or decay; Means for producing special musical effects, e.g. vibratos or glissandos

Definitions

  • This application relates to the field of intelligent voice technology, and in particular to a processing method, device and equipment.
  • timbre refers to a certain attribute of sound produced by the sense of hearing. The listener can judge the difference between two sounds that are presented in the same way and have the same pitch and loudness. ". Therefore, the human voice color during singing refers to the voice characteristics that people use to identify the specific singer when different singers sing the same song.
  • the prior art of song post-processing mainly includes: online fixed template timbre processing and offline artificial timbre processing.
  • the online fixed template has the problem of "one thousand people", which can only achieve a certain fixed processing effect; offline tuner processing has the problems of low efficiency and high price.
  • the present application provides a processing method, device, and equipment, which can make the generated audio after sound effect processing more pleasant.
  • this application provides a processing method, which includes:
  • At least one sound effect scheme is determined, and the sound effect scheme is used to compare the dry sound and the Perform sound effect processing on the accompaniment of songs associated with dry sound to generate audio after sound effect processing;
  • target audio is generated; the target sound effect scheme is one sound effect scheme of the at least one sound effect scheme.
  • the at least one target sound effect scheme includes: one sound effect scheme or multiple sound effect schemes;
  • the method further includes:
  • the target instruction is used to indicate the target sound effect scheme
  • the target sound effect scheme is acquired.
  • the method further includes:
  • the distribution and intensity of the overtones in the vector are compared with the obtained reference result of the dry sound, and before the timbre data of the dry sound is obtained, it also includes:
  • the method Before determining at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, the method further includes:
  • the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
  • the singing speed of the song associated with the dry voice is specifically:
  • equalization parameter values, compression parameter values, and reverberation parameter values in the target sound effect scheme sound effect processing is jointly performed on the dry sound and the accompaniment of the song associated with the dry sound to generate target audio.
  • the obtained equalization parameter value, compression parameter value, and reverberation parameter value in the target sound effect scheme are combined to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate target audio ,include:
  • the degree of improvement of the sound quality of the dry sound and the accompaniment of the song associated with the dry sound is adjusted by the equalization parameter value in the target sound effect scheme, and the compression parameter value in the target sound effect scheme is used for the dry sound and the dry sound.
  • the degree of dynamic repair of the accompaniment of the associated song is adjusted, and the reverberation parameter value in the target sound effect scheme improves the sound quality of the dry sound and the accompaniment of the song associated with the dry sound, the creation of the spatial manufacturing level, and the degree of detail concealment Adjust separately to generate target audio.
  • the preprocessing of the obtained dry sound includes:
  • this application provides a processing device, which includes:
  • the first acquiring unit is configured to acquire dry sound; the dry sound includes fundamental frequency data of a song sung by a user;
  • the second acquiring unit is used to acquire the timbre data of the dry sound
  • the determining unit is configured to determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used to Performing sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing;
  • An output unit configured to output the at least one sound effect solution
  • the generating unit is configured to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one of the at least one sound effect scheme.
  • noise reduction and/or sound modification are performed on the acquired dry sound to obtain the first preprocessed data.
  • It is used to: perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; second The feature vector is used to train the training model to be trained.
  • the determining unit is also used to: before determining at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data,
  • the accompaniment identification number of the accompaniment is determined by the accompaniment of the song associated with the dry sound.
  • the song associated with the dry voice is determined from the first database including a plurality of songs through the accompaniment identification number.
  • the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
  • this application provides a processing device, including an input device, an output device, a processor, and a memory.
  • the processor, the input device, the output device, and the memory are connected to each other, wherein the memory is used to store the support device.
  • the application program code for executing the above processing method, and the processor is configured to execute the processing method provided in the above first aspect.
  • this application provides a computer-readable storage medium for storing one or more computer programs.
  • the one or more computer programs include instructions.
  • the instructions are used To execute the processing method of the first aspect described above.
  • the present application provides a computer program, which includes a processing instruction, and when the computer program is executed on a computer, the utilization processing instruction is used to execute the processing method provided in the first aspect.
  • This application provides a processing method, device and equipment.
  • the dry sound which includes the fundamental frequency data of the song sung by the user.
  • the timbre data of the dry sound is acquired, and the timbre data is acquired through a preset training model.
  • the sound effect plan is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound
  • the target sound effect scheme is one of the at least one sound effect scheme.
  • Figure 1 is a schematic diagram of the architecture of a processing system provided by the present application.
  • Fig. 2 is a schematic flow chart of obtaining dry sound provided by this application
  • FIG. 3 is a schematic diagram of a sound effect solution provided by this application.
  • FIG. 4 is a schematic diagram of another sound effect solution provided by this application.
  • Fig. 5 is a schematic flowchart of a processing method provided by the present application.
  • Fig. 6 is a schematic block diagram of a device provided by the present application.
  • Fig. 7 is a schematic block diagram of a device provided by the present application.
  • the term “if” can be interpreted as “when” or “once” or “in response to determination” or “in response to detection” depending on the context .
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the terminal described in this application includes, but is not limited to, other portable devices such as a mobile phone, a laptop computer, or a tablet computer with a touch-sensitive surface (for example, a touch screen display and/or a touch pad). It should also be understood that, in some embodiments, the device is not a portable communication device, but a desktop computer with a touch-sensitive surface (e.g., touch screen display and/or touch pad).
  • the terminal including a display and a touch-sensitive surface is described.
  • the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
  • the terminal supports various applications, such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, telephone application Applications, video conferencing applications, email applications, instant messaging applications, exercise support applications, photo management applications, digital camera applications, digital camera applications, web browsing applications, digital music player applications, and / Or digital video player application.
  • applications such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, telephone application Applications, video conferencing applications, email applications, instant messaging applications, exercise support applications, photo management applications, digital camera applications, digital camera applications, web browsing applications, digital music player applications, and / Or digital video player application.
  • Various application programs that can be executed on the terminal can use at least one common physical user interface device such as a touch-sensitive surface.
  • a touch-sensitive surface One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within corresponding applications.
  • the common physical architecture of the terminal for example, a touch-sensitive surface
  • FIG. 1 is a schematic diagram of the architecture of a processing system provided by the present application.
  • the system may include, but is not limited to: a recognition part and a sound effect processing part.
  • the identification part may include but not limited to the following working steps:
  • Step 1 Obtain dry voice, and identify the fundamental frequency data of the dry voice from the obtained dry voice of the user.
  • the dry voice of the user singing a song can be recorded through recording software to achieve the acquisition of the dry voice.
  • the dry voice of the user may be a pure human voice without accompaniment sung by the user.
  • the dry voice may refer to the pure human voice that has not undergone post-processing (such as dynamics, compression, or reverberation) and processing after recording.
  • the fundamental frequency data is the frequency data of the fundamental tone, and the fundamental tone is the lowest sound produced by the overall vibration of the sounding body (in other words, the fundamental tone is the pure tone with the lowest frequency in each tone).
  • Fig. 2 exemplarily shows a schematic diagram of obtaining dry sound.
  • the recording software is recording the dry sound of a song sung by the user (such as outside of light years).
  • the fundamental frequency data of the dry voice can be identified from the dry voice of the user through the Praat phonetics software. It should be noted that the fundamental frequency data of the dry voice can also be identified from the dry voice of the user through the autocorrelation algorithm, parallel processing method, cepstrum method and simplified inverse filter method.
  • the fundamental frequency data may include the upper limit of the fundamental frequency, the lower limit of the fundamental frequency, and the main tone of the fundamental frequency data.
  • Step 2 Preprocess the acquired dry sound to obtain first preprocessed data.
  • Step 3 Perform feature extraction on the first preprocessed data to extract the first feature vector, input the first feature vector into a preset training model, and overtone the first feature vector through the preset training model The distribution and intensity of is compared with the obtained reference result of dry sound to obtain timbre data (timbre score) of dry sound; the preset training model is a trained training model.
  • Work process 11 Perform feature extraction on the first preprocessed data to extract a first feature vector, and input the extracted first feature vector into a preset training model.
  • Working process 12 The distribution and intensity of the overtones in the extracted first feature vector are compared with the reference result of the dry sound through a preset training model to obtain the timbre data of the dry sound.
  • the reference result of the dry sound may be the distribution and intensity of the overtones in the feature vector corresponding to the dry sound of the star.
  • each part (one-half or one-third) of the vocal body is also vibrating, which can be the overtone in the embodiment of this application, where the combination of overtones can determine a specific tone. And can make people clearly feel the loudness of the fundamental tone.
  • Step 4 Get the singing speed of the song associated with the dry voice.
  • the obtained singing speed of the song associated with the dry voice may specifically be:
  • BPM beats per minute
  • the number of syllables per minute (SPM) of the song associated with the obtained dry sound is obtained.
  • obtaining the singing speed of songs associated with dry voice may specifically include the following working processes:
  • Work process 21 Determine the accompaniment identification number (ID) of the accompaniment through the accompaniment of the song associated with the dry sound.
  • Work process 22 Determine the dry-voice-related songs from the second database including a plurality of songs through the accompaniment identification number.
  • the second database may be a music library including multiple songs.
  • Work process 23 Determine the singing speed of the song according to the determined song, where the accompaniment identification number of the song is associated with the song.
  • one song can be associated with one or more accompaniments.
  • each accompaniment can have a unique accompaniment identification number.
  • the accompaniment associated with the song “I Love You For Ten Thousand Years” may include: male accompaniment, female accompaniment, and DJ accompaniment, among which the accompaniment identification number of the male accompaniment may be It is 11.
  • the accompaniment identification number of female accompaniment can be 22 and the accompaniment identification number of DJ accompaniment can be 33.
  • the sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate Audio after sound effect processing.
  • generating the target audio according to the acquired target sound effect scheme includes:
  • the sound effect processing is jointly performed on the accompaniment of the song associated with the dry sound and the dry sound to generate the target audio.
  • the degree of improvement of the sound quality of the accompaniment of dry sound and dry sound related songs can be adjusted through the equalization parameter value in the target sound effect plan, and the compression parameter value in the target sound effect plan can be used for dry sound and dry sound related songs. Adjust the dynamic repair level of the accompaniment and the reverberation parameter value in the target sound effect plan to improve the sound quality of the accompaniment of the dry and dry-related songs, the creation of the spatial manufacturing level, and the degree of detail concealment.
  • Output at least one sound effect scheme.
  • output at least one sound effect scheme which may specifically include but is not limited to the following forms:
  • At least one sound effect scheme will be displayed, or the at least one sound effect scheme will be played.
  • Case 1 At least one sound effect scheme, including: a sound effect scheme; the sound effect scheme is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing.
  • a sound effect scheme the sound effect scheme is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing.
  • Case 2 At least one sound effect scheme, including: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the accompaniment of dry sound and dry sound-related songs to generate multiple sound effect processing Audio.
  • Fig. 3 exemplarily shows a sound effect scheme.
  • the sound effect scheme may include four sound effect schemes that can be used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound, respectively.
  • the name of the sound effect scheme may be AI reverberation, and the sound effect scheme can be used to formulate the user's exclusive audio based on the user's dry voice, and the user can audition the singing voice developed through the sound effect scheme.
  • the four sound effect schemes are as follows:
  • Sound effect scheme 1 The timbre data of the user's dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the ideal dry voice and the fundamental frequency data A sound effect scheme with a matching degree of 90%.
  • Sound effect plan 2 User’s dry voice timbre data, dry voice-related song singing speed, fundamental frequency data and ideal dry voice timbre data, ideal dry voice-related song singing speed and fundamental frequency data A sound effect scheme with a matching degree of 80%.
  • Sound effect scheme 3 The timbre data of the user’s dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the ideal dry voice and the fundamental frequency data A sound effect scheme with a matching degree of 60%.
  • Sound effect scheme 4 User’s dry voice timbre data, dry voice-related song singing speed, fundamental frequency data and ideal dry voice timbre data, ideal dry voice-related song singing speed and fundamental frequency data
  • the sound effect scheme with a matching degree of 90% is the recommended sound effect scheme (the user is recommended to choose first) that can be used to process the sound effect of the dry sound and the accompaniment of the song associated with the dry sound.
  • the other three solutions are (recommended The user second priority) is a sound effect solution that can be selected by the user to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound.
  • the name of the sound effect scheme may also be smart sound effect.
  • the sound effect solution can be used to formulate the user's exclusive audio based on the user's dry voice. If the user does not wear headphones to record the user's dry voice, it may affect the audio effect developed through the sound effect solution; The system prompts or recommends that the user wear headphones to record while singing.
  • the user can audition the audio developed through the sound effect scheme. If the user is a VIP (Very Important People), the user can publish the audio developed by the sound effect solution based on the user's dry voice.
  • VIP Very Important People
  • Figure 4 exemplarily shows another sound effect scheme.
  • the sound effect scheme may include multiple sound effect schemes that can be used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound.
  • the sound effect solution may include, but is not limited to: the timbre data of the user’s dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the dry voice,
  • the matching degree between the fundamental frequency data is the sound effect scheme of 90%, the sound effect scheme of KTV sound effect, the sound effect scheme of magnetic sound effect, the sound effect scheme of song sound effect, the sound effect scheme of far away artistic conception sound effect and so on.
  • the timbre data of the user s dry voice, the singing speed of the song associated with the dry voice, the fundamental frequency data and the timbre data of the ideal dry voice, the singing speed of the song associated with the ideal dry voice, and the fundamental frequency data
  • the sound effect scheme with a matching degree of 90% may be a preferred sound effect scheme recommended to the user that can be used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound.
  • the other solution may be a second-preferred sound effect solution recommended to the user that can be used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound.
  • the embodiment of the present application provides a processing system.
  • the processing system includes a recognition part and a sound effect processing part.
  • the processing system obtains the dry voice through the recognition part, and the dry voice includes the fundamental frequency data of the song sung by the user.
  • the processing system obtains the timbre data of the dry sound through the recognition part, and the timbre data is obtained through the preset training model.
  • the processing system determines at least one sound effect scheme through the sound effect processing part according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, and the sound effect scheme is used to associate the dry sound with the dry sound.
  • the accompaniment of the song is processed with sound effects to generate audio after sound effect processing.
  • the processing system outputs at least one sound effect scheme through the sound effect processing part.
  • the processing system generates the target audio according to the acquired target sound effect scheme through the sound effect processing part; the target sound effect scheme is one of the at least one sound effect scheme. According to the embodiment of the present application, by determining the target sound effect scheme from the first database including multiple sound effect schemes to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound, the generated audio after the sound effect processing can be more pleasant to hear .
  • FIGS. 2 to 4 are only used to explain the embodiments of the present application, and should not limit the present application.
  • FIG. 5 is a schematic flowchart of a processing method provided by the present application. As shown in Figure 5, the method can at least include the following steps:
  • the dry voice includes the fundamental frequency data of the song sung by the user.
  • the fundamental frequency data in the dry voice can be identified from the dry voice of the user through Praat phonetics software, and it can also be learned from the dry voice of the user through the autocorrelation algorithm, parallel processing method, cepstrum method and simplified inverse filter method. Recognized in the sound.
  • a song can be an art form that combines lyrics and scores.
  • the dry voice can be a pure human voice without accompaniment sung by the user.
  • the dry voice can refer to the pure human voice without post-processing (such as dynamics, compression or reverberation, etc.) or processing after recording.
  • the fundamental frequency data is the frequency data of the fundamental tone, and the fundamental tone is the lowest sound produced by the overall vibration of the sounding body (in other words, the fundamental tone is the pure tone with the lowest frequency in each tone).
  • Work step 1 preprocess the acquired dry sound to obtain first preprocessed data.
  • preprocessing the acquired dry sound can specifically include the following working processes:
  • Work step 2 Perform feature extraction on the first preprocessed data to extract the first feature vector, input the first feature vector into a preset training model, and use the preset training model to sum the distribution of overtones in the first feature vector The intensity is compared with the obtained reference result of the dry sound to obtain the timbre data of the dry sound; wherein, the preset training model is a trained training model.
  • the reference result of the dry sound may be the difference and intensity of the overtones in the feature vector corresponding to the dry sound of the star. If the distribution and intensity of the overtones in the first feature are closer to the reference result of dry sound, the score of the dry sound of the user is higher.
  • each part (one-half or one-third) of the vocal body is also vibrating, which can be the overtone in the embodiment of this application, where the combination of overtones can determine a specific tone. And can make people clearly feel the loudness of the fundamental tone.
  • S503 Determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data.
  • the sound effect solution is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • Step 1 Receive a target instruction; the target instruction is used to indicate a target sound effect scheme (that is, indicate to obtain a target sound effect scheme associated with the target instruction).
  • Step 2 In response to the received target instruction, obtain the target sound effect scheme.
  • Work process 1 Determine the accompaniment identification number of the accompaniment through the accompaniment of the song associated with the dry sound.
  • Work process 2 Determine the dry-voice-related songs from the first database including multiple songs through the accompaniment identification number.
  • Work process 3 Determine the singing speed of the song according to the determined song, where the accompaniment identification number of the song is associated with the song.
  • Case 1 At least one sound effect scheme, which may include: one sound effect scheme; this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • one sound effect scheme this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • Case 2 At least one sound effect scheme, which may include: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate multiple sound effect processing respectively After the audio.
  • outputting the target sound effect scheme may specifically include but not limited to the following forms:
  • At least one sound effect scheme will be displayed for display, or at least one sound effect scheme will be played.
  • S505 Generate target audio according to the acquired target sound effect scheme.
  • generating the target audio according to the acquired target sound effect scheme may specifically include the following process:
  • the sound effect processing is jointly performed on the accompaniment of the song associated with the dry sound and the dry sound to generate the target audio.
  • the degree of improvement of the sound quality of the accompaniment of dry sound and dry sound related songs is adjusted through the equalization parameter value in the target sound effect plan, and the compression parameter value in the target sound effect plan is used for the dry sound and dry sound related songs.
  • the degree of dynamic repair of the accompaniment is adjusted and the reverberation parameter value in the target sound effect scheme is adjusted to improve the sound quality of the accompaniment of dry and dry-related songs, the creation of spatial manufacturing levels, and the degree of detail concealment.
  • the embodiment of the present application provides a processing method.
  • the dry sound which includes the fundamental frequency data of the song sung by the user.
  • the timbre data of the dry sound is acquired, and the timbre data is acquired through a preset training model.
  • the sound effect plan is determined, and the sound effect plan is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound To generate audio processed by sound effects.
  • output at least one sound effect scheme is output according to the acquired target sound effect scheme.
  • target audio is generated; the target sound effect scheme is one of the at least one sound effect scheme.
  • sound effect processing can be performed on the accompaniment of dry sound and dry sound-related songs through the acquired target sound effect scheme, which can make the generated audio after sound effect processing more beautiful and more pleasant to hear.
  • the processing device 60 includes: a first acquiring unit 601, a second acquiring unit 602, a determining unit 603, an output unit 604, and a generating unit 605. among them:
  • the first acquiring unit 601 is used to acquire dry sound; the dry sound includes fundamental frequency data of the song sung by the user.
  • the second acquiring unit 602 is used to acquire the timbre data of the dry sound.
  • the determining unit 603 is configured to determine the target sound effect from the first database including multiple sound effect schemes according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data identified from the dry sound Scheme; the target sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • the output unit 604 is configured to output at least one sound effect scheme.
  • Case 1 At least one sound effect scheme, which may include: one sound effect scheme; this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • one sound effect scheme this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
  • Case 2 At least one sound effect scheme, which may include: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate multiple sound effect processing respectively After the audio.
  • the generating unit 605 is configured to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one sound effect scheme among at least one sound effect scheme.
  • the generating unit 605 can be specifically used for adjusting the improvement degree of the sound quality of the dry sound and the accompaniment of the song associated with the dry sound through the equalization parameter value in the target sound effect scheme, and the compression parameter value in the target sound effect scheme for the dry sound and dry sound Adjust the dynamic repair degree of the accompaniment of the associated song and adjust the reverberation parameter value in the target sound effect plan to improve the sound quality of the accompaniment of the dry and dry-associated songs, the creation of the spatial manufacturing level, and the degree of detail concealment respectively. Generate target audio.
  • the processing device 60 includes: a first obtaining unit 601, a second obtaining unit 602, a determining unit 603, an output unit 604, and a generating unit 605, and also includes a preprocessing unit.
  • noise reduction and/or sound modification are performed on the acquired dry sound to obtain the first preprocessed data.
  • the processing device 60 includes: a first acquiring unit 601, a second acquiring unit 602, a determining unit 603, an output unit 604, and a generating unit 605, as well as a training unit.
  • It is used to: perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; second The feature vector is used to train the training model to be trained.
  • the determining unit is also used to: before determining at least one sound effect scheme based on the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data,
  • the accompaniment identification number of the accompaniment is determined by the accompaniment of the song associated with the dry sound.
  • the song associated with the dry voice is determined from the first database including a plurality of songs through the accompaniment identification number.
  • the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
  • the device 60 can obtain dry sound through the first acquiring unit 601; dry sound includes the fundamental frequency data of the song sung by the user; further, the device 60 obtains dry sound through the second acquiring unit 602 The timbre data; then, the device 60 determines at least one sound effect scheme through the determining unit 604 according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used for the dry sound and the dry sound
  • the accompaniment of the associated song performs sound effect processing to generate sound effect-processed audio; then, the device 60 outputs the target sound effect scheme through the output unit 604; finally, the device 60 generates the target sound effect scheme according to the acquired target sound effect scheme through the generating unit 605 Audio; the target sound effect scheme is one sound effect scheme in at least one sound effect scheme.
  • the sound effect processing can be performed on the accompaniment of the dry sound and the song associated with the dry sound through the acquired target
  • the device 60 is only an example provided by the embodiment of the present application, and the device 60 may have more or less components than the shown components, may combine two or more components, or may have Different configurations are implemented.
  • Fig. 7 is a schematic structural diagram of a processing device provided by the present application.
  • the devices may include mobile phones, tablet computers, personal digital assistants (Personal Digital Assistant, PDA), mobile Internet devices (Mobile Internet Device, MID), smart wearable devices (such as smart watches, smart bracelets), etc.
  • PDA Personal Digital Assistant
  • the device 70 may include: a baseband chip 701, a memory 702 (one or more computer-readable storage media), and a peripheral system 703. These components can communicate on one or more communication buses 704.
  • the baseband chip 701 may include: one or more processors (CPU) 705.
  • the processor 705 can be specifically used for:
  • dry sound includes the fundamental frequency data of the song sung by the user.
  • the sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate Audio after sound effect processing.
  • the target sound effect scheme is one sound effect scheme among at least one sound effect scheme.
  • the memory 702 is coupled with the processor 705 and may be used to store various software programs and/or multiple sets of instructions.
  • the memory 702 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.
  • the memory 702 can store an operating system (hereinafter referred to as system), such as an embedded operating system such as ANDROID, IOS, WINDOWS, or LINUX.
  • system such as an embedded operating system such as ANDROID, IOS, WINDOWS, or LINUX.
  • the memory 702 may also store a network communication program, which may be used to communicate with one or more additional devices, one or more device devices, and one or more network devices.
  • the memory 702 can also store a user interface program, which can vividly display the content of the application program through a graphical operation interface, and receive user control operations on the application program through input controls such as menus, dialog boxes, and
  • the memory 702 may be used to store implementation code for implementing the processing method.
  • the memory 702 may also store one or more application programs. These applications may include: K song programs, social applications (such as Facebook), image management applications (such as photo albums), map applications (such as Google Maps), browsers (such as Safari, Google Chrome), and so on.
  • K song programs such as Facebook
  • image management applications such as photo albums
  • map applications such as Google Maps
  • browsers such as Safari, Google Chrome
  • the peripheral system 703 is mainly used to implement the interaction function between the user of the device 70 and the external environment, and mainly includes the input and output devices of the device 70.
  • the peripheral system 703 may include: a display controller 707, a camera controller 708, and an audio controller 709. Among them, each controller can be coupled with its corresponding peripheral device (such as the display screen 710, the camera 711, and the audio circuit 712).
  • the display screen may be a display screen 1 equipped with a self-capacitive floating touch panel, or a display screen configured with an infrared floating touch panel.
  • the camera 711 may be a 3D camera. It should be noted that the peripheral system 703 may also include other I/O peripherals.
  • the device 70 can obtain dry sound through the processor 705; the dry sound includes the fundamental frequency data of the user sings the song; further, the device 70 can obtain the timbre data of the dry sound through the processor 705; , The device 70 can determine at least one sound effect scheme through the processor 705 according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used for the dry sound and the song associated with the dry sound Perform sound effect processing on the accompaniment to generate the processed audio; then, the device 70 can output the target sound effect scheme through the peripheral system 703; finally, the device 70 can generate the target audio through the processor 705 according to the acquired target sound effect scheme ;
  • the target sound effect scheme is one sound effect scheme in at least one sound effect scheme.
  • the sound effect processing can be performed on the accompaniment of the dry sound and the song associated with the dry sound through the acquired target sound effect scheme, so that the generated audio after the sound effect processing can be
  • the device 70 is only an example provided in the embodiment of the present application, and the device 70 may have more or less components than the components shown, may combine two or more components, or may have Different configurations are implemented.
  • the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is implemented when executed by a processor.
  • the computer-readable storage medium may be an internal storage unit of the device described in any of the foregoing embodiments, such as a hard disk or memory of the device.
  • the computer-readable storage medium may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the computer-readable storage medium may also include both an internal storage unit of the device and an external storage device.
  • the computer-readable storage medium is used to store computer programs and other programs and data required by the device.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • the present application also provides a computer program product.
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program.
  • the computer program is operable to cause a computer to execute any of the methods described in the above method embodiments. Part or all of the steps.
  • the computer program product may be a software installation package, and the computer includes an electronic device.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation.
  • multiple units or components can be combined or integrated into Another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

Provided are a processing method, apparatus and device. The processing method comprises: acquiring a dry sound (S501), the dry sound comprising fundamental frequency data of a song sung by a user; acquiring timbre data of the dry sound (S502), with the timbre data being acquired by means of a pre-set training model; determining at least one sound effect scheme according to the acquired timbre data of the dry sound, a singing speed for a song associated with the dry sound and the fundamental frequency data (S503), with the sound effect scheme being used for carrying out sound effect processing on the dry sound and an accompaniment of the song associated with the dry sound so as to generate an audio subjected to sound effect processing; outputting at least one sound effect scheme (S504); and generating a target audio according to an acquired target sound effect scheme (S505), wherein the target sound effect scheme is a sound effect scheme in the at least one sound effect scheme. According to the processing method, the generated audio that is subjected to sound effect processing can be more pleasant.

Description

一种处理方法、装置及设备A processing method, device and equipment 技术领域Technical field
本申请涉及智能语音技术领域,尤其涉及一种处理方法、装置及设备。This application relates to the field of intelligent voice technology, and in particular to a processing method, device and equipment.
背景技术Background technique
美国国家标准化研究所对音色做了如下定义,“音色是指声音在听觉上产生的某种属性,听音者能够据此判断两个以同样方式呈现、具有相同音高和响度的声音的不同”。由此,演唱时的人声音色是指当不同的演唱者演唱同一首歌曲时,人们用来判别出具体是哪个演唱者的声音特征。The National Institute of Standardization of the United States defines timbre as follows: "Toneness refers to a certain attribute of sound produced by the sense of hearing. The listener can judge the difference between two sounds that are presented in the same way and have the same pitch and loudness. ". Therefore, the human voice color during singing refers to the voice characteristics that people use to identify the specific singer when different singers sing the same song.
在实现本发明过程中,发明人发现歌曲后期处理的现有技术主要为:线上固定模版方式的音色处理及线下人工音色处理。其中线上固定模板存在着“千人一面”的问题,只能达到某种固定的处理效果;线下调音师处理存在着效率低、价格高等问题。In the process of realizing the present invention, the inventor found that the prior art of song post-processing mainly includes: online fixed template timbre processing and offline artificial timbre processing. Among them, the online fixed template has the problem of "one thousand people", which can only achieve a certain fixed processing effect; offline tuner processing has the problems of low efficiency and high price.
发明内容Summary of the invention
本申请提供一种处理方法、装置及设备,可使得生成的音效处理后的音频更加的动听。The present application provides a processing method, device, and equipment, which can make the generated audio after sound effect processing more pleasant.
第一方面,本申请提供了一种处理方法,该方法包括:In the first aspect, this application provides a processing method, which includes:
获取干声,所述干声包括用户演唱歌曲的基频数据;Acquiring dry voice, where the dry voice includes fundamental frequency data of the song sung by the user;
获取所述干声的音色数据,所述音色数据是通过预设训练模型获取的;Acquiring timbre data of the dry voice, where the timbre data is acquired through a preset training model;
根据获取到的所述干声的音色数据、所述干声关联的歌曲的演唱速度以及所述基频数据,确定出至少一个音效方案,所述音效方案用于对所述干声和所述干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频;According to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, at least one sound effect scheme is determined, and the sound effect scheme is used to compare the dry sound and the Perform sound effect processing on the accompaniment of songs associated with dry sound to generate audio after sound effect processing;
输出所述至少一个音效方案;Output the at least one sound effect scheme;
根据所获取的目标音效方案,生成目标音频;所述目标音效方案为所述至少一个音效方案中的一个音效方案。According to the acquired target sound effect scheme, target audio is generated; the target sound effect scheme is one sound effect scheme of the at least one sound effect scheme.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述至少一个目标音效方案,包括:一个音效方案或多个音效方案;The at least one target sound effect scheme includes: one sound effect scheme or multiple sound effect schemes;
所述输出所述至少一个音效方案之后,根据所获取的目标音效方案,生成目标音频之前,还包括:After the output of the at least one sound effect scheme, before generating the target audio according to the acquired target sound effect scheme, the method further includes:
接收目标指令;所述目标指令用于指示所述目标音效方案;Receiving a target instruction; the target instruction is used to indicate the target sound effect scheme;
响应于接收到的所述目标指令,获取所述目标音效方案。In response to the received target instruction, the target sound effect scheme is acquired.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述获得所述干声的音色数据之前,还包括:Before obtaining the timbre data of the dry sound, the method further includes:
对获取到的干声进行预处理,获得第一预处理数据;Preprocessing the acquired dry sound to obtain the first preprocessing data;
将所述第一预处理数据进行特征提取,以提取出第一特征向量,将所述第一特征向量输入到预设训练模型中,通过所述预设训练模型将所述第一特征向量中泛音的分布和强度与所述获取到的干声的参考结果进比对,以获得所述干声的音色数据;所述预设训练模型为训练好的训练模型。Perform feature extraction on the first preprocessed data to extract a first feature vector, input the first feature vector into a preset training model, and use the preset training model to extract the first feature vector The distribution and intensity of the overtones are compared with the obtained reference result of the dry sound to obtain the timbre data of the dry sound; the preset training model is a trained training model.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述将所述第一预处理数据进行特征提取,以提取出第一特征向量,将所述第一特征向量输入到预设训练模型中,通过所述预设训练模型将所述第一特征向量中泛音的分布和强度与所述获取到的干声的参考结果进比对,以获得所述干声的音色数据之前,还包括:Performing feature extraction on the first preprocessed data to extract a first feature vector, inputting the first feature vector into a preset training model, and extracting the first feature through the preset training model The distribution and intensity of the overtones in the vector are compared with the obtained reference result of the dry sound, and before the timbre data of the dry sound is obtained, it also includes:
将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将所述第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;所述第二特征向量用于对所述待训练的训练模型进行训练。Perform feature extraction on multiple labeled dry sound samples to extract second feature vectors, and input the second feature vectors into the training model to be trained to obtain a preset training model; The two feature vectors are used to train the training model to be trained.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述根据获取到的所述干声的音色数据、所述干声关联的歌曲的演唱速度以及所述基频数据确定出至少一个音效方案之前,还包括:Before determining at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, the method further includes:
通过所述干声关联的歌曲的伴奏,确定出所述伴奏的伴奏标识号码;Determine the accompaniment identification number of the accompaniment through the accompaniment of the song associated with the dry voice;
通过所述伴奏标识号码从包括多首歌曲的第一数据库中确定出所述干声关联的歌曲;Determining the song associated with the dry voice from a first database including a plurality of songs by using the accompaniment identification number;
根据确定出的歌曲,确定出所述歌曲的歌唱速度,其中,所述歌曲的伴奏标识号码与所述歌曲关联。According to the determined song, the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述干声关联的歌曲的演唱速度,具体为:The singing speed of the song associated with the dry voice is specifically:
所述获取到的所述干声关联的歌曲的每分钟节拍数;The number of beats per minute of the acquired song associated with the dry sound;
或者,or,
所述获取到的所述干声关联的歌曲的每分钟音节数。The obtained number of syllables per minute of the song associated with the dry sound.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述根据所获取的目标音效方案,生成目标音频,包括:The generating target audio according to the acquired target sound effect scheme includes:
通过所述所获取到的目标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对所述干声和所述干声关联的歌曲的伴奏进行音效处理,生成目标音频。Through the obtained equalization parameter values, compression parameter values, and reverberation parameter values in the target sound effect scheme, sound effect processing is jointly performed on the dry sound and the accompaniment of the song associated with the dry sound to generate target audio.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述通过所述所获取到的目标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对所述干声和所述干声关联的歌曲的伴奏进行音效处理,生成目标音频,包括:The obtained equalization parameter value, compression parameter value, and reverberation parameter value in the target sound effect scheme are combined to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate target audio ,include:
通过所述目标音效方案中的均衡参数值对干声和所述干声关联的歌曲的伴奏的音质的改善程度进行调整、所述目标音效方案中的压缩参数值对干声和所述干声关联的歌曲的伴奏的动态修补程度进行调整以及所述目标音效方案中的混响参数值对干声和所述干声关联的歌曲的伴奏的音质的改善、空间制造层次的营造、细节掩盖程度分别进行调整,生成目标音频。The degree of improvement of the sound quality of the dry sound and the accompaniment of the song associated with the dry sound is adjusted by the equalization parameter value in the target sound effect scheme, and the compression parameter value in the target sound effect scheme is used for the dry sound and the dry sound The degree of dynamic repair of the accompaniment of the associated song is adjusted, and the reverberation parameter value in the target sound effect scheme improves the sound quality of the dry sound and the accompaniment of the song associated with the dry sound, the creation of the spatial manufacturing level, and the degree of detail concealment Adjust separately to generate target audio.
结合第一方面,在一些可能的实施例中,In combination with the first aspect, in some possible embodiments,
所述对获取到的干声进行预处理,包括:The preprocessing of the obtained dry sound includes:
对获取到的干声进行降噪和/或修音。Perform noise reduction and/or tone repair on the acquired dry sound.
第二方面,本申请提供了一种处理装置,该装置包括:In a second aspect, this application provides a processing device, which includes:
第一获取单元,用于获取干声;所述干声包括用户演唱歌曲的基频数据;The first acquiring unit is configured to acquire dry sound; the dry sound includes fundamental frequency data of a song sung by a user;
第二获取单元,用于获取所述干声的音色数据;The second acquiring unit is used to acquire the timbre data of the dry sound;
确定单元,用于根据获取到的所述干声的音色数据、所述干声关联的歌曲的演唱速度以及从所述基频数据,确定出至少一个音效方案;所述音效方案用于对所述干声和所述干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频;The determining unit is configured to determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used to Performing sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing;
输出单元,用于输出所述至少一个音效方案;An output unit, configured to output the at least one sound effect solution;
生成单元,用于根据所获取的目标音效方案,生成目标音频;所述目标音效方案为所述至少一个音效方案中的一个音效方案。The generating unit is configured to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one of the at least one sound effect scheme.
结合第二方面,在一些可能的实施例中,In combination with the second aspect, in some possible embodiments,
还包括:预处理单元。Also includes: preprocessing unit.
用于:对获取到的干声进行预处理,以获得第一预处理数据。Used to: preprocess the acquired dry sound to obtain first preprocessed data.
具体的,对获取到的干声进行降噪和/或修音,以获得第一预处理数据。Specifically, noise reduction and/or sound modification are performed on the acquired dry sound to obtain the first preprocessed data.
结合第二方面,在一些可能的实施例中,In combination with the second aspect, in some possible embodiments,
还包括:训练单元。Also includes: training unit.
用于:将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;第二特征向量用于对待训练的训练模型进行训练。It is used to: perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; second The feature vector is used to train the training model to be trained.
结合第二方面,在一些可能的实施例中,In combination with the second aspect, in some possible embodiments,
确定单元还用于:在根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据,确定出至少一个音效方案之前,The determining unit is also used to: before determining at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data,
通过干声关联的歌曲的伴奏,确定出伴奏的伴奏标识号码。The accompaniment identification number of the accompaniment is determined by the accompaniment of the song associated with the dry sound.
通过伴奏标识号码从包括多首歌曲的第一数据库中确定出干声关联的歌曲。The song associated with the dry voice is determined from the first database including a plurality of songs through the accompaniment identification number.
根据确定出的歌曲,确定出歌曲的歌唱速度,其中,歌曲的伴奏标识号码与歌曲关联。According to the determined song, the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
第三方面,本申请提供了一种处理设备,包括输入设备、输出设备、处理器和存储器,所述处理器、输入设备、输出设备和存储器相互连接,其中,所述存储器用于存储支持设备执行上述处理方法的应用程序代码,所述处理器被配置用于执行上述第一方面提供的处理方法。In a third aspect, this application provides a processing device, including an input device, an output device, a processor, and a memory. The processor, the input device, the output device, and the memory are connected to each other, wherein the memory is used to store the support device The application program code for executing the above processing method, and the processor is configured to execute the processing method provided in the above first aspect.
第四方面,本申请提供了一种计算机可读的存储介质,用于存储一个或多个计算机程序,上述一个或多个计算机程序包括指令,当上述计算机程序在计算机上运行时,上述指令用于执行上述第一方面的处理方法。In the fourth aspect, this application provides a computer-readable storage medium for storing one or more computer programs. The one or more computer programs include instructions. When the computer programs are run on a computer, the instructions are used To execute the processing method of the first aspect described above.
第五方面,本申请提供了一种计算机程序,该计算机程序包括处理指令,当该计算机程序在计算机上执行时,上述利用处理指令用于执行上述第一方面提供的处理方法。In a fifth aspect, the present application provides a computer program, which includes a processing instruction, and when the computer program is executed on a computer, the utilization processing instruction is used to execute the processing method provided in the first aspect.
本申请提供了一种处理方法、装置及设备。首先,获取干声,干声包括用户演唱歌曲的基频数据。进而,获取干声的音色数据,音色数据是通过预设训练模型获取的。然后,根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据,确定出至少一个音效方案,音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。接着,输出至少一个音效方案。最后,根据所获取的目标音效方案,生成目标音频;目标音效方案为该至少一个音效方案中的一个音效方案。采用本申请,通过获取到的目标音效方案对干声和干声关联的歌曲的伴奏进行音效处理,可使得生成的音效处理后的音频更加的动听。This application provides a processing method, device and equipment. First, get the dry sound, which includes the fundamental frequency data of the song sung by the user. Furthermore, the timbre data of the dry sound is acquired, and the timbre data is acquired through a preset training model. Then, according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, at least one sound effect plan is determined, and the sound effect plan is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound To generate audio processed by sound effects. Then, output at least one sound effect scheme. Finally, according to the acquired target sound effect scheme, target audio is generated; the target sound effect scheme is one of the at least one sound effect scheme. With this application, the sound effect processing is performed on the accompaniment of the dry sound and the song associated with the dry sound through the acquired target sound effect solution, so that the generated audio after the sound effect processing can be more pleasant.
附图说明Description of the drawings
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1是本申请提供的一种处理系统的架构示意图;Figure 1 is a schematic diagram of the architecture of a processing system provided by the present application;
图2是本申请提供的一种干声的获取的示意流程图;Fig. 2 is a schematic flow chart of obtaining dry sound provided by this application;
图3是本申请提供的一种音效方案的示意图;FIG. 3 is a schematic diagram of a sound effect solution provided by this application;
图4是本申请提供的另一种音效方案的示意图;FIG. 4 is a schematic diagram of another sound effect solution provided by this application;
图5是本申请提供的一种处理方法的示意流程图;Fig. 5 is a schematic flowchart of a processing method provided by the present application;
图6是本申请提供的一种装置的示意性框图;Fig. 6 is a schematic block diagram of a device provided by the present application;
图7是本申请提供的一种设备的示意性框图。Fig. 7 is a schematic block diagram of a device provided by the present application.
具体实施方式detailed description
下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基 于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in this application will be clearly and completely described below in conjunction with the drawings in this application. Obviously, the described embodiments are part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the terms "including" and "including" indicate the existence of the described features, wholes, steps, operations, elements and/or components, but do not exclude one or The existence or addition of multiple other features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terms used in the specification of this application are only for the purpose of describing specific embodiments and are not intended to limit the application. As used in the specification of this application and the appended claims, unless the context clearly indicates other circumstances, the singular forms "a", "an" and "the" are intended to include plural forms.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations .
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in this specification and the appended claims, the term "if" can be interpreted as "when" or "once" or "in response to determination" or "in response to detection" depending on the context . Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".
具体实现中,本申请中描述的终端包括但不限于诸如具有触摸敏感表面(例如,触摸屏显示器和/或触摸板)的移动电话、膝上型计算机或平板计算机之类的其它便携式设备。还应当理解的是,在某些实施例中,所述设备并非便携式通信设备,而是具有触摸敏感表面(例如,触摸屏显示器和/或触摸板)的台式计算机。In specific implementation, the terminal described in this application includes, but is not limited to, other portable devices such as a mobile phone, a laptop computer, or a tablet computer with a touch-sensitive surface (for example, a touch screen display and/or a touch pad). It should also be understood that, in some embodiments, the device is not a portable communication device, but a desktop computer with a touch-sensitive surface (e.g., touch screen display and/or touch pad).
在接下来的讨论中,描述了包括显示器和触摸敏感表面的终端。然而,应当理解的是,终端可以包括诸如物理键盘、鼠标和/或控制杆的一个或多个其它物理用户接口设备。In the following discussion, a terminal including a display and a touch-sensitive surface is described. However, it should be understood that the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
终端支持各种应用程序,例如以下中的一个或多个:绘图应用程序、演示应用程序、文字处理应用程序、网站创建应用程序、盘刻录应用程序、电子表格应用程序、游戏应用程序、电话应用程序、视频会议应用程序、电子邮件应 用程序、即时消息收发应用程序、锻炼支持应用程序、照片管理应用程序、数码相机应用程序、数字摄影机应用程序、web浏览应用程序、数字音乐播放器应用程序和/或数字视频播放器应用程序。The terminal supports various applications, such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, telephone application Applications, video conferencing applications, email applications, instant messaging applications, exercise support applications, photo management applications, digital camera applications, digital camera applications, web browsing applications, digital music player applications, and / Or digital video player application.
可以在终端上执行的各种应用程序可以使用诸如触摸敏感表面的至少一个公共物理用户接口设备。可以在应用程序之间和/或相应应用程序内调整和/或改变触摸敏感表面的一个或多个功能以及终端上显示的相应信息。这样,终端的公共物理架构(例如,触摸敏感表面)可以支持具有对用户而言直观且透明的用户界面的各种应用程序。Various application programs that can be executed on the terminal can use at least one common physical user interface device such as a touch-sensitive surface. One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within corresponding applications. In this way, the common physical architecture of the terminal (for example, a touch-sensitive surface) can support various applications with a user interface that is intuitive and transparent to the user.
为了更好的理解本申请,下面对本申请适用的处理系统的架构进行描述。请参阅图1,图1是本申请提供的一种处理系统的架构示意图。如图1所示,系统可包括但不限于:识别部分以及音效处理部分。In order to better understand this application, the following describes the architecture of the processing system applicable to this application. Please refer to FIG. 1, which is a schematic diagram of the architecture of a processing system provided by the present application. As shown in Figure 1, the system may include, but is not limited to: a recognition part and a sound effect processing part.
其中,识别部分可包括但不限于以下工作步骤:Among them, the identification part may include but not limited to the following working steps:
步骤一:获取干声,且从获取到的用户的干声中,识别出干声的基频数据。Step 1: Obtain dry voice, and identify the fundamental frequency data of the dry voice from the obtained dry voice of the user.
具体的,可通过录音软件对用户演唱歌曲的干声进行录制,以实现对干声的获取。Specifically, the dry voice of the user singing a song can be recorded through recording software to achieve the acquisition of the dry voice.
用户的干声可为用户演唱的无伴奏的纯人声,换句话说,干声可指录音以后的未经过后期处理(如动态、压缩或混响等)和加工的纯人声。The dry voice of the user may be a pure human voice without accompaniment sung by the user. In other words, the dry voice may refer to the pure human voice that has not undergone post-processing (such as dynamics, compression, or reverberation) and processing after recording.
应当说明的,基频数据为基音的频率数据,基音为发音体整体振动产生的最低的音(换句话说,基音为每个乐音中频率最低的纯音)。It should be noted that the fundamental frequency data is the frequency data of the fundamental tone, and the fundamental tone is the lowest sound produced by the overall vibration of the sounding body (in other words, the fundamental tone is the pure tone with the lowest frequency in each tone).
图2示例性示出了一种干声的获取的示意图。Fig. 2 exemplarily shows a schematic diagram of obtaining dry sound.
如图2所示,录音软件正在对用户演唱的歌曲(如光年之外)的干声进行录制。As shown in Figure 2, the recording software is recording the dry sound of a song sung by the user (such as outside of light years).
具体的,可通过Praat语音学软件从用户的干声中识别出干声的基频数据。应当说明的,还可通过自相关算法、平行处理法、倒谱法和简化逆滤波器法从用户的干声中识别出干声的基频数据。Specifically, the fundamental frequency data of the dry voice can be identified from the dry voice of the user through the Praat phonetics software. It should be noted that the fundamental frequency data of the dry voice can also be identified from the dry voice of the user through the autocorrelation algorithm, parallel processing method, cepstrum method and simplified inverse filter method.
应当说明的,基频数据可包括基频上限、基频下限以及基频数据主调等部分。It should be noted that the fundamental frequency data may include the upper limit of the fundamental frequency, the lower limit of the fundamental frequency, and the main tone of the fundamental frequency data.
步骤二:对获取到的干声进行预处理,获得第一预处理数据。Step 2: Preprocess the acquired dry sound to obtain first preprocessed data.
具体的,对获取到的噪声进行降噪及修音处理,获得降噪及修音后的第一 预处理数据。Specifically, perform noise reduction and sound repair processing on the acquired noise to obtain the first pre-processed data after noise reduction and sound repair.
步骤三:将第一预处理数据进行特征提取,以提取出第一特征向量,将所述第一特征向量输入到预设训练模型中,通过预设训练模型将所述第一特征向量中泛音的分布和强度与获取到的干声的参考结果进比对,以获得干声的音色数据(音色得分);该预设训练模型为训练好的训练模型。Step 3: Perform feature extraction on the first preprocessed data to extract the first feature vector, input the first feature vector into a preset training model, and overtone the first feature vector through the preset training model The distribution and intensity of is compared with the obtained reference result of dry sound to obtain timbre data (timbre score) of dry sound; the preset training model is a trained training model.
应当说明的,将第一预处理数据进行特征提取,将提取出的第一特征向量输入到预设训练模型中,以获得干声的音色数据之前,还包括以下步骤:It should be noted that before performing feature extraction on the first preprocessed data, and inputting the extracted first feature vector into the preset training model to obtain dry sound timbre data, the following steps are further included:
将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;第二特征向量用于对待训练的训练模型进行训练。Perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; the second feature vector is used Perform training on the training model to be trained.
将第一预处理数据进行特征提取,将提取出的第一特征向量输入到预设训练模型中,以获得干声的音色数据,具体可包括下述工作过程:Perform feature extraction on the first preprocessed data, and input the extracted first feature vector into a preset training model to obtain dry sound timbre data, which may specifically include the following working processes:
工作过程11:将第一预处理数据进行特征提取,以提取出第一特征向量,将提取出的第一特征向量输入到预设训练模型中。Work process 11: Perform feature extraction on the first preprocessed data to extract a first feature vector, and input the extracted first feature vector into a preset training model.
工作过程12:通过预设训练模型将提取出的第一特征向量中泛音的分布和强度与干声的参考结果进行比较,以获得干声的音色数据。应当说明的,干声的参考结果可为明星的干声对应的特征向量中泛音的分布和强度。Working process 12: The distribution and intensity of the overtones in the extracted first feature vector are compared with the reference result of the dry sound through a preset training model to obtain the timbre data of the dry sound. It should be noted that the reference result of the dry sound may be the distribution and intensity of the overtones in the feature vector corresponding to the dry sound of the star.
应当说明的,以基音为标准,发音体的各部分(二分之一或三分之一)也在振动,可为本申请实施例中的泛音,其中,泛音的组合可决定特定的音色,并能使人明确地感到基音的响度。It should be noted that, taking the fundamental tone as the standard, each part (one-half or one-third) of the vocal body is also vibrating, which can be the overtone in the embodiment of this application, where the combination of overtones can determine a specific tone. And can make people clearly feel the loudness of the fundamental tone.
步骤四:获取干声关联的歌曲的歌唱速度。Step 4: Get the singing speed of the song associated with the dry voice.
具体的,获取到的干声关联的歌曲的演唱速度,具体可为:Specifically, the obtained singing speed of the song associated with the dry voice may specifically be:
获取到的干声关联的歌曲的每分钟节拍数(BPM);The number of beats per minute (BPM) of the song associated with the dry sound obtained;
或者,or,
获取到的干声关联的歌曲的每分钟音节数(SPM)。The number of syllables per minute (SPM) of the song associated with the obtained dry sound.
应当说明的,获取干声关联的歌曲的歌唱速度,具体可包括以下工作过程:It should be noted that obtaining the singing speed of songs associated with dry voice may specifically include the following working processes:
工作过程21:通过干声关联的歌曲的伴奏,确定出伴奏的伴奏标识号码(ID)。Work process 21: Determine the accompaniment identification number (ID) of the accompaniment through the accompaniment of the song associated with the dry sound.
工作过程22:通过伴奏标识号码从包括多首歌曲的第二数据库中确定出 干声关联的歌曲。Work process 22: Determine the dry-voice-related songs from the second database including a plurality of songs through the accompaniment identification number.
其中,第二数据库可为包括有多首歌曲的曲库。Wherein, the second database may be a music library including multiple songs.
工作过程23:根据确定出的歌曲,确定出歌曲的歌唱速度,其中,歌曲的伴奏标识号码与歌曲关联。Work process 23: Determine the singing speed of the song according to the determined song, where the accompaniment identification number of the song is associated with the song.
应当说明的,一首歌曲可关联一个或多个伴奏。It should be noted that one song can be associated with one or more accompaniments.
如果一首歌曲关联多个伴奏,其中,每一个伴奏可拥有唯一的伴奏标识号码。If a song is associated with multiple accompaniments, each accompaniment can have a unique accompaniment identification number.
举例来说,针对于歌曲《爱你一万年》,与歌曲《爱你一万年》相关联的伴奏可包括:男声伴奏、女生伴奏以及DJ伴奏等,其中,男声伴奏的伴奏标识号码可为11、女声伴奏的伴奏标识号码可为22以及DJ伴奏的伴奏标识号码可为33。For example, for the song "I Love You For Ten Thousand Years", the accompaniment associated with the song "I Love You For Ten Thousand Years" may include: male accompaniment, female accompaniment, and DJ accompaniment, among which the accompaniment identification number of the male accompaniment may be It is 11. The accompaniment identification number of female accompaniment can be 22 and the accompaniment identification number of DJ accompaniment can be 33.
音效处理部分可包括但不限于以下工作过程:The sound effect processing part can include but is not limited to the following working processes:
根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案;音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。Determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate Audio after sound effect processing.
具体的,根据所获取的目标音效方案,生成目标音频,包括:Specifically, generating the target audio according to the acquired target sound effect scheme includes:
通过所获取到的目标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对干声和干声关联的歌曲的伴奏进行音效处理,生成目标音频。Through the acquired equalization parameter value, compression parameter value and reverberation parameter value in the target sound effect scheme, the sound effect processing is jointly performed on the accompaniment of the song associated with the dry sound and the dry sound to generate the target audio.
更具体的,可通过目标音效方案中的均衡参数值对干声和干声关联的歌曲的伴奏的音质的改善程度进行调整、目标音效方案中的压缩参数值对干声和干声关联的歌曲的伴奏的动态修补程度进行调整以及目标音效方案中的混响参数值对干声和干声关联的歌曲的伴奏的音质的改善、空间制造层次的营造、细节掩盖程度分别进行调整。More specifically, the degree of improvement of the sound quality of the accompaniment of dry sound and dry sound related songs can be adjusted through the equalization parameter value in the target sound effect plan, and the compression parameter value in the target sound effect plan can be used for dry sound and dry sound related songs. Adjust the dynamic repair level of the accompaniment and the reverberation parameter value in the target sound effect plan to improve the sound quality of the accompaniment of the dry and dry-related songs, the creation of the spatial manufacturing level, and the degree of detail concealment.
输出至少一个音效方案。Output at least one sound effect scheme.
具体的,输出至少一个音效方案,具体可包括但不限于以下形式:Specifically, output at least one sound effect scheme, which may specifically include but is not limited to the following forms:
将显示至少一个音效方案,或者语音播放该至少一个音效方案。At least one sound effect scheme will be displayed, or the at least one sound effect scheme will be played.
应当说明的,至少一个音效方案可包括但不限于以下两种情形:It should be noted that at least one sound effect scheme can include but is not limited to the following two situations:
情形1:至少一个音效方案,包括:一个音效方案;该音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。Case 1: At least one sound effect scheme, including: a sound effect scheme; the sound effect scheme is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing.
情形2:至少一个音效方案,包括:多个音效方案;多个音效方案中每一个音效方案分别用于对干声和干声关联的歌曲的伴奏进行音效处理,以分别生成多个音效处理后的音频。Case 2: At least one sound effect scheme, including: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the accompaniment of dry sound and dry sound-related songs to generate multiple sound effect processing Audio.
图3示例性示出了一种音效方案。Fig. 3 exemplarily shows a sound effect scheme.
如图3所示,音效方案可包括四个可分别用于对干声和干声关联的歌曲的伴奏进行音效处理的音效方案。As shown in FIG. 3, the sound effect scheme may include four sound effect schemes that can be used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound, respectively.
应当说明的,音效方案的名称可为AI混响,该音效方案可用于根据用户的干声制定出该用户的专属的音频,且该用户可试听通过该音效方案制定出的歌声。具体的,四个音效方案如下:It should be noted that the name of the sound effect scheme may be AI reverberation, and the sound effect scheme can be used to formulate the user's exclusive audio based on the user's dry voice, and the user can audition the singing voice developed through the sound effect scheme. Specifically, the four sound effect schemes are as follows:
音效方案1:用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、理想的干声关联的歌曲的演唱速度及基频数据之间的匹配度为90%的音效方案。Sound effect scheme 1: The timbre data of the user's dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the ideal dry voice and the fundamental frequency data A sound effect scheme with a matching degree of 90%.
音效方案2:用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、理想的干声关联的歌曲的演唱速度及基频数据之间的匹配度为80%的音效方案。Sound effect plan 2: User’s dry voice timbre data, dry voice-related song singing speed, fundamental frequency data and ideal dry voice timbre data, ideal dry voice-related song singing speed and fundamental frequency data A sound effect scheme with a matching degree of 80%.
音效方案3:用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、理想的干声关联的歌曲的演唱速度及基频数据之间的匹配度60%的音效方案。Sound effect scheme 3: The timbre data of the user’s dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the ideal dry voice and the fundamental frequency data A sound effect scheme with a matching degree of 60%.
音效方案4:用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、理想的干声关联的歌曲的演唱速度及基频数据之间的匹配度为90%的音效方案为推荐的(建议用户优先选择的)可用于对所述干声和所述干声关联的歌曲的伴奏进行音效处理的音效方案,其他的三个方案为(建议用户次优先选择的)可供用户自行选择的以对干声和干声关联的歌曲的伴奏进行音效处理的音效方案。Sound effect scheme 4: User’s dry voice timbre data, dry voice-related song singing speed, fundamental frequency data and ideal dry voice timbre data, ideal dry voice-related song singing speed and fundamental frequency data The sound effect scheme with a matching degree of 90% is the recommended sound effect scheme (the user is recommended to choose first) that can be used to process the sound effect of the dry sound and the accompaniment of the song associated with the dry sound. The other three solutions are (recommended The user second priority) is a sound effect solution that can be selected by the user to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound.
应当说明的,该音效方案的名称还可为智能音效。该音效方案可用于根据用户的干声制定出该用户的专属的音频,其中,如果用户未佩戴耳机对用户的干声进行录制,则可能影响通过该音效方案制定出的音频的效果;因而处理系统提示或建议用户在歌唱的过程中佩戴耳机进行录制。It should be noted that the name of the sound effect scheme may also be smart sound effect. The sound effect solution can be used to formulate the user's exclusive audio based on the user's dry voice. If the user does not wear headphones to record the user's dry voice, it may affect the audio effect developed through the sound effect solution; The system prompts or recommends that the user wear headphones to record while singing.
应当说明的,用户可试听通过该音效方案制定出的音频。如果该用户为贵 宾(Very Important People,VIP),该用户可对利用该音效方案根据该用户的干声所制定出的音频进行发布。It should be noted that the user can audition the audio developed through the sound effect scheme. If the user is a VIP (Very Important People), the user can publish the audio developed by the sound effect solution based on the user's dry voice.
图4示例性示出了另一种音效方案。Figure 4 exemplarily shows another sound effect scheme.
如图4所示,音效方案可包括多个可分别用于对干声和干声关联的歌曲的伴奏进行音效处理的音效方案。As shown in FIG. 4, the sound effect scheme may include multiple sound effect schemes that can be used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound.
具体的,该音效方案可包括但不限于:用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、干声关联的歌曲的演唱速度、基频数据之间的匹配度为90%的音效方案、KTV音效的音效方案、磁性音效的音效方案、歌声音效的音效方案以及悠远意境音效的音效方案等等。Specifically, the sound effect solution may include, but is not limited to: the timbre data of the user’s dry voice, the singing speed of the song associated with the dry voice, the timbre data of the fundamental frequency data and the ideal dry voice, the singing speed of the song associated with the dry voice, The matching degree between the fundamental frequency data is the sound effect scheme of 90%, the sound effect scheme of KTV sound effect, the sound effect scheme of magnetic sound effect, the sound effect scheme of song sound effect, the sound effect scheme of far away artistic conception sound effect and so on.
应当说明的,用户的干声的音色数据、干声关联的歌曲的演唱速度、基频数据与理想的干声的音色数据、理想的干声关联的歌曲的演唱速度及基频数据之间的匹配度为90%的音效方案可为推荐给用户的可用于对干声和干声关联的歌曲的伴奏进行音效处理的优选的音效方案。其他的方案可为推荐给用户的次优选的可用于对干声和干声关联的歌曲的伴奏进行音效处理的音效方案。It should be noted that the timbre data of the user’s dry voice, the singing speed of the song associated with the dry voice, the fundamental frequency data and the timbre data of the ideal dry voice, the singing speed of the song associated with the ideal dry voice, and the fundamental frequency data The sound effect scheme with a matching degree of 90% may be a preferred sound effect scheme recommended to the user that can be used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound. The other solution may be a second-preferred sound effect solution recommended to the user that can be used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound.
综上所述,本申请实施例提供了一种处理系统。该处理系统包括:识别部分以及音效处理部分。该处理系统通过识别部分获取干声,干声包括用户演唱歌曲的基频数据。进而,该处理系统通过识别部分获取干声的音色数据,音色数据是通过预设训练模型获取的。然后,该处理系统通过音效处理部分根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据,确定出至少一个音效方案,音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。接着,该处理系统通过音效处理部分输出至少一个音效方案。最后,该处理系统通过音效处理部分根据所获取的目标音效方案,生成目标音频;目标音效方案为所述至少一个音效方案中的一个音效方案。采用本申请实施例,通过从包括多个音效方案的第一数据库中确定出目标音效方案对干声和干声关联的歌曲的伴奏进行音效处理,可使得生成的音效处理后的音频更加的动听。In summary, the embodiment of the present application provides a processing system. The processing system includes a recognition part and a sound effect processing part. The processing system obtains the dry voice through the recognition part, and the dry voice includes the fundamental frequency data of the song sung by the user. Furthermore, the processing system obtains the timbre data of the dry sound through the recognition part, and the timbre data is obtained through the preset training model. Then, the processing system determines at least one sound effect scheme through the sound effect processing part according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, and the sound effect scheme is used to associate the dry sound with the dry sound. The accompaniment of the song is processed with sound effects to generate audio after sound effect processing. Then, the processing system outputs at least one sound effect scheme through the sound effect processing part. Finally, the processing system generates the target audio according to the acquired target sound effect scheme through the sound effect processing part; the target sound effect scheme is one of the at least one sound effect scheme. According to the embodiment of the present application, by determining the target sound effect scheme from the first database including multiple sound effect schemes to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound, the generated audio after the sound effect processing can be more pleasant to hear .
可理解的,图2~图4仅仅用于解释本申请实施例,不应对本申请做出限制。It is understandable that FIGS. 2 to 4 are only used to explain the embodiments of the present application, and should not limit the present application.
参见图5,是本申请提供的一种处理方法的示意流程图。如图5所示,该方法可以至少包括以下几个步骤:Refer to FIG. 5, which is a schematic flowchart of a processing method provided by the present application. As shown in Figure 5, the method can at least include the following steps:
S501、获取干声。S501: Obtain dry sound.
本申请实施例中,干声包括用户演唱歌曲的基频数据。In the embodiment of this application, the dry voice includes the fundamental frequency data of the song sung by the user.
应当说明的,干声中的基频数据可通过Praat语音学软件从用户的干声中识别出,还可通过自相关算法、平行处理法、倒谱法和简化逆滤波器法从用户的干声中识别出。It should be noted that the fundamental frequency data in the dry voice can be identified from the dry voice of the user through Praat phonetics software, and it can also be learned from the dry voice of the user through the autocorrelation algorithm, parallel processing method, cepstrum method and simplified inverse filter method. Recognized in the sound.
应当说明的,歌曲可为由歌词和曲谱结合的一种艺术形式。It should be noted that a song can be an art form that combines lyrics and scores.
干声可为用户演唱的无伴奏的纯人声,换句话说,干声可指录音以后的未经过后期处理(如动态、压缩或混响等)或加工的纯人声。The dry voice can be a pure human voice without accompaniment sung by the user. In other words, the dry voice can refer to the pure human voice without post-processing (such as dynamics, compression or reverberation, etc.) or processing after recording.
应当说明的,基频数据为基音的频率数据,基音为发音体整体振动产生的最低的音(换句话说,基音为每个乐音中频率最低的纯音)。It should be noted that the fundamental frequency data is the frequency data of the fundamental tone, and the fundamental tone is the lowest sound produced by the overall vibration of the sounding body (in other words, the fundamental tone is the pure tone with the lowest frequency in each tone).
S502、获取干声的音色数据。S502. Acquire timbre data of dry sound.
本申请实施例中,得所述干声的音色数据之前,还包括以下工作步骤:In the embodiment of the present application, before obtaining the timbre data of the dry sound, the following working steps are further included:
工作步骤1:对获取到的干声进行预处理,获得第一预处理数据。Work step 1: preprocess the acquired dry sound to obtain first preprocessed data.
具体的,对获取到的干声进行预处理,具体可包括以下工作过程:Specifically, preprocessing the acquired dry sound can specifically include the following working processes:
对获取到的干声进行降噪和/或修音,以获得降噪及修音后的第一预处理数据。Perform noise reduction and/or sound modification on the acquired dry sound to obtain first preprocessed data after noise reduction and sound modification.
工作步骤2:将第一预处理数据进行特征提取,以提取出第一特征向量,将第一特征向量输入到预设训练模型中,通过预设训练模型将第一特征向量中泛音的分布和强度与获取到的干声的参考结果进比对,以获得干声的音色数据;其中,预设训练模型为训练好的训练模型。Work step 2: Perform feature extraction on the first preprocessed data to extract the first feature vector, input the first feature vector into a preset training model, and use the preset training model to sum the distribution of overtones in the first feature vector The intensity is compared with the obtained reference result of the dry sound to obtain the timbre data of the dry sound; wherein, the preset training model is a trained training model.
其中,干声的参考结果可为明星的干声对应的特征向量中泛音的分别和强度。如果第一特征中泛音的分布和强度与干声的参考结果越接近,则用户的干声的得分越高。Among them, the reference result of the dry sound may be the difference and intensity of the overtones in the feature vector corresponding to the dry sound of the star. If the distribution and intensity of the overtones in the first feature are closer to the reference result of dry sound, the score of the dry sound of the user is higher.
应当说明的,以基音为标准,发音体的各部分(二分之一或三分之一)也在振动,可为本申请实施例中的泛音,其中,泛音的组合可决定特定的音色,并能使人明确地感到基音的响度。It should be noted that, taking the fundamental tone as the standard, each part (one-half or one-third) of the vocal body is also vibrating, which can be the overtone in the embodiment of this application, where the combination of overtones can determine a specific tone. And can make people clearly feel the loudness of the fundamental tone.
应当说明的,将第一预处理数据进行特征提取,将提取出的第一特征向量 输入到预设训练模型中,以获得干声的音色数据之前,还包括以下步骤:It should be noted that before performing feature extraction on the first pre-processed data, and inputting the extracted first feature vector into the preset training model to obtain dry sound timbre data, the following steps are further included:
将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;第二特征向量用于对待训练的训练模型进行训练。Perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; the second feature vector is used Perform training on the training model to be trained.
S503、根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案。S503: Determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data.
本申请实施例中,音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。In the embodiment of the present application, the sound effect solution is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
应当说明的,输出至少一个音效方案之后,根据所获取的目标音效方案,生成目标音频之前,还包括下述步骤:It should be noted that after at least one sound effect scheme is output, the following steps are further included before generating the target audio according to the obtained target sound effect scheme:
步骤1:接收目标指令;该目标指令用于指示目标音效方案(也即是说,指示出获取与目标指令相关联的目标音效方案)。Step 1: Receive a target instruction; the target instruction is used to indicate a target sound effect scheme (that is, indicate to obtain a target sound effect scheme associated with the target instruction).
步骤2:响应于接收到的目标指令,获取目标音效方案。Step 2: In response to the received target instruction, obtain the target sound effect scheme.
应当说明的,根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案之前,还包括以下工作过程:It should be noted that before determining at least one sound effect scheme based on the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, the following working process is also included:
工作过程1:通过干声关联的歌曲的伴奏,确定出伴奏的伴奏标识号码。Work process 1: Determine the accompaniment identification number of the accompaniment through the accompaniment of the song associated with the dry sound.
工作过程2:通过伴奏标识号码从包括多首歌曲的第一数据库中确定出干声关联的歌曲。Work process 2: Determine the dry-voice-related songs from the first database including multiple songs through the accompaniment identification number.
工作过程3:根据确定出的歌曲,确定出歌曲的歌唱速度,其中,歌曲的伴奏标识号码与歌曲关联。Work process 3: Determine the singing speed of the song according to the determined song, where the accompaniment identification number of the song is associated with the song.
应当说明的,干声关联的歌曲的演唱速度,具体可为:It should be noted that the singing speed of songs related to dry voice can be specifically:
获取到的干声关联的歌曲的每分钟节拍数。The number of beats per minute of the song associated with the obtained dry sound.
或者,or,
获取到的干声关联的歌曲的每分钟音节数。The number of syllables per minute of the song associated with the obtained dry sound.
应当说明的,至少一个音效方案可包括但不限于以下两种情形;It should be noted that at least one sound effect scheme can include but is not limited to the following two situations:
情形1:至少一个音效方案,可包括:一个音效方案;该方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。Case 1: At least one sound effect scheme, which may include: one sound effect scheme; this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
情形2:至少一个音效方案,可包括:多个音效方案;多个音效方案中每一个音效方案分别用于对干声和干声关联的歌曲的伴奏进行音效处理,以分别 生成多个音效处理后的音频。Case 2: At least one sound effect scheme, which may include: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate multiple sound effect processing respectively After the audio.
S504、输出至少一个音效方案。S504. Output at least one sound effect solution.
具体的,将目标音效方案进行输出,具体可包括但不限于以下形式:Specifically, outputting the target sound effect scheme may specifically include but not limited to the following forms:
将显示至少一个音效方案进行显示,或者语音播放至少一个音效方案。At least one sound effect scheme will be displayed for display, or at least one sound effect scheme will be played.
S505、根据所获取的目标音效方案,生成目标音频。S505: Generate target audio according to the acquired target sound effect scheme.
本申请实施例中,根据所获取的目标音效方案,生成目标音频,具体可包括下述过程:In the embodiment of the present application, generating the target audio according to the acquired target sound effect scheme may specifically include the following process:
通过所获取到的目标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对干声和干声关联的歌曲的伴奏进行音效处理,生成目标音频。Through the acquired equalization parameter value, compression parameter value and reverberation parameter value in the target sound effect scheme, the sound effect processing is jointly performed on the accompaniment of the song associated with the dry sound and the dry sound to generate the target audio.
更具体的,通过目标音效方案中的均衡参数值对干声和干声关联的歌曲的伴奏的音质的改善程度进行调整、目标音效方案中的压缩参数值对干声和干声关联的歌曲的伴奏的动态修补程度进行调整以及目标音效方案中的混响参数值对干声和干声关联的歌曲的伴奏的音质的改善、空间制造层次的营造、细节掩盖程度分别进行调整。More specifically, the degree of improvement of the sound quality of the accompaniment of dry sound and dry sound related songs is adjusted through the equalization parameter value in the target sound effect plan, and the compression parameter value in the target sound effect plan is used for the dry sound and dry sound related songs. The degree of dynamic repair of the accompaniment is adjusted and the reverberation parameter value in the target sound effect scheme is adjusted to improve the sound quality of the accompaniment of dry and dry-related songs, the creation of spatial manufacturing levels, and the degree of detail concealment.
综上所述,本申请实施例提供了一种处理方法。首先,获取干声,干声包括用户演唱歌曲的基频数据。进而,获取干声的音色数据,音色数据是通过预设训练模型获取的。然后,根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据,确定出至少一个音效方案,音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。接着,输出至少一个音效方案。最后,根据所获取的目标音效方案,生成目标音频;目标音效方案为所述至少一个音效方案中的一个音效方案。采用本申请实施例,可通过获取到的目标音效方案对干声和干声关联的歌曲的伴奏进行音效处理,可使得生成的音效处理后的音频更加美妙、更加动听。In summary, the embodiment of the present application provides a processing method. First, get the dry sound, which includes the fundamental frequency data of the song sung by the user. Furthermore, the timbre data of the dry sound is acquired, and the timbre data is acquired through a preset training model. Then, according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, at least one sound effect plan is determined, and the sound effect plan is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound To generate audio processed by sound effects. Then, output at least one sound effect scheme. Finally, according to the acquired target sound effect scheme, target audio is generated; the target sound effect scheme is one of the at least one sound effect scheme. According to the embodiments of the present application, sound effect processing can be performed on the accompaniment of dry sound and dry sound-related songs through the acquired target sound effect scheme, which can make the generated audio after sound effect processing more beautiful and more pleasant to hear.
可理解的,图5方法实施例中未提供的相关定义和说明可参考图1的实施例,此处不再赘述。It is understandable that related definitions and descriptions not provided in the method embodiment in FIG. 5 can be referred to the embodiment in FIG. 1, and will not be repeated here.
参见图6,是本申请提供的一种处理装置。如图6所示,处理装置60包括:第一获取单元601、第二获取单元602、确定单元603、输出单元604及生成单元605。其中:Refer to Figure 6, which is a processing device provided by this application. As shown in FIG. 6, the processing device 60 includes: a first acquiring unit 601, a second acquiring unit 602, a determining unit 603, an output unit 604, and a generating unit 605. among them:
第一获取单元601,用于获取干声;干声包括用户演唱歌曲的基频数据。The first acquiring unit 601 is used to acquire dry sound; the dry sound includes fundamental frequency data of the song sung by the user.
第二获取单元602,用于获取干声的音色数据。The second acquiring unit 602 is used to acquire the timbre data of the dry sound.
确定单元603,用于根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及从干声中识别出的基频数据从包括多个音效方案的第一数据库中确定出目标音效方案;目标音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。The determining unit 603 is configured to determine the target sound effect from the first database including multiple sound effect schemes according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data identified from the dry sound Scheme; the target sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
输出单元604,用于输出至少一个音效方案。The output unit 604 is configured to output at least one sound effect scheme.
应当说明的,至少一个音效方案可包括但不限于以下两种情形;It should be noted that at least one sound effect scheme can include but is not limited to the following two situations:
情形1:至少一个音效方案,可包括:一个音效方案;该方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。Case 1: At least one sound effect scheme, which may include: one sound effect scheme; this scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate audio after sound effect processing.
情形2:至少一个音效方案,可包括:多个音效方案;多个音效方案中每一个音效方案分别用于对干声和干声关联的歌曲的伴奏进行音效处理,以分别生成多个音效处理后的音频。Case 2: At least one sound effect scheme, which may include: multiple sound effect schemes; each of the multiple sound effect schemes is used to perform sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate multiple sound effect processing respectively After the audio.
生成单元605,用于根据所获取的目标音效方案,生成目标音频;目标音效方案为至少一个音效方案中的一个音效方案。The generating unit 605 is configured to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one sound effect scheme among at least one sound effect scheme.
生成单元605,具体可用于:通过目标音效方案中的均衡参数值对干声和干声关联的歌曲的伴奏的音质的改善程度进行调整、目标音效方案中的压缩参数值对干声和干声关联的歌曲的伴奏的动态修补程度进行调整以及目标音效方案中的混响参数值对干声和干声关联的歌曲的伴奏的音质的改善、空间制造层次的营造、细节掩盖程度分别进行调整,生成目标音频。The generating unit 605 can be specifically used for adjusting the improvement degree of the sound quality of the dry sound and the accompaniment of the song associated with the dry sound through the equalization parameter value in the target sound effect scheme, and the compression parameter value in the target sound effect scheme for the dry sound and dry sound Adjust the dynamic repair degree of the accompaniment of the associated song and adjust the reverberation parameter value in the target sound effect plan to improve the sound quality of the accompaniment of the dry and dry-associated songs, the creation of the spatial manufacturing level, and the degree of detail concealment respectively. Generate target audio.
处理装置60包括:第一获取单元601、第二获取单元602、确定单元603、输出单元604及生成单元605之外,还包括:预处理单元。The processing device 60 includes: a first obtaining unit 601, a second obtaining unit 602, a determining unit 603, an output unit 604, and a generating unit 605, and also includes a preprocessing unit.
用于:对获取到的干声进行预处理,以获得第一预处理数据。Used to: preprocess the acquired dry sound to obtain first preprocessed data.
具体的,对获取到的干声进行降噪和/或修音,以获得第一预处理数据。Specifically, noise reduction and/or sound modification are performed on the acquired dry sound to obtain the first preprocessed data.
处理装置60包括:第一获取单元601、第二获取单元602、确定单元603、输出单元604及生成单元605之外,还包括:训练单元。The processing device 60 includes: a first acquiring unit 601, a second acquiring unit 602, a determining unit 603, an output unit 604, and a generating unit 605, as well as a training unit.
用于:将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;第二特征向量用于对待训练的训练模型进行训练。It is used to: perform feature extraction on multiple labeled dry sound samples to extract the second feature vector, and input the second feature vector into the training model to be trained to obtain the preset training model; second The feature vector is used to train the training model to be trained.
应当说明的,确定单元还用于:在根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据,确定出至少一个音效方案之前,It should be noted that the determining unit is also used to: before determining at least one sound effect scheme based on the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data,
通过干声关联的歌曲的伴奏,确定出伴奏的伴奏标识号码。The accompaniment identification number of the accompaniment is determined by the accompaniment of the song associated with the dry sound.
通过伴奏标识号码从包括多首歌曲的第一数据库中确定出干声关联的歌曲。The song associated with the dry voice is determined from the first database including a plurality of songs through the accompaniment identification number.
根据确定出的歌曲,确定出歌曲的歌唱速度,其中,歌曲的伴奏标识号码与歌曲关联。According to the determined song, the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
综上所述,本申请实施例中,装置60可通过第一获取单元601获取干声;干声包括用户演唱歌曲的基频数据;进而,装置60通过第二获取单元602对获取干声的音色数据;然后,装置60通过确定单元604根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案;音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频;接着,装置60通过输出单元604将目标音效方案进行输出;最后,装置60通过生成单元605根据所获取的目标音效方案,生成目标音频;目标音效方案为至少一个音效方案中的一个音效方案。采用本申请实施例,可通过获取到的目标音效方案对干声和干声关联的歌曲的伴奏进行音效处理,可使得生成的音效处理后的音频更加的动听。To sum up, in the embodiment of the present application, the device 60 can obtain dry sound through the first acquiring unit 601; dry sound includes the fundamental frequency data of the song sung by the user; further, the device 60 obtains dry sound through the second acquiring unit 602 The timbre data; then, the device 60 determines at least one sound effect scheme through the determining unit 604 according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used for the dry sound and the dry sound The accompaniment of the associated song performs sound effect processing to generate sound effect-processed audio; then, the device 60 outputs the target sound effect scheme through the output unit 604; finally, the device 60 generates the target sound effect scheme according to the acquired target sound effect scheme through the generating unit 605 Audio; the target sound effect scheme is one sound effect scheme in at least one sound effect scheme. According to the embodiment of the present application, the sound effect processing can be performed on the accompaniment of the dry sound and the song associated with the dry sound through the acquired target sound effect scheme, so that the generated audio after the sound effect processing can be more beautiful.
应当理解,装置60仅为本申请实施例提供的一个例子,并且,装置60可具有比示出的部件更多或更少的部件,可以组合两个或更多个部件,或者可具有部件的不同配置实现。It should be understood that the device 60 is only an example provided by the embodiment of the present application, and the device 60 may have more or less components than the shown components, may combine two or more components, or may have Different configurations are implemented.
可理解的,关于图6的装置60包括的功能块的具体实现方式,可参考前述图1、图5所述的实施例,这里不再赘述。It is understandable that for the specific implementation of the functional blocks included in the device 60 in FIG. 6, reference may be made to the embodiments described in FIG. 1 and FIG. 5, which will not be repeated here.
图7是本申请提供的一种处理设备的结构示意图。本申请实施例中,设备可以包括移动手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)、移动互联网设备(Mobile Internet Device,MID)、智能穿戴设备(如智能手表、智能手环)等各种设备,本申请实施例不作限定。如图7所示,设备70可包括:基带芯片701、存储器702(一个或多个计算机可读存储介质)、外围系统703。这些部件可在一个或多个通信总线704上通信。Fig. 7 is a schematic structural diagram of a processing device provided by the present application. In the embodiments of this application, the devices may include mobile phones, tablet computers, personal digital assistants (Personal Digital Assistant, PDA), mobile Internet devices (Mobile Internet Device, MID), smart wearable devices (such as smart watches, smart bracelets), etc. Various devices are not limited in the embodiments of this application. As shown in FIG. 7, the device 70 may include: a baseband chip 701, a memory 702 (one or more computer-readable storage media), and a peripheral system 703. These components can communicate on one or more communication buses 704.
基带芯片701可包括:一个或多个处理器(CPU)705。The baseband chip 701 may include: one or more processors (CPU) 705.
处理器705,具体可用于:The processor 705 can be specifically used for:
获取干声;干声包括用户演唱歌曲的基频数据。Get dry sound; dry sound includes the fundamental frequency data of the song sung by the user.
获得干声的音色数据。Obtain dry sound data.
根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案;音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频。Determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used to perform sound effect processing on the accompaniment of the dry sound and the song associated with the dry sound to generate Audio after sound effect processing.
用于根据所获取的目标音效方案,生成目标音频;目标音效方案为至少一个音效方案中的一个音效方案。It is used to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one sound effect scheme among at least one sound effect scheme.
存储器702与处理器705耦合,可用于存储各种软件程序和/或多组指令。具体实现中,存储器702可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器702可以存储操作系统(下述简称系统),例如ANDROID,IOS,WINDOWS,或者LINUX等嵌入式操作系统。存储器702还可以存储网络通信程序,该网络通信程序可用于与一个或多个附加设备,一个或多个设备设备,一个或多个网络设备进行通信。存储器702还可以存储用户接口程序,该用户接口程序可以通过图形化的操作界面将应用程序的内容形象逼真的显示出来,并通过菜单、对话框以及按键等输入控件接收用户对应用程序的控制操作。The memory 702 is coupled with the processor 705 and may be used to store various software programs and/or multiple sets of instructions. In a specific implementation, the memory 702 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 702 can store an operating system (hereinafter referred to as system), such as an embedded operating system such as ANDROID, IOS, WINDOWS, or LINUX. The memory 702 may also store a network communication program, which may be used to communicate with one or more additional devices, one or more device devices, and one or more network devices. The memory 702 can also store a user interface program, which can vividly display the content of the application program through a graphical operation interface, and receive user control operations on the application program through input controls such as menus, dialog boxes, and keys. .
可理解的,存储器702可用于存储实现处理方法的实现代码。It is understandable that the memory 702 may be used to store implementation code for implementing the processing method.
存储器702还可以存储一个或多个应用程序。这些应用程序可包括:K歌程序、社交应用程序(例如Facebook),图像管理应用程序(例如相册),地图类应用程序(例如谷歌地图),浏览器(例如Safari,Google Chrome)等等。The memory 702 may also store one or more application programs. These applications may include: K song programs, social applications (such as Facebook), image management applications (such as photo albums), map applications (such as Google Maps), browsers (such as Safari, Google Chrome), and so on.
外围系统703主要用于实现设备70用户/外部环境之间的交互功能,主要包括设备70的输入输出装置。具体实现中,外围系统703可包括:显示屏控制器707、摄像头控制器708以及音频控制器709。其中,各个控制器可与各自对应的外围设备(如显示屏710、摄像头711以及音频电路712)耦合。在一些实施例中,显示屏可以配置有自电容式的悬浮触控面板的显示屏1,也可以是配置有红外线式的悬浮触控面板的显示屏。在一些实施例中,摄像头711可以是3D摄像头。需要说明的,外围系统703还可以包括其他I/O外设。The peripheral system 703 is mainly used to implement the interaction function between the user of the device 70 and the external environment, and mainly includes the input and output devices of the device 70. In specific implementation, the peripheral system 703 may include: a display controller 707, a camera controller 708, and an audio controller 709. Among them, each controller can be coupled with its corresponding peripheral device (such as the display screen 710, the camera 711, and the audio circuit 712). In some embodiments, the display screen may be a display screen 1 equipped with a self-capacitive floating touch panel, or a display screen configured with an infrared floating touch panel. In some embodiments, the camera 711 may be a 3D camera. It should be noted that the peripheral system 703 may also include other I/O peripherals.
综上所述,本申请实施例中,设备70可通过处理器705获取干声;干声包括用户演唱歌曲的基频数据;进而,设备70可通过处理器705获取干声的音色数据;然后,设备70可通过处理器705根据获取到的干声的音色数据、干声关联的歌曲的演唱速度以及基频数据确定出至少一个音效方案;音效方案用于对干声和干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频;接着,设备70可通过外围系统703将目标音效方案进行输出;最后,设备70可通过处理器705根据所获取的目标音效方案,生成目标音频;目标音效方案为至少一个音效方案中的一个音效方案。采用本申请实施例,可通过获取到的目标音效方案对干声和干声关联的歌曲的伴奏进行音效处理,可使得生成的音效处理后的音频更加的动听。In summary, in this embodiment of the present application, the device 70 can obtain dry sound through the processor 705; the dry sound includes the fundamental frequency data of the user sings the song; further, the device 70 can obtain the timbre data of the dry sound through the processor 705; , The device 70 can determine at least one sound effect scheme through the processor 705 according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used for the dry sound and the song associated with the dry sound Perform sound effect processing on the accompaniment to generate the processed audio; then, the device 70 can output the target sound effect scheme through the peripheral system 703; finally, the device 70 can generate the target audio through the processor 705 according to the acquired target sound effect scheme ; The target sound effect scheme is one sound effect scheme in at least one sound effect scheme. According to the embodiment of the present application, the sound effect processing can be performed on the accompaniment of the dry sound and the song associated with the dry sound through the acquired target sound effect scheme, so that the generated audio after the sound effect processing can be more beautiful.
应当理解,设备70仅为本申请实施例提供的一个例子,并且,设备70可具有比示出的部件更多或更少的部件,可以组合两个或更多个部件,或者可具有部件的不同配置实现。It should be understood that the device 70 is only an example provided in the embodiment of the present application, and the device 70 may have more or less components than the components shown, may combine two or more components, or may have Different configurations are implemented.
可理解的,关于图7的设备70包括的功能模块的具体实现方式,可参考图1、图5的实施例,此处不再赘述。It is understandable that for the specific implementation of the functional modules included in the device 70 in FIG. 7, reference may be made to the embodiments in FIG. 1 and FIG. 5, and details are not described herein again.
本申请提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现。The present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is implemented when executed by a processor.
该计算机可读存储介质可以是前述任一实施例所述的设备的内部存储单元,例如设备的硬盘或内存。该计算机可读存储介质也可以是设备的外部存储设备,例如设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步的,该计算机可读存储介质还可以既包括设备的内部存储单元也包括外部存储设备。该计算机可读存储介质用于存储计算机程序以及设备所需的其他程序和数据。该计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of the device described in any of the foregoing embodiments, such as a hard disk or memory of the device. The computer-readable storage medium may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on. Further, the computer-readable storage medium may also include both an internal storage unit of the device and an external storage device. The computer-readable storage medium is used to store computer programs and other programs and data required by the device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
本申请还提供一种计算机程序产品,该计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,该计算机程序可操作来使计算机执行如上述方法实施例中记载的任一方法的部分或全部步骤。该计算机程序产品可以为一个软件安装包,该计算机包括电子装置。The present application also provides a computer program product. The computer program product includes a non-transitory computer-readable storage medium storing a computer program. The computer program is operable to cause a computer to execute any of the methods described in the above method embodiments. Part or all of the steps. The computer program product may be a software installation package, and the computer includes an electronic device.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the hardware and software Interchangeability. In the above description, the composition and steps of each example have been generally described in terms of function. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working processes of the devices and units described above can refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, to describe the composition and steps of each example. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
上述描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、设备或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。The device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation. For example, multiple units or components can be combined or integrated into Another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申 请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (12)

  1. 一种处理方法,其特征在于,包括:A processing method, characterized in that it comprises:
    获取干声,所述干声包括用户演唱歌曲的基频数据;Acquiring dry voice, where the dry voice includes fundamental frequency data of the song sung by the user;
    获取所述干声的音色数据,所述音色数据是通过预设训练模型获取的;Acquiring timbre data of the dry voice, where the timbre data is acquired through a preset training model;
    根据获取到的所述干声的音色数据、所述干声相关联的歌曲的演唱速度以及所述基频数据,确定出至少一个音效方案,所述音效方案用于对所述干声和所述干声关联的歌曲的伴奏进行音效处理;According to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data, at least one sound effect scheme is determined, and the sound effect scheme is used to compare the dry sound and the sound Perform sound effect processing on the accompaniment of songs related to the dry sound;
    输出所述至少一个音效方案;Output the at least one sound effect scheme;
    根据所获取的目标音效方案,生成目标音频,所述目标音效方案为所述至少一个音效方案中的一个音效方案。According to the acquired target sound effect scheme, a target audio is generated, where the target sound effect scheme is one sound effect scheme of the at least one sound effect scheme.
  2. 如权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    所述至少一个音效方案,包括:一个音效方案或多个音效方案;The at least one sound effect scheme includes: one sound effect scheme or multiple sound effect schemes;
    所述输出所述至少一个音效方案之后,根据所获取的目标音效方案,生成目标音频之前,还包括:After the output of the at least one sound effect scheme, before generating the target audio according to the acquired target sound effect scheme, the method further includes:
    接收目标指令,所述目标指令用于指示所述目标音效方案;Receiving a target instruction, where the target instruction is used to indicate the target sound effect scheme;
    响应于接收到的所述目标指令,获取所述目标音效方案。In response to the received target instruction, the target sound effect scheme is acquired.
  3. 如权利要求1所述的方法,其特征在于,所述获取所述干声的音色数据之前,还包括:The method according to claim 1, wherein before said obtaining the timbre data of the dry sound, the method further comprises:
    对获取到的干声进行预处理,获得第一预处理数据;Preprocessing the acquired dry sound to obtain the first preprocessing data;
    将所述第一预处理数据进行特征提取,以提取出第一特征向量,将所述第一特征向量输入到预设训练模型中,通过所述预设训练模型将所述第一特征向量中泛音的分布和强度与所述获取到的干声的参考结果进比对,以获得所述干声的音色数据;所述预设训练模型为训练好的训练模型。Perform feature extraction on the first preprocessed data to extract a first feature vector, input the first feature vector into a preset training model, and use the preset training model to extract the first feature vector The distribution and intensity of the overtones are compared with the obtained reference result of the dry sound to obtain the timbre data of the dry sound; the preset training model is a trained training model.
  4. 如权利要求3所述的方法,其特征在于,所述将所述第一预处理数据 进行特征提取,以提取出第一特征向量,将所述第一特征向量输入到预设训练模型中,通过所述预设训练模型将所述第一特征向量中泛音的分布和强度与所述获取到的干声的参考结果进比对,以获得所述干声的音色数据之前,还包括:The method of claim 3, wherein the feature extraction is performed on the first pre-processed data to extract a first feature vector, and the first feature vector is input into a preset training model, Before comparing the distribution and intensity of overtones in the first feature vector with the obtained reference result of dry sound through the preset training model to obtain the timbre data of the dry sound, the method further includes:
    将多个被标注的干声的样本分别进行特征提取,以提取出第二特征向量,将所述第二特征向量分别输入到待训练的训练模型中,以获得预设训练模型;所述第二特征向量用于对所述待训练的训练模型进行训练。Perform feature extraction on multiple labeled dry sound samples to extract second feature vectors, and input the second feature vectors into the training model to be trained to obtain a preset training model; The two feature vectors are used to train the training model to be trained.
  5. 如权利要求1所述的方法,其特征在于,所述根据获取到的所述干声的音色数据、所述干声关联的歌曲的演唱速度以及所述基频数据确定出至少一个音效方案之前,还包括:The method according to claim 1, characterized in that before determining at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data ,Also includes:
    通过所述干声关联的歌曲的伴奏,确定出所述伴奏的伴奏标识号码;Determine the accompaniment identification number of the accompaniment through the accompaniment of the song associated with the dry voice;
    通过所述伴奏标识号码从包括多首歌曲的第一数据库中确定出所述干声关联的歌曲;Determining the song associated with the dry voice from a first database including a plurality of songs by using the accompaniment identification number;
    根据确定出的歌曲,确定出所述歌曲的歌唱速度,其中,所述歌曲的伴奏标识号码与所述歌曲关联。According to the determined song, the singing speed of the song is determined, wherein the accompaniment identification number of the song is associated with the song.
  6. 如权利要求1所述的方法,其特征在于,所述干声关联的歌曲的演唱速度,具体为:The method according to claim 1, wherein the singing speed of the song associated with dry voice is specifically:
    所述获取到的所述干声关联的歌曲的每分钟节拍数;The number of beats per minute of the acquired song associated with the dry sound;
    或者,or,
    所述获取到的所述干声关联的歌曲的每分钟音节数。The obtained number of syllables per minute of the song associated with the dry sound.
  7. 如权利要求1所述的方法,其特征在于,所述根据所获取的目标音效方案,生成目标音频,包括:The method of claim 1, wherein the generating target audio according to the acquired target sound effect scheme comprises:
    通过所述所获取到的目标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对所述干声和所述干声关联的歌曲的伴奏进行音效处理,生成目标音频。Through the obtained equalization parameter values, compression parameter values, and reverberation parameter values in the target sound effect scheme, sound effect processing is jointly performed on the dry sound and the accompaniment of the song associated with the dry sound to generate target audio.
  8. 如权利要求7所述的方法,其特征在于,所述通过所述所获取到的目 标音效方案中的均衡参数值、压缩参数值以及混响参数值,联合对所述干声和所述干声关联的歌曲的伴奏进行音效处理,生成目标音频,包括:The method according to claim 7, wherein the obtained equalization parameter value, compression parameter value, and reverberation parameter value in the target sound effect scheme are used to jointly perform a joint analysis on the dry sound and the dry sound. The accompaniment of the song associated with the sound is processed with sound effects to generate the target audio, including:
    通过所述目标音效方案中的均衡参数值对干声和所述干声关联的歌曲的伴奏的音质的改善程度进行调整、所述目标音效方案中的压缩参数值对干声和所述干声关联的歌曲的伴奏的动态修补程度进行调整以及所述目标音效方案中的混响参数值对干声和所述干声关联的歌曲的伴奏的音质的改善、空间制造层次的营造、细节掩盖程度分别进行调整,生成目标音频。The degree of improvement of the sound quality of the dry sound and the accompaniment of the song associated with the dry sound is adjusted by the equalization parameter value in the target sound effect scheme, and the compression parameter value in the target sound effect scheme is used for the dry sound and the dry sound The degree of dynamic repair of the accompaniment of the associated song is adjusted, and the reverberation parameter value in the target sound effect scheme improves the sound quality of the dry sound and the accompaniment of the song associated with the dry sound, the creation of the spatial manufacturing level, and the degree of detail concealment Adjust separately to generate target audio.
  9. 如权利要求3所述的方法,其特征在于,所述对获取到的干声进行预处理,包括:The method according to claim 3, wherein said preprocessing the acquired dry sound comprises:
    对获取到的干声进行降噪和/或修音。Perform noise reduction and/or tone repair on the acquired dry sound.
  10. 一种处理装置,其特征在于,包括:A processing device, characterized in that it comprises:
    第一获取单元,用于获取干声;所述干声包括用户演唱歌曲的基频数据;The first acquiring unit is configured to acquire dry sound; the dry sound includes fundamental frequency data of a song sung by a user;
    第二获取单元,用于获取所述干声的音色数据;The second acquiring unit is used to acquire the timbre data of the dry sound;
    确定单元,用于根据获取到的所述干声的音色数据、所述干声关联的歌曲的演唱速度以及所述基频数据,确定出至少一个音效方案;所述音效方案用于对所述干声和所述干声关联的歌曲的伴奏进行音效处理,以生成音效处理后的音频;The determining unit is configured to determine at least one sound effect scheme according to the acquired timbre data of the dry sound, the singing speed of the song associated with the dry sound, and the fundamental frequency data; the sound effect scheme is used to Performing sound effect processing on the dry sound and the accompaniment of the song associated with the dry sound to generate audio after sound effect processing;
    输出单元,用于输出所述至少一个音效方案;An output unit, configured to output the at least one sound effect solution;
    生成单元,用于根据所获取的目标音效方案,生成目标音频;所述目标音效方案为所述至少一个音效方案中的一个音效方案。The generating unit is configured to generate target audio according to the acquired target sound effect scheme; the target sound effect scheme is one of the at least one sound effect scheme.
  11. 一种处理设备,其特征在于,包括:输入设备、输出设备、存储器以及耦合于所述存储器的处理器,所述输入设备、输出设备、处理器和存储器相互连接,其中,所述存储器用于存储应用程序代码,所述处理器被配置用于调用所述程序代码,执行如权利要求1-9所述的处理方法。A processing device, characterized by comprising: an input device, an output device, a memory, and a processor coupled to the memory, the input device, output device, processor and memory are connected to each other, wherein the memory is used for An application program code is stored, and the processor is configured to call the program code to execute the processing method according to claims 1-9.
  12. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有 计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-9任一项所述的处理方法。A computer-readable storage medium, wherein the computer storage medium stores a computer program, the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute as claimed in claim 1. -9 The processing method described in any one.
PCT/CN2019/083454 2019-03-01 2019-04-19 Processing method, apparatus and device WO2020177190A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910158854.5 2019-03-01
CN201910158854.5A CN109785820B (en) 2019-03-01 2019-03-01 Processing method, device and equipment

Publications (1)

Publication Number Publication Date
WO2020177190A1 true WO2020177190A1 (en) 2020-09-10

Family

ID=66486097

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/083454 WO2020177190A1 (en) 2019-03-01 2019-04-19 Processing method, apparatus and device

Country Status (2)

Country Link
CN (1) CN109785820B (en)
WO (1) WO2020177190A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331222A (en) * 2020-09-23 2021-02-05 北京捷通华声科技股份有限公司 Method, system, equipment and storage medium for converting song tone
CN112365868A (en) * 2020-11-17 2021-02-12 北京达佳互联信息技术有限公司 Sound processing method, sound processing device, electronic equipment and storage medium
CN112420015A (en) * 2020-11-18 2021-02-26 腾讯音乐娱乐科技(深圳)有限公司 Audio synthesis method, device, equipment and computer readable storage medium
CN113192486A (en) * 2021-04-27 2021-07-30 腾讯音乐娱乐科技(深圳)有限公司 Method, equipment and storage medium for processing chorus audio
CN113744708A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio evaluation method, device and readable storage medium
CN113744721A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN114566191A (en) * 2022-02-25 2022-05-31 腾讯音乐娱乐科技(深圳)有限公司 Sound correcting method for recording and related device
CN115240709A (en) * 2022-07-25 2022-10-25 镁佳(北京)科技有限公司 Sound field analysis method and device for audio file
WO2024066790A1 (en) * 2022-09-26 2024-04-04 抖音视界有限公司 Audio processing method and apparatus, and electronic device

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110706679B (en) * 2019-09-30 2022-03-29 维沃移动通信有限公司 Audio processing method and electronic equipment
CN111061909B (en) * 2019-11-22 2023-11-28 腾讯音乐娱乐科技(深圳)有限公司 Accompaniment classification method and accompaniment classification device
CN111326132B (en) * 2020-01-22 2021-10-22 北京达佳互联信息技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN112164387A (en) * 2020-09-22 2021-01-01 腾讯音乐娱乐科技(深圳)有限公司 Audio synthesis method and device, electronic equipment and computer-readable storage medium
CN112289300B (en) * 2020-10-28 2024-01-09 腾讯音乐娱乐科技(深圳)有限公司 Audio processing method and device, electronic equipment and computer readable storage medium
CN113112998B (en) * 2021-05-11 2024-03-15 腾讯音乐娱乐科技(深圳)有限公司 Model training method, reverberation effect reproduction method, device, and readable storage medium
CN113707113B (en) * 2021-08-24 2024-02-23 北京达佳互联信息技术有限公司 User singing voice repairing method and device and electronic equipment
CN114666706B (en) * 2021-11-30 2024-05-14 北京达佳互联信息技术有限公司 Sound effect enhancement method, device and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9224375B1 (en) * 2012-10-19 2015-12-29 The Tc Group A/S Musical modification effects
CN105208189A (en) * 2014-12-10 2015-12-30 维沃移动通信有限公司 Audio processing method and mobile terminal
CN107978321A (en) * 2017-11-29 2018-05-01 广州酷狗计算机科技有限公司 Audio-frequency processing method and device
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN108922506A (en) * 2018-06-29 2018-11-30 广州酷狗计算机科技有限公司 Song audio generation method, device and computer readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6371283B2 (en) * 2012-08-07 2018-08-08 スミュール,インク.Smule,Inc. Social music system and method using continuous real-time pitch correction and dry vocal capture of vocal performances for subsequent replay based on selectively applicable vocal effect schedule (s)
CN107203571B (en) * 2016-03-18 2019-08-06 腾讯科技(深圳)有限公司 Song lyric information processing method and device
CN106024005B (en) * 2016-07-01 2018-09-25 腾讯科技(深圳)有限公司 A kind of processing method and processing device of audio data
US10062367B1 (en) * 2017-07-14 2018-08-28 Music Tribe Global Brands Ltd. Vocal effects control system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9224375B1 (en) * 2012-10-19 2015-12-29 The Tc Group A/S Musical modification effects
CN105208189A (en) * 2014-12-10 2015-12-30 维沃移动通信有限公司 Audio processing method and mobile terminal
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN107978321A (en) * 2017-11-29 2018-05-01 广州酷狗计算机科技有限公司 Audio-frequency processing method and device
CN108922506A (en) * 2018-06-29 2018-11-30 广州酷狗计算机科技有限公司 Song audio generation method, device and computer readable storage medium

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331222A (en) * 2020-09-23 2021-02-05 北京捷通华声科技股份有限公司 Method, system, equipment and storage medium for converting song tone
CN112365868A (en) * 2020-11-17 2021-02-12 北京达佳互联信息技术有限公司 Sound processing method, sound processing device, electronic equipment and storage medium
CN112365868B (en) * 2020-11-17 2024-05-28 北京达佳互联信息技术有限公司 Sound processing method, device, electronic equipment and storage medium
CN112420015A (en) * 2020-11-18 2021-02-26 腾讯音乐娱乐科技(深圳)有限公司 Audio synthesis method, device, equipment and computer readable storage medium
CN113192486B (en) * 2021-04-27 2024-01-09 腾讯音乐娱乐科技(深圳)有限公司 Chorus audio processing method, chorus audio processing equipment and storage medium
CN113192486A (en) * 2021-04-27 2021-07-30 腾讯音乐娱乐科技(深圳)有限公司 Method, equipment and storage medium for processing chorus audio
CN113744721A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN113744708A (en) * 2021-09-07 2021-12-03 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio evaluation method, device and readable storage medium
CN113744708B (en) * 2021-09-07 2024-05-14 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio evaluation method, device and readable storage medium
CN113744721B (en) * 2021-09-07 2024-05-14 腾讯音乐娱乐科技(深圳)有限公司 Model training method, audio processing method, device and readable storage medium
CN114566191A (en) * 2022-02-25 2022-05-31 腾讯音乐娱乐科技(深圳)有限公司 Sound correcting method for recording and related device
CN115240709A (en) * 2022-07-25 2022-10-25 镁佳(北京)科技有限公司 Sound field analysis method and device for audio file
CN115240709B (en) * 2022-07-25 2023-09-19 镁佳(北京)科技有限公司 Sound field analysis method and device for audio file
WO2024066790A1 (en) * 2022-09-26 2024-04-04 抖音视界有限公司 Audio processing method and apparatus, and electronic device

Also Published As

Publication number Publication date
CN109785820B (en) 2022-12-27
CN109785820A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
WO2020177190A1 (en) Processing method, apparatus and device
CN110555126B (en) Automatic generation of melodies
US8239201B2 (en) System and method for audibly presenting selected text
WO2021004481A1 (en) Media files recommending method and device
CN112309365B (en) Training method and device of speech synthesis model, storage medium and electronic equipment
US9749582B2 (en) Display apparatus and method for performing videotelephony using the same
CN111402842A (en) Method, apparatus, device and medium for generating audio
CN111798821B (en) Sound conversion method, device, readable storage medium and electronic equipment
US11511200B2 (en) Game playing method and system based on a multimedia file
CN110675886A (en) Audio signal processing method, audio signal processing device, electronic equipment and storage medium
US11538476B2 (en) Terminal device, server and controlling method thereof
WO2021169365A1 (en) Voiceprint recognition method and device
WO2022089097A1 (en) Audio processing method and apparatus, electronic device, and computer-readable storage medium
CN111653265A (en) Speech synthesis method, speech synthesis device, storage medium and electronic equipment
WO2020228226A1 (en) Instrumental music detection method and apparatus, and storage medium
CN109410972B (en) Method, device and storage medium for generating sound effect parameters
CN114363531B (en) H5-based text description video generation method, device, equipment and medium
WO2020154916A1 (en) Video subtitle synthesis method and apparatus, storage medium, and electronic device
TWI486949B (en) Music emotion classification method
JP7230085B2 (en) Method and device, electronic device, storage medium and computer program for processing sound
CN112786025A (en) Method for determining lyric timestamp information and training method of acoustic model
CN106649643B (en) A kind of audio data processing method and its device
WO2024075422A1 (en) Musical composition creation method and program
WO2023236054A1 (en) Audio generation method and apparatus, and storage medium
JP2023144076A (en) Program, information processing method and information processing device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19918434

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.01.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19918434

Country of ref document: EP

Kind code of ref document: A1