WO2014142200A1 - Dispositif de traitement vocal - Google Patents
Dispositif de traitement vocal Download PDFInfo
- Publication number
- WO2014142200A1 WO2014142200A1 PCT/JP2014/056570 JP2014056570W WO2014142200A1 WO 2014142200 A1 WO2014142200 A1 WO 2014142200A1 JP 2014056570 W JP2014056570 W JP 2014056570W WO 2014142200 A1 WO2014142200 A1 WO 2014142200A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- singing
- expression
- data
- song
- voice
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims description 67
- 230000014509 gene expression Effects 0.000 claims abstract description 422
- 238000011156 evaluation Methods 0.000 claims description 89
- 238000003672 processing method Methods 0.000 claims description 2
- 239000008186 active pharmaceutical agent Substances 0.000 abstract description 110
- 230000005236 sound signal Effects 0.000 description 103
- 238000000034 method Methods 0.000 description 93
- 230000008569 process Effects 0.000 description 64
- 238000004458 analytical method Methods 0.000 description 27
- 239000011295 pitch Substances 0.000 description 18
- 238000012854 evaluation process Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 238000012935 Averaging Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 241001342895 Chorus Species 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- HAORKNGNJCEJBX-UHFFFAOYSA-N cyprodinil Chemical compound N=1C(C)=CC(C2CC2)=NC=1NC1=CC=CC=C1 HAORKNGNJCEJBX-UHFFFAOYSA-N 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 235000009508 confectionery Nutrition 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000001020 rhythmical effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0091—Means for obtaining special acoustic effects
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/155—Musical effects
- G10H2210/195—Modulation effects, i.e. smooth non-discontinuous variations over a time interval, e.g. within a note, melody or musical transition, of any sound parameter, e.g. amplitude, pitch, spectral response or playback speed
- G10H2210/201—Vibrato, i.e. rapid, repetitive and smooth variation of amplitude, pitch or timbre within a note or chord
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/315—Sound category-dependent sound synthesis processes [Gensound] for musical use; Sound category-specific synthesis-controlling parameters or control means therefor
- G10H2250/455—Gensound singing voices, i.e. generation of human voices for musical applications, vocal singing sounds or intelligible words at a desired pitch or with desired vocal effects, e.g. by phoneme synthesis
Definitions
- the present invention relates to a technique for controlling the singing expression of the singing voice.
- Patent Document 1 discloses a technique for collecting segment data used for segment-connected singing synthesis. Singing voices of arbitrary lyrics can be synthesized by appropriately selecting and connecting the piece data collected by the technique of Patent Document 1 to each other.
- an object of the present invention is to generate singing voices with various singing expressions.
- the speech processing apparatus includes an expression selection unit that selects application target song expression data from a plurality of song expression data indicating different song expressions, and a song selected by the expression selection unit.
- An expression providing unit that assigns the singing expression indicated by the expression data to a specific section of the singing voice.
- An expression selection part selects the 1st song expression data and 2nd song expression data which show different song expressions, and an expression provision part provides the song expression which 1st song expression data shows to the 1st area of song voice.
- the singing expression indicated by the second singing expression data may be given to the second section of the singing voice that is different from the first section.
- the expression selection unit selects two or more singing expression data indicating different singing expressions, and the expression providing unit specifies the singing voice indicated by each of the two or more singing expression data selected by the expression selecting unit. You may give redundantly to a section.
- a plurality of singing expressions typically different types of singing expressions
- the effect of generating singing voices with various singing expressions is particularly remarkable. is there.
- the storage unit stores attribute data related to the singing expression in association with the singing expression data of the singing expression, and the expression selecting unit refers to the attribute data of each singing expression data to obtain the singing expression data from the storing unit. You may choose.
- attribute data is associated with each singing expression data, so that the singing expression data of the singing expression given to the singing voice can be selected (searched) by referring to the attribute data.
- the expression selection unit may select singing expression data in accordance with an instruction from the user.
- indication from a user since the song expression data according to the instruction
- An expression provision part may provide the singing expression which the singing expression data which the expression selection part selected to the specific area according to the instruction
- the singing expression is given to the section according to the instruction from the user in the singing voice, there is an advantage that various singing voices reflecting the user's intention and preference can be generated.
- the singing voice is evaluated by comparing the transition of the pitch and volume of the singing voice with the transition of the pitch and volume of the standard (exemplary) singing voice prepared in advance.
- the evaluation of actual singing depends not only on the accuracy of pitch and volume but also on the skill of singing expression.
- the speech processing apparatus of the present invention corresponds to the singing expression data of the singing expression similar to the singing voice among the plurality of singing expression data, and according to the evaluation value indicating the evaluation of the singing expression. You may comprise the song evaluation part which evaluates a song voice.
- the singing voice is evaluated according to the evaluation value corresponding to the singing expression data of the singing expression similar to the singing voice, there is an advantage that the singing voice can be appropriately evaluated from the viewpoint of skill of the singing expression. .
- the singing evaluation unit selects singing expression data of the singing expression similar to the singing expression of the target section for each of the plurality of target sections of the singing voice, and the singing voice is selected according to the evaluation value corresponding to each singing expression data. You may evaluate.
- the specific target section of the singing voice can be evaluated with priority.
- the target section can be the entire section of the audio signal (the entire music piece).
- the speech processing apparatus includes a storage unit that stores singing expression data indicating singing expression and evaluation values indicating evaluation of the singing expression for a plurality of different singing expressions, and the singing evaluation unit includes the plurality of singing expression data.
- the singing voice may be evaluated according to an evaluation value stored in the storage unit, corresponding to singing expression data of a singing expression similar to the singing voice.
- the singing voice since the singing voice is evaluated according to the evaluation value corresponding to the singing expression data of the singing expression similar to the singing voice, the singing from the viewpoint of whether or not it is similar to the singing expression registered in the storage unit.
- a speech processing method for selecting singing expression data to be applied from a plurality of singing expression data indicating different singing expressions, and providing the singing expression indicated by the selected singing expression data to a specific section of the singing voice. Is done.
- the audio processing device is realized by hardware (electronic circuit) such as DSP (Digital Signal Processor) dedicated to processing of singing voice, and general-purpose arithmetic such as CPU (Central Processing Unit) This is also realized by cooperation between the processing device and the program.
- the program according to the first aspect of the present invention includes an expression selection process for selecting application target song expression data from a plurality of song expression data indicating different song expressions, and a song expression selected in the expression selection process.
- provides the song expression which data shows to the specific area of song voice is performed.
- the program which concerns on the 2nd aspect of this invention is a computer which comprises the memory
- the singing evaluation process is performed to evaluate the singing voice according to the evaluation value corresponding to the singing expression data of the singing expression similar to the singing voice among the singing expression data.
- the program according to each of the above aspects can be provided in a form stored in a computer-readable recording medium and installed in the computer.
- the recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included.
- the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer.
- FIG. 1 is a block diagram of a speech processing apparatus according to a first embodiment of the present invention. It is a functional block diagram of the element relevant to an expression registration process. It is a block diagram of a song division part. It is a flowchart of an expression registration process. It is a functional block diagram of the element relevant to an expression provision process. It is a flowchart of an expression provision process. It is explanatory drawing of the specific example (giving of vibrato) of an expression provision process. It is explanatory drawing of an expression provision process. It is explanatory drawing of an expression provision process. It is explanatory drawing of an expression provision process. It is a functional block diagram of the element relevant to the song evaluation process of 2nd Embodiment. It is a flowchart of a song evaluation process. It is a block diagram of the audio processing apparatus which concerns on a modification.
- FIG. 1 is a block diagram of a speech processing apparatus 100 according to the first embodiment of the present invention.
- the sound processing device 100 is realized by a computer system including an arithmetic processing device 10, a storage device 12, a sound collecting device 14, an input device 16, and a sound emitting device 18.
- the arithmetic processing device 10 controls each element of the speech processing device 100 by executing a program stored in the storage device 12.
- the storage device 12 stores a program executed by the arithmetic processing device 10 and various data used by the arithmetic processing device 10.
- a known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 12.
- the storage device 12 is installed in an external device (for example, an external server device) separate from the speech processing device 100, and the speech processing device 100 writes and reads information to and from the storage device 12 via a communication network such as the Internet.
- an external device for example, an external server device
- the speech processing device 100 writes and reads information to and from the storage device 12 via a communication network such as the Internet.
- a configuration for performing the above can also be adopted. That is, the storage device 12 is not an essential element of the voice processing device 100.
- the storage device 12 of the first embodiment stores a plurality of audio signals X indicating time waveforms of different singing voices (for example, singing voices of different singers). Each of the plurality of audio signals X is prepared in advance by recording a singing voice of singing a song (singing song). Further, the storage device 12 stores a plurality of song expression data DS indicating different song expressions and a plurality of attribute data DA related to the song expressions indicated by each song expression data DS.
- the singing expression is a characteristic of the singing (singing or singing method peculiar to the singer). Singing expression data DS is stored in the storage device 12 for a plurality of types of singing expressions extracted from singing voices pronounced by different singers, and attribute data DA is associated with each of the plurality of singing expression data DS.
- the singing expression data DS includes, for example, a pitch or volume (distribution range), a characteristic amount of a frequency spectrum (for example, a spectrum within a specific band), a frequency or intensity of a formant of a specific order, and a characteristic amount (for example, a harmonic overtone).
- a pitch or volume distributed range
- a characteristic amount of a frequency spectrum for example, a spectrum within a specific band
- a frequency or intensity of a formant of a specific order for example, a harmonic overtone.
- Specify various features related to the musical expression of the singing voice such as the intensity ratio between the component and fundamental component, the intensity ratio between the harmonic component and the non-harmonic component), or MFCC (Mel-Frequency Cepstrum Coefficients) .
- the singing expression exemplified above is a tendency of singing voice for a relatively short time, but the tendency of the pitch or volume to change with time and various singing techniques (for example, vibrato, fall, long tone).
- a configuration in which the singing expression data DS specifies the tendency of the singing voice over a long time such as a tendency is also suitable.
- the attribute data DA of each singing expression is information (metadata) related to the singer of the singing voice and the music, and is used for searching the singing expression data DS. Specifically, information on the singer sung in each singing expression (for example, name, age, birthplace, age, gender, race, native language, range), information on the song sung in each singing expression (for example, The attribute data DA specifies the song name, composer, songwriter, genre, tempo, key, chord, range, and language. The attribute data DA can also specify words (for example, words such as “rhythmic” and “sweet”) that express the impression and atmosphere of the singing voice.
- words for example, words such as “rhythmic” and “sweet” that express the impression and atmosphere of the singing voice.
- the attribute data DA of the first embodiment includes an evaluation value (a skill evaluation index of the singing expression of the singing expression data DS) Q according to the evaluation result of the singing voice sung by each singing expression.
- the attribute value DA includes an evaluation value Q calculated by a known singing evaluation process and an evaluation value Q reflecting the evaluation by each user other than the singer.
- the items specified by the attribute data DA are not limited to the above examples.
- the attribute data DA can specify which section of the music structure into which the music is divided (for example, each phrase such as A melody, chorus, B melody) in which the singing expression is sung.
- the sound collection device 14 is a device (microphone) that collects ambient sounds.
- the sound collection device 14 according to the first embodiment generates an audio signal R by collecting a singing voice in which a singer sang a song (singing song).
- An A / D converter that converts the audio signal R from analog to digital is not shown for convenience.
- a configuration in which the audio signal R is stored in the storage device 12 (therefore, the sound collection device 14 can be omitted) is also suitable.
- the input device 16 is an operation device that receives an instruction from the user to the voice processing device 100, and includes, for example, a plurality of operators that can be operated by the user.
- an operation panel installed in the casing of the voice processing device 100 or a remote control device separate from the voice processing device 100 is employed as the input device 16.
- the arithmetic processing unit 10 executes various control processes and arithmetic processes by executing the programs stored in the storage device 12. Specifically, the arithmetic processing device 10 extracts the singing expression data DS by analyzing the audio signal R supplied from the sound collecting device 14 and stores it in the storage device 12 (hereinafter referred to as “expression registration processing”). The process of generating the audio signal Y by adding the singing expression indicated by each singing expression data DS stored in the storage device 12 in the expression registration process to the audio signal X in the storage apparatus 12 (hereinafter referred to as “expression adding process”). ) And execute.
- the audio signal Y is an acoustic signal in which the singing expression of the audio signal X matches or resembles the singing expression of the singing expression data DS while maintaining the pronunciation content (lyrics) of the audio signal X.
- one of the expression registration process and the expression providing process is selectively executed in accordance with an instruction from the user to the input device 16.
- the sound emitting device 18 (for example, a speaker or a headphone) in FIG. 1 reproduces sound corresponding to the audio signal Y generated by the arithmetic processing device 10 in the expression providing process.
- a D / A converter that converts the audio signal Y from digital to analog and an amplifier that amplifies the audio signal Y are omitted for the sake of convenience.
- FIG. 2 is a functional configuration diagram of elements related to the expression registration process in the voice processing apparatus 100.
- the arithmetic processing unit 10 executes a program (expression registration program) stored in the storage device 12 to thereby execute a plurality of elements (analysis processing unit 20, song) for realizing the expression registration process as shown in FIG. It functions as a classification unit 22, a song evaluation unit 24, a song analysis unit 26, and an attribute acquisition unit 28).
- a configuration in which each function of FIG. 2 is distributed over a plurality of integrated circuits, or a configuration in which a dedicated electronic circuit (for example, DSP) realizes a part of the functions illustrated in FIG. 2 may be employed.
- the analysis processing unit 20 in FIG. 2 analyzes the audio signal R supplied from the sound collection device 14.
- the analysis processing unit 20 includes a music structure analysis unit 20A, a singing technique analysis unit 20B, and a voice quality analysis unit 20C.
- the music structure analysis unit 20A analyzes a section (for example, each phrase such as A melody, chorus, and B melody) on the music structure of the music corresponding to the audio signal R.
- the singing technique analysis unit 20B includes vibrato (singing technique for finely changing the pitch), shackle (singing technique for changing from a pitch lower than the target pitch to a target pitch) and fall (exceeding the target pitch).
- the voice quality analysis unit 20C analyzes the voice quality of the singing voice (for example, the intensity ratio between the harmonic component and the fundamental component and the intensity ratio between the harmonic component and the non-harmonic component).
- the singing section 22 of the first embodiment defines each unit section of the audio signal R according to the music structure, singing technique, and voice quality. Specifically, the singing section 22 includes end points of each section on the music structure of the music analyzed by the music structure analysis unit 20A, end points of each section where the singing technique analysis unit 20B detects various singing techniques, The voice signal R is divided into unit sections with the time point when the voice quality analyzed by the voice quality analysis unit 20C fluctuates. Note that the method of dividing the audio signal R into a plurality of unit sections is not limited to the above examples.
- the audio signal R can be divided using a section specified by the user by an operation on the input device 16 as a unit section. Further, the audio signal R is divided into a plurality of unit sections according to the configuration in which the audio signal R is divided into a plurality of unit sections when set on the time axis at random, or the evaluation value Q calculated by the singing evaluation unit 24.
- a configuration for example, a configuration in which each unit section is defined with a time point at which the evaluation value Q fluctuates
- the song evaluation unit 24 evaluates the skill of the song indicated by the audio signal R supplied from the sound collection device 14. Specifically, the singing evaluation unit 24 sequentially calculates an evaluation value Q obtained by evaluating the skill of singing the audio signal R for each unit section defined by the singing division unit 22. For the calculation of the evaluation value Q by the song evaluation unit 24, a known song evaluation process is arbitrarily employed. Note that the singing technique analyzed by the singing technique analyzing unit 20B and the voice quality analyzed by the voice quality analyzing unit 20C can be applied to the singing evaluation by the singing evaluation unit 24.
- the singing analysis unit 26 in FIG. 2 generates the singing expression data DS for each unit section by analyzing the audio signal R. Specifically, the singing analysis unit 26 extracts an acoustic feature quantity (feature quantity that affects the singing expression) such as pitch and volume from the voice signal R, and a short-term or long-term tendency of each feature quantity. Singing expression data DS indicating (that is, singing expression) is generated.
- a known acoustic analysis technique for example, the technique disclosed in Japanese Patent Application Laid-Open No. 2011-013454 and Japanese Patent Application Laid-Open No. 2011-028230 is arbitrarily employed for extracting the singing expression.
- one singing expression data DS is generated for each unit section.
- one singing expression data DS can be generated from a plurality of feature quantities of different unit sections.
- a configuration in which the singing expression data DS is generated by averaging feature quantities of a plurality of unit sections that are approximated or matched with the attribute data DA, and a weight value corresponding to the evaluation value Q of each unit section by the singing evaluation unit 24 is used.
- a configuration is adopted in which singing expression data DS is generated by applying weighted addition of feature quantities over a plurality of unit sections.
- Attribute acquisition unit 28 generates attribute data DA for each unit section defined by singing section 22. Specifically, the attribute acquisition unit 28 registers, in the attribute data DA, various types of information that the user has instructed by operating the input device 16. Further, the attribute acquisition unit 28 includes the evaluation value Q (for example, the average of the evaluation values in the unit section) calculated by the singing evaluation unit 24 for each unit section in the attribute data DA of the unit section.
- the evaluation value Q for example, the average of the evaluation values in the unit section
- the singing expression data DS generated by the singing analysis unit 26 for each unit section and the attribute data DA generated by the attribute acquisition unit 28 for each unit section are stored in association with each other in common unit sections. It is stored in the device 12.
- the expression registration process exemplified above is repeated for the audio signals R of a plurality of different singing voices, so that each of a plurality of types of singing expressions extracted from the singing voices uttered by each of a plurality of singers, Singing expression data DS and attribute data DA are stored in the storage device 12. That is, a database of various singing expressions (singing expressions with different singers and singing expressions with different types) is constructed in the storage device 12.
- FIG. 4 is a flowchart of the expression registration process.
- the analysis processing unit 20 analyzes the audio signal R supplied from the sound collection device 14 (SA2). ).
- the singing section 22 classifies the voice signal R into each unit section according to the analysis result by the analysis processing section 20 (SA3), and the singing analysis section 26 analyzes the voice signal R to express the singing expression for each unit section.
- Data DS is generated (SA4).
- the singing evaluation unit 24 calculates an evaluation value Q corresponding to the skill of the singing indicated by the audio signal R for each unit section (SA5), and the attribute acquisition unit 28 calculates the singing evaluation unit 24 for each unit section.
- Attribute data DA including the evaluation value Q is generated for each unit section (SA6).
- the song expression data DS generated by the song analysis unit 26 and the attribute data DA generated by the attribute acquisition unit 28 are stored in the storage device 12 for each unit section (SA7).
- the singing expression specified by the singing expression data DS accumulated in the storage device 12 by the expression registration process described above is given to the audio signal X by the expression giving process described below.
- FIG. 5 is a functional configuration diagram of elements related to the expression providing process in the audio processing apparatus 100.
- the arithmetic processing unit 10 executes a program (expression providing program) stored in the storage device 12 to perform a plurality of functions (singing selection unit 32, section) for realizing the expression providing process as shown in FIG. It functions as a designation unit 34, an expression selection unit 36, and an expression giving unit 38).
- a configuration in which each function of FIG. 5 is distributed over a plurality of integrated circuits, or a configuration in which a dedicated electronic circuit (for example, DSP) executes a part of the functions illustrated in FIG. 5 may be employed.
- the singing selection unit 32 selects one of the plurality of audio signals X stored in the storage device 12 (hereinafter referred to as “selected audio signal X”). For example, the singing selection unit 32 selects the selected sound signal X from the plurality of sound signals X in the storage device 12 in accordance with an instruction (selection instruction for the sound signal X) from the user to the input device 16.
- the section designating unit 34 designates one or more sections (hereinafter referred to as “target section”) to which the singing expression of the singing expression data DS is to be added in the selected audio signal X selected by the singing selecting unit 32.
- the section specifying unit 34 specifies each target section in accordance with an instruction from the user to the input device 16.
- the section specifying unit 34 defines a section between two points specified by the user on the time axis (for example, on the waveform of the selected audio signal X) by operating the input device 16 as a target section.
- a plurality of target sections specified by the section specifying unit 34 may overlap each other on the time axis. Note that it is also possible to specify the entire section (the entire music piece) of the selected audio signal X as the target section.
- the expression selection unit 36 shown in FIG. 5 selects the singing expression data DS (hereinafter referred to as “target expression data DS”) that is actually applied to the expression providing process among the plurality of singing expression data DS stored in the storage device 12.
- the designation unit 34 sequentially selects the target sections designated.
- the expression selection unit 36 of the first embodiment selects the target expression data DS from the plurality of song expression data DS in the search process using the attribute data DA stored in the storage device 12 in association with each song expression data DS. .
- the user can designate the search condition (for example, search word) of the target expression data DS for each target section by appropriately operating the input device 16.
- the expression selection unit 36 selects the singing expression data DS corresponding to the attribute data DA that matches the search condition specified by the user among the plurality of singing expression data DS in the storage device 12 as the target expression data DS for each target section.
- a search condition for example, age or gender
- the target expression data DS corresponding to the attribute data DA of the singer that matches the search condition that is, the singing expression of the singer that matches the search condition
- the target expression data DS corresponding to the music attribute data DA that matches the search condition that is, the song singing expression that matches the search condition.
- a search condition for example, a numerical range
- the target expression data DS corresponding to the attribute data DA of the evaluation value Q that matches the search condition that is, the level intended by the user. Singing expression of the singer).
- the expression selection unit 36 of the first embodiment is expressed as an element that selects the singing expression data DS (target expression data DS) in accordance with an instruction from the user.
- the expression providing unit 38 sings the target expression data DS selected by the expression selection unit 36 for the target section for each of a plurality of target sections specified by the section specifying unit 34 in the selected audio signal X.
- indication from a user (designation of search conditions) is provided with respect to each object area according to the instruction
- FIG. A well-known technique is arbitrarily employ
- the singing expression of the selected audio signal X In addition to the configuration that replaces the singing expression of the selected audio signal X with the singing expression of the target expression data DS (the configuration in which the singing expression of the selected audio signal X does not remain in the audio signal Y), the singing expression of the selected audio signal X
- a configuration in which the singing expression of the target expression data DS is cumulatively given (for example, a structure in which both the singing expression of the selected audio signal X and the singing expression of the target expression data DS are reflected in the audio signal Y) may be employed.
- FIG. 6 is a flowchart of the expression providing process.
- the singing selection unit 32 selects the selected audio signal from the plurality of audio signals X stored in the storage device 12.
- X is selected (SB2), and the section designating unit 34 designates one or more target sections for the selected audio signal X (SB3).
- the expression selection unit 36 selects the target expression data DS from the plurality of song expression data DS stored in the storage device 12 (SB4), and the expression giving unit 38 selects the selected voice signal X selected by the song selection unit 32.
- the voice signal Y is generated by giving the singing expression of the target expression data DS to each of the target sections (SB5).
- the sound signal Y generated by the expression providing unit 38 is reproduced from the sound emitting device 18 (SB6).
- FIG. 7 is an explanatory diagram of a specific example of the expression providing process to which the singing expression data DS indicating vibrato is applied.
- FIG. 7 illustrates the time change of the pitch (pitch) of the selected voice signal X and a plurality of song expression data DS (DS [1] to DS [4]).
- Each singing expression data DS is generated by an expression registration process for each audio signal R containing singing voices of different singers. Therefore, the vibrato represented by each song expression data DS (DS [1] to DS [4]) has different characteristics such as pitch fluctuation period (speed) and fluctuation width (depth). As shown in FIG.
- the target section of the selected audio signal X is designated according to an instruction from the user (SB3), and the target expression data DS is selected from a plurality of song expression data DS, for example, according to an instruction from the user.
- the audio signal Y in which the vibrato indicated by the target expression data DS [3] is added to the target section of the selected audio signal X is generated by the expression adding process (SB5).
- the desired singing expression in the desired target section in the audio signal X of the singing voice sung without vibrato for example, the singing voice of a singer who is not good at vibrato
- Vibrato for data DS is given.
- the structure for a user to select object expression data DS from several song expression data DS is arbitrary.
- a predetermined singing voice to which the singing expression of each singing expression data DS is given is reproduced from the sound emitting device 18 and listened to (ie, auditioned) by the user, and the input device 16 (
- a configuration in which the target expression data DS is selected by operating a button or a touch panel is preferable.
- the expression selection unit 36 selects the target expression data DS1 for the target section S1 of the selected audio signal X, and the expression selection unit 36 selects the target expression data DS2 for the target section S2 different from the target section S1.
- the expression giving unit 38 gives the singing expression E1 indicated by the target expression data DS1 to the target section S1, and also gives the singing expression E2 indicated by the target expression data DS2 to the target section S2.
- the target section S1 and the target section S2 overlap (when the target section S2 is included in the target section S1), the target section S1 and the target section S2 of the selected audio signal X
- the overlapping section (that is, the target section S2) is given the song expression E1 of the target expression data DS1 and the song expression E2 of the target expression data DS2 redundantly. That is, a plurality of (typically a plurality of types) singing expressions are given to the specific section of the selected audio signal X in an overlapping manner. For example, both the singing expression E1 related to pitch fluctuation and the singing expression E2 related to volume fluctuation are given to the selected voice signal X (target section S2).
- the sound signal Y generated by the above processing is supplied to the sound emitting device 18 and reproduced as sound.
- each singing expression of the plurality of singing expression data DS indicating different singing expressions is selectively given to the target section of the selected audio signal X. Therefore, it is possible to generate singing voices (voice signal Y) with various singing expressions as compared with the technique of Patent Document 1.
- the target section to which the singing expression is given is the selected voice.
- the above-described effect of generating singing voices with various singing expressions is particularly remarkable.
- a plurality (several types) of singing expressions can be given redundantly to the target section of the selected audio signal X (FIG. 9), so the singing expression given to the target section is limited to one type.
- the effect of being able to generate singing voices with various singing expressions is particularly remarkable as compared to the configuration to be performed.
- the configuration in which the target section to which the singing expression is given is limited to one section of the selected audio signal X and the configuration in which the singing expression given to the target section is limited to one type are also within the scope of the present invention. Is included.
- the target section of the selected audio signal X is designated according to an instruction from the user, and the search condition for the attribute data DA is set according to the instruction from the user.
- Second Embodiment A second embodiment of the present invention will be described.
- the plurality of singing expression data DS stored in the storage device 12 is used for adjusting the singing expression of the audio signal X.
- a plurality of singing expression data DS stored in the storage device 12 are used for the evaluation of the speech signal X.
- standard referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate
- FIG. 10 is a functional configuration diagram of elements related to a process of evaluating the audio signal X (hereinafter referred to as “singing evaluation process”) in the audio processing apparatus 100 of the second embodiment.
- the storage device 12 of the second embodiment stores a plurality of sets of song expression data DS and attribute data DA generated by the same expression registration process as that of the first embodiment.
- the attribute data DA corresponding to each singing expression data DS is the evaluation value calculated by the singing evaluation unit 24 in FIG. 2 (the evaluation index of the skill of the singing expression data DS) Q as described above for the first embodiment. It is comprised including.
- the arithmetic processing unit 10 executes a program (singing evaluation program) stored in the storage device 12 and thereby, as shown in FIG. 10, a plurality of elements (singing selection unit 42, section) for realizing the singing evaluation processing It functions as a designation unit 44 and a singing evaluation unit 46).
- a program for example, a program stored in the storage device 12 and thereby, as shown in FIG. 10, a plurality of elements (singinging selection unit 42, section) for realizing the singing evaluation processing It functions as a designation unit 44 and a singing evaluation unit 46).
- the expression provision process of 1st Embodiment and the song evaluation process explained in full detail below are selectively performed.
- the expression providing process can be omitted. Note that it is also possible to adopt a configuration in which the functions in FIG. 10 are distributed over a plurality of integrated circuits, or a configuration in which a dedicated electronic circuit (for example, DSP) realizes some of the functions illustrated in FIG.
- a dedicated electronic circuit for example, D
- the 10 selects the selected audio signal X to be evaluated from among the plurality of audio signals X stored in the storage device 12.
- the singing selection unit 42 selects the selected audio signal X from the storage device 12 in accordance with an instruction from the user to the input device 16, similarly to the singing selection unit 32 of the first embodiment.
- the section specifying unit 44 specifies one or more target sections to be evaluated from the selected audio signal X selected by the song selection unit 42.
- the section specifying unit 44 specifies each target section in accordance with an instruction from the user to the input device 16, similarly to the section specifying unit 34 of the first embodiment. It is also possible to specify the entire section of the selected audio signal X as the target section.
- the singing evaluation unit 46 of FIG. 10 uses each singing expression data DS and each attribute data DA (evaluation value Q) stored in the storage device 12 to perform the singing of the selected voice signal X selected by the singing selection unit 42. Evaluate skill. That is, the singing evaluation unit 46 sets the evaluation value Q in the attribute data DA corresponding to the singing expression data DS of the singing expression similar to each target section of the selected voice signal X among the plurality of singing expression data DS of the storage device 12. Accordingly, the evaluation value Z of the selected audio signal X is calculated.
- the specific operation of the singing evaluation unit 46 will be described below.
- the singing evaluation unit 46 first determines the similarity (correlation or distance) between the singing expression indicated by the singing expression data DS and the singing expression of the target section of the selected audio signal X in each of the plurality of singing expression data DS in the storage device 12. Is calculated for each target section, and the singing expression data DS having the maximum similarity to the singing expression of the target section among the plurality of singing expression data DS is sequentially selected for each of the plurality of target sections of the selected audio signal X. .
- a known technique for comparing feature quantities is arbitrarily employed for calculating the similarity of singing expressions.
- the singing evaluation unit 46 weights and adds the evaluation value Q of the attribute data DA corresponding to the singing expression data DS selected for each target section of the selected speech signal X for a plurality of target sections of the selected speech signal X (or
- the evaluation value Z of the selected speech signal X is calculated by averaging.
- the evaluation value Z of the selected sound signal X is larger as the target section sung with a singing expression similar to the singing expression having a higher evaluation value Q is included in the selected sound signal X.
- the evaluation value Z calculated by the singing evaluation unit 46 is notified to the user by, for example, image display by a display device (not shown) or sound reproduction by the sound emitting device 18.
- FIG. 11 is a flowchart of the song evaluation process.
- the song selection unit 42 selects the selected voice signal from the plurality of voice signals X stored in the storage device 12.
- X is selected (SC2)
- the section designating unit 44 designates one or more target sections for the selected audio signal X (SC3).
- the song evaluation unit 46 calculates the evaluation value Z of the selected voice signal X using each song expression data DS and each attribute data DA stored in the storage device 12 (SC4).
- the evaluation value Z calculated by the singing evaluation unit 46 is notified to the user (SC5).
- the evaluation value Z of the selected voice signal X is calculated according to the evaluation value Q of the song expression data DS whose singing expression is similar to the selected voice signal X. Therefore, it is possible to appropriately evaluate the selected speech signal X from the viewpoint of skill of singing expression (similarity with the singing expression registered in the expression registration process).
- information other than the evaluation value Q in the attribute data DA can be omitted. That is, the memory
- the target of the expression providing process of the first embodiment and the target of the song evaluation process of the second embodiment are not limited to the audio signal X recorded in advance and stored in the storage device 12.
- the audio signal X generated by the sound collection device 14, the audio signal X reproduced from a portable or built-in recording medium (for example, a CD), and the audio signal received from another communication terminal via the communication network (for example, a streaming-format audio signal) X can be used as an object of the expression imparting process or the song evaluation process.
- the structure which performs an expression provision process and a song evaluation process is also employ
- combination process for example, segment connection type song synthesis process).
- the expression providing process and the singing evaluation process are performed on the recorded audio signal X. For example, if each target section on the time axis is specified in advance, the audio signal X is supplied. In parallel, it is also possible to execute the expression providing process and the singing evaluation process in real time.
- one of the plurality of audio signals X is selected as the selected audio signal X, but the selection of the audio signal X (the singing selection unit 32 or the singing selection unit 42) may be omitted.
- the section specifying unit 34 can be omitted. Therefore, the speech processing apparatus that executes the expression providing process is selected by the expression selecting unit 36 that selects the singing expression data DS to be applied from the plurality of singing expression data DS and the expression selecting unit 36, as illustrated in FIG.
- the singing expression indicated by the singing expression data DS is comprehensively expressed as an apparatus including an expression providing unit 38 that applies a singing expression to a specific section of the singing voice (audio signal X).
- the target of the expression registration process is not limited to the audio signal R generated by the sound collection device 14.
- an audio signal R reproduced from a portable or built-in recording medium, or an audio signal R received from another communication terminal via a communication network can be used as an expression registration process target. It is also possible to execute the expression registration process in real time in parallel with the supply of the audio signal R.
- the expression provision process of 1st Embodiment and the song evaluation process of 2nd Embodiment were performed for the audio
- expression provision process and song evaluation are performed.
- the expression format of the singing voice to be processed is arbitrary.
- the singing voice can be expressed by synthetic information (for example, a file in the VSQ format) in which pitches and pronunciation characters (lyrics) are specified in time series for each musical note.
- the expression providing unit 38 of the first embodiment synthesizes the singing expression by the same expression providing process as that of the first embodiment while sequentially synthesizing the singing voice specified by the synthesis information by, for example, the unit connection type speech synthesis process.
- the song evaluation part 46 of 2nd Embodiment performs the song evaluation process similar to 2nd Embodiment, synthesize
- one target expression data DS is selected for each target section.
- the expression selection unit 36 selects a plurality (typically a plurality of types) of target expression data DS for one target section. It is also possible to select.
- Each singing expression of the plurality of target expression data DS selected by the expression selecting unit 36 is given to one target section of the selected audio signal X in an overlapping manner.
- the singing expression of one singing expression data DS (for example, singing expression data DS obtained by weighted addition of a plurality of object expressing data DS) obtained by integrating a plurality of object expressing data DS selected for one target section is the object. It is also possible to give to a section.
- the singing expression data DS corresponding to the instruction from the user is selected by specifying the search condition, but the method of selecting the singing expression data DS by the expression selecting unit 36 is arbitrary.
- the singing voice of the singing expression indicated by each singing expression data DS is reproduced by the user from the sound emitting device 18, and the singing expression data DS designated by the user in consideration of the result of the trial listening is expressed as an expression selection unit.
- 36 it is also possible for 36 to select.
- storage device 12 at random, and the structure which selects each song expression data DS by the predetermined rule selected beforehand are also employ
- the audio signal Y generated by the expression providing unit 38 is supplied to the sound emitting device 18 and reproduced, but the method of outputting the audio signal Y is arbitrary.
- a configuration in which the audio signal Y generated by the expression providing unit 38 is stored in a specific recording medium (for example, the storage device 12 or a portable recording medium), or a configuration in which the audio signal Y is transmitted from the communication device to another communication terminal. Is also adopted.
- the speech processing apparatus 100 that executes both the expression registration process and the expression imparting process is illustrated.
- the speech processing apparatus that executes the expression registration process and the speech processing apparatus that executes the expression imparting process are provided. It can also be configured separately.
- a plurality of singing expression data DS generated in the expression registration process of the registration speech processing apparatus is transferred to the expression providing voice processing apparatus and applied to the expression providing process.
- the voice processing device that executes the expression registration process and the voice processing device that executes the song evaluation process can be configured separately.
- the voice processing device 100 can be realized by a server device that communicates with a terminal device such as a mobile phone.
- the voice processing device 100 extracts the singing expression data DS by analyzing the voice signal R received from the terminal device and stores it in the storage device 12 or the singing expression indicated by the singing expression data DS as the voice signal X.
- An expression providing process for transmitting the audio signal Y given to the terminal device is executed.
- the present invention can be realized as a voice processing system including a voice processing device (server device) and a terminal device that communicate with each other.
- the speech processing apparatus 100 of each embodiment described above can be realized as a system (speech processing system) in which each function is distributed to a plurality of apparatuses.
- the song evaluation unit 46 uses the singing expression data DS and the attribute data DA (evaluation value Q) stored in the storage device 12 to sing the audio signal X. Although evaluated, the song evaluation part 46 may obtain the evaluation value Q from a device different from the storage device 12 and evaluate the skill of singing the audio signal X.
- DESCRIPTION OF SYMBOLS 100 ... Voice processing device, 10 ... Arithmetic processing device, 12 ... Memory
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201480014605.4A CN105051811A (zh) | 2013-03-15 | 2014-03-12 | 声音处理装置 |
KR1020157024316A KR20150118974A (ko) | 2013-03-15 | 2014-03-12 | 음성 처리 장치 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013-053983 | 2013-03-15 | ||
JP2013053983A JP2014178620A (ja) | 2013-03-15 | 2013-03-15 | 音声処理装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014142200A1 true WO2014142200A1 (fr) | 2014-09-18 |
Family
ID=51536851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2014/056570 WO2014142200A1 (fr) | 2013-03-15 | 2014-03-12 | Dispositif de traitement vocal |
Country Status (5)
Country | Link |
---|---|
JP (1) | JP2014178620A (fr) |
KR (1) | KR20150118974A (fr) |
CN (1) | CN105051811A (fr) |
TW (1) | TW201443874A (fr) |
WO (1) | WO2014142200A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016194622A (ja) * | 2015-04-01 | 2016-11-17 | 株式会社エクシング | カラオケ装置及びカラオケ用プログラム |
EP3537432A4 (fr) * | 2016-11-07 | 2020-06-03 | Yamaha Corporation | Procédé de synthèse vocale |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6620462B2 (ja) * | 2015-08-21 | 2019-12-18 | ヤマハ株式会社 | 合成音声編集装置、合成音声編集方法およびプログラム |
KR102168529B1 (ko) * | 2020-05-29 | 2020-10-22 | 주식회사 수퍼톤 | 인공신경망을 이용한 가창음성 합성 방법 및 장치 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003255974A (ja) * | 2002-02-28 | 2003-09-10 | Yamaha Corp | 歌唱合成装置、歌唱合成方法及び歌唱合成用プログラム |
JP2004264676A (ja) * | 2003-03-03 | 2004-09-24 | Yamaha Corp | 歌唱合成装置、歌唱合成プログラム |
JP2006330615A (ja) * | 2005-05-30 | 2006-12-07 | Yamaha Corp | 歌唱合成装置および歌唱合成プログラム |
JP2008165130A (ja) * | 2007-01-05 | 2008-07-17 | Yamaha Corp | 歌唱音合成装置およびプログラム |
JP2009244607A (ja) * | 2008-03-31 | 2009-10-22 | Daiichikosho Co Ltd | デュエットパート歌唱生成システム |
JP2009258291A (ja) * | 2008-04-15 | 2009-11-05 | Yamaha Corp | 音声データ処理装置およびプログラム |
JP2011013454A (ja) * | 2009-07-02 | 2011-01-20 | Yamaha Corp | 歌唱合成用データベース生成装置、およびピッチカーブ生成装置 |
JP2011095397A (ja) * | 2009-10-28 | 2011-05-12 | Yamaha Corp | 音声合成装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003108179A (ja) * | 2001-10-01 | 2003-04-11 | Nippon Telegr & Teleph Corp <Ntt> | 歌唱音声合成における韻律データ収集方法、韻律データ収集プログラム、そのプログラムを記録した記録媒体 |
CN101981614B (zh) * | 2008-04-08 | 2012-06-27 | 株式会社Ntt都科摩 | 媒体处理服务器设备及其媒体处理方法 |
-
2013
- 2013-03-15 JP JP2013053983A patent/JP2014178620A/ja not_active Withdrawn
-
2014
- 2014-03-12 CN CN201480014605.4A patent/CN105051811A/zh active Pending
- 2014-03-12 WO PCT/JP2014/056570 patent/WO2014142200A1/fr active Application Filing
- 2014-03-12 KR KR1020157024316A patent/KR20150118974A/ko not_active Application Discontinuation
- 2014-03-13 TW TW103109149A patent/TW201443874A/zh unknown
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003255974A (ja) * | 2002-02-28 | 2003-09-10 | Yamaha Corp | 歌唱合成装置、歌唱合成方法及び歌唱合成用プログラム |
JP2004264676A (ja) * | 2003-03-03 | 2004-09-24 | Yamaha Corp | 歌唱合成装置、歌唱合成プログラム |
JP2006330615A (ja) * | 2005-05-30 | 2006-12-07 | Yamaha Corp | 歌唱合成装置および歌唱合成プログラム |
JP2008165130A (ja) * | 2007-01-05 | 2008-07-17 | Yamaha Corp | 歌唱音合成装置およびプログラム |
JP2009244607A (ja) * | 2008-03-31 | 2009-10-22 | Daiichikosho Co Ltd | デュエットパート歌唱生成システム |
JP2009258291A (ja) * | 2008-04-15 | 2009-11-05 | Yamaha Corp | 音声データ処理装置およびプログラム |
JP2011013454A (ja) * | 2009-07-02 | 2011-01-20 | Yamaha Corp | 歌唱合成用データベース生成装置、およびピッチカーブ生成装置 |
JP2011095397A (ja) * | 2009-10-28 | 2011-05-12 | Yamaha Corp | 音声合成装置 |
Non-Patent Citations (2)
Title |
---|
HIDEKI KAWAHARA ET AL.: "Perceptual study on design reuse of voice identity and singing style based on singing voice morphing", INTERACTION 2007 YOKOSHU, March 2007 (2007-03-01), Retrieved from the Internet <URL:http://www.interaction-ipsj.org/archives/paper2007/aural/0043/paper0043.pdf> [retrieved on 20140604] * |
TAKESHI SAITO ET AL.: "Utagoe no Kojinsei Chikaku ni Kiyo suru Onkyo Tokucho no Kento", REPORT OF THE 2007 AUTUMN MEETING, THE ACOUSTICAL SOCIETY OF JAPAN CD-ROM, September 2007 (2007-09-01), pages 601 - 602 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016194622A (ja) * | 2015-04-01 | 2016-11-17 | 株式会社エクシング | カラオケ装置及びカラオケ用プログラム |
EP3537432A4 (fr) * | 2016-11-07 | 2020-06-03 | Yamaha Corporation | Procédé de synthèse vocale |
US11410637B2 (en) | 2016-11-07 | 2022-08-09 | Yamaha Corporation | Voice synthesis method, voice synthesis device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
TW201443874A (zh) | 2014-11-16 |
JP2014178620A (ja) | 2014-09-25 |
KR20150118974A (ko) | 2015-10-23 |
CN105051811A (zh) | 2015-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101094687B1 (ko) | 노래학습 기능을 갖는 노래방 시스템 | |
JP4207902B2 (ja) | 音声合成装置およびプログラム | |
CN112331222B (zh) | 一种转换歌曲音色的方法、系统、设备及存储介质 | |
JP4645241B2 (ja) | 音声処理装置およびプログラム | |
JP2015034920A (ja) | 音声解析装置 | |
WO2020095950A1 (fr) | Procédé et système de traitement d'informations | |
CN111542875A (zh) | 声音合成方法、声音合成装置及程序 | |
JP4479701B2 (ja) | 楽曲練習支援装置、動的時間整合モジュールおよびプログラム | |
JP2019061135A (ja) | 電子楽器、電子楽器の楽音発生方法、及びプログラム | |
JP6737320B2 (ja) | 音響処理方法、音響処理システムおよびプログラム | |
WO2014142200A1 (fr) | Dispositif de traitement vocal | |
CN101930732B (zh) | 基于用户输入语音的乐曲生成方法及装置、智能终端 | |
WO2017057531A1 (fr) | Dispositif de traitement acoustique | |
Lerch | Software-based extraction of objective parameters from music performances | |
JP6288197B2 (ja) | 評価装置及びプログラム | |
JP6102076B2 (ja) | 評価装置 | |
JP6252420B2 (ja) | 音声合成装置、及び音声合成システム | |
JP4491743B2 (ja) | カラオケ装置 | |
JP5292702B2 (ja) | 楽音信号生成装置及びカラオケ装置 | |
JP2002073064A (ja) | 音声処理装置、音声処理方法および情報記録媒体 | |
KR20090023912A (ko) | 음악 데이터 처리 시스템 | |
JP2023013684A (ja) | 歌唱声質変換プログラム及び歌唱声質変換装置 | |
JP2022065554A (ja) | 音声合成方法およびプログラム | |
JP5618743B2 (ja) | 歌唱音声評価装置 | |
JP2008040258A (ja) | 楽曲練習支援装置、動的時間整合モジュールおよびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480014605.4 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14762388 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20157024316 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14762388 Country of ref document: EP Kind code of ref document: A1 |