WO2022254973A1 - 口腔機能評価方法、プログラム、口腔機能評価装置および口腔機能評価システム - Google Patents
口腔機能評価方法、プログラム、口腔機能評価装置および口腔機能評価システム Download PDFInfo
- Publication number
- WO2022254973A1 WO2022254973A1 PCT/JP2022/017643 JP2022017643W WO2022254973A1 WO 2022254973 A1 WO2022254973 A1 WO 2022254973A1 JP 2022017643 W JP2022017643 W JP 2022017643W WO 2022254973 A1 WO2022254973 A1 WO 2022254973A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- oral function
- oral
- formant frequency
- evaluated
- person
- Prior art date
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 166
- 230000006870 function Effects 0.000 claims abstract description 297
- 230000008859 change Effects 0.000 claims abstract description 76
- 238000000605 extraction Methods 0.000 claims abstract description 50
- 238000004364 calculation method Methods 0.000 claims abstract description 22
- 241001417093 Moridae Species 0.000 claims abstract description 9
- 239000000284 extract Substances 0.000 claims description 46
- 230000006866 deterioration Effects 0.000 claims description 25
- 230000009747 swallowing Effects 0.000 claims description 22
- 206010013781 dry mouth Diseases 0.000 claims description 16
- 239000011248 coating agent Substances 0.000 claims description 10
- 238000000576 coating method Methods 0.000 claims description 10
- 230000001055 chewing effect Effects 0.000 claims description 2
- 230000007423 decrease Effects 0.000 abstract description 6
- 210000000214 mouth Anatomy 0.000 description 54
- 238000010586 diagram Methods 0.000 description 30
- 235000013305 food Nutrition 0.000 description 26
- 230000002123 temporal effect Effects 0.000 description 14
- 241000102542 Kara Species 0.000 description 12
- 238000000034 method Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 10
- 238000010801 machine learning Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 239000000470 constituent Substances 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 208000002720 Malnutrition Diseases 0.000 description 3
- 230000003749 cleanliness Effects 0.000 description 3
- 210000001847 jaw Anatomy 0.000 description 3
- 206010003504 Aspiration Diseases 0.000 description 2
- 241000736839 Chara Species 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 235000000824 malnutrition Nutrition 0.000 description 2
- 230000001071 malnutrition Effects 0.000 description 2
- 208000015380 nutritional deficiency disease Diseases 0.000 description 2
- 210000003254 palate Anatomy 0.000 description 2
- 210000003296 saliva Anatomy 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- RZVAJINKPMORJF-UHFFFAOYSA-N Acetaminophen Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 235000008597 Diospyros kaki Nutrition 0.000 description 1
- 244000236655 Diospyros kaki Species 0.000 description 1
- 241000208818 Helianthus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 208000010428 Muscle Weakness Diseases 0.000 description 1
- 206010028372 Muscular weakness Diseases 0.000 description 1
- 206010035669 Pneumonia aspiration Diseases 0.000 description 1
- 102000007562 Serum Albumin Human genes 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- 208000005946 Xerostomia Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 238000005299 abrasion Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 201000009807 aspiration pneumonia Diseases 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000037406 food intake Effects 0.000 description 1
- 235000012631 food intake Nutrition 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 235000018343 nutrient deficiency Nutrition 0.000 description 1
- 238000002559 palpation Methods 0.000 description 1
- 101150002764 purA gene Proteins 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4803—Speech analysis specially adapted for diagnostic purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/15—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being formant information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B2503/00—Evaluating a particular growth phase or type of persons or animals
- A61B2503/08—Elderly
Definitions
- the present invention relates to an oral function evaluation method, a program, an oral function evaluation device, and an oral function evaluation system that can evaluate the oral function of an evaluator.
- a device for evaluating the swallowing function is attached to the neck of the evaluator, and the pharyngeal movement feature value is obtained as a swallowing function evaluation index (marker) to evaluate the swallowing function of the evaluator.
- a method is disclosed (see, for example, Patent Document 1).
- the purpose of the present invention is to provide an oral function evaluation method and the like that enable the oral function of an evaluator to be easily evaluated.
- the oral function evaluation method consists of two or more moras including a change in the first formant frequency or a change in the second formant frequency, or at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative.
- a program according to one aspect of the present invention is a program for causing a computer to execute the oral function evaluation method described above.
- the oral function evaluation device is composed of two or more mora including changes in the first formant frequency or changes in the second formant frequency, or a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative sound.
- Acquisition unit for acquiring speech data obtained by collecting speech uttered by an evaluator including at least one syllable or fixed phrase, and an extraction unit for extracting prosodic features from the acquired speech data and a calculation unit that calculates an estimated value of the oral function of the person to be evaluated based on an oral function estimation formula calculated based on a plurality of learning data and the extracted prosodic feature amount; and an evaluation unit that evaluates the deterioration state of the oral function of the person to be evaluated by determining the estimated value using an oral function evaluation index.
- an oral function evaluation system includes the oral function evaluation device described above, a sound collector that collects, in a non-contact manner, the sound of the syllables or fixed phrases uttered by the person being evaluated, Prepare.
- FIG. 1 is a diagram showing the configuration of an oral function evaluation system according to an embodiment.
- FIG. 2 is a block diagram showing a characteristic functional configuration of the oral cavity function evaluation system according to the embodiment.
- FIG. 3 is a flow chart showing a processing procedure for evaluating the oral function of the person to be evaluated by the oral function evaluation method according to the embodiment.
- FIG. 4 is a diagram showing an outline of a method of acquiring the voice of the person to be evaluated by the oral function evaluation method according to the embodiment.
- FIG. 5A is a diagram showing an example of voice data representing a voice uttered by an evaluator, "I decided to draw a picture.”
- FIG. 5B is a diagram showing an example of changes in the formant frequency of the voice uttered by the evaluator, "I decided to draw a picture.”
- FIG. 6 is a diagram showing an example of voice data representing voice repeatedly uttered by the evaluator, "from kara kara kara".
- FIG. 7 is a diagram showing an example of voice data representing a voice uttered by the evaluator, "once”.
- FIG. 8 is a diagram showing an example of Chinese syllables or fixed phrases similar to Japanese syllables or fixed phrases in the degree of tongue movement or mouth opening and closing during pronunciation.
- FIG. 9A is a diagram showing the International Phonetic Alphabet for vowels.
- FIG. 9A is a diagram showing the International Phonetic Alphabet for vowels.
- FIG. 9B is a diagram showing the International Phonetic Alphabet for consonants.
- FIG. 10A is a diagram showing an example of voice data representing voice uttered by the evaluator as "gao dao wu da ka ji ke da yi wu zhe".
- FIG. 10B is a diagram showing an example of changes in the formant frequency of the voice uttered by the evaluator, "gao dao wu da ka ji ke da yi wu zhe”.
- FIG. 11 is a diagram showing an example of an oral function evaluation index.
- FIG. 12 is a diagram showing an example of evaluation results for each element of oral cavity function.
- FIG. 13 is a diagram showing an example of evaluation results for each element of oral cavity function.
- FIG. 14 is an example of predetermined data used when making suggestions regarding oral cavity functions.
- each figure is a schematic diagram and is not necessarily strictly illustrated. Moreover, in each figure, the same code
- the present invention relates to a method for evaluating deterioration of oral function, etc., and oral function includes various factors.
- oral function factors include tongue coating, dry mouth, bite force, tongue pressure, cheek pressure, number of remaining teeth, swallowing function, and chewing function. Tongue coating, dry mouth, bite force, tongue pressure, and masticatory function are briefly described here.
- Tongue coating indicates the degree of bacterial or food deposits on the tongue (ie oral hygiene). If the tongue coating is absent or thin, it indicates that there is an environment of mechanical abrasion (e.g., food intake), that there is a cleansing action of saliva, and that the swallowing movement (movement of the tongue) is normal. On the other hand, when the tongue coating is thick, the movement of the tongue is poor and it is difficult to eat, which may lead to nutritional deficiencies or muscle weakness. Xerostomia is the degree of dryness of the tongue, which inhibits movement to speak. In addition, food is pulverized after it is taken into the oral cavity, but since it is difficult to swallow as it is, saliva works to gather the pulverized food to make it easier to swallow the pulverized food.
- Tongue pressure is an index that expresses the force with which the tongue presses the palate.
- swallowing movements may become difficult.
- the speed of moving the tongue may decrease, and the speaking speed may decrease.
- Masticatory function is a comprehensive function of the oral cavity.
- the present invention it is possible to evaluate the state of deterioration of the oral function of the person to be evaluated (for example, the state of deterioration of oral function elements) from the voice uttered by the person to be evaluated.
- This is because the speech uttered by an evaluator whose oral function has deteriorated has specific features, and by extracting this as prosody features, the oral function of the evaluator can be evaluated.
- the present invention is realized by an oral function evaluation method, a program that causes a computer to execute the method, an oral function evaluation device that is an example of the computer, and an oral function evaluation system that includes the oral function evaluation device. Below, oral cavity function evaluation method etc. are demonstrated, showing an oral cavity function evaluation system.
- FIG. 1 is a diagram showing the configuration of an oral function evaluation system 200 according to an embodiment.
- the oral function evaluation system 200 is a system for evaluating the oral function of the person U to be evaluated by analyzing the voice of the person U to be evaluated. terminal 300;
- the oral function evaluation device 100 is a device that acquires voice data representing the voice uttered by the evaluator U using the portable terminal 300 and evaluates the oral function of the evaluator U from the acquired voice data.
- the mobile terminal 300 has two or more moras including changes in the first formant frequency or changes in the second formant frequency, or includes at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative.
- the mobile terminal 300 is a smartphone, tablet, or the like having a microphone.
- the mobile terminal 300 is not limited to a smartphone or tablet, and may be a notebook PC or the like, as long as it has a sound collecting function.
- the oral function evaluation system 200 may include a sound collector (microphone) instead of the mobile terminal 300 .
- the oral function evaluation system 200 may also include an input interface for acquiring personal information of the person U to be evaluated.
- the input interface is not particularly limited as long as it has an input function such as a keyboard or a touch panel.
- the volume of the microphone may be set.
- the mobile terminal 300 may be a display device that has a display and displays images based on image data output from the oral function evaluation device 100 .
- the display device may not be the mobile terminal 300, and may be a monitor device configured by a liquid crystal panel, an organic EL panel, or the like. That is, in the present embodiment, mobile terminal 300 serves as both a sound collector and a display device, but the sound collector (microphone), input interface, and display device may be provided separately.
- the oral function evaluation apparatus 100 and the portable terminal 300 may be connected by wire or wirelessly as long as they can transmit and receive audio data or image data for displaying an image showing an evaluation result to be described later. may be connected.
- the oral function evaluation device 100 analyzes the voice of the person U to be evaluated based on the sound data collected by the portable terminal 300, evaluates the oral function of the person U from the analysis result, and outputs the evaluation result. .
- the oral function evaluation apparatus 100 transmits image data for displaying an image showing the evaluation result, or data for making a proposal regarding the oral cavity to the evaluated person U generated based on the evaluation result, to the portable terminal 300. Output.
- the oral function evaluation apparatus 100 can notify the person U of the degree of oral function and a proposal for preventing deterioration of the oral function. It can be prevented and improved.
- the oral function evaluation device 100 is, for example, a personal computer, but may be a server device. Moreover, the oral function evaluation device 100 may be the mobile terminal 300 . In other words, the mobile terminal 300 may have the functions of the oral function evaluation device 100 described below.
- FIG. 2 is a block diagram showing the characteristic functional configuration of the oral cavity function evaluation system 200 according to the embodiment.
- the oral function evaluation device 100 includes an acquisition unit 110 , an extraction unit 120 , a calculation unit 130 , an evaluation unit 140 , an output unit 150 , a proposal unit 160 and a storage unit 170 .
- the acquisition unit 110 acquires voice data obtained by the mobile terminal 300 collecting the voice uttered by the evaluator U in a non-contact manner.
- the speech is the speech of the person U to be evaluated uttering syllables or fixed phrases consisting of two or more moras including a change in the first formant frequency or a change in the second formant frequency.
- the speech is speech in which syllables or fixed phrases including at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative are uttered.
- the acquisition unit 110 may further acquire the personal information of the person U to be evaluated.
- the personal information is information input to the mobile terminal 300, and includes age, weight, height, gender, BMI (Body Mass Index), dental information (e.g., number of teeth, presence or absence of dentures, location of occlusion support, function number of teeth, number of remaining teeth, etc.), serum albumin level, eating rate, etc.
- the personal information may be obtained by a swallowing screening tool called EAT-10, a Seirei style swallowing questionnaire, medical interview, Barthel Index, or the like.
- EAT-10 a Seirei style swallowing questionnaire
- medical interview Barthel Index
- Acquisition unit 110 is, for example, a communication interface that performs wired communication or wireless communication.
- the extraction unit 120 is a processing unit that analyzes the voice data of the evaluated person U acquired by the acquisition unit 110 .
- the extraction unit 120 is specifically implemented by a processor, microcomputer, or dedicated circuit.
- the extraction unit 120 calculates prosodic features from the speech data acquired by the acquisition unit 110 .
- the prosody feature amount is a numerical value indicating the feature of the voice of the person U to be evaluated, which is extracted from the voice data used by the evaluation unit 140 to evaluate the oral function of the person U to be evaluated.
- the prosody features include speech rate, sound pressure difference, time change of sound pressure difference, first formant frequency, second formant frequency, amount of change in first formant frequency, amount of change in second formant frequency, and amount of change in first formant frequency. At least one of a time variation, a second formant frequency time variation, and a plosive duration may be included.
- the calculation unit 130 calculates an estimated value of the oral function of the person to be evaluated U based on the prosodic feature amount extracted by the extraction unit 120 and the oral function estimation formula calculated based on a plurality of learning data.
- Estimation formula data 171 representing an estimation formula is stored in the storage unit 170 .
- Calculation unit 130 is specifically realized by a processor, a microcomputer, or a dedicated circuit.
- the evaluation unit 140 evaluates the deterioration state of the oral function of the person to be evaluated U by determining the estimated value calculated by the calculation unit 130 using the oral function evaluation index.
- Index data 172 indicating an oral function evaluation index is stored in the storage unit 170 .
- the evaluation unit 140 is specifically implemented by a processor, microcomputer, or dedicated circuit.
- the output unit 150 outputs the estimated value calculated by the calculation unit 130 to the proposal unit 160 . Further, the output unit 150 may output the evaluation result of the oral function of the person to be evaluated U evaluated by the evaluation unit 140 to the portable terminal 300 or the like.
- the output unit 150 is specifically realized by a processor, a microcomputer, or a dedicated circuit, and a communication interface that performs wired or wireless communication.
- the proposal unit 160 makes a proposal regarding the oral function of the person to be evaluated U by collating the estimated value calculated by the calculation unit 130 with predetermined data.
- Proposal data 173 which is predetermined data, is stored in storage unit 170 . Further, the proposal unit 160 may compare the personal information acquired by the acquisition unit 110 with the proposal data 173 and make a proposal regarding the oral cavity to the person U to be evaluated.
- the proposal unit 160 outputs the proposal to the mobile terminal 300 .
- the proposal unit 160 is realized by, for example, a processor, a microcomputer or a dedicated circuit, and a communication interface that performs wired or wireless communication.
- the storage unit 170 stores estimation formula data 171 indicating an oral function estimation formula calculated based on a plurality of learning data, and index data indicating an oral function evaluation index for determining an estimated value of the oral function of the person U to be evaluated.
- index data indicating an oral function evaluation index for determining an estimated value of the oral function of the person U to be evaluated.
- the estimation formula data 171 is referred to by the calculation unit 130 when the estimated value of the oral function of the person U to be evaluated is calculated.
- the index data 172 is referred to by the evaluation unit 140 when evaluation of the deterioration state of the oral function of the person U to be evaluated is performed.
- the proposal data 173 is referred to by the proposal unit 160 when a proposal regarding oral functions to the person U to be evaluated is made.
- the personal information data 174 is, for example, data obtained via the obtaining unit 110 .
- Personal information data 174 may be stored in storage unit 170 in advance.
- the storage unit 170 is implemented by, for example, ROM (Read Only Memory), RAM (Random Access Memory), semiconductor memory, HDD (Hard Disk Drive), and the like.
- the storage unit 170 a program executed by the computer to realize the extraction unit 120, the calculation unit 130, the evaluation unit 140, the output unit 150, and the proposal unit 160, and the evaluation result of the oral function of the person U to be evaluated are stored.
- Image data indicating the evaluation results used for output, and data such as images, moving images, voices, texts, etc. indicating the contents of the proposal may also be stored.
- the storage unit 170 may store an instruction image, which will be described later.
- the oral function evaluation device 100 consists of two or more moras including changes in the first formant frequency or changes in the second formant frequency, or at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative sound.
- An instruction unit may be provided for instructing the evaluator U to pronounce syllables or fixed phrases, including one. Specifically, the instruction unit acquires image data of an instruction image or audio data of an instruction voice for instructing pronunciation of the syllable or fixed phrase stored in the storage unit 170. and outputs the image data or the audio data to the mobile terminal 300 .
- FIG. 3 is a flowchart showing a processing procedure for evaluating the oral function of the person to be evaluated U by the oral function evaluation method according to the embodiment.
- FIG. 4 is a diagram showing an outline of a method of acquiring the voice of the person to be evaluated U by the oral function evaluation method.
- the indication part consists of two or more moras including a change in the first formant frequency or a change in the second formant frequency, or includes at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative, a syllable or a fixed form
- An instruction is given to pronounce a sentence (step S101).
- the instruction unit acquires image data of an image for instructing the person to be evaluated U, stored in the storage unit 170 , and outputs the image data to the mobile terminal 300 . Then, as shown in (a) of FIG. In FIG.
- the instruction unit acquires the voice data of the voice for instructing the person to be evaluated U, which is stored in the storage unit 170, and outputs the voice data to the mobile terminal 300, thereby instructing pronunciation.
- the instruction may be given by using an instruction voice instructing to pronounce without using an instruction image.
- an evaluator (a family member, a doctor, etc.) who wants to evaluate the oral function of the evaluator U gives the above instructions to the evaluator U in his/her own voice without using an instruction image and voice for instructing pronunciation. you can go
- a spoken syllable or phrase may include two or more vowels or a combination of vowels and consonants that involve opening and closing the mouth or moving the tongue back and forth to speak.
- such syllables or fixed phrases include "I decided to write a picture".
- the movement of the tongue back and forth is involved, and in order to utter the ⁇ decided'' in ⁇ I decided to write a picture,'' Accompanied by the opening and closing of the mouth.
- the ⁇ e wo'' part of ⁇ I decided to write a picture'' contains the second formant frequencies of the vowel ⁇ e'' and the vowel ⁇ o'', and is flanked by the vowel ⁇ e'' and the vowel ⁇ o''. Since they match, the amount of change in the second formant frequency is included. This portion also includes the time variation of the second formant frequency.
- the ⁇ decided'' part of ⁇ I decided to write an drawing'' contains the first formant frequencies of the vowel ⁇ i'', the vowel ⁇ e'' and the vowel ⁇ a''. Since "e" and vowel "a" are adjacent, the amount of change in the first formant frequency is included.
- This part also includes the time variation of the first formant frequency.
- the sound pressure difference, the first formant frequency, the second formant frequency, the amount of change in the first formant frequency, the amount of change in the second formant frequency, the first formant frequency It is possible to extract the prosodic features such as time change of the second formant frequency, time change of the second formant frequency, speech rate, and the like.
- an uttered fixed phrase may include repetition of syllables composed of a fingering sound and a consonant different from the fingering sound.
- a fixed phrase for example, in Japanese, there is such a fixed phrase as "kara kara kara --.
- By repeatedly uttering "kara kara kara" it is possible to extract the prosodic features such as the sound pressure difference, the time change of the sound pressure difference, the time change of the sound pressure, the number of repetitions, and the like.
- a spoken syllable or fixed phrase may include at least one combination of vowels and plosives.
- a syllable in Japanese there is “Itchi” and the like.
- the voice data may be obtained by collecting voices of syllables or fixed phrases uttered by the evaluator U at least twice at different speaking speeds. For example, the person to be evaluated U is instructed to say “I decided to draw a picture” at normal speed and at a faster speed, respectively. By saying “I decided to draw a picture” at normal speed and faster speed, we can estimate the degree of maintenance of the state of oral function.
- the acquisition unit 110 acquires the voice data of the evaluated person U instructed in step S101 via the mobile terminal 300 (step S102).
- step S102 for example, the person to be evaluated U utters a syllable or fixed phrase such as "I decided to write a picture" toward the mobile terminal 300.
- the acquisition unit 110 acquires syllables or fixed phrases uttered by the person to be evaluated U as voice data.
- the extraction unit 120 extracts prosodic features from the speech data acquired by the acquisition unit 110 (step S103).
- the extraction unit 120 extracts the sound pressure gradient, the first formant frequency, the second formant The frequency, the amount of change in the first formant frequency, the amount of change in the second formant frequency, the time change in the first formant frequency, the time change in the second formant frequency, and the speech rate are extracted as prosodic features. This will be described with reference to FIGS. 5A and 5B.
- FIG. 5A is a diagram showing an example of voice data indicating voice uttered by the person to be evaluated U, "I decided to draw a picture."
- the horizontal axis of the graph shown in FIG. 5A is time, and the vertical axis is power (sound pressure).
- the unit of power shown on the vertical axis of the graph in FIG. 5A is decibel (dB).
- the graph shown in FIG. ” is confirmed.
- the acquiring unit 110 acquires the speech data shown in FIG. 5A from the person to be evaluated U in step S102 shown in FIG.
- the extraction unit 120 extracts each sound pressure of "k” and “a” in “ka (ka)” included in the audio data shown in FIG. Sound pressures of "k” and “o” in “(ko)”, sound pressures of "t” and “o” in “to (to)”, “t” and “a” in “ta (ta)” extract each sound pressure.
- the extraction unit 120 extracts the sound pressure difference Diff_P(ka) of “k” and “a” from the extracted sound pressures of “k” and “a” as prosodic features.
- the extraction unit 120 extracts the sound pressure difference Diff_P(ko) of 'k' and 'o', the sound pressure difference Diff_P(to) of 't' and 'o', and the sounds of 't' and 'a'.
- the pressure difference Diff_P(ta) is extracted as a prosody feature. For example, sound pressure gradients can assess oral function in terms of swallowing force (the pressure at which the tongue contacts the palate) or food gathering forces. In addition, oral function in terms of ability to prevent food from entering the throat can also be evaluated by the sound pressure gradient including 'k'.
- FIG. 5B is a diagram showing an example of a change in the formant frequency of the voice uttered by the person to be evaluated U, "I decided to draw a picture.” Specifically, FIG. 5B is a graph for explaining an example of changes in the first formant frequency and the second formant frequency.
- the first formant frequency is the peak frequency of the amplitude seen first from the low frequency side of human speech, and it is known that the characteristics of opening and closing of the mouth are likely to be reflected.
- the second formant frequency is the peak frequency of amplitude that appears second from the low-frequency side of human speech, and is known to easily reflect the influence of back-and-forth movement of the tongue.
- the extraction unit 120 extracts the first formant frequency and the second formant frequency of each of a plurality of vowels from the speech data representing the speech uttered by the person U to be evaluated as prosodic features. For example, the extraction unit 120 extracts the second formant frequency F2e corresponding to the vowel "e” and the second formant frequency F2o corresponding to the vowel “o” in “eh” as prosodic features. Further, for example, the extraction unit 120 extracts the first formant frequency F1i corresponding to the vowel "i”, the first formant frequency F1e corresponding to the vowel "e”, and the first formant frequency F1e corresponding to the vowel "a” in "kimeta”. The formant frequency F1a is extracted as a prosodic feature quantity.
- the extraction unit 120 extracts the amount of change in the first formant frequency and the amount of change in the second formant frequency of the character string with consecutive vowels as prosodic features. For example, the extraction unit 120 extracts the amount of change in the second formant frequency F2e and the second formant frequency F2o (F2e-F2o), the amount of change in the first formant frequency F1i, the first formant frequency F1e and the first formant frequency F1a ( F1e-F1i, F1a-F1e, F1a-F1i) are extracted as prosodic features.
- the extraction unit 120 extracts the time change of the first formant frequency and the time change of the second formant frequency of the character string with consecutive vowels as prosodic features. For example, the extraction unit 120 extracts temporal changes in the second formant frequency F2e and the second formant frequency F2o, and temporal changes in the first formant frequency F1i, the first formant frequency F1e, and the first formant frequency F1a as prosodic features. do.
- FIG. 5B shows an example of temporal changes in the first formant frequency F1i, the first formant frequency F1e, and the first formant frequency F1a, and the temporal change is ⁇ F1/ ⁇ Time. This ⁇ F1 is F1a-F1i.
- the oral function related to the movement of food back and forth, left and right movements of the tongue
- the second formant frequency, the amount of change in the second formant frequency, or the time change in the second formant frequency can be used to evaluate oral function related to the ability to crush food.
- the oral function related to the ability to move the mouth quickly by the temporal change of the first formant frequency can be used to evaluate the oral function related to the ability to move the mouth quickly by the temporal change of the first formant frequency.
- the extraction unit 120 may extract the speech rate as the prosody feature amount.
- the extracting unit 120 may extract, as a prosodic feature amount, the time from when the person to be evaluated U starts to utter "I decided to write" to when the utterance ends.
- the extracting unit 120 is not limited to the time until the end of uttering all of ⁇ I decided to write an image.'' You may extract time until it finishes as a prosody feature-value.
- the extracting unit 120 may extract the average time taken to utter one or more words of all or a specific part of "I decided to write a picture" as a prosody feature amount.
- speech rate can assess oral function related to swallowing movements, food gathering movements or tongue dexterity.
- the extraction unit 120 extracts the temporal change in the sound pressure difference as the prosodic feature amount. do. This will be described with reference to FIG.
- FIG. 6 is a diagram showing an example of voice data representing the voice repeatedly uttered by the person to be evaluated U, "Kara kara kara.".
- the horizontal axis of the graph shown in FIG. 6 is time, and the vertical axis is power (sound pressure).
- the unit of power shown on the vertical axis of the graph in FIG. 6 is decibel (dB).
- the acquiring unit 110 acquires the speech data shown in FIG. 6 from the person to be evaluated U in step S102 shown in FIG. For example, at step S103 shown in FIG. Each sound pressure of "r” and “a” in “(ra)” is extracted.
- the extraction unit 120 extracts the sound pressure difference Diff_P(ka) of “k” and “a” from the extracted sound pressures of “k” and “a” as prosodic features. Similarly, the extraction unit 120 extracts the sound pressure difference Diff_P(ra) between "r” and "a” as a prosody feature.
- the extraction unit 120 extracts the sound pressure difference Diff_P(ka) and the sound pressure difference Diff_P(ra) as the prosodic features for each of the repeatedly uttered "kara”. Then, from each of the extracted sound pressure differences Diff_P(ka), the extracting unit 120 extracts the temporal change of the sound pressure differences Diff_P(ka) as prosodic features, and from each of the extracted sound pressure differences Diff_P(ra) , the temporal change of the sound pressure difference Diff_P(ra) is extracted as a prosodic feature. For example, changes in sound pressure gradient over time can assess oral function in terms of swallowing movements, food gathering movements, or the ability to crush food.
- the extraction unit 120 may extract temporal changes in sound pressure as prosodic features. For example, the temporal change of the minimum sound pressure (sound pressure of "k”) in each "kara” when “kara kara kara " is repeatedly uttered may be extracted, or The time change of the maximum sound pressure (sound pressure of "a”) may be extracted, and the time of the sound pressure (sound pressure of "r”) between “ka” and “ra” in each "kara” may be extracted. Changes may be extracted. For example, changes in sound pressure over time can assess oral function in terms of swallowing movements, food gathering movements, or the ability to crush food.
- the extraction unit 120 may extract the number of repetitions, which is the number of times "kara" can be uttered per predetermined time, as a feature amount.
- the predetermined time is not particularly limited, it is 5 seconds or the like.
- the number of repetitions per predetermined period of time can assess oral function in terms of swallowing or food gathering movements.
- the extraction unit 120 extracts the sound pressure gradient and the time of the plosive as prosodic features. This will be described with reference to FIG.
- FIG. 7 is a diagram showing an example of voice data indicating the voice uttered by the person to be evaluated U, "On earth".
- voice data representing a voice repeatedly uttered "What on earth" is shown.
- the horizontal axis of the graph shown in FIG. 7 is time, and the vertical axis is power (sound pressure).
- the unit of power shown on the vertical axis of the graph in FIG. 7 is decibel (dB).
- the acquiring unit 110 acquires the voice data shown in FIG. 7 from the person to be evaluated U in step S102 shown in FIG.
- the extraction unit 120 extracts the sound pressures of "t” and "a” in "ta” contained in the audio data shown in FIG. 7 by a known method.
- the extraction unit 120 extracts the sound pressure difference Diff_P(ta) of “t” and “a” from the extracted sound pressures of “t” and “a” as prosodic features.
- sound pressure gradients can assess oral function in terms of swallowing force or food gathering force.
- the extracting unit 120 also extracts the plosive time Time(ita) (the plosive time between “i” and “ta”) as a prosody feature amount.
- the duration of the plosives can assess oral function in terms of swallowing movements, food gathering movements or steady movements of the tongue.
- FIG. 8 is a diagram showing an example of Chinese syllables or fixed phrases similar to Japanese syllables or fixed phrases in the degree of tongue movement or mouth opening and closing during pronunciation.
- FIG. 8 shows some examples of Japanese and Chinese syllables or fixed phrases with similar degrees of tongue movement or mouth opening and closing during pronunciation for reference.
- FIGS. 9A and 9B there are various languages that exist in the world that have similar degrees of tongue movement or opening and closing of the mouth during pronunciation.
- FIG. 9A is a diagram showing the International Phonetic Alphabet for vowels.
- FIG. 9B is a diagram showing the International Phonetic Alphabet for consonants.
- the horizontal direction indicates the movement of the tongue back and forth, the closer the tongue is, the more similar the movement of the tongue is, and the vertical direction indicates the degree of opening and closing of the mouth. , the degree of opening and closing of the mouth is similar.
- the table of the International Phonetic Alphabet for consonants shown in FIG. Can be pronounced using parts. Therefore, the present invention can be applied to various languages existing in the world.
- the extraction unit 120 extracts the sound pressure gradient, the first formant frequency , the second formant frequency, the amount of change in the first formant frequency, the amount of change in the second formant frequency, the time change in the first formant frequency, the time change in the second formant frequency, and the speech rate are extracted as prosodic features. This will be described with reference to FIGS. 10A and 10B.
- FIG. 10A is a diagram showing an example of voice data indicating voice uttered by the person to be evaluated U, "gao dao wu da ka ji ke da yi wu zhe".
- the horizontal axis of the graph shown in FIG. 10A is time, and the vertical axis is power (sound pressure).
- the unit of power shown on the vertical axis of the graph in FIG. 10A is decibel (dB).
- the graph shown in FIG. 10A includes “gao”, “dao”, “wu”, “da”, “ka”, “ji”, “ke”, “da”, “yi”, “wu”, “zhe ” is confirmed.
- the acquiring unit 110 acquires the speech data shown in FIG. 10A from the evaluated person U in step S102 shown in FIG.
- the extraction unit 120 extracts each sound pressure of "d” and "a” in “dao” and " The sound pressures of "k” and "a”, the sound pressures of "k” and “e” in “ke”, and the sound pressures of "zh” and "e” in “zhe” are extracted.
- the extraction unit 120 extracts the sound pressure difference Diff_P(da) of “d” and “a” from the extracted sound pressures of “d” and “a” as prosodic features. Similarly, the extraction unit 120 extracts the sound pressure difference Diff_P(ka) of 'k' and 'a', the sound pressure difference Diff_P(ke) of 'k' and 'e', and the sound of 'zh' and 'e'. The pressure difference Diff_P(zhe) is extracted as a prosodic feature. For example, sound pressure gradients can assess oral function in terms of swallowing force or food gathering force. In addition, oral function in terms of ability to prevent food from entering the throat can also be evaluated by the sound pressure gradient including 'k'.
- FIG. 10B is a diagram showing an example of changes in the formant frequency of the voice uttered by the evaluator U, "gao dao wu da ka ji ke da yi wu zhe". Specifically, FIG. 10B is a graph for explaining an example of changes in the first formant frequency and the second formant frequency.
- the extraction unit 120 extracts the first formant frequency and the second formant frequency of each of a plurality of vowels from the speech data representing the speech uttered by the person U to be evaluated as prosodic features. For example, the extraction unit 120 extracts the first formant frequency F1i corresponding to the vowel "i” in “ji", the first formant frequency F1e corresponding to the vowel “e” in “ke”, and the vowel "a” in “da”. The corresponding first formant frequency F1a is extracted as a prosodic feature.
- the extraction unit 120 extracts the second formant frequency F2i corresponding to the vowel “i” in “yi” and the second formant frequency F2u corresponding to the vowel “u” in “wu” as prosodic features.
- the extraction unit 120 extracts the amount of change in the first formant frequency and the amount of change in the second formant frequency of the character string with consecutive vowels as prosodic features. For example, the extraction unit 120 extracts the first formant frequency F1i, the amount of change in the first formant frequency F1e and the first formant frequency F1a (F1e ⁇ F1i, F1a ⁇ F1e, F1a ⁇ F1i), the second formant frequency F2i and the A change amount (F2i-F2u) of the two-formant frequency F2u is extracted as a prosodic feature amount.
- the extraction unit 120 extracts the time change of the first formant frequency and the time change of the second formant frequency of the character string with consecutive vowels as prosodic features. For example, the extraction unit 120 extracts temporal changes in the first formant frequency F1i, the first formant frequency F1e, and the first formant frequency F1a, and the temporal changes in the second formant frequency F2i and the second formant frequency F2u, as prosodic features. do.
- the oral function related to the movement of gathering food by the second formant frequency, the amount of change in the second formant frequency, or the time change in the second formant frequency can be used to evaluate oral function related to the ability to crush food.
- the oral function related to the ability to move the mouth quickly by the temporal change of the first formant frequency can be used to evaluate the oral function related to the ability to move the mouth quickly by the temporal change of the first formant frequency.
- the extraction unit 120 may extract the speech rate as the prosody feature amount.
- the extracting unit 120 may extract the time from the start of the utterance of "gao dao wu da ka ji ke da yi wu zhe" by the person to be evaluated until the end of the utterance as the prosodic feature amount.
- the extraction unit 120 specifies not only the time until all of "gao dao wu da ka ji ke da yi wu zhe" is finished uttering, but The time from the start of utterance to the end of utterance of the part may be extracted as the prosodic feature amount.
- the extraction unit 120 extracts the average time taken to utter one or more words of all or a specific part of "gao dao wu da ka ji ke da yi wu zhe" as a prosodic feature amount. good too.
- speech rate can assess oral function related to swallowing movements, food gathering movements or tongue dexterity.
- the calculation unit 130 estimates the oral function of the person U based on the extracted prosodic feature amount and the oral function estimation formula calculated based on a plurality of learning data. A value is calculated (step S104).
- the oral function estimation formula is set in advance based on the evaluation results of multiple subjects. Acquisition of voice features uttered by the subject, actual diagnosis of the oral function of the subject, and establishment of the correlation between the voice features and the diagnostic results by statistical analysis using multiple regression equations. Different types of estimation formulas can be generated depending on how to select speech features used as representative values. In this manner, an estimation formula can be generated in advance.
- Machine learning methods include logistic regression, SVM (Support Vector Machine), and random forest.
- the estimation formula can be configured to include coefficients corresponding to oral function elements and variables into which the extracted prosodic feature values are substituted and multiplied by the above coefficients. Equations 1 to 5 below are examples of estimation equations.
- Oral health estimate (A1 x F2e) + (B1 x F2o) + (C1 x F1i) + (D1 x F1e) + (E1 x F1a) + (F1 x Diff_P(ka)) + (G1 x Diff_P( ko)) + (H1 x Diff_P (to)) + (J1 x Diff_P (ta)) + (K1 x Diff_P (ka)) + (L1 x Diff_P (ra)) + (M1 x Diff_P (ta)) + ( N1 x Time (ita)) + P1 (Formula 1)
- Tongue pressure estimate (A4 x F2e) + (B4 x F2o) + (C4 x F1i) + (D4 x F1e) + (E4 x F1a) + (F4 x Diff_P(ka)) + (G4 x Diff_P( ko)) + (H4 x Diff_P (to)) + (J4 x Diff_P (ta)) + (K4 x Diff_P (ka)) + (L4 x Diff_P (ra)) + (M4 x Diff_P (ta)) + ( N4 x Time (ita)) + P4 (Formula 4)
- Estimated masticatory function (A5 x F2e) + (B5 x F2o) + (C5 x F1i) + (D5 x F1e) + (E5 x F1a) + (F5 x Diff_P (ka)) + (G5 x Diff_P( ko)) + (H5 x Diff_P (to)) + (J5 x Diff_P (ta)) + (K5 x Diff_P (ka)) + (L5 x Diff_P (ra)) + (M5 x Diff_P (ta)) + ( N5 x Time (ita)) + P5 (Formula 5)
- A1, B1, C1, ..., N1, A2, B2, C2, ..., N2, A3, B3, C3, ..., N3, A4, B4, C4, ..., N4, A5, B5, C5, . . . , N5 are coefficients, specifically, coefficients corresponding to oral function elements.
- A1, B1, C1, . A3, B3, C3, . , . is the coefficient corresponding to
- P1 is a constant corresponding to oral hygiene
- P2 is a constant corresponding to dry mouth
- P3 is a constant corresponding to bite force
- P4 is a constant corresponding to tongue pressure
- P5 corresponds to masticatory function. is a constant that
- F2e multiplied by A1, A2, A3, A4, and A5 and F2o multiplied by B1, B2, B3, B4, and B5 are when the person U uttered, "I decided to draw a picture.”
- This is a variable into which the second formant frequency, which is a prosodic feature extracted from speech data, is substituted.
- F1a multiplied by E1, E2, E3, E4 and E5 is a variable into which the first formant frequency, which is the prosodic feature quantity extracted from the utterance data when utters "I decided to write a picture", is substituted.
- Diff_P (ka) multiplied by F1, F2, F3, F4, and F5 Diff_P (ko) multiplied by G1, G2, G3, G4, and G5, and Diff_P (ko) multiplied by H1, H2, H3, H4, and H5 ( to) and Diff_P(ta) multiplied by J1, J2, J3, J4, and J5 are the prosodic features extracted from the utterance data when the person to be evaluated U uttered "I decided to write a picture.” is the variable into which the sound pressure gradient, which is a quantity, is substituted.
- Diff_P(ka) multiplied by K1, K2, K3, K4, and K5 and Diff_P(ra) multiplied by L1, L2, L3, L4, and L5 are the values of Diff_P(ra) multiplied by the evaluator U when he utters “kara”. This is a variable into which a sound pressure difference, which is a prosodic feature extracted from speech data, is substituted.
- Diff_P(ta) multiplied by M1, M2, M3, M4, and M5 is a variable into which the sound pressure difference, which is a prosodic feature quantity extracted from the utterance data when the evaluator U uttered "once", is substituted. is.
- Time (i-ta) to which N1, N2, N3, N4, and N5 are multiplied is substituted with the time of the plosive sound, which is the prosodic feature quantity extracted from the utterance data when the evaluator U uttered “once”. is a variable that is
- the calculation unit 130 calculates an estimated value for each element of the oral function of the person to be evaluated U (for example, tongue coating, dry mouth, bite force, tongue pressure, and masticatory function). calculate.
- these oral function elements are examples, and the oral function elements include at least tongue coating, dry mouth, bite force, tongue pressure, cheek pressure, number of remaining teeth, swallowing function, and masticatory function. One should be included.
- the extracting unit 120 extracts a plurality of types of syllables or fixed phrases (for example, in the above formulas 1 to 5, "I decided to write a picture", "from” and "once”). extracts a plurality of prosodic features from the audio data obtained by collecting the voice uttered by the calculator 130, based on the extracted plurality of prosodic features and the estimation formula, oral function estimation Calculate the value.
- the calculation unit 130 can accurately calculate the estimated value of the oral function by substituting a plurality of prosodic features extracted from speech data of a plurality of types of syllables or fixed phrases into one estimation formula.
- the estimation expression may be a multi-order expression such as a secondary expression.
- the evaluation unit 140 evaluates the deterioration state of the oral function of the person to be evaluated U by determining the estimated value calculated by the calculation unit 130 using the oral function evaluation index (step S105). For example, the evaluation unit 140 determines the calculated estimated value for each element of the oral function using an oral function evaluation index determined for each element of the oral function, thereby reducing the oral function of the person to be evaluated U. Status is assessed by component of oral function.
- the oral function evaluation index is an index for evaluating oral function, and is, for example, a condition for determining that oral function is degraded. The oral function evaluation index will be described with reference to FIG. 11 .
- FIG. 11 is a diagram showing an example of an oral function evaluation index.
- the oral function evaluation index is determined for each element of oral function. For example, in Japan, an index of 50% or more is set for oral hygiene, an index of 27 or less is set for dry mouth, and an index of less than 500 N is set for bite force (GC Co., Ltd. When using Dental Prescale II), an index of less than 30 kPa is set for tongue pressure, and an index of less than 100 mg/dL is set for masticatory function (for indices, see the Japanese Dental Association's "Oral Function Basic concept of depression (https://www.jads.jp/basic/pdf/document_02.pdf)”).
- the evaluation unit 140 compares the calculated estimated value of each oral cavity function element with the oral cavity function evaluation index determined for each oral cavity function element, thereby evaluating the deterioration state of the oral cavity function of the person to be evaluated U. Evaluate each functional element. For example, when the calculated estimated value of oral hygiene is 50% or more, it is evaluated that oral hygiene is in a deteriorated state as an element of oral function. Similarly, when the calculated estimated value of dry mouth is 27 or less, it is evaluated that dry mouth is in a reduced state as an element of oral function, and the calculated estimated value of bite force is less than 500 N.
- the occlusal force is in a state of decreased as an element of oral function
- the estimated value of the calculated tongue pressure is less than 30 kPa
- the estimated value of masticatory function that has been evaluated and calculated is less than 100 mg/dL
- the oral function evaluation index determined for oral hygiene, dry mouth, occlusal force, tongue pressure, and masticatory function is merely an example, and is not limited to this. For example, an index of remaining teeth may be established for masticatory function.
- oral hygiene, dry mouth, bite force, tongue pressure and masticatory function are shown as elements of oral function, but these are only examples.
- oral function evaluation index may differ depending on the country, the oral function evaluation index according to the country where the oral function evaluation device 100 is used may be used.
- the output unit 150 outputs the evaluation result of the oral function of the person to be evaluated U evaluated by the evaluation unit 140 (step S106).
- the output unit 150 outputs evaluation results to the mobile terminal 300 .
- the output unit 150 may include, for example, a communication interface that performs wired communication or wireless communication, acquires the image data of the image corresponding to the evaluation result from the storage unit 170, and acquires it to the mobile terminal 300. Send image data.
- An example of the image data (evaluation result) is shown in FIGS. 12 and 13.
- FIGS. 12 and 13 are diagrams showing an example of evaluation results for each element of oral cavity function.
- the evaluation result may be a two-level evaluation result of OK or NG.
- OK means normal
- NG means abnormal.
- index data 172 stored in storage unit 170 may include a plurality of indexes for one element.
- the evaluation result may be represented by a radar chart.
- Figures 12 and 13 show mouth cleanliness, ability to hold food together, ability to chew hard objects, tongue strength and jaw movement as factors of oral function.
- the cleanliness of the mouth is based on oral hygiene
- the ability to hold food together is based on dry mouth
- the ability to chew hard objects is based on bite force
- the force of the tongue is based on tongue pressure
- the movement of the jaw is based on the estimated value of masticatory function.
- 12 and 13 are only examples, and the wording of evaluation items, oral function items, and their corresponding combinations are not limited to those shown in FIGS.
- the proposal unit 160 compares the estimated value calculated by the calculation unit 130 with predetermined data (suggestion data 173) to make a proposal regarding the oral function of the person U. (Step S107).
- predetermined data will now be described with reference to FIG.
- FIG. 14 is an example of predetermined data (suggestion data 173) used when proposing oral cavity functions.
- the proposal data 173 is data in which evaluation results and proposal contents are associated with each element of oral cavity function. For example, when the calculated estimated value of mouth cleanliness is 50% or more, the proposal unit 160 determines that the index is satisfied and is OK. Make a proposal by Note that although the description of specific proposal content is omitted, for example, the storage unit 170 includes data indicating the content of the proposal (eg, image, video, audio, text, etc.), and the proposal unit 160 Using such data, a proposal regarding oral functions is made to the person U to be evaluated.
- the oral function of the person to be evaluated U by acquiring voice data suitable for oral function evaluation. That is, the oral function of the person U to be evaluated can be evaluated only by uttering the syllables or fixed phrases to the sound collecting device such as the mobile terminal 300 .
- the estimated value of the oral cavity function is calculated using the estimation formula calculated based on a plurality of learning data, the deterioration state of the oral cavity function can be quantitatively evaluated.
- an estimated value is calculated from the prosodic feature value and the estimation formula, and the estimated value is compared with the threshold value (oral function evaluation index). Therefore, the deterioration state of oral cavity function can be evaluated with high accuracy.
- the estimation formula may include coefficients corresponding to oral function elements and variables into which the extracted prosodic feature values are substituted and multiplied by the above coefficients.
- the estimated value of the oral cavity function can be easily calculated simply by substituting the extracted prosodic features into the estimation formula.
- an estimated value is calculated for each element of the oral function of the person to be evaluated U
- the calculated estimated value for each element of the oral cavity function is calculated for each element of the oral cavity function.
- the deterioration state of oral cavity function can be evaluated for each element. For example, by preparing an estimation formula with different coefficients for each oral function element, the deterioration state of oral function can be easily evaluated for each element.
- the elements of the oral cavity function may include at least one of tongue coating, dry mouth, bite force, tongue pressure, cheek pressure, number of remaining teeth, swallowing function, and masticatory function of the subject U.
- the prosody features include speech rate, sound pressure difference, time change of sound pressure difference, first formant frequency, second formant frequency, amount of change in first formant frequency, amount of change in second formant frequency, first formant At least one of a frequency change over time, a second formant frequency change over time, and a plosive sound time may be included.
- a plurality of prosodic features are extracted from speech data obtained by collecting speech uttered by the evaluator U in a plurality of types of syllables or fixed sentences, and in the calculation step, the extracted An estimated value may be calculated based on a plurality of prosody feature values and an estimation formula.
- a syllable or phrase may include two or more vowels or a combination of vowels and consonants that involve opening and closing the mouth or moving the tongue back and forth to speak.
- the amount of change in the first formant frequency, the time change in the first formant frequency, the amount of change in the second formant frequency, or the second formant It is possible to extract prosodic features including temporal changes in frequency.
- the voice data may be obtained by collecting voices of syllables or fixed phrases uttered by the evaluator U at least twice at different speaking speeds.
- a fixed phrase may include repetition of a syllable consisting of a picking sound and a consonant different from the picking sound.
- the prosodic features including the time change of the sound pressure difference, the time change of the sound pressure, and the number of repetitions from the voice of the person U to be evaluated uttering such syllables or fixed phrases.
- a syllable or fixed phrase may include at least one combination of vowels and plosives.
- the oral function evaluation method may further include a proposal step of making a proposal regarding the oral function of the person to be evaluated U by comparing the calculated estimated value with predetermined data.
- the person to be evaluated U can receive suggestions on what measures to take when the oral function deteriorates.
- Oral function evaluation apparatus 100 consists of two or more moras including changes in the first formant frequency or changes in the second formant frequency, or at least one of a picking sound, a plosive sound, an unvoiced sound, a geminate and a fricative sound.
- Acquisition unit 110 that acquires speech data obtained by collecting speech uttered by the evaluator U, including syllables or fixed phrases, and extraction unit 120 that extracts prosodic features from the acquired speech data.
- a calculation unit 130 that calculates an estimated value of the oral function of the person to be evaluated U based on the oral function estimation formula calculated based on a plurality of learning data and the extracted prosodic feature amount; and an evaluation unit 140 that evaluates the deterioration state of the oral function of the person to be evaluated U by determining the estimated value using the oral function evaluation index.
- the oral function evaluation device 100 capable of easily evaluating the oral function of the person U to be evaluated.
- the oral function evaluation system 200 includes an oral function evaluation device 100, a sound collecting device (mobile terminal 300) that collects, in a non-contact manner, the sound of the syllables or fixed phrases uttered by the person U to be evaluated, Prepare.
- an oral function evaluation system 200 that can easily evaluate the oral function of the person to be evaluated U.
- the estimation formula data 171 may be updated based on evaluation results obtained when an expert actually diagnoses the oral function of the person U to be evaluated. Thereby, the evaluation accuracy of the oral cavity function can be improved. Machine learning may be used to improve the accuracy of oral function assessment.
- the proposal data 173 may be updated based on the evaluation results of the evaluated person U evaluating the content of the proposal. For example, when a proposal is made regarding oral functions that are not problematic for the person U to be evaluated, the person U to be evaluated evaluates that the content of the proposal is wrong. By updating the proposal data 173 based on this evaluation result, the wrong proposal as described above is prevented. In this way, the content of the proposal regarding the oral cavity function to the person to be evaluated U can be made more effective. Note that machine learning may be used to make suggestions regarding oral functions more effective.
- evaluation results of oral functions may be accumulated as big data together with personal information and used for machine learning.
- contents of proposals regarding oral cavity functions may be accumulated as big data together with personal information and used for machine learning.
- the oral function evaluation method includes the proposal step (step S107) of making a proposal regarding oral functions, but it does not have to be included.
- the oral function evaluation device 100 does not have to include the proposal section 160 .
- the personal information of the person to be evaluated U is acquired in the acquisition step (step S102), but it does not have to be acquired.
- the acquisition unit 110 does not have to acquire the personal information of the person U to be evaluated.
- the steps in the oral function evaluation method may be executed by a computer (computer system).
- the present invention can be realized as a program for causing a computer to execute the steps included in those methods.
- the present invention can be implemented as a non-temporary computer-readable recording medium such as a CD-ROM recording the program.
- each step is executed by executing the program using hardware resources such as the CPU, memory, and input/output circuits of the computer. . That is, each step is executed by the CPU acquiring data from a memory, an input/output circuit, or the like, performing an operation, or outputting the operation result to the memory, an input/output circuit, or the like.
- each component included in the oral function evaluation device 100 and the oral function evaluation system 200 of the above embodiment may be realized as a dedicated or general-purpose circuit.
- each component included in the oral function evaluation device 100 and the oral function evaluation system 200 of the above embodiment may be realized as an LSI (Large Scale Integration), which is an integrated circuit (IC: Integrated Circuit).
- LSI Large Scale Integration
- IC integrated circuit
- the integrated circuit is not limited to an LSI, and may be realized by a dedicated circuit or a general-purpose processor.
- a programmable FPGA (Field Programmable Gate Array) or a reconfigurable processor capable of reconfiguring connections and settings of circuit cells inside the LSI may be used.
- each component included in the oral function evaluation device 100 and the oral function evaluation system 200 may be integrated.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Veterinary Medicine (AREA)
- Animal Behavior & Ethology (AREA)
- Surgery (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Epidemiology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Physiology (AREA)
- Dentistry (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Primary Health Care (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
[口腔機能の要素]
本発明は、口腔機能の低下の評価方法等に関するものであり、口腔機能には様々な要素がある。
実施の形態に係る口腔機能評価システム200の構成に関して説明する。
続いて、口腔機能評価装置100が実行する口腔機能評価方法における具体的な処理手順について説明する。
(以下、gao dao wu da ka ji ke da yi wu zheと記載する)を発音する際の発音の際の舌の動きまたは口の開閉の程度は、日本語の「えをかくことにきめたよ」を発音する際の発音の際の舌の動きまたは口の開閉の程度と類似しているため、日本語の「えをかくことにきめたよ」と類似する韻律特徴量を抽出することができる。なお、本明細書では声調符号の記載を省略している。図8には、日本語および中国語について、発音の際の舌の動きまたは口の開閉の程度が類似する音節または定型文の例が参考までにいくつか示されている。
以上説明したように、本実施の形態に係る口腔機能評価方法は、図3に示されるように、第一フォルマント周波数の変化もしくは第二フォルマント周波数の変化を含む2モーラ以上からなる、または、弾き音、破裂音、無声音、促音および摩擦音の少なくとも1つを含む、音節または定型文を被評価者Uが発話した音声を集音することで得られる音声データを取得する取得ステップ(ステップS102)と、取得された音声データから韻律特徴量を抽出する抽出ステップ(ステップS103)と、複数の学習データに基づいて算出された口腔機能の推定式と、抽出された韻律特徴量とに基づいて、被評価者Uの口腔機能の推定値を算出する算出ステップ(ステップS104)と、算出された推定値を、口腔機能評価指標を用いて判定することで、被評価者Uの口腔機能の低下状態を評価する評価ステップ(ステップS105)と、を含む。
以上、実施の形態に係る口腔機能評価方法等について説明したが、本発明は、上記実施の形態に限定されるものではない。
110 取得部
120 抽出部
130 算出部
140 評価部
200 口腔機能評価システム
300 携帯端末(集音装置)
U 被評価者
Claims (14)
- 第一フォルマント周波数の変化もしくは第二フォルマント周波数の変化を含む2モーラ以上からなる、または、弾き音、破裂音、無声音、促音および摩擦音の少なくとも1つを含む、音節または定型文を被評価者が発話した音声を集音することで得られる音声データを取得する取得ステップと、
取得された前記音声データから韻律特徴量を抽出する抽出ステップと、
複数の学習データに基づいて算出された口腔機能の推定式と、抽出された前記韻律特徴量とに基づいて、前記被評価者の口腔機能の推定値を算出する算出ステップと、
算出された前記推定値を、口腔機能評価指標を用いて判定することで、前記被評価者の口腔機能の低下状態を評価する評価ステップと、を含む、
口腔機能評価方法。 - 前記推定式は、口腔機能の要素に対応する係数、および、抽出された韻律特徴量が代入され、前記係数が掛けられる変数を含む、
請求項1に記載の口腔機能評価方法。 - 前記算出ステップでは、前記被評価者の口腔機能の要素毎に前記推定値を算出し、
前記評価ステップでは、算出された口腔機能の要素毎の前記推定値を、口腔機能の要素毎に定められた口腔機能評価指標を用いて判定することで、前記被評価者の口腔機能の低下状態を口腔機能の要素毎に評価する、
請求項1または2に記載の口腔機能評価方法。 - 口腔機能の要素には、前記被評価者の舌苔、口腔乾燥、咬合力、舌圧、頬圧、残存歯数、嚥下機能および咀嚼機能の少なくとも1つが含まれる、
請求項1~3のいずれか1項に記載の口腔機能評価方法。 - 前記韻律特徴量は、話速度、音圧較差、音圧較差の時間変化、第一フォルマント周波数、第二フォルマント周波数、第一フォルマント周波数の変化量、第二フォルマント周波数の変化量、第一フォルマント周波数の時間変化、第二フォルマント周波数の時間変化および破裂音の時間の少なくとも1つを含む、
請求項1~4のいずれか1項に記載の口腔機能評価方法。 - 前記抽出ステップでは、複数種類の前記音節または前記定型文を前記被評価者が発話した音声を集音することで取得された前記音声データから複数の韻律特徴量を抽出し、
前記算出ステップでは、抽出された複数の韻律特徴量と前記推定式とに基づいて、前記推定値を算出する、
請求項1~5のいずれか1項に記載の口腔機能評価方法。 - 前記音節または前記定型文は、発話するために口の開閉または舌の前後の動きを伴う、2つ以上の母音または母音および子音の組み合わせを含む、
請求項1~6のいずれか1項に記載の口腔機能評価方法。 - 前記音声データは、前記音節または前記定型文を前記被評価者が異なる話速度で少なくとも2回発話した音声を集音することで得られる、
請求項1~7のいずれか1項に記載の口腔機能評価方法。 - 前記定型文は、弾き音と当該弾き音とは異なる子音からなる音節の繰り返しを含む、
請求項1~8のいずれか1項に記載の口腔機能評価方法。 - 前記音節または前記定型文は、母音および破裂音の組み合わせを少なくとも1つ含む、
請求項1~9のいずれか1項に記載の口腔機能評価方法。 - さらに、算出された前記推定値を、予め定められたデータに照合することで、前記被評価者の口腔機能に関する提案を行う提案ステップを含む、
請求項1~10のいずれか1項に記載の口腔機能評価方法。 - 請求項1~11のいずれか1項に記載の口腔機能評価方法をコンピュータに実行させるためのプログラム。
- 第一フォルマント周波数の変化もしくは第二フォルマント周波数の変化を含む2モーラ以上からなる、または、弾き音、破裂音、無声音、促音および摩擦音の少なくとも1つを含む、音節または定型文を被評価者が発話した音声を集音することで得られる音声データを取得する取得部と、
取得された前記音声データから韻律特徴量を抽出する抽出部と、
複数の学習データに基づいて算出された口腔機能の推定式と、抽出された前記韻律特徴量とに基づいて、前記被評価者の口腔機能の推定値を算出する算出部と、
算出された前記推定値を、口腔機能評価指標を用いて判定することで、前記被評価者の口腔機能の低下状態を評価する評価部と、を備える、
口腔機能評価装置。 - 請求項13に記載の口腔機能評価装置と、
前記音節または前記定型文を前記被評価者が発話した音声を非接触により集音する集音装置と、を備える、
口腔機能評価システム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280033469.8A CN117279568A (zh) | 2021-05-31 | 2022-04-12 | 口腔功能评价方法、程序、口腔功能评价装置及口腔功能评价系统 |
JP2023525648A JPWO2022254973A1 (ja) | 2021-05-31 | 2022-04-12 | |
US18/563,251 US20240268705A1 (en) | 2021-05-31 | 2022-04-12 | Oral function evaluation method, recording medium, oral function evaluation device, and oral function evaluation system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-091766 | 2021-05-31 | ||
JP2021091766 | 2021-05-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022254973A1 true WO2022254973A1 (ja) | 2022-12-08 |
Family
ID=84323191
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/017643 WO2022254973A1 (ja) | 2021-05-31 | 2022-04-12 | 口腔機能評価方法、プログラム、口腔機能評価装置および口腔機能評価システム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240268705A1 (ja) |
JP (1) | JPWO2022254973A1 (ja) |
CN (1) | CN117279568A (ja) |
WO (1) | WO2022254973A1 (ja) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200380960A1 (en) * | 2017-11-27 | 2020-12-03 | Yeda Research And Development Co. Ltd. | Extracting content from speech prosody |
-
2022
- 2022-04-12 JP JP2023525648A patent/JPWO2022254973A1/ja active Pending
- 2022-04-12 WO PCT/JP2022/017643 patent/WO2022254973A1/ja active Application Filing
- 2022-04-12 CN CN202280033469.8A patent/CN117279568A/zh active Pending
- 2022-04-12 US US18/563,251 patent/US20240268705A1/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200380960A1 (en) * | 2017-11-27 | 2020-12-03 | Yeda Research And Development Co. Ltd. | Extracting content from speech prosody |
Non-Patent Citations (1)
Title |
---|
"Manual for Oral Frailty in Dental Clinics", 1 January 2019, JAPAN DENTAL ASSOCIATION, JP, article "Part III. Evaluation of Oral Frailty", pages: 50 - 93, XP009542827 * |
Also Published As
Publication number | Publication date |
---|---|
US20240268705A1 (en) | 2024-08-15 |
JPWO2022254973A1 (ja) | 2022-12-08 |
CN117279568A (zh) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Icht et al. | Oral-diadochokinesis rates across languages: English and Hebrew norms | |
Kent et al. | Speech impairment in Down syndrome: A review | |
Kent | Hearing and believing: Some limits to the auditory-perceptual assessment of speech and voice disorders | |
CN112135564B (zh) | 摄食吞咽功能评价方法、记录介质、评价装置以及评价系统 | |
Munson | Variability in/s/production in children and adults | |
Gillespie et al. | Immediate effect of stimulability assessment on acoustic, aerodynamic, and patient-perceptual measures of voice | |
Kreiman et al. | Perception of voice quality | |
Kummer | Speech and resonance disorders related to cleft palate and velopharyngeal dysfunction: a guide to evaluation and treatment | |
Sussman et al. | The integrity of anticipatory coarticulation in fluent and non-fluent tokens of adults who stutter | |
Núñez-Batalla et al. | Validation of the spanish adaptation of the consensus auditory-perceptual evaluation of voice (CAPE-V) | |
Haley et al. | Speech metrics and samples that differentiate between nonfluent/agrammatic and logopenic variants of primary progressive aphasia | |
Syrika et al. | Acquisition of initial/s/-stop and stop-/s/sequences in Greek | |
Hybbinette et al. | Intra-and interjudge reliability of the apraxia of speech rating scale in early stroke patients | |
Ribeiro et al. | Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors | |
Namasivayam et al. | Development and validation of a probe word list to assess speech motor skills in children | |
Celata et al. | Nasal place assimilation between phonetics and phonology: An EPG study of Italian nasal-to-velar clusters | |
Lorenc et al. | Articulatory and acoustic variation in Polish palatalised retroflexes compared with plain ones | |
Jesus et al. | Discriminative segmental cues to vowel height and consonantal place and voicing in whispered speech | |
Chung | Acquisition and acoustic patterns of Southern American English/l/in young children | |
WO2022254973A1 (ja) | 口腔機能評価方法、プログラム、口腔機能評価装置および口腔機能評価システム | |
Ünal-Logacev et al. | A multimodal approach to the voicing contrast in Turkish: Evidence from simultaneous measures of acoustics, intraoral pressure and tongue palatal contacts | |
Carmichael | Introducing objective acoustic metrics for the Frenchay Dysarthria Assessment procedure | |
Lynce et al. | Phonological development in Portuguese deaf children with cochlear implants: Preliminary study | |
WO2023228615A1 (ja) | 音声特徴量算出方法、音声特徴量算出装置、及び、口腔機能評価装置 | |
WO2023203962A1 (ja) | 口腔機能評価装置、口腔機能評価システム、及び、口腔機能評価方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22815729 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023525648 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280033469.8 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18563251 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22815729 Country of ref document: EP Kind code of ref document: A1 |