TWI811097B

TWI811097B - Method and apparatus for determining a degree of dementia of a user

Info

Publication number: TWI811097B
Application number: TW111134144A
Authority: TW
Inventors: 金亨俊; 林俊植; 洪秀勳; 白贊銀
Original assignee: 南韓商智聰醫治股份有限公司
Priority date: 2021-09-09
Filing date: 2022-09-08
Publication date: 2023-08-01
Also published as: TW202312186A; KR20230037432A; KR20230037433A; KR102526429B1

Abstract

為了確定用戶的癡呆程度，透過用戶終端輸出內容，並連續接收用戶對觀察內容的反應，並透過可視化反應來生成生物標記訊息，並基於生物標記訊息透過卷積神經網絡（CNN）和深度神經網絡（DNN）來確定用戶的癡呆程度。In order to determine the degree of dementia of the user, output the content through the user terminal, and continuously receive the user's response to the observed content, and generate biomarker information through the visualized response, and pass the convolutional neural network (CNN) and deep neural network based on the biomarker information (DNN) to determine the degree of dementia of the user.

Description

Method and apparatus for determining the degree of dementia in a user

本技術領域涉及用於確定用戶的癡呆程度的技術，更具體地，涉及用於向用戶提供內容，並基於用戶對所提供內容的反應來確定用戶的癡呆程度的裝置及方法。The technical field relates to technologies for determining the degree of dementia of a user, and more particularly, to an apparatus and method for providing content to a user and determining the degree of dementia of the user based on the user's response to the provided content.

隨著社會的老齡化，癡呆症成為老年人最嚴重的疾病之一，其近10年來呈快速上升趨勢，社會和經濟成本也在上升。此外，由於患者無法獨立生活，並且會導致失蹤或自身等，因此癡呆症不僅會給患者自身的生活帶來巨大痛苦，還給關心患者的家人帶來巨大痛苦。透過早期診斷和適當治療，可以預防或延遲進一步的認知能力下降，但現有的這種疾病的早期診斷仍存在一些問題。過去，由於有必要到醫院等專業醫療機構就診，因此感到自己的健忘狀況惡化而前來就診的患者中已發展為輕度認知障礙（MCI）或阿爾茨海默病（AD）的患者諸多，用於診斷的神經認知功能測試（SNSB-II、CERAD-K等）只有透過具有足夠經驗和專門知識的醫務人員才能獲得高可靠性，並且，磁共振成像（MRI）、單光子發射成像（SPECT）、正電子斷層掃描（PET）及腦脊液檢測等的診斷費用非常昂貴，而且會給接受診斷的患者帶來許多不便。With the aging of society, dementia has become one of the most serious diseases of the elderly. It has shown a rapid upward trend in the past 10 years, and the social and economic costs are also rising. In addition, dementia brings great pain not only to the life of the patient himself but also to the family members who care about the patient, as the patient cannot live independently and can lead to disappearance or himself, etc. With early diagnosis and appropriate treatment, further cognitive decline can be prevented or delayed, but existing early diagnosis of this disease remains problematic. In the past, because it was necessary to see a professional medical institution such as a hospital, many patients who came to see a doctor who felt that their forgetfulness had worsened had developed mild cognitive impairment (MCI) or Alzheimer's disease (AD), Neurocognitive function tests for diagnosis (SNSB-II, CERAD-K, etc.) ), positron emission tomography (PET), and cerebrospinal fluid testing are very expensive and cause a lot of inconvenience to the patients receiving the diagnosis.

技術課題technical issues

一實施例可以提供一種用於確定用戶的癡呆程度的裝置及方法。An embodiment may provide an apparatus and method for determining a user's degree of dementia.

一實施例可以提供一種用於基於用戶的語音來確定用戶的癡呆程度的裝置及方法。An embodiment may provide an apparatus and method for determining a user's dementia level based on a user's voice.

技術方法Technical method

根據一實施例的由電子裝置執行的用於確定用戶癡呆程度的方法包括以下步驟：透過用戶終端輸出用於確定用戶的癡呆程度而預先製作的第一內容；According to an embodiment, the method for determining the degree of dementia of a user performed by an electronic device includes the following steps: outputting first content pre-produced for determining the degree of dementia of the user through a user terminal;

接收所述用戶針對透過所述用戶終端的麥克風獲取的所述第一內容的第一語音；透過所述用戶終端輸出預先製作的第二內容；接收所述用戶針對透過所述麥克風獲取的所述第二內容的第二語音；透過可視化所述第一語音的至少一個特徵來生成第一頻譜圖（spectrogram）圖像；透過可視化所述第二語音的至少一個特徵來生成第二頻譜圖圖像；透過將所述第一頻譜圖圖像輸入到預先更新的第一卷積神經網絡（CNN，convolutional neural network），為所述第一語音生成預設數量的第一特徵；透過將所述第二頻譜圖圖像輸入到預先更新的第二卷積神經網絡，為所述第二語音生成預設數量的第二特徵；在所述第一特徵和所述第二特徵中確定預設數量的目標特徵；以及透過將所述目標特徵輸入到預先更新的深度神經網絡（DNN，deep neural network），確定所述用戶的癡呆程度，並且，透過所述用戶終端可以輸出所述確定的癡呆程度。receiving the first voice of the user for the first content acquired through the microphone of the user terminal; outputting the pre-made second content through the user terminal; receiving the user's voice for the first content acquired through the microphone A second speech of second content; generating a first spectrogram image by visualizing at least one feature of the first speech; generating a second spectrogram image by visualizing at least one feature of the second speech ; By inputting the first spectrogram image into a pre-updated first convolutional neural network (CNN, convolutional neural network), generating a preset number of first features for the first speech; by adding the first The two spectrogram images are input to the pre-updated second convolutional neural network to generate a preset number of second features for the second speech; determine a preset number of features in the first feature and the second feature target features; and by inputting the target features into a pre-updated deep neural network (DNN, deep neural network), determining the dementia degree of the user, and outputting the determined dementia degree through the user terminal.

所述第一內容可以包括用於接收所述第一語音的指令（instruction）。The first content may include an instruction (instruction) for receiving the first voice.

所述第一內容可以是使用戶跟讀句子的內容、猜測輸出圖像的名稱的內容、描述輸出圖像的內容、用於語言流暢性的內容、用於數字運算的內容以及誘導講故事（story telling）的內容中的一個。The first content may be content that makes the user read a sentence, guess the name of the output image, describe the output image, use language fluency, use number operations, and induce storytelling ( one of the content of story telling).

透過可視化所述第一語音的至少一個特徵來生成第一頻譜圖圖像的步驟，可以包括以下步驟：透過librosa工具生成所述第一語音的所述第一頻譜圖圖像。The step of generating a first spectrogram image by visualizing at least one feature of the first speech may include the step of: generating the first spectrogram image of the first speech by a librosa tool.

所述第一頻譜圖圖像的大小和所述第二頻譜圖圖像的大小可以彼此相同。A size of the first spectrogram image and a size of the second spectrogram image may be the same as each other.

可以基於VGG16模型來預先更新所述第一CNN。The first CNN may be pre-updated based on the VGG16 model.

所述第一CNN透過可以包括輸入層、5個預卷積層塊（pre-convolutional layer blocks）、完全連接層及2個後卷積層塊（post-convolutional layer blocks）並且不包括softmax來生成所述第一頻譜圖圖像的所述第一特徵。The first CNN pass may include an input layer, 5 pre-convolutional layer blocks, a fully connected layer, and 2 post-convolutional layer blocks and does not include softmax to generate the The first feature of the first spectrogram image.

確定用戶癡呆程度的方法還可以包括更新所述第一CNN的步驟。The method of determining the degree of dementia of a user may also include the step of updating said first CNN.

更新所述第一CNN的步驟，可以包括以下步驟：接收針對所述第一內容的測試用戶的第一測試語音；透過可視化所述第一測試語音的至少一個特徵來生成第一測試頻譜圖圖像，其中所述第一測試頻譜圖圖像被標記為所述測試用戶的GT（ground truth）癡呆程度；透過將所述第一測試頻譜圖圖像輸入到第一完整的CNN中來確定所述測試用戶的第一測試癡呆程度，其中所述第一完整的CNN包括輸入層、一個以上的前卷積層塊、完全連接層、一個以上的後卷積層塊及softmax；以及基於所述第一測試癡呆程度及所述GT癡呆程度來更新所述完整的第一CNN，其中所述第一CNN在所述更新的完整的第一CNN的層中僅包括所述輸入層、所述一個以上的預卷積層塊、所述完全連接層以及所述一個以上的後卷積層塊。The step of updating the first CNN may include the following steps: receiving a first test speech of a test user for the first content; generating a first test spectrogram by visualizing at least one feature of the first test speech image, wherein the first test spectrogram image is marked as the GT (ground truth) dementia degree of the test user; by inputting the first test spectrogram image into the first complete CNN to determine the The first test dementia degree of the test user, wherein the first complete CNN includes an input layer, more than one pre-convolution layer block, a fully connected layer, more than one back-convolution layer block, and softmax; and based on the first testing the dementia degree and the GT dementia degree to update the complete first CNN, wherein the first CNN includes only the input layer, the one or more layers in the layers of the updated complete first CNN a pre-convolutional layer block, said fully-connected layer, and said one or more post-convolutional layer blocks.

確定用戶癡呆程度的方法還可以包括以下步驟：在完成包括所述第一CNN及所述第二CNN的多個CNN的更新之後，更新所述DNN。The method for determining the degree of dementia of a user may further include the following step: updating the DNN after completing updating of a plurality of CNNs including the first CNN and the second CNN.

更新所述DNN的步驟可以包括以下步驟：在基於第一測試頻譜圖圖像生成的預設數量的第一測試特徵和基於第二測試頻譜圖圖圖像生成的預設數量的第二測試特徵中，確定預設數量的測試目標特徵，其中所述測試目標特徵被標記為所述測試用戶的GT癡呆程度；透過將所述測試目標特徵輸入到所述DNN中來確定所述測試用戶的第二測試癡呆程度；以及基於所述第二測試癡呆程度及所述GT癡呆程度來更新所述DNN。The step of updating the DNN may include the following steps: a preset number of first test features generated based on the first test spectrogram image and a preset number of second test features generated based on the second test spectrogram image , determine a preset number of test target features, wherein the test target feature is marked as the test user's GT dementia degree; by inputting the test target feature into the DNN to determine the test user's first two test dementia levels; and updating the DNN based on the second test dementia level and the GT dementia level.

根據一實施例的用於確定用戶癡呆程度的裝置包括：存儲器，其記錄用於確定用戶癡呆程度的程序；以及處理器，其執行所述程序，其中所述程序執行以下步驟：透過用戶終端輸出用於確定用戶的癡呆程度而預先製作的第一內容；接收所述用戶針對透過所述用戶終端的麥克風獲取的所述第一內容的第一語音；透過所述用戶終端輸出預先製作的第二內容；接收所述用戶針對透過所述麥克風獲取的所述第二內容的第二語音；透過可視化所述第一語音的至少一個特徵來生成第一頻譜圖圖像；透過可視化所述第二語音的至少一個特徵來生成第二頻譜圖圖像；透過將所述第一頻譜圖圖像輸入到預先更新的第一卷積神經網絡（CNN），為所述第一語音生成預設數量的第一特徵；The device for determining the degree of dementia of a user according to an embodiment includes: a memory recording a program for determining the degree of dementia of a user; and a processor for executing the program, wherein the program performs the following steps: The pre-made first content is used to determine the degree of dementia of the user; receiving the first voice of the user for the first content obtained through the microphone of the user terminal; outputting the pre-made second content through the user terminal content; receiving a second speech of the user for the second content captured through the microphone; generating a first spectrogram image by visualizing at least one characteristic of the first speech; visualizing the second speech at least one feature of the first speech to generate a second spectrogram image; by inputting the first spectrogram image into a pre-updated first convolutional neural network (CNN), a preset number of the first speech is generated for the first speech a feature;

透過將所述第二頻譜圖圖像輸入到預先更新的第二卷積神經網絡，為所述第二語音生成預設數量的第二特徵；在所述第一特徵和所述第二特徵中確定預設數量的目標特徵；以及透過將所述目標特徵輸入到預先更新的深度神經網絡（DNN），確定所述用戶的癡呆程度，並且，可以透過所述用戶終端輸出所述確定的癡呆程度。By inputting the second spectrogram image into the pre-updated second convolutional neural network, generating a preset number of second features for the second speech; among the first features and the second features determining a preset number of target features; and determining the degree of dementia of the user by inputting the target features into a pre-updated deep neural network (DNN), and outputting the determined degree of dementia through the user terminal .

根據一實施例的由電子裝置執行的更新用於確定用戶癡呆程度的卷積神經網絡的方法，包括以下步驟：透過用戶終端輸出用於確定用戶的癡呆程度而預先製作的第一內容；接收針對所述第一內容的測試用戶的第一測試語音；透過可視化所述第一測試語音的至少一個特徵來生成第一測試頻譜圖圖像，其中所述第一測試頻譜圖圖像被標記為所述測試用戶的GT癡呆程度；透過將所述第一測試頻譜圖圖像輸入到完整的CNN中來確定所述測試用戶的測試癡呆程度，其中所述完整的CNN包括輸入層、一個以上的前卷積層塊、完全連接層、一個以上的後卷積層塊及softmax；以及基於所述測試癡呆程度及所述GT癡呆程度來更新所述完整的CNN，並且，所述CNN在所述更新的完整的CNN的層中可以僅包括所述輸入層、所述一個以上的預卷積層塊、所述完全連接層以及所述一個以上的後卷積層塊。According to an embodiment, the method for updating the convolutional neural network used to determine the user's dementia degree by an electronic device includes the following steps: outputting the pre-made first content for determining the user's dementia degree through the user terminal; a first test utterance of a test user of the first content; generating a first test spectrogram image by visualizing at least one feature of the first test utterance, wherein the first test spectrogram image is labeled as The GT dementia degree of the test user; the test dementia degree of the test user is determined by inputting the first test spectrogram image into a complete CNN, wherein the complete CNN includes an input layer, more than one front a convolutional layer block, a fully connected layer, one or more post-convolutional layer blocks, and softmax; and updating the full CNN based on the test dementia level and the GT dementia level, and the CNN is The layers of the CNN may only include the input layer, the one or more pre-convolution layer blocks, the fully connected layer and the one or more post-convolution layer blocks.

根據一實施例的用於更新用於確定用戶癡呆程度的卷積神經網絡的電子裝置包括：存儲器，其記錄用於更新所述CNN的程序；以及執行所述程序的處理器，其中所述處理器執行以下步驟：透過用戶終端輸出用於確定用戶的癡呆程度而預先製作的第一內容；接收針對所述第一內容的測試用戶的第一測試語音；透過可視化所述第一測試語音的至少一個特徵來生成第一測試頻譜圖圖像，其中所述第一測試頻譜圖圖像被標記為所述測試用戶的GT癡呆程度；透過將所述第一測試頻譜圖圖像輸入到完整的CNN中來確定所述測試用戶的測試癡呆程度，其中所述完整的CNN包括輸入層、一個以上的前卷積層塊、完全連接層、一個以上的後卷積層塊及softmax；以及基於所述測試癡呆程度及所述GT癡呆程度來更新所述完整的CNN，並且，所述CNN在所述更新的完整的CNN的層中可以僅包括所述輸入層、所述一個以上的預卷積層塊、所述完全連接層以及所述一個以上的後卷積層塊。An electronic device for updating a convolutional neural network for determining the degree of dementia of a user according to an embodiment includes: a memory recording a program for updating the CNN; and a processor for executing the program, wherein the processing The device executes the steps of: outputting the pre-made first content for determining the degree of dementia of the user through the user terminal; receiving the first test voice of the test user for the first content; by visualizing at least the first test voice A feature to generate a first test spectrogram image labeled as the test user's GT dementia level; by inputting the first test spectrogram image into a full CNN to determine the test dementia degree of the test user, wherein the complete CNN includes an input layer, more than one pre-convolution layer block, a fully connected layer, one or more post-convolution layer blocks, and softmax; and based on the test dementia degree and the GT dementia degree to update the complete CNN, and the CNN may only include the input layer, the one or more pre-convolution layer blocks, the The fully connected layer and the one or more post-convolutional layer blocks.

技術效果technical effect

可以提供一種用於確定用戶的癡呆程度的裝置及方法。An apparatus and method for determining the degree of dementia of a user may be provided.

可以提供一種用於基於用戶的語音來確定用戶的癡呆程度的裝置及方法。An apparatus and method for determining a degree of dementia of a user based on a user's voice may be provided.

以下，將參照圖式對實施例進行詳細說明。然而，本發明的權利範圍並非受到實施例的限制或限定。每個圖式中相同的圖式標記表示相同的元件。Hereinafter, embodiments will be described in detail with reference to the drawings. However, the scope of rights of the present invention is not limited or defined by the examples. Like drawing numbers in each drawing represent like elements.

能夠對以下實施例進行多種變更。應當理解，以下所描述的實施例並不旨在限制這些實施例，並且包括對其的所有修改、等同物和替代物。Various modifications can be made to the following embodiments. It should be understood that the embodiments described below are not intended to limit these embodiments, and include all modifications, equivalents, and substitutions thereto.

實施例中使用的術語僅用於說明特定實施例，並非用於限定實施例。在內容中沒有特別說明的情況下，單數表達包括複數含義。在本說明書中，“包括”或者“具有”等術語用於表達存在說明書中所記載的特徵、數字、步驟、操作、構成要素、配件或其組合，並不排除還具有一個或以上的其他特徵、數字、步驟、操作、構成要素、配件或其組合，或者附加功能。The terms used in the embodiments are for describing specific embodiments only, and are not intended to limit the embodiments. Unless otherwise specified in the content, a singular expression includes a plural meaning. In this specification, terms such as "comprising" or "having" are used to express the presence of features, numbers, steps, operations, constituent elements, accessories or combinations thereof described in the specification, and do not exclude the presence of one or more other features , numbers, steps, operations, constituent elements, accessories or combinations thereof, or additional functions.

在沒有其他定義的情況下，包括技術或者科學術語在內的在此使用的全部術語，都具有本領域普通技術人員所理解的通常的含義。通常使用的與詞典定義相同的術語，應理解為與相關技術的通常的內容相一致的含義，在本申請中沒有明確言及的情況下，不能過度理想化或解釋為形式上的含義。Unless otherwise defined, all terms used herein, including technical or scientific terms, have the usual meanings understood by those of ordinary skill in the art. Commonly used terms that are the same as those defined in dictionaries should be understood as meanings that are consistent with the usual content of related technologies, and should not be overly idealized or interpreted as formal meanings unless explicitly mentioned in this application.

並且，在參照圖式進行說明的過程中，與圖式標記無關，相同的構成要素賦予相同的圖式標記，並省略對此的重複的說明。在說明實施例的過程中，當判斷對於相關公知技術的具體說明會不必要地混淆實施例時，省略對其詳細說明。In addition, in the process of describing with reference to the drawings, the same constituent elements are assigned the same reference symbols regardless of the reference symbols, and overlapping description thereof will be omitted. In explaining the embodiments, when it is judged that the specific descriptions of related well-known technologies will unnecessarily obscure the embodiments, the detailed descriptions thereof are omitted.

圖1為示出根據一示例的用於確定用戶癡呆程度的系統的框圖。FIG. 1 is a block diagram illustrating a system for determining a user's degree of dementia, according to an example.

根據一側面，用於確定用戶的癡呆程度的系統可以包括用於確定用戶癡呆程度的電子裝置110、用於輸出內容的用戶終端120以及用於醫療機構的監控終端130。例如，電子裝置110可以是服務器。According to an aspect, the system for determining a user's dementia degree may include an electronic device 110 for determining a user's dementia degree, a user terminal 120 for outputting content, and a monitoring terminal 130 for a medical institution. For example, the electronic device 110 may be a server.

電子裝置110可以向用戶終端120提供預先製作的內容，以確定用戶的癡呆程度。例如，內容可以是用於從用戶獲取語音的內容。下面將參照圖5詳細描述用於獲取用戶語音的內容。The electronic device 110 may provide pre-made content to the user terminal 120 to determine the degree of dementia of the user. For example, the content may be content for acquiring voice from the user. The content for acquiring the user's voice will be described in detail below with reference to FIG. 5 .

用戶終端120可以離線或在線連接到電子裝置110以彼此通信。電子裝置110向用戶終端120提供內容，並且用戶終端120透過顯示器向用戶輸出內容。例如，用戶終端120可以透過麥克風獲取用戶的語音作為對內容的反應，並將所獲得的語音發送到電子裝置110。The user terminal 120 may be connected to the electronic device 110 offline or online to communicate with each other. The electronic device 110 provides content to the user terminal 120, and the user terminal 120 outputs the content to the user through a display. For example, the user terminal 120 may acquire the user's voice through the microphone as a response to the content, and send the acquired voice to the electronic device 110 .

電子裝置110可以基於用戶的語音來確定用戶的癡呆程度，並將所確定的癡呆程度發送到用戶終端120。The electronic device 110 may determine the user's dementia degree based on the user's voice, and transmit the determined dementia degree to the user terminal 120 .

用戶終端120可以是平板電腦或智能電話等移動終端。當用戶終端120為移動終端時，用戶不受時間和地點的限制，可以以低成本測量癡呆程度。The user terminal 120 may be a mobile terminal such as a tablet computer or a smart phone. When the user terminal 120 is a mobile terminal, the user is not limited by time and place, and can measure the degree of dementia at low cost.

在下文中，將參照圖2至圖17來詳細描述用於確定用戶的癡呆程度的方法。Hereinafter, a method for determining the degree of dementia of a user will be described in detail with reference to FIGS. 2 to 17 .

圖2示出根據一示例的輸出到用戶終端以確定用戶癡呆程度的圖像。FIG. 2 illustrates an image output to a user terminal to determine a user's dementia degree according to an example.

下面的圖像（210至240）可以是用於確定癡呆程度的應用程序的圖像。例如，電子裝置110的用戶可以創建並分發應用程序，並且用戶可以透過用戶終端120執行應用程序。The images below ( 210 to 240 ) may be images of an application used to determine the degree of dementia. For example, a user of the electronic device 110 can create and distribute an application program, and the user can execute the application program through the user terminal 120 .

第一圖像210為應用程序的開始屏幕。The first image 210 is the start screen of the application.

第二圖像220指示應用程序所支持的功能。The second image 220 indicates the functions supported by the application.

第三圖像230為提供給用戶的內容的示例。可以向用戶提供多個內容。The third image 230 is an example of content provided to the user. A plurality of contents may be provided to the user.

第四圖像240表示所確定的用戶的癡呆程度。例如，可以輸出被確定為用戶癡呆程度的正常、輕度認知障礙（MCI）或阿爾茨海默病（AD）。除了對個別疾病的關注程度外，還可以一起輸出綜合判斷。The fourth image 240 represents the determined degree of dementia of the user. For example, normal, mild cognitive impairment (MCI), or Alzheimer's disease (AD) determined to be the user's degree of dementia may be output. In addition to the degree of attention to individual diseases, comprehensive judgments can also be output together.

圖3為示出根據一實施例的用於確定用戶癡呆程度的電子裝置的框圖。FIG. 3 is a block diagram illustrating an electronic device for determining a degree of dementia of a user according to an embodiment.

電子裝置300包括通信部310、處理器320及存儲器330。例如，電子裝置300可以是參照圖1描述的上述電子裝置110。The electronic device 300 includes a communication unit 310 , a processor 320 and a memory 330 . For example, the electronic device 300 may be the above-mentioned electronic device 110 described with reference to FIG. 1 .

通信部310連接到處理器320和存儲器330，以發送和接收數據。通信部310可以連接到外部的另一裝置以發送/接收數據。在下文中，表達式“發送和接收A”可以表示發送和接收“表示A的訊息（information）或數據”。The communication section 310 is connected to the processor 320 and the memory 330 to transmit and receive data. The communication section 310 may be connected to another device outside to transmit/receive data. Hereinafter, the expression "sending and receiving A" may mean sending and receiving "information or data representing A".

通信部310可以實現為電子裝置300中的電路（circuitry）。例如，通信部310可以包括內部總線（internal bus）和外部總線（external bus）。作為另一示例，通信部310可以是連接電子裝置300和外部裝置的元件。通信部310可以是接口（interface）。通信部310可以從外部裝置接收數據，並將其發送到處理器320和存儲器330。The communication unit 310 may be realized as a circuit in the electronic device 300 . For example, the communication unit 310 may include an internal bus (internal bus) and an external bus (external bus). As another example, the communication part 310 may be an element connecting the electronic device 300 and an external device. The communication unit 310 may be an interface. The communication part 310 may receive data from an external device and transmit it to the processor 320 and the memory 330 .

處理器320處理由通信部310接收的數據和存儲在存儲器330中的數據。“處理器”可以是以硬件實現的數據處理裝置，其具有用於執行期望操作（desired operations）的物理結構的電路。例如，期望操作可以包括程序中包括的代碼（code）或指令（instructions）。例如，實現為硬件的數據處理裝置可以包括微處理器（microprocessor）、中央處理單元（central processing unit）、處理器核（processor core）、多核處理器（multi-core processor）、多處理器（multiprocessor）、專用集成電路（ASIC，Application-Specific Integrated Circuit）及現場可編程門陣列（FPGA，Field Programmable Gate Array）。The processor 320 processes data received by the communication unit 310 and data stored in the memory 330 . A "processor" may be a hardware-implemented data processing device having circuits of a physical structure for performing desired operations. For example, the desired operations may include codes or instructions included in the program. For example, a data processing device implemented as hardware may include a microprocessor (microprocessor), a central processing unit (central processing unit), a processor core (processor core), a multi-core processor (multi-core processor), a multiprocessor (multiprocessor) ), Application-Specific Integrated Circuit (ASIC) and Field Programmable Gate Array (FPGA, Field Programmable Gate Array).

處理器320執行存儲在存儲器（例如，存儲器330）中的計算機可讀代碼（例如，軟件）和由處理器320發出的指令。Processor 320 executes computer readable code (eg, software) stored in memory (eg, memory 330 ) and instructions issued by processor 320 .

存儲器330存儲由通信部310接收的數據和由處理器320處理的數據。例如，存儲器330可以存儲程序（或應用程序、軟件）。所存儲的程序可以是句法（syntax）的集合，其被編碼以確定用戶的癡呆程度並且可由處理器320執行。The memory 330 stores data received by the communication unit 310 and data processed by the processor 320 . For example, the memory 330 may store programs (or application programs, software). The stored program may be a set of syntax (syntax) coded to determine the level of dementia of the user and executable by the processor 320 .

根據一側面，存儲器330可以包括一個以上的易失性存儲器、非易失性存儲器和隨機存取存儲器（RAM）、閃存、硬盤驅動器及光盤驅動器。According to an aspect, memory 330 may include one or more of volatile memory, non-volatile memory and random access memory (RAM), flash memory, hard disk drive, and optical disk drive.

存儲器330存儲用於操作電子裝置300的指令集（例如，軟件）。用於操作電子裝置300的指令集由處理器320執行。The memory 330 stores an instruction set (eg, software) for operating the electronic device 300 . A set of instructions for operating the electronic device 300 is executed by the processor 320 .

下面將參照圖4至圖17來詳細描述通信部310、處理器320和存儲器330。The communication part 310, the processor 320, and the memory 330 will be described in detail below with reference to FIGS. 4 to 17 .

圖4為示出根據一實施例的用於確定用戶癡呆程度的方法的流程圖。FIG. 4 is a flow chart illustrating a method for determining the degree of dementia of a user according to an embodiment.

以下步驟410至440由參照圖3描述的上述電子裝置300執行。The following steps 410 to 440 are performed by the above-mentioned electronic device 300 described with reference to FIG. 3 .

在步驟410中，電子裝置300透過用戶終端（例如，用戶終端120）的顯示器輸出預先生成以確定用戶的癡呆程度的內容。內容被輸出到用戶終端，並且用戶對內容進行反應。In step 410 , the electronic device 300 outputs pre-generated content to determine the user's dementia level through the display of the user terminal (eg, the user terminal 120 ). The content is output to the user terminal, and the user reacts to the content.

用戶終端可以透過使用攝像機接收用戶的語音作為上述反應。用戶終端可以透過使用麥克風生成作為反應的語音。所生成的語音可以是數據文件的形式。The user terminal may receive the user's voice as the above reaction by using the camera. The user terminal can generate a voice in response by using a microphone. The generated speech may be in the form of a data file.

可以向用戶提供多個內容，並且可以生成多個內容中的每一個內容的用戶語音。A plurality of contents may be provided to the user, and a user's voice for each of the plurality of contents may be generated.

根據一實施例，使用下面的[表1]來描述用於生成用戶語音的內容。According to an embodiment, the following [Table 1] is used to describe the content for generating the user's voice.

[表1] 語音任務指令（instructions）步驟1、跟讀句子現在請仔細聽我說的句子，並跟讀。每一句結束後，請聽到嗶聲後開始。院子裡，開滿了，玫瑰花。步驟2、跟讀句子再次，請仔細聽我說的句子，並跟讀。每一句結束後，請聽到嗶聲後開始。昨天，下雨，我，宅在家。步驟3、跟讀句子再次，請仔細聽我說的句子，並跟讀。每一句結束後，請聽到嗶聲後開始。牆有縫，壁有耳。步驟4、說出名字接下來，請說出您所看到的動物的名字。請聽到嗶聲後依次說出您所看到的動物的名字。步驟5、描述圖片接下來請看圖片，請在1分鐘內盡可能詳細地描述圖片。請盡可能詳細地描述這裡是哪裡、有什麼東西、動物或人正在做什麼，等等。請聽到嗶聲後開始。步驟6、語言流暢性（音位形式）接下來，請說出以出現的字母開頭的單詞。例如，如果您看到字母“a”，請盡可能多地說出以“a”開頭的單詞。可以說蘋果、螞蟻、宇航員等單詞。還有其他以“a”開頭的單詞嗎？現在請說出以其他字母，即以“b”開頭的單詞。您有一分鐘，請盡可能多地說以“b”開頭的單詞，準備好了嗎？請聽到嗶聲後開始。步驟7、語言流暢性（語義形式）如果我們告訴您一個類型，請儘快告訴我們屬於該類型的名稱。例如，如果我們說“動物類型”，您就可以說出狗、貓、獅子等的名字。還有別的屬於動物類型的嗎？現在請說出屬於其他類型，即屬於水果的所有名字。您有一分鐘，請在一分鐘內說出您想到的水果，準備好了嗎？請聽到嗶聲後開始。步驟8、減法現在我們出一道簡單的計算問題。100減3是什麼？ 100減去3，答案是97。那麼，從那裡再減去3。從97中減去3，所以答案是94。請繼續減去3。從100開始，繼續減去3，準備好了嗎？請聽到嗶聲後開始。步驟9、說故事（正面）迄今為止，您經歷過最快樂的事情是什麼？請在一分鐘內盡可能詳細地告訴我們您經歷過最快樂的事情。請聽到嗶聲後開始。步驟10、說故事（負面）迄今為止，您經歷過最悲傷的事情是什麼？請在一分鐘內盡可能詳細地告訴我們您經歷過最悲傷的事情。請聽到嗶聲後開始。步驟11、說故事（插敘）昨天發生了什麼事？請在一分鐘內盡可能詳細地告訴我們昨天發生的事情。請聽到嗶聲後開始。 [Table 1] voice task instructions Step 1. Read the sentence Now please listen carefully to my sentences and read them. After each sentence, please hear the beep to start. The yard is full of roses. Step 2. Read the sentence Again, please listen carefully to my sentences and follow along. After each sentence, please hear the beep to start. Yesterday, it rained, and I stayed at home. Step 3. Read the sentence Again, please listen carefully to my sentences and follow along. After each sentence, please hear the beep to start. The walls have seams, and the walls have ears. Step 4. Say the name Next, name the animals you see. After the beep, say the names of the animals you see in turn. Step 5. Describe the picture Please look at the picture next, please describe the picture in as much detail as possible within 1 minute. Please describe in as much detail as possible where it is, what is there, what the animal or person is doing, etc. Please hear the beep to start. Step 6. Linguistic fluency (phonemic form) Next, say the word that begins with the letter that appears. For example, if you see the letter "a," say as many words that begin with "a" as you can. Can say words like apple, ant, astronaut, etc. Are there any other words that start with "a"? Now say the words that start with another letter, ie "b". You have a minute, say as many words starting with "b" as you can, are you ready? Please hear the beep to start. Step 7. Linguistic fluency (semantic form) If we tell you about a genre, please tell us the name that belongs to that genre as soon as possible. For example, if we say "type of animal", you can name dogs, cats, lions, etc. Is there anything else of the animal type? Now name all the names that belong to another type, that is, to fruits. You have a minute, please name the fruit you think of in one minute, are you ready? Please hear the beep to start. Step 8. Subtraction Now we have a simple calculation problem. What is 100 minus 3? Subtract 3 from 100 and the answer is 97. Well, subtract 3 from there. Subtract 3 from 97, so the answer is 94. Please continue to subtract 3. Start at 100 and keep subtracting 3, are you ready? Please hear the beep to start. Step 9. Tell a story (front) What is the happiest thing you have experienced so far? Please tell us in as much detail as you can about the happiest thing you have ever experienced in one minute. Please hear the beep to start. Step 10. Tell the Story (Negative) What is the saddest thing that has happened to you so far? Please tell us in as much detail as you can about the saddest thing that has happened to you in one minute. Please hear the beep to start. Step 11. Tell a story (interlude) What happened yesterday? Please tell us what happened yesterday in as much detail as you can within one minute. Please hear the beep to start.

在步驟420中，電子裝置300從用戶終端連續接收用戶觀看內容的反應。例如，電子裝置300可以接收透過用戶終端的麥克風獲取的內容的用戶語音。In step 420, the electronic device 300 continuously receives the user's response to watching the content from the user terminal. For example, the electronic device 300 may receive a user voice of content acquired through a microphone of the user terminal.

當製作多個內容時，可以反復地執行步驟410、420。反復執行步驟410、420以接收多個內容的用戶語音。例如，多個內容可以包括用於接收用戶語音的第一至第十一內容。可以接收第一至第十一內容的第一至第十一語音。When producing multiple contents, steps 410 and 420 may be performed repeatedly. Steps 410 and 420 are repeatedly performed to receive user voices of multiple contents. For example, the plurality of contents may include first to eleventh contents for receiving a user's voice. First to eleventh voices of the first to eleventh contents may be received.

在步驟430中，電子裝置300基於接收到的反應來生成生物標記訊息。In step 430, the electronic device 300 generates a biomarker message based on the received response.

根據一實施例，電子裝置300透過可視化所接收的語音的至少一個特徵來生成語音的頻譜圖（spectrogram）圖像作為生物標記訊息。例如，電子裝置300可以透過librosa工具來生成語音的頻譜圖圖像。頻譜圖圖像可以是梅爾（mel）頻譜圖圖像。According to an embodiment, the electronic device 300 generates a spectrogram image of the speech as the biomarker information by visualizing at least one feature of the received speech. For example, the electronic device 300 can generate a speech spectrogram image through the librosa tool. The spectrogram image may be a mel spectrogram image.

例如，可以生成第一至第十一語音中的每個語音的頻譜圖圖像。下面參照圖6詳細描述頻譜圖圖像。For example, a spectrogram image of each of the first to eleventh voices may be generated. The spectrogram image is described in detail below with reference to FIG. 6 .

在步驟440中，電子裝置300基於生物標記訊息來確定用戶的癡呆程度。In step 440, the electronic device 300 determines the dementia level of the user based on the biomarker information.

根據一實施例，電子裝置300可以透過將頻譜圖圖像作為生物標記訊息輸入預設的癡呆程度分類模型來確定用戶的癡呆程度。例如，癡呆程度分類模型可以基於神經網絡（neural network）進行預訓練。下面將參照圖7至圖12詳細描述基於癡呆程度分類模型來確定用戶癡呆程度的方法。According to an embodiment, the electronic device 300 can determine the dementia level of the user by inputting the spectrogram image as biomarker information into a preset dementia level classification model. For example, a dementia classification model can be pre-trained on a neural network. A method of determining a user's dementia degree based on a dementia degree classification model will be described in detail below with reference to FIGS. 7 to 12 .

在執行步驟440之後，電子裝置300可以透過用戶終端輸出所確定的癡呆程度。After step 440 is executed, the electronic device 300 may output the determined dementia degree through the user terminal.

圖5示出根據一示例的預先製作以接收用戶語音的內容。FIG. 5 illustrates content pre-produced to receive user voice according to an example.

例如，提供給用戶的內容500可以是猜測所輸出的圖像520、530、540的名稱的內容。內容500可以包括除圖像520、520、540之外的用於內容500的用戶語音的指令510。可以以文本形式顯示指令510，也可以透過語音輸出指令510。用戶可以透過說出圖像520、530、540的名稱來生成語音。For example, the content 500 provided to the user may be content to guess the names of the output images 520 , 530 , 540 . The content 500 may include an instruction 510 for a user voice of the content 500 in addition to the images 520 , 520 , 540 . The instruction 510 can be displayed in text form, or can be output by voice. The user can generate voice by speaking the name of the image 520 , 530 , 540 .

儘管已參考圖5描述了用於接收用戶語音的內容的示例，但可以根據待測量的用戶語音以各種方式製作內容。例如，內容可以是使用戶說出從100中減去3獲得的值的內容、使用戶再次說出所輸出的聲音的內容、或使用戶在給定的時間內盡可能多地說出以“b”開頭的單詞的內容。Although an example of content for receiving a user's voice has been described with reference to FIG. 5 , the content may be produced in various ways according to the user's voice to be measured. For example, the content may be content that makes the user say a value obtained by subtracting 3 from 100, that makes the user say the output voice again, or makes the user say as much as possible within a given time. The content of the word beginning with ".

根據一實施例，可以透過用戶終端向用戶按順序提供預設數量（例如，11個）的內容。用戶終端可以透過記錄每個內容的用戶語音來生成多個語音數據文件，並將生成的語音數據文件發送到電子裝置300。According to an embodiment, a preset number (for example, 11) of contents may be sequentially provided to the user through the user terminal. The user terminal can generate a plurality of voice data files by recording the user voice of each content, and send the generated voice data files to the electronic device 300 .

圖6示出根據一示例的為語音生成的原始頻譜圖圖像。Fig. 6 shows a raw spectrogram image generated for speech according to an example.

根據一側面，電子裝置300可以透過Librosa工具生成語音的原始頻譜圖圖像600。原始頻譜圖圖像600的水平軸可以是時間軸，垂直軸可以是頻率軸。原始頻譜圖圖像600表示根據時間軸和頻率軸的變化的振幅差異，作為打印密度/顯示顏色的差異。可以基於改變的振幅差的大小來確定對應位置的顯示顏色。例如，振幅差的大小的顯示顏色的圖例610可以與原始頻譜圖圖像600一起輸出。為了顯示所確定的顏色，可以確定對應坐標的像素的R、G、B通道的值。According to one aspect, the electronic device 300 can generate the original spectrogram image 600 of speech through the Librosa tool. The horizontal axis of the original spectrogram image 600 may be a time axis, and the vertical axis may be a frequency axis. The original spectrogram image 600 represents a difference in amplitude according to changes in the time axis and the frequency axis as a difference in printing density/display color. The display color of the corresponding position may be determined based on the magnitude of the changed amplitude difference. For example, a legend 610 showing the magnitude of the amplitude difference may be output together with the original spectrogram image 600 . In order to display the determined color, the values of the R, G, and B channels of the pixel corresponding to the coordinates may be determined.

根據一實施例，可以基於多個語音中的每個語音的多個原始頻譜圖圖像來生成待輸入到模型的多個頻譜圖圖像。例如，可以為第一語音生成第一頻譜圖圖像，並為第二語音生成第二頻譜圖圖像。原始頻譜圖圖像的時間軸和頻率軸的比例可以根據各個語音的總時間而不同，但最終生成的頻譜圖圖像的大小可以相同。According to an embodiment, the plurality of spectrogram images to be input to the model may be generated based on the plurality of raw spectrogram images for each of the plurality of utterances. For example, a first spectrogram image may be generated for a first speech, and a second spectrogram image may be generated for a second speech. The scale of the time axis and the frequency axis of the original spectrogram image can be different according to the total time of each speech, but the size of the final generated spectrogram image can be the same.

根據一實施例，電子裝置300可以將針對第一內容生成的第一原始頻譜圖圖像轉換為針對第一內容具有預設的第一時間範圍的第一調整頻譜圖圖像。例如，可以基於多個用戶對第一內容的平均響應時間來預設第一內容的第一時間範圍。例如，第一時間範圍可以是平均響應時間和響應時間的中間值（或響應時間的標準偏差）的總和。According to an embodiment, the electronic device 300 may convert the first original spectrogram image generated for the first content into a first adjusted spectrogram image with a preset first time range for the first content. For example, the first time range of the first content may be preset based on the average response time of multiple users to the first content. For example, the first time range may be the sum of the average response time and the median of the response times (or the standard deviation of the response times).

根據一實施例，關於用戶對第一至第四內容的響應時間的統計數據在下面的[表2]中示出。According to an embodiment, statistical data on user's response time to the first to fourth contents is shown in [Table 2] below.

[表2] 第一內容第二內容第三內容第四內容平均響應時間 6.423967571 6.517637474 7.738273502 10.29516905 標準偏差值 9.641737921 9.008077433 9.55999683 10.85853336 中間值 4.120746667 4.47616 5.258986667 6.893226667 最大值 60.78869333 60.74616889 60.76743111 59.97546667 最小值 1.3056 1.314133333 1.258666667 1.32096 [Table 2] first content Second content third content fourth content average response time 6.423967571 6.517637474 7.738273502 10.29516905 Standard deviation value 9.641737921 9.008077433 9.55999683 10.85853336 Median 4.120746667 4.47616 5.258986667 6.893226667 maximum value 60.78869333 60.74616889 60.76743111 59.97546667 minimum value 1.3056 1.314133333 1.258666667 1.32096

例如，根據上面的[表2]，當第一內容的第一時間範圍被確定為平均響應時間和中間值的總和時，其總和為10.544714238（秒），與10.5447 14238沒有顯著差異的9秒可以確定為第一時間範圍。For example, according to [Table 2] above, when the first time range of the first content is determined as the sum of the average response time and the median value, its sum is 10.544714238 (seconds), and 9 seconds with no significant difference from 10.5447 14238 can be Determined as the first time frame.

例如，當用戶針對第一內容的第一原始頻譜圖圖像的長度為10秒時，可以透過切割9秒或更長的部分來生成第一調整頻譜圖圖像。作為另一示例，當用戶針對第一內容的第一原始頻譜圖圖像的長度為8秒時，可以透過添加從8秒到9秒的靜音時間來生成第一調整頻譜圖圖像。For example, when the length of the user's first original spectrogram image for the first content is 10 seconds, the first adjusted spectrogram image may be generated by cutting a part of 9 seconds or longer. As another example, when the length of the user's first original spectrogram image for the first content is 8 seconds, the first adjusted spectrogram image may be generated by adding a silence time from 8 seconds to 9 seconds.

根據一實施例，可以生成用於第一至第十一內容的第一調整後的頻譜圖圖像至第十一頻譜圖圖像。例如，第一調整後的頻譜圖圖像至第十一調整後的頻譜圖圖像中的每一個可以表示不同的時間間隔。可以去除第一調整後的頻譜圖圖像與第十一調整後的頻譜圖圖像之間的差異不大的圖像上部的某些部分。基於去除某些部分的第一調整後的頻譜圖圖像至第十一調整後的頻譜圖圖像，可以對第一調整後的頻譜圖圖像至第十一調整後的頻譜圖圖像進行圖像處理，使得圖像具有相同的大小。經圖像處理的經調整的頻譜圖圖像可以被稱為頻譜圖圖像。例如，第一頻譜圖圖像的大小和第二頻譜圖圖像的大小可以為300x300，彼此相同。例如，頻譜圖圖像的大小可以以像素為單位。例如，像素的值可以由16位表示。According to an embodiment, first to eleventh adjusted spectrogram images for the first to eleventh contents may be generated. For example, each of the first to eleventh adjusted spectrogram images may represent a different time interval. Certain parts of the upper part of the image where the difference between the first adjusted spectrogram image and the eleventh adjusted spectrogram image are not large may be removed. Based on the first to eleventh adjusted spectrogram images with certain parts removed, the first to eleventh adjusted spectrogram images may be processed Image processing such that the images have the same size. The image-processed adjusted spectrogram image may be referred to as a spectrogram image. For example, the size of the first spectrogram image and the size of the second spectrogram image may be 300x300, which are the same as each other. For example, the size of a spectrogram image can be in pixels. For example, the value of a pixel can be represented by 16 bits.

儘管已經參照圖5和圖6描述了用於接收用戶語音的內容的示例，但可以根據要測量的用戶的語音以各種方式製作內容。例如，內容可以包括描繪照片的內容和用於閱讀輸出句子的內容。Although an example of content for receiving a user's voice has been described with reference to FIGS. 5 and 6 , the content may be produced in various ways according to the user's voice to be measured. For example, the content may include content depicting a photo and content for reading an output sentence.

圖7為示出根據一示例的使用CNN和DNN來確定用戶癡呆程度的方法的流程圖。FIG. 7 is a flowchart illustrating a method of determining a degree of dementia of a user using a CNN and a DNN according to an example.

根據一側面，參照圖4描述的上述步驟440可以包括以下步驟（710至730）。According to one aspect, the above step 440 described with reference to FIG. 4 may include the following steps ( 710 to 730 ).

在步驟710中，電子裝置300透過將頻譜圖圖像輸入到對應於頻譜圖圖像的預先更新的卷積神經網絡（CNN）來生成內容的預設數量的特徵。用於生成特徵的CNN可能因內容而異。例如，當存在11個內容時，存在對應於11個內容中的每一個的CNN，並且這11個CNN可以被稱為CNN集。下文中，“更新”一詞可以包括“訓練”一詞的含義，並且可以互換使用。In step 710 , the electronic device 300 generates a preset number of features of the content by inputting the spectrogram image into a pre-updated convolutional neural network (CNN) corresponding to the spectrogram image. The CNN used to generate features may vary depending on the content. For example, when there are 11 contents, there is a CNN corresponding to each of the 11 contents, and these 11 CNNs may be called a CNN set. Hereinafter, the word "update" may include the meaning of the word "training" and may be used interchangeably.

根據一側面，可以基於VGG16模型來預先更新CNN。CNN可以是完整的CNN的一部分，其包括輸入層、一個以上的預卷積層塊、完全連接層（fully connected layer）、一個以上的後卷積層及softmax。例如，CNN可以包括輸入層、預卷積層塊、完全連接層及後卷積層塊，但不包括softmax。由於CNN不包括softmax，因此可以輸出預設數量的特徵，該特徵用於計算癡呆程度而不是計算作為輸入的頻譜圖圖像的結果的癡呆程度。將參照圖8詳細描述完整的CNN和部分CNN。According to one aspect, the CNN can be pre-updated based on the VGG16 model. A CNN can be part of a complete CNN, which includes an input layer, one or more pre-convolutional layer blocks, a fully connected layer, one or more post-convolutional layers, and softmax. For example, a CNN can include an input layer, a pre-convolution block, a fully-connected layer, and a post-convolution block, but not softmax. Since the CNN does not include softmax, it is possible to output a preset number of features that are used to calculate the degree of dementia instead of calculating the degree of dementia as a result of the input spectrogram image. The full CNN and the partial CNN will be described in detail with reference to FIG. 8 .

例如，電子裝置300透過將第一頻譜圖圖像輸入到預先更新的第一CNN來為第一內容生成預設數量的第一特徵，並將第二頻譜圖圖像輸入到預先更新的第二CNN來為第二內容生成預設數量的第二特徵。作為一個具體示例，當接收到11個頻譜圖圖像並為一個頻譜圖生成256個特徵時，可以生成總共2816個特徵。For example, the electronic device 300 generates a preset number of first features for the first content by inputting the first spectrogram image into the pre-updated first CNN, and inputs the second spectrogram image into the pre-updated second CNN. CNN to generate a preset number of second features for the second content. As a specific example, when 11 spectrogram images are received and 256 features are generated for one spectrogram, a total of 2816 features can be generated.

在步驟720中，電子裝置300確定多個內容（或多個對應的頻譜圖圖像）的特徵中的目標特徵。所確定的目標特徵可以是用於確定癡呆程度的標記。被確定為目標特徵的特徵可以被預定為標記。可以透過更新CNN和更新深度神經網絡（DNN）的步驟（稍後將參照圖12至圖16來描述）來預先確定標記。In step 720, the electronic device 300 determines a target feature among the features of the plurality of contents (or the plurality of corresponding spectrogram images). The determined characteristic of interest may be a marker for determining the degree of dementia. Features determined to be target features may be predetermined as markers. The markers may be predetermined through the steps of updating the CNN and updating the deep neural network (DNN), which will be described later with reference to FIGS. 12 to 16 .

圖8示出根據一示例的能夠確定用戶癡呆程度的完整的CNN和部分CNN。Fig. 8 shows a full CNN and a partial CNN capable of determining the degree of dementia of a user according to an example.

根據一側面，完整的CNN 800包括輸入層810、第一卷積層塊820、第二卷積層塊830、第三卷積層塊840、第四卷積層塊840、第五卷積層塊850、完全連接層870、第六卷積層塊880、第七卷積層塊890及softmax895。為便於識別，可以將位於完全連接層879前面的第一卷積層塊820、第二卷積層塊830、第三卷積層塊840、第四卷積層塊850及第五卷積層塊860稱為預卷積層塊（pre-convolutional layer blocks），並將位於完全連接層879後面的第六卷積層塊880及第七卷積層塊890稱為後卷積層塊（post-convolutional layer blocks）。According to one aspect, a complete CNN 800 includes an input layer 810, a first convolutional layer block 820, a second convolutional layer block 830, a third convolutional layer block 840, a fourth convolutional layer block 840, a fifth convolutional layer block 850, a fully connected Layer 870, sixth convolutional layer block 880, seventh convolutional layer block 890, and softmax 895. For ease of identification, the first convolutional layer block 820, the second convolutional layer block 830, the third convolutional layer block 840, the fourth convolutional layer block 850, and the fifth convolutional layer block 860 located in front of the fully connected layer 879 can be referred to as pre- Convolutional layer blocks (pre-convolutional layer blocks), and the sixth convolutional layer block 880 and the seventh convolutional layer block 890 located behind the fully connected layer 879 are called post-convolutional layer blocks (post-convolutional layer blocks).

根據一實施例，卷積層塊可以包括一個以上的卷積層和池層。此外，第六卷積層塊880和第七卷積層塊890中的每一個還可以包括丟棄（drop-out）層塊。According to an embodiment, a convolutional layer block may include more than one convolutional layer and pooling layer. In addition, each of the sixth convolutional layer block 880 and the seventh convolutional layer block 890 may further include a drop-out layer block.

完整的CNN 800可以是透過完整的CNN更新方法（稍後將參考圖12進行描述）來更新的完整的CNN。可以預先更新對於每個內容不同的完整的CNN。The full CNN 800 may be a full CNN updated by a full CNN update method (described later with reference to FIG. 12 ). A complete CNN that is different for each content can be updated in advance.

部分CNN 805可以僅包括輸入層810、第一卷積層塊820、第二卷積層塊830、第三卷積層塊840、第四卷積層塊850、第五卷積層塊860、完全連接層870、第六卷積層塊880及第七卷積層塊890，但不包括softmax 895。即，部分CNN 805可以是在完成完整的CNN 800的更新之後從完整的CNN 800中移除softmax 895的CNN。例如，在參照圖7描述的上述步驟710中使用的CNN可以是部分CNN 805。The partial CNN 805 may only include an input layer 810, a first convolutional layer block 820, a second convolutional layer block 830, a third convolutional layer block 840, a fourth convolutional layer block 850, a fifth convolutional layer block 860, a fully connected layer 870, Sixth convolutional layer block 880 and seventh convolutional layer block 890 , excluding softmax 895 . That is, the partial CNN 805 may be a CNN with the softmax 895 removed from the full CNN 800 after the update of the full CNN 800 is completed. For example, the CNN used in the above step 710 described with reference to FIG. 7 may be the partial CNN 805 .

由於部分CNN 805不包括softmax895，因此其可以輸出頻譜圖圖像的各種特徵。Since part of the CNN 805 does not include softmax895, it can output various features of the spectrogram image.

圖9示出根據一示例的為多個用戶圖像集中的每一個生成的特徵和基於其確定的目標特徵。FIG. 9 illustrates generated features for each of multiple sets of user images and target features determined based thereon, according to an example.

根據一側面，透過對應於目標內容的目標CNN來生成目標語音的預設數量的特徵。例如，特徵的預設數量可以是256個。當根據多個內容的多個頻譜圖圖像的數量為n個時，生成的所有特徵900的數量可以為256xn個。According to an aspect, a predetermined number of features of the target speech are generated through a target CNN corresponding to the target content. For example, the preset number of features may be 256. When the number of multiple spectrogram images according to multiple contents is n, the number of all features 900 generated may be 256×n.

確定所有特徵900中的預設數量的目標特徵910。所確定的目標特徵910可以是用於確定癡呆程度的預設標記。下面參照圖13的步驟1310詳細描述將目標特徵910預先確定為標記的方法。A preset number of target features 910 among all features 900 are determined. The determined target characteristics 910 may be preset markers for determining the degree of dementia. The method of predetermining the target feature 910 as a marker will be described in detail below with reference to step 1310 of FIG. 13 .

圖10示出根據一示例的用於確定用戶癡呆程度的DNN。FIG. 10 illustrates a DNN for determining a user's degree of dementia, according to an example.

根據一側面，用於確定用戶癡呆程度的DNN可以包括輸入層1010、一個以上的隱藏層1020、1030、1040及輸出層1050。例如，DNN可以是透過更新DNN的方法（稍後將參照圖13描述）來更新的DNN。According to an aspect, a DNN for determining the degree of dementia of a user may include an input layer 1010 , one or more hidden layers 1020 , 1030 , 1040 and an output layer 1050 . For example, the DNN may be a DNN updated by a method of updating a DNN (described later with reference to FIG. 13 ).

DNN可以輸出用戶的癡呆程度作為輸入目標特徵910的輸出。DNN可以輸入多個預設癡呆程度中的任一個。例如，預設的多個癡呆程度可以包括確定的正常、輕度認知障礙（MCI）和阿爾茨海默病（AD）。The DNN may output the user's degree of dementia as an output of the input target feature 910 . The DNN can input any of a number of preset dementia levels. For example, the preset multiple degrees of dementia may include confirmed normal, mild cognitive impairment (MCI), and Alzheimer's disease (AD).

圖11示出根據一示例的為提高確定癡呆程度的準確性而執行的兩個步驟的分類。Figure 11 shows a classification of two steps performed to improve the accuracy of determining the degree of dementia, according to an example.

與透過單個模型確定多個癡呆程度中任一個的方法不同，透過多個模型逐步確定癡呆程度的方法可以提高癡呆程度的確定精度。Unlike a method of determining any one of a plurality of dementia levels through a single model, a method of stepwise determining a dementia level through a plurality of models can improve the determination accuracy of the dementia level.

例如，不是透過單個模型確定正常、輕度認知障礙（MCI）和阿爾茨海默病（AD）中任一個的方法，而是在分類的第一階段確定正常或異常（輕度認知障礙（MCI）和阿爾茨海默病（AD）），並在分類的第二階段確定輕度認知障礙（MIC）或阿爾茨海默病（AD）。For example, rather than a single model to identify normal, mild cognitive impairment (MCI) and Alzheimer's disease (AD), normal or abnormal (mild cognitive impairment (MCI) ) and Alzheimer's disease (AD)), and identify mild cognitive impairment (MIC) or Alzheimer's disease (AD) in the second stage of the classification.

為了使用上述方法，分別預先準備分類第一階段中使用的第一CNN集和第一DNN以及分類第二階段中使用的第二CNN集與第二DNN。In order to use the above method, the first set of CNNs and the first DNN used in the first stage of classification and the second set of CNNs and the second DNN used in the second stage of classification are prepared in advance, respectively.

例如，當針對第一分類階段執行參照圖4的上述步驟（410至440），並透過第一分類階段確定用戶的癡呆程度為異常時，可以執行針對第二分類階段的步驟440。當透過第一分類階段確定用戶的癡呆程度為正常時，可以不執行第二分類階段。用於第一分類階段的第一CNN集與第一DNN以及用於第二分類階段的第二CNN集與第二DNN分別彼此不同。For example, when the above steps ( 410 to 440 ) referring to FIG. 4 are performed for the first classification stage, and the user's dementia degree is determined to be abnormal through the first classification stage, step 440 for the second classification stage may be executed. When the dementia level of the user is determined to be normal through the first classification stage, the second classification stage may not be executed. The first set of CNNs and the first DNN used for the first classification stage and the second set of CNNs and the second DNN used for the second classification stage are respectively different from each other.

圖12示出根據另一示例的為提高確定癡呆程度的準確性而執行的兩個步驟的操作。FIG. 12 illustrates a two-step operation performed to improve the accuracy of determining the degree of dementia according to another example.

與透過單個模型確定多個癡呆程度中任一個的方法不同，透過多個模型確定癡呆程度的方法可以提高癡呆程度的確定精度。Unlike the method of determining any one of a plurality of dementia degrees by a single model, the method of determining the degree of dementia by a plurality of models can improve the accuracy of determining the degree of dementia.

根據一實施例，不採用透過一個模型確定正常、輕度認知障礙（MCI）及阿爾茨海默病（AD）中任一種的方法，而採用使用為不同分類目的訓練的多個模型來分別計算輕度認知障礙（MCI）和阿爾茨海默病（AD）的概率，並基於計算出的概率來確定癡呆程度的方法。According to an embodiment, instead of using one model to determine any of normal, mild cognitive impairment (MCI), and Alzheimer's disease (AD), it is calculated separately using multiple models trained for different classification purposes The probability of mild cognitive impairment (MCI) and Alzheimer's disease (AD), and a method for determining the degree of dementia based on the calculated probabilities.

在步驟1210中，可以使用多個模型來計算用戶的正常、輕度認知障礙（MCI）和阿爾茨海默病（AD）中的每一個的部分概率。例如，當為11個頻譜圖圖像生成256個特徵時，可以生成總共2816個特徵，並且可以向多個模型中的每一個模型輸入2816個特徵。例如，多個模型可以包括：用於分類正常和輕度認知障礙（MCI）和阿爾茨海默病（AD）的第一模型、用於分類正常和阿爾茨海默病（AD）的第二模型、用於分類正常及輕度認知障礙（MCI）的第三模型、以及用於分類輕度認知障礙（MCI）和阿爾茨海默病（AD）的第四模型。In step 1210, a plurality of models may be used to calculate partial probabilities for each of the user's normal, mild cognitive impairment (MCI), and Alzheimer's disease (AD). For example, when generating 256 features for 11 spectrogram images, a total of 2816 features can be generated and 2816 features can be input to each of the multiple models. For example, multiple models may include: a first model for classifying normal and mild cognitive impairment (MCI) and Alzheimer's disease (AD), a second model for classifying normal and Alzheimer's disease (AD) model, a third model for classifying normal and mild cognitive impairment (MCI), and a fourth model for classifying mild cognitive impairment (MCI) and Alzheimer's disease (AD).

可以透過第一模型將第一正常概率P _SCI1和第一輕度認知障礙（MCI）概率P _MCI1計算為部分概率。第一阿爾茨海默病（AD）概率P _AD1可以相同於第一輕度認知障礙（MCI）概率P _MCI1。可以透過第二模型將第二正常概率P _SCI2和第二阿爾茨海默病（AD）概率P _AD2計算為部分概率。可以透過第三模型將第三正常概率P _SCI3和第二輕度認知障礙（MCI）概率P _MCI2計算為部分概率。可以透過第四模型將第三輕度認知障礙（MCI）概率P _MCI3和第三阿爾茨海默病（AD）概率P _AD3計算為部分概率。 The first normal probability P _SCI1 and the first mild cognitive impairment (MCI) probability P _MCI1 can be calculated as partial probabilities through the first model. The first Alzheimer's disease (AD) probability P _AD1 may be the same as the first mild cognitive impairment (MCI) probability P _MCI1 . The second normal probability P _SCI2 and the second Alzheimer's disease (AD) probability P _AD2 can be calculated as partial probabilities through the second model. The third normal probability P _SCI3 and the second mild cognitive impairment (MCI) probability P _MCI2 can be calculated as partial probabilities through the third model. The third mild cognitive impairment (MCI) probability P _MCI3 and the third Alzheimer's disease (AD) probability P _AD3 can be calculated as partial probabilities through the fourth model.

為了使用上述方法，預先準備在第一模型中使用的第一CNN集和第一DNN、在第二模型中使用的第二CNN集和第二DNN、在第三模型中使用的第三CNN集及第三DNN、在第四模型中使用的第四CNN集及第四DNN。To use the above method, prepare in advance the first CNN set and the first DNN used in the first model, the second CNN set and the second DNN used in the second model, and the third CNN set used in the third model and a third DNN, a fourth set of CNNs used in a fourth model, and a fourth DNN.

在步驟1220中，可以基於透過多個模型計算的部分概率來確定正常的第一概率、輕度認知障礙（MCI）的第二概率和阿爾茨海默病（AD）的第三概率。In step 1220 , a first probability of normal, a second probability of mild cognitive impairment (MCI), and a third probability of Alzheimer's disease (AD) may be determined based on the partial probabilities calculated through the plurality of models.

例如，第一正常概率P _SCI1、第二正常概率P _SCI2和第三正常概率P _SCI3的總和可以被計算為正常的第一概率。第一輕度認知障礙（MCI）概率P _MCI1、第二輕度認知障礙概率P _MCI2和第三輕度認知障礙概率P _MCI3的總和可以被確定為輕度認知障礙（MCI）的第二概率。第一阿爾茨海默病（AD）概率P _AD1、第二阿爾茨海默病（AD）概率P _AD2和第三阿爾茨海默病（AD）概率P _AD3的總和可以被確定為阿爾茨海默病（AD）的第三概率。 For example, the sum of the first probability of normality _PSCI1 , the second probability of normality _PSCI2 and the third probability of normality _PSCI3 may be calculated as the first probability of normality. The sum of the first mild cognitive impairment (MCI) probability P _MCI1 , the second mild cognitive impairment probability P _MCI2 , and the third mild cognitive impairment probability P _MCI3 may be determined as the second probability of mild cognitive impairment (MCI). The sum of the first Alzheimer's disease (AD) probability P _AD1 , the second Alzheimer's disease (AD) probability P _AD2 and the third Alzheimer's disease (AD) probability P _AD3 can be determined as Alzheimer The third probability of silent disease (AD).

在步驟1230中，可以將與第一概率、第二概率和第三概率中最大值相對應的分類確定為用戶的癡呆程度。例如，當第一概率、第二概率和第三概率中第二概率最大時，可以將用戶的癡呆程度確定為輕度認知障礙（MCI）。In step 1230, the classification corresponding to the maximum value among the first probability, the second probability and the third probability may be determined as the dementia degree of the user. For example, when the second probability is the largest among the first probability, the second probability and the third probability, the degree of dementia of the user may be determined as mild cognitive impairment (MCI).

根據一實施例，在參照圖12的描述中，已描述了四個模型用於確定用戶的癡呆程度，但是用於確定癡呆程度的模型的數量並不限於所公開的實施例。例如，為了確定癡呆的程度，可以使用兩個以上的模型。According to an embodiment, in the description with reference to FIG. 12 , it has been described that four models are used to determine the degree of dementia of the user, but the number of models used to determine the degree of dementia is not limited to the disclosed embodiment. For example, to determine the degree of dementia, more than two models may be used.

圖13為示出根據一示例的用於更新完整的CNN的方法的流程圖。FIG. 13 is a flowchart illustrating a method for updating a complete CNN according to an example.

根據一側面，在執行參照圖4描述的上述步驟410之前，預先執行下面的步驟1300。步驟1300涉及用於更新完整的CNN的方法，其可以包括以下步驟（1310至1350）。According to one aspect, before performing the above step 410 described with reference to FIG. 4 , the following step 1300 is performed in advance. Step 1300 relates to a method for updating a full CNN, which may include the following steps (1310 to 1350).

在步驟1310中，電子裝置300向測試用戶輸出預先製作以確定用戶癡呆程度的內容。例如，電子裝置300可以透過測試用戶的用戶終端輸出內容。In step 1310, the electronic device 300 outputs pre-made content to the test user to determine the degree of dementia of the user. For example, the electronic device 300 may output content through a user terminal of a test user.

測試用戶可以是透過醫生的專業診斷確定其癡呆程度的人。例如，測試用戶可以是正常、或患有輕度認知障礙（MCI）或阿爾茨海默病（AD）。A test user may be a person whose degree of dementia has been determined by a professional diagnosis by a doctor. For example, the test user may be normal, or suffer from mild cognitive impairment (MCI) or Alzheimer's disease (AD).

在步驟1320中，電子裝置300透過用戶終端的麥克風接收內容的測試用戶的測試語音。當提供多個內容時，可以接收多個測試語音。In step 1320, the electronic device 300 receives the test voice of the content test user through the microphone of the user terminal. When multiple contents are provided, multiple test voices may be received.

在步驟1330中，電子裝置300透過可視化所接收的測試語音的至少一個特徵來生成測試語音的測試頻譜圖圖像。可以用測試用戶的GT癡呆程度來標記測試頻譜圖圖像。In step 1330, the electronic device 300 generates a test spectrogram image of the test voice by visualizing at least one feature of the received test voice. The test spectrogram image may be labeled with the test user's GT dementia level.

在步驟1340中，電子裝置300透過將測試頻譜圖圖像輸入到完整的CNN來確定測試用戶的測試癡呆程度。由於完整的CNN包括softmax，因此完整的CNN可以確定測試癡呆程度。例如，確定的測試癡呆程度可以包括正常、患有輕度認知障礙（MCI）及阿爾茨海默病（AD）。In step 1340, the electronic device 300 determines the test dementia level of the test user by inputting the test spectrogram image into the complete CNN. Since the full CNN includes softmax, the full CNN can determine the degree of test dementia. For example, determined test dementia levels may include normal, with mild cognitive impairment (MCI), and Alzheimer's disease (AD).

根據一實施例，對應於第一內容的第一完整的CNN僅基於第一測試頻譜圖圖像來確定測試用戶的測試癡呆程度，並且對應於第n內容的第n完整的CNN僅基於第n測試頻譜圖圖像來確定測試用戶的癡呆程度。According to an embodiment, the first complete CNN corresponding to the first content determines the test dementia degree of the test user based only on the first test spectrogram image, and the nth complete CNN corresponding to the nth content is based only on the nth Test the spectrogram image to determine the degree of dementia of the test user.

在步驟1350中，電子裝置300基於測試癡呆程度和GT癡呆程度來更新完整的CNN。例如，如果測試癡呆水平和GT癡呆水平之間存在差異，則可以使用該差異作為誤差值來執行反向傳播以更新完整的CNN。更新完整的CNN的方法可以是監督學習（supervised learning）。In step 1350, the electronic device 300 updates the full CNN based on the test dementia degree and the GT dementia degree. For example, if there is a difference between the test dementia level and the GT dementia level, you can use that difference as an error value to perform backpropagation to update the full CNN. The way to update the full CNN can be supervised learning.

在圖8的一實施例中，當完整的CNN 800包括輸入層810、第一卷積層塊820、第二卷積層塊830、第三卷積層塊840、第四卷積層塊850、第五卷積層塊860、完全連接層870、第六卷積層塊880、第七卷積層塊890以及softmax895時，只更新第五卷積層塊860，其餘層可以不被更新。In an embodiment of FIG. 8, when the complete CNN 800 includes an input layer 810, a first convolutional layer block 820, a second convolutional layer block 830, a third convolutional layer block 840, a fourth convolutional layer block 850, a fifth convolutional layer block When stacking block 860, fully connected layer 870, sixth convolutional layer block 880, seventh convolutional layer block 890, and softmax 895, only the fifth convolutional layer block 860 is updated, and other layers may not be updated.

根據一實施例，可以透過大量測試用戶重複更新完整的CNN，並且當更新的完整的CNN的輸出精度變得大於預設閾值時，可以終止完整的CNN的更新。According to an embodiment, the complete CNN can be updated repeatedly through a large number of test users, and when the output accuracy of the updated complete CNN becomes greater than a preset threshold, the update of the complete CNN can be terminated.

根據一側面，當如參照圖11及圖12所述的方法所示透過多個模型逐步確定癡呆程度時，可以分別更新每個分類步驟中使用的第一完成CNN和第二完成CNN，以適應每個分類步驟。例如，第一完成CNN被更新以確定正常或異常（輕度認知障礙（MCI）和阿爾茨海默病（AD）），第二完成CNN被更新以確定輕度認知障礙（MCI）或阿爾茨海默病（AD）。According to an aspect, when the degree of dementia is gradually determined through multiple models as shown in the method described with reference to FIGS. each classification step. For example, the first completed CNN is updated to determine normal or abnormal (mild cognitive impairment (MCI) and Alzheimer's disease (AD)), the second completed CNN is updated to determine mild cognitive impairment (MCI) or Alzheimer's disease (AD) Alzheimer's disease (AD).

在步驟710中使用的CNN可以是神經網絡，其在完成完整的CNN的更新之後，從完整的CNN中移除softmax（例如，softmax895）。即，在步驟510中使用的CNN可以用作相應頻譜圖圖像的特徵提取器。The CNN used in step 710 may be a neural network that removes the softmax (eg, softmax895) from the full CNN after updating the full CNN. That is, the CNN used in step 510 can be used as a feature extractor for the corresponding spectrogram image.

圖14為示出根據一示例的更新DNN的方法的流程圖。FIG. 14 is a flowchart illustrating a method of updating a DNN according to an example.

根據一實施例，以下步驟1400涉及用於更新DNN的方法，在執行參照圖13描述的上述步驟1300之後以及在執行參照圖4描述的上步驟410之前，可以執行用於更新DNN的方法。例如，在完整的CNN（或CNN）的更新完成之後，可以執行步驟1400。According to an embodiment, the following step 1400 relates to a method for updating a DNN, which may be performed after performing the above step 1300 described with reference to FIG. 13 and before performing the above step 410 described with reference to FIG. 4 . For example, step 1400 may be performed after the update of the full CNN (or CNN) is completed.

步驟1400可以包括以下步驟（1410至1440）。Step 1400 may include the following steps (1410 to 1440).

在步驟1410中，電子裝置300基於根據第一測試頻譜圖圖像由第一CNN生成的預設數量的第一測試特徵及第二頻譜圖圖像，確定由第二CNN生成的預設數量的第二測試特徵中預設數量的測試目標特徵。儘管僅描述了第一測試特徵和第二測試特徵，但例如，當生成用於n個內容的n個測試頻譜圖圖像時，可以從第一測試特徵到第n個測試特徵中確定測試目標特徵。測試目標特徵可以是用於確定癡呆程度的標記。下面，將參照圖15及圖16來詳細描述確定測試目標特徵的方法。In step 1410, the electronic device 300 determines a preset number of features generated by the second CNN based on the preset number of first test features and the second spectrogram image generated by the first CNN according to the first test spectrogram image. A preset number of test target features in the second test feature. Although only the first test feature and the second test feature are described, for example, when generating n test spectrogram images for n contents, the test target can be determined from the first test feature to the nth test feature feature. The test target characteristic may be a marker for determining the degree of dementia. Hereinafter, a method of determining a characteristic of a test target will be described in detail with reference to FIGS. 15 and 16 .

可以用測試用戶的GT癡呆程度來標記測試目標特徵。The test target feature can be marked with the test user's GT dementia degree.

在步驟1420中，電子裝置300可以驗證所確定的測試目標特徵。例如，測試目標特徵可以透過K-折交叉驗證方法（k-fold cross validation）進行驗證。下面將參照圖17及圖18詳細描述驗證測試目標特徵的方法。根據實施例，當確定的測試目標特徵不需要驗證時，可以不執行步驟1420。In step 1420, the electronic device 300 may verify the determined test target characteristics. For example, test target features can be validated by k-fold cross validation. The method of verifying the characteristics of the test object will be described in detail below with reference to FIGS. 17 and 18 . According to an embodiment, when the determined test target feature does not need to be verified, step 1420 may not be performed.

當測試目標特徵已被驗證（或不需要驗證）時，可以執行步驟1430。根據一示例，有必要驗證測試目標特徵，但如果未被驗證，則視為需要重新更新CNN，可以重新執行步驟1300。Step 1430 may be performed when the test target feature has been verified (or does not need to be verified). According to an example, it is necessary to verify the test target feature, but if it is not verified, it is considered that the CNN needs to be updated again, and step 1300 can be performed again.

在步驟1430中，電子裝置300透過將測試目標特徵輸入到DNN來確定測試用戶的測試癡呆程度。為了將其與在步驟1340中確定的測試癡呆程度區分開，將步驟1340的測試癡呆程度稱為第一測試癡呆程度，將步驟1430的測試癡呆程度稱為第二測試癡呆程度。當第一次執行步驟1430時，使用的DNN可以是初始DNN或基本DNN。In step 1430, the electronic device 300 determines the test dementia degree of the test user by inputting the test target features into the DNN. To distinguish it from the test degree of dementia determined in step 1340, the test degree of dementia of step 1340 is referred to as the first test degree of dementia, and the test degree of dementia of step 1430 is referred to as the second test degree of dementia. When step 1430 is performed for the first time, the DNN used may be an initial DNN or a basic DNN.

在步驟1440中，電子裝置300基於第二測試癡呆程度和GT癡呆程度來更新DNN。例如，當第二測試癡呆程度和GT癡呆程度之間存在差異時，可以使用該差異作為誤差值執行反向傳播以更新DNN。更新DNN的方法可以是監督學習。In step 1440, the electronic device 300 updates the DNN based on the second test dementia degree and the GT dementia degree. For example, when there is a difference between the second test dementia level and the GT dementia level, backpropagation may be performed using the difference as an error value to update the DNN. The method of updating DNN can be supervised learning.

根據一實施例，可以透過大量測試用戶來重複更新DNN，並且當更新的DNN的輸出精度變得大於或等於預設閾值時，可以終止DNN的更新。According to an embodiment, the DNN can be updated repeatedly through a large number of test users, and when the output accuracy of the updated DNN becomes greater than or equal to a preset threshold, the update of the DNN can be terminated.

根據一實施例，當如參照圖11及圖12所述的方法所示透過多個模型逐步確定癡呆程度時，可以分別更新每個分類步驟中使用的第一DNN和第二DNN，以適應每個分類步驟。例如，第一DNN被更新以確定正常或異常（輕度認知障礙（MCI）和阿爾茨海默病（AD）），第二DNN被更新以確定輕度認知障礙（MCI）或阿爾茨海默病（AD）。According to an embodiment, when the degree of dementia is determined step by step through multiple models as shown in the method described with reference to FIGS. classification steps. For example, the first DNN is updated to determine normal or abnormal (mild cognitive impairment (MCI) and Alzheimer's disease (AD)), the second DNN is updated to determine mild cognitive impairment (MCI) or Alzheimer's disease (AD) disease (AD).

圖15為示出根據一示例的用於確定測試目標特徵的方法的流程圖。FIG. 15 is a flowchart illustrating a method for determining test target characteristics according to an example.

根據一側面，參照圖14描述的上述步驟1410可以包括以下步驟（1510至1550）。According to one aspect, the above step 1410 described with reference to FIG. 14 may include the following steps ( 1510 to 1550 ).

在步驟1510中，可以將包括第一測試特徵和第二測試特徵的整體測試特徵劃分為多個子特徵集。例如，當整體測試特徵的數量為2816個時，可以生成為子特徵集的每一個包括200個測試特徵，並且第十五子特徵集可以包括16個測試特徵。整體測試特徵的每一個可以具有索引號，並且第一子特徵集包括測試特徵編號1至測試特徵編號200。In step 1510, the overall test signature including the first test signature and the second test signature may be divided into a plurality of sub-feature sets. For example, when the number of overall test features is 2816, each of the sub-feature sets may be generated to include 200 test features, and the fifteenth sub-feature set may include 16 test features. Each of the overall test features may have an index number, and the first set of sub-features includes test feature numbers 1 to 200.

在步驟1520中，選擇多個子特徵集（15）中的一部分。例如，可以選擇第一子特徵集至第十五子特徵集中的五個集合。所選的5個子特徵集包括總共1000個測試特徵。下面將參照圖16詳細描述選擇子特徵集一部分的方法。In step 1520, a portion of the plurality of sub-feature sets (15) is selected. For example, five sets from the first sub-feature set to the fifteenth sub-feature set may be selected. The selected 5 sub-feature sets include a total of 1000 test features. The method of selecting a part of the subfeature set will be described in detail below with reference to FIG. 16 .

在步驟1530中，所選的子特徵（例如，1000個）被劃分為多個子特徵集。例如，如果所選的特徵為1000個，則可以生成為子特徵集（50）的每一個包括20個測試特徵。In step 1530, the selected sub-features (eg, 1000) are divided into multiple sub-feature sets. For example, if the selected features are 1000, each of the sub-feature sets (50) may be generated to include 20 test features.

在步驟1540中，選擇多個子特徵集（50）中的一部分。例如，可以選擇從第一子特徵集至第五十子特徵集中的十個集合。所選的10個子特徵集包括總共200個測試特徵。步驟1540的詳細描述可以類似地應用於下面針對步驟1520的圖15的描述。In step 1540, a portion of the plurality of sub-feature sets (50) is selected. For example, ten sets from the first sub-feature set to the fiftieth sub-feature set may be selected. The selected 10 sub-feature sets include a total of 200 test features. The detailed description of step 1540 can be similarly applied to the description of FIG. 15 for step 1520 below.

在步驟1550中，被包括在所選子特徵集中的測試特徵被確定為測試目標特徵。可以識別每個確定的測試目標特徵的索引。In step 1550, the test features included in the selected sub-feature set are determined as test target features. An index may be identified for each identified test target characteristic.

所確定的測試目標特徵可以用作確定用戶癡呆程度的標記。例如，當將第一特徵中的第4個特徵、第46個特徵及第89個特徵以及第二特徵中的第78個特徵及第157個特徵確定為測試目標特徵時，在參照圖7描述的上述步驟720中確定的目標特徵還包括第一特徵中的第4個特徵、第46個特徵及第89個特徵以及第二特徵中的第78個特徵及第157個特徵。The determined test target characteristics can be used as markers to determine the degree of dementia of the user. For example, when the 4th feature, the 46th feature and the 89th feature in the first feature and the 78th feature and the 157th feature in the second feature are determined as the test target feature, it will be described with reference to FIG. The target features determined in the above step 720 also include the 4th feature, the 46th feature, and the 89th feature in the first feature, and the 78th feature and the 157th feature in the second feature.

在參照圖15描述的實施例中所示的具體數字涉及示例，並且具體數字可以根據實際實現而改變。The specific numbers shown in the embodiment described with reference to FIG. 15 refer to examples, and the specific numbers may vary according to actual implementation.

圖16為示出根據一示例的選擇子特徵的方法的流程圖。FIG. 16 is a flowchart illustrating a method of selecting sub-features according to an example.

根據一側面，參照圖15描述的上述步驟1520可以包括以下步驟（1610至1640）。According to one aspect, the above step 1520 described with reference to FIG. 15 may include the following steps ( 1610 to 1640 ).

需要大量用戶的數據來確定測試目標特徵。下面，將使用1000個用戶的數據作為示例來描述確定測試目標特徵的過程。一同設置正確值與1000個用戶的數據。Data from a large number of users is required to identify test target characteristics. In the following, the process of determining the test target characteristics will be described using the data of 1000 users as an example. Set the correct value together with the data of 1000 users.

例如，1000個用戶可以被分類為600個訓練數據用戶、200個認證數據用戶及200個測試數據用戶。對於600中的每一個，可以為第一頻譜圖圖像至第十一頻譜圖圖像生成2816個特徵，並且可以生成具有特定索引（例如，1至200）的600個第一子特徵集。例如，生成用於訓練數據的600個第一子特徵集至第十五子特徵集。類似地，為認證數據生成200個第一子特徵集至第十五子特徵集，為測試數據生成200個第一子特徵集至第十五子特徵集。For example, 1000 users may be classified into 600 training data users, 200 certification data users, and 200 test data users. For each of 600, 2816 features may be generated for the first to eleventh spectrogram images, and 600 first sub-feature sets with specific indices (eg, 1 to 200) may be generated. For example, 600 first to fifteenth sub-feature sets are generated for training data. Similarly, 200 first to fifteenth sub-feature sets are generated for the authentication data, and 200 first to fifteenth sub-feature sets are generated for the test data.

作為另一示例，如果不需要驗證測試目標特徵，則1000個用戶可以被分類為800個訓練數據用戶和200個測試數據用戶。對於800中的每一個人的第一頻譜圖圖像到第十一頻譜圖圖像，可以生成2816個特徵，並且可以生成具有特定索引（例如，1至200）的800個第一子特徵集。例如，生成用於訓練數據的800個第一子特徵集至第十五子特徵集。類似地，生成用於測試數據的200個第一子特徵集至第十五子特徵集。As another example, 1000 users may be classified into 800 training data users and 200 test data users if no verification of the test target features is required. For the first to eleventh spectrogram images of each person in 800, 2816 features may be generated, and 800 first sub-feature sets with specific indices (eg, 1 to 200) may be generated. For example, 800 first to fifteenth sub-feature sets for training data are generated. Similarly, 200 first to fifteenth sub-feature sets are generated for the test data.

在步驟1610中，基於訓練數據的600個第一子特徵集（第一訓練數據）和認證數據的200個第一子特徵集（第一認證數據），執行初始DNN的1次的訓練週期（epoch）。如果不需要驗證測試目標特徵，則可以基於訓練數據的800個第一子特徵集來執行初始DNN的1次的訓練週期。基於600個（或800個）第一子特徵集來調整DNN中節點的邊緣或參數的權重。透過調整權重的DNN來輸出輸入的第一認證數據的結果。輸出結果的數量可以為200。管理員可以透過參考200個輸出結果來調整為學習而執行的預設週期的數量。In step 1610, based on the 600 first sub-feature sets of the training data (first training data) and the 200 first sub-feature sets of the authentication data (first authentication data), one training cycle of the initial DNN ( epoch). If there is no need to verify the test target features, one training cycle of the initial DNN can be performed based on the 800 first sub-feature sets of the training data. Weights of edges or parameters of nodes in the DNN are adjusted based on the 600 (or 800) first sub-feature sets. The result of the input first authentication data is outputted through the weighted DNN. The number of output results can be 200. The administrator can adjust the number of preset cycles executed for learning by referring to the 200 output results.

在步驟1620中，在DNN上執行預設數量的訓練週期。例如，可以執行30次的訓練週期。當執行預設數量的週期時，可以視為已完成一次學習（或訓練）。In step 1620, a preset number of training cycles are performed on the DNN. For example, a training cycle of 30 may be performed. A learning (or training) can be considered complete when a preset number of epochs are performed.

在步驟1630中，可以基於測試數據的200個第一子特徵集（第一測試數據）來計算第一學習精度。例如，可以將第一測試數據輸入到學習的DNN，並且可以計算200個結果的精度作為第一學習精度。In step 1630, a first learning accuracy may be calculated based on the 200 first sub-feature sets of the test data (first test data). For example, the first test data can be input to the learned DNN, and the accuracy of 200 results can be calculated as the first learned accuracy.

可以透過重複步驟（1610至1630）預設次數來計算額外的學習精度。由於在步驟1610中提供的初始DNN不同，DNN學習的結果也可能不同，因此，多次學習的學習精度也會各不相同。當重複步驟（1610至1630）10次時，可以計算第一至第十學習精度。Additional learned accuracy can be calculated by repeating steps (1610 to 1630) a preset number of times. Since the initial DNN provided in step 1610 is different, the result of DNN learning may also be different, therefore, the learning accuracy of multiple learning will also be different. When the steps (1610 to 1630) are repeated 10 times, the first to tenth learning accuracies may be calculated.

在步驟1640中，計算第一訓練數據的第一平均學習精度。例如，可以計算第一至第十學習精度的平均值作為第一平均學習精度。In step 1640, a first average learning accuracy of the first training data is calculated. For example, an average value of the first to tenth learning precisions may be calculated as the first average learning precision.

例如，如果對包括索引1至200的特徵的第一子特徵集執行步驟（1610至1640），則可以計算第一子特徵集的第一平均學習精度。For example, if the steps (1610 to 1640) are performed on a first sub-feature set including features with indexes 1 to 200, a first average learning accuracy of the first sub-feature set may be calculated.

作為另一示例，當對包括索引201至400的特徵的第二子特徵集執行步驟（1610至1640）時，可以計算第二子特徵集的第二平均學習精度。As another example, when the steps ( 1610 to 1640 ) are performed on a second sub-feature set including features with indexes 201 to 400 , a second average learning accuracy of the second sub-feature set may be calculated.

例如，可以計算15個子特徵集中的每一個的第一至第十五平均學習精度。在15個平均學習精度中，可以選擇前5個子特徵集。For example, the first to fifteenth average learned accuracies for each of the 15 sub-feature sets may be calculated. Among the 15 average learned accuracies, the top 5 sub-feature sets can be selected.

作為另一示例，可以將15個子特徵集分類為預設數量的組，並且可以計算對應組的組平均學習精度。可以透過基於組平均學習精度從多個組中選擇一些組來選擇所選組中的子特徵集。As another example, the 15 sub-feature sets may be classified into a preset number of groups, and the group average learning accuracy of the corresponding groups may be calculated. A sub-feature set within a selected group may be selected by selecting some groups from among multiple groups based on the group average learned accuracy.

如果選擇了5個子特徵集，則選擇1000個索引。由於選擇了每個子特徵集，因此可以自動考慮由CNN基於頻譜圖圖像生成的特徵之間的地理特徵。If 5 sub-feature sets are selected, 1000 indices are selected. Due to the selection of each sub-feature set, geographical features among the features generated by the CNN based on the spectrogram image can be automatically considered.

步驟（1610至1640）的描述可以類似地應用於步驟1540的詳細描述。The description of steps ( 1610 to 1640 ) can be similarly applied to the detailed description of step 1540 .

圖17為示出根據一示例的驗證測試目標特徵的方法的流程圖。FIG. 17 is a flowchart illustrating a method of verifying test object characteristics according to an example.

根據一側面，參照圖14描述的上述步驟1420可以包括以下步驟（1710至1730）。According to one aspect, the above step 1420 described with reference to FIG. 14 may include the following steps ( 1710 to 1730 ).

在步驟1710中，電子裝置300將測試目標特徵集劃分為K個組。將針對每個測試用戶所確定的測試目標特徵定義為一個集合。例如，如果有1000個測試用戶，則有1000個測試目標特徵集，可以將1000個集合分為K個組。K是2以上的自然數。當K為5時，可以生成5個組，每個組包括200個集合。In step 1710, the electronic device 300 divides the test target feature set into K groups. Define the test target characteristics determined for each test user as a set. For example, if there are 1000 test users, there are 1000 test target feature sets, and the 1000 sets can be divided into K groups. K is a natural number of 2 or more. When K is 5, 5 groups can be generated, and each group includes 200 sets.

在步驟1720中，電子裝置300透過基於K個組分別更新初始DNN來生成K個測試DNN。當生成第一組至第五組時，可以使用第二組至第五組來更新第一測試DNN；使用第一組、第三組至第五組來更新第二測試DNN；使用第一組、第二組、第四組及第五組來更新第三測試DNN；使用第一組至第三組及第五組來更新第四測試DNN；並使用第一組至第四組來更新第五測試DNN。In step 1720, the electronic device 300 generates K test DNNs by respectively updating the initial DNNs based on the K groups. When generating the first group to the fifth group, the first test DNN can be updated using the second group to the fifth group; the second test DNN can be updated using the first group, the third group to the fifth group; , the second group, the fourth group and the fifth group to update the third test DNN; use the first group to the third group and the fifth group to update the fourth test DNN; and use the first group to the fourth group to update the first 5. Test DNN.

在步驟1730中，電子裝置300基於K個測試DNN的精度來驗證測試目標特徵。在上述實施例中，透過將第一組輸入到第一測試DNN，可以輸出第一組的結果，並且可以計算輸出結果的第一精度。類似地，可以為第二至第四測試DNN中的每一個計算第二至第四精度。In step 1730, the electronic device 300 verifies the test target features based on the accuracies of the K test DNNs. In the above embodiments, by inputting the first group into the first test DNN, the results of the first group can be output, and the first accuracy of the output results can be calculated. Similarly, second to fourth accuracies may be calculated for each of the second to fourth test DNNs.

當計算出的第一至第五精度的平均值等於或大於預設閾值時，可以確定測試目標特徵已被驗證。當計算出的第一至第五精度的平均值小於預設閾值時，可以確定測試目標特徵未被驗證。如果未驗證測試目標特徵，則可以重新更新提取測試特徵的CNN。When the calculated average of the first to fifth accuracies is equal to or greater than a preset threshold, it may be determined that the test target feature has been verified. When the calculated average of the first to fifth accuracies is less than the preset threshold, it may be determined that the test target feature has not been verified. If the test target features are not verified, the CNN that extracts the test features can be re-updated.

圖18示出根據一示例的用於驗證目標特徵的K-折交叉驗證方法。FIG. 18 illustrates a K-fold cross-validation method for validating target features according to an example.

根據一示例，可以將測試目標特徵集1810分為第一組1801、第二組1802、第三組1803、第四組1804及第五組1805。當測試目標特徵集1810包括1000個集合時，組（1801至1805）中的每一個包括200個集合。每個集合包括特定測試用戶的測試目標特徵。According to an example, the test target feature set 1810 may be divided into a first group 1801 , a second group 1802 , a third group 1803 , a fourth group 1804 and a fifth group 1805 . When the test target feature set 1810 includes 1000 sets, each of the groups ( 1801 to 1805 ) includes 200 sets. Each set includes test target characteristics for a particular test user.

可以使用第二至第五組（1802到1805）來更新第一測試DNN1820。例如，可以基於800個集合來更新第一測試DNN1820 800次。The first test DNN 1820 may be updated with the second to fifth groups (1802 to 1805). For example, the first test DNN 1820 may be updated 800 times based on 800 sets.

更新的第一測試DNN1820可以接收第一組1801作為輸入，以確定第一組1802的測試用戶的癡呆程度。例如，第一測試DNN1820可以確定200個集合的200個第二測試癡呆程度。An updated first test DNN 1820 may receive as input the first group 1801 to determine the degree of dementia of the first group 1802 of test users. For example, the first test DNN 1820 may determine 200 second test dementia levels of 200 sets.

可以基於第一組1801的200個集合中的每一個的GT癡呆程度和200個第二測試癡呆程度來計算第一測試DNN 1820的精度。類似地，可以計算第二至第四測試DNN的精度。最後，可以基於第一至第五測試DNN的平均精度來驗證測試目標特徵。The accuracy of the first test DNN 1820 can be calculated based on the GT dementia levels for each of the first set 1801 of 200 sets and the 200 second test dementia levels. Similarly, the accuracies of the second to fourth test DNNs can be calculated. Finally, the test target features can be verified based on the average accuracy of the first to fifth test DNNs.

以上說明的實施例能夠透過硬件構成要素、軟件構成要素，和/或硬件構成要素及軟件構成要素的組合實現。例如，實施例中說明的裝置及構成要素，能夠利用例如處理器、控制器、算術邏輯單元(arithmetic logic unit，ALU)、數字訊號處理器 (digital signal processor)、微型計算機、現場可編程陣列(field programmable array，FPA)、可編程邏輯單元(programmable logic unit，PLU)、微處理器、或能夠執行與應答指令(instruction)的任何其他裝置，能夠利用一個以上的通用計算機或特殊目的計算機進行體現。處理裝置能夠執行操作系統(OS)及在所述操作系統中執行的一個以上的應用軟件。並且，處理裝置應答軟件的執行，從而訪問、存儲、操作、處理及生成數據。為方便理解，說明為僅具有一個處理裝置的方式，但本領域普通技術人員應理解處理裝置能夠包括多個處理元件(processing element)和/或多個類型的處理要素。例如，處理裝置能夠包括多個處理器或一個處理器及一個控制器。並且，也能夠包括類似於並行處理器(parallel processor)的其他處理配置(processing configuration)。The embodiments described above can be realized by hardware components, software components, and/or a combination of hardware components and software components. For example, the devices and components described in the embodiments can utilize, for example, a processor, a controller, an arithmetic logic unit (arithmetic logic unit, ALU), a digital signal processor (digital signal processor), a microcomputer, a field programmable array ( field programmable array (FPA), programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions, capable of being embodied by more than one general purpose or special purpose computer . The processing device is capable of executing an operating system (OS) and one or more application software executing within the operating system. And, the processing means is responsive to the execution of software to access, store, manipulate, process and generate data. For ease of understanding, only one processing device is described, but those skilled in the art should understand that a processing device can include multiple processing elements (processing elements) and/or multiple types of processing elements. For example, a processing device can include multiple processors or a processor and a controller. Also, other processing configurations like parallel processors can also be included.

軟件能夠包括計算機程序（computer program）、代碼（code）、指令（instruction），或其中的一個以上的組合，能夠使加工裝置按照所期待的方式操作，或者，單獨或共同（collectively）命令加工裝置。為透過加工裝置進行解釋或者向加工裝置提供命令或數據，軟件和/或數據能夠永久或臨時體現於（embody）任何類型的設備、構成要素（component）、物理裝置、虛擬裝置（virtual equipment）、計算機存儲介質或裝置，或者傳送的訊號波（signal wave）。軟件分佈於透過網絡連接的計算機系統上，能夠以分布式存儲或執行。軟件及數據能夠存儲於一個以上的計算機讀寫存儲介質中。Software can include a computer program, code, instruction, or a combination of one or more of these, capable of causing a processing device to operate in a desired manner, or, individually or collectively, commanding a processing device . Software and/or data can be permanently or temporarily embodied (embodied) in any type of equipment, component, physical device, virtual device, A computer storage medium or device, or a transmitted signal wave. Software is distributed on computer systems connected via a network, enabling distributed storage or execution. Software and data can be stored in more than one computer read-write storage medium.

根據實施例的方法以能夠透過多種計算機手段執行的程序命令的形式體現，並記錄在計算機讀寫介質中。所述計算機讀寫介質能夠以單獨或者組合的形式包括程序命令、數據文件、數據結構等。記錄在所述介質的程序指令能夠是為實現實施例而特別設計與構成的指令，或者是計算機軟件領域普通技術人員能夠基於公知使用的指令。計算機讀寫記錄介質能夠包括硬盤、軟盤以及磁帶等磁性媒介（magnetic media）；與CD-ROM、DVD等類似的光學媒介（optical media）；與光磁軟盤（floptical disk）類似的磁光媒介（magneto-optical media），以及與只讀存儲器（ROM）、隨機存取存儲器（RAM）、閃存等類似的為存儲並執行程序命令而特別構成的硬件裝置。程序指令的例子不僅包括透過編譯器生成的機器語言代碼，還包括透過使用解釋器等能夠由計算機執行的高級語言代碼。為執行實施例的操作，所述硬件裝置能夠構成為以一個以上的軟件模塊實現操作的方式，反之亦然。The methods according to the embodiments are embodied in the form of program commands that can be executed by various computer means, and are recorded in computer read-write media. The computer read-write medium can include program commands, data files, data structures, etc. in a single or combined form. The program instructions recorded in the medium can be instructions specially designed and constructed to implement the embodiments, or instructions that can be used by those skilled in the field of computer software based on known knowledge. Computer read-write recording media can include magnetic media such as hard disks, floppy disks, and magnetic tapes; optical media similar to CD-ROMs and DVDs; magneto-optical media similar to floptical disks ( magneto-optical media), and hardware devices specially configured to store and execute program commands, such as read-only memory (ROM), random-access memory (RAM), flash memory, etc. Examples of program instructions include not only machine language codes generated by a compiler but also high-level language codes that can be executed by a computer by using an interpreter or the like. In order to implement the operations of the embodiments, the hardware device can be configured in such a way that more than one software module can implement the operation, and vice versa.

綜上，透過有限的實施例及圖式對實施例進行了說明，本領域的普通技術人員能夠對上述記載進行多種修改與變形。例如，所說明的技術以與所說明的方法不同的順序執行，和/或所說明的構成要素以與所說明的方法不同的形態結合或組合，或者，由其他構成要素或等同物進行替換或置換也能夠獲得相同的效果。To sum up, the embodiments have been described through limited embodiments and drawings, and those skilled in the art can make various modifications and variations to the above descriptions. For example, the described techniques are performed in a different order than the described method, and/or the described components are combined or combined in a different form than the described method, or are replaced by other components or equivalents or Replacement can also achieve the same effect.

由此，其他體現、其他實施例及申請專利範圍的均等物全部屬於專利請求項的範圍。Therefore, other embodiments, other embodiments, and equivalents of the scope of the patent application all belong to the scope of the patent claims.

110:電子裝置110: Electronic device

120:用戶終端 120: user terminal

130:監控終端 130: Monitoring terminal

210:第一圖像 210: first image

220:第二圖像 220: Second image

230:第三圖像 230: Third image

240:第四圖像 240: The fourth image

300:電子裝置 300: electronic device

310:通信部 310: Department of Communications

320:處理器 320: Processor

330:存儲器 330: memory

410~440、710~730、800、805、810~890、895、1300、1310~1350、1400、1410~1440、1450、1510~1540、1610~1640、1710~1730:步驟 410~440,710~730,800,805,810~890,895,1300,1310~1350,1400,1410~1440,1450,1510~1540,1610~1640,1710~1730: steps

500:內容 500: content

510:指令 510: instruction

520、530、540:圖像 520, 530, 540: image

600:原始頻譜圖圖像 600: Original spectrogram image

610:圖例 610: Legend

900:所有特徵 900: all features

910:目標特徵 910: target features

1010:輸入層 1010: input layer

1020、1030、1040:隱藏層 1020, 1030, 1040: hidden layer

1050:輸出層 1050: output layer

1210~1230:步驟 1210~1230: steps

1801:第一組 1801: The first group

1802:第二組 1802: The second group

1803:第三組 1803: The third group

1804:第四組 1804: The fourth group

1805:第五組 1805: Fifth group

1810:測試目標特徵集 1810: Test target feature set

1820:第一測試DNN 1820: First test of DNN

圖1為示出根據一示例的用於確定用戶癡呆程度的系統的框圖。圖2示出根據一示例的輸出到用戶終端以確定用戶癡呆程度的圖像。圖3為示出根據一實施例的用於確定用戶癡呆程度的電子裝置的框圖。圖4為示出根據一實施例的用於確定用戶癡呆程度的方法的流程圖。圖5示出根據一示例的預先製作以接收用戶語音的內容。圖6示出根據一示例的為語音生成的原始頻譜圖圖像。圖7為示出根據一示例的使用CNN和DNN來確定用戶癡呆程度的方法的流程圖。圖8示出根據一示例的能夠確定用戶癡呆程度的完整的CNN和部分CNN。圖9示出根據一示例的為多個用戶圖像集中的每一個生成的特徵和基於其確定的目標特徵。圖10示出根據一示例的用於確定用戶癡呆程度的DNN。圖11示出根據一示例的為提高確定癡呆程度的準確性而執行的兩個步驟的分類。圖12示出根據另一示例的為提高確定癡呆程度的準確性而執行的兩個步驟的操作。圖13為示出根據一示例的用於更新完整的CNN的方法的流程圖。圖14為示出根據一示例的更新DNN的方法的流程圖。圖15為示出根據一示例的用於確定測試目標特徵的方法的流程圖。圖16為示出根據一示例的選擇子特徵的方法的流程圖。圖17為示出根據一示例的驗證測試目標特徵的方法的流程圖。圖18示出根據一示例的用於驗證目標特徵的K-折交叉驗證方法。 FIG. 1 is a block diagram illustrating a system for determining a user's degree of dementia, according to an example. FIG. 2 illustrates an image output to a user terminal to determine a user's dementia degree according to an example. FIG. 3 is a block diagram illustrating an electronic device for determining a degree of dementia of a user according to an embodiment. FIG. 4 is a flow chart illustrating a method for determining the degree of dementia of a user according to an embodiment. FIG. 5 illustrates content pre-produced to receive user voice according to an example. Fig. 6 shows a raw spectrogram image generated for speech according to an example. FIG. 7 is a flowchart illustrating a method of determining a degree of dementia of a user using a CNN and a DNN according to an example. Fig. 8 shows a full CNN and a partial CNN capable of determining the degree of dementia of a user according to an example. FIG. 9 illustrates generated features for each of multiple sets of user images and target features determined based thereon, according to an example. FIG. 10 illustrates a DNN for determining a user's degree of dementia, according to an example. Figure 11 shows a classification of two steps performed to improve the accuracy of determining the degree of dementia, according to an example. FIG. 12 illustrates a two-step operation performed to improve the accuracy of determining the degree of dementia according to another example. FIG. 13 is a flowchart illustrating a method for updating a complete CNN according to an example. FIG. 14 is a flowchart illustrating a method of updating a DNN according to an example. FIG. 15 is a flowchart illustrating a method for determining test target characteristics according to an example. FIG. 16 is a flowchart illustrating a method of selecting sub-features according to an example. FIG. 17 is a flowchart illustrating a method of verifying test object characteristics according to an example. FIG. 18 illustrates a K-fold cross-validation method for validating target features according to an example.

410~440:步驟 410~440: Steps

Claims

A method for determining the degree of dementia of a user performed by an electronic device, including the following steps: outputting, through a user terminal, first content pre-produced for determining the degree of dementia of the user; the first voice of the first content acquired by the microphone; output the pre-made second content through the user terminal; receive the second voice of the user for the second content acquired through the microphone; electronically The device generates a first spectrogram image by visualizing at least one feature of the first speech; the electronic device generates a second spectrogram image by visualizing at least one feature of the second speech; The device generates a preset number of first features for the first speech by inputting the first spectrogram image into a pre-updated first convolutional neural network (CNN); The second spectrogram image is input to the pre-updated second convolutional neural network to generate a preset number of second features for the second speech; use the electronic device to compare the first features and the second features Determine a preset number of target features in ; and By inputting the target features into a pre-updated deep neural network (DNN), the degree of dementia of the user is determined, wherein the determined degree of dementia is output through the user terminal.

The method for determining the degree of dementia of a user according to claim 1, wherein the first content includes an instruction for receiving the first voice.

The method for determining the degree of dementia of the user as claimed in claim 2, wherein the first content includes the content of making the user read the sentence, the content of guessing the name of the output image, the content of describing the output image, and the content used for language fluency Sexual content, content for number crunching, and content that induces storytelling.

The method for determining the degree of dementia of a user according to claim 1, wherein the step of generating a first spectrogram image by visualizing at least one feature of the first speech includes the following steps: generating the first spectrogram image through a librosa tool The first spectrogram image of speech.

The method for determining the degree of dementia of a user according to claim 1, wherein the size of the first spectrogram image and the size of the second spectrogram image are the same as each other.

The method for determining the degree of dementia of a user according to claim 1, wherein the first convolutional neural network is pre-updated based on the VGG16 model.

The method for determining the degree of dementia of a user as claimed in claim 1, wherein the first convolutional neural network includes an input layer, 5 pre-convolutional layer blocks (pre-convolutional layer blocks), a fully connected layer and 2 post- Convolutional layer blocks (post- convolutional layer blocks) and do not include softmax to generate the first feature of the first spectrogram image.

The method for determining the degree of dementia of the user according to claim 1, further comprising a step of updating the first convolutional neural network with the electronic device.

The method for determining the degree of dementia of a user according to claim 8, wherein the step of updating the first convolutional neural network with the electronic device includes the following steps: receiving the first content of the test user for the first content Test speech; generate a first test spectrogram image by visualizing at least one feature of the first test speech, wherein the first test spectrogram image is marked as the test user's GT (ground truth) dementia degree ; determining a first test dementia level of the test user by inputting the first test spectrogram image into a first full convolutional neural network, wherein the first full convolutional neural network includes an input layer , one or more pre-convolutional layer blocks, a fully connected layer, one or more post-convolutional layer blocks, and softmax; and updating the complete first convolutional neural network based on the first test dementia degree and the GT dementia degree , wherein the first convolutional neural network includes only the input layer, the one or more pre-convolutional layer blocks, the fully connected layer, and the layers of the updated complete first convolutional neural network. Describe one or more post-convolutional layer blocks.

The method for determining the degree of dementia of a user according to claim 9, wherein, It also includes performing the following step with the electronic device: updating the deep neural network after updating a plurality of convolutional neural networks including the first convolutional neural network and the second convolutional neural network.

The method for determining the degree of dementia of a user according to claim 10, wherein the step of updating the deep neural network with the electronic device includes the following steps: in the first preset number of generated based on the first test spectrogram image Determining a preset number of test target features among a test feature and a preset number of second test features generated based on the second test spectrogram image, wherein the test target feature is marked as GT dementia of the test user degree; determining a second test dementia degree of the test user by inputting the test target feature into the deep neural network; and updating the depth based on the second test dementia degree and the GT dementia degree Neural Networks.

A computer-readable recording medium storing a program for executing the method of any one of claims 1 to 11.

A device for determining the degree of dementia of a user, including: a memory recording a program for determining the degree of dementia of the user; and a processor for executing the program, wherein the program performs the following steps: pre-produced first content for determining the degree of dementia of the user; receiving the first voice of the user for the first content acquired through the microphone of the user terminal; outputting the pre-made second content through the user terminal; receiving the user's voice for the first content acquired through the microphone A second speech of the second content; generating a first spectrogram image by visualizing at least one characteristic of the first speech; generating a second spectrogram image by visualizing at least one characteristic of the second speech; generating a second spectrogram image by visualizing at least one characteristic of the second speech; The first spectrogram image is input to a pre-updated first convolutional neural network (CNN), generating a preset number of first features for the first speech; by inputting the second spectrogram image into The pre-updated second convolutional neural network generates a preset number of second features for the second speech; determines a preset number of target features among the first features and the second features; The target features are input to a pre-updated deep neural network (DNN) to determine the dementia degree of the user, wherein the determined dementia degree is output through the user terminal.

A method performed by an electronic device for updating a convolutional neural network for determining a degree of dementia in a user, comprising the steps of: Outputting pre-made first content for determining the degree of dementia of the user through the user terminal; receiving a first test voice of a test user for the first content; generating a second test voice by visualizing at least one feature of the first test voice A test spectrogram image, wherein the first test spectrogram image is marked as the GT dementia level of the test user; by inputting the first test spectrogram image into a complete convolutional neural network determining a test dementia level for the test user, wherein the complete convolutional neural network includes an input layer, one or more pre-convolutional blocks, a fully connected layer, one or more post-convolutional blocks, and softmax; and based on the test The degree of dementia and the degree of GT dementia are used to update the complete convolutional neural network, wherein the convolutional neural network only includes the input layer, the one of the layers of the updated complete convolutional neural network The above pre-convolutional layer blocks, the fully-connected layer, and the one or more post-convolutional layer blocks.

An electronic device for updating a convolutional neural network for determining a degree of dementia in a user, comprising: a memory recording a program for updating the convolutional neural network; and a processor for executing the program, wherein the The processor performs the following steps: Outputting pre-made first content for determining the degree of dementia of the user through the user terminal; receiving a first test voice of a test user for the first content; generating a second test voice by visualizing at least one feature of the first test voice A test spectrogram image, wherein the first test spectrogram image is marked as the GT dementia level of the test user; by inputting the first test spectrogram image into a complete convolutional neural network determining a test dementia level for the test user, wherein the complete convolutional neural network includes an input layer, one or more pre-convolutional blocks, a fully connected layer, one or more post-convolutional blocks, and softmax; and based on the test The degree of dementia and the degree of GT dementia are used to update the complete convolutional neural network, wherein the convolutional neural network only includes the input layer, the one of the layers of the updated complete convolutional neural network The above pre-convolutional layer blocks, the fully-connected layer, and the one or more post-convolutional layer blocks.