WO2021144930A1

WO2021144930A1 - Data-input assisting device

Info

Publication number: WO2021144930A1
Application number: PCT/JP2020/001342
Authority: WO
Inventors: 祥章池田
Original assignee: エヌ・デーソフトウェア株式会社
Priority date: 2020-01-16
Filing date: 2020-01-16
Publication date: 2021-07-22

Abstract

[Problem] To ensure the reliability of input data by preventing mistaking a user for another user, for whom data is to be input, while maintaining the ease of input operations by means of voice. [Solution] According to the present invention, when a caregiver enters a data input command to a voice input/output terminal by means of voice, specifying a target person by his or her "family name" alone, even if there is only one candidate target person in a database, the caregiver is asked a question for confirmation by means of voice, specifying the "family name and given name" of the candidate target person, and the candidate target person is confirmed as the true target person only when the voice question is answered with an affirmative answer by means of voice, whereby specified data input processing is executed.

Description

Data input support device

The present invention relates to a data input support device suitable for elderly care facilities such as, for example, a pay nursing home with nursing care, a special elderly nursing home, a nursing care facility for the elderly, a home care service provider, etc., and particularly uses voice. Data entry support related to equipment.

In elderly care facilities, various data necessary for daily care from individual users for the purpose of health management of facility users (hereinafter simply referred to as "users") and improvement of the quality of services provided (hereinafter referred to as "users"). For example, vital data, sleep data, daily life data, etc.) are collected and registered.

Conventionally, in order to collect and register such data, caregivers go to the room where individual users are accommodated and the events for which data is collected (for example, body temperature, blood pressure, food intake, etc.). Is usually checked visually or with a measuring instrument, and then the data obtained in this way is manually input using an information processing device such as a notebook computer or a tablet.

In recent years, due to labor shortages, the aging of long-term care staff has progressed, and some of them are not good at visually recognizing small characters and charts displayed on the screen of electronic devices, as well as key operations and touch operations for input. Not a little exists. In addition, in the case of foreign workers, although they are not inconvenienced in daily conversation, there are quite a few who are not good at reading and writing Japanese using kanji.

As a solution to these problems, attempts have been made to perform this type of data input in a nursing care facility for the elderly using a portable terminal (for example, a smartphone) capable of voice input / output (for example, patent documents). 1).

JP 2012-073739

The present invention has been made in view of the above-mentioned technical background, and the main object thereof is to make a mistake of a user who is a target of data input while maintaining the ease of input operation by voice. The purpose is to provide data input support devices, methods, systems, and computer programs in elderly care facilities that can ensure the reliability of input data without them.

The above-mentioned technical problems can be solved by a data input support device, a method, a system, and a computer program in a nursing care facility for the elderly, which have the following configurations.
That is, the data input support device in the elderly care facility according to the present invention is
A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
A text / voice / conversion unit that converts text data into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. It confirms the person.

According to such a configuration, by designating the user to be the target of command processing only by "last name" and "first name", it is possible to input data while maintaining the ease of input operation by voice. It is possible to ensure the reliability of the input data by eliminating the mistake of the user.

In a preferred embodiment,
The target person determination processing unit
In the user information holding unit, a search process using the "user specific word" as a search key is executed, and as a result, when the number of hit users is 0, 1 is the target of the corresponding command process. Generates text data corresponding to a speech including at least not finding a user, and transmits it to the voice input / output terminal via the text / voice / conversion unit.
It may be a thing.

In a preferred embodiment,
The command execution unit
When the decoding result in the text decoding unit is an input request command related to the long-term care necessary information of one user, it is received from the voice input / output terminal via the voice / text conversion unit and the text analysis unit. Write the long-term care necessary information in the designated user area in the long-term care necessary information holding unit,
It may be a thing.

In a preferred embodiment,
The command execution unit
When the decoding result in the text decoding unit is a confirmation request command regarding the long-term care necessary information of one user, the long-term care necessary information read from the designated user area in the long-term care necessary information holding unit is referred to as the text / voice. It is transmitted to the voice input / output terminal via the conversion unit.
It may be a thing.

In a preferred embodiment,
Of the "personal specific words", the "user specific word" is the user's "last name", and the remaining one "user specific word" excluding the "last name" is the user's "first name". You may.

The present invention viewed from another aspect can also be grasped as a data input support method in a nursing care facility for the elderly.
That is, the data input support method in this elderly care facility is
A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
Including a text / speech / conversion unit that converts text data into corresponding speech speech data according to a conversion model generated by learning of known input / output relationships.
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding step for decoding the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding step is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the content of the dialogue with the care staff via the voice output terminal, a target person determination processing step for determining one user to be command processed, and a target person determination processing step.
For one user confirmed in the target person confirmation processing step, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing step is
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. It confirms the person.

The present invention seen from another aspect can also be grasped as a data input support system in a nursing care facility for the elderly.
That is, the data input support system in this elderly care facility is
A voice input / output terminal that is portable to the care staff, has a microphone and a speaker, and has a wireless network connection function.
A user information holding server on the network that holds user information about the users of the elderly care facility,
A long-term care necessary information holding server that holds long-term care necessary information about users of the elderly care facility, and
A voice / text conversion server on the network that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
A text / voice / conversion server on the network that converts text data into the corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Obtained by converting speech voice data acquired via communication with the voice input / output terminal and generated by the care worker speaking to the microphone through the voice / text conversion step. A text decoding server on the network that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result on the text decoding server is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. And the target person confirmation processing server on the network that determines one user to be command processed based on the user information and the content of the dialogue processing with the care staff via the voice output terminal. ,
For one user confirmed by the target person confirmation processing unit, the command execution server that executes the processing specified by the input request command or the confirmation request command for the long-term care necessary information holding unit is included.
The target person confirmation processing server is
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding server executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. It confirms the person.

In the above system, the individual servers do not have to be physically separate servers. For example, the voice / text / conversion server and the text / voice / conversion server may be configured as physically the same server, or the text decoding server, the target person determination processing server, and the command execution server are physically the same. It may be configured as a server of.

From another aspect, the present invention can be grasped as a computer program for a data input support device in a nursing care facility for the elderly.
That is, the computer program for the data input support device in this elderly care facility is
A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
In a data input / output device in an elderly care facility having a text / voice / conversion unit that converts text data into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Computer,
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. It confirms the person.
It is intended to function as a device.

From another aspect, the present invention can also be grasped as a computer program for a data input support system in a nursing care facility for the elderly.
That is, the computer program for the data input support system in this elderly care facility is
A voice input / output terminal that is portable to the care staff, has a microphone and a speaker, and has a wireless network connection function.
A user information holding server on the network that holds user information about the users of the elderly care facility,
A long-term care necessary information holding server that holds long-term care necessary information about users of the elderly care facility, and
A voice / text conversion server on the network that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
In a data input support system in a nursing home for the elderly, including a text / voice / conversion server on the network that converts text data into the corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Computer,
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
It is intended to function as a server.

According to the present invention, by designating a user to be a target of command processing only by "last name" or "first name", use that is a target of data input while maintaining ease of input operation by voice. It is possible to provide a data input support device, a method, a system, and a computer program in an elderly care facility that can ensure the reliability of input data without making a mistake.

FIG. 1 is an illustration diagram depicting a state of operation of a voice input / output terminal by a care worker. FIG. 2 is a system configuration diagram showing an example of a case where the present invention is realized by a distributed server system. FIG. 3 is a flowchart showing a processing flow in the voice processing server. FIG. 4 is a chart showing two basic processes in the voice processing server as an image. FIG. 5 is a flowchart showing a processing flow in the dialogue processing server. FIG. 6 is a chart showing an image of five basic processes in the dialogue processing server. FIG. 7 is a flowchart showing the flow of command execution processing. FIG. 8 is a flowchart showing a specific example of command execution pre-processing, command execution processing, and response generation processing when the command classification result is “body temperature input command”. FIG. 9 is a flowchart showing a specific example of command execution pre-processing, command execution processing, and response generation processing when the command classification result is “body temperature confirmation command”. FIG. 10 is a flowchart showing a specific example of command execution pre-processing, command execution processing, and response generation processing when the command classification result is “meal amount input command”.

Hereinafter, a preferred embodiment of the data input support system for the elderly care facility according to the present invention will be described in detail with reference to the attached drawings (FIGS. 1 to 10). As for the data input support device, method, and computer program according to another embodiment of the present invention, those skilled in the art can take out or slightly modify a part thereof based on the configuration of the illustrated data input support system. Since it should be easy to carry out by applying the above, it is omitted to show the specific contents individually.

<< Introduction >>
As will be described in detail later, in the data input support system according to the present invention, the care staff usually uses the microphone of the voice input / output terminal having a wireless network connection function, for example, when the care staff talks with each other. Just by talking to the system in the natural language that corresponds to the request contents (for example, data input request, data confirmation request), those request contents (input request to the database that stores the care necessary information), Confirmation request, etc.) can be executed automatically.

At that time, if there is any unclear point on the system side, a question sentence speech voice in natural language for asking the unclear point will flow from the speaker of the voice input / output terminal, and respond to it. By speaking the answer contents to the terminal in natural language, the unclear points are solved and the desired request contents are executed on the system side. When the execution is completed, a completion sentence speech voice notifying the completion of the execution starts to flow from the speaker of the terminal, and the care staff can confirm the completion of the data input by listening to this.

When you speak to the microphone of the voice input / output terminal corresponding to the request content (for example, data input request, data confirmation request), the speaker of the voice input / output terminal corresponds to a repeat question to confirm the request content. Since the voice of the speech to be heard flows out, the care staff can ensure the accuracy of the request contents by speaking through the microphone, which is equivalent to a positive or negative response.

A facility user to be subject to data entry or data confirmation may be specified by specifying the user's "first and last name", but a speech voice containing only the user's "last name" (for example, "" It can also be done by "Mr. Takahashi's body temperature", "Mr. Yamada's blood pressure", etc.). At that time, in order to avoid mistakes by the user, a confirmation request speech voice (for example, " Mr. Takahashi Yoshiaki") is returned with the "name" of the specified user. On the other hand, by returning a negative or positive answer speech voice (for example, "yes"), it is possible to avoid a user's specification error.

If there are multiple users with the same "last name", this confirmation request speech voice includes the selection request speech voice including each "first name" (for example, "Takahashi has two people. Yoshiaki". Or is it Mr. Noriyuki? "). On the other hand, by returning the selection answer speech voice (for example, "Yoshiaki-san"), it is possible to avoid the user's specification error. At this time, if there are users with the same surname and the same name, the selection request speech voice that specifies each of them by the room number (for example, "There are two Takahashi Yoshiaki. It is Takahashi Yoshiaki in Room 103." Or is it Mr. Takahashi Yoshiaki in Room 115 ? "), It is possible to avoid mistakes in the user's designation. In this type of nursing care facility for the elderly, a private room management system that accommodates one user in one room or, rarely, a small group management system that accommodates two to four people in one room is customary, and moreover, in the same room. Since it is possible to assume that users with the same surname and the same name are not accommodated, specifying the user by "first name and last name + room number" is effective in avoiding mistakes.

In the Western specification system, the above "last name" can be read as "first name" and "first name" can be read as "last name". After designating the title as "Mr. Bill", even when there is only one "Mr. Bill" in the facility, he asks if he is "Mr. Bill Gates". Also, if there are two "Mr. Bill", you may check "Mr. Bill Gates or Mr. Bill Clinton". In addition, after designating the title only with "Bill-san" and "name", "nickname" or "middle name" may be added for confirmation just in case.

Assuming that all facility users are gathered in the cafeteria to serve breakfast, lunch, and dinner, the number of all the gathered people (for example, 15 people) and individual names are known, and one of them is used. It is also assumed that other than the person (for example, Mr. Takahashi), it is recorded that both the staple food and the side dish have been completed. In such a case, it is troublesome to input the speech voice individually for each of the 15 users. In such a case, by using a speech voice including "all user batch designation command with exception designation" (for example, " except Takahashi-san, both staple food and side dish are complete", the time and effort of voice input can be greatly reduced. It can be reduced. Here, the underlined part corresponds to the "all-user batch designation instruction with exception designation".

Figure 1 shows an example of such voice operation by a care worker. In this example, the care worker 10 asks the microphone of the smartphone 20a, which is one of the voice input / output terminals, to the natural language sentence “●● -san's body temperature yesterday” corresponding to the request content (confirmation request CR01). Talk to them by "Tell me". Then, on the system side, the body temperature data of the corresponding date and time is searched and retrieved from the already registered storage area of Mr. ●●, and the natural language sentence “●●'s body temperature yesterday” corresponding to the answer to the confirmation request (CA01) is obtained. 20:11 is 38.2 degrees ”, and the corresponding voice starts to flow from the speaker of the smartphone 20a. By listening to this answer voice, the care staff 10 can confirm the target data (Mr. ●●'s body temperature yesterday).

<< Overall system configuration >>
FIG. 2 shows a system configuration diagram when the above-mentioned system of the present invention is implemented by a plurality of servers distributed and arranged on a network. As shown in the figure, this system includes an "in-facility system" located inside the elderly care facility and an "out-of-facility system" located outside the elderly care facility.

As is well known to those skilled in the art, a device called a "server" is a transmitter / receiver, a microprocessor unit (MPU), or a specific function that enables transmission / reception via a network (for example, an inch net or LAN). It is composed of a central processing unit (CPU) composed of a dedicated IC (ASIC), a hard disk, a semiconductor memory, etc., and has a storage unit for storing control programs and data. For example, it is received via a network. It executes an operation specified in response to a processing request and sends the execution result to a specified party via a network.

<Configuration of in-facility system>
First, the system in the elderly care facility will be described. In this example, the system in the elderly care facility is configured to include one or more voice input / output terminals 20, a local server 22, and one or more personal computers (PCs) 23. The

devices

20, 22, and 23 of the above are configured to be able to cooperate with each other via the LAN 21.

-Voice input / output terminal 20
The voice input / output terminal 20 is portable to the care worker 10, has a microphone and a speaker, and has a wireless network connection function. In this example, specifically, the smartphone 20a It is composed of a smart watch (registered trademark) 20b. Those devices 20a. A dedicated application program (hereinafter, abbreviated as "app") for carrying out the present invention is installed in 20b. This app has a first function and a second function built-in.

The first function is to generate speech voice data by A / D conversion, data compression, etc. of the voice spoken to the microphone, and use this as a speech voice / text conversion request in a predetermined command format. The data is transmitted to the voice processing server 32 (details will be described later) arranged on the Internet 31 via the LAN 21.

The second function is to generate an analog speech signal by decompressing, D / A conversion, etc., the speech voice data received from the voice processing server 32 arranged on the Internet 31 via the LAN 21. By driving the speaker with a signal, the speaker utters a speech voice.

-Local server 22
The local server 22 stores various software related to accounting processing and user management of the elderly care facility, as well as various nursing care necessary data regarding individual users of all the users accommodated in the elderly care facility. These data include sleep data of each user, vital data such as blood pressure, body temperature, and heart rate, as well as recorded data of daily life such as dietary intake and excretion.

・ Personal computer (PC) 23
The personal computer (PC) 23 is used for executing various software stored in the local server 22 and for aggregating and analyzing the above-mentioned data of each user.

<Configuration of out-of-facility system>
Next, the system outside the elderly care facility will be described. In this example, the system outside the nursing care facility for the elderly includes a voice processing server 32, a data storage server 33, and an interactive processing server 34, which is a main part of the present invention. 33 and 34 are configured to be able to cooperate with each other via the Internet 31.

-Voice processing server 32
The voice processing server 32 is generated by learning a known input / output relationship and a voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning the known input / output relationship. It has a text / voice / conversion unit that converts text data into the corresponding speech voice data according to the converted conversion model, and details thereof are shown in FIGS. 3 and 4.

As shown in FIG. 3, the voice processing server 32 uses speech voice / text conversion (hereinafter referred to as "STT conversion") as the request type each time a conversion request arrives via the Internet 31. It is determined whether or not there is text / speech voice / conversion (hereinafter referred to as "TTS conversion") (step 101).

Then, when it is determined to be "STT conversion" (step 101, "STT"), the AI conversion process (step 102) for STT-converting the speech voice data included in the received conversion request into the corresponding text data, and A process (step 103) of transmitting the text data obtained by the conversion to the dialogue processing server 34 via the Internet 31 is executed.

An image of STT conversion is shown in FIG. 4 (a). As shown in the figure, in the AI conversion process for STT conversion, text data is converted into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships. In this example, the speech voice data "What is Mr. Yamada's body temperature?" 701 is processed according to the conversion model and converted into the text data {text Mr. Yamada's body temperature} 702.

On the other hand, when it is determined that the conversion is "TTS" (step 101, "TTS"), the AI conversion process (step 101) of TTS-converting the text data included in the received conversion request into the corresponding speech voice data. 104) and the process (step 105) of transmitting the speech voice data obtained by the conversion to the voice input / output terminal 20 via the Internet 31 and the LAN 21 are executed.

An image of TTS conversion is shown in FIG. 4 (b). As shown in the figure, in the AI conversion process for TTS conversion, text data is converted into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships. In this example, the text data "{text:" Mr. Yamada's body temperature on 7/18 was 36.5 degrees at 10:30 and 36.2 at 14:00 "}" 714 is processed according to the conversion model. Then, it is converted into the speech voice data "Mr. Yamada's body temperature on 7/18 was 36.5 degrees at 10:30 and 36.2 at 14:00" 715.

The performance of the data input / output support system according to the present invention depends not a little on the performance of the above-mentioned "STT conversion" and "TTS" conversion, but the conversion processing thereof is performed by a major IT company (for example, Google or Google). By using the bidirectional AI voice conversion service provided by Amazon, etc.), high-performance conversion processing can be realized at a relatively low cost.

-Data storage server 33
The data storage server 33 is newly provided in connection with the present invention, and is used for "user information" for identifying individual users of the elderly care facility and management of individual users. Stores "necessary care information" to be provided to the server.

As user information, for example, each user's "ID number", "last name", "first name", "phonetic character notation of surname", "phonetic character notation of first name", "room number", etc. Can be mentioned. Here, the phonetic characters include katakana characters, hiragana characters, and romaji characters. As mentioned above, in the case of a system of foreign specifications other than Japan, "intermediate name", "nickname", "common name", etc. may be included according to religious and customary conventions.

Examples of long-term care necessary information include "vital information", "food intake information", "excretion information", "sleep information", "life record information", etc. of each user. Here, the "vital information" can include, for example, the body temperature, blood pressure, heart rate, etc. of each user. In addition, the "sleep information" can include each user's bedtime, wake-up time, sleep duration, and the like. Further, the "food intake information" can include, for example, a staple food intake%, a side dish intake%, a soup intake%, a beverage intake% such as tea or water, and the like. Further, the excretion information can include, for example, the number of defecations, the color, the shape, the amount, the number and amount of urination, and the like. In addition, the "life record information" can include the life records of each user such as "I slept for half a day", "I was watching TV at night", and "I was reading". ..

In FIG. 2, an example of user information and long-term care necessary information is drawn on the upper right of the data storage server 33. Here, a "user table" is drawn as a part of the user information, and a "vital information table" is drawn as a part of the long-term care necessary information. The "user table" stores user information for identifying individual users. In this example, each "user ID" has its own personal attribute (user's). A table that defines the "last name", the "first name" of the user, the "katakana notation of the surname" of the user, the "katakana notation of the first name" of the user, the "room number" in which the user is accommodated, etc.) It is configured as. The "vital information table" stores individual vital information of the user, and in this example, for each "user ID", the vital attribute (the "recording date" of the user data). It is configured as a table that defines "day", "body temperature", "blood pressure", "heart rate", etc.).

Although not shown, the data storage server 33 may also store a "food and drink information table", an "excretion information table", and the like. The "food and drink information table" stores individual food and drink information of users. For example, for each "user ID", the food and drink attributes ("record date" of user data, "staple food") ”,“ Side dish intake% ”,“ Juice intake% ”, etc.) are defined as a table. The "excretion information table" stores individual defecation and urination information of the quoter, and for example, for each "user ID", its excretion attribute ("recording date" of user data, etc. It is configured as a table that defines "color, shape and amount of defecation", "number and amount of urination", etc.).

-Dialogue processing server 34
The dialogue processing server 34 is for realizing a dialogue between the system and the care staff via the voice input / output terminal 20, and mainly includes a text analysis processing unit and a dialogue control processing unit. It is configured.

As shown in FIG. 5A, the dialogue processing server 34 sequentially executes the text analysis processing (step 201) and the dialogue control processing (step 202) each time it receives some text data, thereby performing dialogue. Realizes various functions required as a processing server.

1. 1. Text analysis process The text analysis process (step 201) is configured to sequentially execute the variable extraction process (step 2011) and the command classification process (step 2012) as shown in FIG. 5 (b). ..

1.1 Variable extraction process The variable extraction process (step 2011) is performed in the given text data by analyzing the text according to the dialogue model obtained by learning the known dialogue (for example, dialogue between care workers). From, "words" corresponding to predefined variables are extracted.

An example of a concrete image of this variable extraction process (step 2011) is shown in FIG. 6 (a). As shown in the figure, it is assumed that the text data "{text: Mr. Yamada's body temperature is}" 703 is given. Then, this text data is analyzed according to the dialogue model 704 obtained by learning the known dialogue, and the "words (variable values) corresponding to the predefined variables "last name", "target", and "date and time" are used. ) ”Is extracted respectively. In the illustrated example, as is clear from the extraction result 705, "Yamada" is extracted as the "word" corresponding to the variable "last name", and "body temperature" is extracted as the "word" corresponding to the variable "object". However, no "word" as the variable "date and time" has been extracted.

1.2 Command classification process The command classification process (step 2012) is performed from the given text data by analyzing the text according to the dialogue model obtained by learning the known dialogue (for example, the dialogue between care workers). It classifies (determines) the type of command.

An example of a concrete image of this command classification process (step 2012) is shown in FIG. 6 (b). As shown in the figure, it is assumed that the text data "{text: Mr. Yamada's body temperature is}" 706 is given. Then, the text data 706 is analyzed according to the dialogue model 707 obtained by learning the known dialogue, and the types of commands specified by voice are classified. In the illustrated example, as is clear from the classification result 708, the command classification result is certified as a "body temperature confirmation request command".

2. Dialogue control processing As shown in FIG. 5C, the dialogue control processing (step 2021) includes command execution pre-processing (step 2021), command execution processing (step 2022), response generation processing (step 2023), and anti-portability. It is configured to include a terminal response transmission process (step 2024).

2.1 Pre-command execution processing The pre-command execution processing (step 2021) identifies one user who is the target of the command processing prior to executing the command classified by the command classification processing (step 2012). It goes through the "user specific word" (for example, "last name") used, the "user information" (for example, "user table") stored in the data storage server 33, and the voice output terminal 20. Based on the "dialogue processing result" with the care worker 10, the "target person confirmation process" for determining the user 1 to be commanded and the variables required for the command (for example, "last name", "last name", " Of the "target" and "date and time"), the variable for which the corresponding "word (variable value)" is not satisfied is configured to include "variable supplement processing" for supplementing the "word". ..

2.1.1 Target person determination process The target person determination process is one of the main parts of the data input / output support system according to the present invention, and will be described in detail later with reference to FIGS. 8 and 9. Explain to.

2.1.2 Supplementary processing of variables As shown in Fig. 6 (c), in the variable extraction result 709, among the three variables "last name", "target", and "date and time", the variable "date and time" Assume that the value of is unsatisfied. In such a case, the system automatically generates text data corresponding to the standard speech voice ("When do you want to check the body temperature?") When asking the care staff when it is necessary to confirm the date and time. The text data automatically generated in this way is transmitted to the voice processing server 32 via the Internet 31 as a "TTS" conversion request.

Then, the processes (FIG. 3, step 101 "TTS", 104, 105) are sequentially executed in the voice processing server 32, and the speech voice data corresponding to the text data is generated. The speech voice data thus obtained is sent to the voice input / output terminal 20 via the Internet 31 and LAN 21. Then, a speech voice (“When do you want to check the body temperature?”) Flows out from the speaker of the voice input / output terminal 20. In this state, when the care worker 10 speaks the answer speech voice (“Yesterday's”) 711 to the microphone of the voice input / output terminal 20, the voice input / output terminal 20 generates the corresponding speech voice data and outputs the corresponding speech voice data. As a "STT" conversion request, it is transmitted to the voice processing server 32 via the LAN 21 and the Internet 31.

Then, the processes (FIG. 3, step 101 "STT", 102, 103) are sequentially executed in the voice processing server 32, and the answer text data corresponding to the answer speech voice data is generated. The text data thus obtained is sent to the dialogue processing server 34 via the Internet 30. The dialogue processing server 34 waits for the answer text data to be returned, extracts the word "yesterday" included in the answer text data, and calculates back from the current date and time to specify the specified date and time "7/18". Is obtained, and the command is completed by satisfying the value of the missing variable "date and time", and the process shifts to the command execution process.

3. 3. Command execution processing The basics of the command execution processing (step 2022) are, in short, the "user (for example," Tanaka Yoshinari ")" and the "target (for example," body temperature "") determined by the target person confirmation process (details will be described later). ) ”, The process (body temperature recording process or body temperature confirmation process) according to the command“ type ”(for example, input request, confirmation request) is executed.

That is, as shown in FIG. 7, each time a processing request is given, the content of the command is determined (step 301), and if it is determined to be an "input request command" (step 301, "input request"). , The designated recording process (step 302), the response generation process (step 303), and the response return process (step 304) are sequentially executed. Here, the designated recording process (step 302) refers to the recording process of the command-designated user (target person) on the data storage server 33 regarding the command-designated "target" (for example, body temperature, blood pressure, etc.). It is what you do. On the other hand, if it is determined that the command is a "confirmation request command" (step 301, "confirmation request", designated search process (step 305), response generation process (step 306), response return process (step 307)). Here, the designated search process (step 305) is data related to the command-designated "target" (for example, body temperature, blood pressure, etc.) with respect to the command-designated confirmed user (target user). It executes the search process in the storage server 33.

A specific example of the command execution process (step 2022) is shown in FIG. 6 (d). In this example, since the command type is "confirmation request", the values of the three variables included in the command are "last name: Yamada", "target: body temperature", and "date and time: 7/18". As the search key, the data of the data storage server 33 (vital information table in this example) is accessed via the Internet 31 and the search process is executed to acquire the corresponding search result 712. According to the illustrated search result 712, it can be seen that the body temperature of Mr. Yamada on July 18 was 36.5 degrees at 10:30 and 36.2 degrees at 14:00.

As will be described in detail later with reference to FIG. 10, when executing a command, the text data constituting the command contains a command word corresponding to a predefined "all-user batch designation command with exception designation". When it is included, it should be noted that measures have been taken to enable batch recording of the predetermined user range (see the lower part of FIG. 6D).

4. Response generation process The basis of the response generation process (step 2023) is to generate text data corresponding to various speech voices to be transmitted from the system side to the care staff side. These speech voices include a question speech voice for reciting and confirming the contents of the input request command, a completion speech voice for notifying the care staff that the execution of the input request command has been completed, and an execution result of the confirmation request command. Search result speech voice to convey the barrel search data to the care staff, question speech voice to convey the question to the care staff when there is an unclear point in the given command, and so on.

A specific example of such a response generation process is shown in FIG. 6 (e). In this example, the standard answer speech voice corresponding to the search result 712 ("Mr. Yamada's body temperature on 7/18 was 36.5 degrees at 10:30 and 36.2 degrees at 14:00") 713 The answer text data corresponding to is automatically generated.

5. Response transmission processing The basis of the response transmission processing (step 2024) is to convert various response-corresponding text data generated in the response generation processing (step 2023) in the form of text / speech voice / conversion request via the Internet 31. The data is transmitted to the voice processing server 32.

<General explanation of target person determination process>
Next, "target person determination processing", which is one of the main parts of the present invention, will be described. As shown in FIG. 6 (c), this target person determination process is executed in the command execution pre-process (FIG. 5, step 2021) described above.

This "target person determination process" is the command process when the analysis result in the text analysis unit (step 201) is an input request command or a confirmation request command related to the care necessary information (for example, "body temperature") of the user of 1. "User specific word" (for example, "last name: Yamada") and "user information" (for example, "user table" information in FIG. 2) used to identify one user who is the target of ) And the "dialogue processing result" with the care worker 10 via the voice output terminal 20, the user 1 to be the target of the command processing is determined.

As a result of such "target person confirmation processing" being executed, the care worker 10 may specify the user to be command processing by voice only by its "last name" (for example, "Yamada-san's body temperature"). After that, through dialogue (question and answer) with the system side, it becomes possible to accurately identify the user who is the target of command processing (for example, "Yamada Yoshiaki's body temperature"). Of course, as a basic rule, the target person for command processing is specified by voice by "first name and last name" which is a combination of "first name and last name", but if you are busy or inadvertently forget it, However, as long as the "surname" is known, it will be possible to accurately identify the target person through dialogue with the system.

There are two types of processing in this "target person confirmation processing", which are the first method and the second method. In the command execution process (step 2022), a given command (for example, an input request command, a confirmation request command) is given only when the user who is the target of the command process is determined in those "target person confirmation processes". , Etc.), the processing corresponding to (for example, recording processing and search processing) is executed.

In the "target person determination process" of the first method, a plurality of user-specific words such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual, are used. When it is composed of only one of the "personal specific words" (for example, "last name: Yamada"), in the user information holding unit (for example, FIG. 2, user table), the "user specific word" (for example, the user specific word) ( For example, when a search process using "last name: Yamada") as a search key is executed and the number of hit users is 1 as a result, the "user specific word" (for example, "user specific word") among the "personal specific words" is executed. , "Last name"), but one or more "words" (eg, "first name: Yoshiaki") that correspond to a given reconfirmation speech (eg, "Yamada Yoshiaki?") Data is generated and transmitted to the voice input / output terminal 20 via the text / voice / conversion unit (FIG. 4 (b)), and then the voice / text conversion unit (FIG. 4 (a)) is sent. Only when the text data corresponding to the positive response (for example, "Yes") is received from the voice input / output terminal 20 via the voice input / output terminal 20, the user 1 to be the target of the command processing is determined.

In the "target person determination process" by the second method, when the "user specific word" is composed only of the "last name" (or "first name"), the user information holding unit (for example, FIG. 2). , User table), the user search process using the "user specific word" (for example, "last name: Yamada") as the search key was executed, and as a result, the number of hit users was 2 or more. And when there is no person with the same surname and the same name among those users, a selection request speech sound including the "first name" (or "last name") of each user (for example, "Yamada Yoshiaki and Yamada Hiroshi" Which one? ”) Is automatically generated and transmitted to the voice input / output terminal 20 via the text / voice / conversion unit (Fig. 4 (b)), and then the voice. -Based on the "first name" (or "last name") (for example, "Yoshiaki-san") included in the text data returned from the voice input / output terminal 20 via the text conversion unit (FIG. 4 (a)). , The target person to be processed by the command is determined.

Here, in the Western specification system, the "last name" in the above explanation can be replaced with the "first name" and the "first name" can be replaced with the "last name". That is, when the "user-specific word" is composed of only the "first name" (or "last name"), the "user-specific word" (for example, FIG. 2, user table) in the user information holding unit (for example, FIG. 2, user table) For example, a user search process using "name: Bill") as a search key is executed, and as a result, the number of hit users is two or more, and those users have the same surname and the same name. When it does not exist, text data corresponding to the selection request speech sound (for example, "Which is Bill Clinton or Bill Gates?") Including the "last name" (or "first name") of each user is automatically generated. Then, it is transmitted to the voice input / output terminal 20 via the text / voice / conversion unit (FIG. 4 (b)), and then via the voice / text conversion unit (FIG. 4 (a)). , The target person to be command processed is determined based on the "last name" (or "first name") (for example, "Gates") included in the text data returned from the voice input / output terminal 20.

In the "target person determination process" by this second method, the "user specific word" (for example, "last name: Yamada") is further added in the user information holding unit (for example, FIG. 2, user table). When the number of hit users is 2 or more and there are people with the same surname and the same name among those users as a result of executing the user search process using Text data corresponding to the selection request speech sound including the "room number" (for example, "Which room 105 or 115 is Mr. Yamada Yoshiaki?") Is automatically generated, and the text / voice / conversion unit (for example, The text is transmitted to the voice output terminal 20 via FIG. 4 (b), and then returned from the voice input / output terminal 20 via the voice / text conversion unit (FIG. 4 (a)). Based on the "room number" (for example, "room 105") included in the data, the target person to be command processed is determined.

<Specific example of target person determination process>
The entire command execution pre-processing and command execution processing including a specific example of the target person determination processing according to the first and second methods are shown in the flowcharts of FIGS. 8 and 9. FIG. 8 is a process corresponding to the “body temperature input request command”, and FIG. 9 is a process corresponding to the “body temperature confirmation command”.

<Processing corresponding to body temperature input request command>
First, with reference to FIG. 8, the “target person determination process” corresponding to the body temperature input request command will be described. In the voice input / output terminal 20 (not shown), the care worker 10 speaks to the microphone one of the input request speech voices (FIGS. 8, IR01 to 04) corresponding to the "body temperature input request command" to command the command. In the classification process (see FIG. 6 (c)), it is determined that the command is a "body temperature input request command", whereby the branch process (step 401) is affirmed and as a part of the command execution pre-process (step 2021). , "Target person confirmation process" is started.

1. 1. When there is one target person with the same "last name" In this target person confirmation process, first, the "user table" in the data storage server 33 is accessed via the Internet 31, and the user specific word is used. A user search is performed using a certain "Ikeda" as a search key (step 402).

Subsequently, based on the search results, it is determined whether or not there is at least one user (hereinafter referred to as "target person") to be input with body temperature (step 403). If it is determined that there is at least one target person (step 403YES), it is further determined whether the number of target persons corresponding to "last name: Ikeda" is one or two or more (step 404). ).

When it is determined that the number of persons corresponding to "Last name: Ikeda" is one (step 404YES), then, "time information" which is one of "variables" is added to the text data constituting the "body temperature input request command". It is determined whether or not some "word (variable value)" corresponding to "" exists (step 406). In this example, the three input request speech voices IR01, IR02, and IR03, which are the sources of the text data, have "words (variable values)" corresponding to "time information" which are "variables". Since "11:50", "now", and "5 minutes ago" exist respectively, it is determined that the time information exists (step 406YES). On the other hand, since the input request speech voice IR04 does not have the "word (value)" corresponding to the time information, it is determined that the time information does not exist (step 406NO).

If it is determined that the time information exists (step 406YES), the process immediately proceeds to the utterance processing of the input information (step 407). On the other hand, when it is determined that the time information does not exist (step 406NO), the current time is supplemented as the time information, and then the process proceeds to the utterance processing of the input information (step 407).

In the utterance processing of the input information (step 407), a repeat question speech sound (for example, "Mr. Ikeda Yoshiaki") for confirming the content of the input request speech sound spoken by the care worker 10 toward the microphone of the voice input / output terminal 20 is performed. The body temperature at 11:50 is 37.2 ° C. ”) This is for making 416 speak from the speaker of the voice input / output terminal 20.

In this example, when all the "words (values)" corresponding to the variables "last name", "body temperature", and "time information" are present in the input request speech sound (IR01-03 in FIG. 8), Using the values of these three variables and the "first name: Yoshiaki" stored in the user table in pairs with the "last name: Ikeda", standard text data corresponding to the repeat question speech sound is generated.

On the other hand, when the variables "last name" and "body temperature" are present in the input request speech sound, but the variable "time information" is missing (FIG. 8, IR04), the variable "time" is present. After supplementing the current time to "Information", use the three variables after the supplement and "First name: Yoshiaki" stored in the user table in pairs with "Last name: Ikeda" to make a repeat question speech. Generates standard text data corresponding to sound.

The text data corresponding to the repeated question speech sound generated in this way is transmitted to the voice processing server 32 via the Internet 31 in the form of a text / speech voice / conversion request. Then, the repeat speech sound data generated by the voice processing server 32 is transmitted to the voice input / output terminal 20, and as a result, the repeat question speech sound (in this example, "" Ikeda Yoshiaki's body temperature at 11:50 is 37.2 ° C, isn't it? ") Is spoken.

After that, the system is in a state of waiting for some text data corresponding to the response speech sound of the care staff from the voice input / output terminal 20 to arrive via the voice processing server 32 (step 408). In this state, when the caregiver speaks a positive response speech sound (eg, "yes", "yes", "yes", "yes") to the microphone (step 408, affirmative response 419), " "Ikeda Yoshiaki" is confirmed as the target of command processing, and in the vital information table (see Fig. 2), the body temperature at 11:50 is 37.2 ° C in relation to the user "Ikeda Yoshiaki". Information is recorded.

After that, text data corresponding to the recording completion speech voice "recorded" 420 corresponding to the recording completion is automatically generated, and voice processing is performed via the Internet 31 in the form of text / speech voice / conversion request. It is transmitted to the server 32. Then, the recording completion speech sound data generated by the voice processing server 32 is transmitted to the voice input / output terminal 20, whereby the recording completion speech sound (in this example, "" "Recorded") is spoken, and the care worker 10 is notified to that effect.

In the above-mentioned standby state (step 408), is the care staff speaking a negative response speech sound (for example, "No", "No", "Yes", "No") to the microphone (step). 408, negative response 418), or if no response arrives within a certain waiting time (for example, 5 seconds) (step 408, ignore 419), "Ikeda Yoshiaki" is confirmed as the target of command processing. The above-mentioned recording process is not performed. As a result, the user's mistake is avoided.

After that, text data corresponding to a guidance speech sound (for example, "Please speak again" 421) prompting the microphone of the voice input / output terminal 30 to speak again is automatically generated, and this is a speech on the voice processing server 32. By being converted into voice data and sent to the voice input / output terminal 20, an inductive speech sound (in this example, "Please speak again") is uttered from the speaker of the voice input / output terminal 20.

2. When there are two or more target persons with the same "last name" In determining whether the number of target persons corresponding to "last name: Ikeda" is one or two or more (step 404), the number of applicable persons is two or more. If it is determined that there is (step 404NO), the illustration is omitted, but further, it is determined whether or not there is a person with the same surname and the same name among the two or more target persons (see FIG. 2). It is done based on the contents of.

2.1 When there is no person with the same surname and the same name Here, if it is determined that the person with the same surname and the same name is not included, for example, if there are two applicable persons, each of the two applicable persons The "name" (for example, "Yoshiaki", "Norimasa") is read from the user table (see Fig. 2), and an inquiry speech sound for selecting one of them (for example, "Ikeda-san is multiple"). Which is Yoshiaki-san or Norimasa-san? ”) Text data corresponding to 411 is automatically generated.

This text data is transmitted to the voice processing server 31 in the form of a text / speech voice / conversion request, and further, the speech voice data converted by the voice processing server 31 is sent to the voice input / output terminal 20. As a result, a speech sound (for example, "There are multiple Ikeda-san. Which is Yoshiaki-san or Norimasa-san?") Flows out from the speaker of the voice input / output terminal 20.

After that, on the system side, on the voice input / output terminal 20, a speech voice for selecting one of the two "names" ("Yoshiaki" and "Norimasa") is spoken to the voice input / output terminal 20 microphone. It is in a standby state (step 405).

In this state, when the care worker 10 inputs a speech voice corresponding to the selected person's name (for example, "Yoshiaki-san") into the microphone of the voice input / output terminal 20, it is converted via the voice processing server 32. The target person is confirmed based on the corresponding text data (“Yoshiaki-san” 413).

After that, the input information utterance process (step 407) is executed after passing through a series of processes (steps 403YES, 404YES, 406YES) in the same manner as when there is only one subject with the same "last name". After the input information confirmation process (step 408), the body temperature data recording process for "Ikeda Yoshiaki-san" is completed.

2.2 When there are people with the same surname and the same name If it is determined that there are people with the same surname and the same name, for example, if there are two people, the "room number" of each of those two people (for example) , "No. 201", "No. 302") are read from the user table (see Fig. 2), and an inquiry speech sound (for example, "There are multiple Mr. Ikeda Yoshiaki. 201" to select one of them. The text data corresponding to 421 is automatically generated.

This text data is transmitted to the voice processing server 31 in the form of a text / speech voice / conversion request, and further, the speech voice data converted by the voice processing server 31 is sent to the voice input / output terminal 20. As a result, a speech sound prompting the selection (for example, "There are multiple Mr. Ikeda Yoshiaki. What is the throat of

rooms

201 and 302?") Flows out from the speaker of the voice input / output terminal 20.

After that, on the system side, in the voice input / output terminal 20, a speech voice for selecting one of the two "room numbers" ("room 201" and "room 302") speaks to the voice input / output terminal 20 microphone. It is in a state of waiting to be done (step 405).

In this state, when the care worker 10 inputs a speech voice corresponding to the name of one selected person (for example, "Room 201") into the microphone of the voice input / output terminal 20, it is converted via the voice processing server 32. The target person is confirmed based on the corresponding text data (“Room 201” 414).

If it is determined in the determination process (step 403) that the target person does not exist (step 403NO), the speech voice corresponding to the absence of the target person and the inquiry of the further search request is made. ("Ikeda-san was not found. Would you like to find another person?") Text data corresponding to 410 was automatically generated, and this was converted to speech voice data via the voice processing server 32. After that, it is transmitted to the voice input / output terminal 20, and a speech sound to that effect is uttered. As a result, the user's mistake is avoided.

<Processing corresponding to the body temperature confirmation request command>
Next, with reference to FIG. 9, the “target person determination process” corresponding to the body temperature confirmation request command will be described. In the voice input / output terminal 20 (not shown), the care worker 10 speaks to the microphone one of the confirmation request speech voices (FIGS. 9, IR01 to 11) corresponding to the "body temperature confirmation request command" to command the command. In the classification process (see FIG. 6 (c)), it is determined that the command is a "body temperature confirmation request command", whereby the branch process (step 501) is affirmed and as a part of the command execution pre-process (step 2021). , "Target person confirmation process" is started.

In the process corresponding to this body temperature confirmation request command, regarding the "command execution pre-processing", the response (512) when the target person does not exist (step 503NO) and the response when there is one target person (step 504YES). (Step 506), when there are two or more target persons and there is no person with the same surname and the same name (step 504NO) (513), and when there are two or more target persons and there are persons with the same surname and the same name. The correspondence (514) at the time of (step 504NO) is the same as that of the body temperature input request command described with reference to FIG.

On the other hand, regarding the "command execution process", depending on whether or not the body temperature data exists in the text data generated by the voice input (step 507) and whether or not a plurality of data exist on the corresponding day (step 509). , The correspondence is different.

That is, when the body temperature data does not exist (step 507NO), a speech voice to the effect that the data does not exist is sent to the care staff via the voice input / output terminal 20 (“There is no body temperature data yesterday by Mr. Ikeda Yoshiaki”. ) 517 is spoken and the process ends (step 508).

When a single piece of data exists on the relevant day (step 509NO), a speech about body temperature is given to the care staff via the voice input / output terminal 20 ("Ikeda Yoshiaki's body temperature at 10:35 yesterday was 36". 8.8 ° C. ”) 518 is spoken and the process ends (step 510).

When multiple data exist on the relevant day (step 509YES), the care staff is given a speech about body temperature via the voice two-lap output terminal 20 ("Ikeda Yoshiaki's body temperature at 18:40 yesterday was 38". The body temperature at .2 ° C. and 12:10 is 37.2 ° C., and the body temperature at 10:35 is 36.8 ° C. ”) 519 is spoken and the process ends (step 511).

<Command execution processing for "previous target person batch processing instruction with exception specification">
Next, with reference to FIG. 10, the command execution process for the meal amount input request command including the target person batch processing command with exception designation will be described. At the voice input / output terminal 20 (not shown), the care worker 10 said, "Meal. By speaking one of the input request speech voices (FIG. 10, IR21 and 22) corresponding to the "quantity input request command" into the microphone, the "meal amount input" is performed in the command classification process (see FIG. 6 (c)). The determination of "request command" is performed, the branch process (step 601) is affirmed, and the "target person determination process" is started as a part of the command execution pre-process (step 2021).

In the text data obtained by STT conversion of two types of speech voices (IR21, IR22) corresponding to the meal amount input request command, the target person's individually specified part ("Mr. Ikeda has 80% staple food, 70% side dish, "Soup 50%", "Mr. Ikeda's staple food 50%, side dish 80%, soup is 50%") and all the target people's batch designation part with exception designation ("The staple food, side dish and soup for lunch other than Mr. Ikeda are complete , "The staple food, side dish and soup for lunch are complete except for Mr. Ikeda.")

Regarding the individual designated part of the target person ("Mr. Ikeda has 80% staple food, 70% side dish, 50% juice", "Mr. Ikeda has 50% staple food, 80% side dish, 50% juice") When the target person does not exist (step 603NO) (610), when the target person is one person (step 604YES) (607), the target person is two or more and has the same surname and the same name. Regarding the correspondence (611) when there is no person (step 604NO) and the correspondence (612) when there are two or more target persons with the same surname and the same name, the figure is shown in the figure. It is processed in the same manner as in the case of the body temperature input request command described with reference to 8.

On the other hand, the batch designation part for all target persons with exception designation includes the command word "other than" corresponding to the "collective designation command for all users with exception designation". When this command word "other than" exists in the text data, the command word is premised on a preset population (for example, all users on the first floor, all users gathered in the cafeteria, etc.). With the exception of "users" who have a certain word order relationship with "other than", the process of writing the same data to all of the above population is performed at once.

For example, assuming that all facility users are gathered in a cafeteria to serve breakfast, lunch, and dinner, the number of all the gathered people (for example, 15 people) and individual names are known, and among them, Other than a certain user (for example, Mr. Takahashi), it is assumed that both the staple food and the side dish are recorded as being completed. In such a case, it is a great effort to input the speech voice individually for each of the 15 users. In such a case, by using a speech voice including "all user batch designation command with exception designation" (for example, " except Takahashi-san, both staple food and side dish are complete", the time and effort of voice input can be greatly reduced. Can be reduced.

According to the embodiment of the present invention described above, the data input operation in this kind of elderly care facility can be easily and efficiently performed via voice. Therefore, it is possible to improve the workability of data input in a long-term care facility that employs long-term care staff and foreign workers who are not good at visually recognizing fine characters on the screen of an electronic device.

The present invention can be effectively used by a software provider that provides various management software to a nursing care facility for the elderly.

10 Nursing staff 20 Voice input / output terminal 20a Smartphone 20b Smart watch 21 LAN
22 Local server 23 Personal computer (PC)
31 Internet 32 Speech processing server 33 Data storage server 34 Interactive processing server CR01 Confirmation request speech CA01 Confirmation answer speech IR01 to IR04 Input request speech CR101 to CR111 Confirmation request speech IR21 to IR22 Input request speech

Claims

A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
A text / voice / conversion unit that converts text data into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
Data input support device in elderly care facilities.
The target person determination processing unit
The user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is 0, the corresponding command process is performed on 1. The elderly care facility according to claim 1, wherein text data corresponding to a speech including at least not finding a user is generated and transmitted to the voice input / output terminal via the text / voice / conversion unit. Data input support device.
The command execution unit
When the decoding result in the text decoding unit is an input request command related to the long-term care necessary information of one user, it is received from the voice input / output terminal via the voice / text conversion unit and the text analysis unit. The data input support device in a nursing care facility for the elderly according to claim 1, wherein the long-term care necessary information is written in a designated user area in the long-term care necessary information holding unit.
The command execution unit
When the decoding result in the text decoding unit is a confirmation request command regarding the long-term care necessary information of one user, the long-term care necessary information read from the designated user area in the long-term care necessary information holding unit is referred to as the text / voice. The data input support device in the elderly care facility according to claim 1, which is transmitted to the voice input / output terminal via a conversion unit.
Of the "personal specific words", the "user specific word" is the user's "last name", and the remaining one "user specific word" excluding the "last name" is the user's "first name". , The data input support device in the elderly care facility according to claim 1.
A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
Including a text / speech / conversion unit that converts text data into corresponding speech speech data according to a conversion model generated by learning of known input / output relationships.
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding step for decoding the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding step is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the content of the dialogue with the care staff via the voice output terminal, a target person determination processing step for determining one user to be command processed, and a target person determination processing step.
For one user confirmed in the target person confirmation processing step, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing step is
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
Data entry support method in elderly care facilities.
A voice input / output terminal that is portable to the care staff, has a microphone and a speaker, and has a wireless network connection function.
A user information holding server on the network that holds user information about the users of the elderly care facility,
A long-term care necessary information holding server that holds long-term care necessary information about users of the elderly care facility, and
A voice / text conversion server on the network that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
A text / voice / conversion server on the network that converts text data into the corresponding speech voice data according to the conversion model generated by learning the known input / output relationships.
Obtained by converting speech voice data acquired via communication with the voice input / output terminal and generated by the care worker speaking to the microphone through the voice / text conversion step. A text decoding server on the network that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result on the text decoding server is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the content of the dialogue processing with the care staff via the voice output terminal, a target person confirmation processing server on the network that determines one user to be command processed, and
For one user confirmed by the target person confirmation processing unit, the command execution server that executes the processing specified by the input request command or the confirmation request command for the long-term care necessary information holding unit is included.
The target person confirmation processing server is
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding server executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
Data input support system in elderly care facilities.
A voice input / output terminal that is portable to caregivers and has a built-in microphone, speaker, and communication function.
The user information holding department that holds user information about the users of the elderly care facility,
The long-term care necessary information holding department that holds the long-term care necessary information about the users of the elderly care facility,
A voice / text / conversion unit that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
In a data input / output device in an elderly care facility having a text / voice / conversion unit that converts text data into corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Computer,
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
A computer program to function as a device.
A voice input / output terminal that is portable to the care staff, has a microphone and a speaker, and has a wireless network connection function.
A user information holding server on the network that holds user information about the users of the elderly care facility,
A long-term care necessary information holding server that holds long-term care necessary information about users of the elderly care facility, and
A voice / text conversion server on the network that converts speech voice data into the corresponding text data according to a conversion model generated by learning of known input / output relationships.
In a data input support system in a nursing home for the elderly, including a text / voice / conversion server on the network that converts text data into the corresponding speech voice data according to a conversion model generated by learning of known input / output relationships.
Computer,
Obtained by converting the speech voice data acquired through communication with the voice input / output terminal and generated by the care worker speaking to the microphone via the voice / text / conversion unit. A text decoding unit that decodes the content of the text by analyzing the obtained text data according to a dialogue model obtained by learning a known dialogue.
When the decoding result in the text decoding unit is an input request command or a confirmation request command related to the long-term care necessary information of one user, the user-specific word used to identify one user who is the target of the command processing. Based on the user information and the dialogue processing with the care staff via the voice output terminal, a target person determination processing unit that determines one user to be command processed, and a target person determination processing unit.
For one user confirmed by the target person confirmation processing unit, the command execution unit that executes the processing specified by the input request command or the confirmation request command with respect to the long-term care necessary information holding unit is included.
The target person determination processing unit
The user-specific word is composed of only one of a plurality of "person-specific words" such as "last name", "first name", "intermediate name", "nickname", etc., which are usually used for identifying an individual. Occasionally, the user information holding unit executes a search process using the "user specific word" as a search key, and as a result, when the number of hit users is one, the "personal specific word" is used. Generates text data corresponding to a predetermined reconfirmation speech including the remaining one or two or more "words" excluding the "user-specific word", and the voice via the text / voice / conversion unit. Use of 1 which is the target of command processing only when it is transmitted to the input / output terminal and then text data corresponding to a positive response is received from the voice input / output terminal via the voice / text conversion unit. To determine the person,
A computer program that acts as a server.