WO2019103200A1

WO2019103200A1 - Method and device for providing integrated voice secretary service

Info

Publication number: WO2019103200A1
Application number: PCT/KR2017/013512
Authority: WO
Inventors: 정종일; 김용진
Original assignee: 주식회사 모다
Priority date: 2017-11-23
Filing date: 2017-11-24
Publication date: 2019-05-31

Abstract

Disclosed are a method and a device for providing an integrated voice secretary service. According to an embodiment of the present invention, provided is a method for providing an integrated voice secretary service comprising the steps of: receiving a recognition result of a voice command from a first voice secretary server; locating a target terminal, to which the voice recognition result is to be transmitted, by means of analyzing the voice recognition result; searching for voice secretary servers linked to the target terminal; selecting one among voice secretary servers which have been found; and transmitting the voice recognition result to the selected voice secretary server.

Description

Method and apparatus for providing integrated voice secretary service

The present invention relates to a method and apparatus for providing an integrated voice secretary service.

The contents described in this section merely provide background information on the present invention and do not constitute the prior art.

Since the development of various types of personal smart devices such as laptop computers, tablets, smart phones and smart watches, interfaces for manipulating smart devices have been developed in various directions. As the size of the smart device gradually decreases, the touch screen is installed to provide a rich user interface on a small screen, and the touch interface is widely used as an interface for operating various personal smart devices.

The touch interface is intuitive and has the advantage of receiving immediate command feedback. However, in situations where complex interaction is required, such as when two hands are not free, when complex commands are required, when multiple instructions require interaction, or when long texts are required, typing commands with the touch interface Uncomfortable.

Compared with the touch interface, the voice interface is a natural and intuitive interface that is being used for services that require fast interaction.

The most important thing about the voice interface is the accuracy of the speech recognition technology. Various techniques have been developed to improve the accuracy of speech recognition technology. For example, RDNN (Recurrent Deep Neural Network) based speech recognition technology builds a speech recognition engine through learning. Since the amount of learning data and the amount of repetition of learning greatly affect speech recognition performance, the quality of the voice secretary service provided for each company differs. For example, the voice interface, which is widely used in the English speaking countries, will have good recognition quality in English.

An example of a service based on a voice interface, namely voice secretary service, is Apple's Siri. Siri is a voice secretary service that works on Apple iOS and macOS devices. A brief description of how Siri works is given below.

① Siri's basic call command is "Hey Siri" or "Syriya". When the user calls the voice secretary "Hey Siri" or "Sirija" to the user terminal, the user terminal records the voice of the user and transmits it to the voice secretary server. ② The voice secretary server recognizes the user's voice and converts it into text. ③ The voice secretary server analyzes the converted text using artificial intelligence. ④ The voice secretary server gives a specific answer to the user terminal according to the analyzed contents or allows the user terminal to execute a specific app. It may be possible to perform an operation of controlling a user terminal other than the user terminal that has input the voice.

The user must subscribe and register separately with each voice secretary service before using the voice secretary service. Also, if different voice secretary services are provided for each terminal carried by the user, the user can not be provided continuity of the voice secretary service in various situations and environments. Therefore, it is necessary to interwork a plurality of voice secretarial services.

In this regard, Korean Patent Laid-Open Publication No. 2016-0071111 (Jun. 26, 2016) 'Providing Personal Secretary Service in Electronic Device' discloses a voice secretary service that provides a response to a third party other than the owner of the terminal. For example, when the owner of the terminal asks the third party to say, "When is the end?" When the owner of the terminal is in the meeting, the voice secretary service of the terminal automatically generates a voice "Ends in one hour" and transmits the voice to the third party.

However, this document discloses a configuration for providing services for a plurality of terminals in a single voice secretarial service, but does not disclose a configuration for interworking a plurality of voice secretarial services.

According to an embodiment of the present invention, a plurality of voice secretarial services can be interlinked to ensure continuity of voice secretary services in various situations and environments.

According to an embodiment of the present invention, there is provided a method of providing an integrated voice secretary service, comprising: receiving a voice command recognition result from a first voice secretary server; Analyzing the speech recognition result to identify a target terminal to which the speech recognition result is to be transmitted; Searching for a voice secretary server interlocked with the target terminal; Selecting one of the retrieved voice secretary servers; And transmitting the voice recognition result to the selected voice secretary server.

According to another embodiment of the present invention, there is provided a method of providing an integrated voice secretary service, comprising: receiving a recorded voice command from a first terminal; A first selection step of selecting a voice secretary server to provide a service to the first terminal; Transmitting the recorded voice command to the first voice secretary server selected in the first selecting step; Receiving a recognition result of the recorded voice command from the first voice secretary server; Analyzing the speech recognition result to identify a second terminal to transmit the speech recognition result; A second selection step of selecting a voice secretary server to provide a service to the second terminal; Transmitting the voice recognition result to a second voice secretary server selected in the second selection process; Receiving a service packet according to the speech recognition result from the second voice secretary server; And transmitting the service packet to the second terminal.

According to another embodiment of the present invention, there is provided an integrated voice secretary server comprising: a first receiving unit for receiving a voice command recorded from a first terminal; A first selector for selecting a voice secretary server to provide a service to the first terminal; A first transmission unit for transmitting the recorded voice command to the first voice secretary server selected by the first selection unit; A second receiving unit for receiving the recognition result of the recorded voice command from the first voice secretary server; A determination unit for analyzing the speech recognition result and determining a second terminal to transmit the speech recognition result; A second selector for selecting a voice secretary server to provide a service to the second terminal; A second transmitting unit for transmitting the voice recognition result to the second voice secretary server selected by the second selecting unit; A third receiving unit for receiving a service packet according to the speech recognition result from the second voice secretary server; And a third transmission unit for transmitting the service packet to the second terminal.

According to another embodiment of the present invention, all kinds of voice secretary services can be provided through one integrated voice secretary service subscription.

FIG. 1 illustrates an integrated voice secretary service providing system according to a first embodiment of the present invention. Referring to FIG.

2 is a diagram illustrating a specific operation of the integrated voice secretary service providing system according to the first embodiment of the present invention.

FIG. 3 is a diagram showing a translation process added to the integrated voice secretary service providing system according to the first embodiment of the present invention.

4 is a flowchart illustrating an integrated voice secretary service providing method according to the first embodiment of the present invention.

5 is a diagram illustrating an integrated voice secretary service providing system according to a second embodiment of the present invention.

6 is a flowchart illustrating an integrated voice secretary service providing method according to a second embodiment of the present invention.

7 is a block diagram illustrating a server providing an integrated voice secretary service according to a second embodiment of the present invention.

Hereinafter, an embodiment of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in the drawings, like reference numerals are used to denote like elements in the drawings, even though they are shown in different drawings. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail.

In describing the constituent elements of an embodiment of the present invention, the first, second, i), ii), a), b) and the like can be used. Such a code is intended to distinguish the constituent element from another constituent element, but the nature of the constituent element, the order, the order, and the like are not limited by the code. It is to be understood that when a component is referred to as being "comprising" or "comprising," it should be understood that this section does not exclude the presence of other components, And the like. The term 'module' refers to a unit that processes at least one function or operation, and may be implemented as 'hardware', 'software', or 'combination of hardware and software'.

A variety of products including Apple's Siri, Google's Google assistant, Amazon's Alexa, Microsoft's Kotana, Samsung's Bixbee, SKT, KT's Kaggini, Naver's Clover and Cacao's I Voice secretary services are available now.

However, since the interworking function is not provided between the voice secretary services, the user feels a great inconvenience. The user has to separately register and register in each voice secretary service in order to use the voice secretary service, so that the account management and use are troublesome. Also, if the terminals of the user support different voice secretary services, continuous and consistent voice secretary services can not be provided in various situations and environments.

The smartphone is Samsung's product (voice secretary service: Samsung Big B), the speaker is Amazon's product (voice secretary service: Amazon Alexa), the personal tablet is Apple's product (voice secretary service: , The office tablet is a product of Google (voice secretary service: Google assistant), the TV set-top box is a product of KT (voice secretary service: KT machine), and the office internet telephone is a product of SKT Service: SKT who), and the voice secretary service linked to each terminal are different, it is difficult to share information between each voice secretary service among the terminals. For example, when the user is using the A-voice secretary service, it is difficult to provide the association service to the terminal interlocked with the B-voice secretary service.

1. 제1 실시예1. First Embodiment

As shown in FIG. 1, the integrated voice secretary service providing system according to the first embodiment of the present invention includes a plurality of terminals, a plurality of voice secretarial servers, and an integrated voice secretary service providing server 'Integrated server').

In the integrated voice secretary service providing system according to the first embodiment of the present invention, the integrated server is located behind each voice secret server, so that the service provided by the integrated server is not exposed to the user of the terminal.

The first terminal is provided with voice secretarial service through the first voice secretarial server. And the second terminal is provided with voice secretarial service through the second voice secretarial server. The integration server relays the first voice secretary server and the second voice secretary server.

According to the integrated voice secretary service according to the first embodiment of the present invention, the voice secretary service linked to each terminal can be connected to receive the voice secretary service irrespective of the user's spatial and temporal location or the device being owned. For example, the service provided by the Voice Secretary Service (KT Kiggini) of KT's TV settop at home is provided through the voice secretary service of Samsung's smartphone .

When the user of the first terminal speaks to the first terminal, the first terminal records the voice of the user (hereinafter, 'voice command') and transmits it to the first voice secretarial server.

The first voice secretary server recognizes the voice command and converts it into text. The first voice secretarial server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.

The first voice secretary server transmits the voice recognition result to the integration server. The speech recognition result transmitted from the first voice secretarial server to the integration server is a result of converting the voice command into text (hereinafter, referred to as 'voice conversion result'), or analyzing the semantic content of the voice command converted into text (hereinafter, 'Semantic analysis result'), or both the speech conversion result and the semantic analysis result.

The integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result. The integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result. Alternatively, the integration server can identify the second terminal using the semantic analysis result. Alternatively, the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.

The integration server searches the second terminal in its own database to determine what is the voice secretary service linked to the second terminal. When there are a plurality of voice secretarial services interlocked with the second terminal, the integration server selects one voice secretarial server according to the state of the voice secretary server providing each service and the priority set by the user of the second terminal. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.

The integration server sends the speech recognition result to the selected second voice secretary server. The voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.

The second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the voice conversion result and provide the voice secretary service to the second terminal. Or the second voice secretary server may provide the voice secretary service to the second terminal using the semantic analysis result. Alternatively, the second voice secretary server can provide the voice secretary service to the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.

As shown in FIG. 3, the integrated voice secretary service providing system according to the first embodiment of the present invention may include a translation process.

When the first terminal is Korean and the second terminal is English, the user of the first terminal uses the voice secretarial service having high Korean recognition capability and the user of the second terminal uses voice Secretarial services are available.

If the language used by the first terminal is different from the language used by the second terminal, the integration server may further include a function of translating the voice recognition result into a relative language and interlocking the two voice secretary services optimally.

When the user of the first terminal speaks to the first terminal, the first terminal records the voice command and transmits it to the first voice secretary server.

The first voice secretary server recognizes the voice command and converts it into text. Further, the first voice secretary server may further include a function of analyzing the semantic content of the voice command converted into text.

The first voice secretary server transmits the voice recognition result to the integration server. The voice recognition result transmitted by the first voice secretarial server to the integration server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.

The integration server determines whether the language used by the first terminal and the language used by the second terminal are the same before transmitting the speech recognition result to the selected second voice secretary server. If the language used by the first terminal is different from the language used by the second terminal, the integration server translates the speech recognition result into a language used by the second terminal, and transmits the translated speech recognition result to the second voice secretary server do. The voice recognition result transmitted by the integration server to the second voice secretary server may be a translated voice conversion result, a translated semantic analysis result, or both a translated voice conversion result and a translated semantic analysis result.

The second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the translated voice conversion result and provide the voice secretary service to the second terminal. Or the second voice secretary server may provide the voice secretary service to the second terminal using the translated semantic analysis result. Or the second voice secretarial server can provide the voice secretarial service to the second terminal by using both the result of analyzing the meaning of the translated voice conversion result itself and the translated semantic analysis result received from the integration server.

2. 제1 실시예의 순서도2. Flow chart of the first embodiment

The method of providing the integrated voice secretary service of FIG. 4 may be performed by an integrated server, a gateway device, or other network device.

As shown in FIG. 4, the integrated voice secretary service providing method according to the first embodiment of the present invention includes a step of receiving a voice recognition result from a first voice secretary server interlocked with a first terminal (S410) (S430), searching for a voice secretary server linked with the second terminal (S430), selecting one of the voice secretary servers linked with the second terminal (S440), and transmitting a voice recognition result to the selected voice secretary server (S450).

In step S410 of receiving a voice recognition result from the first voice secretary server interlocked with the first terminal, the received voice recognition result may be a voice conversion result, a semantic analysis result, or both voice conversion result and semantic analysis result have.

The process of determining the second terminal to provide the voice secretary service using the voice recognition result (S420) may include analyzing the meaning content of the voice conversion result.

The process of identifying the second terminal to provide the voice secretary service using the voice recognition result (S420) includes the steps of analyzing the semantic content of the voice conversion result to identify the second terminal, And analyzing the semantic analysis result of the voice conversion result itself and the semantic analysis result received from the first voice secretarial server to identify the second terminal.

The step of searching for the voice secretary server linked with the second terminal (S430) may include searching for a second terminal in its own database and knowing what is the voice secretary service linked to the second terminal.

The step S440 of selecting one of the voice secretary servers interlocked with the second terminal may include a state of the voice secretary server providing each service when a plurality of voice secretary services interlocked with the second terminal are selected, And selecting one voice secretary service using the set priorities and the like. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.

In the process of transmitting the voice recognition result to the second voice secretarial server (S450), the voice recognition result to be transmitted may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result. The semantic analysis result may be received in step S410 or analyzed in step S420.

[0034] The method may further include registering information on user terminal information, information on a voice secretary service interlocked with the user terminal, and information on priorities of the synchronized voice secretary service, prior to step S410.

If the language used in the first terminal differs from the language used in the second terminal, a step of translating the speech recognition result into the language used in the second terminal may be further included between steps S420 and S450. In this case, transmitting the speech recognition result in step S450 may be a result of transmitting the translated speech recognition result.

Although it is described that S410 to S450 are sequentially executed in Fig. 4, the description of the technical idea of the present invention is merely illustrative and the execution of S410 to S450 is not limited to the time series order. Those skilled in the art can change the order of S410 through S450 without departing from the essential characteristics of the present invention or omit one or more steps in S410 through S450 or perform one or more steps in S410 through S450 The method of FIG. 4 can be variously modified and modified, for example, by executing in parallel.

3. 제2 실시예3. Second Embodiment

As shown in FIG. 5, the integrated voice secretary service provision system according to the second embodiment of the present invention includes a plurality of terminals, a plurality of voice secretarial servers, and an integration server.

In the integrated voice secretary service providing system according to the second embodiment of the present invention, the service provided by the integration server is exposed to the user of the terminal, with the integration server in front of each voice secretary server.

The first terminal is provided with the voice secretary service provided by the first voice secretarial server via the integration server. And the second terminal is provided with the voice secretary service provided by the second voice secretarial server via the integration server. The user can receive the voice secretary service provided by each voice secretary server without registering in the voice secretary server by registering only in the integration server.

Specifically, when the user of the first terminal speaks to the first terminal, the first terminal records the voice command of the user and transmits the voice command to the integrated server.

The integration server transmits the voice command to the first voice secretary server providing the voice secretary service set by the user of the first terminal. When there are a plurality of voice secretarial services set by the user of the first terminal, the integration server transmits one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the first terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.

The selected first voice secretary server recognizes the voice command and converts it into text. The first voice secretariat server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.

The integration server searches the second terminal in its own database and grasps the voice secretary service set by the user of the second terminal. When there are a plurality of voice secretarial services set by the user of the second terminal, the integration server uses one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the second terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.

The integration server sends a speech recognition result to a second voice secretary server that provides the selected voice secretary service. The voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.

The second voice secretary server generates a service packet for the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the voice conversion result to generate a service packet for the second terminal. Or the second voice secretary server may generate a service packet for the second terminal using the semantic analysis result. Alternatively, the second voice secretarial server itself can generate a service packet for the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.

And the second voice secretary server transmits a service packet for the second terminal to the integration server.

The integration server transmits the service packet received from the second voice secretary server to the second terminal.

According to the integrated voice secretary service of the second embodiment of the present invention, the integrated voice secretary agent can be configured to be interworked with various voice secretary services. The user can use all the voice secretary services by registering only the integrated voice secretary agent, which greatly increases the user convenience.

4. 제2 실시예의 순서도4. Flow chart of the second embodiment

The integrated voice secretary service providing method of FIG. 6 may be performed by an integrated server, a gateway device, or other network device.

As shown in FIG. 6, the integrated voice secretary service providing method according to the second embodiment of the present invention includes a step of receiving a recorded voice command from a first terminal (S610), a voice secretary A step S630 of transmitting a recorded voice command to the first voice secretary server selected in the first selecting step, a step of receiving a voice recognition result from the first voice secretary server A second selection process (S660) of selecting a voice secretary server to provide a service to the second terminal, a second selection process (S660) of analyzing the voice recognition result to identify a second terminal to which the voice recognition result is to be transmitted (S670) transmitting the voice recognition result to the second voice secretary server selected in step S670, receiving the service packet according to the voice recognition result from the second voice secretary server (S680), and transmitting the service packet to the second terminal Process (S6 90).

In the step of receiving the recorded voice command from the first terminal (S610), the recorded voice command may be in various file formats. For example, a recorded voice command may include files such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, May be formatted

In the first selection process (S620) of selecting a voice secretary server to provide services to the first terminal, the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal is B, C, and A, then the C voice secretary server can be selected.

The step S630 of transmitting the recorded voice command to the first voice secretary server selected in the first selecting step includes a step of converting the format of the voice command into the file format used by the first voice secretary server and transmitting .

In the process of receiving the voice recognition result from the first voice secretarial server (S640), the received voice recognition result may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.

In the process of analyzing the speech recognition result and determining the second terminal to transmit the speech recognition result (S650), the second terminal can be identified by analyzing the meaning content of the speech conversion result. Or the second terminal can be identified using the semantic analysis result. Alternatively, the second terminal can be identified using both the result of analyzing the meaning of the voice conversion result by itself and the result of the semantic analysis received from the first voice secretarial server.

In the second selection process (S660) of selecting a voice secretary server to provide services to the second terminal, the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.

In the process of transmitting the speech recognition result to the second voice secretary server selected in the second selection process (S670), the transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both the speech conversion result and the semantic analysis result have.

In the process of receiving the service packet according to the speech recognition result from the second voice secretarial server (S680), the received service packet is transmitted to the second terminal generated by the second voice secretarial server using the speech recognition result Lt; / RTI >

In step S690 of transmitting the service packet to the second terminal, a command for driving an app providing the second voice secretary service may be transmitted from the second terminal together with the service packet.

The method may further include a step of registering information on user terminal information, information on a voice secretary service desired to be provided by the user terminal, and information on priorities of a voice secretary service to be provided, prior to step S610.

If the language used in the first terminal differs from the language used in the second terminal, steps S650 and S670 may further include translating the speech recognition result into a language used in the second terminal. In this case, transmitting the speech recognition result in step S670 may be a result of transmitting the translated speech recognition result.

FIG. 6 shows that S610 to S690 are sequentially executed. However, the description of the technical idea of the present invention is merely illustrative, and the execution of S610 to S690 is not limited to the time series order. Those skilled in the art will recognize that changes may be made to the procedures of steps S610 through S690 without departing from the essential characteristics of the present invention or by omitting one or more steps S610 through S690 or by performing one or more steps S610 through S690 The method of FIG. 6 can be variously modified and modified, for example, by executing in parallel.

5. 제2 실시예의 장치도5. Device diagram of the second embodiment

7, the integration server 720 according to the second embodiment of the present invention includes a first receiving unit 721, a second receiving unit 722, a third receiving unit 723, a first selecting unit 724 A second selecting unit 725, a determining unit 726, a first transmitting unit 727, a second transmitting unit 728, and a third transmitting unit 729.

The first receiving unit 721 receives the recorded voice command from the first terminal 711. The recorded voice commands can be stored in a file format such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, May be

The first selection unit 724 selects a voice secretary server to provide the service to the first terminal 711. [ The selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal 711 is B, C, and A, then the C voice secretary server can be selected.

The first transmitting unit 727 transmits the recorded voice command to the first voice secretary server 731 selected by the first selecting unit 724. The first transmission unit 727 may convert the recorded voice command into a file format used by the first voice secretary server 731, and then transmit the converted voice command.

The second receiving unit 722 receives the recognition result of the recorded voice command from the first voice secretarial server 731. [ The received speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.

The determination unit 726 analyzes the speech recognition result and determines the second terminal 712 to transmit the speech recognition result. The determination unit 726 can analyze the semantic content of the voice conversion result and determine the second terminal 712. Or the second terminal 712 using the semantic analysis result. Or the second terminal 712 using both the result of analyzing the meaning of the voice conversion result itself and the result of the semantic analysis received from the first voice secretary server 731.

The second selection unit 725 selects a voice secretary server to provide service to the second terminal 712. [ The selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal 712 is B, C, and A, then the C voice secretary server can be selected.

The second transmitting unit 728 transmits the voice recognition result to the second voice secretary server 732 selected by the second selecting unit 725. The transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.

The third receiving unit 723 receives the service packet according to the voice recognition result from the second voice secretary server 732. The service packet may be a service packet for a service that the second voice secretary server 732 wants to provide to the second terminal 712 generated using the voice recognition result.

The third transmitting unit 729 transmits the service packet to the second terminal 712. The third transmitting unit 729 transmits the service packet to the second terminal 712 through the application App) can be transmitted.

Although the integrated voice secretary service provided for two terminals and two voice secretarial servers has been described in the above embodiment, this is merely an example. Also, for two or more terminals and two or more voice secretarial servers, An integrated voice secretary service according to the example can be provided.

Meanwhile, the methods described in the above embodiments can be implemented as a computer or a smartphone readable code on a computer-readable recording medium. A recording medium readable by a computer or a smart phone includes all kinds of recording devices in which data that can be read by a computer system is stored. That is, the recording medium that can be read by a computer or a smart phone includes a magnetic storage medium (e.g., a ROM, a floppy disk, a hard disk, etc.), an optical reading medium (e.g., CD- (E.g., USB, SSD), and the like. In addition, code that is distributed to networked computer systems and readable by a computer or smartphone in a distributed manner can be stored and executed.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention as defined by the appended claims. It will be possible.

The present invention is not intended to limit the scope of the present invention but to limit the scope of the present invention. The scope of protection of the present invention should be construed according to the claims, and all technical ideas considered to be equivalent or equivalent thereto should be construed as being included in the scope of the present invention.

(Explanation of Symbols)

711: first terminal 712: second terminal

720: Integration server 721: First reception unit

722: second receiving section 723: third receiving section

724: first selector 725: second selector

726: Judgment section 727: First transmission section

728: second transmission unit 729: third transmission unit

731: first voice secretary server 732: second voice secretary server

(CROSS-REFERENCE TO RELATED APPLICATION)

This patent application claims priority under 35 USC 119 (a) to US Patent Application No. 119 (a), U.S. Patent Application No. 10-2017-0157064, filed on November 23, 2017, The contents are incorporated herein by reference. This patent application also claims priority to the non-US countries for the same reasons as above, and the entire contents of which are incorporated herein by reference.

Claims

A method for providing an integrated voice secretary service,

Receiving a recognition result of a voice command from a first voice secretary server (hereinafter referred to as a 'voice recognition result');

Analyzing the speech recognition result to identify a terminal to which the speech recognition result is to be transmitted (hereinafter referred to as a 'target terminal');

Searching for a voice secretary server interlocked with the target terminal;

Selecting one of the retrieved voice secretary servers; And

And transmitting the voice recognition result to the selected voice secretary server.
The method according to claim 1,

Before the receiving process,

The method comprising: registering user terminal information, information on a voice secretary service interlocked with the user terminal, and information on a priority order of the interlocked voice secretary service.
3. The method of claim 2,

Wherein the selecting comprises selecting a voice secretary server according to the priority.
The method according to claim 1,

Between the process of identifying the target terminal and the process of transmitting the target terminal,

And translating the speech recognition result into a language used in the target terminal.
5. The method of claim 4,

Wherein the transmitting step transmits the translated speech recognition result.
A method for providing an integrated voice secretary service,

Receiving a recorded voice command from a first terminal;

A first selection step of selecting a voice secretary server to provide a service to the first terminal;

Transmitting the recorded voice command to the voice secretary server selected in the first selection step (hereinafter referred to as 'first voice secretary server');

Receiving a result of recognition of the recorded voice command (hereinafter, 'voice recognition result') from the first voice secretary server;

Analyzing the speech recognition result to identify a terminal to transmit the speech recognition result (hereinafter referred to as a 'second terminal');

A second selection step of selecting a voice secretary server to provide a service to the second terminal;

Transmitting the voice recognition result to the voice secretary server selected in the second selection process (hereinafter referred to as 'second voice secretarial server');

Receiving a service packet according to the speech recognition result from the second voice secretary server; And

And transmitting the service packet to the second terminal.
The method according to claim 6,

Prior to receiving the recorded voice command,

And registering information on the user terminal information, information on the voice secretary service desired to be provided by the user terminal, and information on the priority order of the voice secretary service desired to be provided.
8. The method of claim 7,

Wherein the second selection step selects the voice secretary server according to the priority order.
The method according to claim 6,

Between the process of recognizing the second terminal and the process of transmitting the voice recognition result to the second voice secretary server,

And translating the speech recognition result into a language used in the second terminal.
10. The method of claim 9,

Wherein the step of transmitting the voice recognition result to the second voice secretary server comprises transmitting the translated voice recognition result.
For an integrated voice secretary server,

A first receiver for receiving a recorded voice command from a first terminal;

A first selector for selecting a voice secretary server to provide a service to the first terminal;

A first transmitting unit for transmitting the recorded voice command to a voice secretary server selected by the first selecting unit (hereinafter referred to as 'first voice secretary server');

A second receiving unit for receiving a recognition result of the recorded voice command (hereinafter, a 'voice recognition result') from the first voice secretary server;

A determination unit for analyzing the speech recognition result and determining a terminal to transmit the speech recognition result (hereinafter referred to as a 'second terminal');

A second selector for selecting a voice secretary server to provide a service to the second terminal;

A second transmitting unit for transmitting the voice recognition result to the voice secretary server selected by the second selecting unit (hereinafter referred to as 'second voice secretary server');

A third receiving unit for receiving a service packet according to the speech recognition result from the second voice secretary server; And

And a third transmission unit for transmitting the service packet to the second terminal.
12. The method of claim 11,

And registering information on the user terminal information, information on the voice secretary service desired to be provided by the user terminal, and information on the priority order of the voice secretary service desired to be provided.
13. The method of claim 12,

And the second selection unit selects the voice secretary server according to the priority order.
12. The method of claim 11,

And a translator for translating the speech recognition result into another language.
15. The method of claim 14,

And the second transmitting unit transmits the voice recognition result translated by the translating unit.
A computer-readable recording medium on which a program for executing the method of any one of claims 1 to 10 is recorded.