WO2019103200A1 - Method and device for providing integrated voice secretary service - Google Patents

Method and device for providing integrated voice secretary service Download PDF

Info

Publication number
WO2019103200A1
WO2019103200A1 PCT/KR2017/013512 KR2017013512W WO2019103200A1 WO 2019103200 A1 WO2019103200 A1 WO 2019103200A1 KR 2017013512 W KR2017013512 W KR 2017013512W WO 2019103200 A1 WO2019103200 A1 WO 2019103200A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
terminal
server
secretary
recognition result
Prior art date
Application number
PCT/KR2017/013512
Other languages
French (fr)
Korean (ko)
Inventor
정종일
김용진
Original Assignee
주식회사 모다
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 모다 filed Critical 주식회사 모다
Publication of WO2019103200A1 publication Critical patent/WO2019103200A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals

Definitions

  • the present invention relates to a method and apparatus for providing an integrated voice secretary service.
  • the touch interface is intuitive and has the advantage of receiving immediate command feedback. However, in situations where complex interaction is required, such as when two hands are not free, when complex commands are required, when multiple instructions require interaction, or when long texts are required, typing commands with the touch interface Uncomfortable.
  • the voice interface is a natural and intuitive interface that is being used for services that require fast interaction.
  • RDNN Recurrent Deep Neural Network
  • Siri is a voice secretary service that works on Apple iOS and macOS devices. A brief description of how Siri works is given below.
  • Siri's basic call command is "Hey Siri" or "Syriya".
  • the user terminal records the voice of the user and transmits it to the voice secretary server.
  • the voice secretary server recognizes the user's voice and converts it into text.
  • the voice secretary server analyzes the converted text using artificial intelligence.
  • the voice secretary server gives a specific answer to the user terminal according to the analyzed contents or allows the user terminal to execute a specific app. It may be possible to perform an operation of controlling a user terminal other than the user terminal that has input the voice.
  • the user must subscribe and register separately with each voice secretary service before using the voice secretary service. Also, if different voice secretary services are provided for each terminal carried by the user, the user can not be provided continuity of the voice secretary service in various situations and environments. Therefore, it is necessary to interwork a plurality of voice secretarial services.
  • Korean Patent Laid-Open Publication No. 2016-0071111 (Jun. 26, 2016) 'Providing Personal Secretary Service in Electronic Device' discloses a voice secretary service that provides a response to a third party other than the owner of the terminal. For example, when the owner of the terminal asks the third party to say, "When is the end?" When the owner of the terminal is in the meeting, the voice secretary service of the terminal automatically generates a voice "Ends in one hour" and transmits the voice to the third party.
  • this document discloses a configuration for providing services for a plurality of terminals in a single voice secretarial service, but does not disclose a configuration for interworking a plurality of voice secretarial services.
  • a plurality of voice secretarial services can be interlinked to ensure continuity of voice secretary services in various situations and environments.
  • a method of providing an integrated voice secretary service comprising: receiving a voice command recognition result from a first voice secretary server; Analyzing the speech recognition result to identify a target terminal to which the speech recognition result is to be transmitted; Searching for a voice secretary server interlocked with the target terminal; Selecting one of the retrieved voice secretary servers; And transmitting the voice recognition result to the selected voice secretary server.
  • a method of providing an integrated voice secretary service comprising: receiving a recorded voice command from a first terminal; A first selection step of selecting a voice secretary server to provide a service to the first terminal; Transmitting the recorded voice command to the first voice secretary server selected in the first selecting step; Receiving a recognition result of the recorded voice command from the first voice secretary server; Analyzing the speech recognition result to identify a second terminal to transmit the speech recognition result; A second selection step of selecting a voice secretary server to provide a service to the second terminal; Transmitting the voice recognition result to a second voice secretary server selected in the second selection process; Receiving a service packet according to the speech recognition result from the second voice secretary server; And transmitting the service packet to the second terminal.
  • an integrated voice secretary server comprising: a first receiving unit for receiving a voice command recorded from a first terminal; A first selector for selecting a voice secretary server to provide a service to the first terminal; A first transmission unit for transmitting the recorded voice command to the first voice secretary server selected by the first selection unit; A second receiving unit for receiving the recognition result of the recorded voice command from the first voice secretary server; A determination unit for analyzing the speech recognition result and determining a second terminal to transmit the speech recognition result; A second selector for selecting a voice secretary server to provide a service to the second terminal; A second transmitting unit for transmitting the voice recognition result to the second voice secretary server selected by the second selecting unit; A third receiving unit for receiving a service packet according to the speech recognition result from the second voice secretary server; And a third transmission unit for transmitting the service packet to the second terminal.
  • a plurality of voice secretarial services can be interlinked to ensure continuity of voice secretary services in various situations and environments.
  • all kinds of voice secretary services can be provided through one integrated voice secretary service subscription.
  • FIG. 1 illustrates an integrated voice secretary service providing system according to a first embodiment of the present invention.
  • FIG. 2 is a diagram illustrating a specific operation of the integrated voice secretary service providing system according to the first embodiment of the present invention.
  • FIG. 3 is a diagram showing a translation process added to the integrated voice secretary service providing system according to the first embodiment of the present invention.
  • FIG. 4 is a flowchart illustrating an integrated voice secretary service providing method according to the first embodiment of the present invention.
  • FIG. 5 is a diagram illustrating an integrated voice secretary service providing system according to a second embodiment of the present invention.
  • FIG. 6 is a flowchart illustrating an integrated voice secretary service providing method according to a second embodiment of the present invention.
  • FIG. 7 is a block diagram illustrating a server providing an integrated voice secretary service according to a second embodiment of the present invention.
  • the first, second, i), ii), a), b) and the like can be used.
  • Such a code is intended to distinguish the constituent element from another constituent element, but the nature of the constituent element, the order, the order, and the like are not limited by the code. It is to be understood that when a component is referred to as being “comprising” or “comprising,” it should be understood that this section does not exclude the presence of other components, And the like.
  • the term 'module' refers to a unit that processes at least one function or operation, and may be implemented as 'hardware', 'software', or 'combination of hardware and software'.
  • the interworking function is not provided between the voice secretary services, the user feels a great inconvenience.
  • the user has to separately register and register in each voice secretary service in order to use the voice secretary service, so that the account management and use are troublesome.
  • the terminals of the user support different voice secretary services, continuous and consistent voice secretary services can not be provided in various situations and environments.
  • the smartphone is Samsung's product (voice secretary service: Samsung Big B), the speaker is Amazon's product (voice secretary service: Amazon Alexa), the personal tablet is Apple's product (voice secretary service: ,
  • the office tablet is a product of Google (voice secretary service: Google assistant), the TV set-top box is a product of KT (voice secretary service: KT machine), and the office internet telephone is a product of SKT Service: SKT who), and the voice secretary service linked to each terminal are different, it is difficult to share information between each voice secretary service among the terminals. For example, when the user is using the A-voice secretary service, it is difficult to provide the association service to the terminal interlocked with the B-voice secretary service.
  • FIG. 1 illustrates an integrated voice secretary service providing system according to a first embodiment of the present invention.
  • the integrated voice secretary service providing system includes a plurality of terminals, a plurality of voice secretarial servers, and an integrated voice secretary service providing server 'Integrated server').
  • the integrated server is located behind each voice secret server, so that the service provided by the integrated server is not exposed to the user of the terminal.
  • the first terminal is provided with voice secretarial service through the first voice secretarial server.
  • the second terminal is provided with voice secretarial service through the second voice secretarial server.
  • the integration server relays the first voice secretary server and the second voice secretary server.
  • the voice secretary service linked to each terminal can be connected to receive the voice secretary service irrespective of the user's spatial and temporal location or the device being owned.
  • the service provided by the Voice Secretary Service (KT Kiggini) of KT's TV settop at home is provided through the voice secretary service of Samsung's smartphone .
  • FIG. 2 is a diagram illustrating a specific operation of the integrated voice secretary service providing system according to the first embodiment of the present invention.
  • the first terminal When the user of the first terminal speaks to the first terminal, the first terminal records the voice of the user (hereinafter, 'voice command') and transmits it to the first voice secretarial server.
  • the first voice secretary server recognizes the voice command and converts it into text.
  • the first voice secretarial server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.
  • the first voice secretary server transmits the voice recognition result to the integration server.
  • the speech recognition result transmitted from the first voice secretarial server to the integration server is a result of converting the voice command into text (hereinafter, referred to as 'voice conversion result'), or analyzing the semantic content of the voice command converted into text (hereinafter, 'Semantic analysis result'), or both the speech conversion result and the semantic analysis result.
  • the integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result.
  • the integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result.
  • the integration server can identify the second terminal using the semantic analysis result.
  • the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
  • the integration server searches the second terminal in its own database to determine what is the voice secretary service linked to the second terminal.
  • the integration server selects one voice secretarial server according to the state of the voice secretary server providing each service and the priority set by the user of the second terminal. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the integration server sends the speech recognition result to the selected second voice secretary server.
  • the voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received.
  • the second voice secretarial server may analyze the semantic content of the voice conversion result and provide the voice secretary service to the second terminal.
  • the second voice secretary server may provide the voice secretary service to the second terminal using the semantic analysis result.
  • the second voice secretary server can provide the voice secretary service to the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.
  • FIG. 3 is a diagram showing a translation process added to the integrated voice secretary service providing system according to the first embodiment of the present invention.
  • the integrated voice secretary service providing system may include a translation process.
  • the user of the first terminal uses the voice secretarial service having high Korean recognition capability and the user of the second terminal uses voice Secretarial services are available.
  • the integration server may further include a function of translating the voice recognition result into a relative language and interlocking the two voice secretary services optimally.
  • the first terminal When the user of the first terminal speaks to the first terminal, the first terminal records the voice command and transmits it to the first voice secretary server.
  • the first voice secretary server recognizes the voice command and converts it into text. Further, the first voice secretary server may further include a function of analyzing the semantic content of the voice command converted into text.
  • the first voice secretary server transmits the voice recognition result to the integration server.
  • the voice recognition result transmitted by the first voice secretarial server to the integration server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result.
  • the integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result.
  • the integration server can identify the second terminal using the semantic analysis result.
  • the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
  • the integration server searches the second terminal in its own database to determine what is the voice secretary service linked to the second terminal.
  • the integration server selects one voice secretarial server according to the state of the voice secretary server providing each service and the priority set by the user of the second terminal. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the integration server determines whether the language used by the first terminal and the language used by the second terminal are the same before transmitting the speech recognition result to the selected second voice secretary server. If the language used by the first terminal is different from the language used by the second terminal, the integration server translates the speech recognition result into a language used by the second terminal, and transmits the translated speech recognition result to the second voice secretary server do.
  • the voice recognition result transmitted by the integration server to the second voice secretary server may be a translated voice conversion result, a translated semantic analysis result, or both a translated voice conversion result and a translated semantic analysis result.
  • the second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received.
  • the second voice secretarial server may analyze the semantic content of the translated voice conversion result and provide the voice secretary service to the second terminal.
  • the second voice secretary server may provide the voice secretary service to the second terminal using the translated semantic analysis result.
  • the second voice secretarial server can provide the voice secretarial service to the second terminal by using both the result of analyzing the meaning of the translated voice conversion result itself and the translated semantic analysis result received from the integration server.
  • FIG. 4 is a flowchart illustrating an integrated voice secretary service providing method according to the first embodiment of the present invention.
  • the method of providing the integrated voice secretary service of FIG. 4 may be performed by an integrated server, a gateway device, or other network device.
  • the integrated voice secretary service providing method includes a step of receiving a voice recognition result from a first voice secretary server interlocked with a first terminal (S410) (S430), searching for a voice secretary server linked with the second terminal (S430), selecting one of the voice secretary servers linked with the second terminal (S440), and transmitting a voice recognition result to the selected voice secretary server (S450).
  • the received voice recognition result may be a voice conversion result, a semantic analysis result, or both voice conversion result and semantic analysis result have.
  • the process of determining the second terminal to provide the voice secretary service using the voice recognition result may include analyzing the meaning content of the voice conversion result.
  • the process of identifying the second terminal to provide the voice secretary service using the voice recognition result includes the steps of analyzing the semantic content of the voice conversion result to identify the second terminal, And analyzing the semantic analysis result of the voice conversion result itself and the semantic analysis result received from the first voice secretarial server to identify the second terminal.
  • the step of searching for the voice secretary server linked with the second terminal may include searching for a second terminal in its own database and knowing what is the voice secretary service linked to the second terminal.
  • the step S440 of selecting one of the voice secretary servers interlocked with the second terminal may include a state of the voice secretary server providing each service when a plurality of voice secretary services interlocked with the second terminal are selected, And selecting one voice secretary service using the set priorities and the like. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the voice recognition result to be transmitted may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the semantic analysis result may be received in step S410 or analyzed in step S420.
  • the method may further include registering information on user terminal information, information on a voice secretary service interlocked with the user terminal, and information on priorities of the synchronized voice secretary service, prior to step S410.
  • a step of translating the speech recognition result into the language used in the second terminal may be further included between steps S420 and S450.
  • transmitting the speech recognition result in step S450 may be a result of transmitting the translated speech recognition result.
  • S410 to S450 are sequentially executed in Fig. 4, the description of the technical idea of the present invention is merely illustrative and the execution of S410 to S450 is not limited to the time series order. Those skilled in the art can change the order of S410 through S450 without departing from the essential characteristics of the present invention or omit one or more steps in S410 through S450 or perform one or more steps in S410 through S450
  • the method of FIG. 4 can be variously modified and modified, for example, by executing in parallel.
  • FIG. 5 is a diagram illustrating an integrated voice secretary service providing system according to a second embodiment of the present invention.
  • the integrated voice secretary service provision system includes a plurality of terminals, a plurality of voice secretarial servers, and an integration server.
  • the service provided by the integration server is exposed to the user of the terminal, with the integration server in front of each voice secretary server.
  • the first terminal is provided with the voice secretary service provided by the first voice secretarial server via the integration server.
  • the second terminal is provided with the voice secretary service provided by the second voice secretarial server via the integration server.
  • the user can receive the voice secretary service provided by each voice secretary server without registering in the voice secretary server by registering only in the integration server.
  • the first terminal when the user of the first terminal speaks to the first terminal, the first terminal records the voice command of the user and transmits the voice command to the integrated server.
  • the integration server transmits the voice command to the first voice secretary server providing the voice secretary service set by the user of the first terminal.
  • the integration server transmits one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the first terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the selected first voice secretary server recognizes the voice command and converts it into text.
  • the first voice secretariat server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.
  • the first voice secretary server transmits the voice recognition result to the integration server.
  • the voice recognition result transmitted by the first voice secretarial server to the integration server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result.
  • the integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result.
  • the integration server can identify the second terminal using the semantic analysis result.
  • the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
  • the integration server searches the second terminal in its own database and grasps the voice secretary service set by the user of the second terminal.
  • the integration server uses one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the second terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the integration server sends a speech recognition result to a second voice secretary server that provides the selected voice secretary service.
  • the voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the second voice secretary server generates a service packet for the second terminal using the voice recognition result received.
  • the second voice secretarial server may analyze the semantic content of the voice conversion result to generate a service packet for the second terminal.
  • the second voice secretary server may generate a service packet for the second terminal using the semantic analysis result.
  • the second voice secretarial server itself can generate a service packet for the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.
  • the second voice secretary server transmits a service packet for the second terminal to the integration server.
  • the integration server transmits the service packet received from the second voice secretary server to the second terminal.
  • the integrated voice secretary agent can be configured to be interworked with various voice secretary services.
  • the user can use all the voice secretary services by registering only the integrated voice secretary agent, which greatly increases the user convenience.
  • FIG. 6 is a flowchart illustrating an integrated voice secretary service providing method according to a second embodiment of the present invention.
  • the integrated voice secretary service providing method of FIG. 6 may be performed by an integrated server, a gateway device, or other network device.
  • the integrated voice secretary service providing method includes a step of receiving a recorded voice command from a first terminal (S610), a voice secretary A step S630 of transmitting a recorded voice command to the first voice secretary server selected in the first selecting step, a step of receiving a voice recognition result from the first voice secretary server A second selection process (S660) of selecting a voice secretary server to provide a service to the second terminal, a second selection process (S660) of analyzing the voice recognition result to identify a second terminal to which the voice recognition result is to be transmitted (S670) transmitting the voice recognition result to the second voice secretary server selected in step S670, receiving the service packet according to the voice recognition result from the second voice secretary server (S680), and transmitting the service packet to the second terminal Process (S6 90).
  • the recorded voice command may be in various file formats.
  • a recorded voice command may include files such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, alphabet, ra, rm, tta, May be formatted
  • the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal is B, C, and A, then the C voice secretary server can be selected.
  • the step S630 of transmitting the recorded voice command to the first voice secretary server selected in the first selecting step includes a step of converting the format of the voice command into the file format used by the first voice secretary server and transmitting .
  • the received voice recognition result may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
  • the second terminal can be identified by analyzing the meaning content of the speech conversion result. Or the second terminal can be identified using the semantic analysis result. Alternatively, the second terminal can be identified using both the result of analyzing the meaning of the voice conversion result by itself and the result of the semantic analysis received from the first voice secretarial server.
  • the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
  • the transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both the speech conversion result and the semantic analysis result have.
  • the received service packet is transmitted to the second terminal generated by the second voice secretarial server using the speech recognition result Lt; / RTI >
  • step S690 of transmitting the service packet to the second terminal a command for driving an app providing the second voice secretary service may be transmitted from the second terminal together with the service packet.
  • the method may further include a step of registering information on user terminal information, information on a voice secretary service desired to be provided by the user terminal, and information on priorities of a voice secretary service to be provided, prior to step S610.
  • steps S650 and S670 may further include translating the speech recognition result into a language used in the second terminal.
  • transmitting the speech recognition result in step S670 may be a result of transmitting the translated speech recognition result.
  • FIG. 6 shows that S610 to S690 are sequentially executed.
  • the description of the technical idea of the present invention is merely illustrative, and the execution of S610 to S690 is not limited to the time series order.
  • Those skilled in the art will recognize that changes may be made to the procedures of steps S610 through S690 without departing from the essential characteristics of the present invention or by omitting one or more steps S610 through S690 or by performing one or more steps S610 through S690
  • the method of FIG. 6 can be variously modified and modified, for example, by executing in parallel.
  • FIG. 7 is a block diagram illustrating a server providing an integrated voice secretary service according to a second embodiment of the present invention.
  • the integration server 720 includes a first receiving unit 721, a second receiving unit 722, a third receiving unit 723, a first selecting unit 724 A second selecting unit 725, a determining unit 726, a first transmitting unit 727, a second transmitting unit 728, and a third transmitting unit 729.
  • the first receiving unit 721 receives the recorded voice command from the first terminal 711.
  • the recorded voice commands can be stored in a file format such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, alphabet, ra, rm, tta, May be
  • the first selection unit 724 selects a voice secretary server to provide the service to the first terminal 711.
  • the selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal 711 is B, C, and A, then the C voice secretary server can be selected.
  • the first transmitting unit 727 transmits the recorded voice command to the first voice secretary server 731 selected by the first selecting unit 724.
  • the first transmission unit 727 may convert the recorded voice command into a file format used by the first voice secretary server 731, and then transmit the converted voice command.
  • the second receiving unit 722 receives the recognition result of the recorded voice command from the first voice secretarial server 731.
  • the received speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.
  • the determination unit 726 analyzes the speech recognition result and determines the second terminal 712 to transmit the speech recognition result.
  • the determination unit 726 can analyze the semantic content of the voice conversion result and determine the second terminal 712. Or the second terminal 712 using the semantic analysis result. Or the second terminal 712 using both the result of analyzing the meaning of the voice conversion result itself and the result of the semantic analysis received from the first voice secretary server 731.
  • the second selection unit 725 selects a voice secretary server to provide service to the second terminal 712.
  • the selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal 712 is B, C, and A, then the C voice secretary server can be selected.
  • the second transmitting unit 728 transmits the voice recognition result to the second voice secretary server 732 selected by the second selecting unit 725.
  • the transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.
  • the third receiving unit 723 receives the service packet according to the voice recognition result from the second voice secretary server 732.
  • the service packet may be a service packet for a service that the second voice secretary server 732 wants to provide to the second terminal 712 generated using the voice recognition result.
  • the third transmitting unit 729 transmits the service packet to the second terminal 712.
  • the third transmitting unit 729 transmits the service packet to the second terminal 712 through the application App) can be transmitted.
  • a recording medium readable by a computer or a smart phone includes all kinds of recording devices in which data that can be read by a computer system is stored. That is, the recording medium that can be read by a computer or a smart phone includes a magnetic storage medium (e.g., a ROM, a floppy disk, a hard disk, etc.), an optical reading medium (e.g., CD- (E.g., USB, SSD), and the like.
  • code that is distributed to networked computer systems and readable by a computer or smartphone in a distributed manner can be stored and executed.
  • Integration server 721 First reception unit
  • first selector 725 second selector
  • first voice secretary server 732 second voice secretary server

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Disclosed are a method and a device for providing an integrated voice secretary service. According to an embodiment of the present invention, provided is a method for providing an integrated voice secretary service comprising the steps of: receiving a recognition result of a voice command from a first voice secretary server; locating a target terminal, to which the voice recognition result is to be transmitted, by means of analyzing the voice recognition result; searching for voice secretary servers linked to the target terminal; selecting one among voice secretary servers which have been found; and transmitting the voice recognition result to the selected voice secretary server.

Description

통합 음성비서 서비스 제공 방법 및 장치Method and apparatus for providing integrated voice secretary service
본 발명은 통합 음성비서 서비스 제공 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for providing an integrated voice secretary service.
이 부분에 기술된 내용은 단순히 본 발명에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The contents described in this section merely provide background information on the present invention and do not constitute the prior art.
랩탑 컴퓨터, 타블렛, 스마트폰, 스마트 워치 등 다양한 종류의 개인용 스마트 기기가 개발된 이래 스마트 기기를 조작하기 위한 인터페이스는 다양한 방향으로 발전되어 왔다. 스마트 기기의 크기가 점차 작아지면서 작은 화면에서 풍부한 사용자 인터페이스를 제공하기 위하여 터치 스크린을 장착하게 되었고, 터치 인터페이스는 다양한 개인용 스마트 기기를 조작하기 위한 인터페이스로 널리 활용되고 있다.Since the development of various types of personal smart devices such as laptop computers, tablets, smart phones and smart watches, interfaces for manipulating smart devices have been developed in various directions. As the size of the smart device gradually decreases, the touch screen is installed to provide a rich user interface on a small screen, and the touch interface is widely used as an interface for operating various personal smart devices.
터치 인터페이스는 직관적이고 명령에 대한 피드백을 즉각적으로 받을 수 있다는 장점이 있다. 그러나 두 손이 자유롭지 않거나 복잡한 명령을 수행해야 하거나 여러 단계의 상호작용을 거쳐 명령을 수행해야 하거나 긴 텍스트를 입력해야 하는 경우 등 복잡한 인터랙션(interaction)이 필요한 상황에서는 터치 인터페이스로 일일이 명령을 입력하기가 불편하다.The touch interface is intuitive and has the advantage of receiving immediate command feedback. However, in situations where complex interaction is required, such as when two hands are not free, when complex commands are required, when multiple instructions require interaction, or when long texts are required, typing commands with the touch interface Uncomfortable.
터치 인터페이스와 비교하였을 때 음성 인터페이스는 자연스럽고 직관적인 인터페이스로써 빠른 인터랙션이 필요한 서비스를 중심으로 활용이 확대되고 있다.Compared with the touch interface, the voice interface is a natural and intuitive interface that is being used for services that require fast interaction.
음성 인터페이스에 있어서 가장 중요한 것은 음성 인식 기술의 정확도이다. 음성 인식 기술의 정확도를 높이기 위한 다양한 기법들이 개발되고 있다. 예컨대 RDNN(Recurrent Deep Neural Network) 기반의 음성 인식 기술은 학습을 통해 음성 인식 엔진을 구축한다. 학습 데이터의 양과 학습의 반복량이 음성 인식 성능에 큰 영향을 미치기 때문에 회사 별로 제공하는 음성비서 서비스의 품질이 상이하다. 예컨대 영미권에서 많이 사용되는 음성 인터페이스는 영어 인식 품질이 좋을 것이고, 한국에서 많이 사용되는 음성 인터페이스는 한국어 인식 품질이 좋을 것이다.The most important thing about the voice interface is the accuracy of the speech recognition technology. Various techniques have been developed to improve the accuracy of speech recognition technology. For example, RDNN (Recurrent Deep Neural Network) based speech recognition technology builds a speech recognition engine through learning. Since the amount of learning data and the amount of repetition of learning greatly affect speech recognition performance, the quality of the voice secretary service provided for each company differs. For example, the voice interface, which is widely used in the English speaking countries, will have good recognition quality in English.
음성 인터페이스에 기반한 서비스, 즉, 음성비서 서비스의 예로서 애플社의 'Siri(시리)'가 있다. Siri는 애플社의 iOS와 macOS 탑재 기기들에서 작동하는 음성비서 서비스이다. Siri의 동작 방법을 간단히 설명하면 다음과 같다.An example of a service based on a voice interface, namely voice secretary service, is Apple's Siri. Siri is a voice secretary service that works on Apple iOS and macOS devices. A brief description of how Siri works is given below.
① Siri의 기본 호출 명령어는 "Hey Siri" 또는 "시리야"이다. 사용자가 사용자 단말에 대고 "Hey Siri" 또는 "시리야"라고 음성비서를 호출한 후 말로 명령을 내리면 사용자 단말은 사용자의 음성을 녹음하여 음성비서 서버에게 전송한다. ② 음성비서 서버는 사용자의 음성을 인식하여 텍스트로 변환한다. ③ 음성비서 서버는 변환된 텍스트를 인공지능 등을 이용하여 분석한다. ④ 음성비서 서버는 분석된 내용에 따라 사용자 단말에게 특정한 대답을 들려주거나 사용자 단말이 특정 앱을 실행하도록 한다. 음성을 입력한 사용자 단말이 아닌 다른 사용자 단말을 제어하는 동작을 수행할 수도 있다.① Siri's basic call command is "Hey Siri" or "Syriya". When the user calls the voice secretary "Hey Siri" or "Sirija" to the user terminal, the user terminal records the voice of the user and transmits it to the voice secretary server. ② The voice secretary server recognizes the user's voice and converts it into text. ③ The voice secretary server analyzes the converted text using artificial intelligence. ④ The voice secretary server gives a specific answer to the user terminal according to the analyzed contents or allows the user terminal to execute a specific app. It may be possible to perform an operation of controlling a user terminal other than the user terminal that has input the voice.
사용자는 음성비서 서비스를 이용하기 전에 각각의 음성비서 서비스에 별도로 가입 및 등록해야 한다. 또한 사용자가 소지한 단말마다 서로 다른 음성비서 서비스를 제공할 경우 사용자는 다양한 상황 및 환경에서 음성비서 서비스의 연속성을 제공받을 수 없다. 따라서 복수개의 음성비서 서비스를 상호 연동시킬 필요가 있다.The user must subscribe and register separately with each voice secretary service before using the voice secretary service. Also, if different voice secretary services are provided for each terminal carried by the user, the user can not be provided continuity of the voice secretary service in various situations and environments. Therefore, it is necessary to interwork a plurality of voice secretarial services.
이와 관련하여 한국특허공개 제2016-0071111호(2016.06.21) '전자 장치에서의 개인 비서 서비스 제공'은 단말의 소유자가 아닌 제3자에게 응답을 제공하는 음성비서 서비스를 개시하고 있다. 예컨대 단말의 소유자가 회의 중이라고 할 때 제3자가 "언제 끝나는데?"라고 물으면 단말의 음성비서 서비스가 자동으로 "1시간 후에 끝납니다"라는 음성을 생성하여 제3자에게 전송하는 서비스이다.In this regard, Korean Patent Laid-Open Publication No. 2016-0071111 (Jun. 26, 2016) 'Providing Personal Secretary Service in Electronic Device' discloses a voice secretary service that provides a response to a third party other than the owner of the terminal. For example, when the owner of the terminal asks the third party to say, "When is the end?" When the owner of the terminal is in the meeting, the voice secretary service of the terminal automatically generates a voice "Ends in one hour" and transmits the voice to the third party.
그러나 동 문헌은 단일한 음성비서 서비스에서 복수개의 단말에 대하여 서비스를 제공하는 구성을 개시하고 있을 뿐 복수개의 음성비서 서비스를 상호 연동시키는 구성은 개시하지 못하고 있다.However, this document discloses a configuration for providing services for a plurality of terminals in a single voice secretarial service, but does not disclose a configuration for interworking a plurality of voice secretarial services.
본 발명의 일 실시예에 의하면, 복수개의 음성비서 서비스를 상호 연동시켜 다양한 상황 및 환경에서 음성비서 서비스의 연속성을 보장할 수 있다.According to an embodiment of the present invention, a plurality of voice secretarial services can be interlinked to ensure continuity of voice secretary services in various situations and environments.
본 발명의 일 실시예에 의하면, 통합 음성비서 서비스 제공 방법에 있어서, 제1 음성비서 서버로부터 음성명령의 인식 결과를 수신하는 과정; 상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 대상 단말을 파악하는 과정; 상기 대상 단말과 연동된 음성비서 서버를 검색하는 과정; 상기 검색된 음성비서 서버 중 하나를 선택하는 과정; 및 상기 선택된 음성비서 서버에게 상기 음성인식결과를 전송하는 과정을 포함하는 통합 음성비서 서비스 제공 방법을 제공한다.According to an embodiment of the present invention, there is provided a method of providing an integrated voice secretary service, comprising: receiving a voice command recognition result from a first voice secretary server; Analyzing the speech recognition result to identify a target terminal to which the speech recognition result is to be transmitted; Searching for a voice secretary server interlocked with the target terminal; Selecting one of the retrieved voice secretary servers; And transmitting the voice recognition result to the selected voice secretary server.
본 발명의 다른 실시예에 의하면, 통합 음성비서 서비스 제공 방법에 있어서, 제1 단말로부터 녹음된 음성명령을 수신하는 과정; 상기 제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택 과정; 상기 제1 선택 과정에서 선택된 제1 음성비서 서버에게 상기 녹음된 음성명령을 전송하는 과정; 상기 제1 음성비서 서버로부터 상기 녹음된 음성명령의 인식 결과를 수신하는 과정; 상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 제2 단말을 파악하는 과정; 상기 제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택 과정; 상기 제2 선택 과정에서 선택된 제2 음성비서 서버에게 상기 음성인식결과를 전송하는 과정; 상기 제2 음성비서 서버로부터 상기 음성인식결과에 따른 서비스 패킷을 수신하는 과정; 및 상기 서비스 패킷을 상기 제2 단말에게 전송하는 과정을 포함하는 통합 음성비서 서비스 제공 방법을 제공한다.According to another embodiment of the present invention, there is provided a method of providing an integrated voice secretary service, comprising: receiving a recorded voice command from a first terminal; A first selection step of selecting a voice secretary server to provide a service to the first terminal; Transmitting the recorded voice command to the first voice secretary server selected in the first selecting step; Receiving a recognition result of the recorded voice command from the first voice secretary server; Analyzing the speech recognition result to identify a second terminal to transmit the speech recognition result; A second selection step of selecting a voice secretary server to provide a service to the second terminal; Transmitting the voice recognition result to a second voice secretary server selected in the second selection process; Receiving a service packet according to the speech recognition result from the second voice secretary server; And transmitting the service packet to the second terminal.
본 발명의 또 다른 실시예에 의하면, 통합 음성비서 서버에 있어서, 제1 단말로부터 녹음된 음성명령을 수신하는 제1 수신부; 상기 제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택부; 상기 제1 선택부에서 선택된 제1 음성비서 서버에게 상기 녹음된 음성명령을 전송하는 제1 전송부; 상기 제1 음성비서 서버로부터 상기 녹음된 음성명령의 인식 결과를 수신하는 제2 수신부; 상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 제2 단말을 파악하는 판단부; 상기 제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택부; 상기 제2 선택부에서 선택된 제2 음성비서 서버에게 상기 음성인식결과를 전송하는 제2 전송부; 상기 제2 음성비서 서버로부터 상기 음성인식결과에 따른 서비스 패킷을 수신하는 제3 수신부; 및 상기 서비스 패킷을 상기 제2 단말에게 전송하는 제3 전송부를 포함하는 통합 음성비서 서버를 제공한다.According to another embodiment of the present invention, there is provided an integrated voice secretary server comprising: a first receiving unit for receiving a voice command recorded from a first terminal; A first selector for selecting a voice secretary server to provide a service to the first terminal; A first transmission unit for transmitting the recorded voice command to the first voice secretary server selected by the first selection unit; A second receiving unit for receiving the recognition result of the recorded voice command from the first voice secretary server; A determination unit for analyzing the speech recognition result and determining a second terminal to transmit the speech recognition result; A second selector for selecting a voice secretary server to provide a service to the second terminal; A second transmitting unit for transmitting the voice recognition result to the second voice secretary server selected by the second selecting unit; A third receiving unit for receiving a service packet according to the speech recognition result from the second voice secretary server; And a third transmission unit for transmitting the service packet to the second terminal.
본 발명의 일 실시예에 의하면, 복수개의 음성비서 서비스를 상호 연동시켜 다양한 상황 및 환경에서 음성비서 서비스의 연속성을 보장할 수 있다.According to an embodiment of the present invention, a plurality of voice secretarial services can be interlinked to ensure continuity of voice secretary services in various situations and environments.
본 발명의 다른 실시예에 의하면, 하나의 통합 음성비서 서비스 가입을 통해 모든 종류의 음성비서 서비스를 제공받을 수 있다.According to another embodiment of the present invention, all kinds of voice secretary services can be provided through one integrated voice secretary service subscription.
도 1은 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템을 예시한 도면이다.FIG. 1 illustrates an integrated voice secretary service providing system according to a first embodiment of the present invention. Referring to FIG.
도 2는 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템의 구체적인 동작 과정을 예시한 도면이다.2 is a diagram illustrating a specific operation of the integrated voice secretary service providing system according to the first embodiment of the present invention.
도 3은 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템에 번역 과정을 부가한 것이다.FIG. 3 is a diagram showing a translation process added to the integrated voice secretary service providing system according to the first embodiment of the present invention.
도 4는 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 방법을 예시한 순서도이다.4 is a flowchart illustrating an integrated voice secretary service providing method according to the first embodiment of the present invention.
도 5는 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 시스템을 예시한 도면이다.5 is a diagram illustrating an integrated voice secretary service providing system according to a second embodiment of the present invention.
도 6은 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 방법을 예시한 순서도이다.6 is a flowchart illustrating an integrated voice secretary service providing method according to a second embodiment of the present invention.
도 7은 본 발명의 제2 실시예에 따른 통합 음성비서 서비스를 제공하는 서버를 예시한 블록도이다.7 is a block diagram illustrating a server providing an integrated voice secretary service according to a second embodiment of the present invention.
이하, 본 발명의 일 실시예를 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되어 있더라도 가능한 한 동일한 부호를 사용하고 있음에 유의해야 한다. 또한 본 발명의 일 실시예를 설명함에 있어서 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그에 대한 상세한 설명을 생략한다.Hereinafter, an embodiment of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in the drawings, like reference numerals are used to denote like elements in the drawings, even though they are shown in different drawings. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail.
본 발명의 일 실시예의 구성요소를 설명함에 있어서 제1, 제2, i), ii), a), b) 등의 부호를 사용할 수 있다. 이러한 부호는 그 구성요소를 다른 구성요소와 구별하기 위한 것일 뿐 그 부호에 의해 해당 구성요소의 본질, 차례, 순서 등이 한정되는 것은 아니다. 본 명세서에서 어떤 부분이 어떤 구성요소를 '포함' 또는 '구비'한다고 할 때, 이는 명시적으로 반대되는 기재가 없는 한 해당 부분이 다른 구성요소를 부가하는 것을 배제하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. '~부', '모듈' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 '하드웨어', '소프트웨어' 또는 '하드웨어와 소프트웨어의 결합'으로 구현될 수 있다.In describing the constituent elements of an embodiment of the present invention, the first, second, i), ii), a), b) and the like can be used. Such a code is intended to distinguish the constituent element from another constituent element, but the nature of the constituent element, the order, the order, and the like are not limited by the code. It is to be understood that when a component is referred to as being "comprising" or "comprising," it should be understood that this section does not exclude the presence of other components, And the like. The term 'module' refers to a unit that processes at least one function or operation, and may be implemented as 'hardware', 'software', or 'combination of hardware and software'.
애플社의 시리, 구글社의 구글 어시스턴트, 아마존社의 알렉사, 마이크로소프트社의 코타나, 삼성社의 빅스비, SKT社의 누구, KT社의 기가지니, 네이버社의 클로바, 카카오社의 I 등 다양한 음성비서 서비스들이 현재 출시되어 있다.A variety of products including Apple's Siri, Google's Google assistant, Amazon's Alexa, Microsoft's Kotana, Samsung's Bixbee, SKT, KT's Kaggini, Naver's Clover and Cacao's I Voice secretary services are available now.
그런데 음성비서 서비스들 사이에 상호 연동 기능이 제공되지 않음으로 인해 사용자는 큰 불편을 느끼고 있다. 사용자는 음성비서 서비스를 이용하기 위해 각각의 음성비서 서비스에 별도로 가입 및 등록을 해야 하므로 계정 관리, 사용 등이 번거롭다. 또한 사용자가 소지한 단말들이 서로 다른 음성비서 서비스를 지원하는 경우 다양한 상황 및 환경에서 연속적이고 일관된 음성비서 서비스를 제공받을 수 없다.However, since the interworking function is not provided between the voice secretary services, the user feels a great inconvenience. The user has to separately register and register in each voice secretary service in order to use the voice secretary service, so that the account management and use are troublesome. Also, if the terminals of the user support different voice secretary services, continuous and consistent voice secretary services can not be provided in various situations and environments.
스마트폰은 삼성社의 제품이고(음성비서 서비스: 삼성 빅스비), 스피커는 아마존社의 제품이며(음성비서 서비스: 아마존 알렉사), 개인용 태블릿은 애플社의 제품이고(음성비서 서비스: 애플 시리), 사무용 태블릿은 구글社의 제품이며(음성비서 서비스: 구글 어시스턴트), TV 셋탑박스는 KT社의 제품이고(음성비서 서비스: KT 기가지니), 사무실의 인터넷 전화는 SKT社의 제품인 경우(음성비서 서비스: SKT 누구), 각 단말과 연동된 음성비서 서비스가 다르므로, 각 단말 사이에서 각 음성비서 서비스 간의 정보 공유가 어려운 문제가 있다. 예컨대 사용자가 A 음성비서 서비스를 이용하고 있을 때, B 음성비서 서비스와 연동된 단말에게 연계 서비스를 제공하기 어렵다.The smartphone is Samsung's product (voice secretary service: Samsung Big B), the speaker is Amazon's product (voice secretary service: Amazon Alexa), the personal tablet is Apple's product (voice secretary service: , The office tablet is a product of Google (voice secretary service: Google assistant), the TV set-top box is a product of KT (voice secretary service: KT machine), and the office internet telephone is a product of SKT Service: SKT who), and the voice secretary service linked to each terminal are different, it is difficult to share information between each voice secretary service among the terminals. For example, when the user is using the A-voice secretary service, it is difficult to provide the association service to the terminal interlocked with the B-voice secretary service.
1. 제1 실시예1. First Embodiment
도 1은 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템을 예시한 도면이다.FIG. 1 illustrates an integrated voice secretary service providing system according to a first embodiment of the present invention. Referring to FIG.
도 1에 나타난 것과 같이, 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템은 복수개의 단말, 복수개의 음성비서 서버 및 본 발명의 일 실시예에 따른 통합 음성비서 서비스 제공 서버(이하, '통합 서버')를 포함한다.As shown in FIG. 1, the integrated voice secretary service providing system according to the first embodiment of the present invention includes a plurality of terminals, a plurality of voice secretarial servers, and an integrated voice secretary service providing server 'Integrated server').
본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템은 통합 서버가 각 음성비서 서버의 후단에 있어 통합 서버가 제공하는 서비스가 단말의 사용자에게 노출되지 않는다.In the integrated voice secretary service providing system according to the first embodiment of the present invention, the integrated server is located behind each voice secret server, so that the service provided by the integrated server is not exposed to the user of the terminal.
제1 단말은 제1 음성비서 서버를 통해 음성비서 서비스를 제공받는다. 제2 단말은 제2 음성비서 서버를 통해 음성비서 서비스를 제공받는다. 통합 서버는 제1 음성비서 서버와 제2 음성비서 서버를 중계한다.The first terminal is provided with voice secretarial service through the first voice secretarial server. And the second terminal is provided with voice secretarial service through the second voice secretarial server. The integration server relays the first voice secretary server and the second voice secretary server.
본 발명의 제1 실시예에 따른 통합 음성비서 서비스에 의하면, 각 단말과 연동된 음성비서 서비스를 연결하여 사용자의 공간적ㆍ시간적 위치나 소유 중인 기기에 상관 없이 음성비서 서비스를 제공받을 수 있다. 예컨대 집에 있는 KT社의 TV 셋탑의 음성비서 서비스(KT 기가지니)에서 제공하는 서비스를 집 밖에 있는 사용자가 현재 지참하고 있는 삼성社의 스마트폰의 음성비서 서비스(삼성 빅스비)를 통해 제공받을 수 있다.According to the integrated voice secretary service according to the first embodiment of the present invention, the voice secretary service linked to each terminal can be connected to receive the voice secretary service irrespective of the user's spatial and temporal location or the device being owned. For example, the service provided by the Voice Secretary Service (KT Kiggini) of KT's TV settop at home is provided through the voice secretary service of Samsung's smartphone .
도 2는 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템의 구체적인 동작 과정을 예시한 도면이다.2 is a diagram illustrating a specific operation of the integrated voice secretary service providing system according to the first embodiment of the present invention.
제1 단말의 사용자가 제1 단말에 대고 말을 하면 제1 단말은 사용자의 음성(이하, '음성명령')을 녹음하여 제1 음성비서 서버에게 전송한다.When the user of the first terminal speaks to the first terminal, the first terminal records the voice of the user (hereinafter, 'voice command') and transmits it to the first voice secretarial server.
제1 음성비서 서버는 음성명령을 인식하여 텍스트로 변환한다. 제1 음성비서 서버는 텍스트로 변환된 음성명령의 의미 내용을 분석하는 기능을 추가로 구비할 수도 있다. 음성명령을 인식하여 텍스트로 변환하는 과정, 또는 텍스트로 변환된 음성명령의 의미 내용을 분석하는 과정에서 머신러닝, 빅데이터, 인공지능 등의 분석 기법이 사용될 수 있다.The first voice secretary server recognizes the voice command and converts it into text. The first voice secretarial server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.
제1 음성비서 서버는 통합 서버에게 음성인식결과를 전송한다. 제1 음성비서 서버가 통합 서버에게 전송하는 음성인식결과는 음성명령을 텍스트로 변환한 결과(이하, '음성변환결과')이거나, 텍스트로 변환된 음성명령의 의미 내용을 분석한 결과(이하, '의미분석결과')이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The first voice secretary server transmits the voice recognition result to the integration server. The speech recognition result transmitted from the first voice secretarial server to the integration server is a result of converting the voice command into text (hereinafter, referred to as 'voice conversion result'), or analyzing the semantic content of the voice command converted into text (hereinafter, 'Semantic analysis result'), or both the speech conversion result and the semantic analysis result.
통합 서버는 음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악한다. 통합 서버는 음성변환결과의 의미 내용을 분석하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 의미분석결과를 이용하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말을 파악할 수 있다.The integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result. The integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result. Alternatively, the integration server can identify the second terminal using the semantic analysis result. Alternatively, the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
통합 서버는 자체 데이터베이스에서 제2 단말을 검색하여 제2 단말과 연동된 음성비서 서비스가 무엇인지 파악한다. 제2 단말과 연동된 음성비서 서비스가 복수개인 경우, 통합 서버는 각 서비스를 제공하는 음성비서 서버의 상태 및 제2 단말의 사용자가 설정한 우선순위에 따라 하나의 음성비서 서버를 선택한다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The integration server searches the second terminal in its own database to determine what is the voice secretary service linked to the second terminal. When there are a plurality of voice secretarial services interlocked with the second terminal, the integration server selects one voice secretarial server according to the state of the voice secretary server providing each service and the priority set by the user of the second terminal. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
통합 서버는 선택된 제2 음성비서 서버에게 음성인식결과를 전송한다. 통합 서버가 제2 음성비서 서버에게 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The integration server sends the speech recognition result to the selected second voice secretary server. The voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
제2 음성비서 서버는 수신한 음성인식결과를 이용하여 제2 단말에게 음성비서 서비스를 제공한다. 제2 음성비서 서버는 음성변환결과의 의미 내용을 분석하여 제2 단말에게 음성비서 서비스를 제공할 수 있다. 또는 제2 음성비서 서버는 의미분석결과를 이용하여 제2 단말에게 음성비서 서비스를 제공할 수 있다. 또는 제2 음성비서 서버는 자체적으로 음성변환결과의 의미를 분석한 결과와 통합 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말에게 음성비서 서비스를 제공할 수 있다.The second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the voice conversion result and provide the voice secretary service to the second terminal. Or the second voice secretary server may provide the voice secretary service to the second terminal using the semantic analysis result. Alternatively, the second voice secretary server can provide the voice secretary service to the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.
도 3은 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템에 번역 과정을 부가한 것이다.FIG. 3 is a diagram showing a translation process added to the integrated voice secretary service providing system according to the first embodiment of the present invention.
도 3에 나타난 것과 같이, 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 시스템은 번역 과정을 포함할 수 있다.As shown in FIG. 3, the integrated voice secretary service providing system according to the first embodiment of the present invention may include a translation process.
제1 단말은 사용 언어가 한국어이고 제2 단말은 사용 언어가 영어라고 할 때, 제1 단말의 사용자는 한국어 인식 능력이 높은 음성비서 서비스를 사용하고 제2 단말의 사용자는 영어 인식 능력이 뛰어난 음성비서 서비스를 사용할 수 있다.When the first terminal is Korean and the second terminal is English, the user of the first terminal uses the voice secretarial service having high Korean recognition capability and the user of the second terminal uses voice Secretarial services are available.
제1 단말이 사용하는 언어와 제2 단말이 사용하는 언어가 서로 다르면 통합 서버는 중간에서 음성인식결과를 상대 언어로 번역하여 양 음성비서 서비스를 최적으로 연동시키는 기능을 추가로 구비할 수 있다.If the language used by the first terminal is different from the language used by the second terminal, the integration server may further include a function of translating the voice recognition result into a relative language and interlocking the two voice secretary services optimally.
제1 단말의 사용자가 제1 단말에 대고 말을 하면 제1 단말은 음성명령을 녹음하여 제1 음성비서 서버에게 전송한다.When the user of the first terminal speaks to the first terminal, the first terminal records the voice command and transmits it to the first voice secretary server.
제1 음성비서 서버는 음성명령을 인식하여 텍스트로 변환한다. 나아가 제1 음성비서 서버는 텍스트로 변환된 음성명령의 의미 내용을 분석하는 기능을 추가로 구비할 수 있다.The first voice secretary server recognizes the voice command and converts it into text. Further, the first voice secretary server may further include a function of analyzing the semantic content of the voice command converted into text.
제1 음성비서 서버는 통합 서버에게 음성인식결과를 전송한다. 제1 음성비서 서버가 통합 서버에게 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The first voice secretary server transmits the voice recognition result to the integration server. The voice recognition result transmitted by the first voice secretarial server to the integration server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
통합 서버는 음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악한다. 통합 서버는 음성변환결과의 의미 내용을 분석하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 의미분석결과를 이용하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말을 파악할 수 있다.The integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result. The integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result. Alternatively, the integration server can identify the second terminal using the semantic analysis result. Alternatively, the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
통합 서버는 자체 데이터베이스에서 제2 단말을 검색하여 제2 단말과 연동된 음성비서 서비스가 무엇인지 파악한다. 제2 단말과 연동된 음성비서 서비스가 복수개인 경우, 통합 서버는 각 서비스를 제공하는 음성비서 서버의 상태 및 제2 단말의 사용자가 설정한 우선순위에 따라 하나의 음성비서 서버를 선택한다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The integration server searches the second terminal in its own database to determine what is the voice secretary service linked to the second terminal. When there are a plurality of voice secretarial services interlocked with the second terminal, the integration server selects one voice secretarial server according to the state of the voice secretary server providing each service and the priority set by the user of the second terminal. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
통합 서버는 선택된 제2 음성비서 서버에게 음성인식결과를 음성인식결과를 전송하기 전에 제1 단말이 사용하는 언어와 제2 단말이 사용하는 언어가 동일한지 여부를 판단한다. 만약 제1 단말이 사용하는 언어와 제2 단말이 사용하는 언어가 상이하면 통합 서버는 음성인식결과를 제2 단말이 사용하는 언어로 번역한 후 번역된 음성인식결과를 제2 음성비서 서버에게 전송한다. 통합 서버가 제2 음성비서 서버에게 전송하는 음성인식결과는 번역된 음성변환결과이거나, 번역된 의미분석결과이거나, 번역된 음성변환결과와 번역된 의미분석결과 모두일 수 있다.The integration server determines whether the language used by the first terminal and the language used by the second terminal are the same before transmitting the speech recognition result to the selected second voice secretary server. If the language used by the first terminal is different from the language used by the second terminal, the integration server translates the speech recognition result into a language used by the second terminal, and transmits the translated speech recognition result to the second voice secretary server do. The voice recognition result transmitted by the integration server to the second voice secretary server may be a translated voice conversion result, a translated semantic analysis result, or both a translated voice conversion result and a translated semantic analysis result.
제2 음성비서 서버는 수신한 음성인식결과를 이용하여 제2 단말에게 음성비서 서비스를 제공한다. 제2 음성비서 서버는 번역된 음성변환결과의 의미 내용을 분석하여 제2 단말에게 음성비서 서비스를 제공할 수 있다. 또는 제2 음성비서 서버는 번역된 의미분석결과를 이용하여 제2 단말에게 음성비서 서비스를 제공할 수 있다. 또는 제2 음성비서 서버는 자체적으로 번역된 음성변환결과의 의미를 분석한 결과와 통합 서버로부터 수신한 번역된 의미분석결과를 모두 이용하여 제2 단말에게 음성비서 서비스를 제공할 수 있다.The second voice secretarial server provides the voice secretary service to the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the translated voice conversion result and provide the voice secretary service to the second terminal. Or the second voice secretary server may provide the voice secretary service to the second terminal using the translated semantic analysis result. Or the second voice secretarial server can provide the voice secretarial service to the second terminal by using both the result of analyzing the meaning of the translated voice conversion result itself and the translated semantic analysis result received from the integration server.
2. 제1 실시예의 순서도2. Flow chart of the first embodiment
도 4는 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 방법을 예시한 순서도이다.4 is a flowchart illustrating an integrated voice secretary service providing method according to the first embodiment of the present invention.
도 4의 통합 음성비서 서비스 제공 방법은 통합 서버, 게이트웨이 장치, 기타 네트워크 장치에 의해 수행될 수 있다.The method of providing the integrated voice secretary service of FIG. 4 may be performed by an integrated server, a gateway device, or other network device.
도 4에 나타난 것과 같이, 본 발명의 제1 실시예에 따른 통합 음성비서 서비스 제공 방법은, 제1 단말과 연동된 제1 음성비서 서버로부터 음성인식결과를 수신하는 과정(S410), 음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악하는 과정(S420), 제2 단말과 연동된 음성비서 서버를 검색하는 과정(S430), 제2 단말과 연동된 음성비서 서버 중 하나를 선택하는 과정(S440) 및 선택된 음성비서 서버에게 음성인식결과를 전송하는 과정(S450)을 포함한다.As shown in FIG. 4, the integrated voice secretary service providing method according to the first embodiment of the present invention includes a step of receiving a voice recognition result from a first voice secretary server interlocked with a first terminal (S410) (S430), searching for a voice secretary server linked with the second terminal (S430), selecting one of the voice secretary servers linked with the second terminal (S440), and transmitting a voice recognition result to the selected voice secretary server (S450).
제1 단말과 연동된 제1 음성비서 서버로부터 음성인식결과를 수신하는 과정(S410)에서, 수신한 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.In step S410 of receiving a voice recognition result from the first voice secretary server interlocked with the first terminal, the received voice recognition result may be a voice conversion result, a semantic analysis result, or both voice conversion result and semantic analysis result have.
음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악하는 과정(S420)은, 음성변환결과의 의미 내용을 분석하는 과정을 포함할 수 있다.The process of determining the second terminal to provide the voice secretary service using the voice recognition result (S420) may include analyzing the meaning content of the voice conversion result.
음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악하는 과정(S420)은, 음성변환결과의 의미 내용을 분석하여 제2 단말을 파악하는 과정, 또는 의미분석결과를 이용하여 제2 단말을 파악하는 과정, 또는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말을 파악하는 과정을 포함할 수 있다.The process of identifying the second terminal to provide the voice secretary service using the voice recognition result (S420) includes the steps of analyzing the semantic content of the voice conversion result to identify the second terminal, And analyzing the semantic analysis result of the voice conversion result itself and the semantic analysis result received from the first voice secretarial server to identify the second terminal.
제2 단말과 연동된 음성비서 서버를 검색하는 과정(S430)은, 자체 데이터베이스에서 제2 단말을 검색하여 제2 단말과 연동된 음성비서 서비스가 무엇인지 파악하는 과정을 포함할 수 있다.The step of searching for the voice secretary server linked with the second terminal (S430) may include searching for a second terminal in its own database and knowing what is the voice secretary service linked to the second terminal.
제2 단말과 연동된 음성비서 서버 중 하나를 선택하는 과정(S440)은, 제2 단말과 연동된 음성비서 서비스가 복수개인 경우 각 서비스를 제공하는 음성비서 서버의 상태, 제2 단말의 사용자가 설정한 우선순위 등을 이용하여 하나의 음성비서 서비스를 선택하는 과정을 포함할 수 있다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The step S440 of selecting one of the voice secretary servers interlocked with the second terminal may include a state of the voice secretary server providing each service when a plurality of voice secretary services interlocked with the second terminal are selected, And selecting one voice secretary service using the set priorities and the like. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
제2 음성비서 서버에게 음성인식결과를 전송하는 과정(S450)에서, 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다. 의미분석결과는 S410 단계에서 수신한 것일 수도 있고, S420 단계에서 분석한 것일 수도 있다.In the process of transmitting the voice recognition result to the second voice secretarial server (S450), the voice recognition result to be transmitted may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result. The semantic analysis result may be received in step S410 or analyzed in step S420.
S410 단계 이전에, 사용자 단말 정보, 사용자 단말과 연동된 음성비서 서비스에 대한 정보 및 연동된 음성비서 서비스의 우선순위에 대한 정보를 등록하는 과정을 더 포함할 수 있다.[0034] The method may further include registering information on user terminal information, information on a voice secretary service interlocked with the user terminal, and information on priorities of the synchronized voice secretary service, prior to step S410.
S420 단계와 S450 단계 사이에, 제1 단말에서 사용하는 언어와 제2 단말에서 사용하는 언어가 상이한 경우, 음성인식결과를 제2 단말에서 사용되는 언어로 번역하는 과정을 더 포함할 수 있다. 이 경우 S450 단계에서 음성인식결과를 전송하는 것은 번역된 음성인식결과를 전송하는 것일 수 있다.If the language used in the first terminal differs from the language used in the second terminal, a step of translating the speech recognition result into the language used in the second terminal may be further included between steps S420 and S450. In this case, transmitting the speech recognition result in step S450 may be a result of transmitting the translated speech recognition result.
도 4는 S410 내지 S450을 순차적으로 실행하는 것으로 기재하고 있으나 이는 본 발명의 기술 사상을 예시적으로 설명한 것에 불과할 뿐 S410 내지 S450의 실행이 시계열적인 순서로 한정되는 것은 아니다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 S410 내지 S450의 순서를 변경하거나 S410 내지 S450에서 하나 이상의 과정을 생략하거나 S410 내지 S450에서 하나 이상의 과정을 병렬적으로 실행하는 등 도 4의 방법을 다양하게 수정 및 변형할 수 있을 것이다.Although it is described that S410 to S450 are sequentially executed in Fig. 4, the description of the technical idea of the present invention is merely illustrative and the execution of S410 to S450 is not limited to the time series order. Those skilled in the art can change the order of S410 through S450 without departing from the essential characteristics of the present invention or omit one or more steps in S410 through S450 or perform one or more steps in S410 through S450 The method of FIG. 4 can be variously modified and modified, for example, by executing in parallel.
3. 제2 실시예3. Second Embodiment
도 5는 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 시스템을 예시한 도면이다.5 is a diagram illustrating an integrated voice secretary service providing system according to a second embodiment of the present invention.
도 5에 나타난 것과 같이, 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 시스템은 복수개의 단말, 복수개의 음성비서 서버 및 통합 서버를 포함한다.As shown in FIG. 5, the integrated voice secretary service provision system according to the second embodiment of the present invention includes a plurality of terminals, a plurality of voice secretarial servers, and an integration server.
본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 시스템은 통합 서버가 각 음성비서 서버의 전단에 있어 통합 서버가 제공하는 서비스가 단말의 사용자에게 노출된다.In the integrated voice secretary service providing system according to the second embodiment of the present invention, the service provided by the integration server is exposed to the user of the terminal, with the integration server in front of each voice secretary server.
제1 단말은 통합 서버를 경유하여 제1 음성비서 서버에서 제공되는 음성비서 서비스를 제공받는다. 제2 단말은 통합 서버를 경유하여 제2 음성비서 서버에서 제공되는 음성비서 서비스를 제공받는다. 사용자는 통합 서버에만 등록하면 각 음성비서 서버에 따로 등록하지 않아도 각 음성비서 서버에서 제공하는 음성비서 서비스를 제공받을 수 있다.The first terminal is provided with the voice secretary service provided by the first voice secretarial server via the integration server. And the second terminal is provided with the voice secretary service provided by the second voice secretarial server via the integration server. The user can receive the voice secretary service provided by each voice secretary server without registering in the voice secretary server by registering only in the integration server.
구체적으로, 제1 단말의 사용자가 제1 단말에 대고 말을 하면 제1 단말은 사용자의 음성명령을 녹음하여 통합 서버에게 전송한다.Specifically, when the user of the first terminal speaks to the first terminal, the first terminal records the voice command of the user and transmits the voice command to the integrated server.
통합 서버는 음성명령을 제1 단말의 사용자가 설정한 음성비서 서비스를 제공하는 제1 음성비서 서버에게 전송한다. 제1 단말의 사용자가 설정한 음성비서 서비스가 복수개인 경우, 통합 서버는 각 서비스를 제공하는 음성비서 서버의 상태, 제1 단말의 사용자가 설정한 우선순위 등을 이용하여 하나의 음성비서 서비스를 선택한다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The integration server transmits the voice command to the first voice secretary server providing the voice secretary service set by the user of the first terminal. When there are a plurality of voice secretarial services set by the user of the first terminal, the integration server transmits one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the first terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
선택된 제1 음성비서 서버는 음성명령을 인식하여 텍스트로 변환한다. 제1 음성비서 서버는 텍스트로 변환된 음성명령의 의미 내용을 분석하는 기능을 추가로 구비할 수 있다. 음성명령을 인식하여 텍스트로 변환하는 과정, 또는 텍스트로 변환된 음성명령의 의미 내용을 분석하는 과정에서 머신러닝, 빅데이터, 인공지능 등의 분석 기법이 사용될 수 있다.The selected first voice secretary server recognizes the voice command and converts it into text. The first voice secretariat server may further include a function of analyzing the semantic content of the voice command converted into text. Analysis of machine learning, big data, artificial intelligence, etc. can be used in the process of recognizing a voice command and converting it into text, or analyzing the semantic content of a voice command converted into text.
제1 음성비서 서버는 통합 서버에게 음성인식결과를 전송한다. 제1 음성비서 서버가 통합 서버에게 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The first voice secretary server transmits the voice recognition result to the integration server. The voice recognition result transmitted by the first voice secretarial server to the integration server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
통합 서버는 음성인식결과를 이용하여 음성비서 서비스를 제공할 제2 단말을 파악한다. 통합 서버는 음성변환결과의 의미 내용을 분석하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 의미분석결과를 이용하여 제2 단말을 파악할 수 있다. 또는 통합 서버는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말을 파악할 수 있다.The integrated server recognizes the second terminal to provide the voice secretary service using the voice recognition result. The integration server can identify the second terminal by analyzing the semantic contents of the voice conversion result. Alternatively, the integration server can identify the second terminal using the semantic analysis result. Alternatively, the integration server can identify the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the first voice secretarial server.
통합 서버는 자체 데이터베이스에서 제2 단말을 검색하여 제2 단말의 사용자가 설정한 음성비서 서비스를 파악한다. 제2 단말의 사용자가 설정한 음성비서 서비스가 복수개인 경우, 통합 서버는 각 서비스를 제공하는 음성비서 서버의 상태, 제2 단말의 사용자가 설정한 우선순위 등을 이용하여 하나의 음성비서 서비스를 선택한다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The integration server searches the second terminal in its own database and grasps the voice secretary service set by the user of the second terminal. When there are a plurality of voice secretarial services set by the user of the second terminal, the integration server uses one voice secretary service using the state of the voice secretary server providing each service, the priority set by the user of the second terminal, Select. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
통합 서버는 선택된 음성비서 서비스를 제공하는 제2 음성비서 서버에게 음성인식결과를 전송한다. 통합 서버가 제2 음성비서 서버에게 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The integration server sends a speech recognition result to a second voice secretary server that provides the selected voice secretary service. The voice recognition result transmitted by the integrated server to the second voice secretary server may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
제2 음성비서 서버는 수신한 음성인식결과를 이용하여 제2 단말에 대한 서비스 패킷을 생성한다. 제2 음성비서 서버는 음성변환결과의 의미 내용을 분석하여 제2 단말에 대한 서비스 패킷을 생성할 수 있다. 또는 제2 음성비서 서버는 의미분석결과를 이용하여 제2 단말에 대한 서비스 패킷을 생성할 수 있다. 또는 제2 음성비서 서버는 자체적으로 음성변환결과의 의미를 분석한 결과와 통합 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말에 대한 서비스 패킷을 생성할 수 있다.The second voice secretary server generates a service packet for the second terminal using the voice recognition result received. The second voice secretarial server may analyze the semantic content of the voice conversion result to generate a service packet for the second terminal. Or the second voice secretary server may generate a service packet for the second terminal using the semantic analysis result. Alternatively, the second voice secretarial server itself can generate a service packet for the second terminal by using both the result of analyzing the meaning of the voice conversion result and the semantic analysis result received from the integration server.
제2 음성비서 서버는 제2 단말에 대한 서비스 패킷을 통합 서버에게 전송한다.And the second voice secretary server transmits a service packet for the second terminal to the integration server.
통합 서버는 제2 음성비서 서버로부터 수신한 서비스 패킷을 제2 단말에게 전송한다.The integration server transmits the service packet received from the second voice secretary server to the second terminal.
본 발명의 제2 실시예에 따른 통합 음성비서 서비스에 의하면, 다양한 음성비서 서비스와 연동되는 통합 음성비서 에이전트를 구성할 수 있다. 사용자는 통합 음성비서 에이전트에만 등록하면 다양한 음성비서 서비스를 모두 사용할 수 있으므로 사용자 편의성이 크게 증대된다.According to the integrated voice secretary service of the second embodiment of the present invention, the integrated voice secretary agent can be configured to be interworked with various voice secretary services. The user can use all the voice secretary services by registering only the integrated voice secretary agent, which greatly increases the user convenience.
4. 제2 실시예의 순서도4. Flow chart of the second embodiment
도 6은 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 방법을 예시한 순서도이다.6 is a flowchart illustrating an integrated voice secretary service providing method according to a second embodiment of the present invention.
도 6의 통합 음성비서 서비스 제공 방법은 통합 서버, 게이트웨이 장치, 기타 네트워크 장치에 의해 수행될 수 있다.The integrated voice secretary service providing method of FIG. 6 may be performed by an integrated server, a gateway device, or other network device.
도 6에 나타난 것과 같이, 본 발명의 제2 실시예에 따른 통합 음성비서 서비스 제공 방법은, 제1 단말로부터 녹음된 음성명령을 수신하는 과정(S610), 제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택 과정(S620), 제1 선택 과정에서 선택된 제1 음성비서 서버에게 녹음된 음성명령을 전송하는 과정(S630), 제1 음성비서 서버로부터 음성인식결과를 수신하는 과정(S640), 음성인식결과를 분석하여 음성인식결과를 전송할 제2 단말을 파악하는 과정(S650), 제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택 과정(S660), 제2 선택 과정에서 선택된 제2 음성비서 서버에게 음성인식결과를 전송하는 과정(S670), 제2 음성비서 서버로부터 음성인식결과에 따른 서비스 패킷을 수신하는 과정(S680) 및 서비스 패킷을 제2 단말에게 전송하는 과정(S690)을 포함한다.As shown in FIG. 6, the integrated voice secretary service providing method according to the second embodiment of the present invention includes a step of receiving a recorded voice command from a first terminal (S610), a voice secretary A step S630 of transmitting a recorded voice command to the first voice secretary server selected in the first selecting step, a step of receiving a voice recognition result from the first voice secretary server A second selection process (S660) of selecting a voice secretary server to provide a service to the second terminal, a second selection process (S660) of analyzing the voice recognition result to identify a second terminal to which the voice recognition result is to be transmitted (S670) transmitting the voice recognition result to the second voice secretary server selected in step S670, receiving the service packet according to the voice recognition result from the second voice secretary server (S680), and transmitting the service packet to the second terminal Process (S6 90).
제1 단말로부터 녹음된 음성명령을 수신하는 과정(S610)에서, 녹음된 음성명령은 다양한 파일 포맷일 수 있다. 예컨대 녹음된 음성명령은 mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, vox 등의 파일 포맷일 수 있다In the step of receiving the recorded voice command from the first terminal (S610), the recorded voice command may be in various file formats. For example, a recorded voice command may include files such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, May be formatted
제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택 과정(S620)에서, 음성비서 서버의 선택은 각 음성비서 서버의 상태 및 사용자의 우선순위에 따라 결정될 수 있다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제1 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.In the first selection process (S620) of selecting a voice secretary server to provide services to the first terminal, the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal is B, C, and A, then the C voice secretary server can be selected.
제1 선택 과정에서 선택된 제1 음성비서 서버에게 녹음된 음성명령을 전송하는 과정(S630)은, 제1 음성비서 서버가 사용하는 파일 포맷으로 음성명령의 포맷을 변환한 후 전송하는 과정을 포함할 수 있다.The step S630 of transmitting the recorded voice command to the first voice secretary server selected in the first selecting step includes a step of converting the format of the voice command into the file format used by the first voice secretary server and transmitting .
제1 음성비서 서버로부터 음성인식결과를 수신하는 과정(S640)에서, 수신하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.In the process of receiving the voice recognition result from the first voice secretarial server (S640), the received voice recognition result may be a voice conversion result, a semantic analysis result, or both a voice conversion result and a semantic analysis result.
음성인식결과를 분석하여 음성인식결과를 전송할 제2 단말을 파악하는 과정(S650)에서, 음성변환결과의 의미 내용을 분석하여 제2 단말을 파악할 수 있다. 또는 의미분석결과를 이용하여 제2 단말을 파악할 수 있다. 또는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버로부터 수신한 의미분석결과를 모두 이용하여 제2 단말을 파악할 수 있다.In the process of analyzing the speech recognition result and determining the second terminal to transmit the speech recognition result (S650), the second terminal can be identified by analyzing the meaning content of the speech conversion result. Or the second terminal can be identified using the semantic analysis result. Alternatively, the second terminal can be identified using both the result of analyzing the meaning of the voice conversion result by itself and the result of the semantic analysis received from the first voice secretarial server.
제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택 과정(S660)에서, 음성비서 서버의 선택은 각 음성비서 서버의 상태 및 사용자의 우선순위에 따라 결정될 수 있다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.In the second selection process (S660) of selecting a voice secretary server to provide services to the second terminal, the selection of the voice secretary server may be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal is B, C, and A, then the C voice secretary server can be selected.
제2 선택 과정에서 선택된 제2 음성비서 서버에게 음성인식결과를 전송하는 과정(S670)에서, 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.In the process of transmitting the speech recognition result to the second voice secretary server selected in the second selection process (S670), the transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both the speech conversion result and the semantic analysis result have.
제2 음성비서 서버로부터 음성인식결과에 따른 서비스 패킷을 수신하는 과정(S680)에서, 수신한 서비스 패킷은 제2 음성비서 서버가 음성인식결과를 이용하여 생성한 제2 단말에 제공하고자 하는 서비스에 대한 서비스 패킷일 수 있다.In the process of receiving the service packet according to the speech recognition result from the second voice secretarial server (S680), the received service packet is transmitted to the second terminal generated by the second voice secretarial server using the speech recognition result Lt; / RTI >
서비스 패킷을 제2 단말에게 전송하는 과정(S690)에서, 서비스 패킷과 함께 제2 단말에서 제2 음성비서 서비스를 제공하는 앱(App)을 구동시키는 명령을 전송할 수 있다.In step S690 of transmitting the service packet to the second terminal, a command for driving an app providing the second voice secretary service may be transmitted from the second terminal together with the service packet.
S610 단계 이전에, 사용자 단말 정보, 사용자 단말에서 제공받고자 하는 음성비서 서비스에 대한 정보 및 제공받고자 하는 음성비서 서비스의 우선순위에 대한 정보를 등록하는 과정을 더 포함할 수 있다.The method may further include a step of registering information on user terminal information, information on a voice secretary service desired to be provided by the user terminal, and information on priorities of a voice secretary service to be provided, prior to step S610.
S650 단계와 S670 단계 사이에, 제1 단말에서 사용하는 언어와 제2 단말에서 사용하는 언어가 상이하면 음성인식결과를 제2 단말에서 사용되는 언어로 번역하는 과정을 더 포함할 수 있다. 이 경우 S670 단계에서 음성인식결과를 전송하는 것은 번역된 음성인식결과를 전송하는 것일 수 있다.If the language used in the first terminal differs from the language used in the second terminal, steps S650 and S670 may further include translating the speech recognition result into a language used in the second terminal. In this case, transmitting the speech recognition result in step S670 may be a result of transmitting the translated speech recognition result.
도 6는 S610 내지 S690를 순차적으로 실행하는 것으로 기재하고 있으나 이는 본 발명의 기술 사상을 예시적으로 설명한 것에 불과할 뿐 S610 내지 S690의 실행이 시계열적인 순서로 한정되는 것은 아니다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 S610 내지 S690의 순서를 변경하거나 S610 내지 S690에서 하나 이상의 과정을 생략하거나 S610 내지 S690에서 하나 이상의 과정을 병렬적으로 실행하는 등 도 6의 방법을 다양하게 수정 및 변형할 수 있을 것이다.FIG. 6 shows that S610 to S690 are sequentially executed. However, the description of the technical idea of the present invention is merely illustrative, and the execution of S610 to S690 is not limited to the time series order. Those skilled in the art will recognize that changes may be made to the procedures of steps S610 through S690 without departing from the essential characteristics of the present invention or by omitting one or more steps S610 through S690 or by performing one or more steps S610 through S690 The method of FIG. 6 can be variously modified and modified, for example, by executing in parallel.
5. 제2 실시예의 장치도5. Device diagram of the second embodiment
도 7은 본 발명의 제2 실시예에 따른 통합 음성비서 서비스를 제공하는 서버를 예시한 블록도이다.7 is a block diagram illustrating a server providing an integrated voice secretary service according to a second embodiment of the present invention.
도 7에 나타난 것과 같이, 본 발명의 제2 실시예에 따른 통합 서버(720)는, 제1 수신부(721), 제2 수신부(722), 제3 수신부(723), 제1 선택부(724), 제2 선택부(725), 판단부(726), 제1 전송부(727), 제2 전송부(728) 및 제3 전송부(729)를 포함한다.7, the integration server 720 according to the second embodiment of the present invention includes a first receiving unit 721, a second receiving unit 722, a third receiving unit 723, a first selecting unit 724 A second selecting unit 725, a determining unit 726, a first transmitting unit 727, a second transmitting unit 728, and a third transmitting unit 729.
제1 수신부(721)는 제1 단말(711)로부터 녹음된 음성명령을 수신한다. 녹음된 음성명령은 mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, vox 등의 파일 포맷일 수 있다The first receiving unit 721 receives the recorded voice command from the first terminal 711. The recorded voice commands can be stored in a file format such as mp3, wav, wma, 3gp, aiff, aac, alac, amr, au, awb, dvf, flac, mmf, mpc, msv, ogg, opus, ra, rm, tta, May be
제1 선택부(724)는 제1 단말(711)에게 서비스를 제공할 음성비서 서버를 선택한다. 음성비서 서버의 선택은 각 음성비서 서버의 상태 및 사용자의 우선순위에 따라 결정될 수 있다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제1 단말(711)의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The first selection unit 724 selects a voice secretary server to provide the service to the first terminal 711. [ The selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the first terminal 711 is B, C, and A, then the C voice secretary server can be selected.
제1 전송부(727)는 제1 선택부(724)에서 선택된 제1 음성비서 서버(731)에게 녹음된 음성명령을 전송한다. 제1 전송부(727)는 녹음된 음성명령을 제1 음성비서 서버(731)가 사용하는 파일 포맷으로 변환한 후 전송할 수 있다.The first transmitting unit 727 transmits the recorded voice command to the first voice secretary server 731 selected by the first selecting unit 724. The first transmission unit 727 may convert the recorded voice command into a file format used by the first voice secretary server 731, and then transmit the converted voice command.
제2 수신부(722)는 제1 음성비서 서버(731)로부터 녹음된 음성명령의 인식 결과를 수신한다. 수신하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The second receiving unit 722 receives the recognition result of the recorded voice command from the first voice secretarial server 731. [ The received speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.
판단부(726)는 음성인식결과를 분석하여 음성인식결과를 전송할 제2 단말(712)을 파악한다. 판단부(726)는 음성변환결과의 의미 내용을 분석하여 제2 단말(712)을 파악할 수 있다. 또는 의미분석결과를 이용하여 제2 단말(712)을 파악할 수 있다. 또는 자체적으로 음성변환결과의 의미를 분석한 결과와 제1 음성비서 서버(731)로부터 수신한 의미분석결과를 모두 이용하여 제2 단말(712)을 파악할 수 있다.The determination unit 726 analyzes the speech recognition result and determines the second terminal 712 to transmit the speech recognition result. The determination unit 726 can analyze the semantic content of the voice conversion result and determine the second terminal 712. Or the second terminal 712 using the semantic analysis result. Or the second terminal 712 using both the result of analyzing the meaning of the voice conversion result itself and the result of the semantic analysis received from the first voice secretary server 731.
제2 선택부(725)는 제2 단말(712)에게 서비스를 제공할 음성비서 서버를 선택한다. 음성비서 서버의 선택은 각 음성비서 서버의 상태 및 사용자의 우선순위에 따라 결정될 수 있다. 예컨대 현재 A, C 음성비서 서버의 상태가 쾌적하고 제2 단말(712)의 사용자가 설정한 우선순위는 B, C, A 순서인 경우, C 음성비서 서버를 선택할 수 있다.The second selection unit 725 selects a voice secretary server to provide service to the second terminal 712. [ The selection of the voice secretary server can be determined according to the state of each voice secretary server and the priority of the user. For example, if the current state of the A and C voice secretary servers is pleasant and the priority set by the user of the second terminal 712 is B, C, and A, then the C voice secretary server can be selected.
제2 전송부(728)는 제2 선택부(725)에서 선택된 제2 음성비서 서버(732)에게 음성인식결과를 전송한다. 전송하는 음성인식결과는 음성변환결과이거나, 의미분석결과이거나, 음성변환결과와 의미분석결과 모두일 수 있다.The second transmitting unit 728 transmits the voice recognition result to the second voice secretary server 732 selected by the second selecting unit 725. The transmitted speech recognition result may be a speech conversion result, a semantic analysis result, or both a speech conversion result and a semantic analysis result.
제3 수신부(723)는 제2 음성비서 서버(732)로부터 음성인식결과에 따른 서비스 패킷을 수신한다. 서비스 패킷은 제2 음성비서 서버(732)가 음성인식결과를 이용하여 생성한 제2 단말(712)에 제공하고자 하는 서비스에 대한 서비스 패킷일 수 있다.The third receiving unit 723 receives the service packet according to the voice recognition result from the second voice secretary server 732. The service packet may be a service packet for a service that the second voice secretary server 732 wants to provide to the second terminal 712 generated using the voice recognition result.
제3 전송부(729)는 서비스 패킷을 제2 단말(712)에게 전송한다.제3 전송부(729)는 서비스 패킷과 함께 제2 단말(712)에서 제2 음성비서 서비스를 제공하는 앱(App)을 구동시키는 명령을 전송할 수 있다.The third transmitting unit 729 transmits the service packet to the second terminal 712. The third transmitting unit 729 transmits the service packet to the second terminal 712 through the application App) can be transmitted.
전술한 실시예에서는 2개의 단말과 2개의 음성비서 서버에 대하여 제공되는 통합 음성비서 서비스를 설명하였으나, 이는 예시에 불과하고, 2개 이상의 단말과 2개 이상의 음성비서 서버에 대해서도 본 발명의 일 실시예에 따른 통합 음성비서 서비스가 제공될 수 있다.Although the integrated voice secretary service provided for two terminals and two voice secretarial servers has been described in the above embodiment, this is merely an example. Also, for two or more terminals and two or more voice secretarial servers, An integrated voice secretary service according to the example can be provided.
한편, 전술한 실시예들로 설명된 방법들은 컴퓨터 또는 스마트폰으로 읽을 수 있는 기록매체에 컴퓨터 또는 스마트폰이 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터 또는 스마트폰이 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 즉, 컴퓨터 또는 스마트폰이 읽을 수 있는 기록매체는 마그네틱 저장매체(예를 들면, ROM, 플로피 디스크, 하드디스크 등), 광학적 판독 매체(예를 들면, CD-ROM, DVD 등), 플래시 메모리(예를 들면, USB, SSD) 등과 같은 저장매체를 포함한다. 또한 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터 또는 스마트폰이 읽을 수 있는 코드가 저장되고 실행될 수 있다.Meanwhile, the methods described in the above embodiments can be implemented as a computer or a smartphone readable code on a computer-readable recording medium. A recording medium readable by a computer or a smart phone includes all kinds of recording devices in which data that can be read by a computer system is stored. That is, the recording medium that can be read by a computer or a smart phone includes a magnetic storage medium (e.g., a ROM, a floppy disk, a hard disk, etc.), an optical reading medium (e.g., CD- (E.g., USB, SSD), and the like. In addition, code that is distributed to networked computer systems and readable by a computer or smartphone in a distributed manner can be stored and executed.
본 실시예는 본 발명의 기술적 사상을 예시적으로 설명한 것에 불과하고, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 본 실시예의 다양한 수정 및 변형이 가능할 것이다.It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention as defined by the appended claims. It will be possible.
본 실시예는 본 발명의 기술적 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 따라서 본 실시예에 의하여 본 발명의 권리범위가 한정되는 것은 아니다. 본 발명의 보호범위는 청구범위에 의하여 해석되어야 하며, 그와 동등하거나 균등하다고 인정되는 모든 기술적 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 한다.The present invention is not intended to limit the scope of the present invention but to limit the scope of the present invention. The scope of protection of the present invention should be construed according to the claims, and all technical ideas considered to be equivalent or equivalent thereto should be construed as being included in the scope of the present invention.
(부호의 설명)(Explanation of Symbols)
711: 제1 단말 712: 제2 단말711: first terminal 712: second terminal
720: 통합 서버 721: 제1 수신부720: Integration server 721: First reception unit
722: 제2 수신부 723: 제3 수신부722: second receiving section 723: third receiving section
724: 제1 선택부 725: 제2 선택부724: first selector 725: second selector
726: 판단부 727: 제1 전송부726: Judgment section 727: First transmission section
728: 제2 전송부 729: 제3 전송부728: second transmission unit 729: third transmission unit
731: 제1 음성비서 서버 732: 제2 음성비서 서버731: first voice secretary server 732: second voice secretary server
(CROSS-REFERENCE TO RELATED APPLICATION)(CROSS-REFERENCE TO RELATED APPLICATION)
본 특허출원은 2017년 11월 23일 한국에 출원한 특허출원번호 제10-2017-0157064호에 대해 미국 특허법 119(a)조(35 U.S.C. 119(a))에 따라 우선권을 주장하며, 그 모든 내용은 참고문헌으로 본 특허출원에 병합된다. 아울러 본 특허출원은 미국 이외의 국가에 대해서도 위와 동일한 이유로 우선권을 주장하며, 그 모든 내용은 참고문헌으로 본 특허출원에 병합된다.This patent application claims priority under 35 USC 119 (a) to US Patent Application No. 119 (a), U.S. Patent Application No. 10-2017-0157064, filed on November 23, 2017, The contents are incorporated herein by reference. This patent application also claims priority to the non-US countries for the same reasons as above, and the entire contents of which are incorporated herein by reference.

Claims (16)

  1. 통합 음성비서 서비스 제공 방법에 있어서,A method for providing an integrated voice secretary service,
    제1 음성비서 서버로부터 음성명령의 인식 결과(이하, '음성인식결과')를 수신하는 과정;Receiving a recognition result of a voice command from a first voice secretary server (hereinafter referred to as a 'voice recognition result');
    상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 단말(이하, '대상 단말')을 파악하는 과정;Analyzing the speech recognition result to identify a terminal to which the speech recognition result is to be transmitted (hereinafter referred to as a 'target terminal');
    상기 대상 단말과 연동된 음성비서 서버를 검색하는 과정;Searching for a voice secretary server interlocked with the target terminal;
    상기 검색된 음성비서 서버 중 하나를 선택하는 과정; 및Selecting one of the retrieved voice secretary servers; And
    상기 선택된 음성비서 서버에게 상기 음성인식결과를 전송하는 과정을 포함하는 통합 음성비서 서비스 제공 방법.And transmitting the voice recognition result to the selected voice secretary server.
  2. 제1항에 있어서,The method according to claim 1,
    상기 수신하는 과정 이전에,Before the receiving process,
    사용자 단말 정보, 상기 사용자 단말과 연동된 음성비서 서비스에 대한 정보 및 상기 연동된 음성비서 서비스의 우선순위에 대한 정보를 등록하는 과정을 더 포함하는 통합 음성비서 서비스 제공 방법.The method comprising: registering user terminal information, information on a voice secretary service interlocked with the user terminal, and information on a priority order of the interlocked voice secretary service.
  3. 제2항에 있어서,3. The method of claim 2,
    상기 선택하는 과정은, 상기 우선순위에 따라 음성비서 서버를 선택하는 것을 특징으로 하는 통합 음성비서 서비스 제공 방법.Wherein the selecting comprises selecting a voice secretary server according to the priority.
  4. 제1항에 있어서,The method according to claim 1,
    상기 대상 단말을 파악하는 과정과 상기 전송하는 과정 사이에,Between the process of identifying the target terminal and the process of transmitting the target terminal,
    상기 음성인식결과를 상기 대상 단말에서 사용되는 언어로 번역하는 과정을 더 포함하는 통합 음성비서 서비스 제공 방법.And translating the speech recognition result into a language used in the target terminal.
  5. 제4항에 있어서,5. The method of claim 4,
    상기 전송하는 과정은, 상기 번역된 음성인식결과를 전송하는 것을 특징으로 하는 통합 음성비서 서비스 제공 방법.Wherein the transmitting step transmits the translated speech recognition result.
  6. 통합 음성비서 서비스 제공 방법에 있어서,A method for providing an integrated voice secretary service,
    제1 단말로부터 녹음된 음성명령을 수신하는 과정;Receiving a recorded voice command from a first terminal;
    상기 제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택 과정;A first selection step of selecting a voice secretary server to provide a service to the first terminal;
    상기 제1 선택 과정에서 선택된 음성비서 서버(이하, '제1 음성비서 서버')에게 상기 녹음된 음성명령을 전송하는 과정;Transmitting the recorded voice command to the voice secretary server selected in the first selection step (hereinafter referred to as 'first voice secretary server');
    상기 제1 음성비서 서버로부터 상기 녹음된 음성명령의 인식 결과(이하, '음성인식결과')를 수신하는 과정;Receiving a result of recognition of the recorded voice command (hereinafter, 'voice recognition result') from the first voice secretary server;
    상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 단말(이하, '제2 단말')을 파악하는 과정;Analyzing the speech recognition result to identify a terminal to transmit the speech recognition result (hereinafter referred to as a 'second terminal');
    상기 제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택 과정;A second selection step of selecting a voice secretary server to provide a service to the second terminal;
    상기 제2 선택 과정에서 선택된 음성비서 서버(이하, '제2 음성비서 서버')에게 상기 음성인식결과를 전송하는 과정;Transmitting the voice recognition result to the voice secretary server selected in the second selection process (hereinafter referred to as 'second voice secretarial server');
    상기 제2 음성비서 서버로부터 상기 음성인식결과에 따른 서비스 패킷을 수신하는 과정; 및Receiving a service packet according to the speech recognition result from the second voice secretary server; And
    상기 서비스 패킷을 상기 제2 단말에게 전송하는 과정을 포함하는 통합 음성비서 서비스 제공 방법.And transmitting the service packet to the second terminal.
  7. 제6항에 있어서,The method according to claim 6,
    상기 녹음된 음성명령을 수신하는 과정 이전에,Prior to receiving the recorded voice command,
    사용자 단말 정보, 상기 사용자 단말에서 제공받고자 하는 음성비서 서비스에 대한 정보 및 상기 제공받고자 하는 음성비서 서비스의 우선순위에 대한 정보를 등록하는 과정을 더 포함하는 통합 음성비서 서비스 제공 방법.And registering information on the user terminal information, information on the voice secretary service desired to be provided by the user terminal, and information on the priority order of the voice secretary service desired to be provided.
  8. 제7항에 있어서,8. The method of claim 7,
    상기 제2 선택 과정은, 상기 우선순위에 따라 음성비서 서버를 선택하는 것을 특징으로 하는 통합 음성비서 서비스 제공 방법.Wherein the second selection step selects the voice secretary server according to the priority order.
  9. 제6항에 있어서,The method according to claim 6,
    상기 제2 단말을 파악하는 과정과 상기 제2 음성비서 서버에게 상기 음성인식결과를 전송하는 과정 사이에,Between the process of recognizing the second terminal and the process of transmitting the voice recognition result to the second voice secretary server,
    상기 음성인식결과를 상기 제2 단말에서 사용되는 언어로 번역하는 과정을 더 포함하는 통합 음성비서 서비스 제공 방법.And translating the speech recognition result into a language used in the second terminal.
  10. 제9항에 있어서,10. The method of claim 9,
    상기 제2 음성비서 서버에게 상기 음성인식결과를 전송하는 과정은, 상기 번역된 음성인식결과를 전송하는 것을 특징으로 하는 통합 음성비서 서비스 제공 방법.Wherein the step of transmitting the voice recognition result to the second voice secretary server comprises transmitting the translated voice recognition result.
  11. 통합 음성비서 서버에 있어서,For an integrated voice secretary server,
    제1 단말로부터 녹음된 음성명령을 수신하는 제1 수신부;A first receiver for receiving a recorded voice command from a first terminal;
    상기 제1 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제1 선택부;A first selector for selecting a voice secretary server to provide a service to the first terminal;
    상기 제1 선택부에서 선택된 음성비서 서버(이하, '제1 음성비서 서버')에게 상기 녹음된 음성명령을 전송하는 제1 전송부;A first transmitting unit for transmitting the recorded voice command to a voice secretary server selected by the first selecting unit (hereinafter referred to as 'first voice secretary server');
    상기 제1 음성비서 서버로부터 상기 녹음된 음성명령의 인식 결과(이하, '음성인식결과')를 수신하는 제2 수신부;A second receiving unit for receiving a recognition result of the recorded voice command (hereinafter, a 'voice recognition result') from the first voice secretary server;
    상기 음성인식결과를 분석하여 상기 음성인식결과를 전송할 단말(이하, '제2 단말')을 파악하는 판단부;A determination unit for analyzing the speech recognition result and determining a terminal to transmit the speech recognition result (hereinafter referred to as a 'second terminal');
    상기 제2 단말에게 서비스를 제공할 음성비서 서버를 선택하는 제2 선택부;A second selector for selecting a voice secretary server to provide a service to the second terminal;
    상기 제2 선택부에서 선택된 음성비서 서버(이하, '제2 음성비서 서버')에게 상기 음성인식결과를 전송하는 제2 전송부;A second transmitting unit for transmitting the voice recognition result to the voice secretary server selected by the second selecting unit (hereinafter referred to as 'second voice secretary server');
    상기 제2 음성비서 서버로부터 상기 음성인식결과에 따른 서비스 패킷을 수신하는 제3 수신부; 및A third receiving unit for receiving a service packet according to the speech recognition result from the second voice secretary server; And
    상기 서비스 패킷을 상기 제2 단말에게 전송하는 제3 전송부를 포함하는 통합 음성비서 서버.And a third transmission unit for transmitting the service packet to the second terminal.
  12. 제11항에 있어서,12. The method of claim 11,
    사용자 단말 정보, 상기 사용자 단말에서 제공받고자 하는 음성비서 서비스에 대한 정보 및 상기 제공받고자 하는 음성비서 서비스의 우선순위에 대한 정보를 등록하는 과정을 더 포함하는 통합 음성비서 서비스 제공 방법.And registering information on the user terminal information, information on the voice secretary service desired to be provided by the user terminal, and information on the priority order of the voice secretary service desired to be provided.
  13. 제12항에 있어서,13. The method of claim 12,
    상기 제2 선택부는, 상기 우선순위에 따라 음성비서 서버를 선택하는 것을 특징으로 하는 통합 음성비서 서버.And the second selection unit selects the voice secretary server according to the priority order.
  14. 제11항에 있어서,12. The method of claim 11,
    상기 음성인식결과를 타 언어로 번역하는 번역부를 더 포함하는 통합 음성비서 서버.And a translator for translating the speech recognition result into another language.
  15. 제14항에 있어서,15. The method of claim 14,
    상기 제2 전송부는, 상기 번역부에서 번역된 음성인식결과를 전송하는 것을 특징으로 하는 통합 음성비서 서버.And the second transmitting unit transmits the voice recognition result translated by the translating unit.
  16. 제1항 내지 10항 중 어느 한 항의 방법을 실행시키는 프로그램이 기록된 컴퓨터 판독 가능한 기록 매체.A computer-readable recording medium on which a program for executing the method of any one of claims 1 to 10 is recorded.
PCT/KR2017/013512 2017-11-23 2017-11-24 Method and device for providing integrated voice secretary service WO2019103200A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR20170157064 2017-11-23
KR10-2017-0157064 2017-11-23

Publications (1)

Publication Number Publication Date
WO2019103200A1 true WO2019103200A1 (en) 2019-05-31

Family

ID=66631529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2017/013512 WO2019103200A1 (en) 2017-11-23 2017-11-24 Method and device for providing integrated voice secretary service

Country Status (1)

Country Link
WO (1) WO2019103200A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021025542A1 (en) * 2019-08-08 2021-02-11 Samsung Electronics Co., Ltd. Method, system and device for sharing intelligence engine by multiple devices
CN112466300A (en) * 2019-09-09 2021-03-09 百度在线网络技术(北京)有限公司 Interaction method, electronic device, intelligent device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010100386A (en) * 2000-05-01 2001-11-14 김헌재 Method to support business using multi-purpose call diverters, internet, and wireless data communication
KR100792208B1 (en) * 2005-12-05 2008-01-08 한국전자통신연구원 Method and Apparatus for generating a response sentence in dialogue system
KR20090000279A (en) * 2007-02-13 2009-01-07 홍성훈 Method for acquiring and providing knowledge using wired and wireless networks and the system therefor
KR20090002297A (en) * 2007-06-26 2009-01-09 신흥순 Cyber and mobile secretary system
KR20160142802A (en) * 2011-09-30 2016-12-13 애플 인크. Using context information to facilitate processing of commands in a virtual assistant

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010100386A (en) * 2000-05-01 2001-11-14 김헌재 Method to support business using multi-purpose call diverters, internet, and wireless data communication
KR100792208B1 (en) * 2005-12-05 2008-01-08 한국전자통신연구원 Method and Apparatus for generating a response sentence in dialogue system
KR20090000279A (en) * 2007-02-13 2009-01-07 홍성훈 Method for acquiring and providing knowledge using wired and wireless networks and the system therefor
KR20090002297A (en) * 2007-06-26 2009-01-09 신흥순 Cyber and mobile secretary system
KR20160142802A (en) * 2011-09-30 2016-12-13 애플 인크. Using context information to facilitate processing of commands in a virtual assistant

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021025542A1 (en) * 2019-08-08 2021-02-11 Samsung Electronics Co., Ltd. Method, system and device for sharing intelligence engine by multiple devices
US11490240B2 (en) 2019-08-08 2022-11-01 Samsung Electronics Co., Ltd. Method, system and device for sharing intelligence engine by multiple devices
CN112466300A (en) * 2019-09-09 2021-03-09 百度在线网络技术(北京)有限公司 Interaction method, electronic device, intelligent device and readable storage medium

Similar Documents

Publication Publication Date Title
WO2013077589A1 (en) Method for providing a supplementary voice recognition service and apparatus applied to same
WO2014119975A1 (en) Method and system for sharing part of web page
WO2012161359A1 (en) Method and device for user interface
WO2012148156A2 (en) Method for providing link list and display apparatus applying the same
WO2015053541A1 (en) Method and apparatus for displaying associated information in electronic device
WO2016068455A1 (en) Method and system for providing adaptive keyboard interface and response input method using adaptive keyboard linked with conversation content
EP3500947A1 (en) Language translation device and language translation method
EP2761400A1 (en) User interface method and device
WO2014112754A1 (en) Web service push method and web service push server and web service providing server performing same
WO2012060669A1 (en) Method for controlling remote device through sms and device therefor
WO2015088155A1 (en) Interactive system, server and control method thereof
WO2015147437A1 (en) Mobile service system, and method and device for generating location-based album in same system
WO2019103200A1 (en) Method and device for providing integrated voice secretary service
WO2021251539A1 (en) Method for implementing interactive message by using artificial neural network and device therefor
WO2015126097A1 (en) Interactive server and method for controlling the server
WO2014058153A1 (en) Address book information service system, and method and device for address book information service therein
WO2015102125A1 (en) Text message conversation system and method
WO2016021878A1 (en) Information providing system and method
WO2020233074A1 (en) Mobile terminal control method and apparatus, mobile terminal, and readable storage medium
WO2017213454A1 (en) File control system and method using user server
WO2021080147A1 (en) Device for providing personalized virtual assistant
WO2018182072A1 (en) System and method for extracting learning data from virtual reality content and augmented reality content
WO2013051844A1 (en) Interactive multilingual advertisement system, and method for driving same
WO2016186326A1 (en) Search word list providing device and method using same
WO2014104579A1 (en) Method for controlling file name and electronic device thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17933164

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17933164

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 20/01/2021)