WO2019045250A1

WO2019045250A1 - Push-to-talk communication service operation method and system using same

Info

Publication number: WO2019045250A1
Application number: PCT/KR2018/007623
Authority: WO
Inventors: 박규철
Original assignee: 주식회사 인스파이어모바일
Priority date: 2017-08-31
Filing date: 2018-07-05
Publication date: 2019-03-07
Also published as: KR20190024361A; KR102040370B1

Abstract

Disclosed are a service operation method for providing speech recognition-based text and converting and providing the text into speech data in a push-to-talk (PTT) service application, and a system using the same. A push-to-talk communication service operation method is a PTT communication service operation method that is implemented in an user terminal equipped with a push-to-talk service application capable of speech recognition, and comprises the steps of: converting speech data, that is inputted speech, received from the user terminal through a network, into text through a speech recognition technique and transmitting the text to a receiver; converting text data including the text into speech through a speech synthesis technique and outputting the speech; and applying the transmitted and received speech or text content to an interactive user interface, wherein the user terminal operates to output, through a speaker and a display screen, previously transmitted speech according to a user input or a predetermined input and the text converted by the speech recognition, such that an user can confirm the corresponding content.

Description

Push to talk communication service operation method and system using the same

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a service providing method for providing text based on voice recognition and converting text into voice data in a push to talk service application, and a system using the method.

A push-to-talk (PTT) communication is a two-way communication system in which a voice is transmitted to the other party while a button for transmitting voice is pressed to the other party, and a voice is received from the other party in a state in which the button is not pressed. It is a communication system in which the majority of people can perform simple communication.

Internet protocol (PTT) communication based on internet protocol (IP) enables voice communication like a walkie-talkie without using voice service network of mobile communication terminal by using mobile communication data network in various wireless communication environments such as Wi-Fi, 2G, 3G, LTE and satellite It can transmit and receive.

These PTT communication technologies are utilized for collaboration in existing industrial sites for the purpose of task assignment and distribution in distribution, transportation, logistics, airports, factories, and construction sites.

However, when PPT communication is used in a poor environment such as a domestic shipyard, a car factory, or a construction site, if the user's surrounding noise is very strong, the voice of the caller is not clearly transmitted and it is difficult for the caller to properly understand the voice of the caller.

Also, in PTT communication, since the voice of the speaker is directly transmitted to the recipient terminal regardless of the current state of the recipient, such as a situation where the recipient is engaged in another task because the medium of communication between the sender and the recipient is voice, The important transmission of voice is often missed.

SUMMARY OF THE INVENTION It is an object of the present invention to solve the above problems and provide a mobile communication terminal which can accurately transmit information even in a place where voice communication is not easy due to poor environment, And to provide a method of operating a push to talk (PTT) communication service capable of providing a visual assistance function.

Another object of the present invention is to provide a method of operating a push to talk communication service in which a recipient does not miss an important message of a caller by sequentially displaying voice data transmitted in real time along with text on a dialog window basis.

It is another object of the present invention to provide a push to talk communication service operating method capable of visually assisting voice communication in voice communication at a user terminal through a push to talk service application capable of voice recognition.

It is still another object of the present invention to provide a system using the push to talk communication service operating method.

According to an aspect of the present invention, there is provided a method of operating a push to talk (PTT) communication service according to an aspect of the present invention includes: transmitting a PTT communication service, which is implemented in a user terminal equipped with a push to talk A method of operating a service, the method comprising: converting voice data received through a network from a user terminal, that is, input voice, into text through a voice recognition technique and transmitting the text to a receiver; Converting the text data including the text into speech through a speech synthesis technique and outputting the speech; The method comprising the steps of applying an interactive user interface (UI) to the transmitted or received voice or text content, wherein the user terminal transmits text previously converted to voice and speech recognition according to a user input or a pre- And a display screen so that the user can confirm the contents.

According to another aspect of the present invention, there is provided a method of operating a push to talk communication service, the method comprising: establishing a communication channel with a second user terminal through a network; And detecting text to speech (TTS) input while supporting text-to-speech communication in an interactive text window of a push to talk service application; And generating a TTS request signal for converting the text data input from the first or second user terminal into voice data in the interactive text window according to the TTS input.

In one embodiment, the TTS request signal may be communicated to a TTS manager of a service application installed in the first or second user terminal that generated the signal. In addition, according to the implementation, the TTS request signal may be transmitted to the speech synthesis supporting apparatus connected to the first or second user terminal through the network, which generates the TTS request signal together with the corresponding text data. The TTS manager or the voice synthesis support apparatus can be transmitted to the counterpart terminal of the user terminal that converts the text data into voice data according to the TTS request signal.

According to another aspect of the present invention, there is provided a method of operating a push to talk communication service, the method comprising: establishing a communication channel with a second user terminal through a network; And detecting a secret conversation setting input while supporting a text conversation communication in an interactive text window of a push to talk service application; And changing the mode of the calling user terminal to the voice transmission mode in the interactive text window according to the secret conversation setting input, wherein the receiving user terminal responds to the secret conversation setting input from the calling user terminal And switches the operation mode to the text reception mode according to the mode change request signal.

In one embodiment, a service application installed on the calling user terminal or a voice recognition support device connected via a network with the first and second user terminals may be configured to allow the calling party The voice data of the user terminal can be converted into text data and provided to the receiving side user terminal. Here, the voice data may be transmitted to the receiving user terminal together with or separately from the text data so that the receiver can check later.

According to another aspect of the present invention, there is provided a system using a push to talk communication service method, the system including a push to talk service application capable of voice recognition mounted on a user terminal or a counterpart terminal, The terminal includes a first functional unit for converting a voice signal into a text to speech-to-text (STT) function while pressing a button for transmitting voice; A second function unit for simultaneously transmitting a plurality of converted texts together with speech; And a third function unit for mapping the contents transmitted with voice to the characters transmitted together to support the PTT voice contents delivered through the keyword search again. The receiving-side wireless terminal may further include a fourth functional unit for providing the received text along with the received voice on a screen in the form of a dialog window; A fifth function unit for converting a text received through a text to speech (TTS) function into a real time voice and outputting the same if the voice is not received; And a sixth function unit for mapping the contents received by voice to the transmitted text and supporting the PTT voice contents delivered through the keyword search again.

In one embodiment, the calling terminal is a mobile terminal or a wireless terminal that is equipped with a push to talk service application. The first mode is a first mode in which a button for transmitting a voice is pressed in a pressed state, A second mode for performing a text transmission service converted to a text mode and a third mode for transmitting only text, and in the case where the voice recognition service is not executed, Can be performed.

In one embodiment, the receiving-side terminal is a mobile terminal or a wireless terminal equipped with a push to talk service application. The first mode is a mode in which a voice receiving service is performed while the button for transmitting voice is not pressed. A second mode for visually displaying the text in a form of a dialog window on the screen, a third mode for displaying the text in the form of a dialog window when the text is only delivered, and a fourth mode for converting the text into speech , And if the voice synthesis service is not executed, the first mode or the third mode may be performed.

In one embodiment, the service application maps the converted text based on voice and voice transmitted and received to a group, provides a function of searching for a character history sent and received through a keyword, and outputs voice conversation contents associated with the retrieved character And a function for supporting the user's confirmation.

According to another aspect of the present invention, there is provided a system using a push to talk communication service, the system including a server connected to a user terminal through a network, the system comprising: a first user terminal Or receiving a download request signal for a PTT service application from a second user terminal; And providing the PTT service application to the first user terminal or the second user terminal in response to the download request signal. Here, the PTT service application provides an interactive text window.

In one embodiment, the PTT service application may control the operating mode of the calling user terminal in voice input mode according to a secret chat request or a pre-established request that is activated via the user interface of the interactive text window.

In one embodiment, the PTT service application may control the operation mode of the receiving side user terminal in a text output mode according to a secret conversation request or a predetermined request that is activated through the user interface of the interactive text window.

In one embodiment, the PTT service application may automatically convert the text data of the calling user terminal to voice data and transmit the voice data to the receiving user terminal according to a TTS request activated through another user interface of the interactive text window.

In one embodiment, the system using the push to talk communication service operating method may comprise a mobile terminal, a personal computer, or a desktop computer.

In the case of using the push to talk communication service operating method and a system using the push to talk communication service method, when the user terminal performs voice communication with the counterpart terminal through a push to talk (PTT) service application capable of voice recognition , Text based on voice, and the like.

Further, according to the present invention, it is possible to provide various user terminals such as a mobile terminal and a personal computer, or a counterpart terminal, as a system using a push to talk service application method capable of voice recognition and using the push to talk communication service operation method described above, It is possible to provide a voice recognition support push to talk service that can be easily applied to various users and various environments only by installing a single service application.

In addition, according to the present invention, it is possible to provide an alternative method for clearly recognizing the real-time delivery contents of a caller even in a receiver environment where noise is strong and voice reception is difficult.

Further, according to the present invention, it is possible to confirm the voice contents of the PTT communication at a receiver terminal through keyword search even after a predetermined time has elapsed.

In addition, in a receiver situation where security is required or PTT voice communication is difficult, it is possible to set the operation mode of the PTT communication service according to the situation because the voice of the calling side is transmitted by text alone. This improves the usability of the system by increasing the adaptability to various usage environments. In addition, by providing an operating mode of the PPT communication service according to the situation, it is possible to substantially reduce the network load in terms of the data network.

1 is a diagram schematically showing a configuration of a push to talk (PTT) communication service operating system according to an embodiment of the present invention.

FIG. 2 is a detailed diagram showing a configuration of a transmitting-side terminal and a receiving-side terminal that can be employed in the system of FIG.

3 is a detailed block diagram of a control unit of a transmitting terminal according to an embodiment of the present invention.

4 is a detailed block diagram of a controller of a receiving terminal according to an embodiment of the present invention.

5 is a flowchart illustrating a push to talk communication service operation method according to another embodiment of the present invention.

6 is an exemplary view of a display screen of a system using a push-to-talk communication service operating method according to another embodiment of the present invention.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the exemplary embodiments of the present invention, descriptions of known techniques that are well known in the art and are not directly related to the present invention will be omitted.

The following examples have the same meaning as commonly understood by one of ordinary skill in the art and commonly used terms such as predefined terms should be construed as being consistent with the contextual meanings of the related art. The following embodiments are provided as examples so that the ideas of the present invention can be sufficiently transmitted, and the present invention is not limited to the embodiments described below, but may be embodied in other forms.

As shown in FIG. 1, the PTT communication service system of the present embodiment has a configuration in which a mobile terminal performs PTT communication with an external terminal based on a PTT (Push To Talk) communication service. That is, the PTT communication service operating system may include the PPT communication system 100, the transmitting terminal 200, the receiving terminal 300, the voice recognition support device 400, and the voice synthesis support device 500. The speech recognition support apparatus 400 may be referred to as a speech processing support apparatus. Further, in the separate server and its functional aspects, the speech synthesis supporting apparatus 500, which is referred to as a first server, may be referred to as a second server.

In the PTT communication service operating system of the present embodiment, the transmitting terminal 200 transmits a user's voice signal to the receiving terminal 300 (300) in response to a pressing signal or an activating signal in a state where a button for transmitting a voice by a user is pressed or activated ). During the operation of the PTT communication service, the speech recognition support apparatus 400 may be operated according to a predetermined operation mode or a user's selection, and additional information of data transmitted from the calling terminal may be generated. According to this, the communication additional service can be operated in accordance with the user environment.

In addition, while the PTT communication service is being operated, the receiving terminal 300 operates the voice synthesis support apparatus 500 according to a predetermined operation mode or a user's selection, converts the received text data into voice data, Audio data can be outputted through the speaker. According to this, the communication additional service can be operated in accordance with the user environment.

Here, the supplementary service can be one of a service for transmitting a voice signal from a user as text or a service for converting received text into voice and outputting the voice.

More specifically, if the transmitting terminal 200 forms a data communication channel with the receiving terminal 300, the PTT communication service operating system may be configured to transmit the data to the voice recognition supporting device 400 and the speech synthesis support apparatus 500. [0054] FIG.

For example, when a button for transmitting a voice is pressed by a user and a pressing signal is generated, the transmitting terminal 200 receives a speech to text (STT) service for recognizing a voice signal from the user and generating text Can be operated. In this case, the transmitting terminal 200 transmits the collected voice data to the receiving terminal 300 through the PTT communication system (corresponding to the network), and at the same time performs voice recognition in the voice recognition support device 400, Can be converted. This is to provide additional services within a range that does not violate the characteristics of the PTT communication service as much as possible.

Also, the receiving terminal 300 can operate a text to speech (TTS) service for converting the received text into voice. In this case, the receiving terminal 300 may output the received text to the display unit 340 or convert it into voice data through the voice synthesis support apparatus 500 and output the voice data to the speaker 330.

The voice recognition support apparatus 400 recognizes the voice data provided by the transmitting terminal 200 at the request of the transmitting terminal 200 and converts the recognized voice into text and provides it to the transmitting terminal 200 . The speech recognition support apparatus 400 may be implemented as a separate server type that can be connected to the transmitting terminal 200 through wireless communication, for example, Wi-Fi or short-range wireless communication, Or may be implemented in a separate server form, or in the form of a service application internally driven by a calling or receiving user terminal.

The voice synthesis support apparatus 500 recognizes the text transferred to support the TTS service of the receiving terminal 300, converts the recognized text into voice data, and provides the voice data to the receiving terminal 300. The voice synthesis support apparatus 500 may be implemented as a separate server that can be connected to the reception-side terminal 300 through a Wi-Fi or a short-range wireless communication scheme or as a separate server type that can be accessed through a mobile communication system or an Internet network Or may be implemented in a form that is driven internally by the terminal.

As described above, the PTT communication service operating system according to the present embodiment converts a voice signal into text using a voice recognition function, and converts the text into voice using a voice synthesis function, To utilize a more appropriate communication service environment.

In the following description, the configurations of the transmitting terminal 200 and the receiving terminal 300 are illustrated as separate configurations, but the present invention is not limited thereto. That is, the transmitting terminal 200 may serve as a receiving terminal in the process of using the PTT communication service with the receiving terminal 300, and the receiving terminal 300 may serve as a transmitting terminal. As a result, the configurations of the transmitting terminal 200 and the receiving terminal 300, which will be described below, may be integrated into one PTT communication terminal.

Accordingly, the configuration of the transmitting terminal 200 can be understood as a configuration that the receiving terminal 300 can have while performing the transmitting function, and the configuration of the receiving terminal 300 can be understood as a constitution of the transmitting terminal 200, May be understood as a configuration that can be provided during the reception function.

Referring to FIG. 2, the transmitting terminal 200 can convert a voice signal into text using the voice recognition support apparatus 400 and transmit the converted voice to the receiving terminal 300.

The transmitting terminal 200 may include a configuration of the input unit 210, the microphone 220, the display unit 240, the communication unit 250, and the control unit 260 in order to support the PTT communication service operation according to the present embodiment. have.

The transmitting terminal 200 of the present invention having the above-described configuration operates the voice recognition support apparatus 400 and converts the voice signal inputted from the user into text and transmits it to the receiving terminal 300 Speech To Text) service. To this end, the transmitting terminal 200 can control voice conversion of the voice signal collected by the microphone 220 according to the terminal setting by voice recognition.

The transmission-side input unit 210 generates various input signals required for operation of the transmission-side terminal 200. The input unit 210 may be formed as a button for transmitting a voice by a user or may be provided as a touch map. The generated input signal is transmitted to the controller 260 and can perform a function support according to the input signal.

The transmitting-side microphone 220 is activated according to the functioning of the transmitting-side terminal 200 to collect a surrounding audio signal, particularly a voice signal. The voice signal collected by the transmission-side microphone 220 is transmitted to the transmission-side control unit 260, is voice-recognized under the control of the transmission-side control unit 260, converted into text and transmitted to the reception-side terminal 200 .

The transmission-side display unit 240 provides various screen interfaces necessary for operation of the transmission-side terminal 200. [ The transmitting-side display unit 240 can provide respective screens according to the type of communication service with the receiving-side terminal 300. [ For example, the transmission-side display unit 240 can output one of a screen showing a voice transmission status, a text service support screen, or a screen for outputting text generated by speech recognition, according to each service operation.

The transmitting side communication unit 250 can form a data communication channel with the receiving side terminal 300 through the communication system 100. The communication unit 250 may be configured as a communication module that supports various types of communication methods according to device characteristics of the transmitting terminal 200. [ For example, the communication unit 250 may include various communication modules such as a mobile communication module supporting 2G, 3G, long term evolution (LTE) and the like, and a communication module supporting WiFi. In particular, the communication unit 250 may form a data communication channel for text transmission based on the speech recognition according to the present embodiment with the receiving side terminal 300 according to user input.

The transmission-side control unit 260 supports a signal control required for operation of the transmission-side terminal 200 according to the present embodiment. In particular, the transmission-side control unit 260 can control signal control and data transmission for supporting communication service operation of this embodiment. For this, the transmitting-side control unit 260 may include a configuration as shown in FIG.

3, the transmission-side control unit 260 may include a voice processing unit 261, a text processing unit 262, an STT manager 263, and a media synchronization processing unit 266.

The transmission-side audio processing section 261 is a configuration for processing audio signals collected by the transmission-side microphone 220 to generate audio data. For example, the transmitting-side voice processing unit 261 may be an encoding unit for processing voice signals. The voice data processed by the transmission-side voice processing unit 261 can be transmitted to the STT manager 263. [

The transmission-side text processing section 262 is a configuration for switching the signals inputted from the transmission-side input section 210 and the transmission-side display section 240 of the input function into characters.

The STT manager 263 controls the voice recognition function of the transmitting terminal 200. [ The STT manager 263 can control the voice processing unit 261 to perform voice recognition of the voice data transmitted by the voice processing unit 261 and switch to text. At this time, the STT manager 263 delivers the voice data provided by the voice processing unit 261 to the voice recognition support apparatus in real time. The test processed by the STT manager 263 is transmitted to the network support unit 265.

The transmission side network support unit 265 can support activation control of the transmission side communication unit 250 and formation of a PTT communication service channel with the reception side terminal 300 through the transmission side communication unit 250. It is possible to transmit at least one of the voice data transmitted from the voice processing unit 261 and the text data transmitted from the STT manager 263 to the receiving side terminal 300 through the transmitting side communication unit 250 after the connection of the PTT communication service channel .

The transmission side media synchronization processing unit 266 may receive the time stamp information from which the voice signal is collected from the voice processing unit 261 and may include the received time stamp information in the text transmitted through the voice recognition support apparatus 400 to the receiving side terminal 300 have.

As described above, the transmitting terminal 200 according to the embodiment of the present invention can operate the STT service according to the terminal setting.

Referring back to FIG. 2, the receiving terminal 300 supports connection of a corresponding communication service according to a communication service connection request of the transmitting terminal 200 and a mode set in the terminal.

The receiving-side input unit 310 is a configuration for generating various input signals required for the operation of the receiving-side terminal 300. The input unit 310 may be formed in a button shape or provided as a touch map. The generated input signal is transmitted to the control unit 360 and can perform the function support according to the input signal.

The receiver-side speaker 330 can support the output of the audio signal received by the receiver-side communication unit 350 in a configuration supporting the output of the audio signal of the receiver-side terminal 300. The receiving side speaker 330 can be activated according to the control of the receiving side controller 360 to support the output of the audio signal. In particular, when the TTS service is supported according to the setting of the receiving side terminal 300, It is possible to output a voice signal to the received text.

The receiving-side display unit 340 provides various screen interfaces necessary for the operation of the receiving-side terminal 300. The receiving side display unit 340 can provide respective screens according to the communication service type with the transmitting side terminal 200. [ For example, the receiving-side display unit 340 can individually output one of a screen showing a voice receiving state, a text service supporting screen, or a screen for outputting received text according to each service operation.

The receiving side communication unit 350 can form a communication channel with the transmitting side communication unit 250 of the transmitting side terminal 200. To this end, the receiving side communication unit 350 performs communication with the transmitting side communication unit 250 Module.

The receiving-side control unit 360 supports the signal control necessary for the operation of the receiving-side terminal 300 according to the embodiment of the present invention. In particular, the receiving-side controller 360 can control signal control and data transmission for supporting communication service operation of the present embodiment.

4, the receiving-side control unit 360 includes a receiving-side audio processing unit 361, a receiving-side text processing unit 362, a TTS manager 364, a network support unit 365, and a media synchronization processing unit 366 .

The receiving-side voice processing unit 361 processes the voice signal transmitted through the PTT communication system and outputs voice.

The receiving-side text processing unit 362 may transmit the text transmitted through the PTT communication system to the receiving-side display unit 340 or transmit the text to the TTS manager 364 to perform the voice synthesizing function.

The TTS manager 364 controls the voice synthesis function of the reception-side terminal 300. The TTS manager 364 can control the text data transmitted by the reception-side text processing unit 362 to be switched to speech through the speech synthesis support apparatus 500 under the control. At this time, the TTS manager 364 delivers the text data provided by the reception-side text processing unit 362 to the speech synthesis support apparatus 500 in real time.

The reception side network support unit 365 can support activation control of the reception side communication unit 350 and formation of a PTT communication service channel with the transmission side terminal 200 through the reception side communication unit 350. [ After the PTT communication service channel connection, at least one of the voice and text data transmitted through the receiving side communication unit 350 may be transmitted to the voice processing unit 361 or the text processing unit 362.

The receiving side media synchronization processing unit 366 can arrange the text data so as to match the timestamp of the voice data using the time stamp information extracted from the text data transmitted from the transmitting terminal 200. [

As described above, in the PTT communication service operating system according to the present embodiment, the transmitting terminal 200 generates text through speech recognition using the speech recognition support apparatus 400, and transmits the text to the receiving terminal 300 And the receiving-side terminal 300 can convert the text into speech using the speech synthesis supporting apparatus 500 and output the same.

In addition, the transmitting terminal and the receiving terminal may form a separate channel to support text transmission / reception in a state where a service channel for PTT communication is formed.

In addition, the transmitting terminal can transmit the text generated according to the Speech To Text (STT) service operation, which provides text based on speech recognition, to the receiving terminal together with the voice data transmitted for voice call service support.

In addition, the transmitting terminal can synchronize text and voice data using time stamp information in which voice signals corresponding to the generated text data are collected.

Also, the receiving-side terminal can extract the time stamp information of the received voice data corresponding to the received text data from the data transmitted by the transmitting-side terminal.

Further, the receiving-side terminal can use the extracted time stamp information to arrange the text data on the screen so as to match the received voice.

On the other hand, the present invention is not limited to the above-described configuration as described above. The voice recognition support apparatus and the voice synthesis support apparatus may be implemented as a single server system having a voice recognition function and a voice synthesis function or may be implemented as a first user terminal corresponding to the calling terminal and / In a service application form. At this time, the service application may include a voice recognition function and a voice synthesis function.

Next, various aspects of a push to talk communication service operating method according to another embodiment of the present invention will be described.

5 is a flowchart illustrating a push to talk communication service operation method according to another embodiment of the present invention. 6 is an exemplary view of a display screen of a system using a push-to-talk communication service operating method according to another embodiment of the present invention.

The push to talk communication service operating method according to the present embodiment includes a PTT communication service including a first user terminal 20, a second user terminal 30, a voice recognition support device 400 and a voice synthesis support device 500 System.

The first user terminal 20 may correspond to a transmitting terminal and the second user terminal 30 may correspond to a receiving terminal, but the present invention is not limited thereto. And the second user terminal 30 may correspond to a second computing device capable of transmitting and receiving signals and data through a second mobile terminal or a network.

Also, for the use of a push to talk (PTT) communication service, the first user terminal 20 and the second user terminal 30 can establish a communication channel according to either one of the requests and at least one of the responses . The information and the address of the speech recognition support device supporting the PTT communication service and the speech synthesis support device can be shared at the time of setting the communication channel. In this case, the speech recognition support apparatus and the speech synthesis support apparatus may be referred to as a first server and a second server, respectively. The first server and the second server may be implemented as a single server system 600 having a voice recognition unit and a voice synthesis unit.

Meanwhile, the process of sharing information and addresses with respect to the first server and the second server may be omitted in the case where the voice recognition support apparatus and the voice synthesis support apparatus are implemented in a form of a functional form of a service application or a software module, .

5, when the PTT transmission button is activated in the first user terminal 20 (S51), the first user terminal 20 transmits the PTT communication service (hereinafter, simply referred to as PTT service) To the first user terminal 30 (S53).

The PTT transmit button may be at least one specific hardware button of the first user terminal 20. [ In addition, the PTT transmission button may be a button provided in the user interface of the service application for the PTT service installed in the first user terminal 20. [ The buttons provided in the user interface include a graphical user interface, but the present invention is not limited thereto and may include a virtual button recognized by voice recognition or screen image processing.

Meanwhile, the second user terminal 30 can output the voice data received from the first user terminal 20 through the speaker (S55). In addition, the second user terminal 30 may request the speech recognition support apparatus 400 to perform text conversion on the voice data (S57). This STT request may be performed according to the usage environment of the second user terminal 30, user setting, or real time user input command. The STT request message may include voice data or may include identification information of voice data.

In the case described above, the speech recognition support apparatus 400 may generate the additional information according to the STT request (S59). Wherein the generation of the additional information may comprise converting the voice data into text data. The converted text data may be transmitted again to the second user terminal 30 (S61). The second user terminal 30 may output the received text data on the screen of the PTT service application or on a screen displaying a text message or a multimedia message (S63).

Meanwhile, the voice recognition support apparatus 400 monitors the voice data of the first user terminal 20 and automatically stores or converts the voice data into text data according to a user setting corresponding to the first user terminal 20 , And the converted text data may be provided to the second user terminal 30 in which the current data channel is registered or the preset location or address.

The first user terminal 20 is also connected to the second user terminal 30 via the interactive text window 60 of the push to talk (PTT) service application 50, And the text message (70). The interactive text window 60 may include a user interface such as a character input window 80, a transmission button 86, and a keyboard 90, or an input / output interface. The second user terminal 30 may also output the text message of the user and the text message transmitted from the first user terminal 20 to the interactive text window through the PTT service application.

Referring again to FIG. 5, when a PTT text transmission input is detected in the interactive text window (S71), the first user terminal 20 or the PTT service application (also simply referred to as a service application) To the second user terminal 30 through the data communication network (S73).

The text message may include a TTS request message requesting to convert the text data into voice data. The TTS request may be entered in a toggle manner or on / off manner via the TTS button 82 located in the interactive text window 60, as shown in FIG.

In this case, the voice synthesis support apparatus 500 monitors the text data of the service user or the text message containing the text data in real time on the TTS communication system including the data communication network, and, in accordance with the TTS request message, The text data may be converted into voice data and transmitted to the second user terminal 30 (S75, S77).

On the other hand, when the speech synthesis supporting apparatus 500 is mounted on the first user terminal 20 in the form of a software module, the TTS manager of the first user terminal 20 (see 364 in FIG. 4) The TTS module may be installed in the TTS module according to a predetermined processing procedure corresponding to the command or may be converted into voice to be transmitted to the second user terminal 30 in cooperation with an external voice synthesis support device.

For example, a first user terminal establishes a communication channel with a second user terminal via a network, and a push to talk service application provides text to speech (TTS) while supporting text-to- And generate a TTS request signal for converting the text data input from the user terminal into voice data in the interactive text window according to the detected TTS input / command.

The generated TTS request signal may be transmitted to the TTS manager of the service application installed in the user terminal that generated the signal. Further, according to the implementation, the generated TTS request signal may be transmitted to the voice synthesis support apparatus connected to the user terminal that generates the TTS request signal together with the corresponding text data through the network. In this case, the TTS manager or the voice synthesis support apparatus can convert the text data into voice data according to the TTS request signal, and operate so that the converted voice data is transmitted to the other terminal of the user terminal.

In addition, the method for operating a push to talk communication service according to the present embodiment is characterized in that, after a first user terminal establishes a communication channel with a second user terminal through a network, a push to talk service is executed in an interactive text window of a push to talk service application Secret dialog setting input can be detected while supporting text chat communication.

The secret conversation setting input may be generated or suspended in a toggle manner or in an active / inactive manner via a predetermined button (S, 84) disposed in the interactive text window 60 as shown in FIG.

In the above case, the first user terminal 20 or the calling user terminal is switched to the voice transmission mode in the interactive text window according to the secret conversation setting input, and the second user terminal 30 or the receiving- The operation mode can be switched to the text receiving mode according to the mode switching request signal corresponding to the secret conversation setting input from the side user terminal.

At this time, the service application installed on the calling side user terminal or the voice recognition support device connected to the first or second user terminal through the network converts the text data of the calling side user terminal into voice data according to the signal corresponding to the TTS request To the receiving side user terminal. Here, the text data may be transmitted to the receiving user terminal together with the voice data or separately so that the user of the receiving user terminal can check later.

On the other hand, if a text message including a TTS request is detected by the voice synthesis support apparatus 500, the voice synthesis support apparatus 500 may provide a dummy message for a text message to the second user terminal 30 (S81 ). The dummy message may include no text message, information indicating a record to which the text message is delivered, and location information where the text message is stored. With this dummy message, the user of the second user terminal 30 can transmit a signal requesting the corresponding text data to the speech synthesis support apparatus 500 or the like at a later time such as after the TTS request is terminated (S83) .

While the preferred embodiments of the present invention have been shown and described, it is to be understood that the foregoing description is by way of example only and is not to be construed in any way as limiting. The scope of the invention should be determined by rational interpretation of the appended claims.

Claims

1. A service operating system including a mobile terminal for performing a push to talk (PTT) communication service,

A transmitting side terminal for converting user input voice data into text and transmitting the converted text data according to a setting mode or a user setting mode in a pressed state of a button capable of transmitting voice;

A receiving terminal for converting the text data received after the PTT communication service is connected by the request of the transmitting terminal into a voice according to a preset mode or a mode set by the user and outputting the voice; And

A voice recognition support device for converting voice data input from a user into text data or converting received text data into voice data and outputting the voice data;

A push to talk communication service operating system.
The method according to claim 1,

Wherein the transmitting terminal and the receiving terminal form a separate channel to support text transmission and reception in a state where a service channel for PTT communication is formed.
3. The method of claim 2,

The transmitting terminal transmits the text generated according to the Speech To Text (STT) service operation providing the voice recognition-based text to the receiving terminal together with the voice data transmitted for supporting the voice call service Push to talk communication service operation system.
The method of claim 3,

Wherein the transmitting terminal performs synchronization of the text and the image data using time stamp information in which a voice signal corresponding to the generated data is collected.
The method according to claim 1,

Wherein the receiving terminal extracts time stamp information of the received voice data corresponding to the received text data from data transmitted by the transmitting terminal.
6. The method of claim 5,

Wherein the receiving terminal uses the extracted time stamp information to arrange the text data on the screen so as to match the received voice.