WO2015111256A1 - Speech adjustment system, server, and in-vehicle device - Google Patents

Speech adjustment system, server, and in-vehicle device Download PDF

Info

Publication number
WO2015111256A1
WO2015111256A1 PCT/JP2014/077446 JP2014077446W WO2015111256A1 WO 2015111256 A1 WO2015111256 A1 WO 2015111256A1 JP 2014077446 W JP2014077446 W JP 2014077446W WO 2015111256 A1 WO2015111256 A1 WO 2015111256A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
voice
processing unit
server
vehicle device
Prior art date
Application number
PCT/JP2014/077446
Other languages
French (fr)
Japanese (ja)
Inventor
古郡 弘滋
Original Assignee
クラリオン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by クラリオン株式会社 filed Critical クラリオン株式会社
Priority to JP2015558732A priority Critical patent/JPWO2015111256A1/en
Publication of WO2015111256A1 publication Critical patent/WO2015111256A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser

Definitions

  • the present invention relates to a sound adjustment system, a server, and an in-vehicle device.
  • Patent Document 1 As a background art in this technical field, there is JP-A-2006-301059 (Patent Document 1).
  • This gazette states that “a voice output request including a voice data ID or text data is converted using a conversion table corresponding to the voice quality of the narrator set by the user, and if the converted data is a voice data ID, the voice data Is output using voice, and in the case of text data, the voice synthesis unit synthesizes the voice data and outputs the voice.
  • an object of the present invention is to provide a voice adjustment system, a server, and an in-vehicle device that can diversify information to be provided to a user without depending on the storage capacity and processing capability of the in-vehicle device.
  • an audio adjustment system of the present invention includes a mobile terminal that receives a user operation, a server with which the mobile terminal can communicate, and an in-vehicle device that outputs audio based on audio data
  • the mobile terminal has a terminal-side processing unit that accepts input of parameters for voice adjustment via the user operation, the server acquires the parameter, a storage unit that stores data indicating utterance content, A server-side processing unit that generates voice data from the data based on a parameter, and the in-vehicle device acquires the voice data and outputs a voice based on the acquired voice data. It is characterized by having.
  • the present invention it is possible to diversify the information provided to the user without depending on the storage capacity and processing capability of the in-vehicle device.
  • FIG. 1 is a diagram showing an audio adjustment system according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing the configuration of the in-vehicle device.
  • FIG. 3 is a block diagram showing the configuration of the mobile terminal.
  • FIG. 4 is a diagram showing an operation when the mobile terminal executes the sound adjustment program.
  • FIG. 5 shows an example of the adjustment screen.
  • FIG. 6 is a diagram illustrating an operation at the time of starting the in-vehicle device.
  • FIG. 7 shows an example of the edit screen.
  • FIG. 8 is a diagram illustrating an operation when the user of the in-vehicle device performs destination setting using voice recognition.
  • FIG. 9 is a diagram showing an example of the operation when the mobile terminal executes the mail reading program.
  • FIG. 1 is a diagram showing an audio adjustment system according to an embodiment of the present invention.
  • the voice adjustment system 1 includes an in-vehicle device 100 mounted on a vehicle 10 that is a mobile body, a mobile terminal 200 carried by various users, and a service providing server 300 connected to a public communication line network N1 such as the Internet network. And. As will be described later, the in-vehicle device 100 and the mobile terminal 200 can be connected by short-range wireless communication. The mobile terminal 200 and the service providing server 300 can be connected to each other via the public communication line network N1.
  • FIG. 2 is a block diagram showing the configuration of the in-vehicle device 100.
  • the in-vehicle device 100 includes an information processing unit 101, a storage unit 102, a display unit 103, an input unit 104, a GPS receiving device 105, a near field communication unit 106, and a voice input / output unit 107. Each of these devices is electrically connected by a bus 108 and configured to be able to exchange data with each other.
  • the information processing unit 101 includes a CPU 109, a ROM 110, a RAM 111, and peripheral circuits (not shown) connected to each other via a bus 112, and functions as a computer (vehicle-mounted processing unit) that centrally controls the vehicle-mounted device 100.
  • the information processing unit 101 of the in-vehicle device 100 executes various processes such as route guidance processing and voice processing executed by a known car navigation device or car audio device by executing a control program stored in the storage unit 102. Execute.
  • the information processing unit 101 identifies the current location of the vehicle 10 based on the GPS radio wave received by the GPS receiver 105 and uses the map data stored in the storage unit 102 to determine the departure place ( For example, a route from the current location) to the destination is searched.
  • the information processing unit 101 generates graphics information such as a map image and traffic information, outputs it to the display unit 103, reads voice data for performing route guidance from the storage unit 102, and is installed in the vehicle.
  • a guidance voice is emitted from a speaker (not shown).
  • the storage unit 102 stores a control program executed by the information processing unit 101 and various data.
  • a control program executed by the information processing unit 101 and various data.
  • an HDD Hard Disk Drive
  • a semiconductor memory e.g., a CD-ROM, or a DVD-ROM
  • a voice interactive application program (hereinafter referred to as a control program that performs navigation processing, a voice uttered by a user of the mobile terminal 200, etc.) is transmitted to the storage unit 102 via the mobile terminal 200 to the service providing server 300.
  • These programs may be programs acquired (downloaded) via the public communication network N1 (FIG. 1), or may be programs installed in the in-vehicle device 100 in advance.
  • the display unit 103 displays various images under the control of the information processing unit 101.
  • a liquid crystal display device is used for the display unit 103.
  • the input unit 104 is a device that detects a user operation and notifies the information processing unit 101, and includes an operation switch and a transmissive touch panel arranged on the display screen.
  • the GPS receiver 105 receives GPS radio waves transmitted from GPS satellites, calculates the position and direction of the vehicle based on the GPS radio waves, and outputs them to the information processing unit 101.
  • the short-range communication unit 106 is a wireless communication interface for performing short-range wireless communication under the control of the information processing unit 101. By using the short-range communication unit 106, wireless communication can be performed between the in-vehicle device 100 and the portable terminal 200 in the vehicle.
  • Bluetooth registered trademark
  • Wi-Fi registered trademark
  • the voice input / output unit 107 includes a decoder, an amplifier, and the like, and outputs various voices from a speaker installed in the vehicle 10 under the control of the information processing unit 101.
  • the audio input / output unit 107 generates an audio signal from the audio data stored in the storage unit 102 and emits the sound from a speaker.
  • Specific examples of the voice include guidance voice for navigation, CD voice, radio voice, and various voices transmitted from the portable terminal 200.
  • the voice input / output unit 107 has a microphone and an AD conversion circuit, and has a function of converting voice uttered by the user into voice data.
  • the in-vehicle device 100 may further include other configurations included in a known in-vehicle device.
  • FIG. 3 is a block diagram illustrating a configuration of the mobile terminal 200.
  • the mobile terminal 200 is a smartphone or a PDA (Personal Data Assistance).
  • the mobile terminal 200 includes an information processing unit 201, a storage unit 202, a display unit 203, an input unit 204, a GPS reception device 205, a communication unit 206, and a short-range communication unit 207.
  • Each of these devices is electrically connected by a bus 209 and configured to be able to exchange data with each other.
  • the information processing unit 201 includes a CPU 210, a ROM 211, a RAM 212, and peripheral circuits (not shown) connected to each other via a bus 213, and functions as a computer (terminal side processing unit) that centrally controls the mobile terminal 200.
  • the information processing unit 201 of the mobile terminal 200 executes a control program stored in the storage unit 202, thereby enabling a telephone function provided in a known mobile terminal, a mail function for sending / receiving / browsing electronic mail, Browsing function, and a function of executing various application programs acquired from the Internet or the like.
  • the storage unit 202 stores a control program executed by the information processing unit 201 and various data.
  • a semiconductor memory or an HDD is applied to the storage unit 202.
  • the storage unit 202 communicates with an application program (hereinafter referred to as a voice adjustment program) for adjusting a voice (navigation guidance voice) used in the in-vehicle device 100 and the in-vehicle device 100.
  • Application programs hereinafter referred to as communication programs
  • application programs for editing text data stored in the service providing server 300 hereinafter referred to as editing programs
  • These programs may be programs acquired (downloaded) via the public communication network N1 (FIG. 1), or may be programs installed in advance in the mobile terminal 200.
  • the display unit 203 is a device that displays various images under the control of the information processing unit 201.
  • a liquid crystal display device is used.
  • the input unit 204 is a device that detects a user operation and notifies the information processing unit 201, and includes an operation switch and a transmissive touch panel arranged on the display screen.
  • the GPS receiving device 205 receives GPS radio waves transmitted from GPS satellites, calculates the current location and orientation of the mobile terminal 200 based on the GPS radio waves, and outputs them to the information processing unit 201.
  • the communication unit 206 accesses the public communication line network N1 (FIG. 1) or the like via a wireless communication network (a mobile phone communication network in this embodiment), and the public communication line network N1. It is a communication interface for communicating with a device connected to the device.
  • the communication unit 206 enables communication between the mobile terminal 200 and the service providing server 300 (FIG. 1). Further, the communication unit 206 can make calls and mails with other portable terminals.
  • the near field communication unit 207 is a wireless communication interface for performing near field wireless communication under the control of the information processing unit 201. By using the short-range communication unit 207, wireless communication can be performed between the mobile terminal 200 and the in-vehicle device 100.
  • the voice input / output unit 208 outputs various voices from a speaker included in the mobile terminal 200 under the control of the information processing unit 201, or inputs voice uttered by the user via a microphone and converts it into voice data. To do. By using the voice input / output unit 208, it is possible to make a call with another portable terminal and to reproduce voice data such as music stored in the storage unit 202 of the portable terminal 200.
  • the in-vehicle device 100 may further include other configurations included in a known in-vehicle device.
  • the service providing server 300 is a server that provides a service such as creating voice data from text data, that is, a TTS (Text to speech) service.
  • the service providing server 300 includes an information processing unit 301, a storage unit 302, and a communication unit 303, and has a higher processing capacity and a larger storage capacity than the in-vehicle device 100. It is configured.
  • the information processing unit 301 includes a CPU, a ROM, a RAM, and the like, and functions as a computer (server side processing unit) that centrally controls the service providing server 300.
  • the storage unit 302 stores a control program executed by the information processing unit 301 and various data.
  • the information processing unit 301 can function as a TTS engine that performs voice conversion processing for creating voice data from text data by executing a control program stored in the storage unit 302.
  • the storage unit 302 also stores a database in which text data indicating utterance content (such as voice for route guidance) reproduced by the in-vehicle device 100 and a database of point of interest (POI) used for route search and the like are stored. ing.
  • text data indicating utterance content such as voice for route guidance
  • POI point of interest
  • FIG. 4 is a diagram illustrating an operation when the mobile terminal 200 executes the sound adjustment program.
  • the information processing unit 201 of the portable terminal 200 executes the voice adjustment program in response to a user operation
  • the information processing unit 201 performs a login process to the service providing server 300 (step S1A).
  • the information processing unit 201 accesses the service providing server 300 through the communication unit 206 and causes the display unit 203 to display a login screen.
  • login information for example, a user ID or a password
  • the information processing unit 301 of the service providing server 300 displays a voice adjustment screen (adjustment screen) on the screen of the mobile terminal 200 (step S2A).
  • FIG. 5 is a diagram showing an example of the adjustment screen.
  • the adjustment screen is a screen that accepts input of parameters for voice adjustment, and more specifically, displays a screen that accepts input of voice pitch, speed, height (height), intonation, gender, and type.
  • the parameters are not limited to the above parameters, and may be added or changed as appropriate.
  • the information processing unit 301 of the service providing server 300 stores these parameter groups as individual TTS parameters.
  • Store (save) in 302 step S3A).
  • the storage unit 302 stores individual TTS parameters in association with identification information (user ID or terminal ID) for identifying the mobile terminal 200.
  • identification information user ID or terminal ID
  • the user of the mobile terminal 200 can set the desired parameters by operating the mobile terminal 200 that he / she owns, and store it in the service providing server 300.
  • this parameter is set (when the audio adjustment program is executed), it is not necessary to connect the mobile terminal 200 to the in-vehicle device 100. Therefore, the user can set parameters not only in the vehicle 10 but also in any place outside the vehicle 10.
  • the service providing server 300 converts predetermined sample text data into voice data (voice conversion) based on the individual TTS parameters and sends the voice data to the portable terminal 200. You may make it reproduce
  • FIG. 6 is a diagram illustrating an operation when the in-vehicle device 100 is activated.
  • the in-vehicle device 100 and the mobile terminal 200 are connected for communication.
  • the information processing unit 101 of the in-vehicle device 100 causes the mobile terminal 200 to execute the communication program.
  • the in-vehicle device 100 accesses the service providing server 300 via the portable terminal 200 and can communicate with the server 300.
  • the information processing unit 101 of the in-vehicle device 100 downloads the audio data adjusted by the individual TTS parameter from the service providing server 300 (step S1B).
  • the service providing server 300 may acquire identification information (user ID or terminal ID) from the mobile terminal 200 and specify a single individual TTS parameter based on the identification information. Moreover, not only this method but the well-known technique for specifying the individual TTS parameter corresponding to the portable terminal 200 or the vehicle-mounted apparatus 100 is applicable widely.
  • the voice data is data obtained by voice-converting a text data group of fixed phrases stored in the storage unit 302 of the service providing server 300. That is, the service providing server 300 converts each text data into voice data based on the individual TTS parameters by the TTS engine. Thus, voice data corresponding to the characters of each text data is generated with the voice adjusted by the individual TTS parameter.
  • This voice conversion process may be executed when a download instruction is issued from the in-vehicle device 100, or at a timing before the download instruction (for example, an appropriate timing after the individual TTS parameters are stored). You may do it.
  • the information processing unit 101 of the in-car device 100 stores the voice data in the storage unit 102, and the stored voice data is reproduced by the voice input / output unit 107 at a timing specified in advance (step S2B).
  • a timing specified in advance For example, immediately after the download, "Hello, *** Mr.”, “Happy Birthday”, “Today also go Let in safe driving” can be played with, the timing arriving at the home that has been set in advance You can play “Thank you for your hard work”.
  • the service providing server 300 can convert the voice of the user of the mobile terminal 200 to a desired voice, and the voice can be reproduced by the in-vehicle device 100.
  • the text data stored in the service providing server 300 can be edited by the user operating the portable terminal 200 to execute the editing program.
  • the login process similar to step S1A shown in FIG. 4 is performed, and when the login is completed, the information processing unit 301 of the service providing server 300 displays a text data editing screen (editing screen) on the screen of the mobile terminal 200. ) Is displayed.
  • FIG. 7 is a diagram showing an example of the editing screen.
  • This editing screen is a screen that accepts editing of text data of fixed phrases corresponding to the audio content reproduced by the in-vehicle device 100.
  • a text data group stored in the storage unit 302 of the service providing server 300 is displayed on the editing screen, and editing of each text data is accepted.
  • the edited text data is stored in the storage unit 302 in association with identification information (user ID or terminal ID) for identifying the mobile terminal 200.
  • identification information user ID or terminal ID
  • the user of the mobile terminal 200 can edit the text data by operating the mobile terminal 200 that the user owns, and store the text data in the service providing server 300.
  • the user can edit text data at an arbitrary location such as outside the vehicle.
  • FIG. 8 is a diagram illustrating an operation when the user of the in-vehicle device 100 performs destination setting using voice recognition. Also in this case, it is assumed that the in-vehicle device 100 and the mobile terminal 200 are connected for communication. Further, the information processing unit 101 of the in-vehicle device 100 executes a conversation program in response to a user operation, waits for input of a voice specifying a destination (destination) via the voice input / output unit 107, and inputs a voice. The voice data corresponding to the voice is transmitted to the service providing server 300 via the portable terminal 200. The information processing unit 301 of the service providing server 300 recognizes voice data received via the mobile terminal 200 and performs a POI search (POI database search) using the voice recognition result as a search key (step S1C).
  • POI search POI database search
  • the information processing unit 301 of the service providing server 300 uses the TTS engine to generate voice data that reads the search result using the individual TTS parameters, and transmits the voice data to the in-vehicle device 100 via the portable terminal 200 (step) S2C).
  • voice data adjusted by the individual TTS parameters that is, voice data desired by the user is generated and transmitted to the in-vehicle device 100.
  • the information processing unit 101 of the in-vehicle device 100 performs a process of reproducing the received audio data (step S3C).
  • the in-vehicle device 100 reproduces the voice desired by the user, for example, “Search results are XX”, “First ... second ...”. .
  • the information processing unit 101 of the in-vehicle device 100 performs processing (destination setting) for searching for a recommended route from the current location to the destination (step S4C).
  • processing destination setting
  • the guidance voice is emitted as a voice adjusted by the service providing server 300 based on the individual TTS parameters. That is, the service providing server 300 converts the guidance voice text data stored in the storage unit 302 into voice data based on the individual TTS parameters in advance, and the in-vehicle device from the service providing server 300 via the portable terminal 200. 100 and stored in the storage unit 102 by the in-vehicle device 100, it is possible to reproduce the guidance voice adjusted by the in-vehicle device 100. In this way, route guidance can be performed with the voice desired by the user. Note that the timing at which the service providing server 300 converts the voice data and the timing at which the voice data is transmitted to the in-vehicle device 100 can be set to any timing as long as the guidance voice is not reproduced.
  • mobile terminal 200 accepts input of individual TTS parameters, which are parameters for voice adjustment, through user operation, and service providing server 300 indicates the utterance content. It has a storage unit 302 for storing text data, acquires individual TTS parameters, generates voice data from the text data based on the parameters, and the in-vehicle device 100 acquires the voice data, and converts the acquired voice data into the acquired voice data Since the sound is output based on this, the sound desired by the user can be reproduced on the in-vehicle device 100 side without depending on the storage capacity or the processing capability of the in-vehicle device 100.
  • the voice conversion is performed by the service providing server 300 having a higher processing capability than the in-vehicle device 100, the adjustment range of the voice conversion can be made wider than that in the case of performing the voice conversion by the in-vehicle device 100. As a result, it is possible to diversify the information provided to the user.
  • the individual TTS parameter is set only by communication between the mobile terminal 200 and the service providing server 300, the individual TTS parameter can be set without operating the in-vehicle device 100. Therefore, individual TTS parameters can be set at an arbitrary place and at an arbitrary timing, and convenience is improved. Note that, if an individual TTS parameter is set on the in-vehicle device 100 side, it is necessary to prepare a dedicated menu screen or program, which depends on the function and performance of the in-vehicle device 100. In other words, in the present embodiment, it is possible to perform parameter setting and acquisition / reproduction of audio data adjusted to a desired voice without depending on the function and performance of the in-vehicle device 100.
  • the individual TTS parameters include the pitch, speed, height (height), etc. of the voice, it is easy to adjust to the voice desired by the user.
  • voice data is generated from text data, a known voice conversion technique can be widely applied, and voice contents can be easily edited and created.
  • the information processing unit (terminal-side processing unit) 201 of the mobile terminal 200 accepts editing of text data stored in the service providing server 300 through a user operation, and the information processing unit (server side) of the service providing server 300 Since the processing unit 301 edits the text data so as to perform the above editing, it can be easily adjusted to the audio content desired by the user.
  • the data can be edited without operating the in-vehicle device 100
  • the data can be edited at any place and at any timing, which improves convenience and depends on the function and performance of the in-vehicle device 100. You can edit without having to.
  • the above-described embodiment is merely one aspect of the present invention, and can be arbitrarily changed within the scope of the present invention.
  • the case where the pitch, speed, height (height), inflection, gender, and the like of the voice are adjustable has been described.
  • any one or more of the adjustment items can be adjusted, and the adjustment items can be appropriately increased or decreased.
  • the case where the text data is converted into voice data based on the individual TTS parameters has been described.
  • the voice data serving as a reference is stored in the service providing server 300 in advance, The pitch, speed, height, etc. may be adjusted based on the individual TTS parameters.
  • an application program for reading a mail (hereinafter referred to as a mail reading program) is further stored in the portable terminal 200, and the personalized TTS parameter is adjusted by the service providing server 300 using this program.
  • Voice voice data may be generated by voice and played back by the in-vehicle device 100.
  • FIG. 9 is a diagram illustrating an example of an operation when the mobile terminal 200 executes the mail reading program.
  • the text of the mail is transmitted to the service providing server 300 (step S1D).
  • the information processing unit 301 of the service providing server 300 uses the TTS engine to generate voice data that reads the text of the mail using the individual TTS parameters, and transmits the voice data to the in-vehicle device 100 via the mobile terminal 200.
  • the information processing unit 101 of the in-vehicle device 100 performs a process of reproducing the received audio data (step S3D).
  • a voice indicating the content of the mail such as “Today's meeting, start time has been changed” can be reproduced with a voice desired by the user.
  • the in-vehicle device 100 may have various functions of the mobile terminal 200.
  • the in-vehicle device 100 may receive an input of an individual TTS parameter that is a parameter for voice adjustment through a user operation, and cause the service providing server 300 to acquire the individual TTS parameter.
  • the in-vehicle device 100 includes a communication unit for wirelessly communicating with the service providing server 300 connected to the public communication line network N1 such as the Internet network, and the mobile device 100 is provided via the portable terminal 200.
  • the in-vehicle device 100 is not limited to being mounted on a four-wheeled vehicle such as an automobile, and may be mounted on a two-wheeled vehicle such as a bicycle.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Navigation (AREA)

Abstract

Provided are a speech adjustment system, a server and an in-vehicle device capable of diversifying information provided for a user without depending on the storage capacity and processing ability of the in-vehicle device. A mobile terminal (200) accepts the input of an individual-specific TTS parameter as a parameter for speech adjustment through user operation. A service provider server (300) with a storage unit (302) for storing text data representing utterance content acquires the individual-specific TTS parameter, and generates speech data from the text data on the basis of the parameter. An in-vehicle device (100) acquires the speech data and outputs a speech on the basis of the acquired speech data.

Description

音声調整システム、サーバ及び車載装置Audio adjustment system, server, and in-vehicle device
 本発明は、音声調整システム、サーバ及び車載装置に関する。 The present invention relates to a sound adjustment system, a server, and an in-vehicle device.
 本技術分野の背景技術として、特開2006-301059号公報(特許文献1)がある。この公報には、「音声データIDまたはテキストデータを含む音声出力要求を、ユーザが設定したナレータの音質に対応した変換テーブルを用いて変換し、変換後のデータが音声データIDの場合は音声データを使って音声出力し、テキストデータの場合は音声合成部で音声データを合成して音声出力する。」と記載されている。 As a background art in this technical field, there is JP-A-2006-301059 (Patent Document 1). This gazette states that “a voice output request including a voice data ID or text data is converted using a conversion table corresponding to the voice quality of the narrator set by the user, and if the converted data is a voice data ID, the voice data Is output using voice, and in the case of text data, the voice synthesis unit synthesizes the voice data and outputs the voice. "
特開2006-301059号公報JP 2006-301059 A
 特許文献1では、例えば、ユーザが選択できるメッセージの声質の種類を多くするためには、その特徴を示す特徴データをその分だけ記憶する必要があるが、車両用ナビゲーション装置における記憶装置の容量には限界がある。
 そこで、本発明は、車載装置の記憶容量や処理能力に依存せずに、ユーザに提供する情報の多様化を図ることができる音声調整システム、サーバ及び車載装置を提供することを目的とする。
In Patent Document 1, for example, in order to increase the types of voice quality of a message that can be selected by the user, it is necessary to store the feature data indicating the feature correspondingly, but the capacity of the storage device in the vehicle navigation device is increased. There are limits.
Therefore, an object of the present invention is to provide a voice adjustment system, a server, and an in-vehicle device that can diversify information to be provided to a user without depending on the storage capacity and processing capability of the in-vehicle device.
 上記目的を達成するために、本発明の音声調整システムは、ユーザ操作を受け付ける携帯端末と、前記携帯端末が通信可能なサーバと、音声データに基づいて音声を出力する車載装置とを備え、前記携帯端末は、前記ユーザ操作を介して音声調整用のパラメータの入力を受け付ける端末側処理部を有し、前記サーバは、発話内容を示すデータを記憶する記憶部と、前記パラメータを取得し、前記パラメータに基づいて前記データから音声データを生成するサーバ側処理部とを有し、前記車載装置は、前記音声データを取得し、取得した前記音声データに基づいて音声を出力する車載側処理部を有することを特徴とする。 In order to achieve the above object, an audio adjustment system of the present invention includes a mobile terminal that receives a user operation, a server with which the mobile terminal can communicate, and an in-vehicle device that outputs audio based on audio data, The mobile terminal has a terminal-side processing unit that accepts input of parameters for voice adjustment via the user operation, the server acquires the parameter, a storage unit that stores data indicating utterance content, A server-side processing unit that generates voice data from the data based on a parameter, and the in-vehicle device acquires the voice data and outputs a voice based on the acquired voice data. It is characterized by having.
 本発明によれば、車載装置の記憶容量や処理能力に依存せずに、ユーザに提供する情報の多様化を図ることができる。 According to the present invention, it is possible to diversify the information provided to the user without depending on the storage capacity and processing capability of the in-vehicle device.
図1は本発明の実施形態に係る音声調整システムを示した図である。FIG. 1 is a diagram showing an audio adjustment system according to an embodiment of the present invention. 図2は車載装置の構成を示したブロック図である。FIG. 2 is a block diagram showing the configuration of the in-vehicle device. 図3は携帯端末の構成を示したブロック図である。FIG. 3 is a block diagram showing the configuration of the mobile terminal. 図4は携帯端末が音声調整プログラムを実行したときの動作を示した図である。FIG. 4 is a diagram showing an operation when the mobile terminal executes the sound adjustment program. 図5は調整画面の一例を示した図である。FIG. 5 shows an example of the adjustment screen. 図6は車載装置の起動時の動作を示した図である。FIG. 6 is a diagram illustrating an operation at the time of starting the in-vehicle device. 図7は編集画面の一例を示した図である。FIG. 7 shows an example of the edit screen. 図8は車載装置のユーザが音声認識を利用して行先設定を行うときの動作を示した図である。FIG. 8 is a diagram illustrating an operation when the user of the in-vehicle device performs destination setting using voice recognition. 図9は携帯端末がメール読み上げプログラムを実行したときの動作の一例を示した図である。FIG. 9 is a diagram showing an example of the operation when the mobile terminal executes the mail reading program.
 以下、図面を参照して本発明の実施の形態について説明する。
 図1は、本発明の実施形態に係る音声調整システムを示した図である。
 この音声調整システム1は、移動体である車両10に搭載される車載装置100と、様々なユーザが携帯する携帯端末200と、インターネット網などの公衆通信回線網N1に接続されたサービス提供サーバ300とを備えている。
 なお、後述するように、車載装置100と携帯端末200とは、近距離無線通信により通信接続が可能である。また、携帯端末200とサービス提供サーバ300とは、公衆通信回線網N1を介して通信接続が可能である。
Embodiments of the present invention will be described below with reference to the drawings.
FIG. 1 is a diagram showing an audio adjustment system according to an embodiment of the present invention.
The voice adjustment system 1 includes an in-vehicle device 100 mounted on a vehicle 10 that is a mobile body, a mobile terminal 200 carried by various users, and a service providing server 300 connected to a public communication line network N1 such as the Internet network. And.
As will be described later, the in-vehicle device 100 and the mobile terminal 200 can be connected by short-range wireless communication. The mobile terminal 200 and the service providing server 300 can be connected to each other via the public communication line network N1.
 図2は、車載装置100の構成を示したブロック図である。
 車載装置100は、情報処理部101と、記憶部102と、表示部103と、入力部104と、GPS受信装置105と、近距離通信部106と、音声入出力部107とを備えている。これらの各デバイスは、バス108によって電気的に接続され、相互にデータの受け渡しが可能に構成されている。
 情報処理部101は、バス112によって相互に接続されたCPU109、ROM110、RAM111および周辺回路(不図示)を有し、車載装置100を中枢的に制御するコンピューター(車載側処理部)として機能する。例えば、車載装置100の情報処理部101は、記憶部102に記憶される制御プログラムを実行することにより、公知のカーナビゲーション装置やカーオーディオ装置が実行する経路案内処理や音声処理などの各種処理を実行する。
FIG. 2 is a block diagram showing the configuration of the in-vehicle device 100.
The in-vehicle device 100 includes an information processing unit 101, a storage unit 102, a display unit 103, an input unit 104, a GPS receiving device 105, a near field communication unit 106, and a voice input / output unit 107. Each of these devices is electrically connected by a bus 108 and configured to be able to exchange data with each other.
The information processing unit 101 includes a CPU 109, a ROM 110, a RAM 111, and peripheral circuits (not shown) connected to each other via a bus 112, and functions as a computer (vehicle-mounted processing unit) that centrally controls the vehicle-mounted device 100. For example, the information processing unit 101 of the in-vehicle device 100 executes various processes such as route guidance processing and voice processing executed by a known car navigation device or car audio device by executing a control program stored in the storage unit 102. Execute.
 経路案内処理の場合、情報処理部101は、GPS受信装置105で受信されたGPS電波などに基づいて車両10の現在地を特定し、記憶部102に記憶された地図データを用いて、出発地(例えば、現在地)から目的地に至る経路を探索する。また、情報処理部101は、地図画像や交通情報などのグラフィックス情報を生成し、表示部103に出力するとともに、記憶部102から経路案内を行うための音声データを読み出し、車両に設置された不図示のスピーカから案内音声を放音させる。 In the case of route guidance processing, the information processing unit 101 identifies the current location of the vehicle 10 based on the GPS radio wave received by the GPS receiver 105 and uses the map data stored in the storage unit 102 to determine the departure place ( For example, a route from the current location) to the destination is searched. In addition, the information processing unit 101 generates graphics information such as a map image and traffic information, outputs it to the display unit 103, reads voice data for performing route guidance from the storage unit 102, and is installed in the vehicle. A guidance voice is emitted from a speaker (not shown).
 記憶部102には、情報処理部101が実行する制御プログラムや各種のデータが記憶される。この記憶部102には、例えば、HDD(Hard Disk Drive)、半導体メモリ、CD-ROM、または、DVD-ROMが適用される。本実施形態では、この記憶部102に、ナビゲーション処理を行う制御プログラムや、携帯端末200のユーザが発声した音声などを携帯端末200経由でサービス提供サーバ300に送信する音声対話型のアプリケーションプログラム(以下、会話プログラムと言う)が記憶されている。
 なお、これらプログラムは、公衆通信回線網N1(図1)を介して取得(ダウンロード)したプログラムでも良いし、車載装置100に予めインストールされているプログラムでも良い。
The storage unit 102 stores a control program executed by the information processing unit 101 and various data. For example, an HDD (Hard Disk Drive), a semiconductor memory, a CD-ROM, or a DVD-ROM is applied to the storage unit 102. In the present embodiment, a voice interactive application program (hereinafter referred to as a control program that performs navigation processing, a voice uttered by a user of the mobile terminal 200, etc.) is transmitted to the storage unit 102 via the mobile terminal 200 to the service providing server 300. Is called a conversation program).
These programs may be programs acquired (downloaded) via the public communication network N1 (FIG. 1), or may be programs installed in the in-vehicle device 100 in advance.
 表示部103は、情報処理部101の制御の下、各種の画像を表示する。表示部103には、例えば、液晶表示装置が使用される。
 入力部104は、ユーザ操作を検出して情報処理部101に通知する装置であり、操作スイッチや、表示画面に重ねた配置された透過型のタッチパネルを備えている。
 GPS受信装置105は、GPS衛星から送信されるGPS電波を受信し、GPS電波に基づいて自車位置や方位などを算出し、情報処理部101に出力する。
 近距離通信部106は、情報処理部101の制御の下、近距離無線通信を行うための無線通信インターフェースである。この近距離通信部106を用いることにより、車載装置100と車両内の携帯端末200との間で無線通信することができる。本実施形態では、近距離無線通信にBluetooth(登録商標)を用いているが、Wi-Fi(登録商標)などでも良い。
The display unit 103 displays various images under the control of the information processing unit 101. For example, a liquid crystal display device is used for the display unit 103.
The input unit 104 is a device that detects a user operation and notifies the information processing unit 101, and includes an operation switch and a transmissive touch panel arranged on the display screen.
The GPS receiver 105 receives GPS radio waves transmitted from GPS satellites, calculates the position and direction of the vehicle based on the GPS radio waves, and outputs them to the information processing unit 101.
The short-range communication unit 106 is a wireless communication interface for performing short-range wireless communication under the control of the information processing unit 101. By using the short-range communication unit 106, wireless communication can be performed between the in-vehicle device 100 and the portable terminal 200 in the vehicle. In this embodiment, Bluetooth (registered trademark) is used for short-range wireless communication, but Wi-Fi (registered trademark) may be used.
 音声入出力部107は、デコーダやアンプなどを備え、情報処理部101の制御の下、各種の音声を、車両10に設置されたスピーカから出力させる。例えば、音声入出力部107は、記憶部102に記憶された音声データから音声信号を生成し、スピーカから放音させる。この音声の具体例としては、ナビゲーション用の案内音声、CD音声、ラジオ音声、携帯端末200から送信された各種の音声である。また、音声入出力部107は、マイクやAD変換回路を有し、ユーザが発声した音声を音声データに変換する機能も具備している。なお、車載装置100は、公知の車載装置が具備する他の構成を更に具備しても良い。 The voice input / output unit 107 includes a decoder, an amplifier, and the like, and outputs various voices from a speaker installed in the vehicle 10 under the control of the information processing unit 101. For example, the audio input / output unit 107 generates an audio signal from the audio data stored in the storage unit 102 and emits the sound from a speaker. Specific examples of the voice include guidance voice for navigation, CD voice, radio voice, and various voices transmitted from the portable terminal 200. The voice input / output unit 107 has a microphone and an AD conversion circuit, and has a function of converting voice uttered by the user into voice data. The in-vehicle device 100 may further include other configurations included in a known in-vehicle device.
 図3は、携帯端末200の構成を示したブロック図である。
 携帯端末200は、スマートフォンやPDA(Personal Data Assistance)である。図3に示すように、携帯端末200は、情報処理部201と、記憶部202と、表示部203と、入力部204と、GPS受信装置205と、通信部206と、近距離通信部207と、音声入出力部208とを備えている。これらの各デバイスは、バス209によって電気的に接続され、相互にデータの受け渡しが可能に構成されている。
FIG. 3 is a block diagram illustrating a configuration of the mobile terminal 200.
The mobile terminal 200 is a smartphone or a PDA (Personal Data Assistance). As illustrated in FIG. 3, the mobile terminal 200 includes an information processing unit 201, a storage unit 202, a display unit 203, an input unit 204, a GPS reception device 205, a communication unit 206, and a short-range communication unit 207. A voice input / output unit 208. Each of these devices is electrically connected by a bus 209 and configured to be able to exchange data with each other.
 情報処理部201は、バス213によって相互に接続されたCPU210、ROM211、RAM212および周辺回路(不図示)を有し、携帯端末200を中枢的に制御するコンピューター(端末側処理部)として機能する。
 例えば、携帯端末200の情報処理部201は、記憶部202に記憶される制御プログラムを実行することにより、公知の携帯端末が具備する電話機能や、電子メールを送受信・閲覧するメール機能や、インターネットをブラウジングするブラウジング機能や、インターネットなどから取得した各種アプリケーションプログラムを実行する機能を具備する。
The information processing unit 201 includes a CPU 210, a ROM 211, a RAM 212, and peripheral circuits (not shown) connected to each other via a bus 213, and functions as a computer (terminal side processing unit) that centrally controls the mobile terminal 200.
For example, the information processing unit 201 of the mobile terminal 200 executes a control program stored in the storage unit 202, thereby enabling a telephone function provided in a known mobile terminal, a mail function for sending / receiving / browsing electronic mail, Browsing function, and a function of executing various application programs acquired from the Internet or the like.
 記憶部202には、情報処理部201が実行する制御プログラムや各種のデータが記憶される。この記憶部202には、例えば、半導体メモリまたはHDDが適用される。
 本実施形態では、この記憶部202に、車載装置100で使用される音声(ナビゲーション用の案内音声)を調整するためのアプリケーションプログラム(以下、音声調整プログラムと言う)や、車載装置100と通信するためのアプリケーションプログラム(以下、通信プログラムと言う)や、サービス提供サーバ300に記憶されるテキストデータを編集するためのアプリケーションプログラム(以下、編集プログラムと言う)が記憶されている。
 これらプログラムは、公衆通信回線網N1(図1)を介して取得(ダウンロード)したプログラムでも良いし、携帯端末200に予めインストールされているプログラムでも良い。
The storage unit 202 stores a control program executed by the information processing unit 201 and various data. For example, a semiconductor memory or an HDD is applied to the storage unit 202.
In the present embodiment, the storage unit 202 communicates with an application program (hereinafter referred to as a voice adjustment program) for adjusting a voice (navigation guidance voice) used in the in-vehicle device 100 and the in-vehicle device 100. Application programs (hereinafter referred to as communication programs) and application programs for editing text data stored in the service providing server 300 (hereinafter referred to as editing programs) are stored.
These programs may be programs acquired (downloaded) via the public communication network N1 (FIG. 1), or may be programs installed in advance in the mobile terminal 200.
 表示部203は、情報処理部201の制御の下、各種の画像を表示する装置であり、例えば、液晶表示装置が使用される。
 入力部204は、ユーザ操作を検出して情報処理部201に通知する装置であり、操作スイッチや、表示画面に重ねた配置された透過型のタッチパネルを備えている。
 GPS受信装置205は、GPS衛星から送信されるGPS電波を受信し、GPS電波に基づいて携帯端末200の現在地や方位などを算出し、情報処理部201に出力する。
The display unit 203 is a device that displays various images under the control of the information processing unit 201. For example, a liquid crystal display device is used.
The input unit 204 is a device that detects a user operation and notifies the information processing unit 201, and includes an operation switch and a transmissive touch panel arranged on the display screen.
The GPS receiving device 205 receives GPS radio waves transmitted from GPS satellites, calculates the current location and orientation of the mobile terminal 200 based on the GPS radio waves, and outputs them to the information processing unit 201.
 通信部206は、情報処理部201の制御の下、無線通信網(本実施形態では携帯電話通信網)を経由して公衆通信回線網N1(図1)などにアクセスし、公衆通信回線網N1などに接続された機器と通信するための通信インターフェースである。この通信部206により、携帯端末200とサービス提供サーバ300(図1)との間で通信することができる。また、この通信部206により他の携帯端末と電話やメールが可能である。
 近距離通信部207は、情報処理部201の制御の下、近距離無線通信を行うための無線通信インターフェースである。この近距離通信部207を用いることにより、携帯端末200と車載装置100との間で無線通信することができる。
Under the control of the information processing unit 201, the communication unit 206 accesses the public communication line network N1 (FIG. 1) or the like via a wireless communication network (a mobile phone communication network in this embodiment), and the public communication line network N1. It is a communication interface for communicating with a device connected to the device. The communication unit 206 enables communication between the mobile terminal 200 and the service providing server 300 (FIG. 1). Further, the communication unit 206 can make calls and mails with other portable terminals.
The near field communication unit 207 is a wireless communication interface for performing near field wireless communication under the control of the information processing unit 201. By using the short-range communication unit 207, wireless communication can be performed between the mobile terminal 200 and the in-vehicle device 100.
 音声入出力部208は、情報処理部201の制御の下、各種の音声を、携帯端末200が備えるスピーカから出力させたり、ユーザの発した音声を、マイクを介して入力し、音声データに変換したりする。この音声入出力部208を用いることにより、他の携帯端末と電話したり、携帯端末200の記憶部202に記憶された楽曲などの音声データを再生したりすることができる。なお、車載装置100は、公知の車載装置が具備する他の構成を更に具備しても良い。 The voice input / output unit 208 outputs various voices from a speaker included in the mobile terminal 200 under the control of the information processing unit 201, or inputs voice uttered by the user via a microphone and converts it into voice data. To do. By using the voice input / output unit 208, it is possible to make a call with another portable terminal and to reproduce voice data such as music stored in the storage unit 202 of the portable terminal 200. The in-vehicle device 100 may further include other configurations included in a known in-vehicle device.
 サービス提供サーバ300は、テキストデータから音声データを作成するなどのサービス、つまり、TTS(Text to speech)サービスを提供するサーバである。図1に示すように、サービス提供サーバ300は、情報処理部301と、記憶部302と、通信部303とを備えており、車載装置100よりも処理能力が高く、且つ、記憶容量が多い装置に構成されている。
 情報処理部301は、CPU、ROMおよびRAMなどを備え、サービス提供サーバ300を中枢的に制御するコンピューター(サーバ側処理部)として機能する。
The service providing server 300 is a server that provides a service such as creating voice data from text data, that is, a TTS (Text to speech) service. As shown in FIG. 1, the service providing server 300 includes an information processing unit 301, a storage unit 302, and a communication unit 303, and has a higher processing capacity and a larger storage capacity than the in-vehicle device 100. It is configured.
The information processing unit 301 includes a CPU, a ROM, a RAM, and the like, and functions as a computer (server side processing unit) that centrally controls the service providing server 300.
 記憶部302には、情報処理部301が実行する制御プログラムや各種のデータが記憶される。この情報処理部301は、記憶部302に記憶される制御プログラムを実行することにより、テキストデータから音声データを作成する音声変換処理などを行うTTSエンジンとして機能することができる。
 また、記憶部302には、車載装置100で再生する発話内容(経路案内用の音声など)を示すテキストデータを記述したデータベースや、経路検索などに用いるPoint of Interest(POI)のデータベースも記憶されている。
The storage unit 302 stores a control program executed by the information processing unit 301 and various data. The information processing unit 301 can function as a TTS engine that performs voice conversion processing for creating voice data from text data by executing a control program stored in the storage unit 302.
The storage unit 302 also stores a database in which text data indicating utterance content (such as voice for route guidance) reproduced by the in-vehicle device 100 and a database of point of interest (POI) used for route search and the like are stored. ing.
 次に、この音声調整システム1で実行される処理を説明する。
 図4は、携帯端末200が音声調整プログラムを実行したときの動作を示した図である。
 携帯端末200の情報処理部201は、ユーザ操作に応じて音声調整プログラムを実行すると、サービス提供サーバ300へのログイン処理を行う(ステップS1A)。この場合、情報処理部201が、通信部206によりサービス提供サーバ300へアクセスし、ログイン画面を表示部203に表示させる。そして、携帯端末200のユーザが、入力部204を介してログイン情報(例えば、ユーザIDやパスワード)を入力することにより、ログインが完了する。
Next, processing executed in the sound adjustment system 1 will be described.
FIG. 4 is a diagram illustrating an operation when the mobile terminal 200 executes the sound adjustment program.
When the information processing unit 201 of the portable terminal 200 executes the voice adjustment program in response to a user operation, the information processing unit 201 performs a login process to the service providing server 300 (step S1A). In this case, the information processing unit 201 accesses the service providing server 300 through the communication unit 206 and causes the display unit 203 to display a login screen. Then, when the user of the mobile terminal 200 inputs login information (for example, a user ID or a password) via the input unit 204, the login is completed.
 ログインが完了すると、サービス提供サーバ300の情報処理部301が、携帯端末200の画面に、音声調整用の画面(調整画面)を表示させる(ステップS2A)。
 図5は、調整画面の一例を示した図である。
 調整画面は、音声調整用のパラメータの入力を受け付ける画面であり、より具体的には、音声のピッチ、速度、高低(高さ)、抑揚、性別、種類の入力を受け付ける画面を表示する。なお、パラメータは、上記パラメータに限定されず、適宜に追加・変更しても良い。
When the login is completed, the information processing unit 301 of the service providing server 300 displays a voice adjustment screen (adjustment screen) on the screen of the mobile terminal 200 (step S2A).
FIG. 5 is a diagram showing an example of the adjustment screen.
The adjustment screen is a screen that accepts input of parameters for voice adjustment, and more specifically, displays a screen that accepts input of voice pitch, speed, height (height), intonation, gender, and type. The parameters are not limited to the above parameters, and may be added or changed as appropriate.
 図4に戻り、携帯端末200の入力部204を介して音声調整用のパラメータが入力されると、サービス提供サーバ300の情報処理部301は、これらパラメータ群を、個人別TTSパラメータとして、記憶部302に記憶(保存)する(ステップS3A)。
 この場合、記憶部302には、携帯端末200を識別する識別情報(ユーザIDまたは端末ID)と対応づけて個人別TTSパラメータが記憶される。これにより、識別情報に基づいて個人別TTSパラメータを特定することができる。
Returning to FIG. 4, when voice adjustment parameters are input via the input unit 204 of the mobile terminal 200, the information processing unit 301 of the service providing server 300 stores these parameter groups as individual TTS parameters. Store (save) in 302 (step S3A).
In this case, the storage unit 302 stores individual TTS parameters in association with identification information (user ID or terminal ID) for identifying the mobile terminal 200. As a result, the individual TTS parameter can be specified based on the identification information.
 このようにして、携帯端末200のユーザは、所有する携帯端末200を操作して所望のパラメータを設定し、サービス提供サーバ300に保存しておくことができる。このパラメータ設定時(音声調整プログラム実行時)には、携帯端末200を車載装置100に接続しておく必要がない。従って、ユーザは、車両10内に限らず、車両10外の任意の場所でパラメータの設定を行うことが可能である。 In this way, the user of the mobile terminal 200 can set the desired parameters by operating the mobile terminal 200 that he / she owns, and store it in the service providing server 300. When this parameter is set (when the audio adjustment program is executed), it is not necessary to connect the mobile terminal 200 to the in-vehicle device 100. Therefore, the user can set parameters not only in the vehicle 10 but also in any place outside the vehicle 10.
 なお、上記パラメータ設定時において、サービス提供サーバ300が、予め定めたサンプルのテキストデータを、個人別TTSパラメータに基づいて音声データ化(音声変換)して携帯端末200に送り、携帯端末200側で再生するようにしてもよい。この構成によれば、ユーザが、自身が設定したパラメータの音声を迅速に確認することができる。
 また、サービス提供サーバ300は処理能力が高いため、ほぼリアルタイムで音声データ化することが可能できる。従って、ユーザは、音声を確認しながらパラメータの調整を行うことができる。
At the time of parameter setting, the service providing server 300 converts predetermined sample text data into voice data (voice conversion) based on the individual TTS parameters and sends the voice data to the portable terminal 200. You may make it reproduce | regenerate. According to this configuration, the user can quickly confirm the voice of the parameter set by the user.
Further, since the service providing server 300 has a high processing capability, it can be converted into voice data in almost real time. Therefore, the user can adjust the parameter while confirming the voice.
 図6は、車載装置100の起動時の動作を示した図である。なお、前提として、車載装置100と携帯端末200とを通信接続されているものとする。
 車載装置100が起動すると、車載装置100の情報処理部101は、携帯端末200の通信プログラムを実行させる。これにより、車載装置100が携帯端末200を経由してサービス提供サーバ300にアクセスし、サーバ300との間で通信が可能になる。
 この場合、車載装置100の情報処理部101は、サービス提供サーバ300から、個人別TTSパラメータにより調整された音声データをダウンロードする(ステップS1B)。なお、サービス提供サーバ300は、携帯端末200から識別情報(ユーザIDまたは端末ID)を取得し、この識別情報に基づいて単一の個人別TTSパラメータを特定すれば良い。また、この方法に限らず、携帯端末200または車載装置100に対応する個人別TTSパラメータを特定するための公知の技術を広く適用可能である。
FIG. 6 is a diagram illustrating an operation when the in-vehicle device 100 is activated. As a premise, it is assumed that the in-vehicle device 100 and the mobile terminal 200 are connected for communication.
When the in-vehicle device 100 is activated, the information processing unit 101 of the in-vehicle device 100 causes the mobile terminal 200 to execute the communication program. As a result, the in-vehicle device 100 accesses the service providing server 300 via the portable terminal 200 and can communicate with the server 300.
In this case, the information processing unit 101 of the in-vehicle device 100 downloads the audio data adjusted by the individual TTS parameter from the service providing server 300 (step S1B). The service providing server 300 may acquire identification information (user ID or terminal ID) from the mobile terminal 200 and specify a single individual TTS parameter based on the identification information. Moreover, not only this method but the well-known technique for specifying the individual TTS parameter corresponding to the portable terminal 200 or the vehicle-mounted apparatus 100 is applicable widely.
 ここで、上記音声データは、サービス提供サーバ300の記憶部302に記憶された定型文のテキストデータ群を音声変換したデータである。つまり、サービス提供サーバ300は、TTSエンジンにより、個人別TTSパラメータに基づいて各テキストデータを音声データ化する。これによって、個人別TTSパラメータにより調整された音声で各テキストデータの文字に対応する音声のデータを生成する。この音声変換処理は、車載装置100からダウンロード指示があったときに実行しても良いし、ダウンロード指示よりも前のタイミング(例えば、個人別TTSパラメータが保存された後の適当なタイミング)で実行しても良い。 Here, the voice data is data obtained by voice-converting a text data group of fixed phrases stored in the storage unit 302 of the service providing server 300. That is, the service providing server 300 converts each text data into voice data based on the individual TTS parameters by the TTS engine. Thus, voice data corresponding to the characters of each text data is generated with the voice adjusted by the individual TTS parameter. This voice conversion process may be executed when a download instruction is issued from the in-vehicle device 100, or at a timing before the download instruction (for example, an appropriate timing after the individual TTS parameters are stored). You may do it.
 ダウンロードが終了すると、車載装置100の情報処理部101は、音声データを記憶部102に記憶させ、記憶した音声データを、予め指定されたタイミングで音声入出力部107によって再生する(ステップS2B)。これにより、例えば、ダウンロード直後に、”こんにちは、***さん”、”お誕生日おめでとう”、”今日も安全運転で行きましょう”と再生することができ、予め設定した自宅に到着したタイミングなどで、”お疲れ様でした”を再生することができる。
 このようにして、サービス提供サーバ300で、携帯端末200のユーザが希望する音声に変換し、その音声を、車載装置100で再生することができる。
When the download is completed, the information processing unit 101 of the in-car device 100 stores the voice data in the storage unit 102, and the stored voice data is reproduced by the voice input / output unit 107 at a timing specified in advance (step S2B). Thus, for example, immediately after the download, "Hello, *** Mr.", "Happy Birthday", "Today also go Let in safe driving" can be played with, the timing arriving at the home that has been set in advance You can play “Thank you for your hard work”.
In this way, the service providing server 300 can convert the voice of the user of the mobile terminal 200 to a desired voice, and the voice can be reproduced by the in-vehicle device 100.
 また、ユーザが携帯端末200を操作し、編集プログラムを実行させることにより、サービス提供サーバ300に記憶されるテキストデータを編集することも可能である。この場合、図4に示したステップS1Aと同様のログイン処理を行い、ログインが完了すると、サービス提供サーバ300の情報処理部301が、携帯端末200の画面に、テキストデータ編集用の画面(編集画面)を表示させる。 Also, the text data stored in the service providing server 300 can be edited by the user operating the portable terminal 200 to execute the editing program. In this case, the login process similar to step S1A shown in FIG. 4 is performed, and when the login is completed, the information processing unit 301 of the service providing server 300 displays a text data editing screen (editing screen) on the screen of the mobile terminal 200. ) Is displayed.
 図7は、編集画面の一例を示した図である。
 この編集画面は、車載装置100で再生する音声内容に対応する定型文のテキストデータの編集を受け付ける画面である。図7に示すように、編集画面には、サービス提供サーバ300の記憶部302に記憶されるテキストデータ群が表示され、各テキストデータの編集を受け付ける。また、編集されたテキストデータは、携帯端末200を識別する識別情報(ユーザIDまたは端末ID)と対応づけて記憶部302に記憶される。
 このようにして、携帯端末200のユーザは、所有する携帯端末200を操作してテキストデータを編集し、サービス提供サーバ300に保存しておくことができる。この編集時(編集プログラム実行時)には、携帯端末200を車載装置100に接続しておく必要がないため、ユーザは、車外などの任意の場所でテキストデータの編集を行うことができる。
FIG. 7 is a diagram showing an example of the editing screen.
This editing screen is a screen that accepts editing of text data of fixed phrases corresponding to the audio content reproduced by the in-vehicle device 100. As shown in FIG. 7, a text data group stored in the storage unit 302 of the service providing server 300 is displayed on the editing screen, and editing of each text data is accepted. The edited text data is stored in the storage unit 302 in association with identification information (user ID or terminal ID) for identifying the mobile terminal 200.
In this manner, the user of the mobile terminal 200 can edit the text data by operating the mobile terminal 200 that the user owns, and store the text data in the service providing server 300. At the time of editing (when the editing program is executed), since the mobile terminal 200 does not need to be connected to the in-vehicle device 100, the user can edit text data at an arbitrary location such as outside the vehicle.
 図8は、車載装置100のユーザが音声認識を利用して行先設定を行うときの動作を示した図である。なお、この場合も、前提として、車載装置100と携帯端末200とが通信接続されているものとする。また、車載装置100の情報処理部101は、ユーザ操作に応じて会話プログラムを実行し、音声入出力部107を介して行先(目的地)を指定する音声の入力を待ち受け、音声を入力すると、その音声に対応する音声データを、携帯端末200を経由してサービス提供サーバ300に送信する。
 サービス提供サーバ300の情報処理部301は、携帯端末200を経由して受信した音声データを音声認識し、音声認識結果を検索キーとしてPOI検索(POIのデータベースの検索)を行う(ステップS1C)。
FIG. 8 is a diagram illustrating an operation when the user of the in-vehicle device 100 performs destination setting using voice recognition. Also in this case, it is assumed that the in-vehicle device 100 and the mobile terminal 200 are connected for communication. Further, the information processing unit 101 of the in-vehicle device 100 executes a conversation program in response to a user operation, waits for input of a voice specifying a destination (destination) via the voice input / output unit 107, and inputs a voice. The voice data corresponding to the voice is transmitted to the service providing server 300 via the portable terminal 200.
The information processing unit 301 of the service providing server 300 recognizes voice data received via the mobile terminal 200 and performs a POI search (POI database search) using the voice recognition result as a search key (step S1C).
 次いで、サービス提供サーバ300の情報処理部301は、TTSエンジンにより、個人別TTSパラメータを使用して検索結果を読み上げる音声データを生成し、携帯端末200を経由して車載装置100に送信する(ステップS2C)。これにより、個人別TTSパラメータで調整された声の音声データ、つまりユーザが希望する声の音声データが生成され、車載装置100に送信される。
 続いて、車載装置100の情報処理部101は、受信した音声データを再生する処理を行う(ステップS3C)。これにより、車載装置100においては、ユーザが希望する声で、例えば、”検索結果 ○○件有ります”、”1件目 ・・・ 2件目 ・・・・・”といった音声が再生される。
Next, the information processing unit 301 of the service providing server 300 uses the TTS engine to generate voice data that reads the search result using the individual TTS parameters, and transmits the voice data to the in-vehicle device 100 via the portable terminal 200 (step) S2C). As a result, voice data adjusted by the individual TTS parameters, that is, voice data desired by the user is generated and transmitted to the in-vehicle device 100.
Subsequently, the information processing unit 101 of the in-vehicle device 100 performs a process of reproducing the received audio data (step S3C). As a result, the in-vehicle device 100 reproduces the voice desired by the user, for example, “Search results are XX”, “First ... second ...”. .
 その後、ユーザ操作により適切な候補(行き先)が選択されると、車載装置100の情報処理部101は、現在地から行き先へ至る推奨経路を検索する処理(行き先設定)を行う(ステップS4C)。
 推奨経路が決定すると、車載装置100の情報処理部101は、経路案内処理を開始することにより、例えば、”ルートガイドを始めます・・・300m先を右折します・・・”といった案内音声を再生する(ステップS5C)。
Thereafter, when an appropriate candidate (destination) is selected by a user operation, the information processing unit 101 of the in-vehicle device 100 performs processing (destination setting) for searching for a recommended route from the current location to the destination (step S4C).
When the recommended route is determined, the information processing unit 101 of the in-vehicle device 100 starts a route guidance process, and for example, a guidance voice such as “Start a route guide ... Turn right 300m ahead”. Playback is performed (step S5C).
 この場合、案内音声は、サービス提供サーバ300により個人別TTSパラメータに基づいて調整された音声で放音される。つまり、サービス提供サーバ300は、記憶部302に記憶された案内音声のテキストデータを、個人別TTSパラメータに基づいて予め音声データ化しておき、サービス提供サーバ300から携帯端末200を経由して車載装置100に送信し、車載装置100が記憶部102に記憶しておくことにより、車載装置100で調整済みの案内音声を再生することができる。
 このようにして、ユーザが希望する音声で経路案内を行うことができる。なお、このサービス提供サーバ300が音声データ化するタイミングや、音声データを車載装置100に送信するタイミングは、案内音声の再生前であれば任意のタイミングに設定することが可能である。
In this case, the guidance voice is emitted as a voice adjusted by the service providing server 300 based on the individual TTS parameters. That is, the service providing server 300 converts the guidance voice text data stored in the storage unit 302 into voice data based on the individual TTS parameters in advance, and the in-vehicle device from the service providing server 300 via the portable terminal 200. 100 and stored in the storage unit 102 by the in-vehicle device 100, it is possible to reproduce the guidance voice adjusted by the in-vehicle device 100.
In this way, route guidance can be performed with the voice desired by the user. Note that the timing at which the service providing server 300 converts the voice data and the timing at which the voice data is transmitted to the in-vehicle device 100 can be set to any timing as long as the guidance voice is not reproduced.
 以上説明したように、本実施の形態によれば、携帯端末200は、ユーザ操作を介して音声調整用のパラメータである個人別TTSパラメータの入力を受け付け、サービス提供サーバ300は、発話内容を示すテキストデータを記憶する記憶部302を有し、個人別TTSパラメータを取得し、そのパラメータに基づいてテキストデータから音声データを生成し、車載装置100は、音声データを取得し、取得した音声データに基づいて音声を出力するので、車載装置100の記憶容量や処理能力に依存せずに、ユーザが所望する音声を車載装置100側で再生することができる。また、車載装置100よりも処理能力が高いサービス提供サーバ300で音声変換するので、車載装置100で音声変換する場合よりも音声変換の調整範囲を広くすることができる。これらにより、ユーザに提供する情報の多様化を図ることが可能になる。 As described above, according to the present embodiment, mobile terminal 200 accepts input of individual TTS parameters, which are parameters for voice adjustment, through user operation, and service providing server 300 indicates the utterance content. It has a storage unit 302 for storing text data, acquires individual TTS parameters, generates voice data from the text data based on the parameters, and the in-vehicle device 100 acquires the voice data, and converts the acquired voice data into the acquired voice data Since the sound is output based on this, the sound desired by the user can be reproduced on the in-vehicle device 100 side without depending on the storage capacity or the processing capability of the in-vehicle device 100. Moreover, since the voice conversion is performed by the service providing server 300 having a higher processing capability than the in-vehicle device 100, the adjustment range of the voice conversion can be made wider than that in the case of performing the voice conversion by the in-vehicle device 100. As a result, it is possible to diversify the information provided to the user.
 しかも、携帯端末200とサービス提供サーバ300との間の通信だけで個人別TTSパラメータを設定するので、車載装置100を操作することなく、個人別TTSパラメータを設定することができる。従って、任意の場所、且つ、任意のタイミングで、個人別TTSパラメータを設定することができ、利便性が向上する。
 なお、車載装置100側で個人別TTSパラメータを設定しようとすると、専用のメニュー画面やプログラムを用意する必要が生じ、車載装置100の機能や性能に依存してしまう。言い換えれば、本実施形態では、車載装置100の機能や性能に依存せずに、パラメータの設定や所望の声に調整された音声データの取得・再生を行うことができる。
In addition, since the individual TTS parameter is set only by communication between the mobile terminal 200 and the service providing server 300, the individual TTS parameter can be set without operating the in-vehicle device 100. Therefore, individual TTS parameters can be set at an arbitrary place and at an arbitrary timing, and convenience is improved.
Note that, if an individual TTS parameter is set on the in-vehicle device 100 side, it is necessary to prepare a dedicated menu screen or program, which depends on the function and performance of the in-vehicle device 100. In other words, in the present embodiment, it is possible to perform parameter setting and acquisition / reproduction of audio data adjusted to a desired voice without depending on the function and performance of the in-vehicle device 100.
 また、個人別TTSパラメータは、音声のピッチ、速度、高低(高さ)などを含むため、ユーザが希望する音声に調整し易い。
 さらに、テキストデータから音声データを生成するので、公知の音声変換技術を広く適用することができ、且つ、音声内容の編集や作成などを容易に行うことができる。
 また、携帯端末200の情報処理部(端末側処理部)201は、ユーザ操作を介して、サービス提供サーバ300に記憶されるテキストデータの編集を受け付け、サービス提供サーバ300の情報処理部(サーバ側処理部)301は、上記編集を行うようにテキストデータを編集するので、ユーザが所望する音声内容に容易に調整できる。この場合も、車載装置100を操作することなく、データを編集できるので、任意の場所、且つ、任意のタイミングで編集することができ、利便性が向上し、車載装置100の機能や性能に依存することなく編集が可能になる。
Moreover, since the individual TTS parameters include the pitch, speed, height (height), etc. of the voice, it is easy to adjust to the voice desired by the user.
Furthermore, since voice data is generated from text data, a known voice conversion technique can be widely applied, and voice contents can be easily edited and created.
Further, the information processing unit (terminal-side processing unit) 201 of the mobile terminal 200 accepts editing of text data stored in the service providing server 300 through a user operation, and the information processing unit (server side) of the service providing server 300 Since the processing unit 301 edits the text data so as to perform the above editing, it can be easily adjusted to the audio content desired by the user. Also in this case, since the data can be edited without operating the in-vehicle device 100, the data can be edited at any place and at any timing, which improves convenience and depends on the function and performance of the in-vehicle device 100. You can edit without having to.
 また、上述の実施形態は、あくまで本発明の一態様に過ぎず、本発明の範囲内で任意に変更が可能である。例えば、上記実施形態では、音声のピッチ、速度、高低(高さ)、抑揚、性別などを調整可能にする場合を説明したが、これに限らない。例えば、いずれか一つ以上を調整可能にすればよく、調整項目は適宜に増減が可能である。
 また、上記実施形態では、テキストデータを、個人別TTSパラメータに基づいて音声データ化する場合を説明したが、基準となる音声データを予めサービス提供サーバ300に記憶しておき、この音声データを、個人別TTSパラメータに基づいてピッチ、速度および高さなどを調整するようにしても良い。
The above-described embodiment is merely one aspect of the present invention, and can be arbitrarily changed within the scope of the present invention. For example, in the above-described embodiment, the case where the pitch, speed, height (height), inflection, gender, and the like of the voice are adjustable has been described. For example, any one or more of the adjustment items can be adjusted, and the adjustment items can be appropriately increased or decreased.
In the above-described embodiment, the case where the text data is converted into voice data based on the individual TTS parameters has been described. However, the voice data serving as a reference is stored in the service providing server 300 in advance, The pitch, speed, height, etc. may be adjusted based on the individual TTS parameters.
 また、上述の実施形態において、更に、携帯端末200にメール読み上げ用のアプリケーションプログラム(以下、メール読み上げプログラム)を記憶し、このプログラムを用いて、サービス提供サーバ300により個人別TTSパラメータで調整された声でメールの音声データを生成し、車載装置100で再生するようにしても良い。 Further, in the above-described embodiment, an application program for reading a mail (hereinafter referred to as a mail reading program) is further stored in the portable terminal 200, and the personalized TTS parameter is adjusted by the service providing server 300 using this program. Voice voice data may be generated by voice and played back by the in-vehicle device 100.
 図9は、携帯端末200がメール読み上げプログラムを実行したときの動作の一例を示した図である。
 図9に示すように、携帯端末200の情報処理部201は、ユーザ操作に応じて音声調整プログラムを実行すると、メールのテキストをサービス提供サーバ300に送信する(ステップS1D)。次に、サービス提供サーバ300の情報処理部301は、TTSエンジンにより、個人別TTSパラメータを使用してメールのテキストを読み上げる音声データを生成し、携帯端末200を経由して車載装置100に送信する(ステップS2D)。続いて、車載装置100の情報処理部101は、受信した音声データを再生する処理を行う(ステップS3D)。これにより、車載装置100においては、ユーザが希望する声で、例えば、”今日の打ち合わせの件、開始時刻が変更になりました”といったメール内容を示す音声を再生することができる。
FIG. 9 is a diagram illustrating an example of an operation when the mobile terminal 200 executes the mail reading program.
As illustrated in FIG. 9, when the information processing unit 201 of the mobile terminal 200 executes the voice adjustment program in response to a user operation, the text of the mail is transmitted to the service providing server 300 (step S1D). Next, the information processing unit 301 of the service providing server 300 uses the TTS engine to generate voice data that reads the text of the mail using the individual TTS parameters, and transmits the voice data to the in-vehicle device 100 via the mobile terminal 200. (Step S2D). Subsequently, the information processing unit 101 of the in-vehicle device 100 performs a process of reproducing the received audio data (step S3D). Thereby, in the in-vehicle device 100, for example, a voice indicating the content of the mail such as “Today's meeting, start time has been changed” can be reproduced with a voice desired by the user.
 また、上述の実施形態において、車載装置100が携帯端末200の各種機能を有していても良い。例えば、車載装置100が、ユーザ操作を介して音声調整用のパラメータである個人別TTSパラメータの入力を受け付け、個人別TTSパラメータをサービス提供サーバ300に取得させるようにしてもよい。また、上述の実施形態において、車載装置100が、インタネット網などの公衆通信回線網N1に接続されたサービス提供サーバ300などと無線通信するための通信ユニットを備えるようにし、携帯端末200を介することなく、サービス提供サーバ300などと無線通信できるようにしても良い。
 また、上述の実施形態において、車載装置100は、自動車などの四輪車に搭載されるものに限らず、自転車などの二輪車に搭載されるものでも良い。
In the above-described embodiment, the in-vehicle device 100 may have various functions of the mobile terminal 200. For example, the in-vehicle device 100 may receive an input of an individual TTS parameter that is a parameter for voice adjustment through a user operation, and cause the service providing server 300 to acquire the individual TTS parameter. Further, in the above-described embodiment, the in-vehicle device 100 includes a communication unit for wirelessly communicating with the service providing server 300 connected to the public communication line network N1 such as the Internet network, and the mobile device 100 is provided via the portable terminal 200. Alternatively, wireless communication with the service providing server 300 or the like may be possible.
In the above-described embodiment, the in-vehicle device 100 is not limited to being mounted on a four-wheeled vehicle such as an automobile, and may be mounted on a two-wheeled vehicle such as a bicycle.
 1 音声調整システム
 100 車載装置
 101 情報処理部(車載側処理部)
 102、202、302 記憶部
 200 携帯端末
 201 情報処理部(端末側処理部)
 300 サービス提供サーバ
 301 情報処理部(サーバ側処理部)
DESCRIPTION OF SYMBOLS 1 Audio | voice adjustment system 100 In-vehicle apparatus 101 Information processing part (in-vehicle side processing part)
102, 202, 302 Storage unit 200 Mobile terminal 201 Information processing unit (terminal side processing unit)
300 Service providing server 301 Information processing unit (server side processing unit)

Claims (9)

  1.  ユーザ操作を受け付ける携帯端末と、
     前記携帯端末が通信可能なサーバと、
     音声データに基づいて音声を出力する車載装置とを備え、
     前記携帯端末は、前記ユーザ操作を介して音声調整用のパラメータの入力を受け付ける端末側処理部を有し、
     前記サーバは、発話内容を示すデータを記憶する記憶部と、前記パラメータを取得し、前記パラメータに基づいて前記データから音声データを生成するサーバ側処理部とを有し、
     前記車載装置は、前記音声データを取得し、取得した前記音声データに基づいて音声を出力する車載側処理部を有することを特徴とする音声調整システム。
    A mobile terminal that accepts user operations;
    A server with which the mobile terminal can communicate;
    An in-vehicle device that outputs sound based on sound data,
    The mobile terminal has a terminal-side processing unit that receives input of parameters for audio adjustment through the user operation,
    The server includes a storage unit that stores data indicating utterance content, and a server-side processing unit that acquires the parameters and generates voice data from the data based on the parameters.
    The in-vehicle device includes an in-vehicle side processing unit that acquires the audio data and outputs audio based on the acquired audio data.
  2.  前記パラメータは、音声のピッチ、速度、高さの少なくともいずれかのパラメータを含むことを特徴とする請求項1に記載の音声調整システム。 The sound adjustment system according to claim 1, wherein the parameters include at least one of sound pitch, speed, and height parameters.
  3.  前記データは、テキストデータであることを特徴とする請求項1に記載の音声調整システム。 The voice adjustment system according to claim 1, wherein the data is text data.
  4.  前記端末側処理部は、前記ユーザ操作を介して前記サーバに記憶される前記データの編集を受け付け、
     前記サーバ側処理部は、前記端末側処理部で受け付けた編集を行うように前記データを編集することを特徴とする請求項1に記載の音声調整システム。
    The terminal side processing unit accepts editing of the data stored in the server via the user operation,
    The voice adjustment system according to claim 1, wherein the server side processing unit edits the data so as to perform editing received by the terminal side processing unit.
  5.  携帯端末と通信可能なサーバにおいて、
     発話内容を示すデータを記憶する記憶部と、
     前記携帯端末がユーザ操作を介して受け付けた音声調整用のパラメータを取得し、前記パラメータに基づいて前記データから車載装置で再生する発話内容の音声データを生成するサーバ側処理部とを備えることを特徴とするサーバ。
    In a server that can communicate with a mobile terminal,
    A storage unit for storing data indicating utterance contents;
    A server-side processing unit that acquires a parameter for voice adjustment received by the portable terminal through a user operation, and generates voice data of an utterance content to be reproduced by the in-vehicle device based on the parameter. Feature server.
  6.  前記パラメータは、音声のピッチ、速度、高さの少なくともいずれかのパラメータを含むことを特徴とする請求項5に記載のサーバ。 The server according to claim 5, wherein the parameter includes at least one of a pitch, a speed, and a height of a voice.
  7.  前記データは、テキストデータであることを特徴とする請求項5に記載のサーバ。 The server according to claim 5, wherein the data is text data.
  8.  前記端末側処理部は、前記ユーザ操作を介して前記サーバに記憶される前記データの編集を受け付け、
     前記サーバ側処理部は、前記端末側処理部で受け付けた編集を行うように前記データを編集することを特徴とする請求項5に記載のサーバ。
    The terminal side processing unit accepts editing of the data stored in the server via the user operation,
    The server according to claim 5, wherein the server side processing unit edits the data so as to perform editing received by the terminal side processing unit.
  9.  音声データに基づいて音声を出力する車載装置において、
     発話内容を示すデータを記憶する記憶部と、携帯端末がユーザ操作を介して受け付けた音声調整用のパラメータを取得し、前記パラメータに基づいて前記データから音声データを生成するサーバ側処理部とを備えるサーバから前記音声データを取得し、取得した前記音声データに基づいて音声を出力する車載側処理部を有することを特徴とする車載装置。
    In an in-vehicle device that outputs audio based on audio data,
    A storage unit that stores data indicating utterance content, and a server-side processing unit that acquires a parameter for voice adjustment received by a mobile terminal through a user operation, and generates voice data from the data based on the parameter. An in-vehicle apparatus comprising: an in-vehicle processing unit that acquires the audio data from a server provided and outputs audio based on the acquired audio data.
PCT/JP2014/077446 2014-01-24 2014-10-15 Speech adjustment system, server, and in-vehicle device WO2015111256A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2015558732A JPWO2015111256A1 (en) 2014-01-24 2014-10-15 Audio adjustment system, server, and in-vehicle device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014011288 2014-01-24
JP2014-011288 2014-01-24

Publications (1)

Publication Number Publication Date
WO2015111256A1 true WO2015111256A1 (en) 2015-07-30

Family

ID=53681079

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/077446 WO2015111256A1 (en) 2014-01-24 2014-10-15 Speech adjustment system, server, and in-vehicle device

Country Status (2)

Country Link
JP (1) JPWO2015111256A1 (en)
WO (1) WO2015111256A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015155977A (en) * 2014-02-20 2015-08-27 シャープ株式会社 Voice synthesizer and control program

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05233565A (en) * 1991-11-12 1993-09-10 Fujitsu Ltd Voice synthesization system
JPH1078792A (en) * 1996-07-12 1998-03-24 Konami Co Ltd Voice processing method, game system and recording medium
JP2004246129A (en) * 2003-02-14 2004-09-02 Arcadia:Kk Voice synthesis controller
JP2004295379A (en) * 2003-03-26 2004-10-21 Seiko Epson Corp Data providing system, data providing method, and data providing program
JP2005055607A (en) * 2003-08-01 2005-03-03 Toyota Motor Corp Server, information processing terminal and voice synthesis system
JP2006301059A (en) * 2005-04-18 2006-11-02 Denso Corp Voice output system
JP2006350091A (en) * 2005-06-17 2006-12-28 Nippon Telegr & Teleph Corp <Ntt> Voice synthesis method, voice synthesis information processing method, client terminal, voice synthesis information processing server, client terminal program, and voice synthesis information processing program
WO2012008437A1 (en) * 2010-07-13 2012-01-19 富士通テン株式会社 Information provision system and vehicle-mounted device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3974419B2 (en) * 2002-02-18 2007-09-12 株式会社日立製作所 Information acquisition method and information acquisition system using voice input

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05233565A (en) * 1991-11-12 1993-09-10 Fujitsu Ltd Voice synthesization system
JPH1078792A (en) * 1996-07-12 1998-03-24 Konami Co Ltd Voice processing method, game system and recording medium
JP2004246129A (en) * 2003-02-14 2004-09-02 Arcadia:Kk Voice synthesis controller
JP2004295379A (en) * 2003-03-26 2004-10-21 Seiko Epson Corp Data providing system, data providing method, and data providing program
JP2005055607A (en) * 2003-08-01 2005-03-03 Toyota Motor Corp Server, information processing terminal and voice synthesis system
JP2006301059A (en) * 2005-04-18 2006-11-02 Denso Corp Voice output system
JP2006350091A (en) * 2005-06-17 2006-12-28 Nippon Telegr & Teleph Corp <Ntt> Voice synthesis method, voice synthesis information processing method, client terminal, voice synthesis information processing server, client terminal program, and voice synthesis information processing program
WO2012008437A1 (en) * 2010-07-13 2012-01-19 富士通テン株式会社 Information provision system and vehicle-mounted device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015155977A (en) * 2014-02-20 2015-08-27 シャープ株式会社 Voice synthesizer and control program

Also Published As

Publication number Publication date
JPWO2015111256A1 (en) 2017-03-23

Similar Documents

Publication Publication Date Title
US11205421B2 (en) Selection system and method
JP6501217B2 (en) Information terminal system
US9679562B2 (en) Managing in vehicle speech interfaces to computer-based cloud services due recognized speech, based on context
US10452347B2 (en) Information processing device, information processing method, and terminal device for generating information shared between the information processing device and the terminal device
US9326088B2 (en) Mobile voice platform architecture with remote service interfaces
US8781838B2 (en) In-vehicle text messaging experience engine
US20130197907A1 (en) Services identification and initiation for a speech-based interface to a mobile device
US20130197915A1 (en) Speech-based user interface for a mobile device
JP2010515890A (en) Navigation device, method and program for operating navigation device having voice recognition mode
JP6155592B2 (en) Speech recognition system
US10111000B1 (en) In-vehicle passenger phone stand
US11355135B1 (en) Phone stand using a plurality of microphones
JP2009300537A (en) Speech actuation system, speech actuation method and in-vehicle device
JP2013088477A (en) Speech recognition system
JP2018042254A (en) Terminal device
JP2014219617A (en) Voice guide system and voice guide method
US20120329398A1 (en) In-vehicle messaging
WO2015111256A1 (en) Speech adjustment system, server, and in-vehicle device
JP6226020B2 (en) In-vehicle device, information processing method, and information processing system
CN113160811A (en) Proxy system, server, and computer-readable recording medium
US11386891B2 (en) Driving assistance apparatus, vehicle, driving assistance method, and non-transitory storage medium storing program
JP2012173702A (en) Voice guidance system
Tchankue et al. Are mobile in-car communication systems feasible? a usability study
JP2013250132A (en) On-vehicle device and on-vehicle information system
US11735187B2 (en) Hybrid routing for hands-free voice assistant, and related systems and methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14880258

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2015558732

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14880258

Country of ref document: EP

Kind code of ref document: A1