JP2015041243A

JP2015041243A - Electronic device, voice recognition operation method of mobile terminal connected thereto, and in-vehicle system

Info

Publication number: JP2015041243A
Application number: JP2013171870A
Authority: JP
Inventors: 山崎　昇; Noboru Yamazaki; 昇山崎; 健司信太; Kenji Shinta
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2013-08-22
Filing date: 2013-08-22
Publication date: 2015-03-02

Abstract

PROBLEM TO BE SOLVED: To provide "an electronic device and a voice recognition operation method of a mobile terminal connected thereto" configured to enable trace operation on a touch panel of the mobile terminal, on the basis of voice recognition in the electronic device.SOLUTION: An electronic device includes: a conversion table storage section 13 which stores conversion table information for converting a voice input item to a predetermined coordinate row; and a control section 17 which generates the predetermined coordinate row from the voice input item by use of the conversion table information when an input item determination section 16 determines that voice recognized by a voice recognition section 15 corresponds to the voice input item, and transmits it to a mobile terminal 20. When the voice input item is voice-recognized in an in-vehicle unit 10, the voice input item converted to the predetermined coordinate row is transmitted to the mobile terminal 20. Processing is performed in the same way as if a user traces a position corresponding to the coordinate row on the touch panel of the mobile terminal 20.

Description

本発明は、電子機器およびこれに接続された携帯端末の音声認識操作方法、車載システムに関し、特に、携帯端末に画面表示された画像を電子機器に表示させて電子機器から携帯端末の操作を行うことができるようになされたシステムに適用して好適なものである。 The present invention relates to an electronic device, a voice recognition operation method for a mobile terminal connected to the electronic device, and an in-vehicle system, and in particular, an image displayed on the screen of the mobile terminal is displayed on the electronic device and the mobile terminal is operated from the electronic device. Therefore, the present invention is suitable for application to a system that can be used.

従来、第１の機器に第２の機器を接続し、第２の機器に対する操作によって第１の機器に関する処理を実行可能にした技術が提供されている（例えば、特許文献１参照）。特許文献１に記載のシステムでは、携帯型メディアデバイスに他のデバイス（アクセサリ）を接続し、携帯型メディアデバイスは自身に表示されるＧＵＩ画像をアクセサリに与え、アクセサリがそのＧＵＩ画像を通じて入力制御部のユーザ動作を携帯型メディアデバイスに送信すると、携帯型メディアデバイスはこのユーザ動作に応答してＧＵＩ画像を更新する。 Conventionally, a technique has been provided in which a second device is connected to a first device, and processing relating to the first device can be executed by an operation on the second device (see, for example, Patent Document 1). In the system described in Patent Document 1, another device (accessory) is connected to the portable media device, the portable media device gives a GUI image displayed on the accessory to the accessory, and the accessory controls the input control unit through the GUI image. When the user action is transmitted to the portable media device, the portable media device updates the GUI image in response to the user action.

また、最近では、スマートフォンのような携帯端末と車載機とを連携させるミラーリンク（登録商標：以下同様）と呼ばれる技術が提供されている。ミラーリンクは、携帯端末に表示された画像を車載機のディスプレイに鏡のようにそのまま映すという仕組みで、車載機のディスプレイに表示された画像を通じて携帯端末の操作を行うことができるようになされている。これにより、携帯端末にインストールしたミラーリンク対応のアプリケーション（以下、携帯アプリという）を車載機のディスプレイ上で操作できるようになる。 Recently, a technology called a mirror link (registered trademark: the same applies hereinafter) that links a mobile terminal such as a smartphone with an in-vehicle device has been provided. Mirror link is a mechanism that displays the image displayed on the mobile device as it is on the display of the in-vehicle device like a mirror, so that the mobile terminal can be operated through the image displayed on the in-vehicle device display. Yes. As a result, an application corresponding to a mirror link installed in the mobile terminal (hereinafter referred to as a mobile application) can be operated on the display of the in-vehicle device.

例えば、図４に示すように、地図アプリがインストールされた携帯端末１０１から、携帯端末１０１にて表示中の地図画像を車載機１０２に送信し、車載機１０２のディスプレイに携帯端末１０１と同じ地図画像を表示させることが可能である。また、車載機１０２のタッチパネルに対するタッチ位置を表す位置情報（座標情報）を車載機１０２から携帯端末１０１に送信し、携帯端末１０１がこの位置情報をもとに、携帯端末１０１のタッチパネル上で車載機１０２と同じ位置がタッチ操作されたのと同様の処理（地図のスクロール、拡大／縮小、回転など）を行うことも可能である。 For example, as shown in FIG. 4, a map image being displayed on the mobile terminal 101 is transmitted from the mobile terminal 101 installed with the map application to the in-vehicle device 102, and the same map as the mobile terminal 101 is displayed on the display of the in-vehicle device 102. It is possible to display an image. In addition, position information (coordinate information) representing a touch position on the touch panel of the in-vehicle device 102 is transmitted from the in-vehicle device 102 to the mobile terminal 101, and the mobile terminal 101 is mounted on the touch panel of the mobile terminal 101 based on this position information. It is also possible to perform the same processing (such as map scrolling, enlargement / reduction, rotation, etc.) as if the same position as the machine 102 was touched.

このように、ミラーリンクでは、携帯アプリを車載機のディスプレイ上で操作できるようにするために、車載機のタッチパネルに対するタッチ位置を座標情報として車載機から携帯端末に送信する構成となっている。そのため、車載機において音声認識して得られるコマンドに基づいて携帯アプリを操作するための情報を携帯端末に送信することができない。よって、車載機での音声認識を用いて携帯アプリの操作を行うためには、携帯アプリ自体を音声認識対応にし、かつ、ミラーリンクではない他の通信制御を使用する必要があった。 As described above, the mirror link is configured to transmit the touch position on the touch panel of the in-vehicle device as coordinate information from the in-vehicle device to the mobile terminal so that the mobile application can be operated on the display of the in-vehicle device. For this reason, information for operating the mobile application based on a command obtained by voice recognition in the in-vehicle device cannot be transmitted to the mobile terminal. Therefore, in order to operate the mobile application using voice recognition in the in-vehicle device, it is necessary to make the mobile application itself compatible with voice recognition and use other communication control that is not a mirror link.

なお、ユーザが発話した音声からテキスト符号データを決定し、決定したテキスト符号データの画面上での配置領域（座標）を携帯端末に送信するようにした技術が知られている（例えば、特許文献２参照）。この特許文献２では、車載機が携帯端末から受信した画像データをもとに、画像中に含まれる操作ボタン等のテキスト符号データを抽出し、このテキスト符号データが画面上に位置する領域を特定しておく。そして、操作ボタンの名称等をユーザが発話したときに、その操作ボタンが位置する領域の座標を携帯端末に送信するようになされている。 A technique is known in which text code data is determined from speech uttered by a user, and an arrangement area (coordinates) on the screen of the determined text code data is transmitted to a mobile terminal (for example, Patent Literature) 2). In this Patent Document 2, text code data such as operation buttons included in an image is extracted based on image data received from a mobile terminal by an in-vehicle device, and an area where the text code data is located on the screen is specified. Keep it. When the user speaks the name of the operation button or the like, the coordinates of the area where the operation button is located are transmitted to the portable terminal.

特許第５１３７８９９号公報Japanese Patent No. 5137899 特開２０１２−２１３１３２号公報JP 2012-213132 A

しかしながら、上記特許文献２に記載の技術では、携帯端末から車載機に送信された画像上のあらかじめ決められた位置に配置されている操作ボタンを音声認識によって操作することができるのみである。そのため、携帯端末から車載機に送信されて表示された画像について、タッチパネル上のフリック操作（指で素早くなぞる操作）による画像のスクロール、ピンチ操作（２本の指を使ってその間隔を広げたり縮めたりする操作）による画像の拡大／縮小、ローテーション操作（指を回転させるようになぞる操作）による画像の回転などを音声認識で行うことができないという問題があった。 However, with the technique described in Patent Document 2, the operation buttons arranged at predetermined positions on the image transmitted from the mobile terminal to the vehicle-mounted device can only be operated by voice recognition. Therefore, for images displayed by being transmitted from the mobile terminal to the in-vehicle device, scrolling and pinching operations (flicking with a finger) on a touch panel (flicking with a finger) and pinching operations (widening or shrinking the interval using two fingers) There is a problem in that voice recognition cannot perform image enlargement / reduction by rotation operation, rotation of an image by rotation operation (operation to trace a finger), and the like.

なお、特許文献２に記載の技術を利用して、画像上にスクロールボタン、拡大ボタン、縮小ボタン、回転ボタンなどを表示させてそれぞれからテキスト符号データを抽出し、このテキスト符号データが画面上に位置する領域（座標）をあらかじめ特定しておけば、スクロール、拡大／縮小、回転などの操作を音声認識によって行うことも可能である。 By using the technique described in Patent Document 2, a scroll button, an enlargement button, a reduction button, a rotation button, etc. are displayed on the image, and text code data is extracted from each of them, and this text code data is displayed on the screen. If the area (coordinates) to be positioned is specified in advance, operations such as scrolling, enlargement / reduction, and rotation can be performed by voice recognition.

しかしながら、この場合は、画像上に複数の操作ボタンを配置しなければならなくなり、画像そのものを表示する領域が狭くなって画像の視認性が悪くなるという問題があった。なお、複数の操作ボタンを表示させるか否かを指示するための表示／非表示ボタンのみを初期状態で表示させておき、これが音声認識によって操作されたときに複数の操作ボタンを表示させるという方法も考えられるが、操作の手間が増えてしまうという問題があった。 However, in this case, a plurality of operation buttons must be arranged on the image, and there is a problem in that the visibility of the image is deteriorated because the area for displaying the image itself is narrowed. Note that only a display / non-display button for instructing whether or not to display a plurality of operation buttons is displayed in an initial state, and a plurality of operation buttons are displayed when this is operated by voice recognition. However, there is a problem that the operation time is increased.

本発明は、このような問題を解決するために成されたものであり、携帯端末に画面表示された画像を電子機器に表示させて電子機器から携帯端末の操作を行うことができるようになされたシステムにおいて、電子機器での音声認識に基づいて、携帯端末のタッチパネル上でのなぞり操作（フリック、ピンチ、ローテーションなど）を行うことができるようにすることを目的とする。 The present invention has been made to solve such a problem, and allows an electronic device to display an image displayed on a screen of a mobile terminal so that the mobile device can be operated from the electronic device. Another object of the present invention is to make it possible to perform a tracing operation (flicking, pinching, rotation, etc.) on a touch panel of a portable terminal based on voice recognition in an electronic device.

上記した課題を解決するために、本発明では、タッチパネル付きの携帯端末で生成された画像データを受信して画像表示するとともに、携帯端末の操作を行うことができるようになされた電子機器が、音声入力項目を所定の座標列に変換するための変換テーブル情報を格納した変換テーブル記憶部を備え、音声認識部により認識された発話音声が音声入力項目に該当する場合、変換テーブル情報を用いて音声入力項目から所定の座標列を生成して携帯端末に送信するようにしている。 In order to solve the above-described problem, in the present invention, an electronic device configured to receive and display image data generated by a mobile terminal with a touch panel and to operate the mobile terminal, A conversion table storage unit storing conversion table information for converting a voice input item into a predetermined coordinate sequence is provided, and when the uttered voice recognized by the voice recognition unit corresponds to the voice input item, the conversion table information is used. A predetermined coordinate sequence is generated from the voice input items and transmitted to the portable terminal.

上記のように構成した本発明によれば、電子機器において音声入力項目が音声認識されると、その音声入力項目が所定の座標列に変換されて携帯端末に送信されるので、携帯端末ではその座標列をもとに、携帯端末のタッチパネル上でその座標列に対応する位置がなぞり操作（フリック、ピンチ、ローテーションなど）されたのと同様の処理を行うことが可能となる。これにより、電子機器での音声認識に基づいて携帯端末のタッチパネル上でのなぞり操作を行うことができる。 According to the present invention configured as described above, when a voice input item is recognized by an electronic device, the voice input item is converted into a predetermined coordinate sequence and transmitted to the mobile terminal. Based on the coordinate sequence, it is possible to perform the same processing as when the position corresponding to the coordinate sequence on the touch panel of the mobile terminal is traced (flick, pinch, rotation, etc.). Thereby, the tracing operation on the touch panel of the mobile terminal can be performed based on the voice recognition in the electronic device.

本発明の電子機器の一実施形態に係る車載機の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the vehicle equipment which concerns on one Embodiment of the electronic device of this invention. 本実施形態による変換テーブル記憶部に格納される変換テーブル情報の一例を示す図である。It is a figure which shows an example of the conversion table information stored in the conversion table memory | storage part by this embodiment. 本実施形態による車載機の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the vehicle equipment by this embodiment. ミラーリンクで連携された車載機と携帯端末の動作を説明するための図である。It is a figure for demonstrating operation | movement of the vehicle equipment cooperated with the mirror link, and the portable terminal.

以下、本発明の一実施形態を図面に基づいて説明する。図１は、本発明の電子機器の一実施形態に係る車載機の機能構成例を示すブロック図である。本実施形態の車載機１０は、タッチパネル付きの携帯端末２０とミラーリンクで連携する。そして、携帯端末２０で生成された画像データを受信して画像表示するとともに、携帯端末２０にインストールされているアプリケーション（本実施形態では、地図アプリ）の操作（地図のスクロール、拡大／縮小、回転）を車載機１０から行うことができるようになされている。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram illustrating a functional configuration example of an in-vehicle device according to an embodiment of an electronic device of the present invention. The in-vehicle device 10 of the present embodiment cooperates with the mobile terminal 20 with a touch panel by a mirror link. Then, the image data generated by the mobile terminal 20 is received and displayed, and an operation (map scroll, enlargement / reduction, rotation) of an application (a map application in this embodiment) installed in the mobile terminal 20 is performed. ) Can be performed from the vehicle-mounted device 10.

図１に示すように、本実施形態の車載機１０は、その機能構成として、画像データ受信部１１、画像表示制御部１２、変換テーブル記憶部１３、音声入力部１４、音声認識部１５、入力項目判定部１６および制御部１７を備えている。ここで、車載機１０には、タッチパネル付きディスプレイ３０（以下、単にディスプレイ３０とも記す）およびマイク４０が接続され、あるいは、一体として構成されている。 As shown in FIG. 1, the in-vehicle device 10 of this embodiment includes an image data receiving unit 11, an image display control unit 12, a conversion table storage unit 13, a voice input unit 14, a voice recognition unit 15, and an input as its functional configuration. An item determination unit 16 and a control unit 17 are provided. Here, the in-vehicle device 10 is connected to a display 30 with a touch panel (hereinafter also simply referred to as a display 30) and a microphone 40, or is configured integrally.

上記各機能ブロック１１〜１２，１４〜１７は、ハードウェア、ＤＳＰ（Digital Signal Processor）、ソフトウェアの何れによっても構成することが可能である。例えばソフトウェアによって構成する場合、上記各機能ブロック１１〜１２，１４〜１７は、実際には車載機１０が備えるコンピュータのＣＰＵ、ＲＡＭ、ＲＯＭなどを備えて構成され、ＲＡＭやＲＯＭ、ハードディスクまたは半導体メモリ等の記録媒体に記憶されたプログラムが動作することによって実現される。 Each of the functional blocks 11 to 12 and 14 to 17 can be configured by any of hardware, DSP (Digital Signal Processor), and software. For example, when configured by software, each of the functional blocks 11 to 12 and 14 to 17 is actually configured by including a CPU, a RAM, a ROM, and the like of a computer included in the in-vehicle device 10, and a RAM, a ROM, a hard disk, or a semiconductor memory. This is realized by operating a program stored in a recording medium such as the above.

画像データ受信部１１は、携帯端末２０のタッチパネル付きディスプレイ２１（以下、単にディスプレイ２１とも記す）に表示中の地図画像に係る画像データを携帯端末２０から受信する。画像表示制御部１２は、画像データ受信部１１により受信された画像データに基づいて、地図画像をディスプレイ３０に表示させる。 The image data receiving unit 11 receives image data relating to a map image being displayed on the display 21 with a touch panel of the mobile terminal 20 (hereinafter also simply referred to as the display 21) from the mobile terminal 20. The image display control unit 12 displays a map image on the display 30 based on the image data received by the image data receiving unit 11.

なお、携帯端末２０のディスプレイ２１は縦長型であり、車載機１０に接続されたディスプレイ３０は横長型である。携帯端末２０は、車載機１０とミラーリンク接続されたとき、ディスプレイ２１に表示中の縦長型の地図画像を車載機１０のディスプレイ３０に表示すべき横長型の地図画像へと変換して自身のディスプレイ２１の表示を横長型の地図画像に変更した上で、変換後の画像データを車載機１０に送信する。車載機１０の画像データ受信部１１は、この画像データを受信する。 The display 21 of the mobile terminal 20 is a vertically long type, and the display 30 connected to the in-vehicle device 10 is a horizontally long type. When the mobile terminal 20 is mirror-linked to the in-vehicle device 10, the mobile terminal 20 converts the vertically long map image being displayed on the display 21 into a horizontally long map image to be displayed on the display 30 of the in-vehicle device 10, and After changing the display on the display 21 to a horizontally long map image, the converted image data is transmitted to the in-vehicle device 10. The image data receiving unit 11 of the in-vehicle device 10 receives this image data.

画像データ受信部１１によって携帯端末２０から画像データが受信され、画像表示制御部１２によってその画像データに基づいて地図画像がディスプレイ３０に表示されると、ディスプレイ３０上で地図画像が表示された位置を示す座標情報（例えば、ディスプレイ３０の四隅の座標情報）が車載機１０から携帯端末２０に返される。携帯端末２０では、車載機１０から返されてきたディスプレイ３０の四隅の座標情報と、ディスプレイ２１の四隅の座標情報とを対応付けて記憶する（以下、この記憶する情報を「座標関連付け情報」という）。 When image data is received from the portable terminal 20 by the image data receiving unit 11 and a map image is displayed on the display 30 based on the image data by the image display control unit 12, the position where the map image is displayed on the display 30 (For example, coordinate information of the four corners of the display 30) is returned from the in-vehicle device 10 to the portable terminal 20. The portable terminal 20 stores the coordinate information of the four corners of the display 30 returned from the in-vehicle device 10 and the coordinate information of the four corners of the display 21 in association with each other (hereinafter, the stored information is referred to as “coordinate association information”). ).

これにより、車載機１０のタッチパネルに対するタッチ位置を表す座標情報を車載機１０から携帯端末２０に送信し、携帯端末２０がこの座標情報を座標関連付け情報に基づいて携帯端末２０のタッチパネル上の座標情報に変換することにより、携帯端末２０のタッチパネル上で車載機１０と同じ位置がタッチ操作されたのと同様の処理を行うことが可能な状態となる。 Thereby, coordinate information representing the touch position of the in-vehicle device 10 with respect to the touch panel is transmitted from the in-vehicle device 10 to the mobile terminal 20, and the mobile terminal 20 uses the coordinate information on the touch panel of the mobile terminal 20 based on the coordinate association information. By converting to, it becomes possible to perform the same processing as when the same position as the vehicle-mounted device 10 is touch-operated on the touch panel of the mobile terminal 20.

変換テーブル記憶部１３は、音声認識の対象としたい複数の音声入力項目（音声コマンド）をそれぞれタッチパネル付きディスプレイ３０のタッチパネル上における座標列に変換するための変換テーブル情報をあらかじめ格納している。本実施形態において、音声入力項目は、地図画像のスクロール、拡大・縮小、回転の少なくともいずれかに関する音声コマンドを含む。 The conversion table storage unit 13 stores in advance conversion table information for converting a plurality of voice input items (voice commands) desired to be voice recognition targets into coordinate strings on the touch panel of the display 30 with a touch panel. In the present embodiment, the voice input item includes a voice command related to at least one of scrolling, enlargement / reduction, and rotation of the map image.

例えば、地図画像のスクロールに関して、「右スクロール」、「左スクロール」、「上スクロール」、「下スクロール」といった音声コマンドをそれぞれ所定の座標列に変換するための情報が格納されている。また、地図画像の拡大・縮小に関して、「拡大」、「縮小」といった音声コマンドをそれぞれ所定の座標列に変換するための情報が格納されている。さらに、地図画像の回転に関して、「右回転」、「左回転」といった音声コマンドをそれぞれ所定の座標列に変換するための情報が格納されている。 For example, regarding the scrolling of the map image, information for converting voice commands such as “right scroll”, “left scroll”, “upward scroll”, and “downward scroll” into predetermined coordinate strings is stored. In addition, regarding the enlargement / reduction of the map image, information for converting voice commands such as “enlarge” and “reduction” into predetermined coordinate strings is stored. Further, regarding rotation of the map image, information for converting voice commands such as “right rotation” and “left rotation” into predetermined coordinate sequences is stored.

図２は、一例として「右スクロール」および「拡大」の音声コマンドをそれぞれ所定の座標列に変換するための変換テーブル情報を示す図である。図２に示すように、変換テーブル記憶部１３には、「右スクロール」の音声コマンドに対応して、タッチパネル上の所定の開始位置から終了位置まで、右から左方向へと所定の長さだけ１本の指でフリック操作をした場合に得られる座標列が格納されている。また、「拡大」の音声コマンドに対応して、タッチパネル上の所定の開始位置（例えば、タッチパネルの中央点）を中心として、２本の指を使ってその間隔を押し広げるピンチ操作をした場合に得られる座標列が格納されている。 FIG. 2 is a diagram showing conversion table information for converting “right scroll” and “enlarged” voice commands into predetermined coordinate strings, as an example. As shown in FIG. 2, the conversion table storage unit 13 stores a predetermined length from the right to the left from the predetermined start position to the end position on the touch panel in response to the “right scroll” voice command. A coordinate sequence obtained when a flick operation is performed with one finger is stored. Also, in response to a voice command of “enlarge”, when a pinch operation is performed using two fingers to increase the interval centering on a predetermined start position on the touch panel (for example, the center point of the touch panel) The obtained coordinate sequence is stored.

音声入力部１４は、ユーザが発話した音声をマイク４０より入力する。音声認識部１５は、音声入力部１４により入力された発話音声を認識する。すなわち、音声認識部１５は、音声認識辞書を備えており、音声入力部１４により入力された発話音声が音声認識辞書に登録されている音声のどれと合致するかを認識し、合致した音声を例えばテキスト情報として出力する。 The voice input unit 14 inputs voice uttered by the user from the microphone 40. The voice recognition unit 15 recognizes the uttered voice input by the voice input unit 14. That is, the voice recognition unit 15 includes a voice recognition dictionary, recognizes which of the voices registered in the voice recognition dictionary the uttered voice input by the voice input unit 14 matches, and selects the matched voice. For example, it is output as text information.

入力項目判定部１６は、変換テーブル記憶部１３に格納されている変換テーブル情報を参照して、音声認識部１５により認識された発話音声が音声入力項目に該当するか否かを判定する。すなわち、入力項目判定部１６は、音声認識部１５より出力される認識音声のテキスト情報が、変換テーブル記憶部１３に音声コマンドのテキスト情報として格納されているか否かを判定する。 The input item determination unit 16 refers to the conversion table information stored in the conversion table storage unit 13 to determine whether the uttered voice recognized by the voice recognition unit 15 corresponds to the voice input item. That is, the input item determination unit 16 determines whether the text information of the recognized speech output from the speech recognition unit 15 is stored in the conversion table storage unit 13 as text information of the voice command.

制御部１７は、音声認識部１５により認識された発話音声が音声入力項目に該当すると入力項目判定部１６により判定された場合、変換テーブル記憶部１３に格納されている変換テーブル情報を用いて、音声入力項目から所定の座標列を生成して携帯端末２０に送信する。 When the input item determination unit 16 determines that the utterance voice recognized by the voice recognition unit 15 corresponds to the voice input item, the control unit 17 uses the conversion table information stored in the conversion table storage unit 13. A predetermined coordinate sequence is generated from the voice input item and transmitted to the portable terminal 20.

携帯端末２０では、制御部１７により車載機１０から送られてきた座標列をもとに、携帯端末２０のタッチパネル上でその座標列に対応する位置がなぞり操作（フリック、ピンチ、ローテーションなど）されたのと同様の処理を行う。すなわち、携帯端末２０では、上述した座標関連付け情報を用いて、車載機１０から送られてきた座標列（タッチパネル付きディスプレイ３０におけるタッチパネル上での疑似的な操作座標列）を、携帯端末２０のタッチパネル上での疑似的な操作座標列に変換し、変換後の座標列に対応する位置がなぞり操作されたのと同様の処理を行う。 In the mobile terminal 20, based on the coordinate sequence sent from the in-vehicle device 10 by the control unit 17, the position corresponding to the coordinate sequence is traced on the touch panel of the mobile terminal 20 (flick, pinch, rotation, etc.). Perform the same process as above. In other words, the mobile terminal 20 uses the coordinate association information described above to convert the coordinate sequence (pseudo operation coordinate sequence on the touch panel in the display 30 with the touch panel) sent from the in-vehicle device 10 to the touch panel of the mobile terminal 20. The process is converted into the above pseudo operation coordinate sequence, and the same processing as when the position corresponding to the converted coordinate sequence is traced is performed.

図３は、上記のように構成した本実施形態による車載機１０の動作例を示すフローチャートである。なお、図３に示すフローチャートの処理は、車載機１０と携帯端末２０とを接続してミラーリンクで連携させたときに開始する。まず、画像データ受信部１１は、携帯端末２０のディスプレイ２１に表示中の地図画像に係る画像データを携帯端末２０から受信する（ステップＳ１）。そして、画像表示制御部１２は、画像データ受信部１１により受信された画像データに基づいて、地図画像をディスプレイ３０に表示させる（ステップＳ２）。その後、車載機１０は、地図画像が表示されたディスプレイ３０の四隅の座標情報を携帯端末２０に返信する（ステップＳ３）。 FIG. 3 is a flowchart showing an operation example of the vehicle-mounted device 10 configured as described above according to the present embodiment. Note that the processing of the flowchart shown in FIG. 3 starts when the in-vehicle device 10 and the mobile terminal 20 are connected and linked by a mirror link. First, the image data receiving unit 11 receives image data related to a map image being displayed on the display 21 of the mobile terminal 20 from the mobile terminal 20 (step S1). And the image display control part 12 displays a map image on the display 30 based on the image data received by the image data receiving part 11 (step S2). Thereafter, the in-vehicle device 10 returns the coordinate information of the four corners of the display 30 on which the map image is displayed to the mobile terminal 20 (step S3).

次に、音声入力部１４は、ユーザによる発話音声がマイク４０より入力されたか否かを判定する（ステップＳ４）。発話音声が入力されていない場合、車載機１０は、携帯端末２０とのミラーリンクによる連携が終了したか否かを判定し（ステップＳ５）、連携が終了した場合は図３に示すフローチャートの処理を終了する。連携が終了していなければ、処理はステップＳ１に戻る。これにより、携帯端末２０において地図画像が更新されるのに合わせて、車載機１０においても地図画像が随時更新して表示される。 Next, the voice input unit 14 determines whether or not the voice spoken by the user has been input from the microphone 40 (step S4). When the speech voice is not input, the in-vehicle device 10 determines whether or not the cooperation with the mobile terminal 20 by the mirror link is finished (step S5), and when the cooperation is finished, the processing of the flowchart shown in FIG. Exit. If the cooperation has not ended, the process returns to step S1. Thereby, as the map image is updated in the mobile terminal 20, the map image is also updated and displayed on the in-vehicle device 10 as needed.

一方、音声入力部１４により発話音声が入力された場合、音声認識部１５は、音声入力部１４により入力された発話音声を認識する（ステップＳ６）。次いで、入力項目判定部１６は、音声認識部１５により認識された発話音声が所定の音声入力項目に該当するか否かを判定する（ステップＳ７）。ここで、音声認識部１５により認識された発話音声が所定の音声入力項目に該当しないと判定された場合、入力項目判定部１６は、発話音声が音声入力項目に該当しない旨をユーザに知らせるエラーメッセージをディスプレイ３０に表示させる（ステップＳ８）。その後、処理はステップＳ１に戻る。 On the other hand, when the utterance voice is input by the voice input unit 14, the voice recognition unit 15 recognizes the utterance voice input by the voice input unit 14 (step S6). Next, the input item determination unit 16 determines whether the uttered voice recognized by the voice recognition unit 15 corresponds to a predetermined voice input item (step S7). Here, when it is determined that the utterance voice recognized by the voice recognition unit 15 does not correspond to the predetermined voice input item, the input item determination unit 16 notifies the user that the utterance voice does not correspond to the voice input item. A message is displayed on the display 30 (step S8). Thereafter, the process returns to step S1.

一方、音声認識部１５により認識された発話音声が音声入力項目に該当すると入力項目判定部１６により判定された場合、制御部１７は、変換テーブル記憶部１３に格納されている変換テーブル情報を用いて、音声入力項目から所定の座標列を生成して携帯端末２０に送信する（ステップＳ９）。その後、処理はステップＳ１に戻る。これにより、車載機１０から送られた座標列に基づいて携帯端末２０において所定の処理が実行され、それによって更新された地図画像と同じ地図画像がディスプレイ３０に表示されることとなる。 On the other hand, when the input item determination unit 16 determines that the uttered voice recognized by the voice recognition unit 15 corresponds to the voice input item, the control unit 17 uses the conversion table information stored in the conversion table storage unit 13. Then, a predetermined coordinate sequence is generated from the voice input items and transmitted to the portable terminal 20 (step S9). Thereafter, the process returns to step S1. Thereby, a predetermined process is executed in the portable terminal 20 based on the coordinate sequence sent from the in-vehicle device 10, and the same map image as the updated map image is displayed on the display 30.

以上詳しく説明したように、本実施形態の車載機１０は、音声認識用の音声入力項目（音声コマンド）を所定の座標列に変換するための変換テーブル情報を格納した変換テーブル記憶部１３を備え、音声認識部１５により認識された発話音声が音声入力項目に該当する場合、変換テーブル情報を用いて音声入力項目から所定の座標列を生成して携帯端末２０に送信するようにしている。 As described above in detail, the in-vehicle device 10 of the present embodiment includes the conversion table storage unit 13 that stores conversion table information for converting a voice input item (voice command) for voice recognition into a predetermined coordinate sequence. When the speech voice recognized by the voice recognition unit 15 corresponds to a voice input item, a predetermined coordinate sequence is generated from the voice input item using the conversion table information and transmitted to the mobile terminal 20.

このように構成した本実施形態によれば、車載機１０において音声入力項目が音声認識されると、その音声入力項目が所定の座標列に変換されて携帯端末２０に送信されるので、携帯端末２０ではその座標列をもとに、携帯端末２０のタッチパネル上でその座標列に対応する位置がなぞり操作（フリック、ピンチ、ローテーションなど）されたのと同様の処理を行うことが可能となる。これにより、車載機１０での音声認識に基づいて携帯端末２０のタッチパネル上でのなぞり操作を行うことができる。 According to the present embodiment configured as described above, when a voice input item is recognized by the in-vehicle device 10, the voice input item is converted into a predetermined coordinate sequence and transmitted to the mobile terminal 20. 20, based on the coordinate sequence, it is possible to perform the same processing as when the position corresponding to the coordinate sequence on the touch panel of the mobile terminal 20 is traced (flick, pinch, rotation, etc.). Thereby, the tracing operation on the touch panel of the portable terminal 20 can be performed based on the voice recognition in the in-vehicle device 10.

なお、上記実施形態において、操作処理内容に加えて「速度（高速・低速）」を表す言葉が発話された場合には、車載機１０から携帯端末２０に座標列を送信する時間を通常より短くまたは長くするようにしてもよい。この場合、音声入力項目は、タッチパネルに対する操作の種類に関する第１の音声入力項目と、タッチパネルに対する操作の速度に関する第２の音声入力項目とを含む。変換テーブル記憶部１３は、第１の音声入力項目をそれぞれタッチパネル付きディスプレイ３０のタッチパネル上における座標列に変換するための変換テーブル情報をあらかじめ格納する。 In the above embodiment, when a word representing “speed (high speed / low speed)” is spoken in addition to the operation processing content, the time for transmitting the coordinate sequence from the in-vehicle device 10 to the portable terminal 20 is shorter than usual. Or you may make it lengthen. In this case, the voice input items include a first voice input item related to the type of operation on the touch panel and a second voice input item related to the speed of operation on the touch panel. The conversion table storage unit 13 stores in advance conversion table information for converting each first voice input item into a coordinate string on the touch panel of the display 30 with a touch panel.

入力項目判定部１６は、音声認識部１５により認識された発話音声が第１の音声入力項目に該当するか否か、第２の音声入力項目に該当するか否かをそれぞれ判定する。具体的には、入力項目判定部１６は、音声認識部１５により認識された発話音声が第１の音声入力項目のみに該当するか否か、第１の音声入力項目と第２の音声入力項目との組み合わせに該当するか否かを判定する。 The input item determination unit 16 determines whether or not the uttered voice recognized by the voice recognition unit 15 corresponds to the first voice input item and whether or not it corresponds to the second voice input item. Specifically, the input item determination unit 16 determines whether the uttered voice recognized by the voice recognition unit 15 corresponds to only the first voice input item, the first voice input item and the second voice input item. It is determined whether it corresponds to the combination.

制御部１７は、音声認識部１５により認識された発話音声が第１の音声入力項目のみに該当すると入力項目判定部１６により判定された場合、変換テーブル情報を用いて第１の音声入力項目から所定の座標列を生成して携帯端末２０に対し所定の速度で送信する。この動作は上述した実施形態と同様である。例えば、音声認識部１５により認識された発話音声が「右スクロール」であった場合、制御部１７は、「右スクロール」に対応する座標列を生成して携帯端末２０に対してこれを通常の速度で送信する。 When the input item determination unit 16 determines that the utterance voice recognized by the voice recognition unit 15 corresponds to only the first voice input item, the control unit 17 uses the conversion table information to start from the first voice input item. A predetermined coordinate sequence is generated and transmitted to the mobile terminal 20 at a predetermined speed. This operation is the same as in the above-described embodiment. For example, when the uttered voice recognized by the voice recognition unit 15 is “right scroll”, the control unit 17 generates a coordinate sequence corresponding to “right scroll” and outputs the coordinate string to the mobile terminal 20 as a normal one. Send at speed.

一方、音声認識部１５により認識された発話音声が第１の音声入力項目および第２の音声入力項目の両方に該当すると入力項目判定部１６により判定された場合、制御部１７は、変換テーブル情報を用いて第１の音声入力項目から所定の座標列を生成し、携帯端末２０に対して第２の音声入力項目に応じた速度で送信する。例えば、音声認識部１５により認識された発話音声が「高速右スクロール」であった場合、制御部１７は、「右スクロール」に対応する座標列を生成し、携帯端末２０に対して通常よりも速い所定の速度で（つまり、１つ１つの座標を送信する時間間隔を短くして）送信する。また、音声認識部１５により認識された発話音声が「低速右スクロール」であった場合、制御部１７は、「右スクロール」に対応する座標列を生成し、これを携帯端末２０に対して通常よりも遅い所定の速度で送信する。 On the other hand, when the input item determination unit 16 determines that the utterance voice recognized by the voice recognition unit 15 corresponds to both the first voice input item and the second voice input item, the control unit 17 converts the conversion table information. Is used to generate a predetermined coordinate sequence from the first voice input item and transmit it to the mobile terminal 20 at a speed corresponding to the second voice input item. For example, when the utterance voice recognized by the voice recognition unit 15 is “high-speed right scroll”, the control unit 17 generates a coordinate sequence corresponding to “right scroll”, and the mobile terminal 20 performs more than usual. The transmission is performed at a high predetermined speed (that is, the time interval for transmitting each coordinate is shortened). If the uttered voice recognized by the voice recognition unit 15 is “low-speed right scroll”, the control unit 17 generates a coordinate sequence corresponding to “right scroll”, and this is normally transmitted to the mobile terminal 20. It transmits at a predetermined speed slower than that.

「右スクロール」に対応する座標列が通常よりも速い速度で携帯端末２０に送信された場合、携帯端末２０では、地図画像の右スクロールが通常よりも速い速度で実行される。これにより、車載機１０に表示される地図画像においても、地図画像の右スクロールが通常よりも速い速度で行われる。同様に、「右スクロール」に対応する座標列が通常よりも遅い速度で携帯端末２０に送信された場合、携帯端末２０では、地図画像の右スクロールが通常よりも遅い速度で実行される。これにより、車載機１０に表示される地図画像においても、地図画像の右スクロールが通常よりも遅い速度で行われる。 When the coordinate sequence corresponding to “right scroll” is transmitted to the mobile terminal 20 at a speed faster than usual, the mobile terminal 20 executes the right scroll of the map image at a speed faster than usual. Thereby, also in the map image displayed on the vehicle-mounted device 10, the right scrolling of the map image is performed at a faster speed than usual. Similarly, when the coordinate sequence corresponding to “right scroll” is transmitted to the mobile terminal 20 at a speed slower than normal, the mobile terminal 20 executes the right scroll of the map image at a speed slower than normal. Thereby, also in the map image displayed on the vehicle-mounted device 10, the right scrolling of the map image is performed at a slower speed than usual.

また、上記実施形態において、操作処理内容に加えて「回数・段階」を表す言葉が発話された場合には、回数または段階に応じて座標列を車載機１０から携帯端末２０に複数回送信するようにしてもよい。この場合、音声入力項目は、タッチパネルに対する操作の種類に関する第１の音声入力項目と、タッチパネルに対する操作の回数に関する第３の音声入力項目とを含む。変換テーブル記憶部１３は、第１の音声入力項目をそれぞれタッチパネル付きディスプレイ３０のタッチパネル上における座標列に変換するための変換テーブル情報をあらかじめ格納する。 Further, in the above embodiment, when a word representing “number / stage” is spoken in addition to the operation processing content, the coordinate sequence is transmitted from the in-vehicle device 10 to the mobile terminal 20 a plurality of times according to the number or stage. You may do it. In this case, the voice input items include a first voice input item related to the type of operation on the touch panel and a third voice input item related to the number of operations on the touch panel. The conversion table storage unit 13 stores in advance conversion table information for converting each first voice input item into a coordinate string on the touch panel of the display 30 with a touch panel.

入力項目判定部１６は、音声認識部１５により認識された発話音声が第１の音声入力項目に該当するか否か、第３の音声入力項目に該当するか否かをそれぞれ判定する。具体的には、入力項目判定部１６は、音声認識部１５により認識された発話音声が第１の音声入力項目のみに該当するか否か、第１の音声入力項目と第３の音声入力項目との組み合わせに該当するか否かを判定する。 The input item determination unit 16 determines whether or not the uttered voice recognized by the voice recognition unit 15 corresponds to the first voice input item and whether or not it corresponds to the third voice input item. Specifically, the input item determination unit 16 determines whether the uttered voice recognized by the voice recognition unit 15 corresponds to only the first voice input item, the first voice input item and the third voice input item. It is determined whether it corresponds to the combination.

制御部１７は、音声認識部１５により認識された発話音声が第１の音声入力項目のみに該当すると入力項目判定部１６により判定された場合、変換テーブル情報を用いて第１の音声入力項目から所定の座標列を生成して携帯端末２０に対し１回送信する。この動作は上述した実施形態と同様である。例えば、音声認識部１５により認識された発話音声が「拡大」であった場合、制御部１７は、「拡大」に対応する座標列を生成し、これを携帯端末２０に対して１回だけ送信する。 When the input item determination unit 16 determines that the utterance voice recognized by the voice recognition unit 15 corresponds to only the first voice input item, the control unit 17 uses the conversion table information to start from the first voice input item. A predetermined coordinate sequence is generated and transmitted once to the mobile terminal 20. This operation is the same as in the above-described embodiment. For example, when the utterance voice recognized by the voice recognition unit 15 is “enlarged”, the control unit 17 generates a coordinate sequence corresponding to “enlarged” and transmits this to the mobile terminal 20 only once. To do.

一方、音声認識部１５により認識された発話音声が第１の音声入力項目および第３の音声入力項目の両方に該当すると入力項目判定部１６により判定された場合、制御部１７は、変換テーブル情報を用いて第１の音声入力項目から所定の座標列を生成し、これを携帯端末２０に対して第３の音声入力項目に応じた回数送信する。 On the other hand, when the input item determination unit 16 determines that the utterance voice recognized by the voice recognition unit 15 corresponds to both the first voice input item and the third voice input item, the control unit 17 converts the conversion table information. Is used to generate a predetermined coordinate sequence from the first voice input item and transmit it to the portable terminal 20 a number of times according to the third voice input item.

例えば、音声認識部１５により認識された発話音声が「２段階拡大」であった場合、制御部１７は、「拡大」に対応する座標列を生成し、携帯端末２０に対してこれを２回送信する。「拡大」に対応する座標列が携帯端末２０に２回送信された場合、携帯端末２０では、地図画像の拡大が２段階にわたって実行される。これにより、車載機１０に表示される地図画像においても、地図画像の拡大が２段階にわたって行われる。 For example, when the uttered voice recognized by the voice recognition unit 15 is “two-stage expansion”, the control unit 17 generates a coordinate sequence corresponding to “enlargement” and performs this twice on the mobile terminal 20. Send. When the coordinate sequence corresponding to “enlargement” is transmitted to the mobile terminal 20 twice, the mobile terminal 20 executes the enlargement of the map image in two stages. Thereby, also in the map image displayed on the vehicle-mounted device 10, the map image is enlarged in two stages.

なお、上記実施形態では、携帯端末２０にインストールされている携帯アプリの例として地図アプリを挙げて説明したが、本発明を適用可能な携帯アプリは地図アプリに限定されない。すなわち、フリック、ピンチ、ローテーションなどのなぞり操作に応じて所定の処理を実行する機能を有する携帯アプリであれば、何れも適用することが可能である。 In the above embodiment, the map application is described as an example of the mobile application installed in the mobile terminal 20, but the mobile application to which the present invention is applicable is not limited to the map application. That is, any portable application having a function of executing a predetermined process in response to a tracing operation such as flicking, pinching, and rotation can be applied.

また、上記実施形態では、電子機器の一例として車載機１０を挙げて説明したが、本発明を適用可能な電子機器は車載機１０に限定されない。すなわち、ミラーリンクまたはこれと同様の通信制御技術に対応した電子機器であれば何れにも適用することが可能である。 In the above embodiment, the in-vehicle device 10 is described as an example of the electronic device. However, the electronic device to which the present invention is applicable is not limited to the in-vehicle device 10. That is, any electronic device compatible with a mirror link or a communication control technique similar to this can be applied.

また、上記実施形態では、地図画像を拡大・縮小するための音声入力項目（音声コマンド）として「拡大」、「縮小」を用い、固定の一点（例えば、画面中央点）を中心として地図画像を拡大または縮小する例について説明したが、拡大／縮小の中心点の位置を第４の音声入力コマンドとして用いるようにしてもよい。例えば、音声認識部１５により認識された発話音声が「右上拡大」であった場合、制御部１７は、タッチパネル付きディスプレイ３０の右上エリア内にある所定の１点を中心として、２本の指を使ってその間隔を押し広げるピンチ操作をした場合に得られる座標列を生成し、これを携帯端末２０に対して送信する。 In the above embodiment, “zoom” and “shrink” are used as voice input items (voice commands) for enlarging / reducing the map image, and the map image is centered on a fixed point (for example, the screen center point). Although the example of enlarging or reducing has been described, the position of the center point of enlarging / reducing may be used as the fourth voice input command. For example, when the uttered voice recognized by the voice recognition unit 15 is “upper right upper”, the control unit 17 moves two fingers around a predetermined point in the upper right area of the display 30 with the touch panel. A coordinate sequence obtained when a pinch operation is performed to widen the interval by using is generated and transmitted to the mobile terminal 20.

その他、上記実施形態は、何れも本発明を実施するにあたっての具体化の一例を示したものに過ぎず、これによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその要旨、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。 In addition, each of the above-described embodiments is merely an example of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed in a limited manner. That is, the present invention can be implemented in various forms without departing from the gist or the main features thereof.

１０車載機（電子機器）
１１画像データ受信部
１２画像表示制御部
１３変換テーブル記憶部
１４音声入力部
１５音声認識部
１６入力項目判定部
１７制御部
２０携帯端末 10 Onboard equipment (electronic equipment)
DESCRIPTION OF SYMBOLS 11 Image data receiving part 12 Image display control part 13 Conversion table memory | storage part 14 Voice input part 15 Voice recognition part 16 Input item determination part 17 Control part 20 Portable terminal

Claims

The electronic device is configured to receive and display image data generated by a mobile terminal with a touch panel, and to operate the mobile terminal,
A conversion table storage unit that stores conversion table information for converting voice input items into a predetermined coordinate sequence;
A voice input unit for inputting voice spoken by the user;
A voice recognition unit for recognizing a speech voice input by the voice input unit;
An input item determination unit that determines whether or not the utterance voice recognized by the voice recognition unit corresponds to the voice input item;
If the input item determination unit determines that the speech voice recognized by the voice recognition unit corresponds to the voice input item, the predetermined coordinate sequence is generated from the voice input item using the conversion table information. An electronic apparatus comprising: a control unit that transmits to the portable terminal.

The voice input items include a first voice input item related to the type of operation on the touch panel, and a second voice input item related to the speed of operation on the touch panel.
The input item determination unit determines whether the uttered voice recognized by the voice recognition unit corresponds to the first voice input item or whether the speech input item corresponds to the second voice input item, respectively.
When the input item determination unit determines that the utterance voice recognized by the voice recognition unit corresponds only to the first voice input item, the control unit uses the conversion table information to determine the first voice. The predetermined coordinate string is generated from the input items and transmitted to the portable terminal at a predetermined speed, and the uttered voice recognized by the voice recognition unit is the first voice input item and the second voice input item. If the input item determination unit determines that both are true, the predetermined coordinate string is generated from the first audio input item using the conversion table information, and the second audio is sent to the mobile terminal. The electronic apparatus according to claim 1, wherein transmission is performed at a speed corresponding to an input item.

The voice input items include a first voice input item related to the type of operation on the touch panel, and a third voice input item related to the number of operations on the touch panel.
The input item determination unit determines whether the uttered voice recognized by the voice recognition unit corresponds to the first voice input item or whether to correspond to the third voice input item, respectively.
When the input item determination unit determines that the utterance voice recognized by the voice recognition unit corresponds only to the first voice input item, the control unit uses the conversion table information to determine the first voice. The predetermined coordinate sequence is generated from the input item and transmitted once to the portable terminal, and the uttered voice recognized by the voice recognition unit is both the first voice input item and the third voice input item. If the input item determination unit determines that the condition is true, the predetermined coordinate string is generated from the first voice input item using the conversion table information, and the third voice input item is transmitted to the portable terminal. The electronic device according to claim 1, wherein the electronic device transmits the number of times in accordance with the electronic device.

The voice input item includes a first voice input item related to a type of operation on the touch panel, and a fourth voice input item related to a start position of the operation on the touch panel.
The input item determination unit determines whether the uttered voice recognized by the voice recognition unit corresponds to the first voice input item or whether to correspond to the fourth voice input item, respectively.
When the input item determination unit determines that the utterance voice recognized by the voice recognition unit corresponds to only the first voice input item, the control unit uses the conversion table information to determine a predetermined start position. The coordinate sequence when the operation is started from the first voice input item is generated from the first voice input item and transmitted to the portable terminal, and the utterance voice recognized by the voice recognition unit is the first voice input item and the fourth voice input item. When the input item determination unit determines that the voice input item corresponds to both of the voice input items, the coordinate table when the operation is started from the start position corresponding to the fourth voice input item is obtained using the conversion table information. The electronic device according to claim 1, wherein the electronic device is generated from the first voice input item and transmitted to the portable terminal.

The electronic device according to claim 1, wherein the audio input item includes at least one of image scrolling, enlargement / reduction, and rotation.

5. The electronic apparatus according to claim 2, wherein the first audio input item includes at least one of image scrolling, enlargement / reduction, and rotation.

A voice recognition operation method for operating a portable terminal using voice recognition from the electronic device in a system configured to display an image displayed on a screen on a portable terminal with a touch panel. ,
A first step in which the voice input unit of the electronic device inputs voice spoken by the user;
A second step in which the voice recognition unit of the electronic device recognizes the uttered voice input by the voice input unit;
A third step in which the input item determination unit of the electronic device determines whether the uttered voice recognized by the voice recognition unit corresponds to a predetermined voice input item;
The control unit of the electronic device converts the voice input item into a predetermined coordinate sequence when the input item determination unit determines that the utterance voice recognized by the voice recognition unit corresponds to the voice input item. And a fourth step of generating the predetermined coordinate sequence from the voice input item using the conversion table information and transmitting the generated coordinate sequence to the portable terminal. The voice of the portable terminal connected to the electronic device Recognition operation method.

An in-vehicle system configured to display an image displayed on a mobile terminal with a touch panel on an in-vehicle device and to operate the mobile terminal from the in-vehicle device,
The in-vehicle device is
A conversion table storage unit that stores conversion table information for converting voice input items into a predetermined coordinate sequence;
A voice input unit for inputting voice spoken by the user;
A voice recognition unit for recognizing a speech voice input by the voice input unit;
An input item determination unit that determines whether or not the utterance voice recognized by the voice recognition unit corresponds to the voice input item;
If the input item determination unit determines that the speech voice recognized by the voice recognition unit corresponds to the voice input item, the predetermined coordinate sequence is generated from the voice input item using the conversion table information. A control unit for transmitting to the mobile terminal,
The portable terminal performs the same processing as when the position corresponding to the coordinate sequence is traced on the touch panel based on the coordinate sequence sent from the in-vehicle device by the control unit. In-vehicle system.