WO2023062816A1 - Content output device, content output method, program, and storage medium - Google Patents

Content output device, content output method, program, and storage medium Download PDF

Info

Publication number
WO2023062816A1
WO2023062816A1 PCT/JP2021/038223 JP2021038223W WO2023062816A1 WO 2023062816 A1 WO2023062816 A1 WO 2023062816A1 JP 2021038223 W JP2021038223 W JP 2021038223W WO 2023062816 A1 WO2023062816 A1 WO 2023062816A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
output device
variable
information
voice
Prior art date
Application number
PCT/JP2021/038223
Other languages
French (fr)
Japanese (ja)
Inventor
敦博 山中
高志 飯澤
敬太 倉持
勇志 角田
将士 高野
敬介 栃原
航太朗 宮部
Original Assignee
パイオニア株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パイオニア株式会社 filed Critical パイオニア株式会社
Priority to PCT/JP2021/038223 priority Critical patent/WO2023062816A1/en
Publication of WO2023062816A1 publication Critical patent/WO2023062816A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/36Input/output arrangements for on-board computers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation

Definitions

  • the present invention relates to technology that can be used in content output.
  • Conventionally known technology is to output content corresponding to the information to the user based on various information obtained through sensors and the like.
  • Patent Literature 1 discloses a technique for outputting a greeting voice when a passenger gets in and out of a vehicle based on information obtained through a vibration sensor or the like that detects opening and closing of the vehicle door. disclosed.
  • Patent Document 1 does not disclose a method of outputting the content while distinguishing between the portion that changes according to the situation in which the content is output and the other portion of the content.
  • Patent Document 1 there is a case where the user who is the output destination of the content is subjected to an unnecessary mental burden for recognizing important information that may be included in the content. There is a problem that there is
  • the present invention has been made to solve the above problems, and a main object of the present invention is to provide a content output device capable of recognizing important information contained in content more easily than before. do.
  • the claimed invention is a content output device, which acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of a vehicle.
  • An acquisition unit and an output unit that outputs the content are provided.
  • the claimed invention is a content output method, in which content including a variable part and tag information for outputting voice while emphasizing the variable part is acquired according to the driving situation of a vehicle. and output the content.
  • the claimed invention is a program executed by a content output device provided with a computer, the content including a variable part and tag information for outputting voice while emphasizing the variable part,
  • the computer is caused to function as a content acquisition unit that acquires content according to the driving situation of the vehicle and an output unit that outputs the content.
  • FIG. 1 is a diagram showing a configuration example of an audio output system according to an embodiment
  • FIG. 1 is a block diagram showing a schematic configuration of an audio output device
  • FIG. 4 is a diagram for explaining the data structure of text data stored in the server device;
  • FIG. 4 is a diagram showing a specific example of text data stored in the server device;
  • 4 is a flowchart for explaining processing performed in the server device;
  • a content output device acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle.
  • a content acquisition unit and an output unit for outputting the content are provided.
  • the above content output device includes a content acquisition unit and an output unit.
  • the content acquisition unit acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle.
  • the output unit outputs the content. This makes it easier to recognize important information that may be included in content than in the past.
  • the tag information includes a set value of at least one of volume, pitch, and speed when outputting the variable portion as voice, fixed included in the content. It contains information for setting a value different from the set value when outputting the part as voice.
  • the tag information includes information for setting the volume setting value when outputting the variable portion as sound to be higher than the volume setting value of the fixed portion.
  • the tag information includes information for setting the pitch setting value when outputting the variable portion as sound higher than the pitch setting value of the fixed portion.
  • the tag information includes information for making the set value of the speed when outputting the variable portion slower than the set value of the speed of the fixed portion.
  • a content output method acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of a vehicle, Output content. This makes it easier to recognize important information that may be included in content than in the past.
  • a program executed by a content output device provided with a computer outputs content including a variable portion and tag information for outputting voice while emphasizing the variable portion to a vehicle.
  • the computer is caused to function as a content acquisition unit that acquires content according to driving conditions and an output unit that outputs the content.
  • This program can be stored in a storage medium and used.
  • FIG. 1 is a diagram illustrating a configuration example of an audio output system according to an embodiment.
  • a voice output system 1 according to this embodiment includes a voice output device 100 and a server device 200 .
  • the audio output device 100 is mounted on the vehicle Ve.
  • the server device 200 communicates with a plurality of audio output devices 100 mounted on a plurality of vehicles Ve.
  • the voice output device 100 basically performs route search processing, route guidance processing, etc. for the user who is a passenger of the vehicle Ve. For example, when a destination or the like is input by the user, the voice output device 100 transmits an upload signal S1 including position information of the vehicle Ve and information on the designated destination to the server device 200 . Server device 200 calculates the route to the destination by referring to the map data, and transmits control signal S2 indicating the route to the destination to audio output device 100 . The voice output device 100 provides route guidance to the user by voice output based on the received control signal S2.
  • the voice output device 100 provides various types of information to the user through interaction with the user.
  • the audio output device 100 supplies the server device 200 with an upload signal S1 including information indicating the content or type of the information request and information about the running state of the vehicle Ve.
  • the server device 200 acquires and generates information requested by the user, and transmits it to the audio output device 100 as a control signal S2.
  • the audio output device 100 provides the received information to the user by audio output.
  • the voice output device 100 moves together with the vehicle Ve and performs route guidance mainly by voice so that the vehicle Ve travels along the guidance route.
  • route guidance based mainly on voice refers to route guidance in which the user can grasp information necessary for driving the vehicle Ve along the guidance route at least from only voice, and the voice output device 100 indicates the current position. It does not exclude the auxiliary display of a surrounding map or the like.
  • the voice output device 100 outputs at least various information related to driving, such as points on the route that require guidance (also referred to as “guidance points”), by voice.
  • the guidance point corresponds to, for example, an intersection at which the vehicle Ve turns right or left, or other passing points important for the vehicle Ve to travel along the guidance route.
  • the voice output device 100 provides voice guidance regarding guidance points such as, for example, the distance from the vehicle Ve to the next guidance point and the traveling direction at the guidance point.
  • the voice regarding the guidance for the guidance route will also be referred to as "route voice guidance”.
  • the audio output device 100 is installed, for example, on the upper part of the windshield of the vehicle Ve or on the dashboard. Note that the audio output device 100 may be incorporated in the vehicle Ve.
  • FIG. 2 is a block diagram showing a schematic configuration of the audio output device 100.
  • the audio output device 100 mainly includes a communication unit 111, a storage unit 112, an input unit 113, a control unit 114, a sensor group 115, a display unit 116, a microphone 117, a speaker 118, and an exterior camera 119. and an in-vehicle camera 120 .
  • Each element in the audio output device 100 is interconnected via a bus line 110 .
  • the communication unit 111 performs data communication with the server device 200 under the control of the control unit 114 .
  • the communication unit 111 may receive, for example, map data for updating a map DB (DataBase) 4 to be described later from the server device 200 .
  • Map DB DataBase
  • the storage unit 112 is composed of various memories such as RAM (Random Access Memory), ROM (Read Only Memory), and non-volatile memory (including hard disk drive, flash memory, etc.).
  • the storage unit 112 stores a program for the audio output device 100 to execute predetermined processing.
  • the above programs may include an application program for providing route guidance by voice, an application program for playing back music, an application program for outputting content other than music (such as television), and the like.
  • Storage unit 112 is also used as a working memory for control unit 114 . Note that the program executed by the audio output device 100 may be stored in a storage medium other than the storage unit 112 .
  • the storage unit 112 also stores a map database (hereinafter, the database is referred to as "DB") 4. Various data required for route guidance are recorded in the map DB 4 .
  • the map DB 4 stores, for example, road data representing a road network by a combination of nodes and links, and facility data indicating facilities that are candidates for destinations, stop-off points, or landmarks.
  • the map DB 4 may be updated based on the map information received by the communication section 111 from the map management server under the control of the control section 114 .
  • the input unit 113 is a button, touch panel, remote controller, etc. for user operation.
  • the display unit 116 is a display or the like that displays based on the control of the control unit 114 .
  • the microphone 117 collects sounds inside the vehicle Ve, particularly the driver's utterances.
  • a speaker 118 outputs audio for route guidance to the driver or the like.
  • the sensor group 115 includes an external sensor 121 and an internal sensor 122 .
  • the external sensor 121 is, for example, one or more sensors for recognizing the surrounding environment of the vehicle Ve, such as a lidar, radar, ultrasonic sensor, infrared sensor, and sonar.
  • the internal sensor 122 is a sensor that performs positioning of the vehicle Ve, and is, for example, a GNSS (Global Navigation Satellite System) receiver, a gyro sensor, an IMU (Inertial Measurement Unit), a vehicle speed sensor, or a combination thereof.
  • GNSS Global Navigation Satellite System
  • IMU Inertial Measurement Unit
  • vehicle speed sensor or a combination thereof.
  • the sensor group 115 may have a sensor that allows the control unit 114 to directly or indirectly derive the position of the vehicle Ve from the output of the sensor group 115 (that is, by performing estimation processing).
  • the vehicle exterior camera 119 is a camera that captures the exterior of the vehicle Ve.
  • the exterior camera 119 may be only a front camera that captures the front of the vehicle, or may include a rear camera that captures the rear of the vehicle in addition to the front camera. good too.
  • the in-vehicle camera 120 is a camera for photographing the interior of the vehicle Ve, and is provided at a position capable of photographing at least the vicinity of the driver's seat.
  • the control unit 114 includes a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), etc., and controls the audio output device 100 as a whole. For example, the control unit 114 estimates the position (including the traveling direction) of the vehicle Ve based on the outputs of one or more sensors in the sensor group 115 . Further, when a destination is specified by the input unit 113 or the microphone 117, the control unit 114 generates route information indicating a guidance route to the destination, Based on the positional information and the map DB 4, route guidance is provided. In this case, the control unit 114 causes the speaker 118 to output route voice guidance. Further, the control unit 114 controls the display unit 116 to display information about the music being played, video content, a map of the vicinity of the current position, or the like.
  • a CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • control unit 114 is not limited to being implemented by program-based software, and may be implemented by any combination of hardware, firmware, and software. Also, the processing executed by the control unit 114 may be implemented using a user-programmable integrated circuit such as an FPGA (field-programmable gate array) or a microcomputer. In this case, this integrated circuit may be used to implement the program executed by the control unit 114 in this embodiment. Thus, the control unit 114 may be realized by hardware other than the processor.
  • FPGA field-programmable gate array
  • the configuration of the audio output device 100 shown in FIG. 2 is an example, and various changes may be made to the configuration shown in FIG.
  • the control unit 114 may receive information necessary for route guidance from the server device 200 via the communication unit 111 .
  • the audio output device 100 is electrically connected to an audio output unit configured separately from the audio output device 100, or by a known communication means, so as to output the audio. Audio may be output from the output unit.
  • the audio output unit may be a speaker provided in the vehicle Ve.
  • the audio output device 100 does not have to include the display section 116 .
  • the audio output device 100 does not need to perform display-related control at all. may be executed.
  • the audio output device 100 may acquire information output by sensors installed in the vehicle Ve based on a communication protocol such as CAN (Controller Area Network) from the vehicle Ve. .
  • CAN Controller Area Network
  • the server device 200 generates route information indicating a guidance route that the vehicle Ve should travel based on the upload signal S1 including the destination and the like received from the voice output device 100 .
  • the server device 200 then generates a control signal S2 relating to information output in response to the user's information request based on the user's information request indicated by the upload signal S1 transmitted by the audio output device 100 and the running state of the vehicle Ve.
  • the server device 200 then transmits the generated control signal S ⁇ b>2 to the audio output device 100 .
  • the server device 200 generates content for providing information to the user of the vehicle Ve and for interacting with the user, and transmits the content to the audio output device 100 .
  • the provision of information to the user is primarily a push-type information provision that is triggered by the server device 200 when the vehicle Ve reaches a predetermined driving condition.
  • the dialog with the user is basically a pull-type dialog that starts with a question or inquiry from the user. However, interaction with the user may start with push-type content provision.
  • FIG. 3 is a diagram showing an example of a schematic configuration of the server device 200.
  • the server device 200 mainly has a communication section 211 , a storage section 212 and a control section 214 .
  • Each element in the server device 200 is interconnected via a bus line 210 .
  • the communication unit 211 performs data communication with an external device such as the audio output device 100 under the control of the control unit 214 .
  • the storage unit 212 is composed of various types of memory such as RAM, ROM, nonvolatile memory (including hard disk drive, flash memory, etc.). Storage unit 212 stores a program for server device 200 to execute a predetermined process. Moreover, the memory
  • the control unit 214 includes a CPU, GPU, etc., and controls the server device 200 as a whole. Further, the control unit 214 operates together with the audio output device 100 by executing a program stored in the storage unit 212, and executes route guidance processing, information provision processing, and the like for the user. For example, based on the upload signal S1 received from the audio output device 100 via the communication unit 211, the control unit 214 generates route information indicating a guidance route or a control signal S2 relating to information output in response to a user's information request. Then, the control unit 214 transmits the generated control signal S2 to the audio output device 100 through the communication unit 211 .
  • push-type content provision means that when the vehicle Ve is in a predetermined driving situation, the audio output device 100 outputs content related to the driving situation to the user by voice. Specifically, the voice output device 100 acquires the driving situation information indicating the driving situation of the vehicle Ve based on the output of the sensor group 115 as described above, and transmits it to the server device 200 .
  • the server device 200 stores table data for providing push-type content in the storage unit 212 .
  • the server device 200 refers to the table data, and when the driving situation information received from the voice output device 100 mounted on the vehicle Ve matches the trigger condition defined in the table data, the text corresponding to the trigger condition is generated.
  • output content is acquired and transmitted to the audio output device 100 .
  • the audio output device 100 audio-outputs the content for output received from the server device 200 . In this way, the content corresponding to the driving situation of the vehicle Ve is output to the user by voice.
  • the driving situation information includes, for example, the position of the vehicle Ve, the direction of the vehicle, traffic information around the position of the vehicle Ve (including speed regulation and congestion information, etc.), the current time, the destination, etc. At least one piece of information that can be acquired based on the function of each unit of the output device 100 may be included. Also, the driving situation information includes any of the voice (excluding user's speech) obtained by the microphone 117, the image captured by the exterior camera 119, and the image captured by the interior camera 120. may The driving status information may also include information received from the server device 200 through the communication unit 111 .
  • FIG. 4 is a diagram for explaining the data structure of text data stored in the server device.
  • the storage unit 212 of the server device 200 stores, for example, text data TX having a data structure as shown in FIG.
  • the text data TX has a fixed part corresponding to a part in which a predetermined wording is maintained regardless of the driving situation indicated by the driving situation information, and a part in which the wording changes according to the driving situation indicated by the driving situation information.
  • the text data TX includes three fixed parts FD corresponding to the words “required time”, “distance” and “will be”, and the words “50 minutes shorter". It includes a variable part VDA and a variable part VDB corresponding to the words "10 km shorter". Also, the variable part VDA is arranged in a state sandwiched between the tag information TGA and TGB. Also, the variable part VDB is arranged in a state sandwiched between the tag information TGC and TGD.
  • the setting information for voice output while emphasizing the variable parts VDA and VDB is included.
  • FIG. 5 is a diagram showing a specific example of text data stored in the server device.
  • Text data having a data structure similar to that of text data TX includes, for example, the data shown in FIG. Note that in the text data of FIG. 5, description of the tag information shown in the description of the text data TX is omitted for convenience of illustration.
  • [Variable-A] corresponds to the variable part, and the other part corresponds to the fixed part. Therefore, for example, if the driving situation information includes information indicating that the set value of the volume of the speaker 118 is 8, the text data TXA in which [Variable-A] is replaced with "8" is obtained. Further, according to such text data TXA, the portion of "8" is output as voice while being emphasized. Note that [Variable-A] may be replaced with any numerical value as long as it can be included in the driving situation information.
  • [Variable-B] corresponds to the variable part, and the other part corresponds to the fixed part. Therefore, for example, if the driving status information includes information indicating that the current location of the vehicle Ve is Kawagoe City, Saitama Prefecture, then [Variable-B] is replaced with "Kawagoe City, Saitama Prefecture". Text data TXB is obtained. Further, according to such text data TXB, the portion of "Kawagoe City, Saitama Prefecture" is output as voice while being emphasized. Note that [Variable-B] may be replaced with any place name as long as it can be included in the driving situation information.
  • [Variable-C] corresponds to the variable part, and the other corresponds to the fixed part. Therefore, for example, if the driving status information includes information indicating that the reservation for restaurant R, which the passenger of vehicle Ve wishes to stop by, has been successful, [Variable-C] is replaced with "restaurant R".
  • the text data TXC in the state of being wrapped is acquired. Further, according to such text data TXC, the portion of "Restaurant R" is output as voice while being emphasized. Note that [Variable-C] may be replaced with any store name as long as it can be included in the driving situation information.
  • [Variable-F] may be replaced with a wording different from "today", or may be set to blank (silent period).
  • [Variable-G] may be replaced with any place name as long as it can be included in the driving situation information.
  • [Variable-H] may be replaced with a word representing any weather condition as long as it can be included in the driving situation information.
  • FIG. 6 is a flowchart for explaining the processing performed in the server device 200. As shown in FIG.
  • control unit 114 of the voice output device 100 acquires driving situation information related to the current driving situation of the vehicle Ve and transmits it to the server device 200 .
  • the server device 200 acquires the driving situation information from the voice output device 100 (step S11).
  • control unit 214 of the server device 200 determines whether or not the driving status information acquired in step S11 satisfies the trigger condition (step S12).
  • control unit 214 determines that the driving status information acquired in step S11 of FIG. 6 does not satisfy the trigger condition of the table data TB (step S12: NO), it performs the operation of step S11 again.
  • step S12 When the control unit 214 determines that the driving situation information acquired in step S11 satisfies the trigger condition (step S12: YES), tag information for outputting voice while emphasizing the variable part is added to the variable part.
  • the text data obtained is obtained (step S13).
  • control unit 214 sets the wording of the variable portion included in the text data acquired in step S13 based on the driving situation information acquired in step S11 (step S14).
  • control unit 214 acquires the text data in which the wording of the variable part is set in step S14 as content for output, and outputs the acquired content for output to the voice output device 100 (step S15). In this way, the content acquisition by the server device 200 ends.
  • the audio output device 100 audio-outputs the content received from the server device 200 to passengers of the vehicle Ve.
  • control unit 214 of the server device 200 has a function as a content acquisition unit. Further, according to this embodiment, the communication unit 211 of the server device 200 has a function as an output unit.
  • text data including a variable part and tag information for outputting voice while emphasizing the variable part is acquired according to the driving situation of the vehicle. and the acquired text data is output as voice. Therefore, according to the present embodiment, important information that may be included in content can be recognized more easily than before. Further, according to the present embodiment, for example, by creating data such as text data TXA to TXF in advance, important information in various categories can be voice-output while being emphasized.
  • control unit 114 has a function as a content acquisition unit and the speaker 118 has a function as an output unit
  • a series of processes substantially similar to those in FIG. Processing can be performed in the audio output device 100 .
  • Non-transitory computer readable media include various types of tangible storage media.
  • Examples of non-transitory computer-readable media include magnetic storage media (e.g., floppy disks, magnetic tapes, hard disk drives), magneto-optical storage media (e.g., magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R/W, semiconductor memory (eg mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)).
  • audio output device 200 server device 111, 211 communication unit 112, 212 storage unit 113 input unit 114, 214 control unit 115 sensor group 116 display unit 117 microphone 118 speaker 119 exterior camera 120 interior camera

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Automation & Control Theory (AREA)
  • General Physics & Mathematics (AREA)
  • Navigation (AREA)

Abstract

A content output device is provided with a content acquisition unit and an output unit. The content acquisition unit acquires, in accordance with a vehicle driving status, content including a variable part, and tag information for outputting audio while highlighting the variable part. The output unit outputs the content.

Description

コンテンツ出力装置、コンテンツ出力方法、プログラム及び記憶媒体CONTENT OUTPUT DEVICE, CONTENT OUTPUT METHOD, PROGRAM AND STORAGE MEDIUM
 本発明は、コンテンツの出力において利用可能な技術に関する。 The present invention relates to technology that can be used in content output.
 センサ等を通じて得られた様々な情報に基づき、当該情報に対応するコンテンツをユーザに対して出力するような技術が従来知られている。 Conventionally known technology is to output content corresponding to the information to the user based on various information obtained through sensors and the like.
 具体的には、例えば、特許文献1には、車両のドアの開閉を検出する振動センサ等を通じて得られた情報に基づき、当該車両の搭乗者の乗車時及び降車時に挨拶音声を出力する技術が開示されている。 Specifically, for example, Patent Literature 1 discloses a technique for outputting a greeting voice when a passenger gets in and out of a vehicle based on information obtained through a vibration sensor or the like that detects opening and closing of the vehicle door. disclosed.
特開2003-237453号公報JP-A-2003-237453
 しかし、特許文献1には、コンテンツが出力される状況に応じて変化する部分と、当該コンテンツにおけるそれ以外の部分と、が区別された状態で当該コンテンツを出力する手法について開示等されていない。 However, Patent Document 1 does not disclose a method of outputting the content while distinguishing between the portion that changes according to the situation in which the content is output and the other portion of the content.
 そのため、特許文献1に開示された構成によれば、コンテンツの出力先となるユーザに対し、当該コンテンツに含まれ得る重要な情報を認識するための無用な精神的負担を課してしまう場合がある、という課題が生じている。 Therefore, according to the configuration disclosed in Patent Document 1, there is a case where the user who is the output destination of the content is subjected to an unnecessary mental burden for recognizing important information that may be included in the content. There is a problem that there is
 本発明は、上記の課題を解決するためになされたものであり、コンテンツに含まれ得る重要な情報を従来よりも認識し易くすることが可能なコンテンツ出力装置を提供することを主な目的とする。 SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and a main object of the present invention is to provide a content output device capable of recognizing important information contained in content more easily than before. do.
 請求項に記載の発明は、コンテンツ出力装置であって、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部と、前記コンテンツを出力する出力部と、を備える。 The claimed invention is a content output device, which acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of a vehicle. An acquisition unit and an output unit that outputs the content are provided.
 また、請求項に記載の発明は、コンテンツ出力方法であって、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得し、前記コンテンツを出力する。 Further, the claimed invention is a content output method, in which content including a variable part and tag information for outputting voice while emphasizing the variable part is acquired according to the driving situation of a vehicle. and output the content.
 また、請求項に記載の発明は、コンピュータを備えるコンテンツ出力装置により実行されるプログラムであって、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部、及び、前記コンテンツを出力する出力部として前記コンピュータを機能させる。 Further, the claimed invention is a program executed by a content output device provided with a computer, the content including a variable part and tag information for outputting voice while emphasizing the variable part, The computer is caused to function as a content acquisition unit that acquires content according to the driving situation of the vehicle and an output unit that outputs the content.
実施例に係る音声出力システムの構成例を示す図。1 is a diagram showing a configuration example of an audio output system according to an embodiment; FIG. 音声出力装置の概略構成を示すブロック図。1 is a block diagram showing a schematic configuration of an audio output device; FIG. サーバ装置の概略構成の一例を示す図。The figure which shows an example of schematic structure of a server apparatus. サーバ装置に格納されているテキストデータのデータ構造を説明するための図。FIG. 4 is a diagram for explaining the data structure of text data stored in the server device; サーバ装置に格納されているテキストデータの具体例を示す図。FIG. 4 is a diagram showing a specific example of text data stored in the server device; サーバ装置において行われる処理を説明するためのフローチャート。4 is a flowchart for explaining processing performed in the server device;
 本発明の1つの好適な実施形態では、コンテンツ出力装置は、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部と、前記コンテンツを出力する出力部と、を備える。 In one preferred embodiment of the present invention, a content output device acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle. A content acquisition unit and an output unit for outputting the content are provided.
 上記のコンテンツ出力装置は、コンテンツ取得部と、出力部と、を備える。コンテンツ取得部は、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得する。出力部は、前記コンテンツを出力する。これにより、コンテンツに含まれ得る重要な情報を従来よりも認識し易くすることができる。 The above content output device includes a content acquisition unit and an output unit. The content acquisition unit acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle. The output unit outputs the content. This makes it easier to recognize important information that may be included in content than in the past.
 上記のコンテンツ出力装置の一態様では、前記タグ情報には、前記変数部を音声出力する際の音量、音程、及び、速度のうちの少なくともいずれか1つの設定値を、前記コンテンツに含まれる固定部を音声出力する際の設定値とは異なる値とするための情報が含まれている。 In one aspect of the above content output device, the tag information includes a set value of at least one of volume, pitch, and speed when outputting the variable portion as voice, fixed included in the content. It contains information for setting a value different from the set value when outputting the part as voice.
 上記のコンテンツ出力装置の一態様では、前記タグ情報には、前記変数部を音声出力する際の音量の設定値を、前記固定部の音量の設定値よりも大きな音量にするための情報が含まれている。 In one aspect of the above-described content output device, the tag information includes information for setting the volume setting value when outputting the variable portion as sound to be higher than the volume setting value of the fixed portion. is
 上記のコンテンツ出力装置の一態様では、前記タグ情報には、前記変数部を音声出力する際の音程の設定値を、前記固定部の音程の設定値よりも高くするための情報が含まれている。 In one aspect of the above-described content output device, the tag information includes information for setting the pitch setting value when outputting the variable portion as sound higher than the pitch setting value of the fixed portion. there is
 上記のコンテンツ出力装置の一態様では、前記タグ情報には、前記変数部を音声出力する際の速度の設定値を、前記固定部の速度の設定値よりも遅くするための情報が含まれている。 In one aspect of the above-described content output device, the tag information includes information for making the set value of the speed when outputting the variable portion slower than the set value of the speed of the fixed portion. there is
 本発明の他の実施形態では、コンテンツ出力方法は、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得し、前記コンテンツを出力する。これにより、コンテンツに含まれ得る重要な情報を従来よりも認識し易くすることができる。 In another embodiment of the present invention, a content output method acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of a vehicle, Output content. This makes it easier to recognize important information that may be included in content than in the past.
 本発明のさらに他の実施形態では、コンピュータを備えるコンテンツ出力装置により実行されるプログラムは、変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部、及び、前記コンテンツを出力する出力部として前記コンピュータを機能させる。このプログラムをコンピュータで実行することにより、上記のコンテンツ出力装置を実現することができる。このプログラムは記憶媒体に記憶して使用することができる。 In still another embodiment of the present invention, a program executed by a content output device provided with a computer outputs content including a variable portion and tag information for outputting voice while emphasizing the variable portion to a vehicle. The computer is caused to function as a content acquisition unit that acquires content according to driving conditions and an output unit that outputs the content. By executing this program on a computer, the above content output device can be realized. This program can be stored in a storage medium and used.
 以下、図面を参照して本発明の好適な実施例について説明する。 Preferred embodiments of the present invention will be described below with reference to the drawings.
 [システム構成]
 (全体構成)
 図1は、実施例に係る音声出力システムの構成例を示す図である。本実施例に係る音声出力システム1は、音声出力装置100と、サーバ装置200とを有する。音声出力装置100は、車両Veに搭載される。サーバ装置200は、複数の車両Veに搭載された複数の音声出力装置100と通信する。
[System configuration]
(overall structure)
FIG. 1 is a diagram illustrating a configuration example of an audio output system according to an embodiment. A voice output system 1 according to this embodiment includes a voice output device 100 and a server device 200 . The audio output device 100 is mounted on the vehicle Ve. The server device 200 communicates with a plurality of audio output devices 100 mounted on a plurality of vehicles Ve.
 音声出力装置100は、基本的に車両Veの搭乗者であるユーザに対して、経路探索処理や経路案内処理などを行う。例えば、音声出力装置100は、ユーザにより目的地等が入力されると、車両Veの位置情報や指定された目的地に関する情報などを含むアップロード信号S1をサーバ装置200に送信する。サーバ装置200は、地図データを参照して目的地までの経路を算出し、目的地までの経路を示す制御信号S2を音声出力装置100へ送信する。音声出力装置100は、受信した制御信号S2に基づいて、音声出力によりユーザに対する経路案内を行う。 The voice output device 100 basically performs route search processing, route guidance processing, etc. for the user who is a passenger of the vehicle Ve. For example, when a destination or the like is input by the user, the voice output device 100 transmits an upload signal S1 including position information of the vehicle Ve and information on the designated destination to the server device 200 . Server device 200 calculates the route to the destination by referring to the map data, and transmits control signal S2 indicating the route to the destination to audio output device 100 . The voice output device 100 provides route guidance to the user by voice output based on the received control signal S2.
 また、音声出力装置100は、ユーザとの対話により各種の情報をユーザに提供する。例えば、音声出力装置100は、ユーザが情報要求を行うと、その情報要求の内容又は種類を示す情報、及び、車両Veの走行状態に関する情報などを含むアップロード信号S1をサーバ装置200に供給する。サーバ装置200は、ユーザが要求する情報を取得、生成し、制御信号S2として音声出力装置100へ送信する。音声出力装置100は、受信した情報を、音声出力によりユーザに提供する。 In addition, the voice output device 100 provides various types of information to the user through interaction with the user. For example, when a user makes an information request, the audio output device 100 supplies the server device 200 with an upload signal S1 including information indicating the content or type of the information request and information about the running state of the vehicle Ve. The server device 200 acquires and generates information requested by the user, and transmits it to the audio output device 100 as a control signal S2. The audio output device 100 provides the received information to the user by audio output.
 (音声出力装置)
 音声出力装置100は、車両Veと共に移動し、案内経路に沿って車両Veが走行するように、音声を主とした経路案内を行う。なお、「音声を主とした経路案内」は、案内経路に沿って車両Veを運転するために必要な情報をユーザが少なくとも音声のみから把握可能な経路案内を指し、音声出力装置100が現在位置周辺の地図などを補助的に表示することを除外するものではない。本実施例では、音声出力装置100は、少なくとも、案内が必要な経路上の地点(「案内地点」とも呼ぶ。)など、運転に係る様々な情報を音声により出力する。ここで、案内地点は、例えば車両Veの右左折を伴う交差点、その他、案内経路に沿って車両Veが走行するために重要な通過地点が該当する。音声出力装置100は、例えば、車両Veから次の案内地点までの距離、当該案内地点での進行方向などの案内地点に関する音声案内を行う。以後では、案内経路に対する案内に関する音声を「経路音声案内」とも呼ぶ。
(Audio output device)
The voice output device 100 moves together with the vehicle Ve and performs route guidance mainly by voice so that the vehicle Ve travels along the guidance route. It should be noted that "route guidance based mainly on voice" refers to route guidance in which the user can grasp information necessary for driving the vehicle Ve along the guidance route at least from only voice, and the voice output device 100 indicates the current position. It does not exclude the auxiliary display of a surrounding map or the like. In this embodiment, the voice output device 100 outputs at least various information related to driving, such as points on the route that require guidance (also referred to as “guidance points”), by voice. Here, the guidance point corresponds to, for example, an intersection at which the vehicle Ve turns right or left, or other passing points important for the vehicle Ve to travel along the guidance route. The voice output device 100 provides voice guidance regarding guidance points such as, for example, the distance from the vehicle Ve to the next guidance point and the traveling direction at the guidance point. Hereinafter, the voice regarding the guidance for the guidance route will also be referred to as "route voice guidance".
 音声出力装置100は、例えば車両Veのフロントガラスの上部、又は、ダッシュボード上などに取り付けられる。なお、音声出力装置100は、車両Veに組み込まれてもよい。 The audio output device 100 is installed, for example, on the upper part of the windshield of the vehicle Ve or on the dashboard. Note that the audio output device 100 may be incorporated in the vehicle Ve.
 図2は、音声出力装置100の概略構成を示すブロック図である。音声出力装置100は、主に、通信部111と、記憶部112と、入力部113と、制御部114と、センサ群115と、表示部116と、マイク117と、スピーカ118と、車外カメラ119と、車内カメラ120と、を有する。音声出力装置100内の各要素は、バスライン110を介して相互に接続されている。 FIG. 2 is a block diagram showing a schematic configuration of the audio output device 100. As shown in FIG. The audio output device 100 mainly includes a communication unit 111, a storage unit 112, an input unit 113, a control unit 114, a sensor group 115, a display unit 116, a microphone 117, a speaker 118, and an exterior camera 119. and an in-vehicle camera 120 . Each element in the audio output device 100 is interconnected via a bus line 110 .
 通信部111は、制御部114の制御に基づき、サーバ装置200とのデータ通信を行う。通信部111は、例えば、後述する地図DB(DataBase)4を更新するための地図データをサーバ装置200から受信してもよい。 The communication unit 111 performs data communication with the server device 200 under the control of the control unit 114 . The communication unit 111 may receive, for example, map data for updating a map DB (DataBase) 4 to be described later from the server device 200 .
 記憶部112は、RAM(Random Access Memory)、ROM(Read Only Memory)、不揮発性メモリ(ハードディスクドライブ、フラッシュメモリなどを含む)などの各種のメモリにより構成される。記憶部112には、音声出力装置100が所定の処理を実行するためのプログラムが記憶される。上述のプログラムは、経路案内を音声により行うためのアプリケーションプログラム、音楽を再生するためのアプリケーションプログラム、音楽以外のコンテンツ(テレビ等)を出力するためのアプリケーションプログラムなどを含んでもよい。また、記憶部112は、制御部114の作業メモリとしても使用される。なお、音声出力装置100が実行するプログラムは、記憶部112以外の記憶媒体に記憶されてもよい。 The storage unit 112 is composed of various memories such as RAM (Random Access Memory), ROM (Read Only Memory), and non-volatile memory (including hard disk drive, flash memory, etc.). The storage unit 112 stores a program for the audio output device 100 to execute predetermined processing. The above programs may include an application program for providing route guidance by voice, an application program for playing back music, an application program for outputting content other than music (such as television), and the like. Storage unit 112 is also used as a working memory for control unit 114 . Note that the program executed by the audio output device 100 may be stored in a storage medium other than the storage unit 112 .
 また、記憶部112は、地図データベース(以下、データベースを「DB」と記す。)4を記憶する。地図DB4には、経路案内に必要な種々のデータが記録されている。地図DB4は、例えば、道路網をノードとリンクの組合せにより表した道路データ、及び、目的地、立寄地、又はランドマークの候補となる施設を示す施設データなどを記憶している。地図DB4は、制御部114の制御に基づき、通信部111が地図管理サーバから受信する地図情報に基づき更新されてもよい。 The storage unit 112 also stores a map database (hereinafter, the database is referred to as "DB") 4. Various data required for route guidance are recorded in the map DB 4 . The map DB 4 stores, for example, road data representing a road network by a combination of nodes and links, and facility data indicating facilities that are candidates for destinations, stop-off points, or landmarks. The map DB 4 may be updated based on the map information received by the communication section 111 from the map management server under the control of the control section 114 .
 入力部113は、ユーザが操作するためのボタン、タッチパネル、リモートコントローラ等である。表示部116は、制御部114の制御に基づき表示を行うディスプレイ等である。マイク117は、車両Veの車内の音声、特に運転手の発話などを集音する。スピーカ118は、運転手などに対して、経路案内のための音声を出力する。 The input unit 113 is a button, touch panel, remote controller, etc. for user operation. The display unit 116 is a display or the like that displays based on the control of the control unit 114 . The microphone 117 collects sounds inside the vehicle Ve, particularly the driver's utterances. A speaker 118 outputs audio for route guidance to the driver or the like.
 センサ群115は、外界センサ121と、内界センサ122とを含む。外界センサ121は、例えば、ライダ、レーダ、超音波センサ、赤外線センサ、ソナーなどの車両Veの周辺環境を認識するための1又は複数のセンサである。内界センサ122は、車両Veの測位を行うセンサであり、例えば、GNSS(Global Navigation Satellite System)受信機、ジャイロセンサ、IMU(Inertial Measurement Unit)、車速センサ、又はこれらの組合せである。なお、センサ群115は、制御部114がセンサ群115の出力から車両Veの位置を直接的に又は間接的に(即ち推定処理を行うことによって)導出可能なセンサを有していればよい。 The sensor group 115 includes an external sensor 121 and an internal sensor 122 . The external sensor 121 is, for example, one or more sensors for recognizing the surrounding environment of the vehicle Ve, such as a lidar, radar, ultrasonic sensor, infrared sensor, and sonar. The internal sensor 122 is a sensor that performs positioning of the vehicle Ve, and is, for example, a GNSS (Global Navigation Satellite System) receiver, a gyro sensor, an IMU (Inertial Measurement Unit), a vehicle speed sensor, or a combination thereof. It should be noted that the sensor group 115 may have a sensor that allows the control unit 114 to directly or indirectly derive the position of the vehicle Ve from the output of the sensor group 115 (that is, by performing estimation processing).
 車外カメラ119は、車両Veの外部を撮影するカメラである。車外カメラ119は、車両の前方を撮影するフロントカメラのみでもよく、フロントカメラに加えて車両の後方を撮影するリアカメラを含んでもよく、車両Veの全周囲を撮影可能な全方位カメラであってもよい。一方、車内カメラ120は、車両Veの車内の様子を撮影するカメラであり、少なくとも運転席周辺を撮影可能な位置に設けられる。 The vehicle exterior camera 119 is a camera that captures the exterior of the vehicle Ve. The exterior camera 119 may be only a front camera that captures the front of the vehicle, or may include a rear camera that captures the rear of the vehicle in addition to the front camera. good too. On the other hand, the in-vehicle camera 120 is a camera for photographing the interior of the vehicle Ve, and is provided at a position capable of photographing at least the vicinity of the driver's seat.
 制御部114は、CPU(Central Processing Unit)、GPU(Graphics Processing Unit)などを含み、音声出力装置100の全体を制御する。例えば、制御部114は、センサ群115の1又は複数のセンサの出力に基づき、車両Veの位置(進行方向の向きも含む)を推定する。また、制御部114は、入力部113又はマイク117により目的地が指定された場合に、当該目的地までの経路である案内経路を示す経路情報を生成し、当該経路情報と推定した車両Veの位置情報と地図DB4とに基づき、経路案内を行う。この場合、制御部114は、経路音声案内をスピーカ118から出力させる。また、制御部114は、表示部116を制御することで、再生中の音楽の情報、映像コンテンツ、又は現在位置周辺の地図などの表示を行う。 The control unit 114 includes a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), etc., and controls the audio output device 100 as a whole. For example, the control unit 114 estimates the position (including the traveling direction) of the vehicle Ve based on the outputs of one or more sensors in the sensor group 115 . Further, when a destination is specified by the input unit 113 or the microphone 117, the control unit 114 generates route information indicating a guidance route to the destination, Based on the positional information and the map DB 4, route guidance is provided. In this case, the control unit 114 causes the speaker 118 to output route voice guidance. Further, the control unit 114 controls the display unit 116 to display information about the music being played, video content, a map of the vicinity of the current position, or the like.
 なお、制御部114が実行する処理は、プログラムによるソフトウェアで実現することに限ることなく、ハードウェア、ファームウェア、及びソフトウェアのうちのいずれかの組み合わせ等により実現してもよい。また、制御部114が実行する処理は、例えばFPGA(field-programmable gate array)又はマイコン等の、ユーザがプログラミング可能な集積回路を用いて実現してもよい。この場合、この集積回路を用いて、制御部114が本実施例において実行するプログラムを実現してもよい。このように、制御部114は、プロセッサ以外のハードウェアにより実現されてもよい。 It should be noted that the processing executed by the control unit 114 is not limited to being implemented by program-based software, and may be implemented by any combination of hardware, firmware, and software. Also, the processing executed by the control unit 114 may be implemented using a user-programmable integrated circuit such as an FPGA (field-programmable gate array) or a microcomputer. In this case, this integrated circuit may be used to implement the program executed by the control unit 114 in this embodiment. Thus, the control unit 114 may be realized by hardware other than the processor.
 図2に示す音声出力装置100の構成は一例であり、図2に示す構成に対して種々の変更がなされてもよい。例えば、地図DB4を記憶部112が記憶する代わりに、制御部114が通信部111を介して経路案内に必要な情報をサーバ装置200から受信してもよい。他の例では、音声出力装置100は、スピーカ118を備える代わりに、音声出力装置100とは別体に構成された音声出力部と電気的に又は公知の通信手段によって接続することで、当該音声出力部から音声を出力させてもよい。この場合、音声出力部は、車両Veに備えられたスピーカであってもよい。さらに別の例では、音声出力装置100は、表示部116を備えなくともよい。この場合、音声出力装置100は、表示に関する制御を全く行わなくともよく、有線又は無線により、車両Ve等に備えられた表示部と電気的に接続することで、当該表示部に所定の表示を実行させてもよい。同様に、音声出力装置100は、センサ群115を備える代わりに、車両Veに備え付けられたセンサが出力する情報を、車両VeからCAN(Controller Area Network)などの通信プロトコルに基づき取得してもよい。 The configuration of the audio output device 100 shown in FIG. 2 is an example, and various changes may be made to the configuration shown in FIG. For example, instead of storing the map DB 4 in the storage unit 112 , the control unit 114 may receive information necessary for route guidance from the server device 200 via the communication unit 111 . In another example, instead of including the speaker 118, the audio output device 100 is electrically connected to an audio output unit configured separately from the audio output device 100, or by a known communication means, so as to output the audio. Audio may be output from the output unit. In this case, the audio output unit may be a speaker provided in the vehicle Ve. In still another example, the audio output device 100 does not have to include the display section 116 . In this case, the audio output device 100 does not need to perform display-related control at all. may be executed. Similarly, instead of including the sensor group 115, the audio output device 100 may acquire information output by sensors installed in the vehicle Ve based on a communication protocol such as CAN (Controller Area Network) from the vehicle Ve. .
 (サーバ装置)
 サーバ装置200は、音声出力装置100から受信する目的地等を含むアップロード信号S1に基づき、車両Veが走行すべき案内経路を示す経路情報を生成する。そして、サーバ装置200は、その後に音声出力装置100が送信するアップロード信号S1が示すユーザの情報要求及び車両Veの走行状態に基づき、ユーザの情報要求に対する情報出力に関する制御信号S2を生成する。そして、サーバ装置200は、生成した制御信号S2を、音声出力装置100に送信する。
(Server device)
The server device 200 generates route information indicating a guidance route that the vehicle Ve should travel based on the upload signal S1 including the destination and the like received from the voice output device 100 . The server device 200 then generates a control signal S2 relating to information output in response to the user's information request based on the user's information request indicated by the upload signal S1 transmitted by the audio output device 100 and the running state of the vehicle Ve. The server device 200 then transmits the generated control signal S<b>2 to the audio output device 100 .
 さらに、サーバ装置200は、車両Veのユーザに対する情報提供やユーザとの対話を行うためのコンテンツを生成し、音声出力装置100に送信する。ユーザに対する情報提供は、主として車両Veが所定の運転状況になったことをトリガとしてサーバ装置200側から開始するプッシュ型の情報提供である。また、ユーザとの対話は、基本的にユーザからの質問や問いかけから開始するプル型の対話である。但し、ユーザとの対話は、プッシュ型のコンテンツ提供から開始する場合もある。 Furthermore, the server device 200 generates content for providing information to the user of the vehicle Ve and for interacting with the user, and transmits the content to the audio output device 100 . The provision of information to the user is primarily a push-type information provision that is triggered by the server device 200 when the vehicle Ve reaches a predetermined driving condition. Also, the dialog with the user is basically a pull-type dialog that starts with a question or inquiry from the user. However, interaction with the user may start with push-type content provision.
 図3は、サーバ装置200の概略構成の一例を示す図である。サーバ装置200は、主に、通信部211と、記憶部212と、制御部214とを有する。サーバ装置200内の各要素は、バスライン210を介して相互に接続されている。 FIG. 3 is a diagram showing an example of a schematic configuration of the server device 200. As shown in FIG. The server device 200 mainly has a communication section 211 , a storage section 212 and a control section 214 . Each element in the server device 200 is interconnected via a bus line 210 .
 通信部211は、制御部214の制御に基づき、音声出力装置100などの外部装置とのデータ通信を行う。記憶部212は、RAM、ROM、不揮発性メモリ(ハードディスクドライブ、フラッシュメモリなどを含む)などの各種のメモリにより構成される。記憶部212は、サーバ装置200が所定の処理を実行するためのプログラムが記憶される。また、記憶部212は、地図DB4を含んでいる。 The communication unit 211 performs data communication with an external device such as the audio output device 100 under the control of the control unit 214 . The storage unit 212 is composed of various types of memory such as RAM, ROM, nonvolatile memory (including hard disk drive, flash memory, etc.). Storage unit 212 stores a program for server device 200 to execute a predetermined process. Moreover, the memory|storage part 212 contains map DB4.
 制御部214は、CPU、GPUなどを含み、サーバ装置200の全体を制御する。また、制御部214は、記憶部212に記憶されたプログラムを実行することで、音声出力装置100とともに動作し、ユーザに対する経路案内処理や情報提供処理などを実行する。例えば、制御部214は、音声出力装置100から通信部211を介して受信するアップロード信号S1に基づき、案内経路を示す経路情報、又は、ユーザの情報要求に対する情報出力に関する制御信号S2を生成する。そして、制御部214は、生成した制御信号S2を、通信部211により音声出力装置100に送信する。 The control unit 214 includes a CPU, GPU, etc., and controls the server device 200 as a whole. Further, the control unit 214 operates together with the audio output device 100 by executing a program stored in the storage unit 212, and executes route guidance processing, information provision processing, and the like for the user. For example, based on the upload signal S1 received from the audio output device 100 via the communication unit 211, the control unit 214 generates route information indicating a guidance route or a control signal S2 relating to information output in response to a user's information request. Then, the control unit 214 transmits the generated control signal S2 to the audio output device 100 through the communication unit 211 .
 [プッシュ型のコンテンツ提供]
 次に、プッシュ型のコンテンツ提供について説明する。プッシュ型のコンテンツ提供とは、車両Veが所定の運転状況になった場合に、音声出力装置100がユーザに対してその運転状況に関連するコンテンツを音声出力することをいう。具体的に、音声出力装置100は、前述のようにセンサ群115の出力に基づいて車両Veの運転状況を示す運転状況情報を取得し、サーバ装置200へ送信する。サーバ装置200は、プッシュ型のコンテンツ提供を行うためのテーブルデータを記憶部212に記憶している。サーバ装置200は、テーブルデータを参照し、車両Veに搭載された音声出力装置100から受信した運転状況情報が、テーブルデータに規定されているトリガ条件と一致した場合、そのトリガ条件に対応するテキストデータを用いて出力用コンテンツを取得し、音声出力装置100へ送信する。音声出力装置100は、サーバ装置200から受信した出力用コンテンツを音声出力する。こうして、車両Veの運転状況に対応するコンテンツがユーザに対して音声出力される。
[Push-type content provision]
Next, push-type content provision will be described. The provision of push-type content means that when the vehicle Ve is in a predetermined driving situation, the audio output device 100 outputs content related to the driving situation to the user by voice. Specifically, the voice output device 100 acquires the driving situation information indicating the driving situation of the vehicle Ve based on the output of the sensor group 115 as described above, and transmits it to the server device 200 . The server device 200 stores table data for providing push-type content in the storage unit 212 . The server device 200 refers to the table data, and when the driving situation information received from the voice output device 100 mounted on the vehicle Ve matches the trigger condition defined in the table data, the text corresponding to the trigger condition is generated. Using the data, output content is acquired and transmitted to the audio output device 100 . The audio output device 100 audio-outputs the content for output received from the server device 200 . In this way, the content corresponding to the driving situation of the vehicle Ve is output to the user by voice.
 運転状況情報には、例えば、車両Veの位置、当該車両の方位、当該車両Veの位置の周辺の交通情報(速度規制及び渋滞情報等を含む)、現在時刻、目的地等のような、音声出力装置100の各部の機能に基づいて取得可能な少なくとも1つの情報が含まれていればよい。また、運転状況情報には、マイク117により得られた音声(ユーザの発話を除く)、車外カメラ119により撮影された画像、及び、車内カメラ120により撮影された画像のうちのいずれが含まれていてもよい。また、運転状況情報には、通信部111を通じてサーバ装置200から受信した情報が含まれていてもよい。 The driving situation information includes, for example, the position of the vehicle Ve, the direction of the vehicle, traffic information around the position of the vehicle Ve (including speed regulation and congestion information, etc.), the current time, the destination, etc. At least one piece of information that can be acquired based on the function of each unit of the output device 100 may be included. Also, the driving situation information includes any of the voice (excluding user's speech) obtained by the microphone 117, the image captured by the exterior camera 119, and the image captured by the interior camera 120. may The driving status information may also include information received from the server device 200 through the communication unit 111 .
 (データ構造)
 図4は、サーバ装置に格納されているテキストデータのデータ構造を説明するための図である。
(data structure)
FIG. 4 is a diagram for explaining the data structure of text data stored in the server device.
 サーバ装置200の記憶部212には、音声出力されるコンテンツの基となるデータとして、例えば、図4に示すようなデータ構造を有するテキストデータTXが格納されている。 The storage unit 212 of the server device 200 stores, for example, text data TX having a data structure as shown in FIG.
 テキストデータTXには、運転状況情報により示される運転状況に関わらず所定の文言が維持される部分に相当する固定部と、当該運転状況情報により示される運転状況に応じて文言が変化する部分に相当する変数部と、が含まれている。 The text data TX has a fixed part corresponding to a part in which a predetermined wording is maintained regardless of the driving situation indicated by the driving situation information, and a part in which the wording changes according to the driving situation indicated by the driving situation information. The corresponding variable part and .
 具体的には、テキストデータTXには、「所要時間は」、「距離が」及び「なります。」の各文言に相当する3つの固定部FDと、「50分短く」の文言に相当する変数部VDAと、「10km短く」の文言に相当する変数部VDBと、が含まれている。また、変数部VDAは、タグ情報TGA及びTGBに挟まれた状態で配置されている。また、変数部VDBは、タグ情報TGC及びTGDに挟まれた状態で配置されている。 Specifically, the text data TX includes three fixed parts FD corresponding to the words "required time", "distance" and "will be", and the words "50 minutes shorter". It includes a variable part VDA and a variable part VDB corresponding to the words "10 km shorter". Also, the variable part VDA is arranged in a state sandwiched between the tag information TGA and TGB. Also, the variable part VDB is arranged in a state sandwiched between the tag information TGC and TGD.
 タグ情報TGAの「<Speed="DOWN">」、及び、タグ情報TGBの「</Speed>」は、変数部VDAを音声出力する際の速度の設定値を、固定部FDの速度の設定値よりも遅くするための設定情報に相当する。 "<Speed="DOWN">" of the tag information TGA and "</Speed>" of the tag information TGB are set values of the speed when the variable part VDA is output as voice, and the speed setting of the fixed part FD. Corresponds to setting information to make it slower than the value.
 タグ情報TGAの「<Pitch="UP">」、及び、タグ情報TGBの「</Pitch>」は、変数部VDAを音声出力する際の音程の設定値を、固定部FDの音程の設定値よりも高くするための設定情報に相当する。 "<Pitch="UP">" of the tag information TGA and "</Pitch>" of the tag information TGB are the set value of the pitch when the variable part VDA is output as voice, and the pitch setting of the fixed part FD. It corresponds to the setting information for making it higher than the value.
 タグ情報TGAの「<Volume="UP">」、及び、タグ情報TGBの「</Volume>」は、変数部VDAを音声出力する際の音量の設定値を、固定部FDの音量の設定値よりも大きくするための設定情報に相当する。 "<Volume="UP">" of the tag information TGA and "</Volume>" of the tag information TGB indicate the setting value of the volume when outputting the voice of the variable part VDA, and the setting value of the volume of the fixed part FD. It corresponds to the setting information for making it larger than the value.
 タグ情報TGCの「<Speed=90%>」、及び、タグ情報TGDの「</Speed>」は、変数部VDBを音声出力する際の速度の設定値を、固定部FDの速度の設定値の90%に設定するための設定情報に相当する。 "<Speed=90%>" of the tag information TGC and "</Speed>" of the tag information TGD indicate the set value of the speed when the variable part VDB is output as voice, and the set value of the speed of the fixed part FD. corresponds to setting information for setting to 90% of
 タグ情報TGCの「<Pitch=120%>」、及び、タグ情報TGDの「</Pitch>」は、変数部VDBを音声出力する際の音程の設定値を、固定部FDの音程の設定値の120%に設定するための設定情報に相当する。 "<Pitch=120%>" of the tag information TGC and "</Pitch>" of the tag information TGD are the setting values of the pitch when the variable part VDB is output as voice, and the pitch setting value of the fixed part FD. 120% of the setting information.
 タグ情報TGCの「<Volume=120%>」、及び、タグ情報TGDの「</Volume>」は、変数部VDBを音声出力する際の音量の設定値を、固定部FDの音量の設定値の120%に設定するための設定情報に相当する。 "<Volume=120%>" of the tag information TGC and "</Volume>" of the tag information TGD indicate the set value of the volume when the variable part VDB is output as voice, and the set value of the volume of the fixed part FD. 120% of the setting information.
 すなわち、以上に述べたようなデータ構造を有するテキストデータTXによれば、変数部VDA及びVDBを強調しつつ音声出力するための設定情報が含まれている。 That is, according to the text data TX having the data structure as described above, the setting information for voice output while emphasizing the variable parts VDA and VDB is included.
 (テキストデータの具体例)
 図5は、サーバ装置に格納されているテキストデータの具体例を示す図である。
(Specific example of text data)
FIG. 5 is a diagram showing a specific example of text data stored in the server device.
 テキストデータTXと同様のデータ構造を有するテキストデータとしては、例えば、図5に示すようなものが挙げられる。なお、図5のテキストデータにおいては、図示の便宜のため、テキストデータTXの説明の際に示したタグ情報の記載を省略している。 Text data having a data structure similar to that of text data TX includes, for example, the data shown in FIG. Note that in the text data of FIG. 5, description of the tag information shown in the description of the text data TX is omitted for convenience of illustration.
 図5の「音量を[Variable-A]にしました。」として表されるテキストデータTXAにおいては、[Variable-A]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、スピーカ118の音量の設定値が8であることを示す情報が運転状況情報に含まれている場合には、[Variable-A]が「8」に置き換えられた状態のテキストデータTXAが取得される。また、このようなテキストデータTXAによれば、「8」の部分が強調されつつ音声出力される。なお、[Variable-A]は、運転状況情報に含まれ得る限りにおいては、任意の数値に置き換えてもよい。 In the text data TXA represented as "The volume is set to [Variable-A]" in FIG. 5, [Variable-A] corresponds to the variable part, and the other part corresponds to the fixed part. Therefore, for example, if the driving situation information includes information indicating that the set value of the volume of the speaker 118 is 8, the text data TXA in which [Variable-A] is replaced with "8" is obtained. Further, according to such text data TXA, the portion of "8" is output as voice while being emphasized. Note that [Variable-A] may be replaced with any numerical value as long as it can be included in the driving situation information.
 図5の「ここは[Variable-B]です。」として表されるテキストデータTXBにおいては、[Variable-B]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、車両Veの現在地が埼玉県川越市であることを示す情報が運転状況情報に含まれている場合には、[Variable-B]が「埼玉県川越市」に置き換えられた状態のテキストデータTXBが取得される。また、このようなテキストデータTXBによれば、「埼玉県川越市」の部分が強調されつつ音声出力される。なお、[Variable-B]は、運転状況情報に含まれ得る限りにおいては、任意の地名に置き換えてもよい。 In the text data TXB represented as "This is [Variable-B]" in FIG. 5, [Variable-B] corresponds to the variable part, and the other part corresponds to the fixed part. Therefore, for example, if the driving status information includes information indicating that the current location of the vehicle Ve is Kawagoe City, Saitama Prefecture, then [Variable-B] is replaced with "Kawagoe City, Saitama Prefecture". Text data TXB is obtained. Further, according to such text data TXB, the portion of "Kawagoe City, Saitama Prefecture" is output as voice while being emphasized. Note that [Variable-B] may be replaced with any place name as long as it can be included in the driving situation information.
 図5の「[Variable-C]の予約が完了しました。案内を開始します。」として表されるテキストデータTXCにおいては、[Variable-C]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、車両Veの搭乗者が立ち寄りを希望するレストランRの予約が成功したことを示す情報が運転状況情報に含まれている場合には、[Variable-C]が「レストランR」に置き換えられた状態のテキストデータTXCが取得される。また、このようなテキストデータTXCによれば、「レストランR」の部分が強調されつつ音声出力される。なお、[Variable-C]は、運転状況情報に含まれ得る限りにおいては、任意の店舗名に置き換えてもよい。 In the text data TXC represented as "Reservation for [Variable-C] completed. Guidance will start" in FIG. 5, [Variable-C] corresponds to the variable part, and the other corresponds to the fixed part. Therefore, for example, if the driving status information includes information indicating that the reservation for restaurant R, which the passenger of vehicle Ve wishes to stop by, has been successful, [Variable-C] is replaced with "restaurant R". The text data TXC in the state of being wrapped is acquired. Further, according to such text data TXC, the portion of "Restaurant R" is output as voice while being emphasized. Note that [Variable-C] may be replaced with any store name as long as it can be included in the driving situation information.
 図5の「この先、[Variable-D]キロ以内で、[Variable-E]。」として表されるテキストデータTXDにおいては、[Variable-D]及び[Variable-E]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、車両Veの現在地から5km以内における渋滞の発生を示す情報が運転状況情報に含まれている場合には、[Variable-D]が「5」に置き換えられ、[Variable-E]が「渋滞が発生しました」に置き換えられた状態のテキストデータTXDが取得される。また、このようなテキストデータTXDによれば、「5」の部分、及び、「渋滞が発生しました」の部分が強調されつつ音声出力される。なお、[Variable-D]は、運転状況情報に含まれ得る限りにおいては、任意の数値に置き換えてもよい。また、[Variable-E]は、「渋滞です」に置き換えてもよい。 In the text data TXD represented as "Within [Variable-D] kilometers, [Variable-E]." , and other portions correspond to the fixed portion. Therefore, for example, when the driving situation information includes information indicating the occurrence of traffic congestion within 5 km from the current location of the vehicle Ve, [Variable-D] is replaced with "5", and [Variable-E] is replaced with "5". The text data TXD in which "traffic jam has occurred" is acquired. Further, according to such text data TXD, the part of "5" and the part of "traffic jam occurred" are voice-output while being emphasized. Note that [Variable-D] may be replaced with any numerical value as long as it can be included in the driving situation information. Also, [Variable-E] may be replaced with "It's a traffic jam".
 図5の「[Variable-F][Variable-G]の天気は[Variable-H]です。」として表されるテキストデータTXEにおいては、[Variable-F]、[Variable-G]及び[Variable-H]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、埼玉県川越市の天気が晴れであることを示す情報が運転状況情報に含まれている場合には、[Variable-F]が「今日の」に置き換えられ、[Variable-G]が「川越」に置き換えられ、[Variable-H]が「晴れ」に置き換えられた状態のテキストデータTXEが取得される。また、このようなテキストデータTXEによれば、「今日の」の部分、「川越」の部分、及び、「晴れ」の部分が強調されつつ音声出力される。なお、[Variable-F]は、「今日の」とは異なる表現の文言に置き換えてもよく、または、空白(無言期間)に設定してもよい。また、[Variable-G]は、運転状況情報に含まれ得る限りにおいては、任意の地名に置き換えてもよい。また、[Variable-H]は、運転状況情報に含まれ得る限りにおいては、任意の気象状態を表す文言に置き換えてもよい。 In the text data TXE represented as "[Variable-F] [Variable-G] weather is [Variable-H]" in FIG. H] corresponds to the variable part, and the other part corresponds to the fixed part. Therefore, for example, if the driving situation information includes information indicating that the weather in Kawagoe City, Saitama Prefecture is sunny, [Variable-F] is replaced with "today's" and [Variable-G] is replaced with "Kawagoe" and [Variable-H] is replaced with "clear". Further, according to such text data TXE, the part of "today", the part of "Kawagoe", and the part of "sunny" are voice-output while being emphasized. It should be noted that [Variable-F] may be replaced with a wording different from "today", or may be set to blank (silent period). Also, [Variable-G] may be replaced with any place name as long as it can be included in the driving situation information. Also, [Variable-H] may be replaced with a word representing any weather condition as long as it can be included in the driving situation information.
 図5の「[Variable-I]まで[Variable-J]です。」として表されるテキストデータTXEにおいては、[Variable-I]及び[Variable-J]が変数部に相当するとともに、それ以外の部分が固定部に相当する。そのため、例えば、案内経路に沿って走行している車両Veが川越駅前交差点から3kmの地点を通過したことを示す情報が運転状況情報に含まれている場合には、[Variable-I]が「川越駅前交差点」に置き換えられ、[Variable-J]が「3km」に置き換えられた状態のテキストデータTXFが取得される。また、このようなテキストデータTXFによれば、「川越駅前交差点」の部分、及び、「3km」の部分が強調されつつ音声出力される。なお、[Variable-I]は、運転状況情報に含まれ得る限りにおいては、任意の案内地点に置き換えてもよい。また、[Variable-J]は、運転状況情報に含まれ得る限りにおいては、任意の距離を表す文言に置き換えてもよい。 In the text data TXE represented as "[Variable-I] is [Variable-J] up to [Variable-I]" in FIG. The portion corresponds to the fixed portion. Therefore, for example, if the driving situation information includes information indicating that the vehicle Ve traveling along the guidance route has passed a point 3 km from the intersection in front of Kawagoe Station, [Variable-I] is set to " The text data TXF is acquired in a state in which "Kawagoe station square intersection" is replaced with "Variable-J" and "3km" is replaced. Moreover, according to such text data TXF, the part of "Kawagoe station square crossing" and the part of "3 km" are voice-output while being emphasized. Note that [Variable-I] may be replaced with any guide point as long as it can be included in the driving situation information. Also, [Variable-J] may be replaced with any wording representing a distance as long as it can be included in the driving situation information.
 (処理フロー)
 図6は、サーバ装置200において行われる処理を説明するためのフローチャートである。
(processing flow)
FIG. 6 is a flowchart for explaining the processing performed in the server device 200. As shown in FIG.
 まず、音声出力装置100の制御部114は、車両Veの現在の運転状況に係る運転状況情報を取得し、サーバ装置200へ送信する。サーバ装置200は、音声出力装置100から、運転状況情報を取得する(ステップS11)。 First, the control unit 114 of the voice output device 100 acquires driving situation information related to the current driving situation of the vehicle Ve and transmits it to the server device 200 . The server device 200 acquires the driving situation information from the voice output device 100 (step S11).
 次に、サーバ装置200の制御部214は、ステップS11において取得した運転状況情報がトリガ条件を満たすか否かを判定する(ステップS12)。 Next, the control unit 214 of the server device 200 determines whether or not the driving status information acquired in step S11 satisfies the trigger condition (step S12).
 制御部214は、図6のステップS11において取得した運転状況情報がテーブルデータTBのトリガ条件を満たさないと判定した場合(ステップS12:NO)には、ステップS11の動作を再度行う。 When the control unit 214 determines that the driving status information acquired in step S11 of FIG. 6 does not satisfy the trigger condition of the table data TB (step S12: NO), it performs the operation of step S11 again.
 制御部214は、ステップS11において取得した運転状況情報がトリガ条件を満たすと判定した場合(ステップS12:YES)には、変数部を強調しつつ音声出力するためのタグ情報が当該変数部に付加されたテキストデータを取得する(ステップS13)。 When the control unit 214 determines that the driving situation information acquired in step S11 satisfies the trigger condition (step S12: YES), tag information for outputting voice while emphasizing the variable part is added to the variable part. The text data obtained is obtained (step S13).
 続いて、制御部214は、ステップS11において取得した運転状況情報に基づき、ステップS13により取得したテキストデータに含まれる変数部の文言を設定する(ステップS14)。 Subsequently, the control unit 214 sets the wording of the variable portion included in the text data acquired in step S13 based on the driving situation information acquired in step S11 (step S14).
 そして、制御部214は、ステップS14により変数部の文言が設定されたテキストデータを出力用コンテンツとして取得し、当該取得した出力用コンテンツを音声出力装置100へ出力する(ステップS15)。こうして、サーバ装置200によるコンテンツの取得は終了する。音声出力装置100は、サーバ装置200から受信したコンテンツを、車両Veの搭乗者に対して音声出力する。 Then, the control unit 214 acquires the text data in which the wording of the variable part is set in step S14 as content for output, and outputs the acquired content for output to the voice output device 100 (step S15). In this way, the content acquisition by the server device 200 ends. The audio output device 100 audio-outputs the content received from the server device 200 to passengers of the vehicle Ve.
 本実施例によれば、サーバ装置200の制御部214は、コンテンツ取得部としての機能を有する。また、本実施例によれば、サーバ装置200の通信部211は、出力部としての機能を有する。 According to this embodiment, the control unit 214 of the server device 200 has a function as a content acquisition unit. Further, according to this embodiment, the communication unit 211 of the server device 200 has a function as an output unit.
 以上に述べたように、本実施例によれば、変数部と、当該変数部を強調しつつ音声出力するためのタグ情報と、を含むテキストデータ(コンテンツ)が車両の運転状況に応じて取得され、当該取得されたテキストデータが音声出力される。そのため、本実施例によれば、コンテンツに含まれ得る重要な情報を従来よりも認識し易くすることができる。また、本実施例によれば、例えば、テキストデータTXA~TXFのようなデータを予め作成しておくことにより、様々なカテゴリにおける重要な情報を強調しつつ音声出力することができる。 As described above, according to the present embodiment, text data (contents) including a variable part and tag information for outputting voice while emphasizing the variable part is acquired according to the driving situation of the vehicle. and the acquired text data is output as voice. Therefore, according to the present embodiment, important information that may be included in content can be recognized more easily than before. Further, according to the present embodiment, for example, by creating data such as text data TXA to TXF in advance, important information in various categories can be voice-output while being emphasized.
 なお、本実施例によれば、例えば、制御部114がコンテンツ取得部としての機能を有し、かつ、スピーカ118が出力部としての機能を有する場合に、図6の一連の処理と略同様の処理を音声出力装置100において行うことができる。 Note that, according to this embodiment, for example, when the control unit 114 has a function as a content acquisition unit and the speaker 118 has a function as an output unit, a series of processes substantially similar to those in FIG. Processing can be performed in the audio output device 100 .
 上述した実施例において、プログラムは、様々なタイプの非一時的なコンピュータ可読媒体(non-transitory computer readable medium)を用いて格納され、コンピュータである制御部等に供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記憶媒体(tangible storage medium)を含む。非一時的なコンピュータ可読媒体の例は、磁気記憶媒体(例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ)、光磁気記憶媒体(例えば光磁気ディスク)、CD-ROM(Read Only Memory)、CD-R、CD-R/W、半導体メモリ(例えば、マスクROM、PROM(Programmable ROM)、EPROM(Erasable PROM)、フラッシュROM、RAM(Random Access Memory))を含む。 In the above-described embodiments, the program can be stored using various types of non-transitory computer readable media and supplied to a control unit or the like that is a computer. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (e.g., floppy disks, magnetic tapes, hard disk drives), magneto-optical storage media (e.g., magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R/W, semiconductor memory (eg mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)).
 以上、実施形態を参照して本願発明を説明したが、本願発明は上記実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明のスコープ内で当業者が理解し得る様々な変更をすることができる。すなわち、本願発明は、請求の範囲を含む全開示、技術的思想にしたがって当業者であればなし得るであろう各種変形、修正を含むことは勿論である。また、引用した上記の特許文献等の各開示は、本書に引用をもって繰り込むものとする。 Although the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention. That is, the present invention naturally includes various variations and modifications that a person skilled in the art can make according to the entire disclosure including the scope of claims and technical ideas. In addition, the disclosures of the cited patent documents and the like are incorporated herein by reference.
 100 音声出力装置
 200 サーバ装置
 111、211 通信部
 112、212 記憶部
 113 入力部
 114、214 制御部
 115 センサ群
 116 表示部
 117 マイク
 118 スピーカ
 119 車外カメラ
 120 車内カメラ
100 audio output device 200 server device 111, 211 communication unit 112, 212 storage unit 113 input unit 114, 214 control unit 115 sensor group 116 display unit 117 microphone 118 speaker 119 exterior camera 120 interior camera

Claims (8)

  1.  変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部と、
     前記コンテンツを出力する出力部と、
     を備えるコンテンツ出力装置。
    a content acquisition unit configured to acquire content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle;
    an output unit that outputs the content;
    A content output device comprising:
  2.  前記タグ情報には、前記変数部を音声出力する際の音量、音程、及び、速度のうちの少なくともいずれか1つの設定値を、前記コンテンツに含まれる固定部を音声出力する際の設定値とは異なる値とするための情報が含まれている請求項1に記載のコンテンツ出力装置。 The tag information includes a setting value for at least one of volume, pitch, and speed when outputting the variable portion as voice, and a setting value when outputting the fixed portion included in the content as voice. 2. The content output device according to claim 1, wherein information for setting the values to different values is included.
  3.  前記タグ情報には、前記変数部を音声出力する際の音量の設定値を、前記固定部の音量の設定値よりも大きな音量にするための情報が含まれている請求項2に記載のコンテンツ出力装置。 3. The content according to claim 2, wherein the tag information includes information for setting a volume setting value when outputting the variable portion as sound to be higher than a volume setting value of the fixed portion. output device.
  4.  前記タグ情報には、前記変数部を音声出力する際の音程の設定値を、前記固定部の音程の設定値よりも高くするための情報が含まれている請求項2に記載のコンテンツ出力装置。 3. The content output device according to claim 2, wherein the tag information includes information for setting a pitch setting value when outputting the variable part as a sound higher than a pitch setting value of the fixed part. .
  5.  前記タグ情報には、前記変数部を音声出力する際の速度の設定値を、前記固定部の速度の設定値よりも遅くするための情報が含まれている請求項2に記載のコンテンツ出力装置。 3. The content output device according to claim 2, wherein the tag information includes information for setting a speed setting value for outputting the variable portion as audio to be slower than a speed setting value for the fixed portion. .
  6.  変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得し、
     前記コンテンツを出力するコンテンツ出力方法。
    Acquiring content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle,
    A content output method for outputting the content.
  7.  コンピュータを備えるコンテンツ出力装置により実行されるプログラムであって、
     変数部と、前記変数部を強調しつつ音声出力するためのタグ情報と、を含むコンテンツを、車両の運転状況に応じて取得するコンテンツ取得部、及び、
     前記コンテンツを出力する出力部として前記コンピュータを機能させるプログラム。
    A program executed by a content output device comprising a computer,
    a content acquisition unit that acquires content including a variable part and tag information for outputting voice while emphasizing the variable part according to the driving situation of the vehicle;
    A program that causes the computer to function as an output unit that outputs the content.
  8.  請求項7に記載のプログラムを記憶した記憶媒体。 A storage medium storing the program according to claim 7.
PCT/JP2021/038223 2021-10-15 2021-10-15 Content output device, content output method, program, and storage medium WO2023062816A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/038223 WO2023062816A1 (en) 2021-10-15 2021-10-15 Content output device, content output method, program, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/038223 WO2023062816A1 (en) 2021-10-15 2021-10-15 Content output device, content output method, program, and storage medium

Publications (1)

Publication Number Publication Date
WO2023062816A1 true WO2023062816A1 (en) 2023-04-20

Family

ID=85988220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/038223 WO2023062816A1 (en) 2021-10-15 2021-10-15 Content output device, content output method, program, and storage medium

Country Status (1)

Country Link
WO (1) WO2023062816A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002312157A (en) * 2001-04-13 2002-10-25 Yoshito Suzuki Voice guidance monitor software
JP2010175717A (en) * 2009-01-28 2010-08-12 Mitsubishi Electric Corp Speech synthesizer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002312157A (en) * 2001-04-13 2002-10-25 Yoshito Suzuki Voice guidance monitor software
JP2010175717A (en) * 2009-01-28 2010-08-12 Mitsubishi Electric Corp Speech synthesizer

Similar Documents

Publication Publication Date Title
JP6604151B2 (en) Speech recognition control system
JP7190952B2 (en) Control device, control method and program
US11189274B2 (en) Dialog processing system, vehicle having the same, dialog processing method
JP2006317573A (en) Information terminal
JP7020098B2 (en) Parking lot evaluation device, parking lot information provision method and program
JP5018671B2 (en) Vehicle navigation device
JP2023164659A (en) Information processing apparatus, information output method, program, and storage medium
WO2023062816A1 (en) Content output device, content output method, program, and storage medium
JP2023105143A (en) Information processor, method for outputting information, program, and recording medium
WO2023163197A1 (en) Content evaluation device, content evaluation method, program, and storage medium
WO2023286826A1 (en) Content output device, content output method, program, and storage medium
WO2023163196A1 (en) Content output device, content output method, program, and recording medium
WO2023286827A1 (en) Content output device, content output method, program, and storage medium
WO2023162192A1 (en) Content output device, content output method, program, and recording medium
WO2023276037A1 (en) Content output device, content output method, program, and storage medium
US20240134596A1 (en) Content output device, content output method, program and storage medium
WO2023162189A1 (en) Content output device, content output method, program, and storage medium
WO2023112147A1 (en) Voice output device, voice output method, program, and storage medium
WO2023063405A1 (en) Content generation device, content generation method, program, and storage medium
WO2023112148A1 (en) Audio output device, audio output method, program, and storage medium
WO2023062814A1 (en) Audio output device, audio output method, program, and storage medium
WO2023062817A1 (en) Voice recognition device, control method, program, and storage medium
WO2023073856A1 (en) Audio output device, audio output method, program, and storage medium
WO2023073949A1 (en) Voice output device, server device, voice output method, control method, program, and storage medium
WO2023163045A1 (en) Content output device, content output method, program, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960672

Country of ref document: EP

Kind code of ref document: A1