WO2017187678A1 - 情報処理装置、情報処理方法、およびプログラム - Google Patents
情報処理装置、情報処理方法、およびプログラム Download PDFInfo
- Publication number
- WO2017187678A1 WO2017187678A1 PCT/JP2017/002309 JP2017002309W WO2017187678A1 WO 2017187678 A1 WO2017187678 A1 WO 2017187678A1 JP 2017002309 W JP2017002309 W JP 2017002309W WO 2017187678 A1 WO2017187678 A1 WO 2017187678A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- user
- reading
- voice
- information processing
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 88
- 238000003672 processing method Methods 0.000 title claims abstract description 7
- 230000009471 action Effects 0.000 claims description 36
- 230000006870 function Effects 0.000 claims description 13
- 238000004891 communication Methods 0.000 description 22
- 230000006399 behavior Effects 0.000 description 14
- 238000000034 method Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 230000007704 transition Effects 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 8
- 230000001133 acceleration Effects 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 210000003128 head Anatomy 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000001351 cycling effect Effects 0.000 description 4
- 239000012141 concentrate Substances 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000001151 other effect Effects 0.000 description 2
- OHVLMTFVQDZYHP-UHFFFAOYSA-N 1-(2,4,6,7-tetrahydrotriazolo[4,5-c]pyridin-5-yl)-2-[4-[2-[[3-(trifluoromethoxy)phenyl]methylamino]pyrimidin-5-yl]piperazin-1-yl]ethanone Chemical compound N1N=NC=2CN(CCC=21)C(CN1CCN(CC1)C=1C=NC(=NC=1)NCC1=CC(=CC=C1)OC(F)(F)F)=O OHVLMTFVQDZYHP-UHFFFAOYSA-N 0.000 description 1
- IHCCLXNEEPMSIO-UHFFFAOYSA-N 2-[4-[2-(2,3-dihydro-1H-inden-2-ylamino)pyrimidin-5-yl]piperidin-1-yl]-1-(2,4,6,7-tetrahydrotriazolo[4,5-c]pyridin-5-yl)ethanone Chemical compound C1C(CC2=CC=CC=C12)NC1=NC=C(C=N1)C1CCN(CC1)CC(=O)N1CC2=C(CC1)NN=N2 IHCCLXNEEPMSIO-UHFFFAOYSA-N 0.000 description 1
- 241000272201 Columbiformes Species 0.000 description 1
- 125000002066 L-histidyl group Chemical group [H]N1C([H])=NC(C([H])([H])[C@](C(=O)[*])([H])N([H])[H])=C1[H] 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/0483—Interaction with page-structured environments, e.g. book metaphor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
Definitions
- the present disclosure relates to an information processing apparatus, an information processing method, and a program.
- Information obtained by users from information processing terminals connected to the network can be divided into two types: visual information and sound information.
- Visual information has high image quality and high resolution. Thus, it is possible to present intuitive and easy-to-understand information.
- visual information the user's field of view is narrowed, and viewing the display screen while moving is dangerous.
- sound information the user's field of view is not narrowed, and information can be presented even during movement.
- the present disclosure proposes an information processing apparatus, an information processing method, and a program capable of improving convenience when confirming the read voice information.
- the information processing system includes an information processing device 1, a server 2, and a display device 3 that are worn by a user.
- the information processing device 1, the server 2, and the display device 3 can transmit and receive data to and from each other via the network 4.
- the display device 3 may be an information processing terminal such as a smartphone, a mobile phone, a tablet terminal, or a notebook PC that is carried by the user. Further, when such a display device 3 is paired with the information processing device 1 and connected by wireless communication, the information processing device 1 can perform data transmission / reception with the server 2 via the display device 3.
- “right” indicates the direction of the user's right side
- “left” indicates the direction of the user's left side
- “up” indicates the direction of the user's head
- “down” Indicates the direction of the user's foot.
- “front” indicates a direction in which the user's body faces
- “rear” indicates a direction on the user's back side.
- the mounting unit may be mounted in close contact with the user's neck or may be mounted separately.
- Other shapes of the neck-mounted mounting unit include, for example, a pendant type mounted on the user by a neck strap, or a headset type having a neck band passing through the back of the neck instead of a head band applied to the head. Conceivable.
- the usage form of the wearing unit may be a form used by being directly worn on the human body.
- the form that is directly worn and used refers to a form that is used without any object between the wearing unit and the human body.
- the case where the mounting unit shown in FIG. 1 is mounted so as to contact the skin of the user's neck corresponds to this embodiment.
- various forms such as a headset type and a glasses type that are directly attached to the head are conceivable.
- wearing unit may be a form used by mounting
- the form that is used by being indirectly worn refers to a form that is used in the state where some object exists between the wearing unit and the human body. For example, when the wearing unit shown in FIG.
- auditory information presentation that is, information presentation by voice
- pull type and push type are assumed for the presented information.
- the pull type is information that is requested and presented when the user wants to know, and is activated by a button operation, a screen operation, or a voice operation.
- the push type is information that is automatically presented without being conscious of the user, and for example, e-mail notification, incoming call, call from application, notification, battery remaining amount warning, and the like are assumed.
- Push-type voice notification has the advantage that information is automatically presented, while it has the disadvantage that it takes time to check detailed information.
- visual information such as text and images cannot be referred to, and missed information cannot be easily referred to later.
- the time when the user's position (name of place, etc.) and action at the time of presenting the voice information are linked and recorded and the text of the presented (read-out) voice information is presented together with the user action and position.
- a line UI User Interface
- the audio information timeline UI by displaying the bookmarked information or displaying detailed information (with an image if there is an image), the user can display the audio information that he / she is interested in. It can be easily confirmed later.
- FIG. 2 is a block diagram illustrating an example of the configuration of the information processing apparatus 1 according to the present embodiment.
- the information processing apparatus 1 includes a control unit 10, a communication unit 11, a microphone 12, a camera 13, a 9-axis sensor 14, a speaker 15, a position positioning unit 16, and a storage unit 17.
- the control unit 10 functions as an arithmetic processing device and a control device, and controls the overall operation in the information processing device 1 according to various programs.
- the control unit 10 is realized by an electronic circuit such as a CPU (Central Processing Unit) or a microprocessor, for example.
- the control unit 10 may include a ROM (Read Only Memory) that stores programs to be used, calculation parameters, and the like, and a RAM (Random Access Memory) that temporarily stores parameters that change as appropriate.
- control unit 10 functions as a reading information acquisition unit 10a, a reading control unit 10b, a user situation recognition unit 10c, an operation recognition unit 10d, and a reading history transmission control unit 10e. .
- the reading information acquisition unit 10a acquires information to be presented to the user (read out).
- the read-out information may be received from an external device (for example, a smartphone) or a network (for example, the server 2) by the communication unit 11, may be acquired from the storage unit 17, or from an application that is activated on the information processing device 1. You may get it. Further, the read-out information acquisition unit 10a may acquire information from a website by an RSS reader.
- the read-out control unit 10b controls to output the read-out information acquired by the read-out information acquisition unit 10a from the speaker 15. For example, the reading control unit 10b performs speech synthesis based on the reading information (text information), converts the reading information into speech, outputs the generated voice information from the speaker 15, and presents it to the user. Also, the reading control unit 10b controls to read out a part of the acquired reading information (only the title, up to the title and summary, up to the first sentence of the title and the text, etc.) Based on this, if it is determined that additional reading is necessary, the output of the reading information is further controlled.
- the reading control unit 10b controls to output the read-out information acquired by the read-out information acquisition unit 10a from the speaker 15. For example, the reading control unit 10b performs speech synthesis based on the reading information (text information), converts the reading information into speech, outputs the generated voice information from the speaker 15, and presents it to the user. Also, the reading control unit 10b controls to read out a part of the acquired reading information (only
- the user situation recognition unit 10c recognizes the user situation based on various sensor information. Specifically, the user situation recognizing unit 10c detects the user voice collected by the microphone 12 and the surrounding environmental sound, the surrounding captured image captured by the camera 13, and the sensor value (acceleration detected by the 9-axis sensor 14). The position and action of the user (running, walking, riding a bicycle) using at least one of the sensor value, the gyro sensor value, the geomagnetic sensor value, etc.) or the position information acquired by the position positioning unit 16 Etc.). Furthermore, the user situation recognition unit 10c can recognize a high context of behavior in addition to behavior recognition (low context) such as walking, bicycle, running, stationary, and vehicle. The high context of action is the result of recognizing action details in more detail, for example, at home, returning home, commuting, office, or going out.
- the operation recognition unit 10d recognizes an operation input by the user.
- the operation recognizing unit 10d performs voice recognition of the user voice collected by the microphone 12 and accepts an operation instruction by voice from the user.
- voice operation by the user for example, “Skip, More, Bookmark, Again, Previous” is assumed.
- “Skip” is an instruction to proceed to the next audio information
- “More” is an instruction to request further information
- “Bookmark” is an instruction to mark the current audio information
- “Again” is an audio information of the present Is an instruction to reproduce again from the beginning (repeat instruction)
- Previous is an instruction to return to the previous information.
- the reading history transmission control unit 10e performs control so that the communication unit 11 transmits to the server 2 the history of reading information (hereinafter referred to as “reading history”) that has been subjected to voice output control by the reading control unit 10b. To do.
- the reading history includes the situation at the time of reading (time, position, action (high context, low context)), operation, reading information, and read-out information (the portion of the reading information that is actually output as voice).
- the communication unit 11 is a communication module for transmitting and receiving data to and from other devices by wire / wireless.
- the communication unit 11 is a system such as a wired LAN (Local Area Network), wireless LAN, Wi-Fi (Wireless Fidelity (registered trademark)), infrared communication, Bluetooth (registered trademark), short-range / non-contact communication, and the like. Wirelessly or directly via a network access point.
- the microphone 12 collects the user's voice and the surrounding environment, and outputs it to the control unit 10 as voice data.
- the camera 13 photoelectrically converts a lens system including an imaging lens, an aperture, a zoom lens, a focus lens, and the like, a drive system that causes the lens system to perform a focus operation and a zoom operation, and imaging light obtained by the lens system.
- a solid-state image pickup device array for generating an image pickup signal.
- the solid-state imaging device array may be realized by, for example, a CCD (Charge Coupled Device) sensor array or a CMOS (Complementary Metal Oxide Semiconductor) sensor array.
- the camera 13 is provided so that the front of the user can be imaged in a state where the information processing apparatus 1 (mounting unit) is mounted on the user.
- the camera 13 can capture the scenery around the user and the scenery in the direction in which the user is looking. Further, the camera 13 may be provided so that the user's face can be imaged in a state where the information processing apparatus 1 is worn by the user. In this case, the information processing apparatus 1 can specify the user's line-of-sight direction and facial expression from the captured image. In addition, the camera 13 outputs captured image data that is a digital signal to the control unit 10.
- the 9-axis sensor 14 includes a 3-axis gyro sensor (detection of angular velocity (rotational speed)), a 3-axis acceleration sensor (also referred to as G sensor, detection of acceleration during movement), and a 3-axis geomagnetic sensor (compass, absolute direction (azimuth) ) Detection).
- the 9-axis sensor 14 has a function of sensing the state of the user wearing the information processing apparatus 1 or the surrounding state.
- the 9-axis sensor 14 is an example of a sensor unit, and the present embodiment is not limited thereto.
- a speed sensor or a vibration sensor may be further used, or an acceleration sensor, a gyro sensor, and a geomagnetic sensor may be used. At least one of them may be used.
- the sensor unit may be provided in a device different from the information processing device 1 (mounting unit), or may be provided in a distributed manner in a plurality of devices.
- an acceleration sensor, a gyro sensor, and a geomagnetic sensor may be provided in a device (for example, an earphone) attached to the head, and a speed sensor or a vibration sensor may be provided in a smartphone.
- the 9-axis sensor 14 outputs information indicating the sensing result (sensor information) to the control unit 10.
- the speaker 15 reproduces the audio signal processed by the reading control unit 10b according to the control of the control unit 10.
- the speaker 15 may have directivity.
- the position positioning unit 16 has a function of detecting the current position of the information processing apparatus 1 based on an externally acquired signal.
- the positioning unit 16 is realized by a GPS (Global Positioning System) positioning unit, receives a radio wave from a GPS satellite, detects a position where the information processing apparatus 1 exists, and detects the position.
- the position information is output to the control unit 10.
- the information processing apparatus 1 detects a position by, for example, Wi-Fi (registered trademark), Bluetooth (registered trademark), transmission / reception with a mobile phone / PHS / smartphone, or short-range communication. There may be.
- the storage unit 17 stores programs and parameters for the above-described control unit 10 to execute each function. Further, the storage unit 17 according to the present embodiment may accumulate a reading history to be transmitted to the server 2.
- FIG. 3 is a block diagram illustrating an example of the configuration of the server 2 according to the present embodiment.
- the server 2 includes a control unit 20, a communication unit 21, and a storage unit 22.
- Control unit 20 The control unit 20 functions as an arithmetic processing device and a control device, and controls the overall operation in the server 2 according to various programs.
- the control unit 20 is realized by an electronic circuit such as a CPU or a microprocessor.
- the control unit 20 may include a ROM that stores programs to be used, calculation parameters, and the like, and a RAM that temporarily stores parameters that change as appropriate.
- control unit 20 functions as a storage control unit 20a, a timeline UI generation unit 20b, and a transmission control unit 20c, as shown in FIG.
- the storage control unit 20a controls the storage unit 22 to store the reading history transmitted from the information processing apparatus 1 and received by the communication unit 21.
- the timeline UI generation unit 20b generates a timeline UI to be provided when the user confirms read-out information later based on the read-out history stored in the storage unit 22.
- a specific example of the timeline UI to be generated will be described later with reference to FIGS.
- the transmission control unit 20c controls to transmit the generated timeline UI from the communication unit 21 to the display device 3 (for example, a user's smartphone).
- the communication unit 21 is a communication module for transmitting and receiving data to and from other devices by wire / wireless.
- the communication unit 21 is connected to the information processing apparatus 1 via the network 4 and receives a reading history.
- the communication unit 21 is connected to the display device 3 via the network 4 and transmits the timeline UI generated by the control unit 20.
- the storage unit 22 stores programs and parameters for the above-described control unit 20 to execute each function. Further, the storage unit 22 according to the present embodiment accumulates the reading history transmitted from the information processing apparatus 1. Here, a data example of the reading history will be described with reference to FIG.
- FIG. 4 is a diagram showing an example of the reading history data according to the present embodiment.
- the reading history data includes, for example, reading date / time, position (for example, latitude / longitude information), position name, action high context, action low context, operation (operation input by the user), reading information, and reading out. Information is stored in association with each other.
- the position name can be acquired with reference to map data based on latitude and longitude information, for example.
- the position name may be recognized by the user situation recognition unit 10c of the information processing apparatus 1 or may be recognized on the server 2 side.
- reading information indicates the acquisition source of the reading information (for example, the URL when acquired from the network).
- the information actually read out is stored as “read-out information”.
- the configuration of the information processing system according to the present embodiment is not limited to the example illustrated in FIG. 1, and the configuration of the server 2 described above is provided in the display device 3 realized by an information processing terminal such as a smartphone.
- a system configuration including the display device 3 may be employed.
- FIG. 5 is a flowchart showing a reading process by the information processing apparatus 1 according to the present embodiment.
- the information processing apparatus 1 recognizes a user situation by the user situation recognition unit 10c (step S103).
- a reading event occurs at a preset time, periodically, irregularly, or when new information is acquired. For example, a reading event of the latest news or event information may occur at a fixed time of the day. Alternatively, the user situation may be continuously recognized, and a read-out event may be generated when the recognition result satisfies a predetermined condition.
- the user situation is recognized based on various information acquired from the microphone 12, the camera 13, the 9-axis sensor 14 (acceleration sensor, gyro sensor, geomagnetic sensor, etc.), and the position measurement unit 16 (GPS, etc.). Can be broken.
- the user situation recognition unit 10c recognizes the user's position, the high context of the action, the low context, and the like.
- the information processing apparatus 1 acquires reading information (step S106).
- the information processing apparatus 1 performs information reading control (that is, audio output control from the speaker 15) (step S109).
- step S115 the reading control unit 10b of the information processing apparatus 1 determines whether or not to perform additional reading.
- user operations during reading information include, for example, Skip, More, Bookmark, Again, and Previous. Further, since “More” is a request instruction for more detailed information, the information processing apparatus 1 performs additional reading.
- the information processing device 1 reads the reading date and time, position, action high context, low context, user operation during reading (Skip, More, Bookmark, Again, Previous), reading information, and reading
- the reading history including the completed information is transmitted to the server 2 (step S118).
- FIG. 6 is a flowchart showing timeline UI generation processing by the server 2 of this embodiment.
- the server 2 receives a timeline UI acquisition request from an external device (here, using the display device 3) (step S ⁇ b> 120)
- the server 2 stores the target user stored in the storage unit 17.
- a reading history is acquired (step S123).
- the timeline UI generation unit 20b of the server 2 determines the user load based on the behavior information (high context, low context) included in the reading history (step S126).
- the user load indicates the degree of a situation in which it is difficult for the user to hear voice information (it is difficult to concentrate on the voice information). For example, during running or cycling, it is determined that the user load is high because the user concentrates on running or driving, that is, it is difficult to hear voice information. Further, during walking, it is determined that the user load is not high compared to running and cycling. When the user is on the train, it is determined that the user load is lower than when walking, that is, the voice information is easy to hear.
- the timeline UI generation unit 20b performs a preference determination on the user's voice information based on the operation information included in the reading history (step S129). For example, when a “Skip” operation is performed, a negative determination (determination that the user does not like (not interested) information) is performed, and a “More”, “Bookmark”, or “Again” operation is performed. If yes, a positive determination (determination that the information is preferred (interested) by the user) is made. Further, when the “Previous” operation is performed or when no operation is performed, neither negative / positive determination is performed.
- the timeline UI generation unit 20b calculates the granularity of the display information based on the user load and the preference determination result (step S132).
- the granularity of the display information indicates how detailed the audio information is displayed on the timeline UI (whether only the title or the text is displayed). For example, when the user load is high or when a positive determination is made, the timeline UI generation unit 20b has a granularity of “large”, when the user load is medium or when there is no preference determination, the granularity is “medium”, and the user load is low Alternatively, when a negative determination is made, the particle size is calculated as “small”.
- the timeline UI generation unit 20b generates a timeline UI based on the calculated granularity information and various information included in the reading history (step S135). For example, the timeline UI generation unit 20b arranges the read-out information in time series together with an icon, a position name, and a time indicating the high context of the user's action at the time of reading. In addition, the timeline UI generation unit 20b controls how detailed the read-out information is displayed according to the calculated granularity information.
- the process in the case of generating the timeline UI based on the user load, the preference determination result, and the granularity information is described, but the present embodiment is not limited to this.
- read-out information may be displayed in time series, or captured images captured at the time of reading may be displayed in time series. Specific examples of such various timeline UIs will be described later.
- step S138 the process returns to step S123 (step S138).
- steps S123 to S135 are repeated until the processing of all the reading history for one day is performed.
- the server 2 transmits the generated timeline UI to the external device that is the acquisition request source of the timeline UI, for example, the display device 3 (step S141).
- FIG. 7 is a flowchart showing a timeline UI display process by the display device 3 of the present embodiment. As illustrated in FIG. 7, first, the display device 3 makes a timeline UI acquisition request to the server 2 (step S150).
- step S153 when the display device 3 acquires the timeline UI from the server 2 (step S153 / Yes), the display device 3 displays the timeline UI on the display unit (step S156).
- step S159 / Yes the display device 3 determines whether or not to update the display according to the user operation (step S162).
- step S162 the display device 3 returns to step S156 to update the display of the timeline UI. For example, when the user taps on a map displayed together with the timeline UI on the touch panel display of the display device 3, the display device 3 scrolls the timeline UI so as to display the voice information read out at the tapped position. To update the display.
- FIG. 8 is a diagram showing a screen display example according to the first example of the present embodiment.
- a display field including one timeline map image (a map image showing a timeline locus) is displayed every time a user action (high context) is switched.
- the display screen 30 includes a display column of “6: 50-7: 30 On your way home” and “7: 45-8: 30 The Outside column is displayed in chronological order.
- only two display fields “going home” and “going out” are displayed. However, scrolling the screen displays display columns for other actions.
- the “outing” display field includes a display image 301 of “7: 45-8: 30 Outside” indicating a high context of time and action, a map image 302, a display 303 of information related to reading, and a display 304 of reading information. including.
- the reading information displayed on the reading-related information display 303 and the reading information display 304 is the timeline locus on the map (corresponding movement route at the time of action. Here, the movement route at “outing”). It is information read out in the vicinity of the point tapped arbitrarily. On the timeline locus, pins are displayed at points where information is read out.
- a captured image in the vicinity of the arbitrarily tapped point (a captured image captured by the camera 13 of the information processing apparatus 1 when the user is moving the point) is displayed.
- the captured images at each point may be sequentially displayed.
- the user can easily search for information by using the scene as a clue so that he / she wants to confirm the information he / she hears when he / she is in this place while looking at the scene in the captured image.
- Reading information display 303 shows reading time, type of reading information (news, event, etc.), low context of action (running, walking, riding a bicycle, riding a train, etc.), location Includes (location name) display.
- the text of the read information is displayed.
- a title display 305, an information provider, and a text 306 are displayed as shown in FIG.
- a link to an information provider for example, a news site
- the screen changes to a news site.
- the read-out text (read-out information) is displayed small, and the text that has not been read out is displayed large. For example, when only the title and the first sentence of the text are read at the time of reading, the second sentence is displayed larger. In this way, information that has not been read out is highlighted.
- highlighting a display mode in which characters are enlarged is used.
- this embodiment is not limited to this, and the display mode is not limited to this, and a different font, font, background, or animation is added.
- the highlighting may be performed by, for example. The situation in which the user confirms the audio information later is assumed that the user is interested in the audio information presented and wants to know more details. If it is intuitively understood whether or not the information is unheard information, the convenience is further improved.
- the display column of “Returning home” is also displayed in the same way. That is, it includes a display image 307 of “7: 45-8: 30sideOutside” indicating a high context of time and action, a map image 308, a display 309 of information related to reading, and a display 310 of reading information.
- a title display 311 an information provider, and a body 312 are displayed as the text of the read information. In the body 312, information that has not been read out (second and subsequent sentences of the body) is highlighted (displayed with a large character size).
- FIGS. 9 to 14 are diagrams for explaining screen transition of the timeline UI according to the present embodiment.
- “Today's Timeline” is displayed in time series as items indicating each high context of the user behavior of the day.
- a map image showing the timeline locus at the time of the action is displayed. For example, as shown in FIG. 9, when “7: 45-8: 30 pm: Outside” (outing) is tapped among the items, a timeline locus at the time of going out is displayed as shown in the screen 32 on the right side of FIG. A map image 314 is displayed.
- the user taps an arbitrary point on the timeline locus.
- a pin standing on the timeline locus indicates a point where information is read out.
- information read out at the tapped point is displayed. .
- the audio information presentation screen on the right side of the screen is tapped as shown in the screen 36 on the right side of FIG.
- the information is switched to the information of the point (including information display 321 concerning reading and display 322 of reading information).
- weather forecast information read out when running in the park at 8:25 is displayed.
- the timeline locus is first shown on the map image, and when the user taps an arbitrary point, the scene (captured image) of the tapped point is displayed.
- the text of the information read out at the same point is displayed, and the information read out at the same point is output again by voice.
- the text displayed here may be the text of the voice information that has been read out.
- a plurality of high context timeline trajectories are displayed on the map image.
- a pin indicating the point where the information is read out, and the type, time and action (low context) of the read out information are displayed together on the map.
- the scene (captured image) of the tapped point is displayed on the map image and further read out at the same point.
- the text of information is displayed, and the information read out at the same point is output again by voice.
- the read information is output again by voice to reproduce the situation at the time of reading, so that the voice information that the user wanted to hear in detail can be obtained. Can help you remember which one was.
- Second Embodiment> a screen display example according to the second embodiment will be described with reference to FIG.
- the voice information read out during the action is displayed.
- the user's information search is supported by changing the display granularity of the voice information according to the user preference based on the user operation at the time of reading.
- items “Today's Timeline” are displayed in time series as items 450 and 454 indicating the user behavior of today's day (in this case, a low context as an example). Information read out during the action is displayed.
- icons 451, 457, and 459 indicating user operations at the time of reading are displayed. For example, if the user performs a voice operation that instructs “Bookmark” to the event information read out while the user is running in the park at 7:45 (speaking “Bookmark”), the timeline of the server 2
- the UI generation unit 20b determines that positive feedback has been performed. Accordingly, since the event information indicates the interest of the user, display control is performed with the information granularity “large”. That is, the title 452 of the reading-out information and the full text 453 are displayed. It is assumed that the title and the first sentence of the text are read out when reading out.
- the icon 459 indicates that the voice operation of “More” has been performed. It is judged as a feedback operation, and news information is displayed with a large granularity. That is, for example, as shown in FIG. 16, when the voice operation of “More” is performed on the third news information read out when the user is on the train at 7:10, the title 460 And the full text 461 are displayed.
- the user operation determined as the Positive Feedback operation is, for example, “More”, “Again”, and “Bookmark”, and in these cases, it can be displayed with a large granularity.
- the voice operation of “Skip” is determined as a Negative Feedback operation.
- the information is displayed with a granularity of “small”. For example, as shown in FIG. 16, when a voice operation of “Skip” is performed on the second news information read out when the user is on the train at 7:10, the time of the server 2
- the line UI generation unit 20b determines that negative feedback has been performed, and displays only the title 458.
- a predetermined icon 457 indicating that the voice operation of “Skip” has been performed is also displayed.
- the search amount of the information that the user wants to check is supported by reducing the amount of display.
- a screen display example according to the third embodiment will be described with reference to FIG.
- the granularity of the information display is changed according to the user preference based on the user operation for information reading.
- the granularity of information display may be changed according to the user load based on it.
- a change in granularity of information display according to the user load will be described.
- items “461”, “462”, “463”, and “464” indicating the user behavior of today's day (in this example, low context) are displayed in time series as “Today'sToTimeline” Under each item, the text of the voice information read out during the action is displayed.
- the user load corresponding to the user behavior indicates the degree of the situation in which the user is difficult to hear the voice information (it is difficult to concentrate on the voice information). For example, the user load is high during running or cycling. (That is, it is difficult to hear voice information). Therefore, as shown in FIG. 17, for example, when the user is running at 7:45 (item 461), the event information read out is highly likely to be missed by the user, so the information granularity is “large”. Display control. Specifically, for example, a title and a full text are displayed.
- the read-out information is (“More” (If no voice operation is performed, it is assumed that the title and the first sentence of the text are up to).
- the reading information is controlled to be displayed with the information granularity “medium”. For example, as shown in FIG. 17, news information read out when the user is walking (item 462) at 7:10 is likely to be heard a little, so the information granularity is displayed as “medium”. Be controlled. Specifically, for example, the title and the first sentence of the text are displayed.
- the read-out information is displayed and controlled with the information granularity “small”. For example, as shown in FIG. 17, the news information read out when the user is on the train (item 463) at 7:12 is likely to be heard well, so the information granularity is “small”. Display controlled. Specifically, for example, only the title is displayed. Since the information read aloud on the train is clearly heard, it is unlikely that it will be reconfirmed later. By reducing the amount of voice information displayed, the user can check other reading information. You can avoid getting in the way.
- the granularity of the display information in the timeline UI is changed according to the preference determination result based on the user operation at the time of reading information and the user load based on the user action at the time of reading information.
- the present embodiment is not limited to this.
- the font size of the display information may be changed according to the preference determination result or the user load.
- items 471 and 474 indicating the user behavior of the day today are displayed in time series as “Today's Timeline” on the screen 47, and below each item.
- the text of the voice information read out during the action is displayed.
- the timeline UI generation unit 20b of the server 2 determines that positive feedback has been performed.
- the display control is performed with the information granularity “large” and the font size “large” is further displayed. That is, the title of the reading information and the whole text are displayed in a font size larger than that of the negatively fed back audio information and the audio information whose preference has not been determined.
- a predetermined icon 472 indicating that the voice operation of “Bookmark” has been performed is also displayed.
- the second event information read out when the user is running in the park at 7:45 does not perform any operation
- the third event information When a voice operation instructing “Skip” is performed (the icon 473 means that the “Skip” operation is performed), the display control is performed with the font size “small”. In this way, by displaying audio information that the user is not interested in in a small size, it is possible to prevent the user from getting in the way when searching for information while scrolling the timeline UI.
- items 481, 483, and 486 indicating the user behavior of today's day are displayed in time series as “Today's Timeline” on the screen 48.
- the text of the voice information read out during the action is displayed.
- the voice information that has been read out is displayed, and “paused” (display 482, display 485) is displayed when reading is interrupted according to the context of the user action, and “continue” (display 484, display 487) is restarted. Is displayed.
- FIG. 20 is a diagram for explaining another timeline UI according to the present embodiment.
- Today's Timeline captured images 491, 492, 493 (for example, by the camera 13 of the information processing apparatus 1) in which a scene at a point where information is read out on the day today are displayed.
- a peripheral captured image captured at the time of reading information or a captured image of each point prepared in advance) is displayed in time series. Since the memory of where the voice information was presented is likely to remain, the user can easily check the information he / she missed later, such as the screen 49 shown in FIG. It becomes possible to search for target information.
- the text, time, type of information, and the like of the voice information read out at the place may be displayed.
- the text displayed on the screen 52 is actually read-out voice information (read-out information), and a predetermined keyword is highlighted in the read-out information (for example, the font size is increased). Displayed).
- the predetermined keyword is assumed to be a word that is likely to remain in the memory of the user who has heard the read-out information, such as a proper noun or a noun used in the title.
- this technique can also take the following structures.
- Output control that outputs information on a display screen that displays the text of the voice information in chronological order with information granularity determined based on a user operation at the time of reading the voice information included in the acquired voice information reading history
- An information processing apparatus comprising a unit.
- the information processing apparatus according to (1) wherein the information granularity is an information amount, and is controlled by only the title, the title and a part of the text, or the title and the whole text.
- the user operation is a voice input operation indicating a skip instruction, a repeat instruction, a detailed reproduction instruction, a bookmark instruction, or a previous instruction.
- the output control unit further outputs information on the display screen in which the font size of the audio information is changed in accordance with a user preference estimated based on the user operation.
- Any one of (1) to (3) The information processing apparatus according to claim 1. (5) Any one of (1) to (4), wherein an information granularity of the voice information is controlled in accordance with a user load estimated based on a user action at the time of reading the voice information included in the reading history.
- the information processing apparatus according to any one of 5).
- the information processing apparatus according to any one of (1) to (6), wherein the read-out information displayed on the display screen highlights a portion of the text that has not been read out from the text of the read-out information. .
- a movement trajectory based on the user's position history at the time of reading out each piece of audio information included in the reading history is displayed on a map image included in the display screen, and further, in the vicinity of an arbitrary point of the movement locus specified by the user.
- the information processing apparatus according to any one of (1) to (7), wherein the read voice information is displayed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
1.本開示の一実施形態による情報処理システムの概要
2.構成
2-1.情報処理装置1の構成
2-2.サーバ2の構成
3.動作処理
3-1.読み上げ処理
3-2.タイムラインUI生成処理
3-3.タイムラインUI表示処理
4.画面表示例
4-1.第1の実施例
4-2.第2の実施例
4-3.第3の実施例
4-4.第4の実施例
4-5.第5の実施例
4-6.その他
5.まとめ
まず、本開示の一実施形態による情報処理システムの概要について、図1を参照して説明する。図1は、本実施形態による情報処理システムの概要を説明する図である。
ここで、ウェアラブル装置を身に付けて、情報検索サービス、エンターテインメント情報、行動支援情報の提供サービス等を日常的に享受する際、視覚情報により提示されると日常生活の多くの「ながら」中の確認ができないという問題があった。例えば、歩きながら、自転車に乗りながら、家事をしながら等の視覚情報の確認は、視覚が一時的に奪われるため危険であった。
<2-1.情報処理装置1の構成>
続いて、本実施形態による情報処理装置1の構成について図2を参照して説明する。図2は、本実施形態による情報処理装置1の構成の一例を示すブロック図である。図2に示すように、情報処理装置1は、制御部10、通信部11、マイクロホン12、カメラ13、9軸センサ14、スピーカ15、位置測位部16、および記憶部17を有する。
制御部10は、演算処理装置および制御装置として機能し、各種プログラムに従って情報処理装置1内の動作全般を制御する。制御部10は、例えばCPU(Central Processing Unit)、マイクロプロセッサ等の電子回路によって実現される。また、制御部10は、使用するプログラムや演算パラメータ等を記憶するROM(Read Only Memory)、及び適宜変化するパラメータ等を一時記憶するRAM(Random Access Memory)を含んでいてもよい。
通信部11は、有線/無線により他の装置との間でデータの送受信を行うための通信モジュールである。通信部11は、例えば有線LAN(Local Area Network)、無線LAN、Wi-Fi(Wireless Fidelity、登録商標)、赤外線通信、Bluetooth(登録商標)、近距離/非接触通信等の方式で、外部機器と直接、またはネットワークアクセスポイントを介して無線通信する。
マイクロホン12は、ユーザの音声や周囲の環境を収音し、音声データとして制御部10に出力する。
カメラ13は、撮像レンズ、絞り、ズームレンズ、及びフォーカスレンズ等により構成されるレンズ系、レンズ系に対してフォーカス動作やズーム動作を行わせる駆動系、レンズ系で得られる撮像光を光電変換して撮像信号を生成する固体撮像素子アレイ等を有する。固体撮像素子アレイは、例えばCCD(Charge Coupled Device)センサアレイや、CMOS(Complementary Metal Oxide Semiconductor)センサアレイにより実現されてもよい。例えば、カメラ13は、情報処理装置1(装着ユニット)がユーザに装着された状態で、ユーザの前方を撮像可能に設けられる。この場合、カメラ13は、ユーザの周囲の景色や、ユーザが見ている方向の景色を撮像することが可能となる。また、カメラ13は、情報処理装置1がユーザに装着された状態で、ユーザの顔を撮像可能に設けられてもよい。この場合、情報処理装置1は、撮像画像からユーザの視線方向や表情を特定することが可能となる。また、カメラ13は、デジタル信号とされた撮像画像のデータを制御部10へ出力する。
9軸センサ14は、3軸ジャイロセンサ(角速度(回転速度)の検出)、3軸加速度センサ(Gセンサとも称す。移動時の加速度の検出)、および3軸地磁気センサ(コンパス、絶対方向(方位)の検出)を含む。9軸センサ14は、情報処理装置1を装着したユーザの状態または周囲の状態をセンシングする機能を有する。なお9軸センサ14は、センサ部の一例であって、本実施形態はこれに限定されず、例えば速度センサまたは振動センサ等をさらに用いてもよいし、加速度センサ、ジャイロセンサ、および地磁気センサのうち少なくともいずれかを用いてもよい。また、センサ部は、情報処理装置1(装着ユニット)とは別の装置に設けられていてもよいし、複数の装置に分散して設けられていてもよい。例えば、加速度センサ、ジャイロセンサ、および地磁気センサが頭部に装着されたデバイス(例えばイヤホン)に設けられ、速度センサや振動センサがスマートフォンに設けられてもよい。9軸センサ14は、センシング結果を示す情報(センサ情報)を制御部10へ出力する。
スピーカ15は、制御部10の制御に従って、読み上げ制御部10bにより処理された音声信号を再生する。スピーカ15は、指向性を有していてもよい。
位置測位部16は、外部からの取得信号に基づいて情報処理装置1の現在位置を検知する機能を有する。具体的には、例えば位置測位部16は、GPS(Global Positioning System)測位部により実現され、GPS衛星からの電波を受信して、情報処理装置1が存在している位置を検知し、検知した位置情報を制御部10に出力する。また、情報処理装置1は、GPSの他、例えばWi-Fi(登録商標)、Bluetooth(登録商標)、携帯電話・PHS・スマートフォン等との送受信、または近距離通信等により位置を検知するものであってもよい。
記憶部17は、上述した制御部10が各機能を実行するためのプログラムやパラメータを格納する。また、本実施形態による記憶部17は、サーバ2へ送信する読み上げ履歴を蓄積してもよい。
次に、本実施形態によるサーバ2の構成について図3を参照して説明する。図3は、本実施形態によるサーバ2の構成の一例を示すブロック図である。図3に示すように、サーバ2は、制御部20、通信部21、および記憶部22を有する。
制御部20は、演算処理装置および制御装置として機能し、各種プログラムに従ってサーバ2内の動作全般を制御する。制御部20は、例えばCPU、マイクロプロセッサ等の電子回路によって実現される。また、制御部20は、使用するプログラムや演算パラメータ等を記憶するROM、及び適宜変化するパラメータ等を一時記憶するRAMを含んでいてもよい。
通信部21は、有線/無線により他の装置との間でデータの送受信を行うための通信モジュールである。例えば通信部21は、ネットワーク4を介して情報処理装置1と接続し、読み上げ履歴を受信する。また、通信部21は、ネットワーク4を介して表示装置3と接続し、制御部20により生成されたタイムラインUIを送信する。
記憶部22は、上述した制御部20が各機能を実行するためのプログラムやパラメータを格納する。また、本実施形態による記憶部22は、情報処理装置1から送信された読み上げ履歴を蓄積する。ここで、図4を参照して読み上げ履歴のデータ例について説明する。
<3-1.読み上げ処理>
図5は、本実施形態の情報処理装置1による読み上げ処理を示すフローチャートである。図5に示すように、まず、情報処理装置1は、読み上げイベントが発生すると(ステップS100)、ユーザ状況認識部10cによりユーザ状況の認識を行う(ステップS103)。読み上げイベントは、予め設定された時間や、定期的、不定期的、または新着情報取得時等に発生する。例えば、1日のうち決まった時間に最新ニュースやイベント情報の読み上げイベントが発生するようにしてもよい。また、継続的にユーザ状況の認識を行い、認識結果が所定の条件を満たす場合に読み上げイベントを発生させるようにしてもよい。ユーザ状況の認識は、上述したように、マイクロホン12、カメラ13、9軸センサ14(加速度センサ、ジャイロセンサ、地磁気センサ等)、位置測位部16(GPS等)から取得した各種情報に基づいて行われ得る。例えばユーザ状況認識部10cは、ユーザの位置、行動のハイコンテキスト、ローコンテキスト等を認識する。
図6は、本実施形態のサーバ2によるタイムラインUI生成処理を示すフローチャートである。図6に示すように、まず、サーバ2は、タイムラインUIの取得要求を外部装置(ここでは、表示装置3を用いる)から受け付けると(ステップS120)、記憶部17に記憶された対象ユーザの読み上げ履歴を取得する(ステップS123)。
例えばタイムラインUI生成部20bは、読み上げ時のユーザの行動のハイコンテキストを示すアイコンや位置名称、時刻と共に、読み上げ情報を時系列に並べる。また、タイムラインUI生成部20bは、算出された粒度情報に応じて読み上げ情報をどの程度詳細に表示するかを制御する。例えば粒度「大」の場合はタイトルと本文全てを表示し、粒度「中」の場合はタイトルと本文の1文目までを表示し、粒度「小」の場合はタイトルのみ表示するようにしてもよい。図6に示す動作処理では、ユーザ負荷や嗜好判断結果、粒度情報に基づいてタイムラインUIを生成する場合の処理を説明しているが、本実施形態はこれに限定されない。例えば読み上げ済み情報を時系列で表示したり、読み上げ時に撮像した撮像画像を時系列で表示したりしてもよい。このような様々なタイムラインUIの具体例については後述する。
図7は、本実施形態の表示装置3によるタイムラインUI表示処理を示すフローチャートである。図7に示すように、まず、表示装置3は、タイムラインUIの取得要求をサーバ2に対して行う(ステップS150)。
続いて、本実施形態によるタイムラインUIの画面表示例について複数の実施例を用いて具体的に説明する。
図8は、本実施形態の第1の実施例による画面表示例を示す図である。まず、本実施例によるタイムラインUIでは、ユーザ行動(ハイコンテキスト)の切り替わり毎に、1つのタイムライン地図画像(タイムライン軌跡を示す地図画像)を含む表示欄が表示される。例えば図8に示す例では、表示画面30に、「6:50-7:30 On your way home」の表示欄と、「7:45-8:30
Outside」の表示欄が、時系列順に表示されている。図8に示す例では「帰宅中」と「外出」の2つの表示欄のみが表示されているが、画面をスクロールすることで他の行動時の表示欄が表示される。
図15を参照して、タイムライン軌跡を示す地図画像と共に音声情報のテキストを表示する場合の変形例について説明する。本変形例では、音声提示されたときの状況を再現することで、ユーザの情報検索を支援する。
続いて、図16を参照して第2の実施例による画面表示例について説明する。第2の実施例では、ユーザ行動毎に、その行動時に読み上げられた音声情報を表示する。この際、音声情報の表示粒度を、読み上げ時のユーザ操作に基づくユーザ嗜好に応じて変更することで、ユーザの情報検索を支援する。
Feedback操作と判断され、ニュース情報が粒度大で表示される。すなわち、例えば図16に示すように、7時10分にユーザが電車に乗っている際に読み上げられた3つ目のニュース情報に対して「More」の音声操作が行われた場合、タイトル460と本文全文461が表示される。
次に、図17を参照して第3の実施例による画面表示例について説明する。上述した第2の実施例では、情報読み上げに対するユーザ操作に基づくユーザ嗜好に応じて情報表示の粒度を変更していたが、本実施形態はこれに限定されず、例えば情報読み上げ時のユーザ行動に基づくユーザ負荷に応じて情報表示の粒度を変更してもよい。第3の実施例では、かかるユーザ負荷に応じた情報表示の粒度変更について説明する。
次に、図18を参照して第4の実施例による画面表示例について説明する。上述した第2、第3の実施例では、情報読み上げ時のユーザ操作に基づく嗜好判断結果や、情報読み上げ時のユーザ行動に基づくユーザ負荷に応じて、タイムラインUIにおける表示情報の粒度を変更する場合について説明したが、本実施形態はこれに限定されず、例えば嗜好判断結果やユーザ負荷に応じて、さらに表示情報のフォントサイズを変更してもよい。
続いて、図19を参照して第5の実施例による画面表示例について説明する。本実施例では、ユーザ行動のコンテキストに応じて読み上げ情報が中断、再開された場合に、タイムラインUIにおいて、これら中断、再開の情報も併せて表示することで、ユーザが読み上げ情報を聞いた際の状況を思い出し易くさせ、情報検索の支援を行うことができる。
以上、本実施形態によるタイムラインUIについて複数の実施例を用いて具体的に説明した。なお本実施形態によるタイムラインUIは上述した実施例に限定されず、さらに次のようなものであってもよい。
図20は、本実施形態による他のタイムラインUIを説明する図である。図20に示すように、画面49には、「Today's Timeline」として、今日一日に情報が読み上げられた地点における情景を写した撮像画像491、492、493(例えば情報処理装置1のカメラ13により情報読み上げ時に撮像した周辺の撮像画像、若しくは予め用意された各地点の撮像画像)が、時系列で表示されている。音声情報がどこに居る時に提示されたかという記憶は残りやすいため、ユーザは、聞き逃した情報等を後から確認する際に図20の画面49のような自身が見た景色を手掛かりにして容易に目的の情報を探すことが可能となる。また、画面49に表示される撮像画像上には、その場所で読み上げられた音声情報(読み上げ済み情報)のテキストや時間、情報の種類等が表示されていてもよい。
図21は、本実施形態による他のタイムラインUIを説明する図である。図21の左側に示すように、画面50には、「Today's Timeline」として、地図画像501にユーザ行動のライムライン軌跡(移動経路の軌跡)が表示されている。ユーザがタイムライン軌跡上の任意の地点をタップすると、当該地点で読み上げられた情報のテキスト(または画像)が読み上げ表示領域に表示される。画面50では、例えば、タップした地点で読み上げられた天気予報情報(項目502)が読み上げ表示領域(図21に示す例では地図画像501の下方)に表示される。
図22は、本実施形態による他のタイムラインUIを説明する図である。図22に示すように、画面52には、「Today's Timeline」として、今日一日のユーザ行動(ここでは、一例としてローコンテキスト)を示す項目521、522が時系列で表示され、各項目の下に、その行動時に読み上げられた音声情報のテキストが表示されている。
上述したように、本開示の実施形態による情報処理装置1では、読み上げられた音声情報を確認する際の利便性を向上させることを可能とする。具体的には、情報読み上げ時の日時、位置、行動、操作、または情景に基づいた検索を可能とするUIを提供することで、ユーザが聞き逃した情報や興味を持った情報等を後から探し易くさせることが可能となる。
(1)
取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力する出力制御部を備える、情報処理装置。
(2)
前記情報粒度は、情報量であって、タイトルのみ、タイトルと本文の一部、またはタイトルと本文全部のいずれかに制御される、前記(1)に記載の情報処理装置。
(3)
前記ユーザ操作は、スキップ指示、繰り返し指示、詳細再生指示、ブックマーク指示、または前に戻る指示を示す音声入力操作である、前記(2)に記載の情報処理装置。
(4)
前記出力制御部は、さらに前記ユーザ操作に基づいて推定されるユーザ嗜好に応じて、前記音声情報のフォントサイズが変更された表示画面の情報を出力する、前記(1)~(3)のいずれか1項に記載の情報処理装置。
(5)
前記読み上げ履歴に含まれる音声情報の読み上げ時におけるユーザ行動に基づいて推定されるユーザ負荷に応じて、前記音声情報の情報粒度が制御される、前記(1)~(4)のいずれか1項に記載の情報処理装置。
(6)
前記読み上げ履歴に含まれる音声情報の読み上げ時におけるユーザ行動の認識結果、読み上げ時の日時、場所、読み上げられた情報の種類の少なくともいずれかが前記表示画面にさらに含まれる、前記(1)~(5)のいずれか1項に記載の情報処理装置。
(7)
前記表示画面で表示される読み上げ情報は、読み上げ済み情報のテキストよりも読み上げられていない部分のテキストが強調表示される、前記(1)~(6)のいずれか1項に記載の情報処理装置。
(8)
前記読み上げ履歴に含まれる各音声情報の読み上げ時におけるユーザの位置履歴に基づく移動軌跡が前記表示画面に含まれる地図画像上に表示され、さらにユーザにより指定された前記移動軌跡の任意の地点付近で読み上げられた音声情報が表示される、前記(1)~(7)のいずれか1項に記載の情報処理装置。
(9)
前記表示画面には、さらにユーザにより指定された前記移動軌跡の任意の地点付近の情景を写した撮像画像が表示される、前記(8)に記載の情報処理装置。
(10)
プロセッサが、
取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力することを含む、情報処理方法。
(11)
コンピュータを、
取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力する出力制御部として機能させる、プログラム。
10 制御部
10a 読み上げ情報取得部
10b 読み上げ制御部
10c ユーザ状況認識部
10d 操作認識部
10e 読み上げ履歴送信制御部
11 通信部
12 マイクロホン
13 カメラ
14 9軸センサ
15 スピーカ
16 位置測位部
17 記憶部
2 サーバ
20 制御部
20a 記憶制御部
20b タイムラインUI生成部
20c 送信制御部20c
21 通信部
22 記憶部
3 表示装置
4 ネットワーク
Claims (11)
- 取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力する出力制御部を備える、情報処理装置。
- 前記情報粒度は、情報量であって、タイトルのみ、タイトルと本文の一部、またはタイトルと本文全部のいずれかに制御される、請求項1に記載の情報処理装置。
- 前記ユーザ操作は、スキップ指示、繰り返し指示、詳細再生指示、ブックマーク指示、または前に戻る指示を示す音声入力操作である、請求項2に記載の情報処理装置。
- 前記出力制御部は、さらに前記ユーザ操作に基づいて推定されるユーザ嗜好に応じて、前記音声情報のフォントサイズが変更された表示画面の情報を出力する、請求項1に記載の情報処理装置。
- 前記読み上げ履歴に含まれる音声情報の読み上げ時におけるユーザ行動に基づいて推定されるユーザ負荷に応じて、前記音声情報の情報粒度が制御される、請求項1に記載の情報処理装置。
- 前記読み上げ履歴に含まれる音声情報の読み上げ時におけるユーザ行動の認識結果、読み上げ時の日時、場所、読み上げられた情報の種類の少なくともいずれかが前記表示画面にさらに含まれる、請求項1に記載の情報処理装置。
- 前記表示画面で表示される読み上げ情報は、読み上げ済み情報のテキストよりも読み上げられていない部分のテキストが強調表示される、請求項1に記載の情報処理装置。
- 前記読み上げ履歴に含まれる各音声情報の読み上げ時におけるユーザの位置履歴に基づく移動軌跡が前記表示画面に含まれる地図画像上に表示され、さらにユーザにより指定された前記移動軌跡の任意の地点付近で読み上げられた音声情報が表示される、請求項1に記載の情報処理装置。
- 前記表示画面には、さらにユーザにより指定された前記移動軌跡の任意の地点付近の情景を写した撮像画像が表示される、請求項8に記載の情報処理装置。
- プロセッサが、
取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力することを含む、情報処理方法。 - コンピュータを、
取得された音声情報の読み上げ履歴に含まれる当該音声情報の読み上げ時におけるユーザ操作に基づいて判断された情報粒度で、前記音声情報のテキストを時系列順に表示する表示画面の情報を出力する出力制御部として機能させる、プログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/085,419 US11074034B2 (en) | 2016-04-27 | 2017-01-24 | Information processing apparatus, information processing method, and program |
CN201780024799.XA CN109074240B (zh) | 2016-04-27 | 2017-01-24 | 信息处理设备、信息处理方法和程序 |
JP2018514113A JP6891879B2 (ja) | 2016-04-27 | 2017-01-24 | 情報処理装置、情報処理方法、およびプログラム |
EP17788968.0A EP3451149A4 (en) | 2016-04-27 | 2017-01-24 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016089227 | 2016-04-27 | ||
JP2016-089227 | 2016-04-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017187678A1 true WO2017187678A1 (ja) | 2017-11-02 |
Family
ID=60161278
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/002309 WO2017187678A1 (ja) | 2016-04-27 | 2017-01-24 | 情報処理装置、情報処理方法、およびプログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US11074034B2 (ja) |
EP (1) | EP3451149A4 (ja) |
JP (1) | JP6891879B2 (ja) |
CN (1) | CN109074240B (ja) |
WO (1) | WO2017187678A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019198299A1 (ja) * | 2018-04-11 | 2019-10-17 | ソニー株式会社 | 情報処理装置及び情報処理方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11363953B2 (en) * | 2018-09-13 | 2022-06-21 | International Business Machines Corporation | Methods and systems for managing medical anomalies |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001005634A (ja) * | 1999-06-24 | 2001-01-12 | Hitachi Ltd | 電子メール受信装置 |
JP2010026813A (ja) * | 2008-07-18 | 2010-02-04 | Sharp Corp | コンテンツ表示装置、コンテンツ表示方法、プログラム、記録媒体、および、コンテンツ配信システム |
JP2015169768A (ja) * | 2014-03-06 | 2015-09-28 | クラリオン株式会社 | 対話履歴管理装置、対話装置および対話履歴管理方法 |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7020663B2 (en) * | 2001-05-30 | 2006-03-28 | George M. Hay | System and method for the delivery of electronic books |
JP2006023860A (ja) * | 2004-07-06 | 2006-01-26 | Sharp Corp | 情報閲覧装置、情報閲覧プログラム、情報閲覧プログラム記録媒体及び情報閲覧システム |
JP5250827B2 (ja) * | 2008-09-19 | 2013-07-31 | 株式会社日立製作所 | 行動履歴の生成方法及び行動履歴の生成システム |
WO2010105246A2 (en) * | 2009-03-12 | 2010-09-16 | Exbiblio B.V. | Accessing resources based on capturing information from a rendered document |
JP2012063526A (ja) * | 2010-09-15 | 2012-03-29 | Ntt Docomo Inc | 端末装置、音声認識方法および音声認識プログラム |
US10672399B2 (en) * | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
JP5821307B2 (ja) | 2011-06-13 | 2015-11-24 | ソニー株式会社 | 情報処理装置、情報処理方法及びプログラム |
CN102324191B (zh) * | 2011-09-28 | 2015-01-07 | Tcl集团股份有限公司 | 一种有声读物逐字同步显示方法及系统 |
KR101309794B1 (ko) * | 2012-06-27 | 2013-09-23 | 삼성전자주식회사 | 디스플레이 장치, 디스플레이 장치의 제어 방법 및 대화형 시스템 |
CN103198726A (zh) * | 2013-04-23 | 2013-07-10 | 李华 | 英语学习设备 |
CN103365988A (zh) * | 2013-07-05 | 2013-10-23 | 百度在线网络技术(北京)有限公司 | 对移动终端的图片文字朗读的方法、装置和移动终端 |
GB2518002B (en) * | 2013-09-10 | 2017-03-29 | Jaguar Land Rover Ltd | Vehicle interface system |
US20150120648A1 (en) * | 2013-10-26 | 2015-04-30 | Zoom International S.R.O | Context-aware augmented media |
US9794511B1 (en) * | 2014-08-06 | 2017-10-17 | Amazon Technologies, Inc. | Automatically staged video conversations |
CN107193841B (zh) * | 2016-03-15 | 2022-07-26 | 北京三星通信技术研究有限公司 | 媒体文件加速播放、传输及存储的方法和装置 |
-
2017
- 2017-01-24 US US16/085,419 patent/US11074034B2/en active Active
- 2017-01-24 JP JP2018514113A patent/JP6891879B2/ja active Active
- 2017-01-24 EP EP17788968.0A patent/EP3451149A4/en not_active Withdrawn
- 2017-01-24 WO PCT/JP2017/002309 patent/WO2017187678A1/ja active Application Filing
- 2017-01-24 CN CN201780024799.XA patent/CN109074240B/zh not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001005634A (ja) * | 1999-06-24 | 2001-01-12 | Hitachi Ltd | 電子メール受信装置 |
JP2010026813A (ja) * | 2008-07-18 | 2010-02-04 | Sharp Corp | コンテンツ表示装置、コンテンツ表示方法、プログラム、記録媒体、および、コンテンツ配信システム |
JP2015169768A (ja) * | 2014-03-06 | 2015-09-28 | クラリオン株式会社 | 対話履歴管理装置、対話装置および対話履歴管理方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3451149A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019198299A1 (ja) * | 2018-04-11 | 2019-10-17 | ソニー株式会社 | 情報処理装置及び情報処理方法 |
Also Published As
Publication number | Publication date |
---|---|
US11074034B2 (en) | 2021-07-27 |
EP3451149A4 (en) | 2019-04-17 |
CN109074240B (zh) | 2021-11-23 |
EP3451149A1 (en) | 2019-03-06 |
CN109074240A (zh) | 2018-12-21 |
JPWO2017187678A1 (ja) | 2019-02-28 |
JP6891879B2 (ja) | 2021-06-18 |
US20190073183A1 (en) | 2019-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9860204B2 (en) | Variable notification alerts | |
CN108700982B (zh) | 信息处理设备、信息处理方法以及程序 | |
CN110377365B (zh) | 展示小程序的方法和装置 | |
EP3709607B1 (en) | Device and method for adaptively changing task-performing subjects | |
US10416948B2 (en) | Head mount display and method for controlling output of the same | |
CN109151044B (zh) | 信息推送方法、装置、电子设备及存储介质 | |
US11057728B2 (en) | Information processing apparatus, information processing method, and program | |
WO2017175432A1 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
KR20160133414A (ko) | 정보 처리 장치, 제어 방법 및 프로그램 | |
CN112764608A (zh) | 消息处理方法、装置、设备及存储介质 | |
WO2016002306A1 (ja) | 情報処理システム、情報処理端末、および情報処理方法 | |
EP3086221A2 (en) | Electronic apparatus, image display system, control program, and method for operating electronic apparatus | |
JP6891879B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
US9495017B2 (en) | Computing systems for peripheral control | |
CN114666433A (zh) | 一种终端设备中啸叫处理方法及装置、终端 | |
KR101685361B1 (ko) | 휴대 단말기 및 그 동작 방법 | |
CN113301444B (zh) | 视频处理方法、装置、电子设备及存储介质 | |
CN113256440A (zh) | 虚拟自习室的信息处理方法、装置及存储介质 | |
JP6559096B2 (ja) | 情報出力システム及び情報出力方法 | |
JP2019193121A (ja) | 電子機器及び処理システム | |
CN110875042B (zh) | 指定人群监控方法、装置和存储介质 | |
KR20170025020A (ko) | 이동단말기 및 그 제어방법 | |
JP6206537B2 (ja) | 携帯端末、情報処理装置、およびプログラム | |
US20220058225A1 (en) | Information display method and apparatus | |
CN117519854A (zh) | 显示方法及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 2018514113 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2017788968 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17788968 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017788968 Country of ref document: EP Effective date: 20181127 |