WO2022176440A1 - 受信装置、送信装置、情報処理方法、プログラム - Google Patents
受信装置、送信装置、情報処理方法、プログラム Download PDFInfo
- Publication number
- WO2022176440A1 WO2022176440A1 PCT/JP2022/000744 JP2022000744W WO2022176440A1 WO 2022176440 A1 WO2022176440 A1 WO 2022176440A1 JP 2022000744 W JP2022000744 W JP 2022000744W WO 2022176440 A1 WO2022176440 A1 WO 2022176440A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reproduction
- signal
- information
- data
- haptic
- Prior art date
Links
- 230000005540 biological transmission Effects 0.000 title claims description 84
- 230000010365 information processing Effects 0.000 title claims description 14
- 238000003672 processing method Methods 0.000 title claims description 11
- 238000012545 processing Methods 0.000 claims abstract description 199
- 238000004458 analytical method Methods 0.000 claims description 97
- 238000000034 method Methods 0.000 claims description 41
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 claims description 31
- 230000008569 process Effects 0.000 claims description 29
- 230000005236 sound signal Effects 0.000 claims description 27
- 238000001228 spectrum Methods 0.000 claims description 17
- 230000003595 spectral effect Effects 0.000 claims description 12
- 230000035807 sensation Effects 0.000 description 34
- 230000015541 sensory perception of touch Effects 0.000 description 30
- 238000004891 communication Methods 0.000 description 26
- 230000006870 function Effects 0.000 description 18
- 238000005516 engineering process Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 12
- 230000008859 change Effects 0.000 description 10
- 238000007405 data analysis Methods 0.000 description 10
- 238000011156 evaluation Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 238000004880 explosion Methods 0.000 description 6
- 230000005764 inhibitory process Effects 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 230000000638 stimulation Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 5
- 238000005562 fading Methods 0.000 description 4
- 238000007654 immersion Methods 0.000 description 4
- 239000002360 explosive Substances 0.000 description 3
- 101100505340 Arabidopsis thaliana GLY1 gene Proteins 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 101000969688 Homo sapiens Macrophage-expressed gene 1 protein Proteins 0.000 description 1
- 102100021285 Macrophage-expressed gene 1 protein Human genes 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013481 data capture Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/033—Headphones for stereophonic communication
- H04R5/0335—Earpiece support, e.g. headbands or neckrests
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/25—Output arrangements for video game devices
- A63F13/28—Output arrangements for video game devices responding to control signals received from the game device for affecting ambient conditions, e.g. for vibrating players' seats, activating scent dispensers or affecting temperature or light
- A63F13/285—Generating tactile feedback signals via the game input device, e.g. force feedback
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/20—Input arrangements for video game devices
- A63F13/21—Input arrangements for video game devices characterised by their sensors, purposes or types
- A63F13/215—Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/40—Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment
- A63F13/42—Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle
- A63F13/424—Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle involving acoustic input signals, e.g. by using the results of pitch or rhythm extraction or voice recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/016—Input arrangements with force or tactile feedback as computer generated output to the user
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/70—Media network packetisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/4104—Peripherals receiving signals from specially adapted client devices
- H04N21/4131—Peripherals receiving signals from specially adapted client devices home appliance, e.g. lighting, air conditioning system, metering devices
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
- H04N21/43079—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on multiple devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/01—Indexing scheme relating to G06F3/01
- G06F2203/013—Force feedback applied to a game
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/16—Transforming into a non-visible representation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2460/00—Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
- H04R2460/13—Hearing devices using bone conduction transducers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/12—Circuits for transducers, loudspeakers or microphones for distributing signals to two or more loudspeakers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/02—Spatial or constructional arrangements of loudspeakers
Definitions
- the present technology relates to the technical field of a receiving device, a transmitting device, an information processing method, and a program for transmitting and receiving data related to tactile presentation.
- the tactile sensation presentation means that the user is allowed to experience the tactile sensation of touching an object or the tactile sensation of colliding with an object by means of vibration, pressure, or the like.
- a tactile signal for the tactile stimulus is required.
- a tactile signal is generated, for example, by attaching various sensors to the user and based on measured values measured by the sensors.
- financial and time costs are required to sufficiently prepare the environment for generating such haptic signals.
- Patent Document 1 discloses a technique of generating a tactile signal using an acoustic signal (audio signal).
- the haptic signal generated from the acoustic signal is wirelessly transmitted together with the acoustic signal and provided to the playback device. There is a problem that needs to be established.
- acoustic signals include not only signals suitable for presenting tactile stimuli, such as explosive sounds, but also signals unsuitable for presenting tactile stimuli, such as background music and lines.
- a tactile sensation generated in response to such an unnecessary acoustic signal not only has no effect of improving the sense of reality for the user, but may also cause discomfort.
- This technology was created in view of such problems, and aims to provide an environment that presents appropriate tactile stimulation to the user.
- a receiving device includes a reception processing unit that receives data including haptic signal reproduction permission/prohibition information and an acoustic signal, and a haptic signal that generates the haptic signal based on the acoustic signal received by the reception processing unit.
- a signal generation unit wherein the haptic signal generation unit generates the haptic signal when the reproduction permission/inhibition information indicates that reproduction is permitted, and generates the haptic signal when the reproduction permission/inhibition information indicates that reproduction is not permitted. is not performed. This makes it possible to create a section in which the tactile sense is not presented in accordance with the acoustic signal. For example, it is possible to set reproduction enable/disable information indicating that the haptic signal is not generated for a section in which the haptic presentation in accordance with the acoustic signal is not appropriate.
- the reproduction propriety information is provided for each audio frame data that is the audio signal divided at predetermined time intervals, and the haptic signal generation unit corresponds to the reproduction propriety information indicating that reproduction is possible.
- the haptic signal may be generated based on the acoustic frame data to be reproduced, and the haptic signal may not be generated based on the acoustic frame data corresponding to the reproduction propriety information indicating that reproduction is not possible.
- the reproduction propriety information may be 1-bit flag information. As a result, the amount of data received by the reception processing unit is reduced.
- the received data is encoded data encoded by an audio data encoding system, and the encoded data includes a payload area and a reserved area in which the audio frame data is stored. and the playability information may be stored in the reserved area.
- reception of the reproduction propriety information is realized using a mechanism for transmitting sound frame data.
- the reproduction propriety information may be generated based on partial video data reproduced in synchronization with the audio frame data. In some cases, it may not be possible to determine whether or not to provide a tactile stimulus to the user from the acoustic signal acquired from the acoustic frame data. With this configuration, it is possible to prevent inappropriate reproduction availability information from being generated based on the audio frame data.
- the haptic signal generation unit in the receiving device described above may perform fade-in processing and fade-out processing on the generated haptic signal.
- Fade processing such as fade-in processing and fade-out processing is processing for gradually increasing or decreasing a signal with the passage of time, for example, processing by multiplying a predetermined gain function.
- the haptic signal generation unit in the above-described receiving device is configured such that the reproduction permission/inhibition information corresponding to the target sound frame data indicates that reproduction is possible, and the previous sound frame data is the sound frame data immediately before the target sound frame data.
- fade-in processing is performed on the haptic signal generated from the target sound frame data when the playability information corresponding to the target sound frame data indicates that the playability is not playable, and the playability information corresponding to the target sound frame data indicates playability is not playable and the reproduction permission/prohibition information corresponding to the immediately preceding acoustic frame data indicates that reproduction is possible
- fade-out processing may be performed on the haptic signal generated from the immediately preceding acoustic frame data. That is, either the fade-in process or the fade-out process is executed at the timing when the reproducibility information changes.
- a transmission device includes an analysis processing unit that performs analysis processing on content data that includes at least an acoustic signal and generates reproduction availability information that indicates whether a haptic signal can be reproduced, and a transmission that transmits the reproduction availability information and the acoustic signal. and a processing unit.
- the analysis processing unit in the transmission device described above determines whether or not the haptic signal can be reproduced for each audio frame data that is the audio signal divided at predetermined time intervals, and the transmission processing unit determines whether the haptic signal can be reproduced for each audio frame data.
- the transmission may be performed in association with the reproduction propriety information. By setting the reproducibility information for each sound frame data, it is possible to finely set the section in which the haptic signal is generated.
- the above-described transmission device includes an encoding unit that generates encoded data including the audio frame data and the reproduction propriety information corresponding to the audio frame data, and the transmission processing unit performs the encoding in the transmission.
- data may be sent.
- standardized encoded data having a predetermined data structure is transmitted.
- the analysis processing unit in the transmission device described above may generate the reproduction propriety information based on the analysis result of the acoustic signal. Accordingly, it is possible to determine whether or not it is appropriate to present the tactile sense in accordance with the acoustic signal.
- the content data includes moving image data reproduced in synchronization with the acoustic signal
- the analysis processing unit performs analysis processing on the moving image data, and analyzes the moving image data.
- the reproduction propriety information may be generated based on the above.
- the content data includes a video
- the analysis processing unit in the transmission device described above may generate the reproduction enable/disable information based on spectral flatness in the audio frame data. As a result, it becomes possible to determine whether or not to perform tactile presentation based on the spectral flatness of the acoustic frame data itself, the rate of increase in the spectral flatness, and the like, thereby increasing the possibility of appropriate tactile presentation. .
- the analysis processing unit in the transmission device described above may generate the reproduction enable/disable information based on a total value of power spectra of frequency components equal to or lower than a threshold in the audio frame data. Thus, it is possible to determine whether or not to present the tactile sensation based on the total value of the power spectrum of the low-frequency components in the acoustic frame data, the rate of increase thereof, or the like.
- the analysis processing unit in the transmission device may generate the reproduction propriety information based on a total value of luminance values of a plurality of pixels in the moving image data. This makes it possible to detect a scene in which the luminance value changes significantly, such as an explosion scene.
- the analysis processing unit in the transmission device may generate the reproduction enable/disable information based on whether or not a human face having a size equal to or larger than a predetermined size is detected in the moving image data. For example, a scene in which a person's face is shown in close-up is presumed to be a scene in which a person is speaking. In such a scene, if a tactile sensation is presented in response to a person's speaking voice, the user may feel uncomfortable. In order to avoid this, it is decided not to perform tactile presentation when a scene in which a person's face is enlarged is detected.
- An information processing method receives data including reproducibility information of a haptic signal and an acoustic signal, and generates the haptic signal based on the received acoustic signal when the reproducibility information indicates reproducibility. Then, the computer device executes the process of determining not to generate the haptic signal when the reproduction permission/prohibition information indicates that reproduction is not possible.
- An information processing method performs analysis processing on content data including at least an acoustic signal, generates reproduction propriety information indicating whether a haptic signal can be reproduced, and transmits the reproduction propriety information and the acoustic signal. It is what the device does.
- a program receives data including reproducibility information of a haptic signal and an acoustic signal, generates the haptic signal based on the received acoustic signal when the reproducibility information indicates reproducibility,
- the arithmetic processing unit is caused to execute a function of determining not to generate the haptic signal when the reproduction permission/prohibition information indicates that reproduction is not possible.
- a program according to the present technology performs analysis processing on content data including at least an acoustic signal, generates reproduction propriety information indicating whether a haptic signal can be reproduced, and transmits the reproduction propriety information and the acoustic signal. is executed by With such an information processing method and program, it is possible to easily realize the transmitting device and the receiving device of the present technology.
- FIG. 1 is a schematic diagram showing a configuration example of a tactile sense presentation system
- FIG. 1 is a schematic diagram showing an aspect of a transmitting device and a receiving device for realizing acoustic output and tactile presentation using VOD content data
- FIG. 1 is a perspective view of a neckband speaker
- FIG. 1 is a schematic diagram showing one aspect of a transmitting device and a receiving device for realizing acoustic output and tactile presentation using content data stored in a recording medium
- FIG. 1 is a schematic diagram illustrating one aspect of a transmitting device and a receiving device for providing audio output and tactile presentation of game content
- FIG. It is a figure which shows an example of the data structure of encoded data.
- FIG. 1 is a schematic diagram showing an aspect of a transmitting device and a receiving device for realizing acoustic output and tactile presentation using VOD content data
- FIG. 1 is a perspective view of a neckband speaker
- FIG. 1 is a schematic diagram showing one aspect of a transmitting
- FIG. 3 is a block diagram for showing a configuration example of a transmission device;
- FIG. 4 is a diagram for showing a functional configuration example of an analysis processing unit; It is a block diagram for showing a configuration example of a receiving device.
- FIG. 4 is a diagram for showing a functional configuration example of a decoding unit;
- FIG. 10 is a graph showing the power spectrum of audio frame data before low-pass filter processing;
- FIG. 10 is a graph showing a power spectrum of audio frame data after low-pass filter processing;
- FIG. 4 is a diagram showing an example of a haptic signal before fade-in processing and fade-out processing;
- FIG. 10 is a diagram showing an example of a gain function and a haptic signal after fading;
- 4 is a flow chart showing an example of a processing flow of a transmission device; It is a flow chart which shows an example of a processing flow of a receiving device.
- FIG. 4 is a diagram for explaining scene information;
- FIG. 4 is a diagram for explaining the correspondence between scene type IDs and scene contents;
- a tactile presentation system 1 performs various processes for presenting a tactile sense to the user.
- tactile presentation means providing a tactile stimulus to the user by reproducing a tactile signal.
- a tactile presentation system 1 includes a transmitter 2 , a receiver 3 , an audio player 4 , and a tactile player 5 .
- the transmission device 2 acquires the content data CD containing the acoustic signal, and acquires the acoustic signal from the content data CD. Further, the transmission device 2 performs a process of dividing the acquired acoustic signal into acoustic frame data SFD, which are acoustic signals divided at predetermined time intervals. The predetermined time is a relatively short time such as several tens of milliseconds.
- the transmitting device 2 performs encoding processing for each sound frame data SFD to generate encoded data ED.
- the transmitting device 2 transmits the encoded data ED to the receiving device 3 .
- the encoded data ED includes reproduction enable/disable information indicating whether or not to reproduce the haptic signal synchronized with the sound frame data SFD whose reproduction length is set to a predetermined time.
- the playability information may be generated by analysis processing for each sound frame data SFD by the transmission device 2, or may be generated by other analysis processing for each data segment.
- the reproducibility information is either 1 indicating reproducibility or 0 indicating non-reproducibility, and may be, for example, 1-bit flag information.
- 1-bit flag information is used as one aspect of the reproducibility information, and the reproducibility information is referred to as "haptic reproduction flag PF".
- the haptic reproduction flag PF may be generated based on the partial moving image data, which is the video signal reproduced in synchronization with the sound frame data SFD. That is, whether or not to present the tactile sensation to the user is determined based on the audio signal or video signal of the content data CD.
- the transmission device 2 may acquire the content data CD containing the acoustic signal from another information processing device, may acquire it by reading it from a storage medium, or may acquire it by reading it from a storage medium. may be obtained from the storage unit provided.
- the receiving device 3 performs decoding processing on the encoded data ED received from the transmitting device 2 to obtain the acoustic signal and the reproduction propriety information.
- the receiving device 3 realizes an acoustic output for the user by transmitting an acoustic signal to the acoustic reproduction device 4 .
- the receiving device 3 generates a haptic signal based on the haptic reproduction flag PF as the reproduction propriety information, and transmits the haptic signal to the haptic reproduction device 5, thereby presenting the haptic sensation to the user.
- the sound reproduction device 4 is a device that outputs sound based on sound signals, and is, for example, earphones, headphones, or a speaker device.
- the tactile sense reproduction device 5 is a device that performs output for providing a user with a tactile sense stimulus based on a tactile sense signal.
- Various forms are conceivable, such as a device and a device provided with a heat generating part.
- a device having a vibrating section with vibrators and actuators will be taken as an example.
- the sound reproducing device 4 and the haptic reproducing device 5 may be provided as independent devices different from the receiving device 3, or one or both of them may be provided integrally with the receiving device 3 as an acoustic output section or a haptic reproducing section. may have been
- FIG. 2 shows an aspect of a transmitting device 2A and a receiving device 3A for realizing acoustic output and tactile presentation using a VOD (Video On Demand) format content data CD.
- the transmission device 2A is a smart phone, a tablet terminal, a television receiver, a PC (Personal Computer), or the like.
- the transmitting device 2A receives the content data CD from the content server 100 and transmits the generated encoded data ED to the neckband speaker as the receiving device 3A.
- FIG. 3 shows a configuration example of a receiving device 3A as a neckband speaker.
- the receiving device 3A as a neckband speaker is a neck-mounted speaker device, and includes an acoustic output unit 7L arranged on the left side of the housing 6 and an acoustic output unit 7R arranged on the right side.
- the receiving device 3A also includes a tactile sense reproducing section 8L arranged at the left end portion of the housing 6 and a tactile sense reproducing portion 8R arranged at the right end portion.
- the receiving device 3A has various operators 9 such as a power button.
- the sound output units 7L and 7R are one aspect of the sound reproduction device 4. Further, the haptic sense reproducing units 8L and 8R are one aspect of the haptic sense reproducing device 5.
- FIG. 1 is one aspect of the sound reproduction device 4.
- the receiving device 3A outputs the acoustic frame data SFD acquired by performing the decoding process to the acoustic output units 7L and 7R. Further, the receiving device 3A generates a haptic signal from the acoustic frame data SFD based on the acquired haptic reproduction flag PF, and outputs the haptic signal to the haptic reproduction units 8L and 8R. That is, in the embodiment shown in FIG. 2, the user can view the image displayed on the display unit (including the monitor device connected to the transmission device 2A) provided in the transmission device 2A, while viewing the sound output unit provided in the reception device 3A. The content is enjoyed by listening to the sounds output from 7L and 7R and feeling the vibration stimulation reproduced by the haptic reproduction units 8L and 8R.
- FIG. 4 shows the content data CD stored in the recording medium RM such as CD-ROM (Compact Disc Read Only Memory), DVD (Digital Versatile Disc), BD (Blu-ray Disc (registered trademark)), etc. It is an aspect of the transmitting device 2B and the receiving device 3B for realizing acoustic output and tactile presentation.
- RM Compact Disc Read Only Memory
- DVD Digital Versatile Disc
- BD Blu-ray Disc (registered trademark)
- the transmission device 2B is a device for reading or reproducing the recording medium RM, and transmits a video signal stored in the recording medium RM to the monitor device 10, whereby video is displayed on the monitor device 10.
- FIG. The transmitting device 2B generates encoded data ED based on the acoustic signal stored in the recording medium RM, and transmits the encoded data ED to the receiving device 3B.
- the receiving device 3B is an audio reproducing device such as headphones or earphones equipped with audio output units 7L and 7R, and performs reproduction processing of audio frame data.
- the receiving device 3B does not have a tactile sense reproducing unit, generates a haptic signal based on the haptic sense reproducing flag PF, and transmits the haptic signal to the haptic sense reproducing device 5B of a bracelet type or a vest type.
- the tactile sense reproduction device 5B presents the tactile sense by reproducing the received haptic signal.
- the user listens to sounds output from the sound output units 7L and 7R provided in the receiving device 3B while watching an image displayed on the monitor device 10 connected to the transmitting device 2B. , the content is enjoyed by experiencing the vibration stimulation reproduced by the haptic reproduction device 5B.
- FIG. 5 shows one aspect of the transmitting device 2C and the receiving device 3C for realizing sound output and tactile presentation to the user who enjoys the game.
- the transmission device 2C is a game machine main body, and is a device that reproduces game data stored in the recording medium RM or an internal storage unit.
- the transmission device 2C transmits a video signal included in the game data to the monitor device 10 (or a television receiver) connected to the transmission device 2C, so that the video is displayed on the monitor device 10.
- the transmitting device 2C generates encoded data ED based on the acoustic signal included in the game data, and transmits it to the receiving device 3C.
- the receiving device 3C is a game controller or the like having a haptic reproduction unit 8, and reproduces a haptic signal generated based on the haptic reproduction flag PF from the sound frame data included in the encoded data ED.
- the receiving device 3C acquires the acoustic frame data included in the encoded data ED and transmits it to the acoustic reproducing device 4C such as earphones or headphones.
- the sound reproduction device 4C performs sound output by reproducing the received sound frame data.
- the user experiences the tactile sense stimulus reproduced by the tactile sense reproducing section provided in the receiving device 3C while watching the image displayed on the monitor device 10 connected to the transmitting device 2C.
- the content is enjoyed by listening to the sound output from the sound reproduction device 4C.
- the user can experience tactile stimulation corresponding to the movement of the character that is moved by the user's operation, so that the immersion in the game can be enhanced.
- the encoded data ED has a data structure for transmitting the sound frame data SFD. Specifically, data structures such as SBC (Sub Band Coding), MP3 (MPEG1 Audio Layer-III), AAC (Advanced Audio Coding), and LDAC can be used.
- the encoded data ED is configured with a header area 20 and a payload area 21 .
- the encoded data ED may further include a check area.
- the header area 20 consists of a sync word area 22, a bit rate area 23, a sampling rate area 24, a channel mode area 25, and a reserved area 26.
- the sync word area 22 is an area in which a specific bit string is stored for detecting the beginning of the encoded data ED for one frame. Note that "0x" indicates a hexadecimal number, and 0xFFFE is a 16-bit bit string in which only the last bit (LSB: Least Significant Bit) is set to "0".
- the bit rate area 23 is an area in which a bit rate ID (Identification) is stored.
- the bit rate ID is for specifying the bit rate representing the amount of data per second in the sound frame data SFD by a bit string consisting of 2 bits, for example. Specifically, the bit rate ID takes any value from 0 to 3, where "0" indicates 32 kbps, "1" indicates 64 kbps, and "2". indicates 96 bps, and "3" indicates 128 kbps.
- the sampling rate area 24 is an area in which the sampling rate ID is stored.
- the sampling rate ID is for specifying a sampling rate representing the number of samples per second in the sound frame data SFD by a bit string consisting of 2 bits, for example.
- the sampling rate ID takes any value from 0 to 3, and indicates 12 kHz when it is “0”, indicates 24 kHz when it is “1”, and indicates 24 kHz when it is “2”. indicates 48 kHz, and "3" indicates 96 kHz.
- a channel mode area 25 is an area in which a channel mode ID is stored.
- the channel mode ID is for specifying a combination of channels of the sound frame data SFD by a bit string consisting of 2 bits, for example. Specifically, the channel mode ID takes any value from 0 to 3. If it is "0", it indicates that the sound frame data SFD is a monaural signal, and if it is "1", Indicates that the sound frame data SFD is a stereo signal, and if it is "2", it indicates that the sound frame data SFD is a 5.1-channel surround signal, and if it is "3", it indicates an audio frame. Indicates that the data SFD is a 7.1 channel surround signal.
- the reserved area 26 is an area prepared for future function expansion, and is provided to implement the function expansion without changing the data structure.
- the reserved area 26 may be an area of any size, such as 1 bit, 2 bits, or 4 bits.
- the above-described haptic reproduction flag PF is stored in the reserved area 26 .
- the reserved area 26 may be a 1-bit area in order to implement the present embodiment. That is, if the haptic reproduction flag PF is 1-bit flag information, the present embodiment can be implemented even if the reserved area 26 is a 1-bit area.
- the encoded data ED can be kept to a minimum data structure, so that the communication band for transmitting and receiving the encoded data ED can be suppressed.
- the configuration of the transmission device 2 will be described with reference to FIG.
- the transmission device 2 includes an analysis processing section 30 , an encoding section 31 , a storage section 32 , a control section 33 , a communication section 34 and a bus 35 .
- the analysis processing unit 30 performs analysis processing on the content data CD input via the communication unit 34 .
- the case where the content data CD includes audio signals and moving image data will be taken as an example.
- the analysis processing unit 30 performs decoding processing when the content data CD is encoded. In the decoding process, moving image data and audio signals are extracted from the content data CD.
- the extracted moving image data and audio signal are subjected to analysis processing for each partial moving image data and audio frame data SFD having a predetermined time width.
- analysis processing for each partial moving image data and audio frame data SFD having a predetermined time width.
- it is determined whether or not the scene is suitable for tactile presentation based on each image data included in the partial moving image data. Specifically, it will be described later.
- the analysis processing for the sound frame data SFD it is determined whether or not the scene is suitable for tactile presentation based on the spectrum value of the sound signal as the sound frame data SFD. Specifically, it will be described later.
- the analysis processing unit 30 performs analysis processing based on at least one of the partial video data MD and the sound frame data SFD, and generates a haptic reproduction flag PF according to the result. Specifically, when it is determined that the scene is suitable for tactile presentation, the tactile playback flag PF is set to "1" indicating that playback is possible. On the other hand, if it is determined that the scene is not suitable for haptic presentation, the haptic playback flag PF is set to "0" indicating that playback is not possible.
- the encoding unit 31 performs encoding processing using the acoustic frame data SFD generated by the analysis processing unit 30 and the haptic reproduction flag PF.
- the encoded data ED containing the information of the audio frame data SFD and the haptic reproduction flag PF is generated.
- an existing encoding method for transmitting the sound frame data SFD and storing the haptic reproduction flag PF in an unused area such as the reserved area 26 is used. Any method may be used as long as it is possible. This saves the trouble of developing a new encoding method for encoding the haptic reproduction flag PF.
- the actual data of the encoded data ED may be obtained by subjecting the sound frame data SFD to encoding processing such as compression.
- the storage unit 32 is configured with a HDD (Hard Disk Drive), an SSD (Solid State Drive), etc., and stores content data CD before analysis by the analysis processing unit 30, haptic playback flag PF obtained after analysis, etc. Various information is stored.
- HDD Hard Disk Drive
- SSD Solid State Drive
- the control unit 33 is configured with a microcomputer having a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), etc., and executes various processes according to programs stored in the ROM. performs general control of the transmission device 2 with .
- CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- the communication unit 34 performs wired or wireless data communication with other information processing devices.
- the communication unit 34 performs processing for receiving content data CD from another information processing device, processing for transmitting encoded data ED to the receiving device 3, and the like.
- analysis processing unit 30, the encoding unit 31, the storage unit 32, the control unit 33, and the communication unit 34 are connected via a bus 35 so that they can communicate with each other.
- FIG. 8 shows a specific functional configuration of the analysis processing unit 30.
- the analysis processing section 30 includes an input adjustment section 40 , a moving image data analysis section 41 , a sound analysis section 42 , a haptic reproduction determination section 43 and an output adjustment section 44 .
- the input adjustment unit 40 performs processing for decoding the content data CD input to the transmission device 2 and processing for extracting moving image data and audio signals included in the content data CD. Further, the input adjustment unit 40 divides the extracted moving image data into partial moving image data MD having a predetermined time width, and divides the extracted audio signal into audio frames having a predetermined time width. A process of dividing the data into SFDs is performed.
- the input adjustment unit 40 outputs the divided partial moving image data MD to the moving image data analysis unit 41 and outputs the divided sound frame data SFD to the sound analysis unit 42 .
- the moving image data analysis unit 41 performs image analysis on the partial moving image data MD and calculates a feature amount for each partial moving image data MD. Some examples of feature values to be calculated will be given.
- the moving image data analysis unit 41 calculates a feature amount for determining whether or not the scene is a momentary flickering of light.
- the feature amount A is obtained by calculating the sum of luminance values of all pixels for each image frame included in the partial moving image data MD.
- An image frame with a high feature amount A is an image that captures a bright scene, and is therefore likely to be an image that captures an explosion scene.
- p(m) in [Formula 1] represents the m-th luminance value of the pixel row p
- M represents the number of pixels.
- the moving image data analysis unit 41 calculates, as a feature amount B, a change in luminance value between image frames adjacent in the time direction.
- a feature amount B a change in luminance value between image frames adjacent in the time direction.
- the image frame that captures the moment the explosion started is specified by calculating the feature amount B so that it increases as the rate of increase in the sum of the luminance values of all pixels increases with respect to the immediately preceding image frame. be able to.
- A' in [Formula 2] is the feature amount A calculated for the immediately preceding image frame, and represents the sum of the luminance values of all pixels in the immediately preceding image frame.
- the moving image data analysis unit 41 may perform feature amount calculation for identifying scenes unsuitable for tactile presentation. For example, in a scene where a person's face is shown in a close-up, it is a scene where you want to pay attention to the person's words and facial expressions. There is a possibility that the user will feel as if he or she is being shaken, and that the user will feel uncomfortable.
- the moving image data analysis unit 41 calculates the feature amount C so that the higher the ratio of the image area in which the person's face is occupied to the image area of the image frame, the lower the feature amount C becomes. That is, even if the above-described feature amount A and feature amount B are high and there is a high possibility that the partial video data captures an explosion scene, the scene is a close-up of a person's face, so the feature amount C is calculated to be low, it is possible to decide not to present the tactile sensation.
- the acoustic analysis unit 42 performs acoustic analysis on the acoustic frame data SFD, and calculates feature amounts for each acoustic frame data SFD. Some examples of feature values to be calculated will be given.
- the acoustic analysis unit 42 calculates a feature amount for determining whether or not an impact sound, a collision sound, an explosion sound, a slashing sound, or the like is generated. For example, the spectral flatness of the sound frame data SFD is calculated and set as the feature value D. FIG. Impact sounds and impact sounds are characterized by high spectral flatness. It is possible to identify something.
- x(n) in [Formula 3] represents the n-th crest value of the signal sequence x, and N represents the number of signal samples.
- the acoustic analysis unit 42 calculates a feature amount based on the increase rate of the spectral flatness with respect to the immediately preceding acoustic frame data SFD.
- the feature amount E is calculated such that the higher the increase rate of the spectral flatness for the sound frame data SFD, the higher the feature amount E.
- D' in [Formula 4] is the feature amount D calculated for the immediately preceding sound frame data SFD, and represents the spectral flatness in the immediately preceding sound frame data SFD.
- the acoustic analysis unit 42 calculates feature amounts based on deep bass. Specifically, the feature amount F is calculated so as to increase as the total value of the power spectrum of 100 Hz or less for the sound frame data SFD increases. Deep bass often occurs in scenes with impact, and tactile presentation based on heavy bass is likely to be appropriate.
- X(k) in [Formula 5] represents the k-th spectrum of the signal sequence X
- K represents the spectrum bin (BIN) corresponding to 100 Hz.
- the acoustic analysis unit 42 calculates the feature amount based on the rate of increase of the total value of the power spectrum of 100 Hz or less with respect to the immediately preceding acoustic frame data SFD.
- the feature amount G is calculated such that the higher the increase rate of the total value of the power spectrum of 100 Hz or less for the sound frame data SFD, the higher the feature amount G becomes.
- F' in [Equation 6] is the feature amount F calculated for the immediately preceding sound frame data SFD, and represents the total value of the low-frequency (for example, 100 Hz or less) power spectrum in the immediately preceding sound frame data SFD.
- the tactile sense reproduction determination unit 43 determines whether or not tactile sense should be presented in accordance with the partial moving image data MD and the sound frame data SFD, and sets the haptic sense reproduction flag PF for each sound frame data SFD based on the determination result. process. Specifically, the haptic reproduction determination unit 43 evaluates based on the feature amounts A, B, and C calculated by the moving image data analysis unit 41 and the feature amounts D, E, F, and G calculated by the sound analysis unit 42. Calculate the value EV.
- the evaluation value EV is calculated, for example, based on [Formula 7] shown below.
- Evaluation value EV W1 ⁇ feature amount A + W2 ⁇ feature amount B + W3 ⁇ feature amount C + W4 ⁇ feature amount D + W5 ⁇ feature amount E + W6 ⁇ feature amount F + W7 ⁇ feature amount G ...
- W1 to W7 in [Equation 7] are coefficients representing weights for the feature amounts A to G, respectively.
- the haptic reproduction determination unit 43 sets the haptic reproduction flag PF based on the evaluation value EV. Specifically, if the evaluation value EV is equal to or greater than the threshold TH, the haptic regeneration flag PF is set to "1", and if the evaluation value EV is less than the threshold TH, the haptic regeneration flag PF is set to "0". set.
- the output adjustment unit 44 outputs the sound frame data SFD having a constant time width obtained by the input adjustment unit 40 and the haptic reproduction flag PF set by the haptic reproduction determination unit 43 .
- the configuration of the receiving device 3 will be described with reference to FIG.
- the receiving device 3 includes a decoding section 50 , DACs (Digital to Analog Converters) 51 and 52 , amplifiers 53 and 54 , a storage section 55 , a control section 56 , a communication section 57 and a bus 58 .
- DACs Digital to Analog Converters
- the decoding unit 50 performs decoding processing on the encoded data ED input to the receiving device 3, and acquires the acoustic frame data SFD and the haptic reproduction flag PF.
- the decoding unit 50 checks the acquired haptic reproduction flag PF, and generates a haptic signal from the sound frame data SFD only when the haptic reproduction flag PF is "1" indicating that reproduction is possible.
- the decoding unit 50 outputs the sound frame data SFD acquired from the encoded data ED to the DAC 51 and outputs the generated haptic signal to the DAC 52 .
- the DAC 51 converts the sound frame data SFD, which has been converted into a digital signal, into an analog signal and outputs it to the subsequent amplifier 53 .
- the DAC 52 converts the digital haptic signal into an analog signal and outputs it to the subsequent amplifier 54 .
- the amplifier 53 outputs the audio signal converted into the analog signal to the audio reproduction device 4 .
- the amplifier 54 outputs the haptic signal converted into the analog signal to the haptic playback device 5 .
- both the DACs 51 and 52 and the amplifiers 53 and 54 may be provided inside the sound reproducing device 4 or the haptic reproducing device 5 .
- the sound frame data SFD and the haptic signal as digital signals are transmitted to the sound reproducing device 4 and the haptic reproducing device 5, respectively.
- the storage unit 55 includes an HDD, an SSD, etc., and may store encoded data ED before the decoding unit 50 performs decoding processing, or may store audio frames extracted from the encoded data ED.
- the data SFD, the data of the haptic reproduction flag PF, and the data of the generated haptic signal may be stored.
- the control unit 56 is configured with a microcomputer having a CPU, ROM, RAM, etc., and comprehensively controls the receiving device 3 by executing various processes according to programs stored in the ROM.
- the communication unit 57 performs reception processing of the encoded data ED, transmission processing of the acoustic signal as an analog signal amplified by the amplifier 53, and transmission processing of the haptic signal as an analog signal amplified by the amplifier 54.
- the communication unit 57 is capable of wired communication and wireless communication.
- the receiving device 3 when the receiving device 3 is provided with the acoustic output unit 7 and the tactile sense reproducing unit 8, the acoustic signal converted into an analog signal is output to the acoustic output unit 7, and the analog signal may be output to the haptic reproduction unit 8.
- the decoding unit 50, the storage unit 55, the control unit 56, and the communication unit 57 are connected via a bus 58 so that they can communicate with each other.
- FIG. 10 shows a specific functional configuration of the decoding unit 50.
- the decoder 50 includes an acoustic decoder 60 and a haptic signal generator 61 .
- the audio decoding unit 60 performs compound processing on the input encoded data ED and acquires the audio frame data SFD and the haptic reproduction flag PF.
- the haptic signal generation unit 61 generates a haptic signal using the sound frame data SFD whose haptic reproduction flag PF is set to "1" indicating that reproduction is possible. That is, no haptic signal is generated for the sound frame data SFD whose haptic playback flag PF is set to "0" indicating that playback is not possible.
- FIG. 11 a method for generating a haptic signal from the acoustic frame data SFD will be explained.
- Various methods of generating a haptic signal are conceivable, and one example will be described with reference to FIGS. 11 and 12.
- FIG. 11 a method for generating a haptic signal from the acoustic frame data SFD.
- the haptic signal generation unit 61 performs signal processing such as a low-pass filter to extract only the low-frequency component of the acoustic signal and treats it as a haptic signal.
- FIG. 11 is a graph showing the power spectrum of the sound frame data SFD, showing the signal before being subjected to low-pass filter processing.
- FIG. 12 is a graph showing the power spectrum of the sound frame data SFD as in FIG. The signal after being processed by a low-pass filter with a cutoff frequency of 500 Hz is shown.
- the haptic signal generation unit 61 treats the signals shown in FIG. 12 as haptic signals. In this way, by generating a tactile signal based on the acoustic frame data SFD as an acoustic signal, it is possible to present a tactile sense that is in harmony with the sound that the user hears.
- the haptic signal generation unit 61 further performs fade-in processing and fade-out processing on the haptic signal as shown in FIG.
- the tactile sense stimulus presented to the user suddenly turns from OFF to ON. will transition.
- the haptic reproduction device 5 or the haptic reproduction unit 8 there is a possibility that unintended vibration may occur due to the transient response of the haptic reproduction device 5 or the haptic reproduction unit 8 .
- the sound frame data SFD whose haptic reproduction flag PF is set to "1” and the sound frame data SFD whose haptic reproduction flag PF is set to "0" are reproduced continuously.
- fade-in processing and fade-out processing are performed when the haptic reproduction flag PF for the sound frame data SFD changes.
- FIG. 13 shows an example of a haptic signal before fade-in processing and fade-out processing.
- FIG. 13 shows, in order from the top, a plurality of audio frame data SFD as audio signals, the haptic reproduction flag PF corresponding to the audio frame data SFD, and the haptic signal before fading.
- the haptic signal generation unit 61 multiplies the haptic signal before fading by a gain function as shown in FIG. 14 to generate a haptic signal after fading.
- the gain function is set so that the haptic signal gradually increases when the haptic play flag changes from “0” to “1", and when the haptic play flag changes from "1” to “0". It is set so that the haptic signal gradually weakens when changing to .
- the portion corresponding to the interval T2 in the gain function gradually changes from 0 to 1 over a predetermined time t1 from time t0, which is the start timing of the interval T2, and then remains at 1 until the end of the interval T2. made to continue.
- the change from 0 to 1 in the gain function may or may not be linear.
- the time t1 may be, for example, the timing after half the time of the interval T2 has elapsed, or may be earlier than that. Alternatively, it may change from 0 to 1 by multiplying the entire time of the interval T2 by making the time t1 the end time of the interval T2.
- the portion of the gain function corresponding to section T3 is set to 1 from the start of section T3 to time t2, and gradually changes from 1 to 0 from time t2 to time t3, which is the end timing of section T3.
- the change from 1 to 0 in the gain function may or may not be linear.
- the time t2 may be, for example, the timing after half the time of the interval T3 has elapsed, or may be after that.
- the time t2 may be set as the start time of the interval T3, and the value may change from 1 to 0 by multiplying the entire time of the interval T3.
- the haptic signal generation unit 61 performs fade-in processing and fade-out processing by multiplying the haptic signal before fade processing by a gain function as shown in FIG. As a result, the tactile signal becomes 0 at the time t0 when the tactile sense presentation is started and the time t3 when the tactile sense presentation ends.
- the input adjustment unit 40 of the analysis processing unit 30 of the transmission device 2 executes decoding processing of the content data CD in step S101. Through these processes, the video signal and the audio signal are separated from the content data CD.
- step S102 the input adjustment unit 40 of the analysis processing unit 30 generates partial moving image data MD by dividing the video signal by a predetermined time width, and generates sound frame data SFD by dividing the sound signal by a predetermined time width. do.
- the moving image data analysis unit 41 of the analysis processing unit 30 analyzes the partial moving image data in step S103. This processing is processing for calculating the feature amounts A to C described above.
- the acoustic analysis unit 42 of the analysis processing unit 30 analyzes the acoustic frame data SFD in step S104. This processing is processing for calculating the feature amounts D to G described above.
- the haptic reproduction determination unit 43 of the analysis processing unit 30 performs processing for calculating the evaluation value EV in step S105.
- the tactile sense reproduction determination unit 43 of the analysis processing unit 30 determines whether or not the tactile sense can be presented in step S106, and generates a tactile sense reproduction flag PF in step S107.
- step S108 the encoding unit 31 performs encoding processing to generate encoded data ED.
- the communication unit 34 of the transmitting device 2 transmits the encoded data ED to the receiving device 3 in step S109.
- step S201 the acoustic decoding unit 60 of the decoding unit 50 of the receiving device 3 analyzes the header area 20 of the encoded data ED and extracts information conforming to the data structure of the encoded data ED shown in FIG.
- step S202 the acoustic decoding unit 60 of the decoding unit 50 performs decoding processing on the actual data stored in the payload area 21 to obtain acoustic frame data SFD.
- the haptic signal generation unit 61 of the decoding unit 50 checks in step S203 whether or not the haptic reproduction flag PF corresponding to the sound frame data SFD is ON.
- a state in which the haptic reproduction flag PF is ON refers to a state in which the haptic reproduction flag PF is set to "1".
- step S204 When the haptic reproduction flag PF is ON, the haptic signal generation unit 61 of the decoding unit 50 generates a haptic signal based on the sound frame data SFD in step S204. On the other hand, when the haptic reproduction flag PF is OFF, the process of step S204 is skipped.
- the haptic signal generation unit 61 of the decoding unit 50 determines in step S205 whether or not the haptic reproduction flag PF has changed.
- a change in the haptic reproduction flag PF is a case where the haptic reproduction flag PF corresponding to the previous sound frame data SFD and the haptic reproduction flag PF corresponding to the current processing target sound frame data SFD are different.
- the decoding unit 50 terminates the series of processes shown in FIG.
- the haptic signal generation unit 61 of the decoding unit 50 performs branch processing according to the change direction of the flag in step S206. Specifically, when the haptic reproduction flag PF changes from OFF to ON, the haptic signal generation unit 61 of the decoding unit 50 performs fade-in processing in step S207.
- the haptic signal generation unit 61 of the decoding unit 50 performs fade-out processing in step S208.
- a first modified example is an example in which analysis is performed using program information such as an EPG (Electronic Programming Guide) for broadcast content such as television programs and distributed content.
- EPG Electronic Programming Guide
- scene information can be acquired for each scene in broadcast content or distribution content.
- scene information since what kind of scene each scene is can be estimated to some extent based on the scene information, it may be analyzed whether or not the tactile sensation should be presented based only on the scene information.
- analysis using scene information may be performed in addition to the feature amount and evaluation value described above. For example, if it is a specific scene, it may be decided not to present the tactile sense no matter how high the evaluation value is.
- FIG. 17 shows scene information in a table.
- the scene information includes, for example, a representative thumbnail image for each scene, a scene number that is a serial number of the scene, a start time and end time of the scene, and a scene type ID.
- the scene type ID is information for roughly specifying the scene content, and is associated with the scene content as shown in FIG. 18, for example. Specifically, scenes with a scene type ID of "0001" are assigned to battle action scenes such as shooting and sword fights, and scenes with a scene type ID of "0002" are fireworks and other scenes. This is given to scenes in which a bomb or the like explodes, and scenes with a scene type ID of "0011" are given to scenes in which people are talking.
- Such scene information is managed by the content server 100 (see FIG. 2) that distributes the content data CD, and may be distributed to the transmission device 2 together with the distribution of the content data CD.
- the scene information is stored in the recording medium RM (see FIG. 4), and may be acquired from the recording medium RM in response to reproduction of the content data CD. Alternatively, it may be acquired from the television receiver as part of program guide information of television programs or data of data broadcasting.
- a second modified example of analysis by the analysis processing unit 30 will be described.
- a second modified example is an example in which the sound frame data SFD is not predetermined, but changes according to a situation change caused by a user's operation or the like.
- the sound signals may be related to BGM, lines, sound effects, etc. Some of them are not linked to scenes but linked to user operations. is supposed to be attached.
- each audio signal is mixed and reproduced according to the scene type and user operation. Therefore, the sound frame data SFD is made different data each time according to the user's operation.
- the transmission device 2C as the main body of the game machine analyzes the state of the game content, determines whether or not the timing is suitable for presenting the tactile sensation, and sets the tactile sensation reproduction flag PF.
- the timing suitable for tactile presentation is, for example, the effect reproduced when a weapon or fist, such as a sword, swung down toward an enemy character by a character in the game operated by the user touches the enemy character. It may be the timing at which a sound is generated, or the timing at which a sound effect reproduced when a bomb explodes is generated. In other words, it may be the timing at which the sound effect is generated when the character operated by the user experiences some stimulus.
- the receiving device 3 uses data including haptic signal reproduction enable/disable information (haptic reproduction flag PF) and an acoustic signal (acoustic frame data SFD).
- a reception processing unit (communication unit 57) that receives (encoded data ED), and a tactile signal generation unit 61 that generates a tactile signal based on an acoustic signal received by the reception processing unit.
- the haptic signal generation unit 61 generates a haptic signal when the reproduction permission/inhibition information indicates reproduction is permitted (for example, when the haptic reproduction flag PF is set to “1”), and the reproduction permission/inhibition information indicates reproduction is prohibited. (for example, when the haptic reproduction flag PF is set to "0"), the haptic signal is not generated.
- the reproduction permission/inhibition information indicates reproduction is permitted (for example, when the haptic reproduction flag PF is set to “1”), and the reproduction permission/inhibition information indicates reproduction is prohibited. (for example, when the haptic reproduction flag PF is set to "0")
- the haptic signal is not generated.
- the existence of the acoustic signal for which the haptic signal is not generated reduces the processing burden associated with the generation of the haptic signal. Further, by determining whether or not to generate the haptic signal according to the reproduction propriety information, it is possible to generate the haptic signal only for the necessary period. In particular, when generating a tactile signal from an acoustic signal, it is conceivable that if the acoustic signal is a small signal, the generated haptic signal will also be a small signal. And small tactile signals may be imperceptible to the user. By generating a haptic signal in accordance with the reproducibility information, generation of such an unnecessary haptic signal can be avoided.
- the reception processing unit receives reproduction propriety information indicating propriety of reproduction of the haptic signal instead of receiving data of the haptic signal.
- the reproducibility information is considered to be smaller data than the data of the haptic signal. Therefore, the amount of data received by the reception processing unit can be reduced compared to the case where both the tactile signal data and the acoustic signal data are received. As a result, it is possible to reduce the bandwidth used for communication and reduce the processing load required for reception processing.
- reproduction enable/disable information (for example, haptic reproduction flag PF) is generated for each audio frame data SFD that is an audio signal divided by a predetermined time (for example, the time length of interval T1). is provided, and the haptic signal generation unit 61 generates a haptic signal based on the sound frame data SFD corresponding to the reproduction permission information indicating that reproduction is possible, and based on the sound frame data SFD corresponding to the reproduction permission information indicating that reproduction is not possible It is not necessary to generate a haptic signal.
- reproducibility information for each sound frame data SFD it is possible to finely set the section in which the haptic signal is generated.
- the reproduction time length of the sound frame data SFD is set to a short time such as less than 100 msec, it is possible to finely set the necessary and unnecessary tactile signal sections, thereby presenting a wide variety of tactile signals. It can be performed.
- the reproducibility information may be 1-bit flag information (haptic reproduction flag PF).
- the amount of data received by the reception processing unit (communication unit 57) is reduced. Therefore, the time required for the reception process can be shortened, and the communication band required for data transmission/reception can be reduced.
- reception data received by the receiving device 3 is encoded by an audio data encoding method (for example, SBC, MP3, AAC, LDAC, etc.).
- the encoded data ED has a structure including a payload area 21 in which the sound frame data SFD is stored, and a reserved area 26, and the reproduction availability information may be stored in the reserved area 26. .
- reception of the reproduction propriety information is realized using a mechanism for transmitting the sound frame data SFD.
- the reproduction enable/disable information may be generated based on the partial moving image data MD reproduced in synchronization with the sound frame data SFD.
- the reproduction enable/disable information may be generated based on the partial moving image data MD reproduced in synchronization with the sound frame data SFD.
- Various scenes are included in the video viewed by the user, and it may be possible to determine whether or not the scene is suitable for tactile presentation by performing image analysis. In such a case, by generating reproduction availability information based on the partial moving image data MD, it is possible to increase the possibility of appropriate tactile presentation.
- the haptic signal generator 61 in the receiving device 3 may perform fade-in processing and fade-out processing on the generated haptic signal.
- Fade processing such as fade-in processing and fade-out processing is processing to gradually increase or decrease a signal over time, and is processing to multiply a predetermined gain function.
- the haptic signal generation unit 61 in the receiving device 3 (3A, 3B, 3C) generates reproduction enable/disable information (for example, haptic reproduction flag PF ) and the reproduction propriety information corresponding to the previous sound frame data SFD2, which is the sound frame data SFD one before that, the fade-in process and the fade-out process may be performed.
- reproduction enable/disable information for example, haptic reproduction flag PF
- reproduction propriety information corresponding to the previous sound frame data SFD2 which is the sound frame data SFD one before that, the fade-in process and the fade-out process may be performed.
- the Fade-in processing is performed on the tactile signal, and if the reproducibility information corresponding to the target sound frame data SFD1 indicates reproducibility and the reproducibility information corresponding to the previous sound frame data SFD2 indicates reproducibility, the previous sound frame data SFD2 is reproduced.
- a fade-out process may be performed on the haptic signal generated from. That is, either the fade-in process or the fade-out process is executed at the timing when the reproducibility information changes. As a result, a tactile signal is generated that enables tactile presentation without causing discomfort to the user, so that the sense of immersion in the content can be enhanced.
- the transmission device 2 (2A, 2B, 2C) performs analysis processing on the content data CD including at least the acoustic signal, and reproduces reproduction possibility information (for example, It is provided with an analysis processing unit 30 that generates a haptic reproduction flag PF) and a transmission processing unit (communication unit 34) that transmits reproduction enable/disable information and content data CD.
- an analysis processing unit 30 that generates a haptic reproduction flag PF
- a transmission processing unit (communication unit 34) that transmits reproduction enable/disable information and content data CD.
- the haptic signal instead of transmitting the haptic signal itself, for example, by transmitting reproduction availability information, such as 1-bit flag information, to the receiving device, it is possible to reduce the communication band and also reduce the communication bandwidth required for transmission processing. It is possible to reduce the processing load and processing time.
- reproduction availability information such as 1-bit flag information
- the analysis processing unit 30 in the transmission device 2 (2A, 2B, 2C) generates a tactile sensation for each audio frame data SFD, which is an audio signal divided at predetermined time intervals.
- the transmission processing unit (communication unit 34) may determine whether or not the signal is reproducible, and may perform transmission processing by associating reproducibility information (for example, the haptic reproduction flag PF) with each sound frame data SFD.
- reproducibility information for example, the haptic reproduction flag PF
- the reproducibility information for each sound frame data SFD it is possible to finely set the section in which the haptic signal is generated. As a result, it is possible to reproduce the tactile signal as intended, and to provide the user with an appropriate tactile stimulus.
- the reproduction time length of the sound frame data SFD is set to a short time such as less than 100 msec, it is possible to finely set the necessary and unnecessary tactile signal sections, thereby presenting a wide variety of tactile signals. It can be performed.
- the encoding unit 31 that generates the encoded data ED including the audio frame data SFD and the reproduction availability information (for example, the haptic reproduction flag PF) corresponding to the audio frame data SFD is provided.
- the transmission processing unit (communication unit 34) may transmit the encoded data ED in the transmission processing.
- standardized encoded data ED having a predetermined data structure is transmitted. Therefore, inconsistency between data can be prevented, and the version upgrade of the data structure is facilitated.
- the analysis processing unit 30 in the transmission device 2 determines whether reproduction is possible based on the analysis result of the acoustic signal (for example, the analysis result of the acoustic frame data SFD).
- Information eg, haptic play flag PF
- the tactile stimulus be presented in concert with the audio provided to the user.
- the acoustic signal is analyzed, and it is determined whether or not the tactile sensation should be presented based on the analysis result. Thereby, it is possible to determine whether or not the tactile sensation presentation matched to the sound is appropriate. Therefore, it is possible to present the tactile sense in accordance with the sound provided to the user. For example, if an explosive sound or the like can be identified, it is possible to provide a tactile sensation that matches the explosive sound.
- the content data CD includes moving image data that is reproduced in synchronization with the audio signal, and the analysis processing unit 30 analyzes the moving image data to analyze the audio signal and the moving image.
- Reproducibility information for example, haptic reproduction flag PF
- the content data CD includes a video
- the analysis processing unit 30 in the transmission device 2 (2A, 2B, 2C) generates reproduction availability information (for example, a haptic reproduction flag PF) may be generated.
- reproduction availability information For example, a haptic reproduction flag PF
- the analysis processing unit 30 in the transmission device 2 (2A, 2B, 2C) generates reproduction availability information (for example, a haptic reproduction flag PF) may be generated.
- the analysis processing unit 30 in the transmission device 2 (2A, 2B, 2C) performs Reproducibility information (for example, haptic reproduction flag PF) may be generated based on the total value of the power spectrum.
- Reproducibility information for example, haptic reproduction flag PF
- the analysis processing unit 30 in the transmission device 2 (2A, 2B, 2C) performs may be used to generate reproduction permission/prohibition information (for example, a haptic reproduction flag PF).
- reproduction permission/prohibition information for example, a haptic reproduction flag PF.
- the analysis processing unit 30 in the transmission device 2 determines whether or not a human face of a size equal to or larger than a predetermined size is detected in the moving image data.
- reproduction permission/prohibition information for example, a haptic reproduction flag PF.
- PF reproduction permission/prohibition information
- a scene in which a person's face is shown in close-up is presumed to be a scene in which a person is speaking. In such a scene, if a tactile sensation is presented in response to a person's speaking voice, the user may feel uncomfortable. In order to avoid this, it is decided not to perform tactile presentation when a scene in which a person's face is enlarged is detected. As a result, it is possible to avoid a tactile presentation that gives the user the feeling of being shaken to the sound of a person's voice.
- An information processing method executed by the receiving device 3 receives reproduction propriety information (for example, a haptic reproduction flag PF) of a haptic signal and data (encoded data ED) including an acoustic signal, and determines reproduction propriety. generating a haptic signal based on the received acoustic signal when the information indicates reproducibility, and determining not to generate the haptic signal when the reproducibility information indicates non-reproducible.
- reproduction propriety information for example, a haptic reproduction flag PF
- the information processing method executed by the transmission device 2 (2A, 2B, 2C) performs analysis processing on the content data CD including at least the acoustic signal, and reproduces reproduction possibility information (for example, a haptic reproduction flag PF) indicating whether the haptic signal can be reproduced. is generated, and processing for transmitting reproduction permission/prohibition information and an acoustic signal is included.
- reproduction possibility information for example, a haptic reproduction flag PF
- a program to be executed by the receiving device 3 receives reproduction propriety information of the haptic signal (for example, the haptic reproduction flag PF) and data including the acoustic signal (encoded data ED), and the reproduction propriety information is received.
- a CPU provided in the receiving device 3 has a function of generating a haptic signal based on the received acoustic signal when reproduction is permitted, and determining not to generate a haptic signal when the reproduction permission/inhibition information indicates reproduction is not possible.
- a program to be executed by the transmission device 2 (2A, 2B, 2C) analyzes the content data CD including at least the acoustic signal, generates reproduction permission/prohibition information (for example, a haptic reproduction flag PF) indicating whether or not the haptic signal can be reproduced, It is a program that causes an arithmetic processing device such as a CPU included in the transmission device 2 to execute a function of transmitting reproduction permission/prohibition information and an acoustic signal. With such a program, the analysis processing section 30 and the encoding section 31 described above can be realized by an arithmetic processing device such as a microcomputer.
- These programs can be recorded in advance in a HDD as a recording medium built in equipment such as a computer device, or in a ROM or the like in a microcomputer having a CPU.
- the program may be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, CD-ROM, MO (Magneto Optical) disk, DVD, Blu-ray disk, magnetic disk, semiconductor memory, or memory card.
- a removable recording medium such as a flexible disk, CD-ROM, MO (Magneto Optical) disk, DVD, Blu-ray disk, magnetic disk, semiconductor memory, or memory card.
- Such removable recording media can be provided as so-called package software.
- it can also be downloaded from a download site via a network such as a LAN (Local Area Network) or the Internet.
- LAN Local Area Network
- the present technology can also adopt the following configuration. (1) a reception processing unit that receives data including tactile signal reproduction enable/disable information and an acoustic signal; a haptic signal generation unit that generates the haptic signal based on the acoustic signal received by the reception processing unit; The haptic signal generation unit generating the haptic signal when the reproduction permission/prohibition information indicates that reproduction is permitted; A receiving device that does not generate the haptic signal when the reproduction propriety information indicates that reproduction is not possible.
- the reproduction propriety information is provided for each audio frame data that is the audio signal divided at predetermined time intervals, The haptic signal generation unit generating the haptic signal based on the sound frame data corresponding to the reproduction permission/prohibition information indicating reproduction permission; The receiving device according to (1) above, wherein the haptic signal is not generated based on the sound frame data corresponding to the reproduction permission/prohibition information indicating that reproduction is not possible.
- the receiving apparatus according to any one of (1) to (2) above, wherein the reproduction propriety information is 1-bit flag information.
- the received data is encoded data encoded by an audio data encoding method,
- the encoded data has a structure including a payload area in which the audio frame data is stored and a reserved area,
- the receiving device according to (2) above wherein the reproduction propriety information is stored in the reserved area.
- the haptic signal generation unit performs fade-in processing and fade-out processing on the generated haptic signal.
- the haptic signal generation unit The reproducibility information corresponding to the target audio frame data indicates reproducibility, and the reproducibility information corresponding to the previous audio frame data, which is the audio frame data immediately before the target audio frame data, indicates reproducibility.
- fade-in processing is performed on the haptic signal generated from the target acoustic frame data when the reproduction propriety signal corresponding to the target acoustic frame data indicates that reproduction is not possible, and the reproduction propriety corresponding to the preceding acoustic frame data is performed.
- the receiving device according to (6) above, wherein, when the information indicates that reproduction is possible, fade-out processing is performed on the haptic signal generated from the preceding acoustic frame data.
- an analysis processing unit that performs analysis processing on content data including at least an acoustic signal and generates reproduction propriety information indicating whether or not the haptic signal can be reproduced;
- a transmission device comprising: a transmission processing unit that transmits the reproduction propriety information and the acoustic signal.
- the analysis processing unit determines whether or not the haptic signal can be reproduced for each audio frame data that is the audio signal divided at predetermined time intervals, The transmission device according to (8), wherein the transmission processing unit performs the transmission by associating the reproduction availability information with each sound frame data.
- an encoding unit that generates encoded data including the audio frame data and the reproduction propriety information corresponding to the audio frame data;
- the analysis processing unit generates the reproduction propriety information based on an analysis result of the acoustic signal.
- the content data includes moving image data reproduced in synchronization with the audio signal;
- the analysis processing unit is performing analysis processing on the moving image data;
- the transmission device according to any one of (8) to (11) above, wherein the reproduction propriety information is generated based on an analysis result of the moving image data.
- a program that causes an arithmetic processing unit to execute a function of transmitting the reproduction propriety information and the acoustic signal.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Automation & Control Theory (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
ユーザに触覚刺激を提供するためには、触覚刺激についての触覚信号が必要となる。触覚信号は、例えば、ユーザに各種のセンサを取り付け、当該センサによって計測された測定値に基づいて生成される。
ところが、このような触覚信号を生成するための環境を十分に整えるためには金銭的及び時間的なコストが必要となる。
このような状況に鑑みて、例えば、下記特許文献1においては、音響信号(音声信号)を用いて触覚信号を生成する技術が開示されている。
このような不要な音響信号に対応して生成された触覚刺激は、ユーザにとって臨場感を向上させる効果がないだけでなく、不快感をもたらすことになり兼ねない。
これにより、音響信号に合わせた触覚提示を行わない区間を作ることが可能となる。例えば、音響信号に合わせた触覚提示が適切でない区間については触覚信号の生成を行わないことを示す再生可否情報を設定することが可能となる。
音響フレームデータごとに再生可否情報が設けられることで、触覚信号が生成される区間を細かく設定することができる。
これにより、受信処理部が受信するデータのデータ量が小さくされる。
再生可否情報が予約領域に記憶されることで、音響フレームデータを伝送する仕組みを利用して再生可否情報の受信が実現される。
音響フレームデータから取得される音響信号から触覚刺激をユーザに提供すべきか否かを判定できない場合が存在する。本構成により、音響フレームデータに基づいて適切でない再生可否情報が生成されてしまうことが防止される。
フェードイン処理やフェードアウト処理などのフェード処理は、時間の経過に伴って徐々に信号を大きくする処理や徐々に信号を小さくする処理であり、例えば所定のゲイン関数を掛ける処理である。触覚提示の開始時及び終了時において適切なフェード処理を行うことにより、触覚提示の開始と終了をスムーズに行うことができ、自然な触覚体験を与えることができる。
即ち、再生可否情報が変化するタイミングでフェードイン処理及びフェードアウト処理の何れかの処理が実行される。
コンテンツデータを解析することにより、触覚提示を行うべき区間と行わない方がよい区間とを判定することができる。そして、解析結果に応じて再生可否情報を生成することで、コンテンツデータに合わせた触覚提示を行うことが可能となる。
音響フレームデータごとに再生可否情報が設定されることで、触覚信号が生成される区間を細かく設定することができる。
これにより、所定のデータ構造とされた規格化された符号化データが送信される。
これにより、音響信号に合わせて触覚提示を行うことが適切か否かを判定することができる。
コンテンツデータに映像が含まれる場合には、音に合わせた触覚刺激がユーザに提示されるだけでなく、映像に合わせた触覚刺激がユーザに提示される方が好ましい場合もある。本構成によれば、動画像データについての解析 処理が行われることで、映像に合わせて触覚刺激を提示すべきシーンであるか否かの判定を行うことができる。
これにより、音響フレームデータのスペクトル平坦度そのものやスペクトル平坦度の増加率などによって触覚提示を行うべきか否かを判定することが可能となり、適切な触覚提示が行われる可能性を高めることができる。
これにより、音響フレームデータにおける低周波成分のパワースペクトルの合計値や、その増加率などによって触覚提示を行うべきか否かを判定することができる。
これにより、例えば、爆発シーンなどのように輝度値が大きく変化する場面を検出することができる。
例えば、人の顔が大写しにされたシーンは、人が話しているシーンであることが推定される。そのようなシーンは人が話している声に応じて触覚提示を行ってしまうと、ユーザに不快感を与えてしまう場合がある。これを避けるために、人の顔が大写しにされたシーンを検出した場合には、触覚提示を行わないことを決定する。
このような情報処理方法やプログラムによって、本技術の送信装置や受信装置を容易に実現することができる。
<1.システム構成>
<2.符号化データのデータ構造>
<3.送信装置の構成>
<4.受信装置の構成>
<5.処理フロー>
<5-1.送信装置の処理フロー>
<5-2.受信装置の処理フロー>
<6.変形例>
<7.まとめ>
<8.本技術>
本技術に係る触覚提示システム1の構成の概要について図1を参照して説明する。
触覚提示システム1は、ユーザに対して触覚提示するための各種処理を行う。ここで、触覚提示とは、触覚信号を再生することによりユーザに触覚刺激を提供することを意味する。
触覚提示システム1は、送信装置2と受信装置3と音響再生装置4と触覚再生装置5とを備えている。
送信装置2は、音響フレームデータSFDごとに符号化処理を行い、符号化データEDを生成する。送信装置2は、符号化データEDを受信装置3に送信する。
即ち、コンテンツデータCDの音響信号または映像信号に基づいてユーザに対する触覚提示の有無が定められる。
受信装置3は、音響信号を音響再生装置4に送信することによりユーザに対する音響出力を実現する。
また、受信装置3は、再生可否情報としての触覚再生フラグPFに基づいて触覚信号を生成し、触覚再生装置5に送信することによりユーザに対する触覚提示を実現する。
ネックバンドスピーカとしての受信装置3Aは、首掛け式のスピーカ装置とされ、筐体6における左側の部分に配置された音響出力部7Lと右側の部分に配置された音響出力部7Rとを備えている。
また、受信装置3Aは、筐体6の左側の先端部に配置された触覚再生部8Lと、右側の先端部に配置された触覚再生部8Rとを備えている。
さらに、受信装置3Aは、電源ボタンなどの各種の操作子9を備えている。
即ち、図2に示す態様においては、ユーザは、送信装置2Aが備える表示部(送信装置2Aに接続されたモニタ装置も含む)に表示された画像を見ながら、受信装置3Aが備える音響出力部7L,7Rから出力される音を聞き、触覚再生部8L,8Rにおいて再生される振動刺激を体感することにより、コンテンツを楽しむものである。
送信装置2Bは、記録メディアRMに記憶された音響信号に基づいて符号化データEDを生成し、受信装置3Bに送信する。
触覚再生装置5Bは、受信した触覚信号の再生処理を行うことにより、触覚提示を行う。
音響再生装置4Cは、受信した音響フレームデータの再生を行うことにより、音響出力を行う。
また、図5に示す態様においては、ユーザの操作によって動くキャラクタの動きに応じた触覚刺激をユーザに体感させることができるため、ゲームへの没入間を高めることができる。
1フレーム分の符号化データEDのデータ構造について図6を参照して説明する。
符号化データEDは、音響フレームデータSFDを送信するためのデータ構造とされている。具体的には、SBC(Sub Band Coding)やMP3(MPEG1 Audio Layer-III)やAAC(Advanced Audio Coding)やLDACなどのデータ構造を用いることができる。
符号化データEDは、ヘッダ領域20とペイロード領域21を備えて構成されている。また、符号化データEDが更にチェック領域を備えて構成されていてもよい。
具体的には、ビットレートIDは0~3の何れかの値を採り、「0」である場合には32kbpsを示し、「1」である場合には64kbpsを示し、「2」である場合には96bpsを示し、「3」である場合には128kbpsを示す。
具体的には、サンプリングレートIDは0~3の何れかの値を採り、「0」である場合には12kHzを示し、「1」である場合には24kHzを示し、「2」である場合には48kHzを示し、「3」である場合には96kHzを示す。
具体的には、チャンネルモードIDは0~3の何れかの値を採り、「0」である場合には音響フレームデータSFDがモノラルの信号であることを示し、「1」である場合には音響フレームデータSFDがステレオの信号であることを示し、「2」である場合には音響フレームデータSFDが5.1チャンネルサラウンドの信号であることを示し、「3」である場合には音響フレームデータSFDが7.1チャンネルサラウンドの信号であることを示す。
予約領域26は、1ビットや2ビットや4ビットなど、どのような大きさの領域とされていてもよい。
本実施の形態においては、予約領域26に上述した触覚再生フラグPFが記憶される。なお、本実施の形態を実現するためには、予約領域26が1ビットの領域であってもよい。即ち、触覚再生フラグPFが1ビットから成るフラグ情報とされていれば、予約領域26が1ビットの領域とされていても本実施の形態を実現可能である。また、この態様であれば、符号化データEDを最小限のデータ構造に留めることができるため、符号化データEDの送受信の際の通信帯域を抑えることができる。
送信装置2の構成について図7を参照して説明する。
送信装置2は、解析処理部30と符号化部31と記憶部32と制御部33と通信部34とバス35とを備えている。
部分動画像データ解析処理では、部分動画像データに含まれる各画像データに基づいて触覚提示に適したシーンであるか否かを判定する。具体的には後述する。
これにより、触覚再生フラグPFを符号化するための新たな符号化方式を開発する手間などを省くことができる。
なお、符号化データEDの実データとしては、音響フレームデータSFDに対して圧縮などの符号化処理を施したものであってもよい。
解析処理部30は、入力調整部40と動画像データ解析部41と音響解析部42と触覚再生判定部43と出力調整部44とを備えている。
更に、入力調整部40は、取り出した動画像データを所定の時間幅のデータとされた部分動画像データMDに分割する処理や、取り出した音響信号を所定の時間幅のデータとされた音響フレームデータSFDに分割する処理を行う。
算出する特徴量についていくつかの例を挙げる。
例えば、部分動画像データMDに含まれる画像フレームごとに全画素の輝度値の合計値を算出し、特徴量Aとする。特徴量Aが高い画像フレームは、明るいシーンが写った画像であるため、爆発のシーンを捉えた画像である可能性が高い。
例えば、直前の画像フレームに対して、全画素の輝度値の合計値の増加率が高いほど高くなるように特徴量Bを算出することで、爆発が始まった瞬間を写した画像フレームを特定することができる。
算出する特徴量についていくつかの例を挙げる。
例えば、音響フレームデータSFDについてのスペクトル平坦度を算出し、特徴量Dとする。打撃音や衝突音などはスペクトル平坦度が高くなるという特徴を備えており、スペクトル平坦度が高いほど高くなるように特徴量Dを算出することで、打撃音や衝突音などが発生したシーンであることを特定することが可能となる。
例えば、音響フレームデータSFDについてのスペクトル平坦度の増加率が高いほど高くなるように特徴量Eを算出する。
特徴量Dだけでなく特徴量Eを加味することで、打撃音や衝突音などが発生したシーンが特定される可能性を高めることができる。
例えば、音響フレームデータSFDについての100Hz以下のパワースペクトルの合計値の増加率が高いほど高くなるように特徴量Gを算出する。
特徴量Fだけでなく特徴量Gを加味することで、重低音が発生したシーン、特に重低音が発生し始めたシーンを特定することができ、重低音の発生と同時に触覚提示が行われはじめるような効果的な触覚提示を行うことが可能となる。
そして、触覚再生判定部43は、評価値EVに基づいて触覚再生フラグPFを設定する。具体的には、評価値EVが閾値TH以上である場合には触覚再生フラグPFに「1」を設定し、評価値EVが閾値TH未満である場合には触覚再生フラグPFに「0」を設定する。
受信装置3の構成について図9を参照して説明する。
受信装置3は、復号部50とDAC(Digital to Analog Converter)51,52と、増幅器53,54と記憶部55と制御部56と通信部57とバス58とを備えている。
復号部50は、取得した触覚再生フラグPFを確認し、触覚再生フラグPFが再生可を示す「1」であった場合に限り音響フレームデータSFDから触覚信号を生成する。
また、増幅器54は、アナログ信号に変換された触覚信号を触覚再生装置5に出力する。
なお、DAC51,52及び増幅器53,54は、共に音響再生装置4や触覚再生装置5の内部に設けられていてもよい。この場合には、ディジタル信号としての音響フレームデータSFDや触覚信号がそれぞれ音響再生装置4や触覚再生装置5に送信される。
復号部50は、音響復号部60と触覚信号生成部61とを備えている。
図11は、音響フレームデータSFDのパワースペクトルを示すグラフであり、ローパスフィルタの処理を施す前の信号を示している。
図12は、図11と同様に音響フレームデータSFDのパワースペクトルを示すグラフであるが、
カットオフ周波数が500Hzとされたローパスフィルタの処理を施した後の信号を示している。
触覚信号生成部61は、図14に示すようなゲイン関数をフェード処理前の触覚信号に乗じることでフェードイン処理及びフェードアウト処理を行う。これにより、触覚提示が開始される時間t0と触覚提示が終了する時間t3において触覚信号が0とされる。
触覚提示システム1が備える送信装置2や受信装置3が実行する処理の流れについて添付図を参照して説明する。
なお、以下の説明においては、各処理がソフトウェアの処理として実現される例を挙げるが、各処理の少なくとも一部がハードウェアの処理として実現されてもよい。
送信装置2の解析処理部30や符号化部31が実行する処理の流れについて図15を参照して説明する。
受信装置3の復号部50が実行する処理の流れについて図16を参照して説明する。
一方、触覚再生フラグPFがOFFである場合には、ステップS204の処理はスキップされる。
具体的には、触覚再生フラグPFがOFFからONに変化した場合、復号部50の触覚信号生成部61はステップS207において、フェードイン処理を行う。
解析処理部30の解析についての別の形態について添付図を参照して説明する。
一つ目の変形例としての形態は、テレビ番組などの放送コンテンツや配信コンテンツについてのEPG(Electronic Programming Guide)などの番組情報を用いて解析を行う例である。
もちろん、上述した特徴量及び評価値に加えてシーン情報を用いた解析を行ってもよい。例えば、特定のシーンであれば、どんなに評価値が高くても触覚提示を行わないことを決定してもよい。
また、シーン情報が記録メディアRMに記憶されており(図4参照)、コンテンツデータCDの再生に応じて記録メディアRMから取得してもよい。
或いは、テレビ番組の番組表情報やデータ放送のデータの一部としてテレビジョン受像機から取得してもよい。
ここで、触覚提示に適したタイミングとは、例えば、ユーザが操作するゲーム内のキャラクタが敵キャラクタに向けて振り下ろした剣などの武器や拳が当該敵キャラクタに接触した際に再生される効果音が発生したタイミングであってもよいし、爆弾が爆発したときに再生される効果音が発生したタイミングであってもよい。換言すれば、ユーザの操作するキャラクタが何らかの刺激を体感する際の効果音発生のタイミングであってもよい。
上述した各例において説明したように、本技術における受信装置3(3A,3B,3C)は、触覚信号の再生可否情報(触覚再生フラグPF)と音響信号(音響フレームデータSFD)が含まれるデータ(符号化データED)を受信する受信処理部(通信部57)と、受信処理部が受信する音響信号に基づいて触覚信号の生成を行う触覚信号生成部61と、を備えている。
また、触覚信号生成部61は、再生可否情報が再生可を示す場合(例えば、触覚再生フラグPFに「1」が設定されている場合)に触覚信号の生成を行い、再生可否情報が再生不可を示す場合(例えば、触覚再生フラグPFに「0」が設定されている場合)に触覚信号の生成を行わないようにされる。
これにより、触覚信号生成部61による触覚信号の生成が行われない音響信号(音響フレームデータSFD)と、触覚信号の生成が行われる音響信号が存在する。即ち、全ての音響信号について触覚信号が生成されるわけではない。
触覚信号が生成されない音響信号が存在することにより、触覚信号の生成に係る処理負担が軽減される。また、再生可否情報に応じて触覚信号の生成の可否を決定することで、必要な期間のみ触覚信号を生成することが可能となる。特に、音響信号から触覚信号を生成する場合には、音響信号が小さな信号であった場合に生成される触覚信号も小さな信号になってしまうことが考えられる。そして、小さな触覚信号はユーザが感知できない可能性がある。再生可否情報に応じて触覚信号の生成を行うことで、このような不要な触覚信号の生成を回避することができる。他にも、触覚提示に適していない音が含まれた音響信号なども存在する。そのような音響信号に基づいて触覚提示を行ってしまうと、ユーザに不快感を与えてしまう場合がある。当該再生区間において触覚提示を行わないことを示す再生可否情報が設定されることで、ユーザに対して不快感を与える触覚提示を回避することができる。
また、受信処理部は、触覚信号のデータを受信する代わりに触覚信号の再生可否を示す再生可否情報を受信するものと考えられる。そして、再生可否情報は、触覚信号のデータよりも小さなデータと考えられる。従って、触覚信号のデータと音響信号のデータの双方を受信する場合と比較して、受信処理部が受信するデータのデータ量を小さく抑えることができる。これにより通信に使用される帯域を削減することができると共に、受信処理に要する処理負担を軽減することが可能となる。
音響フレームデータSFDごとに再生可否情報が設けられることで、触覚信号が生成される区間を細かく設定することができる。
これにより、意図に沿った触覚信号の再生を行うことができ、ユーザに適切な触覚刺激を提供することができる。特に、音響フレームデータSFDの再生時間長が百msec未満などの短い時間とされることにより、触覚信号の必要な区間と不要な区間をきめ細かく設定することができるため、多種多様な触覚信号の提示を行うことができる。
これにより、受信処理部(通信部57)が受信するデータのデータ量が小さくされる。
従って、受信処理に要する時間を短くすることができると共に、データの送受信に要する通信帯域の削減を図ることができる。
再生可否情報が予約領域26に記憶されることで、音響フレームデータSFDを伝送する仕組みを利用して再生可否情報の受信が実現される。
これにより、触覚信号や再生可否情報を受信するためのデータ構造や通信方式を確立する必要がないため、環境構築のためのコスト削減を図ることができる。また、触覚信号専用のデータ構造を生成する環境や利用する環境が普及していない場合に好適である。
音響フレームデータSFDから取得される音響信号から触覚刺激をユーザに提供すべきか否かを判定できない場合が存在する。
そして、ユーザが視聴する映像には各種のシーンが含まれており、画像解析をすることにより触覚提示に相応しいシーンであるか否かを判定可能な場合がある。このような場合において、部分動画像データMDに基づいて再生可否情報を生成することにより、適切な触覚提示が行われる可能性を高めることができる。
フェードイン処理やフェードアウト処理などのフェード処理は、時間の経過に伴って徐々に信号を大きくする処理や徐々に信号を小さくする処理であり、所定のゲイン関数を掛ける処理である。触覚提示の開始時及び終了時において適切なフェード処理を行うことにより、触覚提示の開始と終了をスムーズに行うことができ、自然な触覚体験を与えることができる。
従って、ユーザの気が削がれてしまうことを防止することができ、コンテンツへの没入感を高めることができる。
即ち、再生可否情報が変化するタイミングでフェードイン処理及びフェードアウト処理の何れかの処理が実行される。
これにより、ユーザに不快感を与えずに触覚提示を行うことができる触覚信号が生成されるため、コンテンツへの没入感を高めることができる。
コンテンツデータCDを解析することにより、触覚提示を行うべき再生区間と行わない方がよい再生区間とを判定することができる。
従って、触覚刺激の提示に値しない区間については触覚信号の再生を行わないように再生可否情報を設定することにより、ユーザに適切な触覚提示を行うことができる。
また、触覚信号そのものを送信するのではなく、例えば1ビットのフラグ情報などとされた再生可否情報を受信装置に対して送信することで、通信帯域を削減することができると共に、送信処理に要する処理負担や処理時間を削減することが可能となる。
音響フレームデータSFDごとに再生可否情報が設定されることで、触覚信号が生成される区間を細かく設定することができる。
これにより、意図に沿った触覚信号の再生を行うことができ、ユーザに適切な触覚刺激を提供することができる。特に、音響フレームデータSFDの再生時間長が百msec未満などの短い時間とされることにより、触覚信号の必要な区間と不要な区間をきめ細かく設定することができるため、多種多様な触覚信号の提示を行うことができる。
これにより、所定のデータ構造とされた規格化された符号化データEDが送信される。
従って、データ間の不整合を防止することができると共に、データ構造のバージョンアップが容易となる。
触覚刺激は、ユーザに提供する音響に合わせて提示されることが好ましい場合がある。本構成によれば、音響信号についての解析処理を行い、その解析結果に基づいて触覚提示を行うべきか否かの判定が行われる。これにより、音響に合わせた触覚提示が適切であるか否かを判定することができる。
従って、ユーザに提供する音に合わせて触覚提示を行うことが可能となる。例えば、爆発音などが特定できた場合には、爆発音に合わせた触覚提示を行うことが可能となる。
コンテンツデータCDに映像が含まれる場合には、音に合わせた触覚刺激がユーザに提示されるよりも映像に合わせた触覚刺激がユーザに提示される方が好ましい場合もある。
本構成によれば、動画像データについての解析処理が行われることで、映像に合わせて触覚刺激を提示すべきシーンであるか否かの判定を行うことができる。
従って、動画像データに合わせてユーザに触覚提示を行うことが可能となる。特に、音響信号に背景音楽が含まれている場合などには、ユーザに触覚提示を行うべきか否かについての判定を適切に行うことができない可能性がある。このような場合に動画像データについての解析処理の結果を考慮して触覚提示を行うか否かを判定することで、ユーザに対して適切でない触覚提示が行われてしまうことを防止することができる。
音響フレームデータSFDのスペクトル平坦度やスペクトル平坦度の増加率などによって触覚提示を行うべきか否かを判定することで、適切な触覚提示が行われる可能性を高めることができる。
従って、ユーザに対してコンテンツへの高い没入間を与えることができると共に、満足度を与えることが可能となる。
これにより、音響フレームデータSFDにおける低周波成分のパワースペクトルの合計値や、その増加率に基づいて触覚提示を行うべきか否かを判定することができる。
従って、重低音に基づく触覚刺激をユーザに提供することができるため、ユーザにとって違和感の無い触覚提示を行うことができる。
これにより、例えば、爆発シーンなどのように輝度値が大きく変化する場面を検出することができる。
従って、ユーザにとって違和感の無い触覚提示を行うことができる。
例えば、人の顔が大写しにされたシーンは、人が話しているシーンであることが推定される。そのようなシーンは人が話している声に応じて触覚提示を行ってしまうと、ユーザに不快感を与えてしまう場合がある。これを避けるために、人の顔が大写しにされたシーンを検出した場合には、触覚提示を行わないことを決定する。
これにより、人の声に合わせて揺さぶられているような感覚をユーザに与えてしまうような触覚提示が行われてしまうことを回避することができる。
このようなプログラムにより、上述した復号部50をマイクロコンピュータ等の演算処理装置により実現できる。
このようなプログラムにより、上述した解析処理部30や符号化部31をマイクロコンピュータ等の演算処理装置により実現できる。
また、このようなプログラムは、リムーバブル記録媒体からパーソナルコンピュータ等にインストールする他、ダウンロードサイトから、LAN(Local Area Network)、インターネットなどのネットワークを介してダウンロードすることもできる。
本技術は以下のような構成を採ることもできる。
(1)
触覚信号の再生可否情報と音響信号が含まれるデータを受信する受信処理部と、
前記受信処理部が受信する前記音響信号に基づいて前記触覚信号の生成を行う触覚信号生成部と、を備え、
前記触覚信号生成部は、
前記再生可否情報が再生可を示す場合に前記触覚信号の生成を行い、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わない
受信装置。
(2)
所定時間ごとに区切られた前記音響信号とされた音響フレームデータごとに前記再生可否情報が設けられ、
前記触覚信号生成部は、
再生可を示す前記再生可否情報に対応する前記音響フレームデータに基づく前記触覚信号の生成を行い、
再生不可を示す前記再生可否情報に対応する前記音響フレームデータに基づく前記触覚信号の生成を行わない
上記(1)に記載の受信装置。
(3)
前記再生可否情報は1ビットから成るフラグ情報とされた
上記(1)から上記(2)の何れかに記載の受信装置。
(4)
前記受信データは、音響データの符号化方式で符号化された符号化データとされ、
前記符号化データは、前記音響フレームデータが記憶されるペイロード領域と予約領域とを含む構造とされ、
前記再生可否情報は前記予約領域に記憶される
上記(2)に記載の受信装置。
(5)
前記再生可否情報は、前記音響フレームデータと同期して再生される部分動画像データに基づいて生成される
上記(2)または上記(4)の何れかに記載の受信装置。
(6)
前記触覚信号生成部は、生成された前記触覚信号についてのフェードイン処理及びフェードアウト処理を行う
上記(1)から上記(5)の何れかに記載の受信装置。
(7)
前記触覚信号生成部は、
対象の音響フレームデータに対応する前記再生可否情報が再生可を示し且つ前記対象の音響フレームデータの一つ前の音響フレームデータである直前音響フレームデータに対応する前記再生可否情報が再生不可を示す場合に前記対象の音響フレームデータから生成した前記触覚信号に対するフェードイン処理を行い
前記対象の音響フレームデータに対応する前記再生可否情報が再生不可を示し且つ前記直前音響フレームデータに対応する前記再生可否情報が再生可を示す場合に前記直前音響フレームデータから生成した前記触覚信号に対するフェードアウト処理を行う
上記(6)に記載の受信装置。
(8)
少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成する解析処理部と、
前記再生可否情報と前記音響信号を送信する送信処理部と、を備えた
送信装置。
(9)
前記解析処理部は、所定時間ごとに区切られた前記音響信号である音響フレームデータごとに前記触覚信号の再生可否を判定し、
前記送信処理部は、前記音響フレームデータごとに前記再生可否情報を対応付けて前記送信を行う
上記(8)に記載の送信装置。
(10)
前記音響フレームデータと該音響フレームデータに対応した前期再生可否情報とを含む符号化データを生成する符号化部を備え、
前記送信処理部は、前記送信において前記符号化データを送信する
上記(9)に記載の送信装置。
(11)
前記解析処理部は、前記音響信号の解析結果に基づいて前記再生可否情報を生成する
上記(8)から上記(10)の何れかに記載の送信装置。
(12)
前記コンテンツデータは前記音響信号と同期して再生される動画像データを含み、
前記解析処理部は、
前記動画像データに対する解析処理を行い、
前記動画像データの解析結果に基づいて前記再生可否情報を生成する
上記(8)から上記(11)の何れかに記載の送信装置。
(13)
前記解析処理部は、前記音響フレームデータにおけるスペクトル平坦度に基づいて前記再生可否情報を生成する
上記(9)から上記(10)の何れかに記載の送信装置。
(14)
前記解析処理部は、前記音響フレームデータにおける閾値以下の周波数成分のパワースペクトルの合計値に基づいて前記再生可否情報を生成する
上記(9)から上記(10)の何れかに記載の送信装置。
(15)
前記解析処理部は、前記動画像データにおける複数の画素の輝度値の合計値に基づいて前記再生可否情報を生成する
上記(12)に記載の送信装置。
(16)
前記解析処理部は、前記動画像データにおいて所定以上の大きさの人の顔を検出したか否かに基づいて前記再生可否情報を生成する
上記(12)に記載の送信装置。
(17)
触覚信号の再生可否情報と音響信号が含まれるデータを受信し、
前記再生可否情報が再生可を示す場合に前記受信した音響信号に基づいて前記触覚信号を生成し、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わないことを決定する処理を、コンピュータ装置が実行する
情報処理方法。
(18)
少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成し、
前記再生可否情報と前記音響信号を送信する処理を、コンピュータ装置が実行する
情報処理方法。
(19)
触覚信号の再生可否情報と音響信号が含まれるデータを受信し、
前記再生可否情報が再生可を示す場合に前記受信した音響信号に基づいて前記触覚信号を生成し、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わないことを決定する機能を、演算処理装置に実行させる
プログラム。
(20)
少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成し、
前記再生可否情報と前記音響信号を送信する機能を、演算処理装置に実行させる
プログラム。
3,3A,3B,3C 受信装置
21 ペイロード領域
26 予約領域
30 解析処理部
31 符号化部
34 通信部(送信処理部)
57 通信部(受信処理部)
61 触覚信号生成部
CD コンテンツデータ
ED 符号化データ
SFD 音響フレームデータ
MD 部分動画像データ
PF 触覚再生フラグ
Claims (20)
- 触覚信号の再生可否情報と音響信号が含まれるデータを受信する受信処理部と、
前記受信処理部が受信する前記音響信号に基づいて前記触覚信号の生成を行う触覚信号生成部と、を備え、
前記触覚信号生成部は、
前記再生可否情報が再生可を示す場合に前記触覚信号の生成を行い、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わない
受信装置。 - 所定時間ごとに区切られた前記音響信号とされた音響フレームデータごとに前記再生可否情報が設けられ、
前記触覚信号生成部は、
再生可を示す前記再生可否情報に対応する前記音響フレームデータに基づく前記触覚信号の生成を行い、
再生不可を示す前記再生可否情報に対応する前記音響フレームデータに基づく前記触覚信号の生成を行わない
請求項1に記載の受信装置。 - 前記再生可否情報は1ビットから成るフラグ情報とされた
請求項1に記載の受信装置。 - 前記受信データは、音響データの符号化方式で符号化された符号化データとされ、
前記符号化データは、前記音響フレームデータが記憶されるペイロード領域と予約領域とを含む構造とされ、
前記再生可否情報は前記予約領域に記憶される
請求項2に記載の受信装置。 - 前記再生可否情報は、前記音響フレームデータと同期して再生される部分動画像データに基づいて生成される
請求項2に記載の受信装置。 - 前記触覚信号生成部は、生成された前記触覚信号についてのフェードイン処理及びフェードアウト処理を行う
請求項1に記載の受信装置。 - 前記触覚信号生成部は、
対象の音響フレームデータに対応する前記再生可否情報が再生可を示し且つ前記対象の音響フレームデータの一つ前の音響フレームデータである直前音響フレームデータに対応する前記再生可否情報が再生不可を示す場合に前記対象の音響フレームデータから生成した前記触覚信号に対するフェードイン処理を行い、
前記対象の音響フレームデータに対応する前記再生可否情報が再生不可を示し且つ前記直前音響フレームデータに対応する前記再生可否情報が再生可を示す場合に前記直前音響フレームデータから生成した前記触覚信号に対するフェードアウト処理を行う
請求項6に記載の受信装置。 - 少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成する解析処理部と、
前記再生可否情報と前記音響信号を送信する送信処理部と、を備えた
送信装置。 - 前記解析処理部は、所定時間ごとに区切られた前記音響信号である音響フレームデータごとに前記触覚信号の再生可否を判定し、
前記送信処理部は、前記音響フレームデータごとに前記再生可否情報を対応付けて前記送信を行う
請求項8に記載の送信装置。 - 前記音響フレームデータと該音響フレームデータに対応した前期再生可否情報とを含む符号化データを生成する符号化部を備え、
前記送信処理部は、前記送信において前記符号化データを送信する
請求項9に記載の送信装置。 - 前記解析処理部は、前記音響信号の解析結果に基づいて前記再生可否情報を生成する
請求項8に記載の送信装置。 - 前記コンテンツデータは前記音響信号と同期して再生される動画像データを含み、
前記解析処理部は、
前記動画像データに対する解析処理を行い、
前記動画像データの解析結果に基づいて前記再生可否情報を生成する
請求項8に記載の送信装置。 - 前記解析処理部は、前記音響フレームデータにおけるスペクトル平坦度に基づいて前記再生可否情報を生成する
請求項9に記載の送信装置。 - 前記解析処理部は、前記音響フレームデータにおける閾値以下の周波数成分のパワースペクトルの合計値に基づいて前記再生可否情報を生成する
請求項9に記載の送信装置。 - 前記解析処理部は、前記動画像データにおける複数の画素の輝度値の合計値に基づいて前記再生可否情報を生成する
請求項12に記載の送信装置。 - 前記解析処理部は、前記動画像データにおいて所定以上の大きさの人の顔を検出したか否かに基づいて前記再生可否情報を生成する
請求項12に記載の送信装置。 - 触覚信号の再生可否情報と音響信号が含まれるデータを受信し、
前記再生可否情報が再生可を示す場合に前記受信した音響信号に基づいて前記触覚信号を生成し、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わないことを決定する処理を、コンピュータ装置が実行する
情報処理方法。 - 少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成し、
前記再生可否情報と前記音響信号を送信する処理を、コンピュータ装置が実行する
情報処理方法。 - 触覚信号の再生可否情報と音響信号が含まれるデータを受信し、
前記再生可否情報が再生可を示す場合に前記受信した音響信号に基づいて前記触覚信号を生成し、
前記再生可否情報が再生不可を示す場合に前記触覚信号の生成を行わないことを決定する機能を、演算処理装置に実行させる
プログラム。 - 少なくとも音響信号を含むコンテンツデータに対する解析処理を行い触覚信号の再生可否を示す再生可否情報を生成し、
前記再生可否情報と前記音響信号を送信する機能を、演算処理装置に実行させる
プログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE112022001118.1T DE112022001118T5 (de) | 2021-02-18 | 2022-01-12 | Empfangsvorrichtung, übertragungsvorrichtung, informationsverarbeitungsverfahren und programm |
CN202280014615.2A CN116848498A (zh) | 2021-02-18 | 2022-01-12 | 接收装置、发送装置、信息处理方法和程序 |
US18/264,154 US20240307764A1 (en) | 2021-02-18 | 2022-01-12 | Reception apparatus, transmission apparatus, information processing method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021024624 | 2021-02-18 | ||
JP2021-024624 | 2021-02-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022176440A1 true WO2022176440A1 (ja) | 2022-08-25 |
Family
ID=82930613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/000744 WO2022176440A1 (ja) | 2021-02-18 | 2022-01-12 | 受信装置、送信装置、情報処理方法、プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240307764A1 (ja) |
CN (1) | CN116848498A (ja) |
DE (1) | DE112022001118T5 (ja) |
WO (1) | WO2022176440A1 (ja) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015185168A (ja) * | 2014-03-21 | 2015-10-22 | イマージョン コーポレーションImmersion Corporation | ハプティック効果の自動チューニング |
JP2016213667A (ja) * | 2015-05-08 | 2016-12-15 | 日本放送協会 | 感覚提示装置 |
WO2021019925A1 (ja) * | 2019-07-29 | 2021-02-04 | ソニー株式会社 | ウェアラブルスピーカ |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9898085B2 (en) | 2013-09-06 | 2018-02-20 | Immersion Corporation | Haptic conversion system using segmenting and combining |
-
2022
- 2022-01-12 CN CN202280014615.2A patent/CN116848498A/zh active Pending
- 2022-01-12 WO PCT/JP2022/000744 patent/WO2022176440A1/ja active Application Filing
- 2022-01-12 DE DE112022001118.1T patent/DE112022001118T5/de active Pending
- 2022-01-12 US US18/264,154 patent/US20240307764A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015185168A (ja) * | 2014-03-21 | 2015-10-22 | イマージョン コーポレーションImmersion Corporation | ハプティック効果の自動チューニング |
JP2016213667A (ja) * | 2015-05-08 | 2016-12-15 | 日本放送協会 | 感覚提示装置 |
WO2021019925A1 (ja) * | 2019-07-29 | 2021-02-04 | ソニー株式会社 | ウェアラブルスピーカ |
Also Published As
Publication number | Publication date |
---|---|
US20240307764A1 (en) | 2024-09-19 |
CN116848498A (zh) | 2023-10-03 |
DE112022001118T5 (de) | 2024-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7725203B2 (en) | Enhancing perceptions of the sensory content of audio and audio-visual media | |
US11132984B2 (en) | Automatic multi-channel music mix from multiple audio stems | |
US10433089B2 (en) | Digital audio supplementation | |
CN102100088B (zh) | 用于使用基于对象的元数据产生音频输出信号的装置和方法 | |
US8064322B2 (en) | Adaptive high fidelity reproduction system | |
JP4780375B2 (ja) | 音響信号への制御コード埋込装置、および音響信号を用いた時系列駆動装置の制御システム | |
KR20070065401A (ko) | 오디오 데이터를 처리하는 시스템 및 방법, 프로그램구성요소, 및 컴퓨터-판독가능 매체 | |
EP3108672A2 (en) | Content-aware audio modes | |
US9148104B2 (en) | Reproduction apparatus, reproduction method, provision apparatus, and reproduction system | |
WO2021176904A1 (ja) | ビットストリーム生成方法、符号化装置、復号装置 | |
JP2009005369A (ja) | ファイル生成装置およびデータ出力装置 | |
WO2012029807A1 (ja) | 情報処理装置、音響処理装置、音響処理システム、プログラムおよびゲームプログラム | |
WO2022176440A1 (ja) | 受信装置、送信装置、情報処理方法、プログラム | |
JP2002078066A (ja) | 振動波形信号出力装置 | |
US20120173008A1 (en) | Method and device for processing audio data | |
CN114915874A (zh) | 音频处理方法、装置、设备、介质及程序产品 | |
JP2010230972A (ja) | 音信号処理装置、その方法、そのプログラム、および、再生装置 | |
Suzuki et al. | AnnoTone: Record-time audio watermarking for context-aware video editing | |
WO2020149227A1 (ja) | 復号装置、復号方法、及びプログラム | |
JP2006186920A (ja) | 情報再生装置および情報再生方法 | |
WO2024004924A1 (ja) | 信号処理装置、認知機能改善システム、信号処理方法、およびプログラム | |
WO2023084933A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
CN116504265A (zh) | 用于控制音频的系统和方法 | |
Wang | A NEW APPROACH AND GUIDLINE FOR LOUDNESS IN GAME AUDIO: Developing Specific Loudness Standards for Each Section of Game Audio | |
WO2007106165A2 (en) | Enhancing perceptions of the sensory content of audio and audio-visual media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22755765 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18264154 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280014615.2 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112022001118 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22755765 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |