US11178465B2 - System and method for automatic subtitle display - Google Patents

System and method for automatic subtitle display Download PDF

Info

Publication number
US11178465B2
US11178465B2 US16/149,996 US201816149996A US11178465B2 US 11178465 B2 US11178465 B2 US 11178465B2 US 201816149996 A US201816149996 A US 201816149996A US 11178465 B2 US11178465 B2 US 11178465B2
Authority
US
United States
Prior art keywords
display
space
subtitle data
conversation language
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/149,996
Other versions
US20200107078A1 (en
Inventor
Girisha Ganapathy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harman International Industries Inc
Original Assignee
Harman International Industries Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harman International Industries Inc filed Critical Harman International Industries Inc
Priority to US16/149,996 priority Critical patent/US11178465B2/en
Assigned to HARMAN INTERNATIONAL INDUTRIES, INCORPORATED reassignment HARMAN INTERNATIONAL INDUTRIES, INCORPORATED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GANAPATHY, GIRISHA
Priority to CN201910930371.2A priority patent/CN110996163B/en
Priority to DE102019126688.2A priority patent/DE102019126688A1/en
Publication of US20200107078A1 publication Critical patent/US20200107078A1/en
Application granted granted Critical
Publication of US11178465B2 publication Critical patent/US11178465B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • G06K9/00228
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/41422Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance located in transportation means, e.g. personal vehicle
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • H04N21/4856End-user interface for client configuration for language selection, e.g. for the menu or subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/278Subtitling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition

Definitions

  • the present disclosure relates to systems, methods and devices for controlling display elements, and more particularly to presentation of automatic subtitle display for display devices and vehicles.
  • Media content typically includes sound in a single language. Sometimes, audio data for the media content is dubbed such that additional or supplementary recordings replace the original production sound in a post-production process. Dubbing sound for media content can be labor intensive. In addition the sound quality of the media is often reduced. For many types of media, viewers desire the ability to understand voice or speech of the media. Some broadcast formats include secondary audio accompanied with the media and the media player can be set to include subtitles. There exists a need to provide display devices with additional subtitle information not limited to a fixed set of subtitle information provided with the media.
  • Display devices are not configured to provide content other than information that is received by an input.
  • Conventional display devices are usually programmed for a particular set of operation languages. There is a desire to provide display devices with the ability to access and present media with a desired language.
  • One embodiment is directed to a method including determining, by a control device, a conversation language for a space.
  • the method also includes identifying, by the control device, display content presented in the space on a display and requesting, by the control device, subtitle data for the display content based on the conversation language determined for the space.
  • the method also includes controlling, by the control device, presentation of subtitle data for the display content for output on the device, wherein subtitle data presented is selected for the determined conversation language.
  • determining conversation language includes performing a speech recognition operation on passenger voice data detected in the space.
  • determining conversation language includes performing a facial recognition operation on image data detected in the space.
  • determining conversation language includes determining a user profile setting for a passenger in the space.
  • the space is associated with a viewing area of a display device, and conversation language includes voice data detected in the viewing area.
  • identifying display content includes determining at least one of title, source, and identifier for the display content.
  • subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
  • controlling presentation of the subtitle data includes synchronizing output of the subtitle data to timing of the display content.
  • Another embodiment is directed to a system including a display and a control device coupled to the display.
  • the control device is configured to determine a conversation language for a space, identify display content presented in the space on a display, and request subtitle data for the display content based on the conversation language determined for the space.
  • the control device is also configured to control presentation of subtitle data for the display content for output on the device, wherein subtitle data presented is selected for the determined conversation language.
  • FIG. 2 depicts a process for automatic subtitle display according to one or more embodiments
  • FIG. 3 depicts a graphical representation of device components according to one or more embodiments
  • FIG. 5 depicts another process for subtitle operations according to one or more embodiments.
  • a system including a display and a control device coupled to the display.
  • the control device is configured to determine a conversation language for a space and identify display content presented on a display. Based on the conversation language, the control device may request subtitle data for the display content.
  • the control device may also be configured to control presentation of subtitle data for the display content for output on the device. Subtitle data presented by the display may be selected by the control device for the determined conversation language.
  • Processes and configurations described herein may be configured to identify display content presented in the space and request subtitle data for display content based on the conversation language determined for the space.
  • determining conversation languages may be based on a space relative to a display.
  • conversation language may be relative to a space or area within a vehicle cabin.
  • determining conversation language of a space may be relative to a viewing area of a display device (e.g., TV, projector, etc.).
  • Presentation of subtitle data for the display content may be controlled for output on the device.
  • the terms “a” or “an” shall mean one or more than one.
  • the term “plurality” shall mean two or more than two.
  • the term “another” is defined as a second or more.
  • the terms “including” and/or “having” are open ended (e.g., comprising).
  • the term “or” as used herein is to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” means “any of the following: A; B; C; A and B; A and C; B and C; A, B and C”. An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.
  • FIGS. 1A-1B depict graphical representations of subtitle display according to one or more embodiments.
  • FIG. 1A provides a graphical representation of a system 100 for vehicle 105 including display 110 .
  • vehicle 105 may include a control device (not shown in FIG. 1A ) configured to control operation of display 110 .
  • display 110 may be configured to present display content 120 .
  • a control device of vehicle 105 such as control device 305 of FIG. 3 , may be configured to determine a conversation language within the passenger compartment, or cabin, of vehicle 105 .
  • configurations and processes are provided to request subtitle data for display content 120 based on a determined conversation language in a space, such as within vehicle 105 .
  • Vehicle 105 may be configured to control presentation of subtitle data, such as subtitle text 115 , with display content 120 .
  • a control device e.g., control device 305 of FIG. 3
  • FIG. 1A includes representations of passenger speech 125 and 130 .
  • passenger speech 125 and 130 may relate to regular or non-command conversation between occupants.
  • the control device of vehicle 105 is configured to detect passenger speech 125 and 130 and determine a conversation language based on passenger speech.
  • Subtitle text 115 may be presented based on the determined conversation language.
  • subtitle text 115 may be presented based on a conversation language determined from imaging of passengers and/or one or more user settings for subtitle language.
  • the control unit of display 110 and/or vehicle 105 may be configured perform the processes (e.g., process 200 , process 400 , process 500 , etc.) described herein for presentation of subtitle text 115 .
  • passenger speech 125 and 130 may relate to regular or non-command conversation between occupants. According to one embodiment, determining conversation language may be based on natural-language instructions from one or more occupants of the vehicle. By way of example, passenger speech, such as passenger speech 125 and 130 , may be detected and interpreted such that commands to present subtitle data in one or more languages may be interpreted. In one embodiment, passenger speech 125 and 130 may relate to conversational language such as, “turn here,” “have a good day,” and “I'm turning here,” in one or more languages. In one embodiment, passenger speech 125 and 130 may relate to one or more commands including identification of a desired language.
  • FIG. 1A also shows graphical element 116 which may be presented on display 120 to indicate one or more of automatic subtitle display and the availability of subtitle data based on conversation language.
  • graphical element 116 may be a selectable element configured for activation, modifying and/or ending subtitle presentation of subtitle text 115 .
  • Display 155 may be configured to control presentation of subtitle data, such as subtitle text 165 , with display content 160 .
  • a control device e.g., control device 305 of FIG. 3
  • FIG. 1B includes representations of viewer speech 175 and 180 .
  • the control device of display 155 is configured to detect viewer speech 175 and 180 and determine a conversation language based on the viewer speech.
  • Subtitle text 165 may be presented based on the determined conversation language viewer speech 175 and 180 .
  • subtitle text 165 may be presented based on a conversation language determined from imaging of viewers 170 1-n and/or one or more user settings for subtitle language.
  • FIG. 2 depicts a process for automatic subtitle display according to one or more embodiments.
  • process 200 may be performed by a control device of a vehicle (e.g., vehicle 105 ) including a display (e.g., display 110 ) for presentation of display content with subtitle data.
  • a control device of display device e.g., display device 160 for presentation of subtitle data.
  • determining conversation language at block 205 may include determining more than one language.
  • a control device can select a conversation language. Selection of the conversation language may be based on the word count of each conversation language. By way of example, a conversation language detected having a greater word count for passenger speech may be selected.
  • process 200 may account for one or more other factors when multiple languages are concerned.
  • One or more of a user input preference for language of subtitle presentation and facial recognition performed in the space may be employed to select one language over another language when multiple languages are identified.
  • a graphical element e.g., graphical element 116 , graphical element 166 , etc.
  • determining conversation language at block 205 includes determining a user profile setting for a passenger in the space.
  • a display may provide a graphical display element (e.g., graphical element 116 , graphical element 166 , etc.) that operates as a user interface where a user can provide identify a desired conversation language of choice.
  • process 200 can include identifying display content presented in the space on a display.
  • a control device identifies display content by determining at least one of title, source, and identifier for the display content.
  • the control device requests subtitle data for the display content based on the conversation language determined for the space. At least one of identified content and a title of display content may be transmitted with a determined conversation language to a server to obtain subtitle data for the display content.
  • subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
  • process 200 includes receiving subtitle data at block 230 .
  • Subtitle data can include text and/or data to present text with display content.
  • subtitle data may include metadata to synchronize the subtitle data with display content.
  • one or more of a time base and synchronization framework may be provided to control presentation of the subtitle text.
  • Process 200 allows for a display to present content and subtitle data without requiring user activation. In that fashion, process 200 provides automatic presentation of subtitle information. For use in a vehicle, process 200 overcomes the need for a driver to select a subtitle set, and thus avoids driver distraction. For display device operations in other settings, such as television viewing process 200 provides a functionality that is not provided by conventional devices.
  • FIG. 3 depicts a graphical representation of display device components according to one or more embodiments.
  • display device 300 relates to a display device such as a TV.
  • display device 300 may be a display device configured for operation in a vehicle.
  • Display device 300 includes control device 305 , data storage unit 315 , input/output module 320 , microphone 321 , speaker 322 and display 325 .
  • display device 300 includes optional camera 310 .
  • display device 300 relates to a vehicle display device, and thus, may interoperate with one or more components of an optional vehicle system 330 to provide control signals.
  • display device 300 relates to a system including display 325 and control device 305 .
  • Control device 305 may be configured to determine a conversation language for a space, identify display content presented on display 325 for the space, and request subtitle data for the display content based on the conversation language determined for the space.
  • Control device 305 may also be configured to control presentation of subtitle data for display content for output by display 325 , wherein subtitle data presented is selected for the determined conversation language.
  • Control device 305 may be a processor, and is configured to control operation of display device 300 . According to one embodiment, control device 305 may be configured to provide a control module 306 to generate control commands for the display device. Control device 305 may be configured to provide a language detection module 307 data received from at least one of microphone 321 and optional camera 310 . In other embodiments, control module 306 and language detection module 307 may be physical hardware units of device 300 .
  • Control device 305 may operate based on executable code of control module 306 , language detection module 307 , and data storage unit 315 to perform and control functions of display device 300 .
  • control device 305 may execute process 200 of FIG. 2 , process 400 of FIG. 4 , and process 500 of FIG. 5 .
  • Control device 305 may execute and direct one or more processes and functional blocks described herein for display device operation include presentation of subtitle data.
  • control device 305 may use one or more processes for identifying conversation language based on parameters stored by data storage unit 315 .
  • keywords, terms and phrases may be stored for comparison to identify language for to request subtitle data.
  • Voice and/or speech data detected by input/output module 320 may be converted to text or machine readable representations to interpret language.
  • Optional camera 310 may be mounted to image one or more viewers in a space to provide image data to object detection module 307 .
  • Data storage unit 315 may be configured to store executable code to operate control device 305 and display device 300 .
  • Input/output (I/O) module 320 may be configured to receive inputs from a controller or input surface (e.g., touch screen, input buttons, etc.), display 325 and to output display content to display 325 .
  • Input/output (I/O) module 320 may operate display 325 and speaker 326 to output confirmation of one or more natural-language guidance instructions.
  • display device 300 and control device 305 may be configured to communicate with components of a vehicle, such as optional vehicle system 330 .
  • optional vehicle system 330 may be configured to direct relate to a user interface system of a vehicle including one or more sensors, functions and data capabilities.
  • FIG. 4 depicts a process for subtitle operations according to one or more embodiments.
  • determining language for subtitle data may be based on one or more attributes and data types detected by a device.
  • process 400 may be performed by a control device of a vehicle (e.g., vehicle 105 ) including a display (e.g., display 110 ) for presentation of subtitle data.
  • process 400 may be performed by a control device of display device (e.g., display device 160 ) for presentation of subtitle data.
  • process 400 can include at least one of detecting voice (e.g., speech) at block 405 , detecting image data at block 410 , and receiving user input at block 415 .
  • voice e.g., speech
  • One or more sources may be provided to determine language and perform subtitle requests at block at block 420 .
  • Voice data may be detected at block 405 while a display device is presenting content.
  • Image data may be detected of viewers of a display device at block 410 .
  • User input at block 415 may include user settings and/or interactions with a display.
  • Synchronizing subtitle data at block 425 may be based on subtitle data received from a source different from the source of display content.
  • the display content may be received or output from a device local to the display, such as a media player.
  • Subtitle data employed in block 425 may be received over a network communication, such as communication with a server.
  • the subtitle data may be synchronized such that the graphical elements of the subtitle data presented are matched to the occurrence of voice and other sound in the display content.
  • FIG. 5 depicts another process for subtitle operations according to one or more embodiments.
  • user input may aid in identifying a conversation language and/or subtitle title to present.
  • process 500 may be performed by a control device of a vehicle (e.g., vehicle 105 ) including a display (e.g., display 110 ) for presentation of subtitle data.
  • process 500 may be performed by a control device of display device (e.g., display device 160 ) for presentation of subtitle data.
  • Process 500 may be initiated by detecting display content at block 505 and identifying subtitle data at block 510 .
  • multiple sources or sets of subtitle may be available.
  • display content at block 505 may relate to popular content, such as a well-known film.
  • subtitle data identified at block 510 may result in the identification of multiple files or sources of data.
  • subtitle data identified at block 510 may not match a conversation language identified.
  • process 500 includes operations to request user input at block 515 .
  • User input may be requested through display of a graphical element (e.g., graphical element 116 , graphical element 166 , etc.), audible tone and feedback of a device in general.
  • a graphical element e.g., graphical element 116 , graphical element 166 , etc.
  • user input can include selection of subtitle data for a language that is not spoken in the display content.
  • User input can include selection of a particular subtitle data set associated with an identified language or source.
  • the user input may be received and used to by the control device to control display output at block 520 .
  • Subtitle data presented in response to display output at block 520 may be based on user input.

Abstract

The present disclosure relates to systems, devices and methods for automatic subtitle display. In one embodiment, a method is provided that includes determining a conversation language for a space, and identifying display content presented in the space on a display. The method may also include requesting subtitle data for the display content based on the conversation language determined for the space, and controlling, by the control device, presentation of subtitle data for the display content for output on the device, wherein subtitle data presented is selected for the determined conversation language. Processes and configurations can include determining conversation language by one or more of speech recognition, facial recognition, and user profile settings. In addition, automatic subtitle display may be provided for displays in a vehicle cabin and viewing areas of a display device in general.

Description

FIELD
The present disclosure relates to systems, methods and devices for controlling display elements, and more particularly to presentation of automatic subtitle display for display devices and vehicles.
BACKGROUND
Media content typically includes sound in a single language. Sometimes, audio data for the media content is dubbed such that additional or supplementary recordings replace the original production sound in a post-production process. Dubbing sound for media content can be labor intensive. In addition the sound quality of the media is often reduced. For many types of media, viewers desire the ability to understand voice or speech of the media. Some broadcast formats include secondary audio accompanied with the media and the media player can be set to include subtitles. There exists a need to provide display devices with additional subtitle information not limited to a fixed set of subtitle information provided with the media.
Many display devices are not configured to provide content other than information that is received by an input. Conventional display devices are usually programmed for a particular set of operation languages. There is a desire to provide display devices with the ability to access and present media with a desired language.
BRIEF SUMMARY OF THE EMBODIMENTS
Disclosed and claimed herein are methods, devices and systems for automatic subtitle display. One embodiment is directed to a method including determining, by a control device, a conversation language for a space. The method also includes identifying, by the control device, display content presented in the space on a display and requesting, by the control device, subtitle data for the display content based on the conversation language determined for the space. The method also includes controlling, by the control device, presentation of subtitle data for the display content for output on the device, wherein subtitle data presented is selected for the determined conversation language.
In one embodiment, determining conversation language includes performing a speech recognition operation on passenger voice data detected in the space.
In one embodiment, determining conversation language includes performing a facial recognition operation on image data detected in the space.
In one embodiment, determining conversation language includes determining a user profile setting for a passenger in the space.
In one embodiment, the space is a vehicle cabin, and conversation language includes passenger voice data detected for a vehicle cabin passenger.
In one embodiment, the space is associated with a viewing area of a display device, and conversation language includes voice data detected in the viewing area.
In one embodiment, identifying display content includes determining at least one of title, source, and identifier for the display content.
In one embodiment, subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
In one embodiment, controlling presentation of the subtitle data includes synchronizing output of the subtitle data to timing of the display content.
In one embodiment, the method includes displaying a notification for the subtitle data and receiving user input for the subtitle data, wherein presentation of the subtitle data is in response to user input received.
Another embodiment is directed to a system including a display and a control device coupled to the display. The control device is configured to determine a conversation language for a space, identify display content presented in the space on a display, and request subtitle data for the display content based on the conversation language determined for the space. The control device is also configured to control presentation of subtitle data for the display content for output on the device, wherein subtitle data presented is selected for the determined conversation language.
Other aspects, features, and techniques will be apparent to one skilled in the relevant art in view of the following detailed description of the embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
The features, objects, and advantages of the present disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify correspondingly throughout and wherein:
FIGS. 1A-1B depict graphical representations of subtitle display according to one or more embodiments;
FIG. 2 depicts a process for automatic subtitle display according to one or more embodiments;
FIG. 3 depicts a graphical representation of device components according to one or more embodiments;
FIG. 4 depicts a process for subtitle operations according to one or more embodiments; and
FIG. 5 depicts another process for subtitle operations according to one or more embodiments.
DETAILED DESCRIPTION OF THE EXEMPLARY EMBODIMENTS Overview and Terminology
One aspect of the disclosure is directed to controlling operations of a display device. Processes and device configurations are provided to allow for automatic subtitle display. In one embodiment, a process is provided that includes controlling presentation of subtitle data for display content output on a device. The process may include performing at least one operation to determine a conversation language relative to the display. In one embodiment, subtitle data is presented for the determined conversation language in a vehicle. Other embodiments are directed to presentation of subtitle data for display devices in general.
In one embodiment, a system is provided including a display and a control device coupled to the display. The control device is configured to determine a conversation language for a space and identify display content presented on a display. Based on the conversation language, the control device may request subtitle data for the display content. The control device may also be configured to control presentation of subtitle data for the display content for output on the device. Subtitle data presented by the display may be selected by the control device for the determined conversation language.
Processes and configurations described herein may be configured to identify display content presented in the space and request subtitle data for display content based on the conversation language determined for the space. In one embodiment, determining conversation languages may be based on a space relative to a display. By way of example, conversation language may be relative to a space or area within a vehicle cabin. In other embodiments, determining conversation language of a space may be relative to a viewing area of a display device (e.g., TV, projector, etc.). Presentation of subtitle data for the display content may be controlled for output on the device.
According to one embodiment, conversation language in a vehicle cabin may be determined by one or more operations including speech recognition, natural language processing and/or artificial intelligence (AI). In certain embodiments, one or more parameters for determining conversation language include determining a language identified in a user profile. In other embodiments, determining conversation language can include performing facial recognition operations. Facial recognition may be performed to identify nationality of one or more individuals in a space relative to the display. The determined conversation language can be used to identify the most relevant subtitle. Operations are also described herein to download subtitle data with display content, such as video, automatically. With respect to vehicle configurations, such as a vehicle display for the vehicle cabin, determining conversation language as discussed herein can overcome issues with driver distraction. For example, requests by vehicle passengers, such as young children, to provide subtitle data can be handled by processes and configurations without requiring driver programming of the subtitle data.
According to one embodiment, operations and configurations can provide improvements to display devices such as televisions. For broadcast programming (e.g., live TV), operations discussed herein can provide functions to allow for determination of a conversation language relative to the display device and presentation of subtitle data.
As used herein, the terms “a” or “an” shall mean one or more than one. The term “plurality” shall mean two or more than two. The term “another” is defined as a second or more. The terms “including” and/or “having” are open ended (e.g., comprising). The term “or” as used herein is to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” means “any of the following: A; B; C; A and B; A and C; B and C; A, B and C”. An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.
Reference throughout this document to “one embodiment,” “certain embodiments,” “an embodiment,” or similar term means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of such phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner on one or more embodiments without limitation.
EXEMPLARY EMBODIMENTS
Referring now to the figures, FIGS. 1A-1B depict graphical representations of subtitle display according to one or more embodiments. FIG. 1A provides a graphical representation of a system 100 for vehicle 105 including display 110. According to one embodiment, the interior space of vehicle 100 may accommodate one of more passengers. In addition, vehicle 105 may include a control device (not shown in FIG. 1A) configured to control operation of display 110. According to one embodiment, display 110 may be configured to present display content 120. According to another embodiment, a control device of vehicle 105, such as control device 305 of FIG. 3, may be configured to determine a conversation language within the passenger compartment, or cabin, of vehicle 105. As will be discussed herein, configurations and processes are provided to request subtitle data for display content 120 based on a determined conversation language in a space, such as within vehicle 105.
Vehicle 105 may be configured to control presentation of subtitle data, such as subtitle text 115, with display content 120. As will be discussed in more detail below, a control device (e.g., control device 305 of FIG. 3) may be configured to detect a conversation language within vehicle 105. FIG. 1A includes representations of passenger speech 125 and 130. In one embodiment, passenger speech 125 and 130 may relate to regular or non-command conversation between occupants. According to one embodiment, the control device of vehicle 105 is configured to detect passenger speech 125 and 130 and determine a conversation language based on passenger speech. Subtitle text 115 may be presented based on the determined conversation language. As will be discussed in more detail below, subtitle text 115 may be presented based on a conversation language determined from imaging of passengers and/or one or more user settings for subtitle language. The control unit of display 110 and/or vehicle 105 may be configured perform the processes (e.g., process 200, process 400, process 500, etc.) described herein for presentation of subtitle text 115.
In one embodiment, passenger speech 125 and 130 may relate to regular or non-command conversation between occupants. According to one embodiment, determining conversation language may be based on natural-language instructions from one or more occupants of the vehicle. By way of example, passenger speech, such as passenger speech 125 and 130, may be detected and interpreted such that commands to present subtitle data in one or more languages may be interpreted. In one embodiment, passenger speech 125 and 130 may relate to conversational language such as, “turn here,” “have a good day,” and “I'm turning here,” in one or more languages. In one embodiment, passenger speech 125 and 130 may relate to one or more commands including identification of a desired language. By way of example, a natural language command of “English subtitles” may result in the control device identifying language as English and control to present subtitle information in the detected language. Alternative examples of natural language commands detected by the control device may include, “change subtitle language” and “display subtitles in my language.” Passenger speech 125 and 130 may include commands to operate with one or more functions of display 110 and graphical element 116. As such, the language used for natural language commands may be identified to determine a conversation language for subtitle data.
FIG. 1A also shows graphical element 116 which may be presented on display 120 to indicate one or more of automatic subtitle display and the availability of subtitle data based on conversation language. According to one embodiment, graphical element 116 may be a selectable element configured for activation, modifying and/or ending subtitle presentation of subtitle text 115.
FIG. 1B provides a graphical representation of a system 150 for display 155 in a viewing area or space 151. According to one embodiment, space 151, associated with display 155 may accommodate one of more viewers, such as viewers 170 1-n. Display 155 may include a control device (not shown in FIG. 1B) configured to control operation of display 155. According to one embodiment, display 155 may be configured to present display content 160. According to another embodiment, a control device of display 155, such as control device 305 of FIG. 3, may be configured to determine a conversation language within space 151. As will be discussed herein, configurations and processes are provided to request subtitle data for display content 160 based on a determined conversation language in space 151. The control unit of display 155 may be configured perform the processes (e.g., process 200, process 400, process 500, etc.) described herein for presentation of subtitle text 165.
Display 155 may be configured to control presentation of subtitle data, such as subtitle text 165, with display content 160. As will be discussed in more detail below, a control device (e.g., control device 305 of FIG. 3) may be configured to detect a conversation language within space 151. FIG. 1B includes representations of viewer speech 175 and 180. According to one embodiment, the control device of display 155 is configured to detect viewer speech 175 and 180 and determine a conversation language based on the viewer speech. Subtitle text 165 may be presented based on the determined conversation language viewer speech 175 and 180. As will be discussed in more detail below, subtitle text 165 may be presented based on a conversation language determined from imaging of viewers 170 1-n and/or one or more user settings for subtitle language.
FIG. 1B also shows graphical element 166 which may be presented on display 155 to indicate one or more of automatic subtitle display and the availability of subtitle data based on conversation language. According to one embodiment, graphical element 166 may be a selectable element configured for activation, modifying and/or ending subtitle presentation of subtitle text 165.
FIG. 2 depicts a process for automatic subtitle display according to one or more embodiments. According to one embodiment, process 200 may be performed by a control device of a vehicle (e.g., vehicle 105) including a display (e.g., display 110) for presentation of display content with subtitle data. According to another embodiment, process 200 may be performed by a control device of display device (e.g., display device 160) for presentation of subtitle data.
Process 200 may be initiated at block 205 with determining language for a space. In one embodiment, determining language includes determining a conversation language for the space. As used herein, conversation language can include determining the spoken human language for communication including the use of words in a structured and conventional way. In some embodiments, a conversation language may be determined by analyzing spoken words. Conversation language may be determined at block 205 prior to display of content. In other embodiments, conversation language may be determined at block 205 in response to display of content.
According to one embodiment, determining conversation language at block 205 includes performing a speech recognition operation on passenger voice data detected in the space. Each command may be identified by identifying an action and reference for the action
According to one embodiment, determining conversation language at block 205 may include determining more than one language. In response to detecting more than one language a control device can select a conversation language. Selection of the conversation language may be based on the word count of each conversation language. By way of example, a conversation language detected having a greater word count for passenger speech may be selected. In other embodiments, process 200 may account for one or more other factors when multiple languages are concerned. One or more of a user input preference for language of subtitle presentation and facial recognition performed in the space may be employed to select one language over another language when multiple languages are identified. In yet another embodiment, a graphical element (e.g., graphical element 116, graphical element 166, etc.) may be presented on a display to allow a user to select a language detected.
Determining conversation language at block 205 may include performing one or more operations to characterize speech detected in a space. In one embodiment, one or more of sound and keyword recognition are used to identify possible languages. Phrases and sentences may be determined in addition to determining words. Process 200 may include parameters for natural language processing. In addition, process 200 may load a plurality of language and sound data sets as a reference. Languages and sound parameters may be assigned identifiers to allow for a control device to request subtitle data based on a determined language.
In one embodiment, determining conversation language at block 205 may include performing a voice recognition process including at least one of acoustic and language modelling. Acoustic modeling may include receiving audio data, detecting voice inputs, and identifying one or more linguistic units of the voice portion of audio data. The linguistic units may be used for language modelling including matching at least one of sounds and sequences of sounds to terms or words. In addition, patterns of speech such as a temporal pattern may be used to identify a spoken language.
In one embodiment, determining conversation language at block 205 may include identifying a spoken language between multiple passengers using at least one of voice differentiation, and voice location in the space. One or more microphones associated with the display or space may be used to detect human speech and characteristics of the speech. Speech detected in a first area of the space may be associated with a first passenger/viewer/individual, speech associated with a second area, which may be non-overlapping or located in a second different position, may be associated with a second passenger/viewer/individual. By assigning detected audio data a determining speech with at least one of the first location of the space and a second location of the space, speech from each location may be sequenced. Sequences of speech may be used to identify terms or language.
According to one embodiment, determining conversation language at block 205 includes performing a facial recognition operation on image data detected in the space. Conversation language can relate to a system of communication used by a particular community or country. In addition, parameters associated with people from a particular community or country may be associated with one or more national languages. According to one embodiment, a control unit may include one or more processes employed a trained data set for facial recognition. The trained data set may be based on a machine learned process for identifying facial features and correlating facial features to one or more languages. A trained data set and one or more processes for feature recognition may be performed by process 200.
In one embodiment, determining conversation language at block 205 includes determining a user profile setting for a passenger in the space. A display may provide a graphical display element (e.g., graphical element 116, graphical element 166, etc.) that operates as a user interface where a user can provide identify a desired conversation language of choice.
In one embodiment, determining conversation language at block 205 includes sending one or more of audio data, a user setting and optical characteristics to a server for processing. The control device may communicate with a network device, such as a server, over a communication network to determine a conversation language for the space. In one embodiment, the space is a vehicle cabin, and conversation language includes passenger voice data detected for a vehicle cabin passenger. According to another embodiment, the space is associated with a viewing area of a display device, and conversation language includes voice data detected in the viewing area.
At block 210, process 200 can include identifying display content presented in the space on a display. In one embodiment, a control device identifies display content by determining at least one of title, source, and identifier for the display content. At block 215, the control device requests subtitle data for the display content based on the conversation language determined for the space. At least one of identified content and a title of display content may be transmitted with a determined conversation language to a server to obtain subtitle data for the display content. In one embodiment, subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
At block 220, process 200 includes controlling presentation of subtitle data for the display content for output on the device. The control device may output subtitle data for presentation for the determined conversation language with the display content. In one embodiment, controlling presentation of the subtitle data includes synchronizing output of the subtitle data to timing of the display content. The subtitle data may be output to be imposed on display content or presented in a desired area of the display.
In certain embodiments, controlling presentation can include displaying a notification for the subtitle data and receiving user input for the subtitle data. Presentation of the subtitle data may be in response to user input received at optional block 225. Process 200 may be performed to provide automatic subtitle presentation. Automatic subtitle presentation can include detection of one or more parameters to identify conversation language without the knowledge of the individuals in the space. Control and output of the subtitle data may then be synchronized and displayed. In one embodiment, process 200 includes detecting voice and sounds of media in addition to voice within a space. Detected audio of the media may be filtered (e.g., ignored) to allow for identification of passenger speech. In other embodiments, detection of audio media may be identified, and a speech recognition process may be performed on media audio to determine timing for presentation of subtitle information.
According to one embodiment, process 200 includes receiving user input at block 225. User input received at block 225 may be relative to a display, such as inputs to a graphical display element (e.g., graphical element 116, graphical element 166, etc.). In one embodiment, user input at block 225 includes a user selection of a graphical display element of the display to confirm subtitle data for an identified language.
According to one embodiment, process 200 includes receiving subtitle data at block 230. Subtitle data can include text and/or data to present text with display content. In certain embodiments, subtitle data may include metadata to synchronize the subtitle data with display content. By way of example, one or more of a time base and synchronization framework may be provided to control presentation of the subtitle text.
Process 200 allows for a display to present content and subtitle data without requiring user activation. In that fashion, process 200 provides automatic presentation of subtitle information. For use in a vehicle, process 200 overcomes the need for a driver to select a subtitle set, and thus avoids driver distraction. For display device operations in other settings, such as television viewing process 200 provides a functionality that is not provided by conventional devices.
FIG. 3 depicts a graphical representation of display device components according to one or more embodiments. According to one embodiment, display device 300 relates to a display device such as a TV. In certain embodiments display device 300 may be a display device configured for operation in a vehicle. Display device 300 includes control device 305, data storage unit 315, input/output module 320, microphone 321, speaker 322 and display 325. According to one embodiment, display device 300 includes optional camera 310. According to another embodiment, display device 300 relates to a vehicle display device, and thus, may interoperate with one or more components of an optional vehicle system 330 to provide control signals.
According to one embodiment, display device 300 relates to a system including display 325 and control device 305. Control device 305 may be configured to determine a conversation language for a space, identify display content presented on display 325 for the space, and request subtitle data for the display content based on the conversation language determined for the space. Control device 305 may also be configured to control presentation of subtitle data for display content for output by display 325, wherein subtitle data presented is selected for the determined conversation language.
Control device 305, may be a processor, and is configured to control operation of display device 300. According to one embodiment, control device 305 may be configured to provide a control module 306 to generate control commands for the display device. Control device 305 may be configured to provide a language detection module 307 data received from at least one of microphone 321 and optional camera 310. In other embodiments, control module 306 and language detection module 307 may be physical hardware units of device 300.
Control device 305 may operate based on executable code of control module 306, language detection module 307, and data storage unit 315 to perform and control functions of display device 300. By way of example, control device 305 may execute process 200 of FIG. 2, process 400 of FIG. 4, and process 500 of FIG. 5. Control device 305 may execute and direct one or more processes and functional blocks described herein for display device operation include presentation of subtitle data.
In certain embodiments, control device 305 may use one or more processes for identifying conversation language based on parameters stored by data storage unit 315. By way of example, keywords, terms and phrases may be stored for comparison to identify language for to request subtitle data. Voice and/or speech data detected by input/output module 320 may be converted to text or machine readable representations to interpret language.
Optional camera 310 may be mounted to image one or more viewers in a space to provide image data to object detection module 307. Data storage unit 315 may be configured to store executable code to operate control device 305 and display device 300. Input/output (I/O) module 320 may be configured to receive inputs from a controller or input surface (e.g., touch screen, input buttons, etc.), display 325 and to output display content to display 325. Input/output (I/O) module 320 may operate display 325 and speaker 326 to output confirmation of one or more natural-language guidance instructions.
In certain embodiments, display device 300 and control device 305 may be configured to communicate with components of a vehicle, such as optional vehicle system 330. By way of example, optional vehicle system 330 may be configured to direct relate to a user interface system of a vehicle including one or more sensors, functions and data capabilities.
FIG. 4 depicts a process for subtitle operations according to one or more embodiments. According to one embodiment, determining language for subtitle data may be based on one or more attributes and data types detected by a device. According to one embodiment, process 400 may be performed by a control device of a vehicle (e.g., vehicle 105) including a display (e.g., display 110) for presentation of subtitle data. According to another embodiment, process 400 may be performed by a control device of display device (e.g., display device 160) for presentation of subtitle data. In FIG. 4, process 400 can include at least one of detecting voice (e.g., speech) at block 405, detecting image data at block 410, and receiving user input at block 415. One or more sources may be provided to determine language and perform subtitle requests at block at block 420. Voice data may be detected at block 405 while a display device is presenting content. Image data may be detected of viewers of a display device at block 410. User input at block 415 may include user settings and/or interactions with a display.
Synchronizing subtitle data at block 425 may be based on subtitle data received from a source different from the source of display content. In one embodiment, the display content may be received or output from a device local to the display, such as a media player. Subtitle data employed in block 425 may be received over a network communication, such as communication with a server. The subtitle data may be synchronized such that the graphical elements of the subtitle data presented are matched to the occurrence of voice and other sound in the display content.
FIG. 5 depicts another process for subtitle operations according to one or more embodiments. According to one embodiment, user input may aid in identifying a conversation language and/or subtitle title to present. According to one embodiment, process 500 may be performed by a control device of a vehicle (e.g., vehicle 105) including a display (e.g., display 110) for presentation of subtitle data. According to another embodiment, process 500 may be performed by a control device of display device (e.g., display device 160) for presentation of subtitle data.
Process 500 may be initiated by detecting display content at block 505 and identifying subtitle data at block 510. In certain embodiments, multiple sources or sets of subtitle may be available. By way of example, display content at block 505 may relate to popular content, such as a well-known film. As such, subtitle data identified at block 510 may result in the identification of multiple files or sources of data. Alternatively, subtitle data identified at block 510 may not match a conversation language identified. Accordingly, process 500 includes operations to request user input at block 515. User input may be requested through display of a graphical element (e.g., graphical element 116, graphical element 166, etc.), audible tone and feedback of a device in general. By way of example, user input can include selection of subtitle data for a language that is not spoken in the display content. User input can include selection of a particular subtitle data set associated with an identified language or source. The user input may be received and used to by the control device to control display output at block 520. Subtitle data presented in response to display output at block 520 may be based on user input.
While this disclosure has been particularly shown and described with references to exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the claimed embodiments.

Claims (18)

What is claimed is:
1. A method for automatic subtitle display on a display device in a vehicle, the method comprising:
detecting, by a control device, speech of a passenger;
detecting, by the control device, audio of media in the vehicle;
determining, by the control device, a conversation language for a space based on the passenger speech, wherein the space is a passenger compartment of the vehicle, wherein determining the conversation language comprises identifying voice location, and at least one of acoustic modelling, keyword recognition, or temporal patterns, and wherein determining the conversation language further comprises filtering out audio of media in the vehicle;
identifying, by the control device, display content being presented in the space on the display device;
requesting, by the control device, subtitle data for the display content based on the conversation language determined for the space, wherein if more than one conversation language is determined for the space, the conversation language detected as having a greater word count for passenger speech is selected; and
controlling, by the control device, presentation of subtitle data for the display content for output on the display device, wherein subtitle data presented is selected for the determined conversation language.
2. The method of claim 1, wherein determining conversation language includes performing a facial recognition operation on image data detected in the space.
3. The method of claim 1, wherein determining conversation language includes determining a user profile setting for the passenger in the space.
4. The method of claim 1, wherein the space is associated with a viewing area of a display device, and conversation language includes voice data detected in the viewing area.
5. The method of claim 1, wherein identifying display content includes determining at least one of title, source, and identifier for the display content.
6. The method of claim 1, wherein subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
7. The method of claim 1, wherein controlling presentation of the subtitle data includes synchronizing output of the subtitle data to timing of the display content.
8. The method of claim 1, further comprising displaying a notification for the subtitle data and receiving user input for the subtitle data, wherein presentation of the subtitle data is in response to user input received,
wherein the user is the passenger of the vehicle.
9. The method of claim 1, wherein requesting subtitle data occurs without requiring driver programming of the subtitle data.
10. A system comprising:
a display; and
a control device coupled to the display, wherein the control device is configured to:
identify display content being presented in a space on the display;
determine conversation language based on speech of a passenger, wherein if more than one conversation language is determined for the space, the conversation language detected as having a greater word count for passenger speech is selected;
request subtitle data for the display content based on the conversation language determined for the space, wherein the space is a passenger compartment of a vehicle, wherein the subtitle data is stored in a source different than the source of the display content; and
control presentation of subtitle data for the display content for output on the display, wherein subtitle data presented is selected for the determined conversation language.
11. The system of claim 10, wherein determining conversation language includes performing a speech recognition operation on passenger voice data detected in the space and exclusion of audio of media in the space.
12. The system of claim 10, wherein determining conversation language includes performing a facial recognition operation on image data detected in the space.
13. The system of claim 10, wherein determining conversation language includes determining a user profile setting for the passenger in the space.
14. The system of claim 10, wherein the space is associated with a viewing area of a display device, and conversation language includes voice data detected in the viewing area.
15. The system of claim 10, wherein identifying display content includes determining at least one of title, source, and identifier for the display content.
16. The system of claim 10, wherein subtitle data includes at least one of a textual and graphical representation of audio and speech data for the display content.
17. The system of claim 10, wherein controlling presentation of the subtitle data includes synchronizing output of the subtitle data to timing of the display content.
18. The system of claim 10, wherein the control device is further configured to control display of a notification for the subtitle data and receiving user input for the subtitle data, wherein presentation of the subtitle data is in response to user input received, wherein the user is the passenger of the vehicle.
US16/149,996 2018-10-02 2018-10-02 System and method for automatic subtitle display Active 2039-03-14 US11178465B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/149,996 US11178465B2 (en) 2018-10-02 2018-10-02 System and method for automatic subtitle display
CN201910930371.2A CN110996163B (en) 2018-10-02 2019-09-29 System and method for automatic subtitle display
DE102019126688.2A DE102019126688A1 (en) 2018-10-02 2019-10-02 SYSTEM AND METHOD FOR AUTOMATIC SUBTITLE DISPLAY

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US16/149,996 US11178465B2 (en) 2018-10-02 2018-10-02 System and method for automatic subtitle display

Publications (2)

Publication Number Publication Date
US20200107078A1 US20200107078A1 (en) 2020-04-02
US11178465B2 true US11178465B2 (en) 2021-11-16

Family

ID=69781693

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/149,996 Active 2039-03-14 US11178465B2 (en) 2018-10-02 2018-10-02 System and method for automatic subtitle display

Country Status (3)

Country Link
US (1) US11178465B2 (en)
CN (1) CN110996163B (en)
DE (1) DE102019126688A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017163719A1 (en) * 2016-03-23 2017-09-28 日本電気株式会社 Output control device, output control method, and program
US11341961B2 (en) * 2019-12-02 2022-05-24 National Cheng Kung University Multi-lingual speech recognition and theme-semanteme analysis method and device
CN111526382B (en) * 2020-04-20 2022-04-29 广东小天才科技有限公司 Live video text generation method, device, equipment and storage medium

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060224438A1 (en) 2005-04-05 2006-10-05 Hitachi, Ltd. Method and device for providing information
US20110097056A1 (en) * 2008-06-24 2011-04-28 Shenzhen Tcl New Technology Ltd. System and method for resolution of closed captioning and subtitle conflict
US8156114B2 (en) * 2005-08-26 2012-04-10 At&T Intellectual Property Ii, L.P. System and method for searching and analyzing media content
US20120169583A1 (en) * 2011-01-05 2012-07-05 Primesense Ltd. Scene profiles for non-tactile user interfaces
US8260615B1 (en) * 2011-04-25 2012-09-04 Google Inc. Cross-lingual initialization of language models
US20150179184A1 (en) * 2013-12-20 2015-06-25 International Business Machines Corporation Compensating For Identifiable Background Content In A Speech Recognition Device
EP2933607A1 (en) 2014-04-14 2015-10-21 Bosch Automotive Products (Suzhou) Co., Ltd. Navigation system having language category self-adaptive function and method of controlling the system
US20150304727A1 (en) 2014-04-16 2015-10-22 Sony Corporation Method and system for displaying information
US20150325268A1 (en) * 2014-05-12 2015-11-12 Penthera Partners, Inc. Downloading videos with commercials to mobile devices
US20160055786A1 (en) * 2014-06-20 2016-02-25 Google Inc. Methods, systems, and media for detecting a presentation of media content on a display device
US20160127807A1 (en) * 2014-10-29 2016-05-05 EchoStar Technologies, L.L.C. Dynamically determined audiovisual content guidebook
US9571870B1 (en) * 2014-07-15 2017-02-14 Netflix, Inc. Automatic detection of preferences for subtitles and dubbing
US20180053518A1 (en) * 2016-08-17 2018-02-22 Vocollect, Inc. Method and apparatus to improve speech recognition in a high audio noise environment
US9934785B1 (en) * 2016-11-30 2018-04-03 Spotify Ab Identification of taste attributes from an audio signal
US20180233130A1 (en) * 2017-02-10 2018-08-16 Synaptics Incorporated Binary and multi-class classification systems and methods using connectionist temporal classification
US20180342239A1 (en) * 2017-05-26 2018-11-29 International Business Machines Corporation Closed captioning through language detection
US20190080691A1 (en) * 2017-09-12 2019-03-14 Toyota Motor Engineering & Manufacturing North America, Inc. System and method for language selection
US20190197430A1 (en) * 2017-12-21 2019-06-27 Lyft, Inc. Personalized ride experience based on real-time signals
US20200007946A1 (en) * 2018-06-29 2020-01-02 Rovi Guides, Inc. Selectively delivering a translation for a media asset based on user proficiency level in the foreign language and proficiency level required to comprehend the media asset

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100739680B1 (en) * 2004-02-21 2007-07-13 삼성전자주식회사 Storage medium for recording text-based subtitle data including style information, reproducing apparatus, and method therefor
US20110020774A1 (en) * 2009-07-24 2011-01-27 Echostar Technologies L.L.C. Systems and methods for facilitating foreign language instruction
CN102802044A (en) * 2012-06-29 2012-11-28 华为终端有限公司 Video processing method, terminal and subtitle server
US10321204B2 (en) * 2014-07-11 2019-06-11 Lenovo (Singapore) Pte. Ltd. Intelligent closed captioning
CN104681023A (en) * 2015-02-15 2015-06-03 联想(北京)有限公司 Information processing method and electronic equipment
CN106331893B (en) * 2016-08-31 2019-09-03 科大讯飞股份有限公司 Real-time caption presentation method and system
CN106504754B (en) * 2016-09-29 2019-10-18 浙江大学 A kind of real-time method for generating captions according to audio output
CN106864358A (en) * 2017-03-17 2017-06-20 东莞市立敏达电子科技有限公司 Subtitle dialog system between a kind of vehicle and vehicle
CN108600773B (en) * 2018-04-25 2021-08-10 腾讯科技(深圳)有限公司 Subtitle data pushing method, subtitle display method, device, equipment and medium

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060224438A1 (en) 2005-04-05 2006-10-05 Hitachi, Ltd. Method and device for providing information
US8156114B2 (en) * 2005-08-26 2012-04-10 At&T Intellectual Property Ii, L.P. System and method for searching and analyzing media content
US20110097056A1 (en) * 2008-06-24 2011-04-28 Shenzhen Tcl New Technology Ltd. System and method for resolution of closed captioning and subtitle conflict
US20120169583A1 (en) * 2011-01-05 2012-07-05 Primesense Ltd. Scene profiles for non-tactile user interfaces
US8260615B1 (en) * 2011-04-25 2012-09-04 Google Inc. Cross-lingual initialization of language models
US20150179184A1 (en) * 2013-12-20 2015-06-25 International Business Machines Corporation Compensating For Identifiable Background Content In A Speech Recognition Device
EP2933607A1 (en) 2014-04-14 2015-10-21 Bosch Automotive Products (Suzhou) Co., Ltd. Navigation system having language category self-adaptive function and method of controlling the system
US20150304727A1 (en) 2014-04-16 2015-10-22 Sony Corporation Method and system for displaying information
US20150325268A1 (en) * 2014-05-12 2015-11-12 Penthera Partners, Inc. Downloading videos with commercials to mobile devices
US20160055786A1 (en) * 2014-06-20 2016-02-25 Google Inc. Methods, systems, and media for detecting a presentation of media content on a display device
US9571870B1 (en) * 2014-07-15 2017-02-14 Netflix, Inc. Automatic detection of preferences for subtitles and dubbing
US20160127807A1 (en) * 2014-10-29 2016-05-05 EchoStar Technologies, L.L.C. Dynamically determined audiovisual content guidebook
US20180053518A1 (en) * 2016-08-17 2018-02-22 Vocollect, Inc. Method and apparatus to improve speech recognition in a high audio noise environment
US9934785B1 (en) * 2016-11-30 2018-04-03 Spotify Ab Identification of taste attributes from an audio signal
US20180233130A1 (en) * 2017-02-10 2018-08-16 Synaptics Incorporated Binary and multi-class classification systems and methods using connectionist temporal classification
US20180342239A1 (en) * 2017-05-26 2018-11-29 International Business Machines Corporation Closed captioning through language detection
US20190080691A1 (en) * 2017-09-12 2019-03-14 Toyota Motor Engineering & Manufacturing North America, Inc. System and method for language selection
US20190197430A1 (en) * 2017-12-21 2019-06-27 Lyft, Inc. Personalized ride experience based on real-time signals
US20200007946A1 (en) * 2018-06-29 2020-01-02 Rovi Guides, Inc. Selectively delivering a translation for a media asset based on user proficiency level in the foreign language and proficiency level required to comprehend the media asset

Also Published As

Publication number Publication date
CN110996163A (en) 2020-04-10
DE102019126688A1 (en) 2020-04-02
US20200107078A1 (en) 2020-04-02
CN110996163B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
EP3190512B1 (en) Display device and operating method therefor
CN110996163B (en) System and method for automatic subtitle display
US7136817B2 (en) Method and apparatus for the voice control of a device appertaining to consumer electronics
US20180211659A1 (en) Ambient assistant device
US9959872B2 (en) Multimodal speech recognition for real-time video audio-based display indicia application
JP2008079018A (en) Closed caption generator, closed caption generation method and closed caption generation program
JP6945130B2 (en) Voice presentation method, voice presentation program, voice presentation system and terminal device
WO2018005334A1 (en) Systems and methods for routing content to an associated output device
CN112136102B (en) Information processing apparatus, information processing method, and information processing system
WO2019107145A1 (en) Information processing device and information processing method
US10089899B2 (en) Hearing and speech impaired electronic device control
WO2017222645A1 (en) Crowd-sourced media playback adjustment
CN110696756A (en) Vehicle volume control method and device, automobile and storage medium
US6757656B1 (en) System and method for concurrent presentation of multiple audio information sources
WO2019155716A1 (en) Information processing device, information processing system, information processing method, and program
CN115605948A (en) Arbitration between multiple potentially responsive electronic devices
US20200388268A1 (en) Information processing apparatus, information processing system, and information processing method, and program
JP2016206646A (en) Voice reproduction method, voice interactive device, and voice interactive program
WO2020003820A1 (en) Information processing device for executing plurality of processes in parallel
WO2022237381A1 (en) Method for saving conference record, terminal, and server
JP7229906B2 (en) Command controller, control method and control program
JP2018138987A (en) Information processing device and information processing method
US20230267942A1 (en) Audio-visual hearing aid
JP2007228624A (en) Motion picture reproducing apparatus and method, and computer program therefor
JP2008154258A (en) Motion picture playback apparatus, motion picture playback method and computer program therefor

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: HARMAN INTERNATIONAL INDUTRIES, INCORPORATED, CONNECTICUT

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GANAPATHY, GIRISHA;REEL/FRAME:047349/0491

Effective date: 20180208

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE