WO2024065799A1 - Vehicle passenger display modification - Google Patents

Vehicle passenger display modification Download PDF

Info

Publication number
WO2024065799A1
WO2024065799A1 PCT/CN2022/123563 CN2022123563W WO2024065799A1 WO 2024065799 A1 WO2024065799 A1 WO 2024065799A1 CN 2022123563 W CN2022123563 W CN 2022123563W WO 2024065799 A1 WO2024065799 A1 WO 2024065799A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
vehicle
interest
processing circuitry
passenger
Prior art date
Application number
PCT/CN2022/123563
Other languages
French (fr)
Inventor
Cornelius Buerkle
Ping Guo
Mee Sim LAI
Meng Siong LEE
Kuan Heng Lee
Fabian Oboril
Frederik Pasch
Say Chuan Tan
Wei Seng Yeap
Chien Chern Yew
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to PCT/CN2022/123563 priority Critical patent/WO2024065799A1/en
Publication of WO2024065799A1 publication Critical patent/WO2024065799A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R1/00Optical viewing arrangements; Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
    • B60R1/20Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
    • B60R1/22Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle
    • B60R1/23Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view
    • B60R1/24Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with a predetermined field of view in front of the vehicle
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B61RAILWAYS
    • B61LGUIDING RAILWAY TRAFFIC; ENSURING THE SAFETY OF RAILWAY TRAFFIC
    • B61L27/00Central railway traffic control systems; Trackside control; Communication systems specially adapted therefor
    • B61L27/04Automatic systems, e.g. controlled by train; Change-over to manual control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R2300/00Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
    • B60R2300/30Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of image processing
    • B60R2300/307Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of image processing virtually distinguishing relevant parts of a scene from the background of the scene
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R2300/00Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
    • B60R2300/70Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by an event-triggered choice to display a specific image among a selection of captured images
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R2300/00Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
    • B60R2300/80Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the intended use of the viewing arrangement
    • B60R2300/8093Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the intended use of the viewing arrangement for obstacle warning

Definitions

  • aspects described herein generally relate to a passenger vehicle visual augmentation system, and more particularly, to a passenger vehicle visual augmentation system for enhancing a passenger experience.
  • Vehicle accidents are common. In Germany in 2020, there were 430 accidents injuring more than 500 people. And there were additional collisions with animals that went unreported. Unfortunately, even with the most advanced sensor technology, these events cannot be eliminated. Vehicles, such as trains, travelling at speeds above 150 km/h have stopping distances of several hundred meters up to few kilometers, which is out of sensor range.
  • vehicle passengers like to enjoy scenery along the traveling path. Scenery enjoyment needs vary depending on the particular passenger or situation. Child passengers may have questions about a type of tree, elderly and visually-impaired passengers may have poor eyesight, and any passenger may have a view blocked by a person or another vehicle (e.g., when the target vehicle is in a middle lane and the passenger view is blocked by a vehicle in a side lane) . Thus, solutions are desired to enhance passenger travel experience.
  • FIGs. 1A-1C illustrate schematic diagrams of a system in accordance with aspects of the disclosure.
  • FIGs. 2A and 2B illustrate schematic diagrams of transparent window regions in accordance with aspects of the disclosure.
  • FIG. 3 illustrates a window optionally used as a screen in accordance with aspects of the disclosure.
  • FIGs. 4A-4B illustrate a virtual image creation system using a Generative Adversarial Network (GAN) .
  • GAN Generative Adversarial Network
  • FIG. 5 illustrates a schematic diagram of a passenger enjoyment system overview in accordance with aspects of the disclosure.
  • FIG. 6 illustrates a schematic diagram of a vehicle passenger enjoyment system in accordance with aspects of the disclosure.
  • FIG. 7 illustrates a schematic diagram of a point of interest extraction system in accordance with aspects of the disclosure.
  • FIG. 8 illustrates a schematic diagram of a scenery caption generation system in accordance with aspects of the disclosure.
  • FIG. 9 illustrates a schematic diagram of a bookmark generation system in accordance with aspects of the disclosure.
  • FIG. 10 illustrates a schematic diagram of a content sharing system in accordance with aspects of the disclosure.
  • FIG. 11 illustrates a block diagram of an exemplary computing device in accordance with an aspects of the disclosure
  • FIGs. 1A-1C illustrate schematic diagrams of a system 100 in accordance with aspects of the disclosure.
  • the system 100 is configured to protect vehicle passengers from trauma due to a collision of the vehicle 10 with an object.
  • the system 100 uses an object detection system 110 to perceive a surroundings of the vehicle (vehicle outside environment) , coupled with a collision detection system 120 and a passenger protection system 130 to virtually eliminate a collision obstacle from the perceived environment.
  • the vehicle 10 may be, for example, a train. It is likely not possible to detect an obstacle (or other point of interest) early enough for a high-speed train to stop in time to avoid a collision.
  • the train 10 is equipped with one or more sensors 12, 14 to detect the object in front of the train. If the collision system 120 detects a potential collision, the passenger protection system 130 may be activated. In normal operation, the train windshield may be a transparent window. But if the object detection system 110 and the collision detection system 120 determine that a collision is inevitable, the passenger protection system 130 is configured to manipulate the windshield to display an augmented reality.
  • This augmented reality may be either an image of the current vehicle environment with the object removed, or a non-transparent screen, to prevent a front automated train passenger or a train driver from experiencing trauma due witnessing the collision.
  • the vehicle 10 is not limited to being a train, but may alternatively be an airplane to avoid panic about an engine failure, an automobile, or any another suitable vehicle.
  • the system 100 may additionally or alternatively be configured to project information of interest (e.g., city name, region-specific advertisement, etc. ) .
  • the system 100 comprises an object detection system 110, a collision detection system 120, a passenger protection system 130, and/or a passenger enjoyment system 600.
  • the object detection system 110 receives sensor data from onboard sensors 12, 14 (e.g., RGB camera, LiDAR, radar, etc. ) that may be mounted in the front of the train 10 to detect potential obstacles on the track or objects that are about to cross the track.
  • the object detection system 110 may receive sensor data from infrastructure sensors 16 (e.g., digital twin) . Using infrastructure information can improve the responsiveness of the system 100 because a possible collision can be detected earlier with such off-board sensor data than with only on-board sensors.
  • the collision detection system 120 comprises an object classification system 122, an object motion and prediction system 124, and a collision probability and severity estimation system 126.
  • the object classification system 122 is configured to use any known classification approach to classify the detected object as a person, vehicle, animal, unknown object, etc.
  • the object motion and prediction system 124 is configured to estimate the object’s motion, predict future movement, and output an object list with corresponding trajectories.
  • the object’s motion may be estimated using any known object tracker.
  • the future movement may be predicted using any known solution.
  • the collision probability and severity estimation system 126 is configured to use the object list with corresponding trajectories from the object motion and prediction system 124 to estimate the collision probability.
  • This system 126 intersects the trajectory of the object with the trajectory of the train 10 , taking into consideration uncertainties.
  • This system 126 may also consider any data from a map 18 or digital twin 16, if available.
  • Useful map data could be, for example, information on track geometries and road crossings, with or without barriers.
  • Data from the digital twin 16 may include data on the state of barriers, open or closed. The state of barriers may be used to improve the certainty of the trajectories. For example, if the train 10 is moving towards a closed barrier, the collision probability is relatively low. If, however, the object is a vehicle that is detected between closed barriers, the collision probability would be much higher.
  • the collision probability and severity estimation system 126 is configured to estimate the collision severity. For example, if the train 10 is about to collide with a heavy vehicle (e.g., classified as truck) , the passenger protection system 130 might activate a warning. A collision with a truck will likely result in a noticeable physical impact. Thus the passenger protection system 130 may decide in this case to not make the window non-transparent or project a scenes on the window, but rather keep the window in a transparent state. However, when a non-noticeable impact is expected, such as that with an animal, the passenger protection system 130 may activate the envisioned virtual window feature.
  • a heavy vehicle e.g., classified as truck
  • the collision probability and severity estimation system 126 outputs a list of likely collisions (can be an empty list) , with information on when and where the collision will happen (e.g., in 2 seconds, on right side of train) , what type of object will be hit (e.g., animal) , and how likely the collision will happen (e.g., 80%) , and the expected severity (e.g., no injuries within train) .
  • the vehicle passenger protection system 130 is configured use the output from the collision probability and severity estimation system 126 to determine whether to warn train passengers that a severe collision is to be expected (e.g., high severity due to collision with truck) , or alternatively, to provide the passengers with a virtual image.
  • the warning option is straight forward and may be a simple audio message.
  • the second option, to provide a virtual image is now described in more detail.
  • the vehicle passenger protection system 130 comprises an object removal from panoramic view system 132, a perspective correction system 134, and a windshield/window projection system 136.
  • An example of the object removal from panoramic view system 132 is described below with respect to FIGs. 4A-4B.
  • the perspective correction system 134 is described below with respect to FIGs. 2A, 2B, and 3.
  • the windshield/window projection system 136 may be any known display system.
  • FIGs. 2A-2B illustrate perspective correction system 200 (134 in FIG. 1B) .
  • the first option is to use windows 210 that can be made non-transparent by applying electronic signals. This is an inexpensive and fast solution that allows the system to make the entire window 210 non-transparent, or only a portion of the window 210. If only portions of the window are to be transparent, then a field-of-view (FOV) is estimated for each individual passenger using a head tracking sensor 230 and correlating the estimated FOV with positions of the object at different points in time (t, t+1, t+2 ... ) until the collision.
  • FOV field-of-view
  • the intersection with the passengers’ FOV and the object positions may be blurred out in a blending area 240.
  • the system 100 thereby prevents the passengers from becoming an eye-witnesses to the collision.
  • the switching time of the windows 210 may be in the order of a few milliseconds, so the passengers will probably not notice the image change from actual to virtual.
  • FIG. 3 illustrates a more advanced and less intrusive perspective correction system 300 (134 in FIG. 1B) with the window 300 also used as a display screen.
  • the train 10 is equipped with panoramic cameras 14 to capture the entire scenery.
  • the window 300A first shows the actual scenery.
  • the window 300B may optional then be blanked out by applying an electric signal to act as a screen.
  • the window 300C may show a virtual image with the object to be collided with either removed or placed in a non-collision location.
  • the stream of the camera (s) is better processed in real time for a better passenger experience, that is, at least 25 to 30 frames per second processed and generated, which is below the time the human eye would observe a non-smooth environment. However, even if fewer frames per second can be generated (e.g., just one static image) , this is still better for the passengers than witnessing a collision.
  • FIGs. 4A-4B illustrate a virtual image creation system 400 using a Generative Adversarial Network (GAN) 430.
  • the virtual image creation system 400 comprises a camera 410, a semantic segmentation system 420, a GAN 430, and a window display 440.
  • the camera 410 is configured to obtain original image data, as shown in the left-hand image of FIG. 4A.
  • the semantic segmentation system 420 is configured to process the image data to identify an obstacle in the original image, as shown in the middle image of FIG. 4A. This processing is disclosed as being achieved by semantic segmentation, but may alliteratively be achieved by another known process, such as yolo.
  • the identified object region is provided to a GAN 430 together with the original image.
  • the GAN 430 is configured to create a virtual image without the object, as shown in the right-hand image of FIG. 4A. After a correction of the perspective, to match the windows position, this image is then displayed on the window display 440.
  • This is similar to an IMAX half dome in which perspective projections are used.
  • an observant person might feel a difference when this virtual world projection is activated, as all of a sudden, the human eyes need to focus on something at a shorter distance. However, this is acceptable sacrifice as compared with the passenger experiencing a collision with an animal or other human.
  • Another option is to not only remove the object from image, but instead, use an AI algorithm to create a scenario in which the obstacle is moving away from the track. This creates a virtual solution, omitting the crash.
  • the aspects disclosed herein may be used to activate windows to become non-transparent so as to prevent gawkers on highways. Gawkers unnecessarily slow down traffic to take pictures or videos of accidents or other bad situations. They also often hinder rescue and emergency people, thereby risking other lives. As a consequence, rescue and emergency people need to install blinds to avoid unnecessary hinderance.
  • the aspects of this disclosure may be deployed in vehicles 10 to address this issue in different ways.
  • the police may remotely activate the feature using vehicle-to-everything (V2X) communication paths.
  • the vehicle 10 itself may activate the feature by using map information 18, localization, and knowledge of rescue missions.
  • the feature activation will be the side windows either becoming non-transparent, or, similar to the train embodiment, display a “normal” scene.
  • the driver is not required to pay attention (L3-vehicle or above) , even the windshield could be handled in the same manner.
  • the windshield should remain transparent, but more and more vehicles in the future will use camera-based mirrors, which will allow activation of the aspects of the disclosure for the side windows.
  • FIG. 5 illustrates a schematic diagram of a passenger enjoyment system overview 500 in accordance with aspects of the disclosure.
  • the system may highlight, translate, and bookmark an image of scenery along a traveling route.
  • the vehicle surroundings are captured in images using cameras 12, an optionally additionally sounds using microphones 52.
  • the vehicle 10 is driving in a middle lane with its view blocked by vehicles in adjacent lanes.
  • the captured view is shown on one or more vehicle displays 54 (e.g., organic light-emitting diode (OLED) monitor, projector, AR/XR glasses, etc. ) .
  • the passenger can view the scenery through both the display 54 and the vehicle window.
  • a passenger interaction system 510 allows a vehicle passenger to obtain additional information about the scenery by pointing to a point of interest or clicking on the point of interest on the display 54. Then, the system modifies the image to caption or highlight the point of interest on the display 54, and may additionally translate the caption to another language.
  • the translated language may be determined by both the scenery image, audio, and/or passenger profile.
  • a scenery bookmark system 530 may record and organize the images as personalized bookmarks, which may be referred to during a subsequent trip.
  • FIG. 6 illustrates a schematic diagram of a vehicle passenger enjoyment system 600 in accordance with aspects of the disclosure.
  • the passenger enjoyment system 600 allows vehicle passengers to have an improved traveling experience.
  • the system 600 comprises passenger profiles 610, a point of extraction system 620 (more detail in FIG. 7) , a scenery caption generation system 630 (more detail in FIG. 8) , a bookmark generation system (more detail in FIG. 9) , and a content sharing system 650 (more detail in FIG. 10) .
  • cameras 12, 14 and microphones 52 capture a vehicle’s surroundings, both visually and audibly.
  • a first input is the video/image/audio data that captures scenery information.
  • the second input is the location, path, and motion plan that autonomous vehicles use for navigation.
  • the third input is passenger interest based on the human-vehicle interactions 510 (e.g., gaze 510, emotion 514, heart rate 516, pointing direction, touching display location, etc. ) .
  • the outputs are scenery captions (e.g., text 632, audio 636, and/or highlighted regions in images/videos 638) , which are sent to vehicle passengers based on respective passenger profiles 610 (e.g., child 612, elderly 614, visually impaired 616, etc. ) .
  • passenger profiles 610 e.g., child 612, elderly 614, visually impaired 616, etc.
  • the passenger may receive a modified image that is zoomed in to a point of interest for a better view and/or audio explanation 636.
  • the system 600 may also be configured to bookmark any recorded location, image, video, or audio.
  • FIG. 7 illustrates a schematic diagram of a point of interest extraction system 700 in accordance with aspects of the disclosure.
  • a point of interest is, for example, natural scenery, historical building, etc.
  • the point of interest extraction system 700 comprises a passenger detection system 710, a region of interest (ROI) extraction system 720, an event of interest (EOI) extraction system 740, and an object type of interest (OTI) extraction system 760.
  • the passenger detection system 710 is configured to detect by any known method (e.g., RFID tag, NFC, etc. ) which passenger has entered the vehicle 10 so that the passenger’s profile may be retrieved.
  • a ROI is a sub-region of an image.
  • An EOI is an event happening on the road, An OTI is a type of object.
  • the ROI extraction system 720 is configured to output a modified image with a sub-region of the received image highlighted 730.
  • the EOI extraction system 740 is configured to output a modified image of an event, such as a traffic jam, traffic accident, emergency such as gun shot, etc.
  • the OTI extraction system 760 is configured to output a modified image of an object of interest, such as a plant, animal, building, etc.
  • the inputs to point of interest extraction system 700 include human-vehicle interactions 510, passenger profiles 610, video/image/audio signals, and navigation information.
  • the human-vehicle interactions 510 may comprise eye gaze tracking 512, emotion recognition 512, heart rate 516, etc.
  • the passenger profiles 610 are, for example, that a vehicle passenger prefers audio or video service, scenery type of interest, etc.
  • the profile may be generated manually by the vehicle passenger or traffic work force, or may be generated automatically by a machine learning algorithm.
  • the video/image/audio signals are of vehicle surroundings.
  • the navigation information comprises location, map, and/or motion plans. Based on these inputs, the point of interest (POI) can be determined by any known process, such as a classifier (offline or online) .
  • FIG. 8 illustrates a schematic diagram of a scenery caption generation system 800 in accordance with aspects of the disclosure.
  • the inputs to scenery caption generation system 800 comprise the point of interest extraction results 730, 750, 770, the scenery video/image/audio signals, and the navigation information (e.g., location, path, and motion plans) .
  • the outputs may comprise test, sentences, speech, and/or ROI of images describing the scenery generated using, for example, deep learning algorithms.
  • a text output may be, for example, Example of generated caption: “There are few vehicles on the road. It starts raining and there are maple trees on the roadside. Dogs are sitting on the roadside. ”
  • a passenger attention selection module 840 is configured to, based on a vehicle passenger’s interests as indication in the corresponding profile, extract and select the vehicle passenger’s attention.
  • the scenery is represented by a semantic graph 850 and a geometry graph 860.
  • the semantic graph 850 is configured to describe object and event types, such as there are maple trees, dogs, people, etc.
  • the geometry graph 860 is configured to describe the location relationship between objects and events, for example, the sky is above the tree, building A is to the east of building B, etc.
  • An encoder-decoder 870 is configured to generate language description. There may be multiple decoders 870, such as Long Short Term Memory (LSTM) , transformers, or the like.
  • the encoder may employ a convolutional neural network (CNN) , which extracts objects and features from an image or video frame.
  • CNN convolutional neural network
  • the decoder may employ a neural network that generates a natural sentence based on the available information.
  • FIG. 9 illustrates a schematic diagram of a bookmark generation system 900 in accordance with aspects of the disclosure.
  • the bookmark generation system 900 is configured to bookmark documented evidence of the trip.
  • the system 900 is configured to record images, videos, and/or audio based on the vehicle passenger points of interest.
  • the system 900 is also configured to record associated conversations in regard to the point of interests, for example, “I used to come to this beach during my high school time, ” or “This is the church where my parents got married. ” Similarly, recordings may be recalled from the bookmark database when the vehicle subsequently travels by related point of interests.
  • FIG. 10 illustrates a schematic diagram of a content sharing system 1000 in accordance with aspects of the disclosure.
  • the content sharing system 1000 allows vehicle passengers of a same view to share content, either enabled by a client server, peer-to-peer, or other network architecture.
  • the content sharing system 1000 allows a vehicle passenger to share a sub-set of content for their specific use cases. For example, vehicle passenger A might want to share a house image and related information to vehicle passenger B.
  • the specific ROI/EOI/OT information 1010 attention of vehicle passenger A, which is the house, is displayed on the vehicle passenger B’s view and allows vehicle passenger B to edit, add, delete, and any possible communication between them.
  • the changes can be made either locally (e.g., at vehicle passenger B only) or globally (e.g., at both vehicle passengers A and B) .
  • image as defined herein may also encompass a portion of a video, which is a series of images.
  • FIG. 11 illustrates a block diagram of an exemplary computing device in accordance with an aspects of the disclosure.
  • the computing device 1100 as shown and described with respect to FIG. 11 may be identified with a central controller and be implemented as any suitable network infrastructure component, which may be implemented as an Edge network server, controller, computing device, etc.
  • the computing device 1100 may serve the environment in accordance with the various techniques as discussed herein.
  • the computing device 1100 may perform the various functionality as described herein.
  • the computing device 1100 may include processing circuitry 1102, a transceiver 1104, communication interface 1106, and a memory 1108.
  • the components shown in FIG. 11 are provided for ease of explanation, and the computing device 1100 may implement additional, less, or alternative components as those shown in FIG 11.
  • the processing circuitry 1102 may be configured as any suitable number and/or type of computer processors, which may function to control the computing device 1100.
  • the processing circuitry 1102 may be identified with one or more processors (or suitable portions thereof) implemented by the computing device 1100.
  • the processing circuitry 1102 may be identified with one or more processors such as a host processor, a digital signal processor, one or more microprocessors, graphics processors, baseband processors, microcontrollers, an application-specific integrated circuit (ASIC) , part (or the entirety of) a field-programmable gate array (FPGA) , etc.
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • the processing circuitry 1102 may be configured to carry out instructions to perform arithmetical, logical, and/or input/output (I/O) operations, and/or to control the operation of one or more components of computing device 1100 to perform various functions as described herein.
  • the processing circuitry 1102 may include one or more microprocessor cores, memory registers, buffers, clocks, etc., and may generate electronic control signals associated with the components of the computing device 1100 to control and/or modify the operation of these components.
  • the processing circuitry 1102 may communicate with and/or control functions associated with the transceiver 1104, the communication interface 1106, and/or the memory 1108.
  • the processing circuitry 1102 may additionally perform various operations to control the communications, communications scheduling, and/or operation of other network infrastructure components that are communicatively coupled to the computing device 1100.
  • the transceiver 1104 may be implemented as any suitable number and/or type of components configured to transmit and/or receive data packets and/or wireless signals in accordance with any suitable number and/or type of communication protocols.
  • the transceiver 1104 may include any suitable type of components to facilitate this functionality, including components associated with known transceiver, transmitter, and/or receiver operation, configurations, and implementations. Although depicted in FIG. 11 as a transceiver, the transceiver 1104 may include any suitable number of transmitters, receivers, or combinations of these that may be integrated into a single transceiver or as multiple transceivers or transceiver modules.
  • the transceiver 1104 may include components typically identified with an RF front end and include, for example, antennas, ports, power amplifiers (PAs) , RF filters, mixers, local oscillators (LOs) , low noise amplifiers (LNAs) , up-converters, down-converters, channel tuners, etc.
  • PAs power amplifiers
  • LOs local oscillators
  • LNAs low noise amplifiers
  • the communication interface 1106 may be configured as any suitable number and/or type of components configured to facilitate the transceiver 1104 receiving and/or transmitting data and/or signals in accordance with one or more communication protocols, as discussed herein.
  • the communication interface 1106 may be implemented as any suitable number and/or type of components that function to interface with the transceiver 1106, such as analog-to-digital converters (ADCs) , digital to analog converters, intermediate frequency (IF) amplifiers and/or filters, modulators, demodulators, baseband processors, etc.
  • ADCs analog-to-digital converters
  • IF intermediate frequency
  • the communication interface 1106 may thus work in conjunction with the transceiver 1104 and form part of an overall communication circuitry implemented by the computing device 1100, which may be implemented via the computing device 1100 to transmit commands and/or control signals to the AMRs111 to execute any of the functions describe herein.
  • the memory 1108 is configured to store data and/or instructions such that, when the instructions are executed by the processing circuitry 1102, cause the computing device 1100 to perform various functions as described herein.
  • the memory 1108 may be implemented as any well-known volatile and/or non-volatile memory, including, for example, read-only memory (ROM) , random access memory (RAM) , flash memory, a magnetic storage media, an optical disc, erasable programmable read only memory (EPROM) , programmable read only memory (PROM) , etc.
  • the memory 1108 may be non-removable, removable, or a combination of both.
  • the memory 1108 may be implemented as a non-transitory computer readable medium storing one or more executable instructions such as, for example, logic, algorithms, code, etc.
  • the instructions, logic, code, etc., stored in the memory 1108 are represented by the various modules/engines as shown in FIG. 11.
  • the modules/engines shown in FIG. 11 associated with the memory 1108 may include instructions and/or code to facilitate control and/or monitor the operation of such hardware components.
  • the modules/engines as shown in FIG. 11 are provided for ease of explanation regarding the functional association between hardware and software components.
  • the processing circuitry 1102 may execute the instructions stored in these respective modules/engines in conjunction with one or more hardware components to perform the various functions as discussed herein.
  • model as, for example, used herein may be understood as any kind of algorithm, which provides output data from input data (e.g., any kind of algorithm generating or calculating output data from input data) .
  • a machine learning model may be executed by a computing system to progressively improve performance of a specific task.
  • parameters of a machine learning model may be adjusted during a training phase based on training data.
  • a trained machine learning model may be used during an inference phase to make predictions or decisions based on input data.
  • the trained machine learning model may be used to generate additional training data.
  • An additional machine learning model may be adjusted during a second training phase based on the generated additional training data.
  • a trained additional machine learning model may be used during an inference phase to make predictions or decisions based on input data.
  • the machine learning models described herein may take any suitable form or utilize any suitable technique (e.g., for training purposes) .
  • any of the machine learning models may utilize supervised learning, semi-supervised learning, unsupervised learning, or reinforcement learning techniques.
  • the model may be built using a training set of data including both the inputs and the corresponding desired outputs (illustratively, each input may be associated with a desired or expected output for that input) .
  • Each training instance may include one or more inputs and a desired output.
  • Training may include iterating through training instances and using an objective function to teach the model to predict the output for new inputs (illustratively, for inputs not included in the training set) .
  • a portion of the inputs in the training set may be missing the respective desired outputs (e.g., one or more inputs may not be associated with any desired or expected output) .
  • the model may be built from a training set of data including only inputs and no desired outputs.
  • the unsupervised model may be used to find structure in the data (e.g., grouping or clustering of data points) , illustratively, by discovering patterns in the data.
  • Techniques that may be implemented in an unsupervised learning model may include, e.g., self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition.
  • Reinforcement learning models may include positive or negative feedback to improve accuracy.
  • a reinforcement learning model may attempt to maximize one or more objectives/rewards.
  • Techniques that may be implemented in a reinforcement learning model may include, e.g., Q-learning, temporal difference (TD) , and deep adversarial networks.
  • Various aspects described herein may utilize one or more classification models.
  • the outputs may be restricted to a limited set of values (e.g., one or more classes) .
  • the classification model may output a class for an input set of one or more input values.
  • An input set may include sensor data, such as image data, radar data, LIDAR data and the like.
  • a classification model as described herein may, for example, classify certain driving conditions and/or environmental conditions, such as weather conditions, road conditions, and the like.
  • references herein to classification models may contemplate a model that implements, e.g., any one or more of the following techniques: linear classifiers (e.g., logistic regression or naive Bayes classifier) , support vector machines, decision trees, boosted trees, random forest, neural networks, or nearest neighbor.
  • linear classifiers e.g., logistic regression or naive Bayes classifier
  • support vector machines decision trees, boosted trees, random forest, neural networks, or nearest neighbor.
  • a regression model may output a numerical value from a continuous range based on an input set of one or more values (illustratively, starting from or using an input set of one or more values) .
  • References herein to regression models may contemplate a model that implements, e.g., any one or more of the following techniques (or other suitable techniques) : linear regression, decision trees, random forest, or neural networks.
  • a machine learning model described herein may be or may include a neural network.
  • the neural network may be any kind of neural network, such as a convolutional neural network, an autoencoder network, a variational autoencoder network, a sparse autoencoder network, a recurrent neural network, a deconvolutional network, a generative adversarial network, a forward thinking neural network, a sum-product neural network, and the like.
  • the neural network may include any number of layers.
  • the training of the neural network (e.g., adapting the layers of the neural network) may use or may be based on any kind of training principle, such as backpropagation (e.g., using the backpropagation algorithm) .
  • Example 1 An apparatus, comprising: an interface for receiving image data in real-time of a surroundings of a vehicle; processing circuitry for: identifying a point of interest within the received image data; generating modified image data based on the received image data and the identified point of interest; and transmitting the modified image data to be displayed to a vehicle passenger.
  • Example 2 The apparatus of example 1, wherein: the point of interest is an object; and the processing circuitry is further for: detecting whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generating the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
  • Example 3 The apparatus of any of examples 1-2, wherein the processing circuitry, in detecting whether the vehicle will probably collide with the object, is further for: classifying a type of the object; predicting a motion of the object; estimating, based on the type and/or predicted motion of the object, a collision probability and/or collision severity; and generating the modified image data if the estimated collision probability and/or collision severity is greater than a predetermined threshold.
  • Example 4 The apparatus of any of examples 1-3, wherein the processing circuitry is further for: estimating the collision probability and/or collision severity based on map information and/or surroundings state information.
  • Example 5 The apparatus of any of examples 1-4, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for: removing the object from the received image data; moving the object in the received image data to a different location within the modified image data; and/or outputting a signal to cause at least a portion of a window of the vehicle to be non-transparent.
  • Example 6 The apparatus of any of examples 1-5, wherein: the point of interest is an object; and the processing circuitry is further for: determining whether to generate modified image data based on a type of the object, a predicted motion of the object, an estimated probability of a collision between the vehicle and the object, and/or estimated severity of a collision between the vehicle and the object.
  • Example 7 The apparatus of any of examples 1-6, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for: estimating, based on head-tracking image data received from a passenger head-tracking sensor, a field-of-view (FOV) of the passenger; correlating the FOV with positions of the object at respective points in time; and generating modified image data to modify the received image data within an intersection of the FOVs and correlated object positions.
  • FOV field-of-view
  • Example 8 The apparatus of any of examples 1-7, wherein the modified image data or the received image data is displayed to the vehicle passenger in a window of the vehicle.
  • Example 9 The apparatus of any of examples 1-8, wherein when the received image data is not modified, the window of the vehicle is transparent.
  • Example 10 The apparatus of any of examples 1-9, wherein when the received image data is modified, the window of the vehicle is non-transparent.
  • Example 11 The apparatus of any of examples 1-10, wherein the processing circuitry is further for: identifying the point of interest based on a vehicle passenger profile, a vehicle passenger action, vehicle passenger speech, and/or a vehicle passenger emotion.
  • Example 12 The apparatus of any of examples 1-11, wherein the processing circuitry is further for: generating information about the point of interest, based on a vehicle passenger profile or a vehicle passenger action.
  • Example 13 The apparatus of any of examples 1-12, wherein the generated modified image data comprises a visual highlight of the point of interest.
  • Example 14 The apparatus of any of examples 1-13, wherein the generated modified image data comprises textual information related the point of interest.
  • Example 15 The apparatus of any of examples 1-14, wherein the processing circuitry is further for: translating the generated information into audio information.
  • Example 16 The apparatus of any of examples 1-15, wherein the processing circuitry is further for: generating a bookmark of the point of interest.
  • Example 17 The apparatus of any of examples 1-16, wherein the point of interest is a region of interest (ROI) , an event of interest (EOI) , or an object type of interest (OTI) .
  • ROI region of interest
  • EOI event of interest
  • OTI object type of interest
  • Example 18 The apparatus of any of examples 1-17, wherein the processing circuitry is further for: sharing the modified image data to be displayed to a person other than the vehicle passenger.
  • Example 19 The apparatus of any of examples 1-18, wherein the processing circuitry is further for: generating caption information about the point of interest, wherein the modified image data comprises the generated caption information.
  • Example 20 The apparatus of any of examples 1-19, wherein the processing circuitry is further for: identifying a profile related to the vehicle passenger.
  • Example 21 An autonomous system, comprising: the apparatus of any of examples 1-20.
  • Example 22 The autonomous system of any of examples 1-21, further comprising: a display for displaying the modified image data.
  • Example 23 The autonomous system of any of examples 1-22, wherein the display is comprised within a window of the vehicle.
  • Example 24 A component of a system, comprising: processing circuitry; and a non-transitory computer-readable storage medium including instructions that, when executed by the processing circuitry, cause the processing circuitry to: identify a point of interest within image data received in real-time of a surroundings of a vehicle; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
  • Example 25 The component of example 24, wherein the point of interest is an object, and the instructions further to cause processing circuitry to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
  • Example 26 An apparatus, comprising: an interface configured to receive image data in real-time of a surroundings of a vehicle; processing circuitry configured to: identify a point of interest within the received image data; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
  • Example 27 The apparatus of example 26, wherein: the point of interest is an object; and the processing circuitry is further configured to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
  • Example 28 The apparatus of example 27, wherein the processing circuitry, in detecting whether the vehicle will probably collide with the object, is further configured to: classify a type of the object; predict a motion of the object; estimate, based on the type and/or predicted motion of the object, a collision probability and/or collision severity; and generate the modified image data if the estimated collision probability and/or collision severity is greater than a predetermined threshold.
  • Example 29 The apparatus of example 28, wherein the processing circuitry is further configured to: estimate the collision probability and/or collision severity based on map information and/or surroundings state information.
  • Example 30 The apparatus of example 27, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further configured to: remove the object from the received image data; move the object in the received image data to a different location within the modified image data; and/or output a signal to cause at least a portion of a window of the vehicle to be non-transparent.
  • Example 31 The apparatus of example 26, wherein: the point of interest is an object; and the processing circuitry is further configured to: determine whether to generate modified image data based on a type of the object, a predicted motion of the object, an estimated probability of a collision between the vehicle and the object, and/or estimated severity of a collision between the vehicle and the object.
  • Example 32 The apparatus of example 27, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further configured to: estimate, based on head-tracking image data received from a passenger head-tracking sensor, a field-of-view (FOV) of the passenger; correlate the FOV with positions of the object at respective points in time; and generate modified image data to modify the received image data within an intersection of the FOVs and correlated object positions.
  • FOV field-of-view
  • Example 33 The apparatus of example 26, wherein the modified image data or the received image data is displayed to the vehicle passenger in a window of the vehicle.
  • Example 34 The apparatus of example 33, wherein when the received image data is not modified, the window of the vehicle is transparent.
  • Example 35 The apparatus of example 33, wherein when the received image data is modified, the window of the vehicle is non-transparent.
  • Example 36 The apparatus of example 26, wherein the processing circuitry is further configured to: identify the point of interest based on a vehicle passenger profile, a vehicle passenger action, vehicle passenger speech, and/or a vehicle passenger emotion.
  • Example 37 The apparatus of example 26, wherein the processing circuitry is configured to: generate information about the point of interest, based on a vehicle passenger profile or a vehicle passenger action.
  • Example 38 The apparatus of example 37, wherein the generated modified image data comprises a visual highlight of the point of interest.
  • Example 39 The apparatus of example 37, wherein the generated modified image data comprises textual information related the point of interest.
  • Example 40 The apparatus of example 37, wherein the processing circuitry is further configured to: translate the generated information into audio information.
  • Example 41 The apparatus of example 36, wherein the processing circuitry is further configured to: generate a bookmark of the point of interest.
  • Example 42 The apparatus of example 36, wherein the point of interest is a region of interest (ROI) , an event of interest (EOI) , or an object type of interest (OTI) .
  • ROI region of interest
  • EOI event of interest
  • OTI object type of interest
  • Example 43 The apparatus of example 36, wherein the processing circuitry is further configured to: share the modified image data to be displayed to a person other than the vehicle passenger.
  • Example 44 The apparatus of example 26, wherein the processing circuitry is further configured to: generate caption information about the point of interest, wherein the modified image data comprises the generated caption information.
  • Example 45 The apparatus of example 36, wherein the processing circuitry is further configured to: identify a profile related to the vehicle passenger.
  • Example 46 An autonomous system, comprising: the apparatus of example 26.
  • Example 47 The autonomous system of example 46, further comprising: a display configured to display the modified image data.
  • Example 48 The autonomous system of example 47, wherein the display is comprised within a window of the vehicle.
  • Example 49 A component of a system, comprising: processing circuitry; and a non-transitory computer-readable storage medium including instructions that, when executed by the processing circuitry, cause the processing circuitry to: identify a point of interest within image data received in real-time of a surroundings of a vehicle; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
  • Example 50 The component of example 49, wherein the point of interest is an object, and the instructions further to cause processing circuitry to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.

Landscapes

  • Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Multimedia (AREA)
  • Traffic Control Systems (AREA)

Abstract

An apparatus, including: an interface configured to receive image data in real-time of a surroundings of a vehicle; processing circuitry configured to: identify a point of interest within the received image data; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.

Description

VEHICLE PASSENGER DISPLAY MODIFICATION Technical Field
Aspects described herein generally relate to a passenger vehicle visual augmentation system, and more particularly, to a passenger vehicle visual augmentation system for enhancing a passenger experience.
Background
Vehicle accidents are common. In Germany in 2020, there were 430 accidents injuring more than 500 people. And there were additional collisions with animals that went unreported. Unfortunately, even with the most advanced sensor technology, these events cannot be eliminated. Vehicles, such as trains, travelling at speeds above 150 km/h have stopping distances of several hundred meters up to few kilometers, which is out of sensor range.
Collisions are traumatic events for eyewitnesses. While today this is typically a single train driver, in future automated trains the number of eyewitnesses could be more than a handful of people sitting next to the windshield in the first coach vehicle. This is an attractive seating location for children as it offers the best view. Thus, solutions are desired to avoid people becoming eyewitnesses to collisions and the resulting psychological consequences.
In addition, vehicle passengers like to enjoy scenery along the traveling path. Scenery enjoyment needs vary depending on the particular passenger or situation. Child passengers may have questions about a type of tree, elderly and visually-impaired passengers may have poor eyesight, and any passenger may have a view blocked by a person or another vehicle (e.g., when the target vehicle is in a middle lane and the passenger view is blocked by a vehicle in a side lane) . Thus, solutions are desired to enhance passenger travel experience.
Description of the Drawings
FIGs. 1A-1C illustrate schematic diagrams of a system in accordance with aspects of the disclosure.
FIGs. 2A and 2B illustrate schematic diagrams of transparent window regions in accordance with aspects of the disclosure.
FIG. 3 illustrates a window optionally used as a screen in accordance with aspects of the disclosure.
FIGs. 4A-4B illustrate a virtual image creation system using a Generative Adversarial Network (GAN) .
FIG. 5 illustrates a schematic diagram of a passenger enjoyment system overview in accordance with aspects of the disclosure.
FIG. 6 illustrates a schematic diagram of a vehicle passenger enjoyment system in accordance with aspects of the disclosure.
FIG. 7 illustrates a schematic diagram of a point of interest extraction system in accordance with aspects of the disclosure.
FIG. 8 illustrates a schematic diagram of a scenery caption generation system in accordance with aspects of the disclosure.
FIG. 9 illustrates a schematic diagram of a bookmark generation system in accordance with aspects of the disclosure.
FIG. 10 illustrates a schematic diagram of a content sharing system in accordance with aspects of the disclosure.
FIG. 11 illustrates a block diagram of an exemplary computing device in accordance with an aspects of the disclosure
Description of Aspects of the Disclosure
I.  System 100
FIGs. 1A-1C illustrate schematic diagrams of a system 100 in accordance with aspects of the disclosure.
A.  Overview
The system 100 is configured to protect vehicle passengers from trauma due to a collision of the vehicle 10 with an object. The system 100 uses an object detection system 110 to perceive a surroundings of the vehicle (vehicle outside environment) , coupled with a collision detection system 120 and a passenger protection system 130 to virtually eliminate a collision obstacle from the perceived environment.
The vehicle 10 may be, for example, a train. It is likely not possible to detect an obstacle (or other point of interest) early enough for a high-speed train to stop in time to avoid a collision. The train 10 is equipped with one or  more sensors  12, 14 to detect the object in front of the train. If the collision system 120 detects a potential collision, the passenger protection system 130 may be activated. In normal operation, the train windshield may be a transparent window. But if the object detection system 110 and the collision detection system 120 determine that a collision is inevitable, the passenger protection system 130 is configured to manipulate the windshield to display an augmented reality. This augmented reality may be either an image of the current vehicle environment with the object removed, or a non-transparent screen, to prevent a front automated train passenger or a train driver from experiencing trauma due witnessing the collision.
The vehicle 10 is not limited to being a train, but may alternatively be an airplane to avoid panic about an engine failure, an automobile, or any another suitable vehicle. Also, the system 100 may additionally or alternatively be configured to project information of interest (e.g., city name, region-specific advertisement, etc. ) .
With this overview in mind, a more detailed explanation of the operation of the system 100 using a passenger train as an example, follows.
The system 100 comprises an object detection system 110, a collision detection system 120, a passenger protection system 130, and/or a passenger enjoyment system 600.
B.  Object Detection System 110
The object detection system 110 receives sensor data from onboard sensors 12, 14 (e.g., RGB camera, LiDAR, radar, etc. ) that may be mounted in the front of the train 10 to detect potential obstacles on the track or objects that are about to cross the track. In addition, the object detection system 110 may receive sensor data from infrastructure sensors 16 (e.g., digital twin) . Using infrastructure information can improve the responsiveness of the system 100 because a possible collision can be detected earlier with such off-board sensor data than with only on-board sensors.
C.  Collision Detection System 120
The collision detection system 120 comprises an object classification system 122, an object motion and prediction system 124, and a collision probability and severity estimation system 126.
1.  Object Classification System 122
The object classification system 122 is configured to use any known classification approach to classify the detected object as a person, vehicle, animal, unknown object, etc.
2.  Object Motion and Prediction System 124
The object motion and prediction system 124 is configured to estimate the object’s motion, predict future movement, and output an object list with corresponding trajectories. The object’s motion may be estimated using any known object tracker. The future movement may be predicted using any known solution.
3.  Collision Probability and Severity Estimation System 126
The collision probability and severity estimation system 126 is configured to use the object list with corresponding trajectories from the object motion and prediction system 124 to estimate the collision probability. This system 126 intersects the trajectory of the object with the trajectory of the train 10 , taking into consideration uncertainties. This system 126 may also consider any data from a map 18 or digital twin 16, if available. Useful map data could be, for example, information on track geometries and road crossings, with or without barriers. Data from the digital twin 16 may include data on the state of barriers, open or closed. The state of barriers may be used to improve the certainty of the trajectories. For example, if the train 10 is moving towards a closed barrier, the collision probability is relatively low. If, however, the object is a vehicle that is detected between closed barriers, the collision probability would be much higher.
Besides estimating how likely a collision is, the collision probability and severity estimation system 126 is configured to estimate the collision severity. For example, if the train 10 is about to collide with a heavy vehicle (e.g., classified as truck) , the passenger protection system 130 might activate a warning. A collision with a truck will likely result in a noticeable physical impact. Thus the passenger protection system 130 may decide in this case to not make the window non-transparent or project a scenes on the window, but rather keep the window in a transparent state. However, when a non-noticeable impact is expected, such as that with  an animal, the passenger protection system 130 may activate the envisioned virtual window feature.
The collision probability and severity estimation system 126 outputs a list of likely collisions (can be an empty list) , with information on when and where the collision will happen (e.g., in 2 seconds, on right side of train) , what type of object will be hit (e.g., animal) , and how likely the collision will happen (e.g., 80%) , and the expected severity (e.g., no injuries within train) .
D.  Vehicle Passenger Protection System 130
The vehicle passenger protection system 130 is configured use the output from the collision probability and severity estimation system 126 to determine whether to warn train passengers that a severe collision is to be expected (e.g., high severity due to collision with truck) , or alternatively, to provide the passengers with a virtual image. The warning option is straight forward and may be a simple audio message. The second option, to provide a virtual image, is now described in more detail.
The vehicle passenger protection system 130 comprises an object removal from panoramic view system 132, a perspective correction system 134, and a windshield/window projection system 136. An example of the object removal from panoramic view system 132 is described below with respect to FIGs. 4A-4B. The perspective correction system 134 is described below with respect to FIGs. 2A, 2B, and 3. And the windshield/window projection system 136 may be any known display system.
FIGs. 2A-2B illustrate perspective correction system 200 (134 in FIG. 1B) . Disclosed here are two possible options for manipulating the surroundings perceived by the passengers. The first option is to use windows 210 that can be made non-transparent by applying electronic signals. This is an inexpensive and fast solution that allows the system to make the entire window 210 non-transparent, or only a portion of the window 210. If only portions of the  window are to be transparent, then a field-of-view (FOV) is estimated for each individual passenger using a head tracking sensor 230 and correlating the estimated FOV with positions of the object at different points in time (t, t+1, t+2 ... ) until the collision. The intersection with the passengers’ FOV and the object positions may be blurred out in a blending area 240. The system 100 thereby prevents the passengers from becoming an eye-witnesses to the collision. The switching time of the windows 210 may be in the order of a few milliseconds, so the passengers will probably not notice the image change from actual to virtual.
FIG. 3 illustrates a more advanced and less intrusive perspective correction system 300 (134 in FIG. 1B) with the window 300 also used as a display screen. The train 10 is equipped with panoramic cameras 14 to capture the entire scenery. The window 300A first shows the actual scenery. The window 300B may optional then be blanked out by applying an electric signal to act as a screen. And then additionally, or alternatively, the window 300C may show a virtual image with the object to be collided with either removed or placed in a non-collision location.
The stream of the camera (s) is better processed in real time for a better passenger experience, that is, at least 25 to 30 frames per second processed and generated, which is below the time the human eye would observe a non-smooth environment. However, even if fewer frames per second can be generated (e.g., just one static image) , this is still better for the passengers than witnessing a collision.
FIGs. 4A-4B illustrate a virtual image creation system 400 using a Generative Adversarial Network (GAN) 430. The virtual image creation system 400 comprises a camera 410, a semantic segmentation system 420, a GAN 430, and a window display 440. The camera 410 is configured to obtain original image data, as shown in the left-hand image of FIG. 4A. The semantic segmentation system 420 is configured to process the image data to identify an obstacle in the original image, as shown in the middle image of FIG. 4A. This processing is disclosed as being achieved by semantic segmentation, but may alliteratively be achieved by  another known process, such as yolo. The identified object region is provided to a GAN 430 together with the original image. Then the GAN 430 is configured to create a virtual image without the object, as shown in the right-hand image of FIG. 4A. After a correction of the perspective, to match the windows position, this image is then displayed on the window display 440. This is similar to an IMAX half dome in which perspective projections are used. Of course an observant person might feel a difference when this virtual world projection is activated, as all of a sudden, the human eyes need to focus on something at a shorter distance. However, this is acceptable sacrifice as compared with the passenger experiencing a collision with an animal or other human. Another option is to not only remove the object from image, but instead, use an AI algorithm to create a scenario in which the obstacle is moving away from the track. This creates a virtual solution, omitting the crash.
Further, the aspects disclosed herein may be used to activate windows to become non-transparent so as to prevent gawkers on highways. Gawkers unnecessarily slow down traffic to take pictures or videos of accidents or other bad situations. They also often hinder rescue and emergency people, thereby risking other lives. As a consequence, rescue and emergency people need to install blinds to avoid unnecessary hinderance.
The aspects of this disclosure may be deployed in vehicles 10 to address this issue in different ways. First, the police may remotely activate the feature using vehicle-to-everything (V2X) communication paths. Alternatively, the vehicle 10 itself may activate the feature by using map information 18, localization, and knowledge of rescue missions. The feature activation will be the side windows either becoming non-transparent, or, similar to the train embodiment, display a “normal” scene. As long as the driver is not required to pay attention (L3-vehicle or above) , even the windshield could be handled in the same manner. In case a driver is operating the vehicle 10, the windshield should remain transparent, but more and more vehicles in the future will use camera-based mirrors, which will allow activation of the aspects of the disclosure for the side windows.
E.  Passenger Enjoyment System 600
1.  Overview
FIG. 5 illustrates a schematic diagram of a passenger enjoyment system overview 500 in accordance with aspects of the disclosure. The system may highlight, translate, and bookmark an image of scenery along a traveling route.
The vehicle surroundings are captured in images using cameras 12, an optionally additionally sounds using microphones 52. In the example shown, the vehicle 10 is driving in a middle lane with its view blocked by vehicles in adjacent lanes. The captured view is shown on one or more vehicle displays 54 (e.g., organic light-emitting diode (OLED) monitor, projector, AR/XR glasses, etc. ) . The passenger can view the scenery through both the display 54 and the vehicle window. A passenger interaction system 510 allows a vehicle passenger to obtain additional information about the scenery by pointing to a point of interest or clicking on the point of interest on the display 54. Then, the system modifies the image to caption or highlight the point of interest on the display 54, and may additionally translate the caption to another language. The translated language may be determined by both the scenery image, audio, and/or passenger profile. Additionally, a scenery bookmark system 530 may record and organize the images as personalized bookmarks, which may be referred to during a subsequent trip.
2.  Vehicle Passenger Enjoyment System 600
FIG. 6 illustrates a schematic diagram of a vehicle passenger enjoyment system 600 in accordance with aspects of the disclosure.
The passenger enjoyment system 600 allows vehicle passengers to have an improved traveling experience. The system 600 comprises passenger profiles 610, a point of extraction system 620 (more detail in FIG. 7) , a scenery caption generation system 630 (more detail in FIG.  8) , a bookmark generation system (more detail in FIG. 9) , and a content sharing system 650 (more detail in FIG. 10) .
By way of overview,  cameras  12, 14 and microphones 52 capture a vehicle’s surroundings, both visually and audibly. There are three possible inputs shown. A first input is the video/image/audio data that captures scenery information. The second input is the location, path, and motion plan that autonomous vehicles use for navigation. The third input is passenger interest based on the human-vehicle interactions 510 (e.g., gaze 510, emotion 514, heart rate 516, pointing direction, touching display location, etc. ) . Based on the inputs, the outputs are scenery captions (e.g., text 632, audio 636, and/or highlighted regions in images/videos 638) , which are sent to vehicle passengers based on respective passenger profiles 610 (e.g., child 612, elderly 614, visually impaired 616, etc. ) . For example, for elderly 614 and visually impaired passengers 616, the passenger may receive a modified image that is zoomed in to a point of interest for a better view and/or audio explanation 636. The system 600 may also be configured to bookmark any recorded location, image, video, or audio.
3.  Point of Interest (POI) Extraction System 620/700
FIG. 7 illustrates a schematic diagram of a point of interest extraction system 700 in accordance with aspects of the disclosure. A point of interest is, for example, natural scenery, historical building, etc.
The point of interest extraction system 700 comprises a passenger detection system 710, a region of interest (ROI) extraction system 720, an event of interest (EOI) extraction system 740, and an object type of interest (OTI) extraction system 760. The passenger detection system 710 is configured to detect by any known method (e.g., RFID tag, NFC, etc. ) which passenger has entered the vehicle 10 so that the passenger’s profile may be retrieved. A ROI is a sub-region of an image. An EOI is an event happening on the road, An OTI is a type of object. The ROI extraction system 720 is configured to output a modified image with a sub-region of  the received image highlighted 730. The EOI extraction system 740 is configured to output a modified image of an event, such as a traffic jam, traffic accident, emergency such as gun shot, etc. The OTI extraction system 760 is configured to output a modified image of an object of interest, such as a plant, animal, building, etc.
The inputs to point of interest extraction system 700 include human-vehicle interactions 510, passenger profiles 610, video/image/audio signals, and navigation information. The human-vehicle interactions 510 may comprise eye gaze tracking 512, emotion recognition 512, heart rate 516, etc. The passenger profiles 610 are, for example, that a vehicle passenger prefers audio or video service, scenery type of interest, etc. The profile may be generated manually by the vehicle passenger or traffic work force, or may be generated automatically by a machine learning algorithm. The video/image/audio signals are of vehicle surroundings. The navigation information comprises location, map, and/or motion plans. Based on these inputs, the point of interest (POI) can be determined by any known process, such as a classifier (offline or online) .
4.  Scenery Caption Generation System 630/800
FIG. 8 illustrates a schematic diagram of a scenery caption generation system 800 in accordance with aspects of the disclosure.
The inputs to scenery caption generation system 800 comprise the point of  interest extraction results  730, 750, 770, the scenery video/image/audio signals, and the navigation information (e.g., location, path, and motion plans) . The outputs may comprise test, sentences, speech, and/or ROI of images describing the scenery generated using, for example, deep learning algorithms. A text output may be, for example, Example of generated caption: “There are few vehicles on the road. It starts raining and there are maple trees on the roadside. Dogs are sitting on the roadside. ”
A passenger attention selection module 840 is configured to, based on a vehicle passenger’s interests as indication in the corresponding profile, extract and select the vehicle passenger’s attention. The scenery is represented by a semantic graph 850 and a geometry graph 860. The semantic graph 850 is configured to describe object and event types, such as there are maple trees, dogs, people, etc. The geometry graph 860 is configured to describe the location relationship between objects and events, for example, the sky is above the tree, building A is to the east of building B, etc.
An encoder-decoder 870 is configured to generate language description. There may be multiple decoders 870, such as Long Short Term Memory (LSTM) , transformers, or the like. The encoder may employ a convolutional neural network (CNN) , which extracts objects and features from an image or video frame. The decoder may employ a neural network that generates a natural sentence based on the available information.
5.  Bookmark Generation System 640/900
FIG. 9 illustrates a schematic diagram of a bookmark generation system 900 in accordance with aspects of the disclosure.
The bookmark generation system 900 is configured to bookmark documented evidence of the trip. The system 900 is configured to record images, videos, and/or audio based on the vehicle passenger points of interest. The system 900 is also configured to record associated conversations in regard to the point of interests, for example, “I used to come to this beach during my high school time, ” or “This is the church where my parents got married. ” Similarly, recordings may be recalled from the bookmark database when the vehicle subsequently travels by related point of interests.
6.  Content Sharing System 650/1000
FIG. 10 illustrates a schematic diagram of a content sharing system 1000 in accordance with aspects of the disclosure.
The content sharing system 1000 allows vehicle passengers of a same view to share content, either enabled by a client server, peer-to-peer, or other network architecture. The content sharing system 1000 allows a vehicle passenger to share a sub-set of content for their specific use cases. For example, vehicle passenger A might want to share a house image and related information to vehicle passenger B. The specific ROI/EOI/OT information 1010 attention of vehicle passenger A, which is the house, is displayed on the vehicle passenger B’s view and allows vehicle passenger B to edit, add, delete, and any possible communication between them. The changes can be made either locally (e.g., at vehicle passenger B only) or globally (e.g., at both vehicle passengers A and B) .
In addition to autonomous vehicles, the present invention may be applicable in other autonomous systems including autonomous robots, drones, unmanned aerial vehicles, etc. Further, “image” as defined herein may also encompass a portion of a video, which is a series of images.
II.  Computing Device
FIG. 11 illustrates a block diagram of an exemplary computing device in accordance with an aspects of the disclosure. In an aspect, the computing device 1100 as shown and described with respect to FIG. 11 may be identified with a central controller and be implemented as any suitable network infrastructure component, which may be implemented as an Edge network server, controller, computing device, etc. As further discussed below, the computing device 1100 may serve the environment in accordance with the various techniques as discussed herein. Thus, the computing device 1100 may perform the various functionality as described herein. To do so, the computing device 1100 may include processing circuitry 1102,  a transceiver 1104, communication interface 1106, and a memory 1108. The components shown in FIG. 11 are provided for ease of explanation, and the computing device 1100 may implement additional, less, or alternative components as those shown in FIG 11.
The processing circuitry 1102 may be configured as any suitable number and/or type of computer processors, which may function to control the computing device 1100. The processing circuitry 1102 may be identified with one or more processors (or suitable portions thereof) implemented by the computing device 1100. The processing circuitry 1102 may be identified with one or more processors such as a host processor, a digital signal processor, one or more microprocessors, graphics processors, baseband processors, microcontrollers, an application-specific integrated circuit (ASIC) , part (or the entirety of) a field-programmable gate array (FPGA) , etc.
In any event, the processing circuitry 1102 may be configured to carry out instructions to perform arithmetical, logical, and/or input/output (I/O) operations, and/or to control the operation of one or more components of computing device 1100 to perform various functions as described herein. The processing circuitry 1102 may include one or more microprocessor cores, memory registers, buffers, clocks, etc., and may generate electronic control signals associated with the components of the computing device 1100 to control and/or modify the operation of these components. The processing circuitry 1102 may communicate with and/or control functions associated with the transceiver 1104, the communication interface 1106, and/or the memory 1108. The processing circuitry 1102 may additionally perform various operations to control the communications, communications scheduling, and/or operation of other network infrastructure components that are communicatively coupled to the computing device 1100.
The transceiver 1104 may be implemented as any suitable number and/or type of components configured to transmit and/or receive data packets and/or wireless signals in accordance with any suitable number and/or type of communication protocols. The transceiver  1104 may include any suitable type of components to facilitate this functionality, including components associated with known transceiver, transmitter, and/or receiver operation, configurations, and implementations. Although depicted in FIG. 11 as a transceiver, the transceiver 1104 may include any suitable number of transmitters, receivers, or combinations of these that may be integrated into a single transceiver or as multiple transceivers or transceiver modules. The transceiver 1104 may include components typically identified with an RF front end and include, for example, antennas, ports, power amplifiers (PAs) , RF filters, mixers, local oscillators (LOs) , low noise amplifiers (LNAs) , up-converters, down-converters, channel tuners, etc.
The communication interface 1106 may be configured as any suitable number and/or type of components configured to facilitate the transceiver 1104 receiving and/or transmitting data and/or signals in accordance with one or more communication protocols, as discussed herein. The communication interface 1106 may be implemented as any suitable number and/or type of components that function to interface with the transceiver 1106, such as analog-to-digital converters (ADCs) , digital to analog converters, intermediate frequency (IF) amplifiers and/or filters, modulators, demodulators, baseband processors, etc. The communication interface 1106 may thus work in conjunction with the transceiver 1104 and form part of an overall communication circuitry implemented by the computing device 1100, which may be implemented via the computing device 1100 to transmit commands and/or control signals to the AMRs111 to execute any of the functions describe herein.
The memory 1108 is configured to store data and/or instructions such that, when the instructions are executed by the processing circuitry 1102, cause the computing device 1100 to perform various functions as described herein. The memory 1108 may be implemented as any well-known volatile and/or non-volatile memory, including, for example, read-only memory (ROM) , random access memory (RAM) , flash memory, a magnetic storage media, an optical disc, erasable programmable read only memory (EPROM) , programmable read only memory (PROM) , etc. The memory 1108 may be non-removable, removable, or a combination of both.  The memory 1108 may be implemented as a non-transitory computer readable medium storing one or more executable instructions such as, for example, logic, algorithms, code, etc.
As further discussed below, the instructions, logic, code, etc., stored in the memory 1108 are represented by the various modules/engines as shown in FIG. 11. Alternatively, if implemented via hardware, the modules/engines shown in FIG. 11 associated with the memory 1108 may include instructions and/or code to facilitate control and/or monitor the operation of such hardware components. In other words, the modules/engines as shown in FIG. 11 are provided for ease of explanation regarding the functional association between hardware and software components. Thus, the processing circuitry 1102 may execute the instructions stored in these respective modules/engines in conjunction with one or more hardware components to perform the various functions as discussed herein.
Various aspects herein may utilize one or more machine learning models to perform or control functions of the vehicle (or other functions described herein) . The term “model” as, for example, used herein may be understood as any kind of algorithm, which provides output data from input data (e.g., any kind of algorithm generating or calculating output data from input data) . A machine learning model may be executed by a computing system to progressively improve performance of a specific task. In some aspects, parameters of a machine learning model may be adjusted during a training phase based on training data. A trained machine learning model may be used during an inference phase to make predictions or decisions based on input data. In some aspects, the trained machine learning model may be used to generate additional training data. An additional machine learning model may be adjusted during a second training phase based on the generated additional training data. A trained additional machine learning model may be used during an inference phase to make predictions or decisions based on input data.
The machine learning models described herein may take any suitable form or utilize any suitable technique (e.g., for training purposes) . For example, any of the machine learning  models may utilize supervised learning, semi-supervised learning, unsupervised learning, or reinforcement learning techniques.
In supervised learning, the model may be built using a training set of data including both the inputs and the corresponding desired outputs (illustratively, each input may be associated with a desired or expected output for that input) . Each training instance may include one or more inputs and a desired output. Training may include iterating through training instances and using an objective function to teach the model to predict the output for new inputs (illustratively, for inputs not included in the training set) . In semi-supervised learning, a portion of the inputs in the training set may be missing the respective desired outputs (e.g., one or more inputs may not be associated with any desired or expected output) .
In unsupervised learning, the model may be built from a training set of data including only inputs and no desired outputs. The unsupervised model may be used to find structure in the data (e.g., grouping or clustering of data points) , illustratively, by discovering patterns in the data. Techniques that may be implemented in an unsupervised learning model may include, e.g., self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition.
Reinforcement learning models may include positive or negative feedback to improve accuracy. A reinforcement learning model may attempt to maximize one or more objectives/rewards. Techniques that may be implemented in a reinforcement learning model may include, e.g., Q-learning, temporal difference (TD) , and deep adversarial networks.
Various aspects described herein may utilize one or more classification models. In a classification model, the outputs may be restricted to a limited set of values (e.g., one or more classes) . The classification model may output a class for an input set of one or more input values. An input set may include sensor data, such as image data, radar data, LIDAR data and the like. A classification model as described herein may, for example, classify certain driving conditions  and/or environmental conditions, such as weather conditions, road conditions, and the like. References herein to classification models may contemplate a model that implements, e.g., any one or more of the following techniques: linear classifiers (e.g., logistic regression or naive Bayes classifier) , support vector machines, decision trees, boosted trees, random forest, neural networks, or nearest neighbor.
Various aspects described herein may utilize one or more regression models. A regression model may output a numerical value from a continuous range based on an input set of one or more values (illustratively, starting from or using an input set of one or more values) . References herein to regression models may contemplate a model that implements, e.g., any one or more of the following techniques (or other suitable techniques) : linear regression, decision trees, random forest, or neural networks.
A machine learning model described herein may be or may include a neural network. The neural network may be any kind of neural network, such as a convolutional neural network, an autoencoder network, a variational autoencoder network, a sparse autoencoder network, a recurrent neural network, a deconvolutional network, a generative adversarial network, a forward thinking neural network, a sum-product neural network, and the like. The neural network may include any number of layers. The training of the neural network (e.g., adapting the layers of the neural network) may use or may be based on any kind of training principle, such as backpropagation (e.g., using the backpropagation algorithm) .
The techniques of this disclosure may also be described in the following examples.
Example 1. An apparatus, comprising: an interface for receiving image data in real-time of a surroundings of a vehicle; processing circuitry for: identifying a point of interest within the received image data; generating modified image data based on the received image data and the identified point of interest; and transmitting the modified image data to be displayed to a vehicle passenger.
Example 2. The apparatus of example 1, wherein: the point of interest is an object; and the processing circuitry is further for: detecting whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generating the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
Example 3. The apparatus of any of examples 1-2, wherein the processing circuitry, in detecting whether the vehicle will probably collide with the object, is further for: classifying a type of the object; predicting a motion of the object; estimating, based on the type and/or predicted motion of the object, a collision probability and/or collision severity; and generating the modified image data if the estimated collision probability and/or collision severity is greater than a predetermined threshold.
Example 4. The apparatus of any of examples 1-3, wherein the processing circuitry is further for: estimating the collision probability and/or collision severity based on map information and/or surroundings state information.
Example 5. The apparatus of any of examples 1-4, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for: removing the object from the received image data; moving the object in the received image data to a different location within the modified image data; and/or outputting a signal to cause at least a portion of a window of the vehicle to be non-transparent.
Example 6. The apparatus of any of examples 1-5, wherein: the point of interest is an object; and the processing circuitry is further for: determining whether to generate modified image data based on a type of the object, a predicted motion of the object, an estimated probability of a collision between the vehicle and the object, and/or estimated severity of a collision between the vehicle and the object.
Example 7. The apparatus of any of examples 1-6, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for: estimating, based on head-tracking image data received from a passenger head-tracking sensor, a field-of-view (FOV) of the passenger; correlating the FOV with positions of the object at respective points in time; and generating modified image data to modify the received image data within an intersection of the FOVs and correlated object positions.
Example 8. The apparatus of any of examples 1-7, wherein the modified image data or the received image data is displayed to the vehicle passenger in a window of the vehicle.
Example 9. The apparatus of any of examples 1-8, wherein when the received image data is not modified, the window of the vehicle is transparent.
Example 10. The apparatus of any of examples 1-9, wherein when the received image data is modified, the window of the vehicle is non-transparent.
Example 11. The apparatus of any of examples 1-10, wherein the processing circuitry is further for: identifying the point of interest based on a vehicle passenger profile, a vehicle passenger action, vehicle passenger speech, and/or a vehicle passenger emotion.
Example 12. The apparatus of any of examples 1-11, wherein the processing circuitry is further for: generating information about the point of interest, based on a vehicle passenger profile or a vehicle passenger action.
Example 13. The apparatus of any of examples 1-12, wherein the generated modified image data comprises a visual highlight of the point of interest.
Example 14. The apparatus of any of examples 1-13, wherein the generated modified image data comprises textual information related the point of interest.
Example 15. The apparatus of any of examples 1-14, wherein the processing circuitry is further for: translating the generated information into audio information.
Example 16. The apparatus of any of examples 1-15, wherein the processing circuitry is further for: generating a bookmark of the point of interest.
Example 17. The apparatus of any of examples 1-16, wherein the point of interest is a region of interest (ROI) , an event of interest (EOI) , or an object type of interest (OTI) .
Example 18. The apparatus of any of examples 1-17, wherein the processing circuitry is further for: sharing the modified image data to be displayed to a person other than the vehicle passenger.
Example 19. The apparatus of any of examples 1-18, wherein the processing circuitry is further for: generating caption information about the point of interest, wherein the modified image data comprises the generated caption information.
Example 20. The apparatus of any of examples 1-19, wherein the processing circuitry is further for: identifying a profile related to the vehicle passenger.
Example 21. An autonomous system, comprising: the apparatus of any of examples 1-20.
Example 22. The autonomous system of any of examples 1-21, further comprising: a display for displaying the modified image data.
Example 23. The autonomous system of any of examples 1-22, wherein the display is comprised within a window of the vehicle.
Example 24. A component of a system, comprising: processing circuitry; and a non-transitory computer-readable storage medium including instructions that, when executed by  the processing circuitry, cause the processing circuitry to: identify a point of interest within image data received in real-time of a surroundings of a vehicle; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
Example 25. The component of example 24, wherein the point of interest is an object, and the instructions further to cause processing circuitry to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
Example 26. An apparatus, comprising: an interface configured to receive image data in real-time of a surroundings of a vehicle; processing circuitry configured to: identify a point of interest within the received image data; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
Example 27. The apparatus of example 26, wherein: the point of interest is an object; and the processing circuitry is further configured to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
Example 28. The apparatus of example 27, wherein the processing circuitry, in detecting whether the vehicle will probably collide with the object, is further configured to: classify a type of the object; predict a motion of the object; estimate, based on the type and/or predicted motion of the object, a collision probability and/or collision severity; and generate the modified image data if the estimated collision probability and/or collision severity is greater than a predetermined threshold.
Example 29. The apparatus of example 28, wherein the processing circuitry is further configured to: estimate the collision probability and/or collision severity based on map information and/or surroundings state information.
Example 30. The apparatus of example 27, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further configured to: remove the object from the received image data; move the object in the received image data to a different location within the modified image data; and/or output a signal to cause at least a portion of a window of the vehicle to be non-transparent.
Example 31. The apparatus of example 26, wherein: the point of interest is an object; and the processing circuitry is further configured to: determine whether to generate modified image data based on a type of the object, a predicted motion of the object, an estimated probability of a collision between the vehicle and the object, and/or estimated severity of a collision between the vehicle and the object.
Example 32. The apparatus of example 27, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further configured to: estimate, based on head-tracking image data received from a passenger head-tracking sensor, a field-of-view (FOV) of the passenger; correlate the FOV with positions of the object at respective points in time; and generate modified image data to modify the received image data within an intersection of the FOVs and correlated object positions.
Example 33. The apparatus of example 26, wherein the modified image data or the received image data is displayed to the vehicle passenger in a window of the vehicle.
Example 34. The apparatus of example 33, wherein when the received image data is not modified, the window of the vehicle is transparent.
Example 35. The apparatus of example 33, wherein when the received image data is modified, the window of the vehicle is non-transparent.
Example 36. The apparatus of example 26, wherein the processing circuitry is further configured to: identify the point of interest based on a vehicle passenger profile, a vehicle passenger action, vehicle passenger speech, and/or a vehicle passenger emotion.
Example 37. The apparatus of example 26, wherein the processing circuitry is configured to: generate information about the point of interest, based on a vehicle passenger profile or a vehicle passenger action.
Example 38. The apparatus of example 37, wherein the generated modified image data comprises a visual highlight of the point of interest.
Example 39. The apparatus of example 37, wherein the generated modified image data comprises textual information related the point of interest.
Example 40. The apparatus of example 37, wherein the processing circuitry is further configured to: translate the generated information into audio information.
Example 41. The apparatus of example 36, wherein the processing circuitry is further configured to: generate a bookmark of the point of interest.
Example 42. The apparatus of example 36, wherein the point of interest is a region of interest (ROI) , an event of interest (EOI) , or an object type of interest (OTI) .
Example 43. The apparatus of example 36, wherein the processing circuitry is further configured to: share the modified image data to be displayed to a person other than the vehicle passenger.
Example 44. The apparatus of example 26, wherein the processing circuitry is further configured to: generate caption information about the point of interest, wherein the modified image data comprises the generated caption information.
Example 45. The apparatus of example 36, wherein the processing circuitry is further configured to: identify a profile related to the vehicle passenger.
Example 46. An autonomous system, comprising: the apparatus of example 26.
Example 47. The autonomous system of example 46, further comprising: a display configured to display the modified image data.
Example 48. The autonomous system of example 47, wherein the display is comprised within a window of the vehicle.
Example 49. A component of a system, comprising: processing circuitry; and a non-transitory computer-readable storage medium including instructions that, when executed by the processing circuitry, cause the processing circuitry to: identify a point of interest within image data received in real-time of a surroundings of a vehicle; generate modified image data based on the received image data and the identified point of interest; and transmit the modified image data to be displayed to a vehicle passenger.
Example 50. The component of example 49, wherein the point of interest is an object, and the instructions further to cause processing circuitry to: detect whether the vehicle will probably collide with the object; and if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
While the foregoing has been described in conjunction with exemplary aspect, it is understood that the term “exemplary” is merely meant as an example, rather than the best or  optimal. Accordingly, the disclosure is intended to cover alternatives, modifications and equivalents, which may be included within the scope of the disclosure.
Although specific aspects have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific aspects shown and described without departing from the scope of the present application. This application is intended to cover any adaptations or variations of the specific aspects discussed herein.

Claims (25)

  1. An apparatus, comprising:
    an interface for receiving image data in real-time of a surroundings of a vehicle;
    processing circuitry for:
    identifying a point of interest within the received image data;
    generating modified image data based on the received image data and the identified point of interest; and
    transmitting the modified image data to be displayed to a vehicle passenger.
  2. The apparatus of claim 1, wherein:
    the point of interest is an object; and
    the processing circuitry is further for:
    detecting whether the vehicle will probably collide with the object; and
    if it is detected that the vehicle will probably collide with the object, generating the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
  3. The apparatus of any of claims 1-2, wherein the processing circuitry, in detecting whether the vehicle will probably collide with the object, is further for:
    classifying a type of the object;
    predicting a motion of the object;
    estimating, based on the type and/or predicted motion of the object, a collision probability and/or collision severity; and
    generating the modified image data if the estimated collision probability and/or collision severity is greater than a predetermined threshold.
  4. The apparatus of any of claims 1-3, wherein the processing circuitry is further for:
    estimating the collision probability and/or collision severity based on map information and/or surroundings state information.
  5. The apparatus of any of claims 1-4, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for:
    removing the object from the received image data;
    moving the object in the received image data to a different location within the modified image data; and/or
    outputting a signal to cause at least a portion of a window of the vehicle to be non-transparent.
  6. The apparatus of any of claims 1-5, wherein:
    the point of interest is an object; and
    the processing circuitry is further for:
    determining whether to generate modified image data based on a type of the object, a predicted motion of the object, an estimated probability of a collision between the vehicle and the object, and/or estimated severity of a collision between the vehicle and the object.
  7. The apparatus of any of claims 1-6, wherein the processing circuitry, in generating the modified image data to obscure the collision from the vehicle passenger, is further for:
    estimating, based on head-tracking image data received from a passenger head-tracking sensor, a field-of-view (FOV) of the passenger;
    correlating the FOV with positions of the object at respective points in time; and
    generating modified image data to modify the received image data within an intersection of the FOVs and correlated object positions.
  8. The apparatus of any of claims 1-7, wherein the modified image data or the received image data is displayed to the vehicle passenger in a window of the vehicle.
  9. The apparatus of any of claims 1-8, wherein when the received image data is not modified, the window of the vehicle is transparent.
  10. The apparatus of any of claims 1-9, wherein when the received image data is modified, the window of the vehicle is non-transparent.
  11. The apparatus of any of claims 1-10, wherein the processing circuitry is further for:
    identifying the point of interest based on a vehicle passenger profile, a vehicle passenger action, vehicle passenger speech, and/or a vehicle passenger emotion.
  12. The apparatus of any of claims 1-11, wherein the processing circuitry is further for:
    generating information about the point of interest, based on a vehicle passenger profile or a vehicle passenger action.
  13. The apparatus of any of claims 1-12, wherein the generated modified image data comprises a visual highlight of the point of interest.
  14. The apparatus of any of claims 1-13, wherein the generated modified image data comprises textual information related the point of interest.
  15. The apparatus of any of claims 1-14, wherein the processing circuitry is further for:
    translating the generated information into audio information.
  16. The apparatus of any of claims 1-15, wherein the processing circuitry is further for:
    generating a bookmark of the point of interest.
  17. The apparatus of any of claims 1-16, wherein the point of interest is a region of interest (ROI) , an event of interest (EOI) , or an object type of interest (OTI) .
  18. The apparatus of any of claims 1-17, wherein the processing circuitry is further for:
    sharing the modified image data to be displayed to a person other than the vehicle passenger.
  19. The apparatus of any of claims 1-18, wherein the processing circuitry is further for:
    generating caption information about the point of interest, wherein the modified image data comprises the generated caption information.
  20. The apparatus of any of claims 1-19, wherein the processing circuitry is further for:
    identifying a profile related to the vehicle passenger.
  21. An autonomous system, comprising:
    the apparatus of any of claims 1-20.
  22. The autonomous system of any of claims 1-21, further comprising:
    a display for displaying the modified image data.
  23. The autonomous system of any of claims 1-22, wherein the display is comprised within a window of the vehicle.
  24. A component of a system, comprising:
    processing circuitry; and
    a non-transitory computer-readable storage medium including instructions that, when executed by the processing circuitry, cause the processing circuitry to:
    identify a point of interest within image data received in real-time of a surroundings of a vehicle;
    generate modified image data based on the received image data and the identified point of interest; and
    transmit the modified image data to be displayed to a vehicle passenger.
  25. The component of claim 24, wherein the point of interest is an object, and the instructions further to cause processing circuitry to:
    detect whether the vehicle will probably collide with the object; and
    if it is detected that the vehicle will probably collide with the object, generate the modified image data to obscure a collision of the vehicle with the object from the vehicle passenger.
PCT/CN2022/123563 2022-09-30 2022-09-30 Vehicle passenger display modification WO2024065799A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/123563 WO2024065799A1 (en) 2022-09-30 2022-09-30 Vehicle passenger display modification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/123563 WO2024065799A1 (en) 2022-09-30 2022-09-30 Vehicle passenger display modification

Publications (1)

Publication Number Publication Date
WO2024065799A1 true WO2024065799A1 (en) 2024-04-04

Family

ID=90475644

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123563 WO2024065799A1 (en) 2022-09-30 2022-09-30 Vehicle passenger display modification

Country Status (1)

Country Link
WO (1) WO2024065799A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005205974A (en) * 2004-01-21 2005-08-04 Denso Corp Windshield display device
CN103080983A (en) * 2010-09-06 2013-05-01 国立大学法人东京大学 Vehicle system
CN105513389A (en) * 2015-11-30 2016-04-20 小米科技有限责任公司 Method and device for augmented reality
CN109945887A (en) * 2017-12-20 2019-06-28 上海博泰悦臻网络技术服务有限公司 AR air navigation aid and navigation equipment
CN110717991A (en) * 2018-07-12 2020-01-21 通用汽车环球科技运作有限责任公司 System and method for in-vehicle augmented virtual reality system
JP2020071415A (en) * 2018-11-01 2020-05-07 マクセル株式会社 Head-up display system
CN111263133A (en) * 2020-02-26 2020-06-09 中国联合网络通信集团有限公司 Information processing method and system
CN112677740A (en) * 2019-10-17 2021-04-20 现代摩比斯株式会社 Apparatus and method for treating a windshield to make it invisible

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005205974A (en) * 2004-01-21 2005-08-04 Denso Corp Windshield display device
CN103080983A (en) * 2010-09-06 2013-05-01 国立大学法人东京大学 Vehicle system
CN105513389A (en) * 2015-11-30 2016-04-20 小米科技有限责任公司 Method and device for augmented reality
CN109945887A (en) * 2017-12-20 2019-06-28 上海博泰悦臻网络技术服务有限公司 AR air navigation aid and navigation equipment
CN110717991A (en) * 2018-07-12 2020-01-21 通用汽车环球科技运作有限责任公司 System and method for in-vehicle augmented virtual reality system
JP2020071415A (en) * 2018-11-01 2020-05-07 マクセル株式会社 Head-up display system
CN112677740A (en) * 2019-10-17 2021-04-20 现代摩比斯株式会社 Apparatus and method for treating a windshield to make it invisible
CN111263133A (en) * 2020-02-26 2020-06-09 中国联合网络通信集团有限公司 Information processing method and system

Similar Documents

Publication Publication Date Title
US11392131B2 (en) Method for determining driving policy
KR102618662B1 (en) 3D feature prediction for autonomous driving
KR102618700B1 (en) Object feature estimation using visual image data
CN110349405B (en) Real-time traffic monitoring using networked automobiles
US11710303B2 (en) Systems and methods for prioritizing object prediction for autonomous vehicles
KR102605807B1 (en) Generating ground truth for machine learning from time series elements
US11667283B2 (en) Autonomous vehicle motion control systems and methods
CN107368890B (en) Road condition analysis method and system based on deep learning and taking vision as center
US20170192423A1 (en) System and method for remotely assisting autonomous vehicle operation
US20200089977A1 (en) Driver behavior recognition and prediction
KR20190083317A (en) An artificial intelligence apparatus for providing notification related to lane-change of vehicle and method for the same
JPWO2020003776A1 (en) Information processing equipment and information processing methods, imaging equipment, computer programs, information processing systems, and mobile equipment
JP7480302B2 (en) Method and apparatus for predicting the intentions of vulnerable road users
WO2019069581A1 (en) Image processing device and image processing method
WO2020113187A1 (en) Motion and object predictability system for autonomous vehicles
US10235886B1 (en) Integral damage control by interaction between a collision detection system and a bumper system
Fernández-Llorca et al. Two-stream networks for lane-change prediction of surrounding vehicles
KR102355431B1 (en) AI based emergencies detection method and system
JP2021051470A (en) Target tracking program, device and method capable of switching target tracking means
WO2024065799A1 (en) Vehicle passenger display modification
Dunna et al. A Deep Learning based system for fast detection of obstacles using rear-view camera under parking scenarios
Siddiqui et al. Object/Obstacles detection system for self-driving cars
US20230068848A1 (en) Systems and methods for vehicle camera obstruction detection
KR20210104199A (en) Autonomous vehicle and method of controlling the same
Sarraf Current stage of autonomous driving through a quick survey for novice

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960410

Country of ref document: EP

Kind code of ref document: A1