US20190078905A1 - Systems and methods for using real-time imagery in navigation - Google Patents
Systems and methods for using real-time imagery in navigation Download PDFInfo
- Publication number
- US20190078905A1 US20190078905A1 US16/188,215 US201816188215A US2019078905A1 US 20190078905 A1 US20190078905 A1 US 20190078905A1 US 201816188215 A US201816188215 A US 201816188215A US 2019078905 A1 US2019078905 A1 US 2019078905A1
- Authority
- US
- United States
- Prior art keywords
- landmark
- visual
- navigation
- driver
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/36—Input/output arrangements for on-board computers
- G01C21/3626—Details of the output of route guidance instructions
- G01C21/3644—Landmark guidance, e.g. using POIs or conspicuous other objects
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/3407—Route searching; Route guidance specially adapted for specific applications
- G01C21/3415—Dynamic re-routing, e.g. recalculating the route when the user deviates from calculated route or after detecting real-time traffic data or accidents
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/36—Input/output arrangements for on-board computers
- G01C21/3602—Input other than that of destination using image analysis, e.g. detection of road signs, lanes, buildings, real preceding vehicles using a camera
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/36—Input/output arrangements for on-board computers
- G01C21/3605—Destination input or retrieval
- G01C21/3623—Destination input or retrieval using a camera or code reader, e.g. for optical or magnetic codes
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/36—Input/output arrangements for on-board computers
- G01C21/3679—Retrieval, searching and output of POI information, e.g. hotels, restaurants, shops, filling stations, parking facilities
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/36—Input/output arrangements for on-board computers
- G01C21/3691—Retrieval, searching and output of information related to real-time traffic, weather, or environmental conditions
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/09—Arrangements for giving variable traffic instructions
- G08G1/0962—Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
- G08G1/0967—Systems involving transmission of highway information, e.g. weather, speed limits
- G08G1/096766—Systems involving transmission of highway information, e.g. weather, speed limits where the system is characterised by the origin of the information transmission
- G08G1/096783—Systems involving transmission of highway information, e.g. weather, speed limits where the system is characterised by the origin of the information transmission where the origin of the information is a roadside individual element
Definitions
- the present disclosure relates to navigation directions and, in particular, to using imagery in navigation directions.
- Systems that automatically route drivers between geographic locations generally utilize indications of distance, street names, building numbers, to generate navigation directions based on the route. For example, these systems can provide to a driver such instructions as “proceed for one-fourth of a mile, then turn right onto Maple Street.” However, it is difficult for drivers to accurately judge distance, nor is it always easy for drivers to see street signs. Moreover, there are geographic areas where street and road signage is poor.
- a system can generate such navigation directions as “in one fourth of a mile, you will see a McDonald's restaurant on your right; make the next right turn onto Maple Street.”
- locations e.g., street addresses, coordinates
- a system of this disclosure provides a driver with navigation directions using visual landmarks that are likely to be visible at the time when the driver reaches the corresponding geographic location.
- the system selects visual landmarks from a relatively large and redundant set of previously identified visual landmarks. To make the selection, the system can consider one or more of the time of day, the current weather conditions, the current season, etc.
- the system can utilize real-time imagery collected by the dashboard camera, the camera of a smartphone mounted on the dashboard, or another camera that approximately corresponds to the vantage point of the driver.
- the system also can use implicit and explicit feedback regarding visibility and/or prominence of physical objects to improve subsequent references to visual landmarks.
- An example embodiment of these techniques is a method for generating navigation directions for drivers, executed by one or more processors.
- the method includes obtaining a route for guiding a driver of a vehicle to a destination, retrieving visual landmarks corresponding to prominent physical objects disposed along the route, obtaining real-time imagery collected at the vehicle approximately from a vantage point of the driver during navigation along the route, and using (i) the retrieved visual landmarks and (ii) the imagery collected at the vehicle, selecting a subset of the visual landmarks that are currently visible to the driver.
- the method further includes providing, to the driver, navigation directions describing the route, the navigation directions referencing the selected subset of the visual landmarks and excluding the remaining visual landmarks.
- the system includes a camera configured to capture real-time imagery approximately from a vantage point of the driver, a positioning module configured to determine a current geographic location of the vehicle, a network interface to communicate with a server system via a communication network, a user interface, and processing hardware configured to (i) obtain, using the captured real-time imagery and the current geographic location of the vehicle, driving directions including an instruction that references a visual landmark automatically determined as being visible in the captured real-time imagery, and (ii) provide the instruction to the driver via the user interface.
- Yet another example embodiment of these techniques is a method in a mobile system operating in a vehicle for providing driving directions.
- the method comprises receiving a request for driving directions to a destination from a driver of the vehicle, receiving real-time imagery collected at the vehicle approximately from a vantage point of the driver, obtaining, using the real-time imagery and a current location of the vehicle, the driving directions including an instruction that references a visual landmark automatically determined as being visible in the real-time imagery, and providing the instruction to the driver in response to the request.
- Still another example embodiment of this technique is a method for generating navigation directions for drivers.
- the method includes obtaining, by one or more processors, a route for guiding a driver of a vehicle to a destination as well as real-time imagery collected at the vehicle approximately from a vantage point of the driver during navigation along the route.
- the method further includes automatically identifying, by the one or more processors, a physical object within the real-time imagery to be used as a visual landmark in navigation, including recognizing at least one of (i) one of a finite set of pre-set objects or (ii) text within the real-time imagery.
- the method includes determining a position of the physical object relative to a point on the route, and providing, to the driver, navigation directions describing the route, the navigation directions including a reference to the identified physical object.
- FIG. 1 is a block diagram of an example computing system that generates navigation directions in view of real-time imagery collected from approximately the user's vantage point, according to one implementation
- FIG. 2 is a flow diagram of an example method for generating navigation directions for drivers using real-time imagery, which can be implemented in the system of FIG. 1 ;
- FIG. 3 is a flow diagram of an example method for adjusting numeric metrics of landmark prominence based on user feedback, which can be implemented in the system of FIG. 1 .
- FIG. 4 is a block diagram that illustrates semantic segmentation of a scene and detecting poses of objects using a machine learning model, which can be implemented in the system of FIG. 1 ;
- FIG. 5 is a flow diagram of an example method for generating navigation directions that include a reference to a physical object not included in the original navigation directions;
- FIG. 6 is a block diagram that schematically illustrates two routing options for reaching a final or intermediate destination, from which the system of FIG. 1 selects in view of the live state of the traffic light, according to one implementation;
- FIG. 7 is a flow diagram of an example method for selecting a navigation option in view of the live state of a traffic signal, which can be implemented in the system of FIG. 1 .
- a system collects real-time imagery from approximately the user's vantage point (e.g., using a dashboard camera, a camera built into the vehicle, the user's smartphone mounted on the dashboard), retrieves a set of visual landmarks for the user's current position along the navigation route, and uses the real-time imagery to determine which of the retrieved visual landmarks should be used to augment step-by-step navigation directions for the navigation route, according to one implementation.
- the system omits visual landmarks that are occluded by trees or vehicles, obscured due to current lighting conditions, or poorly visible from the user's current vantage for some other reason.
- the system can identify dynamic visual landmarks, such as changing electronic billboards or trucks with machine-readable text.
- dynamic visual landmarks such as changing electronic billboards or trucks with machine-readable text.
- the system can position the object relative to the next navigation instruction and reference in the object in the navigation instruction. For example, the system can modify the instruction “turn left in 200 feet” to “turn left by the red truck.”
- the system in some scenarios may select a route from among multiple routing options based on live states of traffic lights. For example, the system may determine that the red light at the intersection the driver is approaching makes another routing option more appealing.
- the system can assess the usefulness of a certain visual landmark based on explicit and/or implicit user signals. For example, the driver can indicate that she cannot see a landmark, using a voice command. When it is desirable to collect more information about visual landmarks, the system can present visual landmarks in interrogative sentences, e.g., “do you see the billboard on the left?” As an example of an implicit signal, when drivers tend to miss a turn which the system describes using a visual landmark, the system may flag the visual landmark as not useful. The system can assess usefulness at different times and under weather conditions, so that a certain billboard can be marked as not useful during daytime but useful when illuminated at night. Further, the system can receive signals indicative of current time, weather conditions, etc.
- the system can use explicit and/or implicit user feedback to modify subsequent navigation directions even when no real-time video or still photography is available to a driver. For example, the system may be able to determine only that the driver is requesting navigation directions at nighttime, and accordingly provide indications of visual landmarks that have been determined to be visible, or particularly well noticeable, at night.
- the system can use object and/or character recognition techniques to automatically recognize vehicles, billboards, text written on surfaces of various kind, etc. Further, to identify currently visible landmarks within real-time imagery, the system can match features of an image captured with an image previously captured from the location and with the same orientation of the camera (i.e., with the same camera pose) and known to depict a visual landmark.
- the system uses a convolutional neural network to implement an object detector which determines whether a captured scene includes an object of one of predefined classes (e.g., car, person, traffic light). Further, the object detector can implement semantic segmentation to label every pixel in the image.
- FIG. 1 illustrates an environment 10 in which at least some of the techniques for selecting salient visual landmarks can be implemented.
- the environment 10 includes a mobile system 12 and a server system 14 interconnected via a communication network 16 .
- the server system 14 in turn can communicate with various databases and, in some implementations, third-party systems such as a live traffic service or a weather service (not shown to avoid clutter).
- a landmark selection system 18 configured to select visual landmarks using real-time imagery and/or time of day, season, weather, conditions, etc. can be implemented in the mobile system 12 , the server system 14 , or partially in mobile system 12 and partially in the server system 14 .
- the mobile system 12 can include a portable electronic device such as a smartphone, a wearable device such as a smartwatch or a head-mounted display, or a tablet computer.
- the mobile system 12 also includes components embedded or mounted in a vehicle.
- a driver of a vehicle equipped with electronic components such as a head unit with a touchscreen or a built-in camera can use her smartphone for navigation.
- the smartphone can connect to the head unit via a short-range communication link such as Bluetooth® to access the sensors of the vehicle and/or to project the navigation directions onto the screen of the head unit.
- the user's smartphone can connect to a standalone dashboard camera mounted on the windshield of the vehicle.
- modules of a portable or wearable user device, modules of a vehicle, and external devices or modules of devices can operate as components of the mobile system 12 .
- a camera 20 can be a standard monocular camera mounted on the dashboard or windshield.
- the driver mounts the smartphone so that the camera of the smartphone faces the road similar to a dashboard camera.
- the vehicle includes a camera or even multiple cameras built into dashboard or the exterior of the vehicle, and the mobile system 12 accesses these cameras via a standard interface (e.g., USB).
- the camera 20 is configured to collect a digital video stream or capture still photographs at certain intervals.
- the mobile system 12 in some implementations uses multiple cameras to collected redundant imagery in real time.
- One camera may be mounted on the left side of the dashboard and another camera may be mounted on the right side of the dashboard to generate a slightly different views of the surroundings, which in some cases may make it easier for the landmark selection system 18 to compare real-time imagery to previously captured images of landmarks.
- the mobile system 12 also can include a processing module 22 , which can include one or more central processing unit (CPUs), one or more graphics processing unit (GPUs) for efficiently rendering graphics content, an application-specific integrated circuit (ASIC), or any other suitable type of processing hardware.
- the mobile system 12 can include a memory 24 made up of persistent (e.g., a hard disk, a flash drive) and/or non-persistent (e.g., RAM) components.
- the memory 24 stores instructions that implement a navigation application 26 .
- the mobile system 12 further includes a user interface 28 and a network interface 30 .
- the user interface 28 can correspond to the user interface of the portable electronic device or the user interface of the vehicle.
- the user interface 28 can include one or more input components such as a touchscreen, a microphone, a keyboard, etc. as well as one or more output components such as a screen or speaker.
- the network interface 30 can support short-range and/or long-range communications.
- the network interface 30 can support cellular communications, personal area network protocols such as IEEE 802.11 (e.g., Wi-Fi) or 802.15 (Bluetooth).
- the mobile system 12 includes multiple network interface modules to interconnect multiple devices within the mobile system 12 and to connect the mobile system 12 to the network 16 .
- the mobile system 12 can include a smartphone, the head unit of a vehicle, and a camera mounted on the windshield.
- the smartphone and the head unit can communicate using Bluetooth
- the smartphone and the camera can communicate using USB
- the smartphone can communicate with the server 14 via the network 16 using a 4G cellular service, to pass information to and from various components of the mobile system 16 .
- the network interface 30 in some cases can support geopositioning.
- the network interface 30 can support Wi-Fi trilateration.
- the mobile system 12 can include a dedicated positioning module 32 such as a Global Positioning Service (GPS) module.
- GPS Global Positioning Service
- the mobile system 12 can include various additional components, including redundant components such as positioning modules implemented both in the vehicle and in the smartphone.
- the mobile system 12 can communicate with the server system 14 via the network 16 , which can be a wide-area network such as the Internet.
- the server system 14 can be implemented in one more server devices, including devices distributed over multiple geographic locations.
- the server system 14 can implement a routing engine 40 , a navigation instructions generator 42 , and a visual landmark selection module 44 .
- the components 40 - 44 can be implemented using any suitable combination of hardware, firmware, and software.
- the server system 15 can access databases such as a map database 50 , a visual landmark database 52 , and a user profile database 54 , which can be implemented using any suitable data storage and access techniques.
- the routing engine 40 can receive a request for navigation directions from the mobile system 12 .
- the request can include a source, a destination, and constraints such as a request to avoid toll roads, for example.
- the routing engine 40 can retrieve road geometry data, road and intersection restrictions (e.g., one-way, no left turn), road type data (e.g., highway, local road), speed limit data, etc. from the map database 50 to generate a route from the source to the destination.
- the routing engine 40 also obtains live traffic data when selecting the best route.
- the routing engine 40 can generate one or several alternate routes.
- the map database 50 can store descriptions of geometry and location indications for various natural geographic features such as rivers, mountains, and forests, as well as artificial geographic features such buildings and parks.
- the map data can include, among other data, vector graphics data, raster image data, and text data.
- the map database 50 organizes map data into map tiles, which generally correspond to a two-dimensional organization of geospatial data into traversable data structure such as a quadtree.
- the navigation instructions generator 42 can use the one or more routes generated by the routing engine 40 and generate a sequence of navigation instructions. Examples of navigation instructions include “in 500 feet, turn right on Elm St.” and “continue straight for four miles.”
- the navigation instructions generator 42 can implement natural language generation techniques to construct these and similar phrases, in the language of the driver associated with the mobile system 12 .
- the instructions can include text, audio, or both.
- the visual landmark selection module 44 operates as part of the landmark selection system 18 , which also includes the navigation application 26 .
- the visual landmark selection module 44 can augment the navigation directions generated by the navigation instructions generator 42 with references to visual landmarks such as prominent buildings, billboards, traffic lights, stop signs, statues and monuments, and symbols representing businesses.
- the visual landmark selection module 44 initially can access the visual landmark database 52 to select a set of visual landmarks disposed along the navigation route.
- the landmark selection system 18 then can select a subset of these visual landmarks in accordance with the likelihood the driver can actually see the landmarks when driving, and/or dynamically identify visual landmarks that were not previously stored in the visual landmark database 52 .
- the visual landmark database 52 can store information regarding prominent geographic entities that can be visible when driving (or bicycling, walking, or otherwise moving along a navigation route) and thus serve as visual landmarks.
- the visual landmark database 52 can store one or several photographs, geographic coordinates, a textual description, remarks submitted by users, and numeric metrics indicative of usefulness of the visual landmark and/or of a particular image of the visual landmark.
- a landmark-specific record in the visual landmark database 52 stores multiple views of the visual landmark from the same vantage point, i.e., captured from the same location and with the same orientation of the camera. However, the multiple views of the visual landmark can differ according to the time of day, weather conditions, season, etc.
- the data record can include metadata that specifies these parameters for each image.
- the data record may include a photograph of a billboard at night when it is illuminated along with a timestamp indicating when the photograph was captured and another photograph of the billboard at daytime from the same vantage point along with the corresponding timestamp.
- the data record may include photographs of the billboard captured during snowy weather, during rainy weather, during foggy weather, etc., and corresponding indicators for each photograph.
- the data record may include photographs captured during different seasons.
- the visual landmark database 52 can store a large set of visual landmarks that in some cases is redundant both in terms of the number of landmarks available for the same maneuver (e.g., a billboard on the right and a church on the left near the same intersection) and in terms of imagery available for the same landmark.
- the landmark selection system 18 can determine which of the redundant landmarks are useful for particular lighting conditions, weather conditions, traffic conditions (as drivers may find it difficult to recognize certain visual landmarks when driving fast), and how well the corresponding scene is visible from the driver's vantage point (as inferred from real-time imagery).
- the visual landmark database 52 can store multiple descriptions of the same landmark, such as “the large glass building,” “the building with a large ‘M’ in front of it,” “the building with international flags,” etc. Operators of the server system 14 and/or users submitting landmark information as part of a crowd-sourcing effort can submit these descriptions, and the server system 14 can determine which description drivers find more helpful using the feedback processing techniques discussed in more detail below.
- the visual landmark database 52 in one example implementation stores an overall numeric metric for a visual landmark that can be used to assess whether the visual landmark should be referenced in navigation directions at all, separate numeric metrics for different times of day, different weather conditions, etc. and/or separate numeric metrics for different images.
- the server system 14 can receive satellite imagery, photographs and videos submitted by various users, street-level imagery collected by cars equipped with specialized panoramic cameras, street and sidewalk imagery collected by pedestrians and bicyclists, etc.
- the visual landmark database 52 can receive descriptions of landmarks from various sources such as operators of the server system 14 and people submitting user-generated content.
- the user profile database 54 can store user preferences regarding the types of visual landmarks they prefer to see. For example, the profile of a certain user can indicate that she prefers billboards as landmarks.
- the landmark selection system 18 can use user preferences as at least one of the factors when selecting visual landmarks from among redundant visual landmarks. In some implementations, the user provides an indication that he or she allows the landmark selection system 18 may utilize this data.
- the camera 20 can capture a scene 60 as a still photograph or a frame in a video feed.
- the scene 60 approximately corresponds to what the driver of the vehicle operating in the mobile system 12 currently sees.
- the landmark selection system 18 can determine that the driver can clearly see the landmark stadium depicted in a pre-stored image 70 , but that the landmark building depicted in a pre-stored image 72 is largely obscured.
- the better visibility of the landmark stadium is at least one of the signals the landmark selection system 18 can use to determine whether to reference the landmark stadium, the landmark building, or both.
- functionality of the landmark selection system 18 can be distributed between the mobile system 12 and the server system 14 in any suitable manner.
- the processing capability of the mobile system 12 is insufficiently robust to implement image processing.
- the mobile system 12 accordingly can capture photographs and/or video and provide the captured imagery to the server system 14 , where the visual landmark selection module executes a video processing pipeline.
- the mobile system 12 has sufficient processing capability to implement image matching.
- the server system 14 in this case can provide relevant visual landmark imagery such as the images 70 and 72 to the mobile system 12 , and the navigation application 26 can compare the scene 60 to the images 70 and 72 to detect probable matches.
- the mobile system 12 implements a less constrained image processing pipeline and attempts to automatically recognize in the scene 60 objects of certain pre-defined types such as people, small cars, large cars, trucks, traffic lights, billboards, etc.
- example methods for generating navigation directions using real-time imagery and for adjusting visual landmark metrics are discussed with reference to FIGS. 2 and 3 , respectively, followed by a discussion of example image processing techniques that can be implemented in the system of FIG. 1 .
- Other techniques for selecting visual landmarks from a large, redundant pre-stored set or recognizing visual landmarks currently absent from the pre-stored set are then discussed with reference to the remaining drawings.
- a driver request launches a navigation application on her smartphone and requests driving directions to her friends' home.
- She connects her smartphone to the camera mounted on the windshield of her car and starts driving.
- three objects potentially could serve as visual landmarks: a fast-food restaurant with an easily recognizable logo on the right, a bus stop shelter on the left, and a distinctive building on the left just past the intersection.
- the scene as captured by the driver's camera indicates that while bus stop shelter is visible, the fast-food restaurant and the distinctive building are obscured by trees.
- the navigation application accordingly generates the audio message “turn left at the bus stop you will see on your left” when the driver is approximately 200 feet away from the intersection.
- FIG. 2 is a flow diagram of an example method 100 for generating navigation directions for drivers using real-time imagery as discussed in the example above.
- the method 100 can be implemented in the landmark selection system 18 of FIG. 1 or in another suitable system.
- the method 100 can be implemented as a set of software instructions stored on a non-transitory computer-readable medium and executable by one or more processors, for example.
- the method 100 begins at block 102 , where a route for driving to a certain destination from the current location of the user or from some other location is obtained.
- indications of landmarks corresponding to prominent physical objects disposed along the route are retrieved.
- Each indication can include the coordinates of the corresponding visual landmark and the corresponding pre-stored imagery (e.g., photographs or a video sequence of a short fixed duration).
- visual landmarks can be retrieved for the entire route or for a portion of the route, e.g., for the current location of the user. In a sense, these visual landmarks are only candidate visual landmarks for the current navigation sessions, and it can be determined that some or all of these visual landmarks are not visible (or, as discussed above, some currently visible visual landmarks may not be selected when better candidates are available).
- real-time imagery is collected at the vehicle approximately from the vantage point of the driver.
- the real-time imagery can be one or several still photographs defining a scene.
- feature comparison or recognition is more reliable when a video stream rather than a single photograph is available, and thus the real-time imagery defining the scene also can be a video feed of a certain duration (e.g., 0.5 sec).
- the real-time imagery of the scene then is processed at block 108 .
- the collected real-time imagery then can be uploaded to a network server.
- the real-time imagery can be processed at a mobile system such as the user's smartphone or the head unit of the vehicle.
- the mobile system 12 can receive a representative image of a visual landmark and locally process the real-time imagery using the processing module 22 whether this candidate visual landmark is visible in the real-time imagery.
- processing of the real-time imagery can be distributed between the mobile system and the server system.
- the processing at block 108 can include comparing the captured scene to the pre-stored imagery of the landmarks obtained at block 106 .
- the processing can produce an indication of which of the visual landmarks identified at block 104 can be identified in the captured scene, and thus probably are visible to the driver.
- navigation directions referencing the one or more visible visual landmarks are provided to the driver, whereas the visual landmarks identified at block 104 but not located within the scene captured at block 106 are omitted.
- the instructions can include text to be displayed on the driver's smartphone or projected via the head unit and/or audio announcements, for example.
- a pre-stored image of a visual landmark referenced in the directions can be downloaded from the visual landmark database 52 to the mobile system 12 and displayed in the projected mode on the head unit of the vehicle, so that the user can glance at the display and see to which visual landmark the directions refer.
- the method 100 completes after block 110 .
- the system implementing the method 100 uses real-time imagery as a filter applied to the redundant set of visual landmarks.
- the visual landmarks can be further filtered based on other signals. Some of these signals, including the signals based on user feedback, are discussed below.
- the landmark selection system 18 determines that the landmark of the image 70 is probably visible to the driver and that the landmark of the image 72 is probably not visible to the driver, and accordingly references the landmark of the image 70 in the navigation directions, the driver can provide an indication of whether the landmark of the image 70 was in fact helpful. Further, the landmark selection system 18 in some cases is not equipped with the camera 20 or fails to obtain real-time imagery at the vehicle for some reason (the landmark selection system 18 then can select the visual landmarks based on other signals). In these cases, the driver still can provide feedback regarding the quality of the visual landmarks referenced in the navigation directions. In other words, the landmark selection system 18 can collect driver feedback regardless of its capacity to process real-time imagery.
- an example method 150 for requesting and processing user feedback is discussed below with reference to the landmark selection system 18 , in which it can be implemented.
- the method 150 in general can be implemented in any suitable system, including navigation systems that receive navigation directions via a network connection, navigation systems built into vehicles and storing landmark data along with map data on a hard disk or other storage device, standalone navigation systems with pre-stored landmark and map databases, etc. It is noted further that the method 150 can be implemented in systems configured to receive real-time imagery as well as systems that are not configured to receive real-time imagery.
- the method 150 begins at block 152 .
- the landmark selection system 18 can select a visual landmark for a certain location and maneuver, during navigation.
- the landmark selection system 18 can provide an indication of the visual landmark to the driver at block 154 , and provide a prompt regarding this visual landmark at block 156 so as to assess the quality of the suggestion.
- the indication can be “after you pass the statue of a bull, turn right on Financial Pl.”
- the follow-up yes/no prompt at block 156 can be “did you see the statue of a bull?”
- the landmark selection system 18 does not generate a follow-up prompt every time the visual landmark is referenced but rather at a certain relatively low rate, such as once per hundred references to the visual landmarks. Additionally or alternatively, the landmark selection system 18 can collect implicit user feedback by determining whether the user successfully completed the maneuver or missed the turn. Thus, if the prompt above is provided to one hundred drivers over a certain period of time, and only 85% the drivers turn right on Financial Pl.
- the landmark selection system 18 can utilize any suitable statistical technique to assess the probability of recognizing visual landmarks.
- the landmark selection system 18 can format the reference to the visual landmark at block 154 as a question.
- the navigation application can generate the question “do you see the statue of a bull on your right?” If the driver answers in the affirmative, the landmark selection system 18 can immediately provide the complete instruction “after you pass the statue of a bull, turn right on Financial Pl.” Otherwise, the landmark selection system 18 can select the next visual landmark, when available, and generate the next question.
- the flow proceeds to block 160 . Otherwise, the flow proceeds to block 162 .
- the landmark selection system 18 can adjust the numeric metric for the visual landmark upward to indicate an instance of success.
- the landmark selection system 18 can adjust the numeric metric for the visual landmark downward to indicate an instance of failure. Further, depending on the implementation, the landmark selection system 18 can adjust the metric for a particular time of day, particular weather, particular season, particular lighting conditions, etc.
- the landmark selection system 18 can also adjust the probability of selecting other landmarks that belong to the same type (or images of landmarks of a certain type). For example, if it determined at block 158 that the driver found a certain billboard to be a useful landmark, the probability of preferring billboards to other types of landmarks can increase.
- the flow proceeds to block 166 , where the next maneuver is selected. The flow then returns to block 152 , where a set of visual landmarks is selected for the new maneuver and the location of the driver.
- the landmark selection system 18 can utilize explicit and/or implicit driver feedback to determine which visual landmarks are more likely to be useful for the remainder of the navigation session, and which visual landmarks are likely to be useful to other drivers in the future.
- the overall accuracy of assessing usefulness of visual landmarks is expected to increase when the method 150 is executed for a large number of navigation sessions, and for a large number of drivers.
- the method 150 can be extended to other types of navigation directions or geographic suggestions.
- a navigation system can use the method 150 to determine whether a certain reference to a street name is a reliable reference in navigation directions. Because street signs may be missing or poorly lit, and because some street and road information may be out of date, the navigation system can format certain directions as questions (e.g., “Do you see Elm St. 300 feet ahead?”), receive explicit feedback when the user chooses to comment on the previously provided directions (e.g., “In 300 feet, turn right on Elm St.”—“I cannot see Elm St.”), and/or collect implicit feedback (e.g., missed turn, sudden deceleration prior to the turn).
- questions e.g., “Do you see Elm St. 300 feet ahead?”
- receive explicit feedback when the user chooses to comment on the previously provided directions
- implicit feedback e.g., missed turn, sudden deceleration prior to the turn.
- the devices illustrated in FIG. 1 can use explicit and implicit driver feedback to identify easy-to-miss turns.
- the server system 14 can detect the tendencies of drivers to miss turns, quickly brake before upcoming turns, or otherwise not maneuver according to the instructions provided as part of the navigation directions. For example, if a certain percentage of the drivers miss the turn or appear to almost miss the term by quickly changing their speed, the server system 14 can determine that the turn is an easy-to-miss turn. As discussed above, this percentage also can mean that the visual landmark referenced in the corresponding instruction may not be reliable.
- the navigation instruction generator 42 can automatically provide a warning to the driver, such as “slow down here, the next turn is easy to miss.” Further, the difficulty of the maneuver may indicate to the landmark selection system 18 that it should attempt to identify a suitable dynamic visual landmark, especially when no permanent visual landmarks are available. Dynamic visual landmarks are discussed in more detail below.
- the landmark selection system 18 compares the captured real-time imagery to pre-stored images to detect a match or absence of a match.
- the visual landmark database 52 of FIG. 1 can store images of the landmark depicted in the image 70 captured from various locations and with various orientations of the camera, i.e., camera poses. These images can be, for example, street-level images collected by a specialized vehicle and annotated to select only those pixels or portions of each image that depict the visual landmark. The annotation may be conducted manually, for example.
- a positioning module operating in the mobile system 12 determines the location from which the scene 60 was captured.
- the landmark selection system 18 then can retrieve those images of the landmarks depicted in the images 70 and 72 that match the pose of the camera 20 at the time of capture.
- the visual landmark database 52 can store numerous photographs of the stadium depicted in FIG. 1 , and the landmark selection system 18 can select one or several photographs from among these numerous photographs based on the camera pose and then determine whether the stadium is depicted in the scene 60 .
- the landmark selection system 18 seeks to determine the presence or absence of a specified visual landmark.
- the landmark selection system 18 implements less constrained image processing.
- FIG. 4 illustrates the scene 60 along with a model 200 that positions automatically recognized entities such as cars and people in two- or three-dimensional space.
- the landmark selection system 18 can rely on models of certain types or classes of objects to identify presence or absence of objects of these types in the scene 60 using a deep-learning technique such as building a convolutional neural network (CNN), for example.
- CNN convolutional neural network
- these techniques can spatially localize hundreds of classes of objects, in relatively short time (e.g., 100 ms per image).
- the CNN can be trained using such datasets annotated with metadata as, for example, CityScapes available at www.cityscapes-dataset.com.
- the landmark selection system 18 generates bounding boxes 202 , 204 , 206 and 208 with respective confidence scores.
- the bounding boxes 202 , 204 and 206 correspond to vehicles of respective types, and the bounding box 208 corresponds to a standing person.
- the landmark selection system 18 then places the identified objects within the geographic model 200 of the corresponding area. Moreover, the landmark selection system 18 can determine the spatial orientation of these objects.
- the bounding boxes 212 - 218 enclose models of the corresponding object types.
- the bounding box 212 encloses a sample object of type “sports utility vehicle,” the bounding box 214 encloses a sample object of type “mid-size car,” the bounding box 216 encloses a sample object of type “sports car,” and bounding box 218 encloses a sample object of type “standing adult person.”
- types of objects can include bicycles, buses, billboards, traffic lights, certain chain store logos, etc.
- the landmark selection system 18 can align the objects identified in the scene 60 with these and other types of objects and determines the positions of these objects relative to static geographic features such as buildings with known coordinates, etc. In this manner, the landmark selection system 18 can describe the position of an identified object relative to static geographic features and generate navigation instructions of the type “turn where the sports car is now turning.”
- the landmark selection system 18 also can process color characteristics of the identified objects. Thus, the instruction above can become “turn where the red sports car is now turning,” which may be more helpful to the driver. Further, the landmark selection system 18 can be configured to recognize alphanumeric characters and generate such instructions as “keep going past the sign that says ‘car wash,’” when the camera captures an image of a person holding up a temporary car wish sign.
- the landmark selection system 18 labels every pixel in the scene 60 in accordance with semantic segmentation techniques.
- semantic segmentation can produce an indication of where the sidewalk, the road, and the trees are located.
- a more robust image processing pipeline generally is required to conduct semantic segmentation, but using semantic segmentation the landmark selection system 18 can identify additional landmarks and/or generate better explanations of where visual landmarks are located. For example, the navigation instruction “turn right after you see a large yellow billboard” can be improved to “turn right after you see a large yellow billboard on the sidewalk.”
- the landmark selection system 18 can use the image processing techniques discussed above both to determine the presence or absence of pre-selected objects in captured real-time imagery and to dynamically identify objects in the real-time imagery that can work as visual landmarks, even where no information for such objects was stored in the visual landmarks database 52 .
- These dynamic visual landmarks typically are transient (e.g., a bus stopped at the corner, a truck parked in front of a convenience store, a bicyclist in a yellow shirt turning left), in which case the landmark selection system 18 can limit the use of these dynamic visual landmarks to the current navigation instructions only.
- the landmark selection system 18 in a similar fashion can identify new permanent landmarks that were missing from the visual landmark database 52 .
- no information about a recently installed billboard may be stored in the visual landmark database 52 , and the landmark selection system 18 in some cases can identify a potentially permanent landmark and automatically submit the corresponding image to the server system 14 , which in response may create a new record in the visual landmark database 52 .
- FIG. 5 illustrates an example method 300 for identifying prominent objects within a captured scene, which can be implemented in the system of FIG. 1 .
- the method 300 is discussed with reference to landmark selection system 18 , but it is noted that the method 300 can be implemented in any suitable system.
- the landmark selection system 18 can determine a route for guiding a driver to a destination.
- the route can include a graph traversing several road segments, and the corresponding navigation directions can include a sequence of descriptions of maneuvers.
- the navigation directions can be generated at the server system 14 and provided to the mobile system 12 in relevant portions.
- the landmark selection system 18 can receive real-time imagery for a scene, collected at a certain location of the vehicle.
- the real-time imagery is collected when the vehicle approaches the location of the next maneuver.
- the camera pose for the captured imagery approximately corresponds to the vantage point of the driver.
- the real-time imagery can be geographically tagged, i.e., include an indication of the location where the real-time imagery was captured.
- the landmark selection system 18 can identify objects of certain pre-defined types within the captured scene. As discussed above, this identification can be based on training data and can include semantic image segmentation. In some cases, the identification is based on the presence of letters, numbers, and other alphanumeric characters. To this end, the landmark selection system 18 can implement any suitable character recognition technique. Moreover, the landmark selection system 18 may implement both object identification and character recognition to identify objects of pre-defined types with alphanumeric characters.
- the landmark selection system 18 can determine which of the detected objects appear prominently within the scene. Referring back to FIG. 4 , not every object within the bounding boxes 202 - 208 is necessarily noticeable to a human observer. In other words, to generate useful dynamic visual landmarks, it is often insufficient for the landmark selection system 18 to simply identify objects. The landmark selection system 18 accordingly can assess the prominent of visual landmarks relative to the rest of the scene based on the difference in color, for example. More particularly, the landmark selection system 18 can determine that the car enclosed by the box 206 is bright red, and that the rest of the scene 60 lacks bright patches of color. The car enclosed by the box 206 thus can be determined to be a potentially useful visual landmark.
- the landmark selection system 18 can identify several buildings within a scene, determine that the buildings are disposed at a similar distance from vehicle, and determine that one of the buildings is significantly larger than the other buildings.
- the landmark selection system 18 can use any number of suitable criteria of prominence, such as shape, presence of alphanumeric characters, etc.
- the landmark selection system 18 can determine the positions of the one or more prominent objects relative to the current location of the vehicle and/or to the locations of road intersections and other geographic waypoints, in a two- or may three-dimensional coordinate system. Where relevant, the landmark selection system 18 also determine the orientation of the prominent object. Referring back to FIG. 4 , after the sports car enclosed by the box 206 is identified as a prominent feature, the landmark selection system 18 can determine the location and orientation of the sports car relative to the streets.
- the landmark selection system 18 can include in the navigation directions a reference to the one or more prominent objects identified at block 306 . As discussed above, the landmark selection system 18 can generate such instructions as “turn left on Main. St., where the red sports car is turning” or “turn right on Central St. after the blue billboard.” The instructions can include any suitable combination of text and multimedia.
- the car 400 approaches an intersection 402 via Elm St., en route to a destination or intermediate waypoint 404 .
- Elm St. en route to a destination or intermediate waypoint 404 .
- a traffic light 404 includes a left-turn arrow indicator.
- the routing engine 40 Prior to the car 400 reaching the intersection 402 , the routing engine 40 (see FIG. 1 ) may have determined that the route 410 is faster that the route 412 .
- the routing engine 40 may have applied routing algorithms based on graph theory and additionally considered live traffic data for the potentially relevant portions of the route.
- the landmark selection system 18 can analyze the scene to a identify and properly classify a visual landmark, the traffic light 404 .
- the landmark selection system 18 can determine that the traffic light 404 is currently displaying a green arrow, and in response the routing engine 40 can re-evaluate the routing options and determine that the route 412 has become a better option.
- the navigation instructions generator can provide an updated notification advising the driver to turn left at the intersection 402 .
- the landmark selection system 18 analyzes the scene to determine that the traffic light 404 is green, the routing engine 40 can confirm that the route 410 remains the better option. It is noted that in many cases, the current state of a traffic light cannot be obtained from other sources such as real-time database servers, or can be obtained with such difficulties that the approach becomes impractical.
- FIG. 7 depicts a flow diagram of an example method 450 for selecting a navigation option in view of the live state of a traffic signal, which can be implemented in the devices illustrated in FIG. 1 or any other suitable system.
- the method 450 begins at block 452 , where two or more routing options for reaching a certain intermediate point along the route or the endpoint of the route, from a certain location controlled by a traffic light, are identified.
- the current state of the traffic light is determined using real-time imagery captured at the vehicle approaching the location. If the traffic light is determined to be displaying the green arrow, the flow proceeds to block 460 , where the first routing option is selected. Otherwise, if the traffic light is determined to not be displaying the green arrow, the flow proceeds to block 462 , and the second routing option is selected. The corresponding navigation instruction then is provided to the user at block 464 .
- the components of the landmark selection system 18 can use real-time imagery to improve lane guidance.
- positioning solutions such as GPS or Wi-Fi triangulation cannot yield a position fix precise enough to determine in which lane the vehicle is currently located.
- the landmark selection system 18 can recognize lane marking (e.g., white and yellow divider strips), arrows and highway signs painted on the road, the dimensionality of lanes based on detected boundaries of the sidewalk, presence of other vehicles from which the existence of other lanes can be inferred, etc.
- the camera 20 of FIG. 1 can be positioned so as to capture the road immediately ahead of the vehicle.
- the captured imagery can include a first solid single white line on the left, a solid double yellow line in the to the right to the first white line, a dashed white line to right of the solid yellow line, and a second single white line on the right.
- the navigation application 26 can process the imagery (locally or by uploading the imagery to the sever system 14 ) to determine, using the knowledge that the vehicle currently is in a geographic region where people drive on the right, that the road includes two lanes in the current direction of travel and one lane in the opposite direction. The navigation application 26 then can process the geometry of the detected lines to determine the current position of the vehicle relative to the lanes.
- the camera 20 may be mounted at a certain precise location, so that the navigation application 26 can account for the geometry of the vehicle (e.g., the navigation application 26 may be provisioned to assume that the camera two feet above ground level, 30 inches away from the left edge of the vehicle and 40 inches away from the right edge of the vehicle). Additionally or alternatively, the camera 20 may be mounted so as to capture the front exterior corners of the vehicle to determine where the corners are located relative to the white and yellow lines on the road.
- the navigation application 26 can provide lane-specific guidance. For example, the navigation application 26 can guide the driver to avoid left-turn-only or right-turn-only lanes when the vehicle needs to travel straight, generate more relevant warnings regarding merging left or right, warn the driver when he or she is in a lane that is about to end, etc.
- the navigation application 26 and/or the navigation instructions generator 42 can also use lane data available in the map database 50 .
- the navigation application 26 can receive an indication that the vehicle is currently traveling in a three-lane road segment, based on the most recent GPS or Wi-Fi positioning fix. Using this information along with real-time imagery, the navigation application 26 can determine in which lane the vehicle is travelling and generate appropriate instructions when necessary.
- the navigation application 26 can use the imagery captured by the camera 20 to automatically generate warnings regarding potential traffic violations. For example, drivers have been observed making an illegal right-on-red turn onto Shoreline Blvd. from US 101 North in Mountain View, Calif. It is believed that many drivers simply do not notice the “no right on red” sign. While the map database 50 can store an indication that the right turn on red is not allowed at this road junction, preemptively generating a warning whenever the driver is about to turn onto Shoreline Blvd. can be distracting and unnecessary, as the driver may be turning right on green.
- the landmark selection system 18 can process the state of the traffic light as discussed above when the driver enters the ramp.
- the state of the traffic light is determined to be red, and when the driver appears to start moving based on the positioning data or vehicle sensor data, the landmark selection system 18 can automatically provide an instruction “no right no red here!,” for example.
- the landmark selection system 18 also can consider statistical indicators for the road junction, when available. For example, an operator can manually provision the server system 14 with an indication that this particular Shoreline Blvd exit is associated with frequent traffic violations. These indications also can be user-generated.
- the landmark selection system 18 also can process and interpret the “no right on red” sign prior to generating the warning.
- the map database 50 may not have specific turn restriction data for a certain residential area.
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- the methods or routines described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- the one or more processors may also operate to support performance of the relevant operations in a cloud computing environment or as a software as a service (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., application program interfaces (APIs).)
- a network e.g., the Internet
- APIs application program interfaces
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Automation & Control Theory (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Atmospheric Sciences (AREA)
- Biodiversity & Conservation Biology (AREA)
- Ecology (AREA)
- Environmental & Geological Engineering (AREA)
- Environmental Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Navigation (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 15/144,300, filed May 2, 2016; the disclosure of which is incorporated herein by reference in its entirety for all purposes.
- The present disclosure relates to navigation directions and, in particular, to using imagery in navigation directions.
- The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
- Systems that automatically route drivers between geographic locations generally utilize indications of distance, street names, building numbers, to generate navigation directions based on the route. For example, these systems can provide to a driver such instructions as “proceed for one-fourth of a mile, then turn right onto Maple Street.” However, it is difficult for drivers to accurately judge distance, nor is it always easy for drivers to see street signs. Moreover, there are geographic areas where street and road signage is poor.
- To provide guidance to a driver that is more similar to what another person may say to the driver, it is possible to augment navigation directions with references to prominent objects along the route, such as visually salient buildings or billboards. These prominent object can be referred to as “visual landmarks.” Thus, a system can generate such navigation directions as “in one fourth of a mile, you will see a McDonald's restaurant on your right; make the next right turn onto Maple Street.” To this end, an operator can enter descriptions and indications of locations (e.g., street addresses, coordinates) for visual landmarks, so that the system can automatically select suitable visual landmarks when generating navigation directions.
- However, not every landmark is visible at all times. For example, some billboards may be brightly illuminated at night but may be generally unnoticeable during the day. On the other hand, an intricate façade of a building may be easy to notice during the day but may be poorly illuminated and accordingly unnoticeable at night.
- Generally speaking, a system of this disclosure provides a driver with navigation directions using visual landmarks that are likely to be visible at the time when the driver reaches the corresponding geographic location. In one implementation, the system selects visual landmarks from a relatively large and redundant set of previously identified visual landmarks. To make the selection, the system can consider one or more of the time of day, the current weather conditions, the current season, etc. Moreover, the system can utilize real-time imagery collected by the dashboard camera, the camera of a smartphone mounted on the dashboard, or another camera that approximately corresponds to the vantage point of the driver. As discussed in more detail below, the system also can use implicit and explicit feedback regarding visibility and/or prominence of physical objects to improve subsequent references to visual landmarks.
- An example embodiment of these techniques is a method for generating navigation directions for drivers, executed by one or more processors. The method includes obtaining a route for guiding a driver of a vehicle to a destination, retrieving visual landmarks corresponding to prominent physical objects disposed along the route, obtaining real-time imagery collected at the vehicle approximately from a vantage point of the driver during navigation along the route, and using (i) the retrieved visual landmarks and (ii) the imagery collected at the vehicle, selecting a subset of the visual landmarks that are currently visible to the driver. The method further includes providing, to the driver, navigation directions describing the route, the navigation directions referencing the selected subset of the visual landmarks and excluding the remaining visual landmarks.
- Another example embodiment of these techniques is a system operating in a vehicle. The system includes a camera configured to capture real-time imagery approximately from a vantage point of the driver, a positioning module configured to determine a current geographic location of the vehicle, a network interface to communicate with a server system via a communication network, a user interface, and processing hardware configured to (i) obtain, using the captured real-time imagery and the current geographic location of the vehicle, driving directions including an instruction that references a visual landmark automatically determined as being visible in the captured real-time imagery, and (ii) provide the instruction to the driver via the user interface.
- Yet another example embodiment of these techniques is a method in a mobile system operating in a vehicle for providing driving directions. The method comprises receiving a request for driving directions to a destination from a driver of the vehicle, receiving real-time imagery collected at the vehicle approximately from a vantage point of the driver, obtaining, using the real-time imagery and a current location of the vehicle, the driving directions including an instruction that references a visual landmark automatically determined as being visible in the real-time imagery, and providing the instruction to the driver in response to the request.
- Still another example embodiment of this technique is a method for generating navigation directions for drivers. The method includes obtaining, by one or more processors, a route for guiding a driver of a vehicle to a destination as well as real-time imagery collected at the vehicle approximately from a vantage point of the driver during navigation along the route. The method further includes automatically identifying, by the one or more processors, a physical object within the real-time imagery to be used as a visual landmark in navigation, including recognizing at least one of (i) one of a finite set of pre-set objects or (ii) text within the real-time imagery. Further, the method includes determining a position of the physical object relative to a point on the route, and providing, to the driver, navigation directions describing the route, the navigation directions including a reference to the identified physical object.
-
FIG. 1 is a block diagram of an example computing system that generates navigation directions in view of real-time imagery collected from approximately the user's vantage point, according to one implementation; -
FIG. 2 is a flow diagram of an example method for generating navigation directions for drivers using real-time imagery, which can be implemented in the system ofFIG. 1 ; -
FIG. 3 is a flow diagram of an example method for adjusting numeric metrics of landmark prominence based on user feedback, which can be implemented in the system ofFIG. 1 . -
FIG. 4 is a block diagram that illustrates semantic segmentation of a scene and detecting poses of objects using a machine learning model, which can be implemented in the system ofFIG. 1 ; -
FIG. 5 is a flow diagram of an example method for generating navigation directions that include a reference to a physical object not included in the original navigation directions; -
FIG. 6 is a block diagram that schematically illustrates two routing options for reaching a final or intermediate destination, from which the system ofFIG. 1 selects in view of the live state of the traffic light, according to one implementation; and -
FIG. 7 is a flow diagram of an example method for selecting a navigation option in view of the live state of a traffic signal, which can be implemented in the system ofFIG. 1 . - To better guide a driver along a navigation route, a system collects real-time imagery from approximately the user's vantage point (e.g., using a dashboard camera, a camera built into the vehicle, the user's smartphone mounted on the dashboard), retrieves a set of visual landmarks for the user's current position along the navigation route, and uses the real-time imagery to determine which of the retrieved visual landmarks should be used to augment step-by-step navigation directions for the navigation route, according to one implementation. In this manner, the system omits visual landmarks that are occluded by trees or vehicles, obscured due to current lighting conditions, or poorly visible from the user's current vantage for some other reason.
- In addition to selecting salient visual landmarks from among pre-stored static landmarks, the system can identify dynamic visual landmarks, such as changing electronic billboards or trucks with machine-readable text. When capable of automatically recognizing such an object in the video or photo feed, the system can position the object relative to the next navigation instruction and reference in the object in the navigation instruction. For example, the system can modify the instruction “turn left in 200 feet” to “turn left by the red truck.” Moreover, the system in some scenarios may select a route from among multiple routing options based on live states of traffic lights. For example, the system may determine that the red light at the intersection the driver is approaching makes another routing option more appealing.
- Additionally or alternatively to processing real-time imagery, the system can assess the usefulness of a certain visual landmark based on explicit and/or implicit user signals. For example, the driver can indicate that she cannot see a landmark, using a voice command. When it is desirable to collect more information about visual landmarks, the system can present visual landmarks in interrogative sentences, e.g., “do you see the billboard on the left?” As an example of an implicit signal, when drivers tend to miss a turn which the system describes using a visual landmark, the system may flag the visual landmark as not useful. The system can assess usefulness at different times and under weather conditions, so that a certain billboard can be marked as not useful during daytime but useful when illuminated at night. Further, the system can receive signals indicative of current time, weather conditions, etc. from other sources, such as a weather service, and select landmarks suitable for the current environmental conditions. The system can use explicit and/or implicit user feedback to modify subsequent navigation directions even when no real-time video or still photography is available to a driver. For example, the system may be able to determine only that the driver is requesting navigation directions at nighttime, and accordingly provide indications of visual landmarks that have been determined to be visible, or particularly well noticeable, at night.
- The system can use object and/or character recognition techniques to automatically recognize vehicles, billboards, text written on surfaces of various kind, etc. Further, to identify currently visible landmarks within real-time imagery, the system can match features of an image captured with an image previously captured from the location and with the same orientation of the camera (i.e., with the same camera pose) and known to depict a visual landmark. In some implementations, the system uses a convolutional neural network to implement an object detector which determines whether a captured scene includes an object of one of predefined classes (e.g., car, person, traffic light). Further, the object detector can implement semantic segmentation to label every pixel in the image.
-
FIG. 1 illustrates anenvironment 10 in which at least some of the techniques for selecting salient visual landmarks can be implemented. Theenvironment 10 includes amobile system 12 and aserver system 14 interconnected via acommunication network 16. Theserver system 14 in turn can communicate with various databases and, in some implementations, third-party systems such as a live traffic service or a weather service (not shown to avoid clutter). Alandmark selection system 18 configured to select visual landmarks using real-time imagery and/or time of day, season, weather, conditions, etc. can be implemented in themobile system 12, theserver system 14, or partially inmobile system 12 and partially in theserver system 14. - The
mobile system 12 can include a portable electronic device such as a smartphone, a wearable device such as a smartwatch or a head-mounted display, or a tablet computer. In some implementations or scenarios, themobile system 12 also includes components embedded or mounted in a vehicle. For example, a driver of a vehicle equipped with electronic components such as a head unit with a touchscreen or a built-in camera can use her smartphone for navigation. The smartphone can connect to the head unit via a short-range communication link such as Bluetooth® to access the sensors of the vehicle and/or to project the navigation directions onto the screen of the head unit. As another example, the user's smartphone can connect to a standalone dashboard camera mounted on the windshield of the vehicle. More generally, modules of a portable or wearable user device, modules of a vehicle, and external devices or modules of devices can operate as components of themobile system 12. - These components can include a
camera 20, which can be a standard monocular camera mounted on the dashboard or windshield. In some scenarios, the driver mounts the smartphone so that the camera of the smartphone faces the road similar to a dashboard camera. In other scenarios, the vehicle includes a camera or even multiple cameras built into dashboard or the exterior of the vehicle, and themobile system 12 accesses these cameras via a standard interface (e.g., USB). Depending on the implementation, thecamera 20 is configured to collect a digital video stream or capture still photographs at certain intervals. Moreover, themobile system 12 in some implementations uses multiple cameras to collected redundant imagery in real time. One camera may be mounted on the left side of the dashboard and another camera may be mounted on the right side of the dashboard to generate a slightly different views of the surroundings, which in some cases may make it easier for thelandmark selection system 18 to compare real-time imagery to previously captured images of landmarks. - The
mobile system 12 also can include aprocessing module 22, which can include one or more central processing unit (CPUs), one or more graphics processing unit (GPUs) for efficiently rendering graphics content, an application-specific integrated circuit (ASIC), or any other suitable type of processing hardware. Further, themobile system 12 can include amemory 24 made up of persistent (e.g., a hard disk, a flash drive) and/or non-persistent (e.g., RAM) components. In the example implementation illustrated inFIG. 1 , thememory 24 stores instructions that implement anavigation application 26. - Further, the
mobile system 12 further includes auser interface 28 and anetwork interface 30. Depending on the scenario, theuser interface 28 can correspond to the user interface of the portable electronic device or the user interface of the vehicle. In either case, theuser interface 28 can include one or more input components such as a touchscreen, a microphone, a keyboard, etc. as well as one or more output components such as a screen or speaker. - The
network interface 30 can support short-range and/or long-range communications. For example, thenetwork interface 30 can support cellular communications, personal area network protocols such as IEEE 802.11 (e.g., Wi-Fi) or 802.15 (Bluetooth). In some implementations, themobile system 12 includes multiple network interface modules to interconnect multiple devices within themobile system 12 and to connect themobile system 12 to thenetwork 16. For example, themobile system 12 can include a smartphone, the head unit of a vehicle, and a camera mounted on the windshield. The smartphone and the head unit can communicate using Bluetooth, the smartphone and the camera can communicate using USB, and the smartphone can communicate with theserver 14 via thenetwork 16 using a 4G cellular service, to pass information to and from various components of themobile system 16. - Further, the
network interface 30 in some cases can support geopositioning. For example, thenetwork interface 30 can support Wi-Fi trilateration. In other cases, themobile system 12 can include adedicated positioning module 32 such as a Global Positioning Service (GPS) module. In general, themobile system 12 can include various additional components, including redundant components such as positioning modules implemented both in the vehicle and in the smartphone. - With continued reference to
FIG. 1 , themobile system 12 can communicate with theserver system 14 via thenetwork 16, which can be a wide-area network such as the Internet. Theserver system 14 can be implemented in one more server devices, including devices distributed over multiple geographic locations. Theserver system 14 can implement arouting engine 40, anavigation instructions generator 42, and a visuallandmark selection module 44. The components 40-44 can be implemented using any suitable combination of hardware, firmware, and software. The server system 15 can access databases such as amap database 50, avisual landmark database 52, and a user profile database 54, which can be implemented using any suitable data storage and access techniques. - In operation, the
routing engine 40 can receive a request for navigation directions from themobile system 12. The request can include a source, a destination, and constraints such as a request to avoid toll roads, for example. Therouting engine 40 can retrieve road geometry data, road and intersection restrictions (e.g., one-way, no left turn), road type data (e.g., highway, local road), speed limit data, etc. from themap database 50 to generate a route from the source to the destination. In some implementations, therouting engine 40 also obtains live traffic data when selecting the best route. In addition to the best, or “primary,” route, therouting engine 40 can generate one or several alternate routes. - In addition to road data, the
map database 50 can store descriptions of geometry and location indications for various natural geographic features such as rivers, mountains, and forests, as well as artificial geographic features such buildings and parks. The map data can include, among other data, vector graphics data, raster image data, and text data. In an example implementation, themap database 50 organizes map data into map tiles, which generally correspond to a two-dimensional organization of geospatial data into traversable data structure such as a quadtree. - The
navigation instructions generator 42 can use the one or more routes generated by therouting engine 40 and generate a sequence of navigation instructions. Examples of navigation instructions include “in 500 feet, turn right on Elm St.” and “continue straight for four miles.” Thenavigation instructions generator 42 can implement natural language generation techniques to construct these and similar phrases, in the language of the driver associated with themobile system 12. The instructions can include text, audio, or both. - The visual
landmark selection module 44 operates as part of thelandmark selection system 18, which also includes thenavigation application 26. The visuallandmark selection module 44 can augment the navigation directions generated by thenavigation instructions generator 42 with references to visual landmarks such as prominent buildings, billboards, traffic lights, stop signs, statues and monuments, and symbols representing businesses. To this end, the visuallandmark selection module 44 initially can access thevisual landmark database 52 to select a set of visual landmarks disposed along the navigation route. However, as discussed in more detail below, thelandmark selection system 18 then can select a subset of these visual landmarks in accordance with the likelihood the driver can actually see the landmarks when driving, and/or dynamically identify visual landmarks that were not previously stored in thevisual landmark database 52. - The
visual landmark database 52 can store information regarding prominent geographic entities that can be visible when driving (or bicycling, walking, or otherwise moving along a navigation route) and thus serve as visual landmarks. For each visual landmark, thevisual landmark database 52 can store one or several photographs, geographic coordinates, a textual description, remarks submitted by users, and numeric metrics indicative of usefulness of the visual landmark and/or of a particular image of the visual landmark. In some implementations, a landmark-specific record in thevisual landmark database 52 stores multiple views of the visual landmark from the same vantage point, i.e., captured from the same location and with the same orientation of the camera. However, the multiple views of the visual landmark can differ according to the time of day, weather conditions, season, etc. The data record can include metadata that specifies these parameters for each image. For example, the data record may include a photograph of a billboard at night when it is illuminated along with a timestamp indicating when the photograph was captured and another photograph of the billboard at daytime from the same vantage point along with the corresponding timestamp. Further, the data record may include photographs of the billboard captured during snowy weather, during rainy weather, during foggy weather, etc., and corresponding indicators for each photograph. Still further, the data record may include photographs captured during different seasons. - In short, the
visual landmark database 52 can store a large set of visual landmarks that in some cases is redundant both in terms of the number of landmarks available for the same maneuver (e.g., a billboard on the right and a church on the left near the same intersection) and in terms of imagery available for the same landmark. Thelandmark selection system 18 can determine which of the redundant landmarks are useful for particular lighting conditions, weather conditions, traffic conditions (as drivers may find it difficult to recognize certain visual landmarks when driving fast), and how well the corresponding scene is visible from the driver's vantage point (as inferred from real-time imagery). - In addition to multiple images of a same visual landmark, the
visual landmark database 52 can store multiple descriptions of the same landmark, such as “the large glass building,” “the building with a large ‘M’ in front of it,” “the building with international flags,” etc. Operators of theserver system 14 and/or users submitting landmark information as part of a crowd-sourcing effort can submit these descriptions, and theserver system 14 can determine which description drivers find more helpful using the feedback processing techniques discussed in more detail below. To keep track of drivers' feedback, thevisual landmark database 52 in one example implementation stores an overall numeric metric for a visual landmark that can be used to assess whether the visual landmark should be referenced in navigation directions at all, separate numeric metrics for different times of day, different weather conditions, etc. and/or separate numeric metrics for different images. - To populate the
visual landmark database 52, theserver system 14 can receive satellite imagery, photographs and videos submitted by various users, street-level imagery collected by cars equipped with specialized panoramic cameras, street and sidewalk imagery collected by pedestrians and bicyclists, etc. Similarly, thevisual landmark database 52 can receive descriptions of landmarks from various sources such as operators of theserver system 14 and people submitting user-generated content. - With continued reference to
FIG. 1 , the user profile database 54 can store user preferences regarding the types of visual landmarks they prefer to see. For example, the profile of a certain user can indicate that she prefers billboards as landmarks. Thelandmark selection system 18 can use user preferences as at least one of the factors when selecting visual landmarks from among redundant visual landmarks. In some implementations, the user provides an indication that he or she allows thelandmark selection system 18 may utilize this data. - In operation, the
camera 20 can capture ascene 60 as a still photograph or a frame in a video feed. Thescene 60 approximately corresponds to what the driver of the vehicle operating in themobile system 12 currently sees. Based on the capturedscene 60, thelandmark selection system 18 can determine that the driver can clearly see the landmark stadium depicted in apre-stored image 70, but that the landmark building depicted in apre-stored image 72 is largely obscured. The better visibility of the landmark stadium is at least one of the signals thelandmark selection system 18 can use to determine whether to reference the landmark stadium, the landmark building, or both. - As indicated above, functionality of the
landmark selection system 18 can be distributed between themobile system 12 and theserver system 14 in any suitable manner. In some implementations, for example, the processing capability of themobile system 12 is insufficiently robust to implement image processing. Themobile system 12 accordingly can capture photographs and/or video and provide the captured imagery to theserver system 14, where the visual landmark selection module executes a video processing pipeline. In other implementations, themobile system 12 has sufficient processing capability to implement image matching. Theserver system 14 in this case can provide relevant visual landmark imagery such as theimages mobile system 12, and thenavigation application 26 can compare thescene 60 to theimages mobile system 12 implements a less constrained image processing pipeline and attempts to automatically recognize in thescene 60 objects of certain pre-defined types such as people, small cars, large cars, trucks, traffic lights, billboards, etc. - Next, example methods for generating navigation directions using real-time imagery and for adjusting visual landmark metrics are discussed with reference to
FIGS. 2 and 3 , respectively, followed by a discussion of example image processing techniques that can be implemented in the system ofFIG. 1 . Other techniques for selecting visual landmarks from a large, redundant pre-stored set or recognizing visual landmarks currently absent from the pre-stored set are then discussed with reference to the remaining drawings. - In an example scenario, a driver request launches a navigation application on her smartphone and requests driving directions to her friends' home. She connects her smartphone to the camera mounted on the windshield of her car and starts driving. As she drives through a busy part of town and approaches the intersection where she must turn left, three objects potentially could serve as visual landmarks: a fast-food restaurant with an easily recognizable logo on the right, a bus stop shelter on the left, and a distinctive building on the left just past the intersection. The scene as captured by the driver's camera indicates that while bus stop shelter is visible, the fast-food restaurant and the distinctive building are obscured by trees. The navigation application accordingly generates the audio message “turn left at the bus stop you will see on your left” when the driver is approximately 200 feet away from the intersection.
-
FIG. 2 is a flow diagram of anexample method 100 for generating navigation directions for drivers using real-time imagery as discussed in the example above. Themethod 100 can be implemented in thelandmark selection system 18 ofFIG. 1 or in another suitable system. Themethod 100 can be implemented as a set of software instructions stored on a non-transitory computer-readable medium and executable by one or more processors, for example. - The
method 100 begins atblock 102, where a route for driving to a certain destination from the current location of the user or from some other location is obtained. Atblock 104, indications of landmarks corresponding to prominent physical objects disposed along the route are retrieved. Each indication can include the coordinates of the corresponding visual landmark and the corresponding pre-stored imagery (e.g., photographs or a video sequence of a short fixed duration). Depending on the implementation, visual landmarks can be retrieved for the entire route or for a portion of the route, e.g., for the current location of the user. In a sense, these visual landmarks are only candidate visual landmarks for the current navigation sessions, and it can be determined that some or all of these visual landmarks are not visible (or, as discussed above, some currently visible visual landmarks may not be selected when better candidates are available). - At
block 106, real-time imagery is collected at the vehicle approximately from the vantage point of the driver. The real-time imagery can be one or several still photographs defining a scene. For some image processing techniques, feature comparison or recognition is more reliable when a video stream rather than a single photograph is available, and thus the real-time imagery defining the scene also can be a video feed of a certain duration (e.g., 0.5 sec). - The real-time imagery of the scene then is processed at
block 108. To this end, the collected real-time imagery then can be uploaded to a network server. Alternatively, the real-time imagery can be processed at a mobile system such as the user's smartphone or the head unit of the vehicle. For example, themobile system 12 can receive a representative image of a visual landmark and locally process the real-time imagery using theprocessing module 22 whether this candidate visual landmark is visible in the real-time imagery. As yet another alternative, processing of the real-time imagery can be distributed between the mobile system and the server system. The processing atblock 108 can include comparing the captured scene to the pre-stored imagery of the landmarks obtained atblock 106. The processing can produce an indication of which of the visual landmarks identified atblock 104 can be identified in the captured scene, and thus probably are visible to the driver. - At
block 110, navigation directions referencing the one or more visible visual landmarks are provided to the driver, whereas the visual landmarks identified atblock 104 but not located within the scene captured atblock 106 are omitted. The instructions can include text to be displayed on the driver's smartphone or projected via the head unit and/or audio announcements, for example. Additionally, a pre-stored image of a visual landmark referenced in the directions can be downloaded from thevisual landmark database 52 to themobile system 12 and displayed in the projected mode on the head unit of the vehicle, so that the user can glance at the display and see to which visual landmark the directions refer. - The
method 100 completes afterblock 110. Thus, in a sense, the system implementing themethod 100 uses real-time imagery as a filter applied to the redundant set of visual landmarks. Of course, if more than the necessary number of visual landmarks (typically one) are determined to be visible for a single maneuver, the visual landmarks can be further filtered based on other signals. Some of these signals, including the signals based on user feedback, are discussed below. - Referring back to
FIG. 1 , after thelandmark selection system 18 determines that the landmark of theimage 70 is probably visible to the driver and that the landmark of theimage 72 is probably not visible to the driver, and accordingly references the landmark of theimage 70 in the navigation directions, the driver can provide an indication of whether the landmark of theimage 70 was in fact helpful. Further, thelandmark selection system 18 in some cases is not equipped with thecamera 20 or fails to obtain real-time imagery at the vehicle for some reason (thelandmark selection system 18 then can select the visual landmarks based on other signals). In these cases, the driver still can provide feedback regarding the quality of the visual landmarks referenced in the navigation directions. In other words, thelandmark selection system 18 can collect driver feedback regardless of its capacity to process real-time imagery. - Now referring to
FIG. 3 , anexample method 150 for requesting and processing user feedback is discussed below with reference to thelandmark selection system 18, in which it can be implemented. However, themethod 150 in general can be implemented in any suitable system, including navigation systems that receive navigation directions via a network connection, navigation systems built into vehicles and storing landmark data along with map data on a hard disk or other storage device, standalone navigation systems with pre-stored landmark and map databases, etc. It is noted further that themethod 150 can be implemented in systems configured to receive real-time imagery as well as systems that are not configured to receive real-time imagery. - The
method 150 begins atblock 152. Here, thelandmark selection system 18 can select a visual landmark for a certain location and maneuver, during navigation. Next, thelandmark selection system 18 can provide an indication of the visual landmark to the driver atblock 154, and provide a prompt regarding this visual landmark atblock 156 so as to assess the quality of the suggestion. For example, the indication can be “after you pass the statue of a bull, turn right on Financial Pl.” To obtain explicit user feedback after the user completes the maneuver by turning right, the follow-up yes/no prompt atblock 156 can be “did you see the statue of a bull?” In some implementations, thelandmark selection system 18 does not generate a follow-up prompt every time the visual landmark is referenced but rather at a certain relatively low rate, such as once per hundred references to the visual landmarks. Additionally or alternatively, thelandmark selection system 18 can collect implicit user feedback by determining whether the user successfully completed the maneuver or missed the turn. Thus, if the prompt above is provided to one hundred drivers over a certain period of time, and only 85% the drivers turn right on Financial Pl. (while the overall success rate for maneuvers specified in the navigation directions and augmented by references to visual landmarks is 99%, for example), it is probable that the statue of a bull is not a good visual landmark. Thelandmark selection system 18 can utilize any suitable statistical technique to assess the probability of recognizing visual landmarks. - Further, because some users may dislike any follow-up prompts, the
landmark selection system 18 can format the reference to the visual landmark atblock 154 as a question. Thus, for example, the navigation application can generate the question “do you see the statue of a bull on your right?” If the driver answers in the affirmative, thelandmark selection system 18 can immediately provide the complete instruction “after you pass the statue of a bull, turn right on Financial Pl.” Otherwise, thelandmark selection system 18 can select the next visual landmark, when available, and generate the next question. - If it is determined at
block 158 that the user can see the visual landmark, the flow proceeds to block 160. Otherwise, the flow proceeds to block 162. Atblock 160, thelandmark selection system 18 can adjust the numeric metric for the visual landmark upward to indicate an instance of success. On the other hand, at block 162 thelandmark selection system 18 can adjust the numeric metric for the visual landmark downward to indicate an instance of failure. Further, depending on the implementation, thelandmark selection system 18 can adjust the metric for a particular time of day, particular weather, particular season, particular lighting conditions, etc. - At
block 164, thelandmark selection system 18 can also adjust the probability of selecting other landmarks that belong to the same type (or images of landmarks of a certain type). For example, if it determined atblock 158 that the driver found a certain billboard to be a useful landmark, the probability of preferring billboards to other types of landmarks can increase. Afterblock 164, the flow proceeds to block 166, where the next maneuver is selected. The flow then returns to block 152, where a set of visual landmarks is selected for the new maneuver and the location of the driver. - Thus, when a redundant set of visual landmarks is available, the
landmark selection system 18 can utilize explicit and/or implicit driver feedback to determine which visual landmarks are more likely to be useful for the remainder of the navigation session, and which visual landmarks are likely to be useful to other drivers in the future. The overall accuracy of assessing usefulness of visual landmarks is expected to increase when themethod 150 is executed for a large number of navigation sessions, and for a large number of drivers. - In some cases, the
method 150 can be extended to other types of navigation directions or geographic suggestions. For example, a navigation system can use themethod 150 to determine whether a certain reference to a street name is a reliable reference in navigation directions. Because street signs may be missing or poorly lit, and because some street and road information may be out of date, the navigation system can format certain directions as questions (e.g., “Do you see Elm St. 300 feet ahead?”), receive explicit feedback when the user chooses to comment on the previously provided directions (e.g., “In 300 feet, turn right on Elm St.”—“I cannot see Elm St.”), and/or collect implicit feedback (e.g., missed turn, sudden deceleration prior to the turn). - Further, in a generally similar manner, the devices illustrated in
FIG. 1 can use explicit and implicit driver feedback to identify easy-to-miss turns. For both “traditional” navigation directions and landmark-based navigation directions, theserver system 14 can detect the tendencies of drivers to miss turns, quickly brake before upcoming turns, or otherwise not maneuver according to the instructions provided as part of the navigation directions. For example, if a certain percentage of the drivers miss the turn or appear to almost miss the term by quickly changing their speed, theserver system 14 can determine that the turn is an easy-to-miss turn. As discussed above, this percentage also can mean that the visual landmark referenced in the corresponding instruction may not be reliable. In addition to determining that a new visual landmark may be need for this location, thenavigation instruction generator 42 can automatically provide a warning to the driver, such as “slow down here, the next turn is easy to miss.” Further, the difficulty of the maneuver may indicate to thelandmark selection system 18 that it should attempt to identify a suitable dynamic visual landmark, especially when no permanent visual landmarks are available. Dynamic visual landmarks are discussed in more detail below. - In some implementations, the
landmark selection system 18 compares the captured real-time imagery to pre-stored images to detect a match or absence of a match. As a more specific example, thevisual landmark database 52 ofFIG. 1 can store images of the landmark depicted in theimage 70 captured from various locations and with various orientations of the camera, i.e., camera poses. These images can be, for example, street-level images collected by a specialized vehicle and annotated to select only those pixels or portions of each image that depict the visual landmark. The annotation may be conducted manually, for example. - As the
camera 20 captures thescene 60, a positioning module operating in themobile system 12 determines the location from which thescene 60 was captured. Thelandmark selection system 18 then can retrieve those images of the landmarks depicted in theimages camera 20 at the time of capture. Thus, thevisual landmark database 52 can store numerous photographs of the stadium depicted inFIG. 1 , and thelandmark selection system 18 can select one or several photographs from among these numerous photographs based on the camera pose and then determine whether the stadium is depicted in thescene 60. According to this approach, thelandmark selection system 18 seeks to determine the presence or absence of a specified visual landmark. - In another implementation, the
landmark selection system 18 implements less constrained image processing.FIG. 4 illustrates thescene 60 along with amodel 200 that positions automatically recognized entities such as cars and people in two- or three-dimensional space. Thelandmark selection system 18 can rely on models of certain types or classes of objects to identify presence or absence of objects of these types in thescene 60 using a deep-learning technique such as building a convolutional neural network (CNN), for example. Experiments have shown that these techniques can spatially localize hundreds of classes of objects, in relatively short time (e.g., 100 ms per image). The CNN can be trained using such datasets annotated with metadata as, for example, CityScapes available at www.cityscapes-dataset.com. - In the example scenario of
FIG. 4 , thelandmark selection system 18 generates boundingboxes boxes bounding box 208 corresponds to a standing person. Thelandmark selection system 18 then places the identified objects within thegeographic model 200 of the corresponding area. Moreover, thelandmark selection system 18 can determine the spatial orientation of these objects. The bounding boxes 212-218 enclose models of the corresponding object types. For example, thebounding box 212 encloses a sample object of type “sports utility vehicle,” thebounding box 214 encloses a sample object of type “mid-size car,” thebounding box 216 encloses a sample object of type “sports car,” andbounding box 218 encloses a sample object of type “standing adult person.” Other examples of types of objects can include bicycles, buses, billboards, traffic lights, certain chain store logos, etc. Thelandmark selection system 18 can align the objects identified in thescene 60 with these and other types of objects and determines the positions of these objects relative to static geographic features such as buildings with known coordinates, etc. In this manner, thelandmark selection system 18 can describe the position of an identified object relative to static geographic features and generate navigation instructions of the type “turn where the sports car is now turning.” - The
landmark selection system 18 also can process color characteristics of the identified objects. Thus, the instruction above can become “turn where the red sports car is now turning,” which may be more helpful to the driver. Further, thelandmark selection system 18 can be configured to recognize alphanumeric characters and generate such instructions as “keep going past the sign that says ‘car wash,’” when the camera captures an image of a person holding up a temporary car wish sign. - In some implementations, the
landmark selection system 18 labels every pixel in thescene 60 in accordance with semantic segmentation techniques. For theexample scene 60, semantic segmentation can produce an indication of where the sidewalk, the road, and the trees are located. A more robust image processing pipeline generally is required to conduct semantic segmentation, but using semantic segmentation thelandmark selection system 18 can identify additional landmarks and/or generate better explanations of where visual landmarks are located. For example, the navigation instruction “turn right after you see a large yellow billboard” can be improved to “turn right after you see a large yellow billboard on the sidewalk.” - Referring back to
FIG. 1 , thelandmark selection system 18 can use the image processing techniques discussed above both to determine the presence or absence of pre-selected objects in captured real-time imagery and to dynamically identify objects in the real-time imagery that can work as visual landmarks, even where no information for such objects was stored in thevisual landmarks database 52. These dynamic visual landmarks typically are transient (e.g., a bus stopped at the corner, a truck parked in front of a convenience store, a bicyclist in a yellow shirt turning left), in which case thelandmark selection system 18 can limit the use of these dynamic visual landmarks to the current navigation instructions only. However, thelandmark selection system 18 in a similar fashion can identify new permanent landmarks that were missing from thevisual landmark database 52. For example, no information about a recently installed billboard may be stored in thevisual landmark database 52, and thelandmark selection system 18 in some cases can identify a potentially permanent landmark and automatically submit the corresponding image to theserver system 14, which in response may create a new record in thevisual landmark database 52. - Next,
FIG. 5 illustrates anexample method 300 for identifying prominent objects within a captured scene, which can be implemented in the system ofFIG. 1 . For convenience, themethod 300 is discussed with reference tolandmark selection system 18, but it is noted that themethod 300 can be implemented in any suitable system. - At
block 302, thelandmark selection system 18 can determine a route for guiding a driver to a destination. The route can include a graph traversing several road segments, and the corresponding navigation directions can include a sequence of descriptions of maneuvers. In some implementations, the navigation directions can be generated at theserver system 14 and provided to themobile system 12 in relevant portions. - Next, at
block 304, thelandmark selection system 18 can receive real-time imagery for a scene, collected at a certain location of the vehicle. Typically but not necessarily, the real-time imagery is collected when the vehicle approaches the location of the next maneuver. The camera pose for the captured imagery approximately corresponds to the vantage point of the driver. When geo-positioning is available, the real-time imagery can be geographically tagged, i.e., include an indication of the location where the real-time imagery was captured. - At
block 306, thelandmark selection system 18 can identify objects of certain pre-defined types within the captured scene. As discussed above, this identification can be based on training data and can include semantic image segmentation. In some cases, the identification is based on the presence of letters, numbers, and other alphanumeric characters. To this end, thelandmark selection system 18 can implement any suitable character recognition technique. Moreover, thelandmark selection system 18 may implement both object identification and character recognition to identify objects of pre-defined types with alphanumeric characters. - At
block 308, thelandmark selection system 18 can determine which of the detected objects appear prominently within the scene. Referring back toFIG. 4 , not every object within the bounding boxes 202-208 is necessarily noticeable to a human observer. In other words, to generate useful dynamic visual landmarks, it is often insufficient for thelandmark selection system 18 to simply identify objects. Thelandmark selection system 18 accordingly can assess the prominent of visual landmarks relative to the rest of the scene based on the difference in color, for example. More particularly, thelandmark selection system 18 can determine that the car enclosed by thebox 206 is bright red, and that the rest of thescene 60 lacks bright patches of color. The car enclosed by thebox 206 thus can be determined to be a potentially useful visual landmark. As another example, thelandmark selection system 18 can identify several buildings within a scene, determine that the buildings are disposed at a similar distance from vehicle, and determine that one of the buildings is significantly larger than the other buildings. In addition to color and size, thelandmark selection system 18 can use any number of suitable criteria of prominence, such as shape, presence of alphanumeric characters, etc. - At
block 310, thelandmark selection system 18 can determine the positions of the one or more prominent objects relative to the current location of the vehicle and/or to the locations of road intersections and other geographic waypoints, in a two- or may three-dimensional coordinate system. Where relevant, thelandmark selection system 18 also determine the orientation of the prominent object. Referring back toFIG. 4 , after the sports car enclosed by thebox 206 is identified as a prominent feature, thelandmark selection system 18 can determine the location and orientation of the sports car relative to the streets. - At
block 312, thelandmark selection system 18 can include in the navigation directions a reference to the one or more prominent objects identified atblock 306. As discussed above, thelandmark selection system 18 can generate such instructions as “turn left on Main. St., where the red sports car is turning” or “turn right on Central St. after the blue billboard.” The instructions can include any suitable combination of text and multimedia. - Modifying Navigation Route using Live States of Traffic Lights
- In the scenario schematically illustrated in
FIG. 6 , thecar 400 approaches anintersection 402 via Elm St., en route to a destination orintermediate waypoint 404. From theintersection 402, there are two viable route options: one could continue driving down Elm St. past theintersection 402, turn left on Central St., and then left again on Oak St. (route 410). Alternatively, one could turn left on Main St. at theintersection 402 and then turn right on Oak St. (route 412). Atraffic light 404 includes a left-turn arrow indicator. - Prior to the
car 400 reaching theintersection 402, the routing engine 40 (seeFIG. 1 ) may have determined that theroute 410 is faster that theroute 412. For example, therouting engine 40 may have applied routing algorithms based on graph theory and additionally considered live traffic data for the potentially relevant portions of the route. However, as the camera operating in thecar 400 captures a scene that includes thetraffic light 404, thelandmark selection system 18 can analyze the scene to a identify and properly classify a visual landmark, thetraffic light 404. Thelandmark selection system 18 can determine that thetraffic light 404 is currently displaying a green arrow, and in response therouting engine 40 can re-evaluate the routing options and determine that theroute 412 has become a better option. In this case, the navigation instructions generator can provide an updated notification advising the driver to turn left at theintersection 402. On the other hand, if thelandmark selection system 18 analyzes the scene to determine that thetraffic light 404 is green, therouting engine 40 can confirm that theroute 410 remains the better option. It is noted that in many cases, the current state of a traffic light cannot be obtained from other sources such as real-time database servers, or can be obtained with such difficulties that the approach becomes impractical. -
FIG. 7 depicts a flow diagram of anexample method 450 for selecting a navigation option in view of the live state of a traffic signal, which can be implemented in the devices illustrated inFIG. 1 or any other suitable system. - The
method 450 begins atblock 452, where two or more routing options for reaching a certain intermediate point along the route or the endpoint of the route, from a certain location controlled by a traffic light, are identified. Atblock 454, the current state of the traffic light is determined using real-time imagery captured at the vehicle approaching the location. If the traffic light is determined to be displaying the green arrow, the flow proceeds to block 460, where the first routing option is selected. Otherwise, if the traffic light is determined to not be displaying the green arrow, the flow proceeds to block 462, and the second routing option is selected. The corresponding navigation instruction then is provided to the user atblock 464. - In some implementations, the components of the
landmark selection system 18 can use real-time imagery to improve lane guidance. In general, positioning solutions such as GPS or Wi-Fi triangulation cannot yield a position fix precise enough to determine in which lane the vehicle is currently located. Using the techniques discussed above and/or other suitable techniques, thelandmark selection system 18 can recognize lane marking (e.g., white and yellow divider strips), arrows and highway signs painted on the road, the dimensionality of lanes based on detected boundaries of the sidewalk, presence of other vehicles from which the existence of other lanes can be inferred, etc. - For example, the
camera 20 ofFIG. 1 can be positioned so as to capture the road immediately ahead of the vehicle. The captured imagery can include a first solid single white line on the left, a solid double yellow line in the to the right to the first white line, a dashed white line to right of the solid yellow line, and a second single white line on the right. Thenavigation application 26 can process the imagery (locally or by uploading the imagery to the sever system 14) to determine, using the knowledge that the vehicle currently is in a geographic region where people drive on the right, that the road includes two lanes in the current direction of travel and one lane in the opposite direction. Thenavigation application 26 then can process the geometry of the detected lines to determine the current position of the vehicle relative to the lanes. To this end, thecamera 20 may be mounted at a certain precise location, so that thenavigation application 26 can account for the geometry of the vehicle (e.g., thenavigation application 26 may be provisioned to assume that the camera two feet above ground level, 30 inches away from the left edge of the vehicle and 40 inches away from the right edge of the vehicle). Additionally or alternatively, thecamera 20 may be mounted so as to capture the front exterior corners of the vehicle to determine where the corners are located relative to the white and yellow lines on the road. - Using lane recognition, the
navigation application 26 can provide lane-specific guidance. For example, thenavigation application 26 can guide the driver to avoid left-turn-only or right-turn-only lanes when the vehicle needs to travel straight, generate more relevant warnings regarding merging left or right, warn the driver when he or she is in a lane that is about to end, etc. - In some implementations, the
navigation application 26 and/or thenavigation instructions generator 42 can also use lane data available in themap database 50. For example, thenavigation application 26 can receive an indication that the vehicle is currently traveling in a three-lane road segment, based on the most recent GPS or Wi-Fi positioning fix. Using this information along with real-time imagery, thenavigation application 26 can determine in which lane the vehicle is travelling and generate appropriate instructions when necessary. - Generating Warnings about Potential Traffic Violations using Real Time Imagery
- Further, the
navigation application 26 can use the imagery captured by thecamera 20 to automatically generate warnings regarding potential traffic violations. For example, drivers have been observed making an illegal right-on-red turn onto Shoreline Blvd. from US 101 North in Mountain View, Calif. It is believed that many drivers simply do not notice the “no right on red” sign. While themap database 50 can store an indication that the right turn on red is not allowed at this road junction, preemptively generating a warning whenever the driver is about to turn onto Shoreline Blvd. can be distracting and unnecessary, as the driver may be turning right on green. - Accordingly, the
landmark selection system 18 can process the state of the traffic light as discussed above when the driver enters the ramp. When the state of the traffic light is determined to be red, and when the driver appears to start moving based on the positioning data or vehicle sensor data, thelandmark selection system 18 can automatically provide an instruction “no right no red here!,” for example. To determine whether such an instruction should be provided, thelandmark selection system 18 also can consider statistical indicators for the road junction, when available. For example, an operator can manually provision theserver system 14 with an indication that this particular Shoreline Blvd exit is associated with frequent traffic violations. These indications also can be user-generated. - In some embodiments, the
landmark selection system 18 also can process and interpret the “no right on red” sign prior to generating the warning. In particular, themap database 50 may not have specific turn restriction data for a certain residential area. - The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- Similarly, the methods or routines described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or more processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- The one or more processors may also operate to support performance of the relevant operations in a cloud computing environment or as a software as a service (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., application program interfaces (APIs).)
- Upon reading this disclosure, those of ordinary skill in the art will appreciate still additional alternative structural and functional designs for the systems for using real-time imagery and/or driver feedback in navigation. Thus, while particular embodiments and applications have been illustrated and described, it is to be understood that the disclosed embodiments are not limited to the precise construction and components disclosed herein. Various modifications, changes and variations, which will be apparent to those skilled in the art, may be made in the arrangement, operation and details of the method and apparatus disclosed herein without departing from the spirit and scope defined in the appended claims.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/188,215 US20190078905A1 (en) | 2016-05-02 | 2018-11-12 | Systems and methods for using real-time imagery in navigation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/144,300 US10126141B2 (en) | 2016-05-02 | 2016-05-02 | Systems and methods for using real-time imagery in navigation |
US16/188,215 US20190078905A1 (en) | 2016-05-02 | 2018-11-12 | Systems and methods for using real-time imagery in navigation |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/144,300 Continuation US10126141B2 (en) | 2016-05-02 | 2016-05-02 | Systems and methods for using real-time imagery in navigation |
Publications (1)
Publication Number | Publication Date |
---|---|
US20190078905A1 true US20190078905A1 (en) | 2019-03-14 |
Family
ID=59215967
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/144,300 Active US10126141B2 (en) | 2016-05-02 | 2016-05-02 | Systems and methods for using real-time imagery in navigation |
US16/188,215 Abandoned US20190078905A1 (en) | 2016-05-02 | 2018-11-12 | Systems and methods for using real-time imagery in navigation |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/144,300 Active US10126141B2 (en) | 2016-05-02 | 2016-05-02 | Systems and methods for using real-time imagery in navigation |
Country Status (4)
Country | Link |
---|---|
US (2) | US10126141B2 (en) |
EP (1) | EP3452788A1 (en) |
CN (1) | CN109073404A (en) |
WO (1) | WO2017209878A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626332A (en) * | 2020-04-27 | 2020-09-04 | 中国地质大学(武汉) | Rapid semi-supervised classification method based on picture volume active limit learning machine |
WO2022188154A1 (en) * | 2021-03-12 | 2022-09-15 | 深圳市大疆创新科技有限公司 | Front view to top view semantic segmentation projection calibration parameter determination method and adaptive conversion method, image processing device, mobile platform, and storage medium |
Families Citing this family (64)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102016205867A1 (en) * | 2016-04-08 | 2017-10-12 | Robert Bosch Gmbh | Method for determining a pose of an at least partially automated vehicle using different types of landmarks |
US10126141B2 (en) * | 2016-05-02 | 2018-11-13 | Google Llc | Systems and methods for using real-time imagery in navigation |
EP3951319A1 (en) * | 2016-06-01 | 2022-02-09 | Pioneer Corporation | Information processing device and detection device |
WO2018031678A1 (en) | 2016-08-09 | 2018-02-15 | Nauto Global Limited | System and method for precision localization and mapping |
US10139244B2 (en) * | 2016-08-17 | 2018-11-27 | Veoneer Us Inc. | ADAS horizon and vision supplemental V2X |
US10246014B2 (en) | 2016-11-07 | 2019-04-02 | Nauto, Inc. | System and method for driver distraction determination |
JP6867482B2 (en) | 2016-12-30 | 2021-04-28 | グーグル エルエルシーGoogle LLC | Hash-based dynamic limitation of content on information resources |
US11194856B2 (en) * | 2017-03-07 | 2021-12-07 | Verizon Media Inc. | Computerized system and method for automatically identifying and providing digital content based on physical geographic location data |
US10527449B2 (en) * | 2017-04-10 | 2020-01-07 | Microsoft Technology Licensing, Llc | Using major route decision points to select traffic cameras for display |
WO2018229548A2 (en) | 2017-06-16 | 2018-12-20 | Nauto Global Limited | System and method for contextualized vehicle operation determination |
WO2018229550A1 (en) | 2017-06-16 | 2018-12-20 | Nauto Global Limited | System and method for adverse vehicle event determination |
US10474908B2 (en) * | 2017-07-06 | 2019-11-12 | GM Global Technology Operations LLC | Unified deep convolutional neural net for free-space estimation, object detection and object pose estimation |
CN107292291B (en) * | 2017-07-19 | 2020-04-03 | 北京智芯原动科技有限公司 | Vehicle identification method and system |
EP3454079B1 (en) * | 2017-09-12 | 2023-11-01 | Aptiv Technologies Limited | Method to determine the suitablility of a radar target as a positional landmark |
US10223601B1 (en) * | 2017-10-12 | 2019-03-05 | Denso International America, Inc. | Synthetic traffic object generator |
US10916132B2 (en) * | 2017-11-22 | 2021-02-09 | International Business Machines Corporation | Vehicle dash cam sign reading |
US10788332B2 (en) * | 2017-12-07 | 2020-09-29 | International Business Machines Corporation | Route navigation based on user feedback |
US10794716B2 (en) | 2017-12-14 | 2020-10-06 | Google Llc | Systems and methods for selecting a POI to associate with a navigation maneuver |
BR112020009706B1 (en) * | 2017-12-31 | 2022-11-29 | Uber Technologies, Inc | METHOD AND MEDIA OF NON-TRANSITORY COMPUTER READABLE STORAGE FOR ENHANCEMENT OF MAP DATA BASED ON POINTS OF INTEREST |
US11392131B2 (en) | 2018-02-27 | 2022-07-19 | Nauto, Inc. | Method for determining driving policy |
FR3078565A1 (en) * | 2018-03-02 | 2019-09-06 | Renault S.A.S. | METHOD AND APPARATUS FOR CONSTRUCTING ENVIRONMENTALLY ADAPTIVE ROAD GUIDANCE INSTRUCTIONS |
US10969237B1 (en) | 2018-03-23 | 2021-04-06 | Apple Inc. | Distributed collection and verification of map information |
US11378956B2 (en) * | 2018-04-03 | 2022-07-05 | Baidu Usa Llc | Perception and planning collaboration framework for autonomous driving |
US11195102B2 (en) | 2018-04-23 | 2021-12-07 | International Business Machines Corporation | Navigation and cognitive dialog assistance |
US11282385B2 (en) * | 2018-04-24 | 2022-03-22 | Qualcomm Incorproated | System and method of object-based navigation |
DE102018210765A1 (en) * | 2018-06-29 | 2020-01-02 | Volkswagen Aktiengesellschaft | Localization system and method for operating the same |
CN112368545A (en) | 2018-09-06 | 2021-02-12 | 谷歌有限责任公司 | Navigation directions with familiar locations as intermediate destinations |
CN112368546A (en) | 2018-09-06 | 2021-02-12 | 谷歌有限责任公司 | Displaying personalized landmarks in mapping applications |
US11448518B2 (en) * | 2018-09-27 | 2022-09-20 | Phiar Technologies, Inc. | Augmented reality navigational overlay |
US10495476B1 (en) * | 2018-09-27 | 2019-12-03 | Phiar Technologies, Inc. | Augmented reality navigation systems and methods |
CN112005082A (en) * | 2018-10-22 | 2020-11-27 | 谷歌有限责任公司 | Finding locally salient semantic features for navigation and geocoding |
EP3644016B1 (en) * | 2018-10-23 | 2024-07-31 | Zenuity AB | Localization using dynamic landmarks |
KR20230020564A (en) * | 2018-11-20 | 2023-02-10 | 구글 엘엘씨 | Enhanced navigation instructions with landmarks under difficult driving conditions |
KR102096078B1 (en) * | 2018-12-05 | 2020-04-02 | 네이버랩스 주식회사 | Method, apparatus, system and computer program for providing route guide |
DE102018133669A1 (en) * | 2018-12-28 | 2020-07-02 | Volkswagen Aktiengesellschaft | Method and device for reproducing navigation instructions |
CN109858374B (en) * | 2018-12-31 | 2021-08-17 | 武汉中海庭数据技术有限公司 | Automatic extraction method and device for arrow mark lines in high-precision map making |
CN111507126B (en) * | 2019-01-30 | 2023-04-25 | 杭州海康威视数字技术股份有限公司 | Alarm method and device of driving assistance system and electronic equipment |
CN109784309A (en) * | 2019-02-01 | 2019-05-21 | 上海智能交通有限公司 | A kind of advertisement board on highway identifying system and method based on in-vehicle camera |
EP3920095A4 (en) * | 2019-02-15 | 2022-03-02 | SZ DJI Technology Co., Ltd. | Image processing method and apparatus, moveable platform, unmanned aerial vehicle and storage medium |
KR20200101186A (en) * | 2019-02-19 | 2020-08-27 | 삼성전자주식회사 | Electronic apparatus and controlling method thereof |
CN110096985B (en) * | 2019-04-23 | 2022-09-20 | 东北电力大学 | Urban building identification method based on image characteristics and GPS positioning |
CN110096707B (en) * | 2019-04-29 | 2020-09-29 | 北京三快在线科技有限公司 | Method, device and equipment for generating natural language and readable storage medium |
JP7227072B2 (en) * | 2019-05-22 | 2023-02-21 | 日立Astemo株式会社 | vehicle controller |
KR20220012212A (en) * | 2019-05-24 | 2022-02-03 | 구글 엘엘씨 | Interactive landmark-based positioning |
CN110619666B (en) * | 2019-09-20 | 2022-05-27 | 阿波罗智能技术(北京)有限公司 | Method and device for calibrating camera |
EP4034840B1 (en) * | 2019-09-24 | 2024-02-14 | Toyota Motor Europe | System and method for navigating a vehicle using language instructions |
WO2021076099A1 (en) * | 2019-10-15 | 2021-04-22 | Google Llc | Weather and road surface type-based navigation directions |
DE102019128253B4 (en) * | 2019-10-18 | 2024-06-06 | StreetScooter GmbH | Procedure for navigating an industrial truck |
CN111159459B (en) * | 2019-12-04 | 2023-08-11 | 恒大恒驰新能源汽车科技(广东)有限公司 | Landmark positioning method, landmark positioning device, computer equipment and storage medium |
KR102657472B1 (en) * | 2019-12-17 | 2024-04-15 | 구글 엘엘씨 | How to provide additional commands for difficult maneuvers during navigation |
US20210239485A1 (en) * | 2020-02-05 | 2021-08-05 | GM Global Technology Operations LLC | System and method for vehicle navigation using terrain text recognition |
US11635299B2 (en) * | 2020-02-06 | 2023-04-25 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for scene-aware interaction |
CN113408534B (en) * | 2020-03-17 | 2024-07-02 | 株式会社理光 | Method, apparatus and storage medium for recognizing landmark in panoramic image |
CN111413722A (en) * | 2020-03-17 | 2020-07-14 | 新石器慧通(北京)科技有限公司 | Positioning method, positioning device, unmanned vehicle, electronic equipment and storage medium |
CN111397633A (en) * | 2020-05-12 | 2020-07-10 | 苏州清研捷运信息科技有限公司 | Automobile navigation system using landmark identification technology for guiding |
US11543259B2 (en) * | 2020-06-05 | 2023-01-03 | Hitachi, Ltd. | Determining landmark detectability |
CN113945220A (en) * | 2020-07-15 | 2022-01-18 | 奥迪股份公司 | Navigation method and device |
JP2022059958A (en) * | 2020-10-02 | 2022-04-14 | フォルシアクラリオン・エレクトロニクス株式会社 | Navigation device |
WO2022154786A1 (en) * | 2021-01-13 | 2022-07-21 | Google Llc | Explicit signage visibility cues in driving navigation |
US20220316906A1 (en) * | 2021-04-03 | 2022-10-06 | Naver Corporation | Apparatus and Method for Generating Navigational Plans |
JP2022178701A (en) * | 2021-05-20 | 2022-12-02 | フォルシアクラリオン・エレクトロニクス株式会社 | navigation device |
CN115472026B (en) * | 2021-06-10 | 2023-06-20 | 湖南九九智能环保股份有限公司 | Method and system for commanding vehicle to travel through intelligent fluorescent landmarks in material shed |
CN113420692A (en) * | 2021-06-30 | 2021-09-21 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium, and program product for generating direction recognition model |
US11367289B1 (en) * | 2021-07-16 | 2022-06-21 | Motional Ad Llc | Machine learning-based framework for drivable surface annotation |
Citations (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5220507A (en) * | 1990-11-08 | 1993-06-15 | Motorola, Inc. | Land vehicle multiple navigation route apparatus |
US5844505A (en) * | 1997-04-01 | 1998-12-01 | Sony Corporation | Automobile navigation system |
US6321161B1 (en) * | 1999-09-09 | 2001-11-20 | Navigation Technologies Corporation | Method and system for providing guidance about alternative routes with a navigation system |
US20030225508A9 (en) * | 2000-09-12 | 2003-12-04 | Bernd Petzold | Navigational system |
US20040225434A1 (en) * | 2003-05-07 | 2004-11-11 | Gotfried Bradley L. | Vehicle navigation and safety systems |
US20060009188A1 (en) * | 2004-07-09 | 2006-01-12 | Aisin Aw Co., Ltd. | Method of producing traffic signal information, method of providing traffic signal information, and navigation apparatus |
US20070061066A1 (en) * | 2003-06-26 | 2007-03-15 | Christian Bruelle-Drews | Method for assisting navigation and navigation system |
JP2010008068A (en) * | 2008-06-24 | 2010-01-14 | Denso Corp | Navigation device |
US20100063722A1 (en) * | 2008-09-09 | 2010-03-11 | Aisin Aw Co., Ltd. | On-board vehicle navigation device and program |
US20100131304A1 (en) * | 2008-11-26 | 2010-05-27 | Fred Collopy | Real time insurance generation |
US20100312466A1 (en) * | 2009-02-26 | 2010-12-09 | Navigon Ag | Method and device for calculating alternative routes in a navigation system |
US20110133956A1 (en) * | 2008-10-08 | 2011-06-09 | Toyota Jidosha Kabushiki Kaisha | Drive assist device and method |
US20120215432A1 (en) * | 2011-02-18 | 2012-08-23 | Honda Motor Co., Ltd. | Predictive Routing System and Method |
US20130253754A1 (en) * | 2012-03-26 | 2013-09-26 | Google Inc. | Robust Method for Detecting Traffic Signals and their Associated States |
US20130261969A1 (en) * | 2010-12-24 | 2013-10-03 | Pioneer Corporation | Navigation apparatus, control method, program, and storage medium |
US20140129121A1 (en) * | 2012-11-06 | 2014-05-08 | Apple Inc. | Routing Based On Detected Stops |
US20150057923A1 (en) * | 2013-08-21 | 2015-02-26 | Kyungpook National University Industry-Academic Cooperation Foundation | Method for car navigating using traffic signal data |
US20150226565A1 (en) * | 2013-06-21 | 2015-08-13 | Here Global B.V. | Method and apparatus for route determination based on one or more non-travel lanes |
US20160232785A1 (en) * | 2015-02-09 | 2016-08-11 | Kevin Sunlin Wang | Systems and methods for traffic violation avoidance |
US20170010612A1 (en) * | 2015-07-07 | 2017-01-12 | Honda Motor Co., Ltd. | Vehicle controller, vehicle control method, and vehicle control program |
US20170015239A1 (en) * | 2015-07-13 | 2017-01-19 | Ford Global Technologies, Llc | System and method for no-turn-on-red reminder |
US9569963B1 (en) * | 2015-12-03 | 2017-02-14 | Denso International America, Inc. | Systems and methods for informing driver of traffic regulations |
US20170069208A1 (en) * | 2015-09-04 | 2017-03-09 | Nokia Technologies Oy | Method and apparatus for providing an alternative route based on traffic light status |
US20170103652A1 (en) * | 2015-10-08 | 2017-04-13 | Denso Corporation | Driving support apparatus |
US9631943B2 (en) * | 2015-02-10 | 2017-04-25 | Mobileye Vision Technologies Ltd. | Systems and systems for identifying landmarks |
JP2017097684A (en) * | 2015-11-26 | 2017-06-01 | マツダ株式会社 | Sign recognition system |
US10126141B2 (en) * | 2016-05-02 | 2018-11-13 | Google Llc | Systems and methods for using real-time imagery in navigation |
Family Cites Families (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3919855B2 (en) * | 1996-10-17 | 2007-05-30 | 株式会社ザナヴィ・インフォマティクス | Navigation device |
JPH11271074A (en) * | 1998-03-20 | 1999-10-05 | Fujitsu Ltd | Device and method for comparing mark image and program storage medium |
US6323807B1 (en) * | 2000-02-17 | 2001-11-27 | Mitsubishi Electric Research Laboratories, Inc. | Indoor navigation with wearable passive sensors |
US20060244830A1 (en) * | 2002-06-04 | 2006-11-02 | Davenport David M | System and method of navigation with captured images |
US7831387B2 (en) * | 2004-03-23 | 2010-11-09 | Google Inc. | Visually-oriented driving directions in digital mapping system |
US20060271286A1 (en) | 2005-05-27 | 2006-11-30 | Outland Research, Llc | Image-enhanced vehicle navigation systems and methods |
US20070055441A1 (en) | 2005-08-12 | 2007-03-08 | Facet Technology Corp. | System for associating pre-recorded images with routing information in a navigation system |
US20070078596A1 (en) | 2005-09-30 | 2007-04-05 | John Grace | Landmark enhanced directions |
JP4935145B2 (en) * | 2006-03-29 | 2012-05-23 | 株式会社デンソー | Car navigation system |
US20070299607A1 (en) * | 2006-06-27 | 2007-12-27 | Verizon Laboratories Inc. | Driving directions with landmark data |
US8174568B2 (en) * | 2006-12-01 | 2012-05-08 | Sri International | Unified framework for precise vision-aided navigation |
US8478515B1 (en) | 2007-05-23 | 2013-07-02 | Google Inc. | Collaborative driving directions |
US7912637B2 (en) * | 2007-06-25 | 2011-03-22 | Microsoft Corporation | Landmark-based routing |
US20090177378A1 (en) * | 2008-01-07 | 2009-07-09 | Theo Kamalski | Navigation device and method |
WO2010081549A1 (en) * | 2009-01-14 | 2010-07-22 | Tomtom International B.V. | Navigation device & method |
US7868821B2 (en) * | 2009-01-15 | 2011-01-11 | Alpine Electronics, Inc | Method and apparatus to estimate vehicle position and recognized landmark positions using GPS and camera |
US8060302B2 (en) * | 2009-03-31 | 2011-11-15 | Microsoft Corporation | Visual assessment of landmarks |
US8370060B2 (en) | 2009-08-28 | 2013-02-05 | Navteq B.V. | Method of operating a navigation system to provide route guidance |
JP5569365B2 (en) * | 2010-11-30 | 2014-08-13 | アイシン・エィ・ダブリュ株式会社 | Guide device, guide method, and guide program |
JP5625987B2 (en) * | 2011-02-16 | 2014-11-19 | アイシン・エィ・ダブリュ株式会社 | Guide device, guide method, and guide program |
US9037411B2 (en) * | 2012-05-11 | 2015-05-19 | Honeywell International Inc. | Systems and methods for landmark selection for navigation |
DE102013011827A1 (en) * | 2013-07-15 | 2015-01-15 | Audi Ag | Method for operating a navigation device, navigation device and motor vehicle |
DE112013007522T5 (en) * | 2013-10-25 | 2016-07-07 | Mitsubishi Electric Corporation | Driving assistance device and driver assistance method |
JP6325806B2 (en) * | 2013-12-06 | 2018-05-16 | 日立オートモティブシステムズ株式会社 | Vehicle position estimation system |
US9476729B2 (en) * | 2014-05-29 | 2016-10-25 | GM Global Technology Operations LLC | Adaptive navigation and location-based services based on user behavior patterns |
US9676386B2 (en) * | 2015-06-03 | 2017-06-13 | Ford Global Technologies, Llc | System and method for controlling vehicle components based on camera-obtained image information |
US10024683B2 (en) * | 2016-06-06 | 2018-07-17 | Uber Technologies, Inc. | User-specific landmarks for navigation systems |
-
2016
- 2016-05-02 US US15/144,300 patent/US10126141B2/en active Active
-
2017
- 2017-05-01 EP EP17733192.3A patent/EP3452788A1/en active Pending
- 2017-05-01 CN CN201780004062.1A patent/CN109073404A/en active Pending
- 2017-05-01 WO PCT/US2017/030412 patent/WO2017209878A1/en unknown
-
2018
- 2018-11-12 US US16/188,215 patent/US20190078905A1/en not_active Abandoned
Patent Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5220507A (en) * | 1990-11-08 | 1993-06-15 | Motorola, Inc. | Land vehicle multiple navigation route apparatus |
US5844505A (en) * | 1997-04-01 | 1998-12-01 | Sony Corporation | Automobile navigation system |
US6321161B1 (en) * | 1999-09-09 | 2001-11-20 | Navigation Technologies Corporation | Method and system for providing guidance about alternative routes with a navigation system |
US20030225508A9 (en) * | 2000-09-12 | 2003-12-04 | Bernd Petzold | Navigational system |
US20040225434A1 (en) * | 2003-05-07 | 2004-11-11 | Gotfried Bradley L. | Vehicle navigation and safety systems |
US20070061066A1 (en) * | 2003-06-26 | 2007-03-15 | Christian Bruelle-Drews | Method for assisting navigation and navigation system |
US20060009188A1 (en) * | 2004-07-09 | 2006-01-12 | Aisin Aw Co., Ltd. | Method of producing traffic signal information, method of providing traffic signal information, and navigation apparatus |
JP2010008068A (en) * | 2008-06-24 | 2010-01-14 | Denso Corp | Navigation device |
US20100063722A1 (en) * | 2008-09-09 | 2010-03-11 | Aisin Aw Co., Ltd. | On-board vehicle navigation device and program |
US20110133956A1 (en) * | 2008-10-08 | 2011-06-09 | Toyota Jidosha Kabushiki Kaisha | Drive assist device and method |
US20100131304A1 (en) * | 2008-11-26 | 2010-05-27 | Fred Collopy | Real time insurance generation |
US20100312466A1 (en) * | 2009-02-26 | 2010-12-09 | Navigon Ag | Method and device for calculating alternative routes in a navigation system |
US20130261969A1 (en) * | 2010-12-24 | 2013-10-03 | Pioneer Corporation | Navigation apparatus, control method, program, and storage medium |
US20120215432A1 (en) * | 2011-02-18 | 2012-08-23 | Honda Motor Co., Ltd. | Predictive Routing System and Method |
US20130253754A1 (en) * | 2012-03-26 | 2013-09-26 | Google Inc. | Robust Method for Detecting Traffic Signals and their Associated States |
US20140129121A1 (en) * | 2012-11-06 | 2014-05-08 | Apple Inc. | Routing Based On Detected Stops |
US20150226565A1 (en) * | 2013-06-21 | 2015-08-13 | Here Global B.V. | Method and apparatus for route determination based on one or more non-travel lanes |
US20150057923A1 (en) * | 2013-08-21 | 2015-02-26 | Kyungpook National University Industry-Academic Cooperation Foundation | Method for car navigating using traffic signal data |
US20160232785A1 (en) * | 2015-02-09 | 2016-08-11 | Kevin Sunlin Wang | Systems and methods for traffic violation avoidance |
US9631943B2 (en) * | 2015-02-10 | 2017-04-25 | Mobileye Vision Technologies Ltd. | Systems and systems for identifying landmarks |
US20170010612A1 (en) * | 2015-07-07 | 2017-01-12 | Honda Motor Co., Ltd. | Vehicle controller, vehicle control method, and vehicle control program |
US20170015239A1 (en) * | 2015-07-13 | 2017-01-19 | Ford Global Technologies, Llc | System and method for no-turn-on-red reminder |
US20170069208A1 (en) * | 2015-09-04 | 2017-03-09 | Nokia Technologies Oy | Method and apparatus for providing an alternative route based on traffic light status |
US20170103652A1 (en) * | 2015-10-08 | 2017-04-13 | Denso Corporation | Driving support apparatus |
JP2017097684A (en) * | 2015-11-26 | 2017-06-01 | マツダ株式会社 | Sign recognition system |
US20170154528A1 (en) * | 2015-11-26 | 2017-06-01 | Mazda Motor Corporation | Traffic sign recognition system |
US9569963B1 (en) * | 2015-12-03 | 2017-02-14 | Denso International America, Inc. | Systems and methods for informing driver of traffic regulations |
US10126141B2 (en) * | 2016-05-02 | 2018-11-13 | Google Llc | Systems and methods for using real-time imagery in navigation |
Non-Patent Citations (1)
Title |
---|
1 to 6 , 8 to 16 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626332A (en) * | 2020-04-27 | 2020-09-04 | 中国地质大学(武汉) | Rapid semi-supervised classification method based on picture volume active limit learning machine |
WO2022188154A1 (en) * | 2021-03-12 | 2022-09-15 | 深圳市大疆创新科技有限公司 | Front view to top view semantic segmentation projection calibration parameter determination method and adaptive conversion method, image processing device, mobile platform, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109073404A (en) | 2018-12-21 |
WO2017209878A1 (en) | 2017-12-07 |
US10126141B2 (en) | 2018-11-13 |
US20170314954A1 (en) | 2017-11-02 |
EP3452788A1 (en) | 2019-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10126141B2 (en) | Systems and methods for using real-time imagery in navigation | |
US11604077B2 (en) | Systems and method for using visual landmarks in initial navigation | |
US10331957B2 (en) | Method, apparatus, and system for vanishing point/horizon estimation using lane models | |
US11501104B2 (en) | Method, apparatus, and system for providing image labeling for cross view alignment | |
US11677930B2 (en) | Method, apparatus, and system for aligning a vehicle-mounted device | |
US20160178389A1 (en) | Method for operating a navigation system, navigation system and motor vehicle | |
US11055862B2 (en) | Method, apparatus, and system for generating feature correspondence between image views | |
US10515293B2 (en) | Method, apparatus, and system for providing skip areas for machine learning | |
US20210240762A1 (en) | Finding Locally Prominent Semantic Features for Navigation and Geocoding | |
JP2024020616A (en) | Provision of additional instruction regarding difficult steering during navigation | |
US20230175854A1 (en) | Explicit Signage Visibility Cues in Driving Navigation | |
US12050109B2 (en) | Systems and method for using visual landmarks in initial navigation | |
CN112991805A (en) | Driving assisting method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GOOGLE INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GOLDING, ANDREW W.;MURPHY, KEVIN;SIGNING DATES FROM 20160428 TO 20160430;REEL/FRAME:047516/0447 Owner name: GOOGLE LLC, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:047573/0540 Effective date: 20170929 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |