WO2021128834A1

WO2021128834A1 - Navigation method and apparatus based on computer vision, computer device, and medium

Info

Publication number: WO2021128834A1
Application number: PCT/CN2020/105015
Authority: WO
Inventors: 温桂龙
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2019-12-25
Filing date: 2020-07-28
Publication date: 2021-07-01
Also published as: CN111060074A

Abstract

A navigation method and apparatus based on computer vision, a computer device, and a storage medium. The method comprises: obtaining a first target route, and using a voice broadcasting system to broadcast navigation voice data corresponding to the first target route; obtaining a real-time road condition video corresponding to the first target route, extracting, from the real-time road condition video, an image to be identified, preprocessing the image to be identified so as to obtain a target identification image, and using a target barrier identification model to identify the target identification image so as to obtain the current identification result; if the current identification result is that there is a barrier object, using a computer vision tool to perform binocular distance measurement on the barrier object so as to determine distance data between the current position of a user and the barrier object; and obtaining corresponding avoidance prompting information according to the distance data and a preset alarm condition, using the voice broadcasting system to broadcast the avoidance prompting information, and planning a navigation route for the user, thereby ensuring the traveling safety of the user.

Description

Computer vision-based navigation method, device, computer equipment and medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on December 25 , 2019 , the application number is 201911356786.X, and the invention title is "Computer Vision-based Navigation Methods, Devices, Computer Equipment and Media", all of which The content is incorporated in this application by reference.

Technical field

The field of artificial intelligence navigation of this application, in particular, relates to a navigation method, device, computer equipment, and storage medium based on computer vision.

Background technique

More and more users install navigation software on the client to provide users with route navigation based on the starting point and ending point. Existing navigation systems generally have functions such as speech synthesis, text reading, zoom in and zoom out, and touch feedback, which provide users with convenience, help users plan routes and provide travel mode suggestions. In the process of route navigation using the existing navigation system, the inventor found that users with inconvenient eyes cannot perceive the real-time road conditions on the navigation route in real time, making them prone to danger when moving according to the navigation route. Users with inconvenient eyes here It can be a visually impaired user or a user who cannot concentrate on watching the real-time road conditions due to other reasons.

Summary of the invention

The embodiments of the present application provide a computer vision-based navigation method, device, computer equipment, and storage medium to solve the problem that users who are inconvenient to use the navigation route recommended by the existing navigation system are prone to danger when moving.

A navigation method based on computer vision, including:

Acquiring navigation request information, where the navigation request information includes a starting point position and an ending point position;

Perform route planning according to the starting point position and the ending point position, obtain a first target route, and use a voice playback system to play navigation voice data corresponding to the first target route;

Acquire a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain a target recognition image, and use a target obstacle recognition model to recognize the target Recognize the image and obtain the current recognition result;

If the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to determine the distance data between the user's current position and the obstacle;

According to the distance data and preset alarm conditions, the corresponding evasion reminder information is obtained, and the voice playback system is used to play the evasion reminder information.

A navigation device based on computer vision, including:

A navigation request information acquisition module, configured to acquire navigation request information, where the navigation request information includes a starting point position and an ending point position;

The first target route acquisition module is configured to perform route planning according to the starting point position and the ending point position, acquire the first target route, and play the navigation voice data corresponding to the first target route by using a voice playback system;

The current recognition result obtaining module is used to obtain the real-time video of the road conditions corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain the target recognition image, and use the target The obstacle recognition model recognizes the target recognition image and obtains the current recognition result;

The distance data acquisition module is configured to, if the current recognition result is that there is an obstacle, use a computer vision tool to perform binocular distance measurement on the obstacle, and determine the distance data between the user's current position and the obstacle;

The evasion reminder information acquisition module is configured to obtain corresponding evasion reminder information according to the distance data and preset alarm conditions, and use the voice playback system to play the evasion reminder information.

A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:

One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.

The above-mentioned computer vision-based navigation method, device, computer equipment and storage medium carry out route planning according to the starting point position and the ending point position, obtain the first target route, and use the voice playback system to play the corresponding first target route Navigation voice data in order to provide users with voice navigation, which is convenient for users to travel based on the navigation voice data they hear. Acquire a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain a target recognition image, and use a target obstacle recognition model to recognize the target The image is recognized, and the current recognition result is obtained to determine whether there is an obstacle when the user advances along the first target route. When the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to quickly determine the distance data between the user's current position and the obstacle. Pre-set alarm conditions based on the distance data, obtain the corresponding evasion reminder information and use the voice playback system to play, so as to provide a barrier-free forward solution for users with eye inconvenience, avoiding the inconvenience of eyes or other inability to view road conditions in real time Under the circumstances, the danger that may be caused by the inability to see the existing obstacles, to ensure the safety of users’ travel.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

Fig. 1 is a schematic diagram of an application environment of a computer vision-based navigation method in an embodiment of the present application;

Fig. 2 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 3 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 4 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 5 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 6 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 7 is a flowchart of a computer vision-based navigation method in an embodiment of the present application;

Fig. 8 is a functional block diagram of a navigation device based on computer vision in an embodiment of the present application;

FIG. 9 is a schematic diagram of the principle of binocular ranging in an embodiment of the present application;

Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The computer vision-based navigation method provided by the embodiments of the present application can be applied to the application environment as shown in FIG. 1. Specifically, the computer vision-based navigation method is applied to a navigation system. The navigation system includes a client and a server as shown in FIG. It is convenient for users with eyes to provide navigation and provide corresponding circumvention solutions to ensure user travel safety. Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 2, a computer vision-based navigation method is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:

S201: Acquire navigation request information, where the navigation request information includes a starting point position and an ending point position.

Among them, the navigation request information refers to the information that the user sends to the server through the client, and requests the server to plan the route according to the starting point and the ending point. The starting point location is the location where the starting point of the navigation route is determined independently by the user. The end position is the position where the end point of the navigation route needs to be determined independently by the user.

S202: Carry out route planning according to the starting point position and the ending point position, obtain the first target route, and use the voice playback system to play the navigation voice data corresponding to the first target route.

Among them, the first target route refers to a route from the starting point position to the ending position obtained by planning according to the navigation request information. Navigation voice data refers to the voice data that provides navigation for the user. The navigation voice data corresponds to the first target route. For example, the navigation voice data can be "please walk xx meters to the left and then turn right" or "you have deviated Route etc.". Voice playback system playback refers to a system used for voice playback. For example, the voice playback system can play the first target route.

Specifically, after the server obtains the navigation request information, it inputs the start position and the end position in the navigation request information into the navigation system, and obtains the first target route fed back by the navigation system, and uses the voice playback system to play the first target route. Corresponding navigation voice data, so as to provide users with voice navigation, so that users who are inconvenient to use can obtain the corresponding first target route according to the played navigation voice data. As an example, there may be multiple navigation routes planned by the navigation system according to the starting position and the ending position. In this embodiment, the route with the shortest walking time may be selected as the first target route.

S203: Obtain a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain the target recognition image, use the target obstacle recognition model to recognize the target recognition image, and obtain the current recognition result.

Among them, the real-time video of road conditions refers to the video captured by the client in real time when the user is walking according to the navigation voice data. The image to be recognized refers to the image that needs to be recognized. In this embodiment, the video image extraction software is used to extract the image to be recognized in the real-time video of the road condition. For example, the frequency of the image extraction using the video image extraction software can be to extract one image to be recognized in the real-time video of the road condition every 10 seconds; or The image acquisition port extracts the image to be identified in the real-time video of the road condition. For example, the frequency of the image acquisition port can be 10 seconds to extract an image to be identified from the real-time video of the road condition. The target recognition image refers to the image obtained by preprocessing the image to be recognized.

The target obstacle recognition model is a model used to recognize obstacle objects in an image. In this embodiment, the target obstacle recognition model is used to recognize the target recognition image, so as to determine whether there is an obstacle on the road that prevents the user from moving forward when the user walks along the first target route. The current recognition result is the recognition result of the target recognition image by the target obstacle recognition model. Obstacle objects refer to objects that hinder the user's progress when the user advances along the first target route.

Specifically, before the user walks according to the navigation voice data, the client's camera is turned on for video recording to obtain real-time video of the road condition, and the image to be recognized is extracted from the real-time video of the road condition by using video image extraction software or the image collection port, and the image to be recognized is grayed out. To obtain the target recognition image, use the target obstacle recognition model to recognize the target recognition image, and obtain the current recognition result of whether there may be obstacles when the user moves along the first target route, so that the subsequent avoidance can be performed based on the current recognition result. Handling of obstacles to ensure the safety of users’ travel.

S204: If the current recognition result is that there is an obstacle, use a computer vision tool to perform binocular distance measurement on the obstacle, and determine the distance data between the user's current position and the obstacle.

Among them, computer vision refers to machine vision that uses cameras and computers instead of human eyes to identify, track, and measure obstacles. Computer vision tools include but are not limited to Halcon, MATLAB+Simulink and OpenCV. The user's current location refers to the user's current location. The distance data refers to the data of the distance between the user's current position and the obstacle. The distance data is specifically the distance between the three-dimensional coordinates of the obstacle and the three-dimensional coordinates of the user's current position. The three-dimensional coordinates of the user's current position are the origin coordinates. Binocular distance measurement refers to the process of calculating the image extracted from the real-time video of road conditions through computer vision tools to determine the distance between the user's current position and the obstacle.

Specifically, if the current recognition result is that there is an obstacle, in order to determine whether the obstacle will affect the user's progress, this embodiment uses the OpenCV tool to calculate the image extracted from the real-time video of the road condition to quickly learn from the user’s current location to the obstacle. The distance data of the object position. When the user’s eyes are inconvenient, the distance data between the user and the obstacle can be calculated according to the computer vision tool, so that it can accurately determine whether the obstacle will hinder the user from moving forward, so that the corresponding evasion reminder information corresponding to the obstacle can be obtained later to provide data to ensure User travel safety.

S205: Obtain corresponding evasion reminder information according to the distance data and preset alarm conditions, and use a voice playback system to play the evasion reminder information.

Among them, the preset alarm condition refers to a preset alarm condition, and the alarm condition is set according to whether the obstacle will hinder the user from moving forward. Evasion reminder information refers to the reminder information generated by judging distance data and preset alarm conditions. For example, when obstacles will not hinder the user, the evasion reminder message can be "Please pay attention to the presence of xx obstacles x meters in front of the left. , Please continue to move forward", or, when an obstacle is likely to hinder the user, the avoidance reminder message can be "Please note that there is an xx obstacle at x meters from the front left, please stop", or the obstacle prevents the user from moving forward At the time, the avoidance reminder message may be "Please note that there is an xx obstacle at x meters in front of the left, request to change the first target route", etc. The avoidance reminder information can provide users with a barrier-free forwarding solution, avoid the danger that may be caused by the inconvenience of the user's eyes and the inability to see the existing obstacles, and ensure the user's travel safety.

Specifically, using binocular ranging to measure the target recognition image with obstacles to obtain the user’s current location and distance data of the obstacle, and then obtain the corresponding evasion reminder information based on the distance data and preset alarm conditions, And use the voice playback system to play the obtained avoidance reminder information, so as to provide a barrier-free forward solution for users with eye inconvenience, avoiding the obstacles that cannot be seen due to the inconvenience of eyes or other situations where the user cannot view the road conditions in real time. The possible dangers caused by objects, to ensure the safety of users. Compared with the existing navigation system, which can only provide users with routes, in this embodiment, obstacles are identified and computer vision tools are used for binocular distance measurement to quickly obtain the distance data of the user’s current position and the distance of the obstacles. Corresponding evasion reminder information ensures that users who are inconvenient can travel normally based on the evasion reminder information and protect users' travel safety.

In the computer vision-based navigation method provided in this implementation, route planning is performed according to the starting point and the ending point, the first target route is obtained, and the voice playback system is used to play the navigation voice data corresponding to the first target route, so as to provide users with Voice navigation makes it convenient for users to travel based on the navigation voice data they hear. Obtain the real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain the target recognition image, use the target obstacle recognition model to recognize the target recognition image, and obtain the current recognition result, To determine whether there is an obstacle when the user moves along the first target route. When the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to quickly determine the distance data between the user's current position and the obstacle. Preset warning conditions based on the distance data, obtain the corresponding evasion reminder information and use the voice playback system to play, so as to provide a barrier-free forward plan for users with eye inconvenience, avoiding the situation that users with eye inconvenience or other situations where the user cannot view the road conditions in real time , The danger that may be caused by the inability to see the existing obstacles, to ensure the safety of users.

In an embodiment, the navigation request information in step S201 is the information corresponding to the start position and the end position independently input by the user. Specifically, it may be the start position and the end position input by the user directly on the client by text input, or It is the start and end positions determined by automatic positioning technology, and the start and end positions can also be input by voice input. As shown in Figure 3, step S201, namely obtaining navigation request information, includes:

S301: Use the voice playback system to input reminder data at the playback position, and receive the to-be-recognized voice data input by the voice collection system based on the position reminder data.

Among them, the location input reminder data refers to the data issued by the voice playback system to remind the user to input the location. The position input reminder data specifically includes the start position input reminder data and the end position input reminder data. For example, the start position input reminder data may be "please enter the start position". The voice data to be recognized is the data that the user said contains the starting position or the ending position. The voice collection system is a system used to collect user voice data, which can be a microphone built into the client.

Specifically, the user can independently select the voice input mode through the client. In the voice input mode, the voice playback system is used to play the position to input the reminder data. The user enters the reminder data according to the position within the preset waiting time. Corresponding to the voice data to be recognized, the voice collection system collects the voice data to be recognized and sends it to the server. The preset waiting time is a preset time for waiting for user feedback data. For example, the preset waiting time may be 1 minute.

S302: Use a voice recognition model to recognize the voice data to be recognized, and obtain the target text.

The voice recognition model is a model that is pre-trained to recognize the text content in the voice data to be recognized. The target text refers to the text corresponding to the voice data to be recognized, specifically the text corresponding to the start position or the end position. In this embodiment, the voice recognition model is used to recognize the voice data to be recognized, and the target text including the starting point or the ending point can be quickly obtained, so as to subsequently plan a route for the user.

S303: Use the speech synthesis technology to perform speech synthesis on the target text, and obtain the to-be-confirmed speech data corresponding to the target text.

Among them, speech synthesis technology is a technology that converts text information generated or input by a computer into speech output. The voice data to be confirmed refers to the voice data obtained after speech synthesis processing is performed on the target text. In this embodiment, speech synthesis is performed on the target text to obtain the to-be-confirmed speech data corresponding to the target text for the user to determine whether the starting point or the ending point is accurate, so as to ensure the accuracy of the subsequently generated route.

S304: Use the voice playback system to play the voice data to be confirmed, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.

Wherein, the position confirmation information refers to the information that the user confirms that the start position or the end position of the target text is accurate. In this embodiment, the voice playback system is used to play the voice data to be confirmed, and the location confirmation information sent by the client is received within a preset waiting time. The location confirmation information may be confirmation information, that is, information used to confirm that the target text is accurate and correct. ; It can also be the confirmation of incorrect information, that is, the information that needs to be modified to confirm that the target text is inaccurate.

Determine the navigation request information based on the target text and location confirmation information, including: if the location confirmation information is confirmed and correct, then the navigation request information is determined based on the target text; if the location confirmation information is confirmed to be incorrect, the voice playback system is repeated Enter the reminder data at the playback position, and receive the to-be-recognized voice data input by the voice collection system based on the position reminder data and the subsequent steps, that is, repeat steps S301-S304 until the correct information is obtained, and the navigation request information is determined according to the target text . In this embodiment, the user interacts with the client by means of human-computer interaction such as a voice playback system, so as to provide an intelligent location input method for users who are inconvenient with eyes, so as to plan a route later.

In the computer vision-based navigation method provided in this embodiment, the voice playback system plays the position input reminder data that needs to be determined by the user, and receives the voice data to be recognized by the voice collection system, recognizes the voice data to be recognized, and obtains the target text in order to Follow up to plan the first target route for the user. The speech synthesis technology is used to synthesize the target text to obtain the to-be-confirmed speech data corresponding to the target text, so that the user can determine whether the starting point or the ending point is accurate, so as to ensure the accuracy of the first target route generated subsequently. Use the voice playback system to play the voice data to be confirmed, receive the location confirmation information sent by the client, determine the navigation request information based on the target text and location confirmation information, and realize the interaction between the user and the client by means of human-computer interaction such as the voice playback system. Users with limited eyesight provide an intelligent location input method for subsequent route planning.

In an embodiment, as shown in FIG. 4, step S203, that is, preprocessing the image to be recognized to obtain the target recognition image, includes:

S401: Perform grayscale and binarization processing on the image to be recognized, and obtain the image to be processed.

Among them, grayscale refers to the process of converting a color image to be recognized into a grayscale image to be recognized, so as to reduce the workload of subsequent image processing. Binarization refers to processing the image obtained after the image to be identified is grayed out to generate an image with only two gray levels, and the image with only two gray levels is determined as the image to be processed. Perform grayscale and binarization processing on each image to be recognized to obtain the image to be processed to speed up the processing of subsequent images.

S402: Use an edge detection algorithm and a straight line detection algorithm to process the image to be processed, and obtain a road condition recognition image.

Among them, the edge detection algorithm is used to measure, detect and locate the gray level change of the image to be processed, so as to determine the part of the image to be recognized with significant brightness change, and provide technical support for the subsequent segmentation of obstacles and background. The detection algorithm includes but is not limited to the Canny edge detection algorithm.

The straight line detection algorithm is an algorithm used to identify a straight line from the image to be processed. The straight line detection algorithm includes but is not limited to the Hough transform. In this embodiment, the Hough transform is used to process the image to be processed, and the straight lines in the image to be processed are extracted to determine the sidewalk, blind side, or highway on the road, and obtain the road condition recognition image.

The edge detection algorithm is used to process the image to be processed to detect the parts with significant brightness changes in the image to be processed, and the straight line detection algorithm is used to determine the road in the image to be processed, so as to efficiently identify the sidewalk and the blind on the road in the image to be processed And road conditions such as highways.

S403: Use a threshold selection method to segment the obstacle object and the background of the road condition recognition image, and obtain a target recognition image.

Among them, the threshold selection method refers to the process of using the gray level difference between the target and the background to be extracted in the image, and dividing the pixel level into several categories by setting the gray threshold to realize the separation of the target and the background. The gray threshold is preset, and is used to distinguish obstacles from the background. The target recognition image refers to the image obtained after processing the road condition recognition image. Specifically, it is an image determined based on the comparison result of the gray level difference between the obstacle object and the background extracted from the road condition recognition image and the gray threshold value. The target recognition image is an image that is likely to be an obstacle. Threshold selection methods include, but are not limited to, threshold selection methods based on genetic algorithms. In this embodiment, the threshold selection method is used to segment the road condition recognition image into the obstacle object and the background, and the part of the road condition recognition image whose gray value is greater than the gray threshold value is determined as the obstacle object, which has the advantage of small calculation amount and can be obtained quickly. Target recognition image. The gray level threshold is preset, and is used to distinguish the value of obstacles and background in the road condition recognition image.

In the computer vision-based navigation method provided in this embodiment, the image to be recognized in the real-time video of road conditions is extracted, the image to be recognized is grayed and binarized, and the image to be processed is obtained to speed up the processing of subsequent images. Use edge detection algorithm to process the image to be processed, determine the part of the image to be processed with significant brightness changes, provide technical support for the subsequent segmentation of obstacles and background, and use straight line detection algorithm to process the image to be processed to efficiently identify the road surface Road conditions. The threshold selection method is used to segment the obstacle object and the background of the road condition recognition image, which has the advantage of small calculation amount and can quickly obtain the target recognition image.

In an embodiment, as shown in FIG. 5, the target recognition image includes a left-eye recognition image and a right-eye recognition image. In step S204, a computer vision tool is used to perform binocular distance measurement on the obstacle to determine the distance between the user’s current position and the obstacle. Data, including:

S501: Use Zhang Zhengyou's calibration method to calibrate to obtain the parameter data of the binocular camera.

Among them, the binocular camera refers to the left and right cameras on the user client. The distance data between the user's current position and the obstacle obtained by the binocular camera is more than the distance data between the user's current position and the obstacle obtained by the monocular camera. accurate. The Zhang Zhengyou calibration method is a single-plane checkerboard camera calibration method proposed by Professor Zhang Zhengyou in 1998 to obtain the parameter data of the binocular camera. The parameter data includes internal parameter data and external parameter data, the internal parameter data includes focal length and lens distortion parameters, and the external parameter data includes rotation matrix and translation matrix.

Specifically, the binocular camera is used in advance to obtain multiple sets of calibration images at different angles and different distances, and then the Zhang Zhengyou calibration method is used to calibrate multiple sets of calibration images to obtain the parameter data of the binocular camera to identify the image and the right eye for the subsequent left eye. Recognize the image and provide technical support for image correction. Among them, the calibration image refers to an image used for calibration, specifically an image used to calculate and determine the parameter data of the binocular camera. The calibration image includes a left target image and a right target image. The image to be recognized includes the left-eye original image and the right-eye original image. The left-eye recognition image is the left-eye original image extracted from the real-time video of the road condition captured by the left camera, and then the left-eye original image is preprocessed. Similarly, the right-eye recognition image is the right-eye original image extracted from the real-time video of the road condition captured by the right camera, and then the right-eye original image is preprocessed to obtain the image. It should be noted that the left-eye recognition image and the right-eye recognition image must be images obtained from real-time video of road conditions at the same time to ensure the accuracy of the distance data calculated later.

S502: Perform image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data, and obtain a left-eye correction image and a right-eye correction image.

Among them, image correction refers to the method of mapping and transforming the left-eye recognition image and the right-eye recognition image according to the parameter data, so that the polar line of the matching point on the left-eye recognition image and the right-eye recognition image is collinear, and the collinear epipolar line can be understood as the left-eye recognition The matching points on the image and the right-eye recognition image are on the same horizontal line. Image correction based on the parameter data of the binocular camera can ensure the accuracy of the subsequent calculation of the distance data between the user's current position and the obstacle, and effectively reduce the amount of calculation. Among them, the matching point on the left-eye recognition image and the right-eye recognition image refers to the point at the same position of the same object in the left-eye recognition image and the right-eye recognition image, for example, a point on the left ear of the same user on the left-eye recognition image and the right-eye recognition image . The left-eye correction image is an image obtained after correcting the left-eye recognition image. The right-eye correction image is an image obtained after correcting the right-eye recognition image.

Specifically, because the binocular camera is affected by the lens radial distortion and the lens tangential distortion, the left-eye recognition image and the right-eye recognition image obtained by the binocular camera have image distortion. If the left-eye recognition image and the right-eye recognition image are directly used to calculate the user’s current The distance data of the position and obstacle objects, there is a large error in the obtained distance data. In this embodiment, the parameter data obtained by calibration is input into OpenCV, and the affine transformation function of OpenCV is used to realize the mapping transformation processing on the left target image and the right target image. The mapping transformation includes but is not limited to translation, rotation, and scaling. , Make the polar line of the matching point on the left target image and the right target image collinear, determine the left-eye image mapping table and the right-eye image mapping table based on the mapping transformation, the left-eye image mapping table reflects the left target image and after the mapping transformation Similarly, the right-eye image mapping table reflects the mapping relationship between the right target image and the sum and the right-eye correction image after the mapping transformation. In this embodiment, the left-eye recognition image is corrected according to the left-eye image mapping table to obtain the left-eye correction image. Similarly, the right-eye recognition image is corrected according to the right-eye image mapping table to obtain the right-eye correction image. Perform image correction on the left-eye recognition image and the right-eye recognition image to eliminate the influence of image distortion on the subsequent ranging and ensure the reliability of the subsequent calculation of the distance between the user's current position and the obstacle.

S503: Use a stereo matching algorithm to perform stereo matching on the left-eye corrected image and the right-eye corrected image to obtain a disparity map.

Among them, the disparity map refers to an image whose image size is equal to the size of any one of the left-eye correction image and the right-eye correction image, and the element value is the disparity value. The disparity value is the difference between the x-coordinates corresponding to the same point or object imaged by the left-eye camera and the right-eye camera.

Stereo matching refers to finding matching pixels in the left-eye correction image and right-eye correction image, and using the positional relationship between the corresponding pixels to obtain a disparity map. Stereo matching algorithms include, but are not limited to, the local BM algorithm and the global SGBM algorithm provided in OpenCV. The stereo matching algorithm used in this embodiment is a global SGBM. The idea of SGBM is to select the disparity of each pixel to form a disparity map, and set a global energy function related to the disparity map to minimize this energy function. To achieve the purpose of solving the optimal disparity of each pixel.

Specifically, the stereo matching algorithm is used to select the disparity of the corresponding pixels in the left eye correction image and the right eye correction image to form a disparity map, set a global energy function related to the disparity map, and minimize the energy cost function to solve The optimal disparity of each pixel, the optimal disparity of each pixel is used as the disparity value of the pixel to generate a disparity map, and then the distance data between the user's current position and the obstacle can be accurately calculated based on the disparity map.

S504: Determine the distance data between the current position of the user and the obstacle based on the disparity map.

Specifically, as shown in Figure 9, the location of the obstacle is point P, the width of the left-eye camera and the right-eye camera is l, the focal length of the binocular camera is f, and the distance between the left-eye camera and the right-eye camera is T, x _l and x _r represents the abscissa of the projection point of the obstacle in the left eye correction image and the right eye correction image, y _r represents the ordinate of the projection point of the obstacle in the right eye correction image, the imaging point of the obstacle in the left eye camera is P _l , and the obstacle is in The imaging point on the right-eye camera is P _r , and the coordinates of the obstacle P is (X, Y, Z), which can be obtained by the triangle similarity principle

Simplify to get

Since the disparity is known as d=x _l -x _r , therefore,

Taking the right-eye camera as the benchmark, we can get

which is

In this way, the distance data of the obstacle can be obtained, so as to provide the user with corresponding navigation according to the distance data.

In the computer vision-based navigation method provided in this embodiment, the Zhang Zhengyou calibration method is used for calibration to obtain parameter data of the binocular camera, which provides technical support for subsequent image correction of the left-eye recognition image and the right-eye recognition image. Perform image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data, and obtain the left-eye correction image and the right-eye correction image to eliminate the influence of image distortion on the subsequent distance measurement and ensure the reliability of the subsequent calculation of the distance between the user’s current position and the obstacle. . The stereo matching algorithm is used to perform stereo matching on the left-eye correction image and the right-eye correction image to obtain a disparity map. According to the disparity map, the distance data between the user's current position and the obstacle can be accurately calculated, so as to provide the user with corresponding navigation based on the distance data.

In one embodiment, as shown in FIG. 6, before step S203, that is, before the target obstacle recognition model is used to recognize the target recognition image and the current recognition result is obtained, the computer vision-based navigation method further includes:

S601: Obtain a training image and a test image, where the training image and the test image carry the type of obstacle and the tag of the obstacle.

Among them, the training image is an image used to train the neural network model to generate a target obstacle recognition model. The test image is an image used to verify the original obstacle recognition model. The obstacle type refers to the type of the object that hinders the user from moving forward. For example, the obstacle type may be a movable obstacle or a fixed obstacle. Obstructive object tags are tags of objects that hinder the user from moving forward. For example, obstructive object tags may be people, dogs, bicycles, trees, and so on. Furthermore, when training the model, it is also possible to identify traffic lights and blind roads, thereby guiding users with eye inconvenience to walk in the blind roads, or reminding color-blind patients when the traffic is red.

S602: Input the training image into the neural network model for training, and obtain the original obstacle recognition model.

Specifically, the training image with the obstacle object type and obstacle object label is input into the neural network model. When the neural network model converges, the original obstacle recognition model is obtained, and the neural network model is trained to quickly identify the obstacle in the subsequent object.

S603: Input the test image into the original obstacle recognition model, and obtain the recognition accuracy rate output by the original obstacle recognition model.

Among them, the recognition accuracy refers to the probability that the original obstacle recognition model can accurately identify the type of obstacle and the tag of the obstacle in the test image.

Specifically, input multiple test images into the original obstacle recognition model, obtain the original recognition result of the original obstacle recognition model, and compare each original recognition result with the obstacle object type and obstacle object label of the corresponding test image to Obtain the recognition accuracy of the original obstacle recognition model to verify whether the original obstacle recognition model is successful. Among them, the recognition accuracy rate of obtaining the original obstacle recognition model refers to the quotient of the original recognition result of the recognition accuracy and the number of images of all test images.

S604: If the recognition accuracy rate is greater than the preset accuracy threshold, determine the original obstacle recognition model as the target obstacle recognition model.

Among them, the preset accuracy threshold is preset, and is used to determine whether the original obstacle recognition model can accurately recognize the type of obstacle and the threshold of the obstacle tag. For example, the preset accuracy threshold may be 90%.

Specifically, when the recognition accuracy rate is greater than the preset accuracy threshold, it indicates that the original obstacle recognition model is successfully trained, and the original obstacle recognition model is determined as the target obstacle recognition model, so as to ensure that there are obstacles in the target recognition image according to the target obstacle recognition model. Ensure the accuracy of obstacle recognition.

In the computer vision-based navigation method provided in this embodiment, the training image is input into the neural network model for training, and the original obstacle recognition model is obtained so as to quickly identify obstacle objects in the subsequent. The test image is input into the original obstacle recognition model, and the recognition accuracy rate output by the original obstacle recognition model is obtained to verify whether the original obstacle recognition model is successful. When the recognition accuracy is greater than the preset accuracy threshold, the original obstacle recognition model is determined as the target obstacle recognition model, so as to ensure that there are obstacles in the target recognition image according to the target obstacle recognition model, and to ensure the accuracy of obstacle recognition.

In an embodiment, as shown in FIG. 7, the obstacle object also carries an obstacle object type, and the obstacle object type includes, but is not limited to, a fixed obstacle object and a movable obstacle object. In step S205, the corresponding evasion reminder information is obtained according to the distance data and preset alarm conditions, and the evasion reminder information is played by the voice playback system, including:

S701: If the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a fixed obstacle, based on the user's current position and the end position, the genetic algorithm is used for path planning, the second target route is obtained, and the second target route is obtained. As the evasion reminder message, the voice playback system is used to play the evasion reminder message.

Among them, the genetic algorithm (GeneticAlgorithm) is a computational model that simulates the biological evolution process of natural selection and genetic mechanism of Darwin's biological evolution theory, and is a way to search for the optimal solution by simulating the natural evolution process.

Specifically, when the distance data meets the preset alarm conditions, and the type of obstacle carried by the obstacle is a fixed obstacle, it means that the first target line cannot go forward at this time. Therefore, the server uses a genetic algorithm according to the user's current position and the end position. Perform path planning to obtain the second target route, use the second target route as the avoidance reminder information, and use the voice playback system to play the avoidance reminder information to the user, so that the user can walk without obstacles based on the avoidance reminder information, so that the user does not need to use his eyes directly To check, you can safely navigate to the end position, especially for users with eye inconvenience or other situations where the road conditions cannot be checked in real time. Based on the distance data and preset alarm conditions, the route is planned for the user to ensure the user's travel safety.

S702: If the distance data meets the preset warning condition, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle is detected, and evasion reminder information is generated based on the detection result, and the evasion reminder information is played by a voice playback system.

Specifically, when the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle may or may not move at this time, the user is first reminded to stop, and then the obstacle is detected. If an obstacle-free object is detected within the preset stop time, the first target line is used as the avoidance reminder message, and the voice playback system is used to play the avoidance reminder message to remind the user to continue walking; if the obstacle is detected within the preset stop time For the object, based on the user's current position and end position, genetic algorithm is used to plan the path, the second target route is obtained, the second target route is used as the avoidance reminder information, and the voice playback system is used to play the avoidance reminder information.

Further, when the distance data does not meet the preset alarm conditions, the voice playback system is used to play the continue walking information according to the distance the user walks. For example, the continue walking information can be "You have walked XX meters, please walk straight ahead XX Turn left after a meter, XX meters away from the target location"; or a reminder time threshold may be used. For example, the reminder time threshold may be 5 minutes. When the user's walking time is equal to the reminder time threshold, the voice playback system is used to play the message of continuing walking.

In the computer vision-based navigation method provided in this embodiment, if the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a fixed obstacle, the genetic algorithm is used for path planning based on the user's current position and the end position , Acquire the second target route, use the second target route as the avoidance reminder information, and use the voice playback system to play the avoidance reminder information. If the obstacle type is a fixed obstacle, then the second target route is planned for the user as the avoidance reminder information at this time, So that users can walk without obstacles based on the avoidance reminder information, so that users can safely navigate to the end position without directly checking with their eyes, especially for users with eye inconvenience or other situations where the road conditions cannot be checked in real time, ensuring user travel safety . If the distance data meets the preset alarm conditions, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle is detected, and the evasion reminder is generated based on the detection result, and the voice playback system is used to play the evasion reminder so that the user can follow the evasion The reminder information can be walked without barriers, so that the user can safely navigate to the target location without directly viewing it with his eyes.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

In one embodiment, a computer vision-based navigation device is provided, and the computer vision-based navigation device corresponds to the computer vision-based navigation method in the above-mentioned embodiment in a one-to-one correspondence. As shown in FIG. 8, the computer vision-based navigation device includes a navigation request information acquisition module 801, a first target route acquisition module 802, a current recognition result acquisition module 803, a distance data acquisition module 804, and an avoidance reminder information acquisition module 805. The detailed description of each functional module is as follows:

The navigation request information obtaining module 801 is used to obtain navigation request information, and the navigation request information includes a starting point position and an ending point position.

The first target route acquisition module 802 is used for route planning according to the starting point position and the ending point position, acquiring the first target route, and playing the navigation voice data corresponding to the first target route by using a voice playback system.

The current recognition result acquisition module 803 is used to acquire the real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain the target recognition image, and use the target obstacle recognition model to recognize the target The image is recognized and the current recognition result is obtained.

The distance data acquisition module 804 is configured to, if the current recognition result is that there is an obstacle, use a computer vision tool to perform binocular distance measurement on the obstacle, and determine the distance data between the user's current position and the obstacle.

The evasion reminder information acquisition module 805 is configured to acquire corresponding evasion reminder information according to the distance data and preset alarm conditions, and use a voice playback system to play the evasion reminder information.

Further, the navigation request information acquisition module 801 includes: a location input reminder data playback unit, a target text acquisition unit, a speech synthesis unit, and a location confirmation information receiving unit.

The location input reminder data playback unit is used to use the voice playback system to play the location to input reminder data, and receive the voice data to be recognized based on the location reminder data input by the voice collection system.

The target text acquisition unit is used to recognize the voice data to be recognized by using a voice recognition model to obtain the target text.

The speech synthesis unit is used to synthesize the target text with speech synthesis technology, and obtain the to-be-confirmed speech data corresponding to the target text.

The location confirmation information receiving unit is used to play the voice data to be confirmed using the voice playback system, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.

Further, the current recognition result acquisition module 803 includes: a to-be-processed image acquisition unit, a road condition recognition image acquisition unit, and a target recognition image acquisition unit.

The to-be-processed image acquisition unit is used to perform grayscale and binarization processing on the to-be-identified image to acquire the to-be-processed image.

The road condition recognition image acquisition unit is used to process the image to be processed by adopting the edge detection algorithm and the straight line detection algorithm to obtain the road condition recognition image.

The target recognition image acquisition unit is used to segment the obstacle object and the background of the road condition recognition image by using the threshold selection method to obtain the target recognition image.

Further, the target recognition image includes a left-eye recognition image and a right-eye recognition image. The distance data acquisition module 804 includes: a parameter data acquisition unit, an image correction unit, a disparity map acquisition unit, and a distance data determination unit.

The parameter data acquisition unit is used for calibration by Zhang Zhengyou calibration method to obtain parameter data of the binocular camera.

The image correction unit is used to perform image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data, and obtain the left-eye correction image and the right-eye correction image.

The disparity map acquiring unit is used to perform stereo matching on the left-eye corrected image and the right-eye corrected image by using a stereo matching algorithm to obtain a disparity map.

The distance data determining unit is used to determine the distance data between the current position of the user and the obstacle based on the disparity map.

Further, before the current recognition result acquisition module 803, the computer vision-based navigation device further includes: training image and test image acquisition unit, original obstacle recognition model acquisition unit, recognition accuracy rate acquisition unit, and target obstacle recognition model determination unit.

The training image and test image acquisition unit is used to acquire the training image and the test image, and the training image and the test image carry the obstacle object type and the obstacle object label.

The original obstacle recognition model acquisition unit is used to input training images into the neural network model for training, and obtain the original obstacle recognition model.

The recognition accuracy rate acquisition unit is used to input the test image into the original obstacle recognition model to obtain the recognition accuracy rate output by the original obstacle recognition model.

The target obstacle recognition model determination unit is configured to determine the original obstacle recognition model as the target obstacle recognition model if the recognition accuracy rate is greater than the preset accuracy threshold.

Further, the obstacle also carries the type of the obstacle; the avoidance reminder information acquisition module 805 includes: a first judgment unit and a second judgment unit.

The first judging unit is configured to, if the distance data meets the preset warning condition, and the type of obstacle carried by the obstacle is a fixed obstacle, based on the user's current position and end position, the genetic algorithm is used to plan the path to obtain the second target route, The second target route is used as the evasion reminder information, and the voice playback system is used to play the evasion reminder information.

The second judging unit is used to detect the obstacle if the distance data meets the preset alarm condition and the obstacle type carried by the obstacle is a movable obstacle, generate an avoidance reminder message based on the detection result, and use the voice playback system to play the avoidance Reminder information.

For the specific definition of the navigation device based on computer vision, please refer to the above definition of the navigation method based on computer vision, which will not be repeated here. The various modules in the above-mentioned computer vision-based navigation device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 10. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a readable storage medium and an internal memory. The readable storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium. The database of the computer equipment is used to store evasion reminder information. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to realize a navigation method based on computer vision.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor. The processor executes the computer-readable instructions to implement the The steps of the computer vision navigation method, such as steps S201-S205 shown in FIG. 2, or the steps shown in FIG. 3 to FIG. 7, are not repeated here in order to avoid repetition. Or, when the processor executes the computer-readable instructions, the functions of the modules/units in the embodiment of the computer vision-based navigation device are implemented, for example, as shown in FIG. 802. The functions of the current recognition result acquisition module 803, the distance data acquisition module 804, and the avoidance reminder information acquisition module 805 are not repeated here to avoid repetition.

In an embodiment, one or more readable storage media storing computer readable instructions are provided. The readable storage medium stores computer readable instructions. When the computer readable instructions are executed by a processor, the foregoing implementation is implemented. The steps of the computer vision-based navigation method in the example, such as steps S201-S205 shown in FIG. 2, or the steps shown in FIG. 3 to FIG. 7, are not repeated here to avoid repetition. Or, when the processor executes the computer-readable instructions, the functions of the modules/units in the embodiment of the computer vision-based navigation device are implemented, for example, as shown in FIG. 802. The functions of the current recognition result acquisition module 803, the distance data acquisition module 804, and the avoidance reminder information acquisition module 805 are not repeated here to avoid repetition.

A person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as required. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A navigation method based on computer vision, which includes:

Acquiring navigation request information, where the navigation request information includes a starting point position and an ending point position;

Perform route planning according to the starting point position and the ending point position, obtain a first target route, and use a voice playback system to play navigation voice data corresponding to the first target route;

Acquire a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain a target recognition image, and use a target obstacle recognition model to recognize the target Recognize the image and obtain the current recognition result;

If the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to determine the distance data between the user's current position and the obstacle;

According to the distance data and preset alarm conditions, the corresponding evasion reminder information is obtained, and the voice playback system is used to play the evasion reminder information.
The computer vision-based navigation method according to claim 1, wherein said obtaining navigation request information comprises:

Use a voice playback system to play the position to input reminder data, and receive the to-be-recognized voice data input by the voice collection system based on the position reminder data;

Recognizing the to-be-recognized voice data using a voice recognition model to obtain the target text;

Using speech synthesis technology to perform speech synthesis on the target text, and obtain the to-be-confirmed speech data corresponding to the target text;

The voice playback system is used to play the voice data to be confirmed, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.
The computer vision-based navigation method according to claim 1, wherein the preprocessing the image to be recognized to obtain the target recognition image comprises:

Performing grayscale and binarization processing on the image to be recognized to obtain the image to be processed;

Use an edge detection algorithm and a straight line detection algorithm to process the to-be-processed image to obtain a road condition recognition image;

The threshold selection method is adopted to segment the obstacle object and the background of the road condition recognition image to obtain the target recognition image.
The computer vision-based navigation method of claim 1, wherein the target recognition image includes a left-eye recognition image and a right-eye recognition image,

The binocular distance measurement of the obstacle by using a computer vision tool to determine the distance data between the user's current position and the obstacle includes:

Use Zhang Zhengyou calibration method to calibrate to obtain the parameter data of the binocular camera;

Performing image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data to obtain a left-eye correction image and a right-eye correction image;

Using a stereo matching algorithm to perform stereo matching on the left-eye corrected image and the right-eye corrected image to obtain a disparity map;

Determine the distance data between the current position of the user and the obstacle based on the disparity map.
The computer vision-based navigation method according to claim 1, wherein, before the target recognition image is recognized by the target obstacle recognition model and the current recognition result is obtained, the computer vision-based navigation method further comprises:

Acquiring a training image and a test image, where the training image and the test image carry the type of obstacle and the tag of the obstacle;

Input the training image into a neural network model for training, and obtain an original obstacle recognition model;

Inputting the test image into the original obstacle recognition model, and obtaining the recognition accuracy rate output by the original obstacle recognition model;

If the recognition accuracy rate is greater than the preset accuracy threshold, the original obstacle recognition model is determined as the target obstacle recognition model.
The computer vision-based navigation method according to claim 1, wherein the obstacle object also carries the obstacle object type;

The acquiring corresponding evasion reminder information according to the distance data and preset alarm conditions, and playing the evasion reminder information by using the voice playback system includes:

If the distance data meets the preset warning condition, and the type of obstacle carried by the obstacle is a fixed obstacle, based on the current position of the user and the end position, a genetic algorithm is used for path planning to obtain A second target route, using the second target route as evasion reminder information, and playing the evasion reminder information by using the voice playback system;

If the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle is detected, and evasion reminder information is generated based on the detection result, and the The voice playback system plays the evasion reminder message.
A navigation device based on computer vision, which includes:

A navigation request information acquisition module, configured to acquire navigation request information, where the navigation request information includes a starting point position and an ending point position;

The first target route acquisition module is configured to perform route planning according to the starting point position and the end point position, acquire a first target route, and use a voice playback system to play navigation voice data corresponding to the first target route;

The current recognition result obtaining module is used to obtain the real-time video of the road conditions corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain the target recognition image, and use the target The obstacle recognition model recognizes the target recognition image and obtains the current recognition result;

The distance data acquisition module is configured to, if the current recognition result is that there is an obstacle, use a computer vision tool to perform binocular distance measurement on the obstacle, and determine the distance data between the user's current position and the obstacle;

The evasion reminder information acquisition module is configured to obtain corresponding evasion reminder information according to the distance data and preset alarm conditions, and use the voice playback system to play the evasion reminder information.
8. The computer vision-based navigation device according to claim 7, wherein the navigation request information acquisition module comprises:

The location input reminder data playback unit is configured to use the voice playback system to play location input reminder data, and receive the voice data to be recognized input by the voice collection system based on the location reminder data;

The target text acquisition unit is configured to recognize the to-be-recognized voice data using a voice recognition model to acquire the target text;

A speech synthesis unit, configured to use speech synthesis technology to perform speech synthesis on the target text, and obtain the to-be-confirmed speech data corresponding to the target text;

The location confirmation information receiving unit is configured to use the voice playback system to play the voice data to be confirmed, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.
A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:

Acquiring navigation request information, where the navigation request information includes a starting point position and an ending point position;

Perform route planning according to the starting point position and the ending point position, obtain a first target route, and use a voice playback system to play navigation voice data corresponding to the first target route;

Acquire a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain a target recognition image, and use a target obstacle recognition model to recognize the target Recognize the image and obtain the current recognition result;

If the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to determine the distance data between the user's current position and the obstacle;

According to the distance data and preset alarm conditions, the corresponding evasion reminder information is obtained, and the voice playback system is used to play the evasion reminder information.
9. The computer device according to claim 9, wherein said obtaining navigation request information comprises:

Use a voice playback system to play the position to input reminder data, and receive the to-be-recognized voice data input by the voice collection system based on the position reminder data;

Recognizing the to-be-recognized voice data using a voice recognition model to obtain the target text;

Using speech synthesis technology to perform speech synthesis on the target text, and obtain the to-be-confirmed speech data corresponding to the target text;

The voice playback system is used to play the voice data to be confirmed, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.
9. The computer device according to claim 9, wherein the preprocessing the image to be recognized to obtain the target recognition image comprises:

Performing grayscale and binarization processing on the image to be recognized to obtain the image to be processed;

Use an edge detection algorithm and a straight line detection algorithm to process the to-be-processed image to obtain a road condition recognition image;

The threshold selection method is adopted to segment the obstacle object and the background of the road condition recognition image to obtain the target recognition image.
The computer device of claim 9, wherein the target recognition image includes a left-eye recognition image and a right-eye recognition image, and the computer vision tool is used to perform binocular distance measurement on the obstacle to determine the current position of the user and the Distance data of obstacles, including:

Use Zhang Zhengyou calibration method to calibrate to obtain the parameter data of the binocular camera;

Performing image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data to obtain a left-eye correction image and a right-eye correction image;

Using a stereo matching algorithm to perform stereo matching on the left-eye corrected image and the right-eye corrected image to obtain a disparity map;

Determine the distance data between the current position of the user and the obstacle based on the disparity map.
9. The computer device according to claim 9, wherein, before the target recognition image is recognized by the target obstacle recognition model and the current recognition result is obtained, the computer vision-based navigation method further comprises:

Acquiring a training image and a test image, where the training image and the test image carry the type of obstacle and the tag of the obstacle;

Input the training image into a neural network model for training, and obtain an original obstacle recognition model;

Inputting the test image into the original obstacle recognition model, and obtaining the recognition accuracy rate output by the original obstacle recognition model;

If the recognition accuracy rate is greater than the preset accuracy threshold, the original obstacle recognition model is determined as the target obstacle recognition model.
9. The computer device according to claim 9, wherein the obstacle object also carries an obstacle object type;

The acquiring corresponding evasion reminder information according to the distance data and preset alarm conditions, and playing the evasion reminder information by using the voice playback system includes:

If the distance data meets the preset warning condition, and the type of obstacle carried by the obstacle is a fixed obstacle, based on the current position of the user and the end position, a genetic algorithm is used for path planning to obtain A second target route, using the second target route as evasion reminder information, and playing the evasion reminder information by using the voice playback system;

If the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle is detected, and evasion reminder information is generated based on the detection result, and the The voice playback system plays the evasion reminder message.
One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Acquiring navigation request information, where the navigation request information includes a starting point position and an ending point position;

Perform route planning according to the starting point position and the ending point position, obtain a first target route, and use a voice playback system to play navigation voice data corresponding to the first target route;

Acquire a real-time video of the road condition corresponding to the first target route, extract the image to be recognized from the real-time video of the road condition, preprocess the image to be recognized, obtain a target recognition image, and use a target obstacle recognition model to recognize the target Recognize the image and obtain the current recognition result;

If the current recognition result is that there is an obstacle, a computer vision tool is used to perform binocular distance measurement on the obstacle to determine the distance data between the user's current position and the obstacle;

According to the distance data and preset alarm conditions, the corresponding evasion reminder information is obtained, and the voice playback system is used to play the evasion reminder information.
The readable storage medium according to claim 15, wherein said obtaining navigation request information comprises:

Use a voice playback system to play the position to input reminder data, and receive the to-be-recognized voice data input by the voice collection system based on the position reminder data;

Recognizing the to-be-recognized voice data using a voice recognition model to obtain the target text;

Using speech synthesis technology to perform speech synthesis on the target text, and obtain the to-be-confirmed speech data corresponding to the target text;

The voice playback system is used to play the voice data to be confirmed, receive the location confirmation information sent by the client, and determine the navigation request information based on the target text and the location confirmation information.
15. The readable storage medium of claim 15, wherein the preprocessing the image to be recognized to obtain the target recognition image comprises:

Performing grayscale and binarization processing on the image to be recognized to obtain the image to be processed;

Use an edge detection algorithm and a straight line detection algorithm to process the to-be-processed image to obtain a road condition recognition image;

The threshold selection method is adopted to segment the obstacle object and the background of the road condition recognition image to obtain the target recognition image.
15. The readable storage medium of claim 15, wherein the target recognition image includes a left-eye recognition image and a right-eye recognition image,

The binocular distance measurement of the obstacle by using a computer vision tool to determine the distance data between the user's current position and the obstacle includes:

Use Zhang Zhengyou calibration method to calibrate to obtain the parameter data of the binocular camera;

Performing image correction on the left-eye recognition image and the right-eye recognition image based on the parameter data to obtain a left-eye correction image and a right-eye correction image;

Using a stereo matching algorithm to perform stereo matching on the left-eye corrected image and the right-eye corrected image to obtain a disparity map;

Determine the distance data between the current position of the user and the obstacle based on the disparity map.
15. The readable storage medium of claim 15, wherein, before the target recognition image is recognized by the target obstacle recognition model and the current recognition result is obtained, the computer vision-based navigation method further comprises:

Acquiring a training image and a test image, where the training image and the test image carry the type of obstacle and the tag of the obstacle;

Input the training image into a neural network model for training, and obtain an original obstacle recognition model;

Inputting the test image into the original obstacle recognition model, and obtaining the recognition accuracy rate output by the original obstacle recognition model;

If the recognition accuracy rate is greater than the preset accuracy threshold, the original obstacle recognition model is determined as the target obstacle recognition model.
The readable storage medium according to claim 15, wherein the obstacle object also carries the obstacle object type;

The acquiring corresponding evasion reminder information according to the distance data and preset alarm conditions, and playing the evasion reminder information by using the voice playback system includes:

If the distance data meets the preset warning condition, and the type of obstacle carried by the obstacle is a fixed obstacle, based on the current position of the user and the end position, a genetic algorithm is used for path planning to obtain A second target route, using the second target route as evasion reminder information, and playing the evasion reminder information by using the voice playback system;

If the distance data meets the preset warning conditions, and the type of obstacle carried by the obstacle is a movable obstacle, the obstacle is detected, and evasion reminder information is generated based on the detection result, and the The voice playback system plays the evasion reminder message.