CN112634303B

CN112634303B - Method, system, device and storage medium for assisting blind person in visual reconstruction

Info

Publication number: CN112634303B
Application number: CN202011599079.6A
Authority: CN
Inventors: 史业民; 俞益洲; 李一鸣; 乔昕
Original assignee: Beijing Shenrui Bolian Technology Co Ltd; Shenzhen Deepwise Bolian Technology Co Ltd
Current assignee: Beijing Shenrui Bolian Technology Co Ltd; Shenzhen Deepwise Bolian Technology Co Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2022-02-25
Anticipated expiration: 2040-12-29
Also published as: CN112634303A

Abstract

The invention provides a method, a system, equipment and a storage medium for assisting a blind person in visual reconstruction, belongs to the technical field of computer vision, and solves the technical problems of low response speed and impracticality of the existing blind guiding method. The method comprises the following steps: acquiring visual data in real time; determining the edge of a target object according to the visual data and forming a key point cloud; forming auxiliary data of the outline of the target object according to the key point cloud; the auxiliary data is transmitted to the intracerebral electrodes in a contact or non-contact manner, so that the intracerebral electrodes coupled with the retina obtain signal input. The invention can simplify the data of the visual data into the data with low code rate to realize the simplification of the data under the condition of meeting the data required by the visual reconstruction of the blind, thereby being beneficial to reducing the calculated amount, improving the response speed, reducing the time delay and improving the use experience of the user.

Description

Method, system, device and storage medium for assisting blind person in visual reconstruction

Technical Field

The invention relates to the technical field of computer vision reconstruction, in particular to a method, a system, equipment and a storage medium for assisting blind people in performing vision reconstruction.

Background

The blind person loses the ability to move about by himself due to impaired visual function. However, most blind people often have visual loss due to eye diseases and other problems, but have complete and available visual paths, which provides possibility for restoring the visual ability of the blind people to a certain degree. Existing solutions for blind persons mainly provide navigation for the blind person by means of external feedback, such as detecting or predicting the direction of progress, and then inform the blind person of the movement instructions in the form of speech or stimuli. However, on one hand, the method is easy to generate missed detection and false detection, and meanwhile, due to the difficulty of communication among equipment, an algorithm and the blind, wrong navigation which does not accord with the intention of the blind is easy to occur. In addition, an intracerebral electrode is in signal coupling with a retina in the existing blind assisting technology, and is activated when receiving a time sequence to stimulate ganglion cells to generate a biological pulse signal which is transmitted to a cerebral cortex visual area.

Disclosure of Invention

In view of the above problems, the embodiment of the invention provides a method, a system, a device and a storage medium for assisting blind persons in performing visual reconstruction, which solve the technical problems of slow response speed and impracticality of the existing blind guiding method.

In order to solve the technical problems, the invention provides the following technical scheme:

in a first aspect, the present invention provides a method for assisting a blind person in performing visual reconstruction, the method comprising:

acquiring visual data in real time;

determining the edge of a target object according to the visual data and forming a key point cloud;

forming auxiliary data of the outline of the target object according to the key point cloud;

the auxiliary data is transmitted to the intracerebral electrodes in a contact or non-contact manner, so that the intracerebral electrodes coupled with the retina obtain signal input.

In one embodiment, the determining edges of the target object and forming a key point cloud from the visual perspective comprises:

detecting an object area of the visual data in real time according to a target detection algorithm, and defining the object area as an effective area;

reserving edge data of an object in the effective area according to an edge data detection algorithm;

calculating the depth of each edge point by adopting a binocular camera;

and extracting key point clouds of key nodes in the edge data and encoding the key point clouds.

In one embodiment, the extracting and encoding the key point cloud of the key node in the edge data includes:

dividing the edge data into isolated points, straight lines and arc lines according to different types of the edge data;

taking the isolated points, two end points on a straight line and a plurality of sampling points obtained by sparse sampling on an arc line as key point clouds;

and coding according to the type of the key point cloud and the depth of the space coordinate.

In one embodiment, the forming auxiliary data of the target object contour according to the key point cloud includes:

decoding the key point cloud, and restoring the key point cloud into contour data;

reducing the key point cloud belonging to the straight line type into a straight line in a function fitting mode;

and restoring the key point cloud belonging to the arc type into an arc in a spline interpolation mode.

In a second aspect, the present invention provides a system for assisting blind persons in performing vision reconstruction, the system comprising:

an acquisition module: the system is used for acquiring visual data in real time;

a first forming module: the system comprises a display device, a control device and a display device, wherein the display device is used for displaying a target object;

a second forming module: auxiliary data used for forming the outline of the target object according to the key point cloud;

a transmission module: and the auxiliary data are transmitted to the intracerebral electrode in a contact or non-contact mode, so that the intracerebral electrode coupled with the retina obtains signal input.

In one embodiment, the first forming module specifically includes:

an object detection unit: the object region is used for detecting the visual data in real time according to a target detection algorithm and is defined as an effective region;

an edge detection unit: the edge data detection algorithm is used for reserving the edge data of the object in the effective area;

a depth calculation unit: the binocular camera is used for calculating the depth of each edge point;

a key point cloud encoding unit: the method is used for extracting the key point clouds of the key nodes in the edge data and encoding the key point clouds.

In an embodiment, the key point cloud encoding unit is specifically configured to:

In an embodiment, the second forming module is specifically configured to:

decoding the key point cloud to restore the key point cloud into contour data;

In a third aspect, the present invention provides an electronic device comprising:

a processor, a memory, an interface to communicate with a gateway;

the memory is used for storing programs and data, and the processor calls the programs stored in the memory to execute the method for assisting the blind in visual reconstruction.

In a fourth aspect, the present invention provides a computer-readable storage medium comprising a program which, when executed by a processor, is adapted to perform a method of assisting a blind person in visual reconstruction as provided in any one of the first aspect.

From the above description, it can be seen that the present invention has the following advantages over the prior art: the method can simplify the visual data into the key point cloud with low code rate and object edge under the condition of meeting the data required by the visual reconstruction of the blind, and ensures the visual data quality of the visual characteristics under the condition of ensuring less visual data loss. The blind person self-help operation system is small in data volume and short in operation period, response speed is improved, delay is effectively reduced, simple outlines of objects can be reproduced in the brain of the blind person in real time, partial visual functions of the objects in the surrounding environment can be recovered by the blind person after brain recognition, visual reconstruction to a certain degree is achieved, possibility is provided for autonomous actions of the blind person, practicability is enhanced, user use experience is improved, the life quality of the blind person is further improved, and risks encountered in life are reduced.

Drawings

Fig. 1 is a schematic flow chart illustrating a method for assisting a blind person in performing visual reconstruction according to an embodiment of the present invention;

fig. 2 is a schematic diagram illustrating a process of extracting and encoding a key point cloud according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a system for assisting blind people in performing visual reconstruction according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described below with reference to the accompanying drawings and the detailed description. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Based on the shortcomings of the prior art, the embodiment of the present invention provides a specific implementation of a method for assisting a blind person in performing visual reconstruction, and as shown in fig. 1, the method specifically includes the following steps:

and S110, acquiring visual data in real time.

Specifically, the space between at least two cameras is synchronously acquired to form stereoscopic vision, and then the feeling of distance or depth of an object is formed. Various visual data is captured in real-time from the surrounding environment. The visual data is the brightness of the environment where the blind is located, the color of the object, the shape of the object and the real-time picture of the motion of the object.

And S120, determining the edge of the target object according to the visual data and forming a key point cloud.

Specifically, the content of the visual data is numerous and complex, the amount of formed data is large, and if the visual data is transmitted without being changed, the response speed is inevitably slow, and the effect of visual reconstruction is seriously affected. The quantitative description of the target object in the visual data can be obtained by adopting a target detection technology, the edge range of the object is determined according to the quantitative description, the key point cloud is formed according to the edge range, and the data volume is simplified.

And S130, forming auxiliary data of the outline of the target object according to the key point cloud.

Specifically, the determined contour data is formed by the key point clouds, the contour data forms electrode driving signals according to a matrix arrangement structure of the electrodes in the brain and a coding rule, and the time sequence of an electrode driving signal set forms auxiliary data.

And S140, transmitting the auxiliary data to the intracerebral electrode in a contact or non-contact mode, so that the intracerebral electrode coupled with the retina obtains signal input.

Specifically, the contact mode may include, but is not limited to, a contact input, and the non-contact mode may include, but is not limited to, an interactive input of a wireless signal or an optical signal.

In the embodiment, under the condition that data required by blind visual reconstruction is met, visual data can be simplified into key point cloud with low code rate and containing object edges, and the visual data quality of visual features is ensured under the condition of ensuring less visual data loss. The blind person self-help operation system has the advantages that the data volume is small, the operation cycle is shortened, the response speed is improved, the delay is effectively reduced, the simple outline of an object can be reproduced in real time in the brain of the blind person, the blind person can recover partial visual function of the object in the surrounding environment after brain recognition, the visual reconstruction of a certain degree is realized, the possibility of autonomous action of the blind person is provided, the practicability is enhanced, the user use experience is improved, the life quality of the blind person is further improved, and the risk of encountering in life is reduced.

Based on the foregoing embodiments, in an embodiment of the present invention, as shown in fig. 2, S120 specifically includes:

s121: and detecting the object area of the visual data in real time according to a target detection algorithm, and defining the object area as an effective area.

Specifically, in order to reduce redundant information as much as possible and reduce the data volume required to be transmitted on the premise of ensuring valid data, an object detection module in the YOLO algorithm may be used, visual data captured by a binocular camera is used as input, areas without obvious influences such as sky, flat ground and the like are ignored, only the object area is used as an interest point, an object area (namely, an effective area) in each video frame is detected in real time, and meanwhile, the binocular camera is used for detection, so that the occurrence of conditions such as missing detection, false detection and the like can be reduced.

S122: and reserving the edge data of the object in the effective area according to an edge data detection algorithm.

Specifically, in order to ensure low code rate and sparsity of interactive data, the edge of the object is used as the description of the object, so that the visual effect can be effectively ensured and the data volume needing to be transmitted is reduced. When extracting object edges, the CANNY operator can be used to reserve object edges in space for input using visual data captured by the binocular camera.

S123: the depth of each edge point is calculated using a binocular camera.

Specifically, the object edge may be abstracted to be composed of numerous points, each edge point has its own spatial coordinates, and in order to describe the three-dimensional information of each edge point, the depth information may be calculated by using the depth estimation of the binocular camera through the video and the angle information of the binocular camera, and the depth value is calculated by the following formula:

wherein z is a depth value; f is the focal length of the binocular camera; b is the distance of the center point of the binocular camera; x is the number of_lThe distance of the object deviating from the center position in the current frame of the left camera is obtained; x is the number of_rAnd the distance of the object from the center position in the current frame of the right camera is obtained.

S124: and extracting key point clouds of key nodes in the edge data and encoding the key point clouds. Specifically, the edge of an object (i.e., the contour of the object) in the visual data is extracted, the edge of the object is composed of various line segments, important point locations of the connection relationship of the line segments are taken as key points, spatial information of the key points is extracted as key point clouds, the key point clouds are encoded in order to avoid transmission of large data volume, and the information of the key point clouds is transferred and stored into the formed encoded information with low code rate. In order to reduce the data amount as much as possible, only points where features are particularly prominent in the edge data are extracted as key nodes based on the edge data, and the key nodes may be isolated points having the greatest or smallest intensity on the feature attributes, end points of line segments, or points having the greatest local curvature on the curve. This key node point is defined as a key point cloud. And then coding is carried out according to the space coordinates and the depth values of the key point cloud, and the original graphic information is converted into digital information for transmission.

In the embodiment, the object region is determined as an effective region by taking the visual data of the binocular camera as a reference; only preserving edge data within the region for representing the contour of each object; then, calculating the depth value of each edge point to obtain three-dimensional data of the object; and then encoding the extracted key point cloud. Therefore, the visual data has the original image converted into digital information containing key point clouds, so that a large amount of calculation and serious time delay caused by directly sending the image are avoided, the overall response speed of the invention is improved, and better use experience can be brought to the blind.

Based on the foregoing embodiments, in an embodiment of the present invention, as shown in fig. 2, S124 specifically includes:

s125: the edge data is divided into isolated points, straight lines and arcs according to the type of the edge data.

S126: and taking the isolated points, two end points on the straight line and a plurality of sampling points obtained by sparse sampling on the arc line as key point clouds.

S127: and coding according to the type of the key point cloud and the depth of the space coordinate.

In particular, the encoding can be performed according to the following rules for isolated points, lines and arcs in particular.

Isolated point coding: only its type and three-dimensional coordinates are transmitted for an independent point and the independent point is represented as:

0

1

x

y

d

wherein 01 represents the encoding of the independent point type; x represents the x-axis coordinate of the isolated point; y represents the y-axis coordinate of the isolated point; d represents the depth of the isolated point.

Straight line coding: for a straight line, only two end points of the straight line are transmitted, and the straight line is represented as:

1

0

x1

y1

d1

x2

y2

d2

wherein 10 represents a straight line type of code; x1, y1, d1 respectively represent the x-axis coordinate, y-axis coordinate and depth of an end point of the straight line; x2, y2, d2 represent the x-axis coordinate, y-axis coordinate and depth, respectively, of the other end point of the straight line.

Arc coding: the change of the arc is complex, and sparse sampling is carried out on the arc in order to reduce the data volume. For the continuously changed arc, sampling key points by taking the step length as s; for an abrupt arc, sampling is done in steps of s/2 and the arc is represented as:

1

x1

y1

d1

x2

y2

d2

.

xn

yn

dn

wherein 11 represents an arc type of encoding; (xk, yk, dk (where k is 1 → n)) represents the x-axis coordinate, y-axis coordinate and depth corresponding to a certain sampling point of the arc.

In the present embodiment, the edge data is integrated into three types of isolated points, straight lines, and arcs, and only two end points of a straight line, sampling points on an arc, are extracted. The interactive data volume is further reduced, the overall response time is further prolonged, and the practicability is improved.

Based on the foregoing embodiment, in an embodiment of the present invention, S130 specifically includes:

decoding the key point cloud to restore the key point cloud into contour data;

specifically, since the data of the straight line type only includes two end points, the straight line passing through the two end points can be fitted through a linear function or a plurality of times of function fitting, and then the straight line can be restored.

Specifically, for the arc type data, since it describes a complex change process through multiple points, spline interpolation is used to perform the reduction of the original data to ensure the smoothness of the arc data.

In the embodiment, the key point clouds of the straight lines and the arc lines are restored into the contour data through the method, the calculated amount is small, the error of restoration can be reduced, the object contour data can be restored more truly, the authenticity of the restored contour data is improved, the stability and the practicability are further improved on the basis, and the good experience effect is ensured.

Based on the same inventive concept, the embodiment of the present application further provides a system for assisting the blind to perform visual reconstruction, which can be used to implement the method for assisting the blind to perform visual reconstruction described in the above embodiment, as described in the following embodiments. The principle of solving the problems of the system for assisting the blind to perform the visual reconstruction is similar to that of the method for assisting the blind to perform the visual reconstruction, so the implementation of the system for assisting the blind to perform the visual reconstruction can refer to the implementation of the method for assisting the blind to perform the visual reconstruction, and repeated parts are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

The invention provides a system for assisting blind people in visual reconstruction, which is shown in figure 3. In fig. 3, the system comprises:

the obtaining module 210: the system is used for acquiring visual data in real time;

the first forming module 220: the system comprises a display device, a control device and a display device, wherein the display device is used for displaying a target object;

the second forming module 230: auxiliary data used for forming the outline of the target object according to the key point cloud;

the transmission module 240: and the auxiliary data are transmitted to the intracerebral electrode in a contact or non-contact mode, so that the intracerebral electrode coupled with the retina obtains signal input.

Based on the above embodiment, in the system for assisting the blind to perform visual reconstruction according to an embodiment of the present invention, the first forming module 220 specifically includes:

the target detection unit 221: the object region is used for detecting the visual data in real time according to a target detection algorithm and is defined as an effective region;

the edge detection unit 222: the edge data detection algorithm is used for reserving the edge data of the object in the effective area;

the depth calculation unit 223: the binocular camera is used for calculating the depth of each edge point;

the key point cloud encoding unit 224: the method is used for extracting the key point clouds of the key nodes in the edge data and encoding the key point clouds.

Based on the above embodiment, in the system for assisting the blind to perform visual reconstruction according to an embodiment of the present invention, the key point cloud encoding unit 224 is specifically configured to:

Based on the above embodiment, in the system for assisting the blind to perform visual reconstruction according to an embodiment of the present invention, the second forming module 230 is specifically configured to:

decoding the key point cloud to restore the key point cloud into contour data;

An embodiment of the present application further provides a specific implementation manner of an electronic device, which is capable of implementing all steps in the method in the foregoing embodiment, and referring to fig. 4, the electronic device 300 specifically includes the following contents:

a processor 310, a memory 320, a communication unit 330, and a bus 340;

the processor 310, the memory 320 and the communication unit 330 complete communication with each other through the bus 340; the communication unit 330 is used for implementing data transmission between server-side devices and terminal devices and other related devices.

The processor 310 is used to call the computer program in the memory 320, and when the processor executes the computer program, the processor realizes all the steps of the method for assisting the blind to perform the visual reconstruction in the above embodiments.

Those of ordinary skill in the art will understand that: the Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory is used for storing programs, and the processor executes the programs after receiving the execution instructions. Further, the software programs and modules within the aforementioned memories may also include an operating system, which may include various software components and/or drivers for managing system tasks (e.g., memory management, storage device control, power management, etc.), and may communicate with various hardware or software components to provide an operating environment for other software components.

The processor may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The present application further provides a computer readable storage medium comprising a program which, when executed by a processor, is adapted to perform a method of assisting a blind person in performing visual reconstruction as provided in any of the method embodiments described above.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media capable of storing program codes, such as ROM, RAM, magnetic or optical disk, etc., and the specific type of media is not limited in this application.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of assisting a blind person in visual reconstruction, the method comprising:

acquiring visual data in real time;

calculating the depth of each edge point by adopting a binocular camera;

extracting key point clouds of key nodes in the edge data and encoding the key point clouds;

coding according to the type of the key point cloud and the depth of the space coordinate; forming auxiliary data of the outline of the target object according to the key point cloud;

the forming of the auxiliary data of the target object outline according to the key point cloud comprises the following steps:

decoding the key point cloud to restore the key point cloud into contour data;

restoring the key point cloud belonging to the arc type into an arc in a spline interpolation mode;

2. A system for assisting a blind person in performing visual reconstruction, the system comprising:

the first forming module specifically includes:

a key point cloud encoding unit: the system comprises a data processing module, a data processing module and a data processing module, wherein the data processing module is used for extracting key point clouds of key nodes in edge data and coding the key point clouds;

the key point cloud encoding unit is specifically configured to:

coding according to the type of the key point cloud and the depth of the space coordinate;

the second forming module is specifically configured to:

decoding the key point cloud to restore the key point cloud into contour data;

3. An electronic device, comprising:

a processor, a memory, an interface to communicate with a gateway;

the memory is used for storing programs and data, and the processor calls the programs stored in the memory to execute the method for assisting the blind in visual reconstruction as claimed in claim 1.

4. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a program which, when being executed by a processor, is adapted to carry out a method of assisting a blind person in visual reconstruction as claimed in claim 1.