CN117994744A

CN117994744A - Image data processing method, image data processing device, storage medium and vehicle

Info

Publication number: CN117994744A
Application number: CN202211330059.8A
Authority: CN
Inventors: 何佳男; 况磊; 孙昊; 黄玉春
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-10-27
Filing date: 2022-10-27
Publication date: 2024-05-07

Abstract

The application relates to an image data processing method, an image data processing device, a storage medium and a vehicle. The method comprises the following steps: acquiring image data of a road; determining the road surface height information of the road according to the image data and the parameters of the real-time dynamic positioning RTK equipment; and adjusting the image data by using the road surface height information, and determining lane information in the adjusted image data, wherein the lane information is used for constructing a lane-level map. According to the embodiment of the application, lane data can be generated with low cost, high efficiency and high precision, and the method can be applied in a large scale to construct a lane-level map, so that high-definition navigation wide coverage of a complex road scene is realized.

Description

Image data processing method, image data processing device, storage medium and vehicle

Technical Field

The present application relates to the field of map mapping, and in particular, to an image data processing method, an image data processing device, a storage medium, and a vehicle.

Background

With the development of national economic progress and infrastructure, more and more complex road structures appear in modern cities, and the actual road driving environment is more and more complex, and the map becomes an integral part of daily travel of vehicles and is mainly used for environment viewing and path navigation. With the continuous development of map mapping technology, the demand of people for driving navigation has gradually changed from road level navigation to high definition navigation. At this time, the standard navigation Map (STANDARD DEFINIT ion Map, SD Map) has not satisfied the demands of users for the information richness and accuracy of the Map, but a high-definition Map with rich information capable of more truly restoring road scenes is required.

The lane-level Map (lane definition Map, LD Map) is a high-definition Map containing various lane information and having data precision up to centimeter level, and can be used for improving the driving experience of a user or an automatic/auxiliary driving system, realizing high-definition navigation and solving the problem of pain points of the navigation experience of the user in a complex traffic environment.

The current method for generating LD Map and obtaining lane LD data mainly comprises: derived from SD Map or degraded from high-definition Map (high definition Map, HD Map). The mode of SD Map derivation depends on manual design, and the finally obtained data is generally low in consistency with the real world; the mode of degrading through the HD Map depends on the existing HD data, the coverage area is lower, the acquisition of the HD data depends on a professional acquisition vehicle, the production cost is high, the period is long, and the large-scale popularization and the coverage are not facilitated. Therefore, a new way is needed to obtain high-precision lane data at low cost and high efficiency, and to be easily applied on a large scale.

Disclosure of Invention

In view of this, an image data processing method, apparatus, storage medium, and vehicle are proposed.

In a first aspect, an embodiment of the present application provides an image data processing method. The method comprises the following steps: acquiring image data of a road; determining the road surface height information of the road according to the image data and the parameters of the real-time dynamic positioning RTK equipment; and adjusting the image data by using the road surface height information, and determining lane information in the adjusted image data, wherein the lane information is used for constructing a lane-level map.

Wherein the image data may comprise one or more images.

According to the embodiment of the application, the image data of the road is acquired by the low-cost equipment, so that the lane information can be determined by using the image data at low cost. The image data can be optimized by utilizing parameters of RTK equipment to combine with the image data to obtain the road surface height information of the road so as to adjust the image data, the accuracy of the obtained lane information can be further improved, the lane data can be generated with low cost, high efficiency and high accuracy, and the method can be applied on a large scale to construct a lane-level map, and the high-definition navigation wide coverage of complex road scenes can be realized.

In a first possible implementation manner of the image data processing method according to the first aspect, determining road surface height information of a road according to image data and parameters of a real-time dynamic positioning RTK device includes: determining the pose of the image data through the image data and parameters of RTK equipment; and determining the road surface height information according to the pose.

The pose may include a position parameter and a pose parameter, among others.

According to the embodiment of the application, the pose of the image data is determined, and the road surface height information is determined by utilizing the pose, so that the actual height of the road surface can be calculated in consideration of the change of the road surface gradient, the transverse accuracy of the map after the subsequent orthorectified is ensured, and the accuracy of the lane information determination can be improved.

In a second possible implementation form of the image data processing method according to the first aspect as such or according to the first possible implementation form of the first aspect, the lane information relates to a position of the lane in the adjusted image data and to a semantic meaning of the lane.

The position of the lane may indicate a position of each lane element in the adjusted image data, and the position may include any one or more of coordinates of the lane element in an image coordinate system, coordinates in an absolute coordinate system, and relative positions of the lane element and other elements in the image. The semantics of the lane may indicate that the lane element is any one type of road center line, roadway demarcation line, roadway edge line, lane surface sign.

According to the embodiment of the application, the binary semantic segmentation and the multi-class target detection are combined by decoupling the position of the lane and the semantics of the lane, so that adverse effects caused by uneven distribution of the lane element types can be avoided, the sample demand of the model on training data is reduced, and the rapid expansion of the lane element types can be supported, thereby realizing efficient and accurate determination of the lane information.

In a third possible implementation form of the image data processing method according to the first aspect as such or according to the first or second possible implementation form of the first aspect, the adjusted image data is orthophoto image data.

According to the embodiment of the application, the lane information in the orthophoto data is determined by obtaining the orthophoto data, so that the lane information can be determined with low cost and high efficiency, and the method is suitable for large-scale application.

In a fourth possible implementation manner of the image data processing method according to the third possible implementation manner of the first aspect, the adjusting the image data using the road surface height information to obtain lane information in the adjusted image data includes: determining an image data set comprising one or more of the image data to be corrected and/or one or more of the image data to be corrected; and carrying out orthographic correction on the image data to be corrected by utilizing the road surface height information of the image data and the road surface mask of the image data set to obtain orthographic image data corresponding to the image data to be corrected, and determining the road surface mask according to the image data to indicate the road surface part in the road.

According to the embodiment of the application, through determining the image data set, the image data to be corrected can be utilized to carry out orthorectification on the image data to be corrected by utilizing a plurality of pieces of image data before and after the image data to be corrected, and through combining the road surface height information and the road surface mask, the orthographic image data without holes, coverage and ghost can be obtained, so that the lane information can be determined more accurately later.

In a fifth possible implementation manner of the image data processing method according to the fourth possible implementation manner of the first aspect, performing orthographic correction on image data to be corrected by using road surface height information of the image data and a road surface mask of an image data set to obtain orthographic image data corresponding to the image data to be corrected, including: determining one or more second pixel points corresponding to the first pixel points to be corrected in the image data set by utilizing the pavement height information and the pose corresponding to the image data to be corrected; traversing the pavement mask of one or more second pixel points, and taking the pixel value of the corresponding second pixel point as the pixel value of a third pixel point corresponding to the first pixel point in the orthographic image data to determine the orthographic image data under the condition that the pavement mask indicates that the corresponding second pixel point is a pavement.

According to the embodiment of the application, the pixel value of the third pixel point on the corresponding orthographic image data is corrected by traversing the pavement masks corresponding to the second pixel points on the image data before and after the image data to be corrected, so that the problems of lane holes, covered lane, ghosts and the like of the orthographic image data obtained by projection can be well prevented, and the accuracy of the subsequent determination of the lane information is improved.

In a sixth possible implementation manner of the image data processing method according to the third or fourth or fifth possible implementation manner of the first aspect, the orthographic image data is spliced orthographic image data, and in a case that the one or more orthographic image data includes bidirectional road information, a spliced line between the one or more orthographic image data does not pass through a road surface area indicated by the road surface mask.

According to the embodiment of the application, the spliced lines among the orthographic image data do not pass through the pavement area in the process of splicing the bidirectional orthographic image data, so that the road is prevented from being blocked by the blocking object after splicing, the spliced orthographic image data are more complete and clear, and the accuracy of determining the lane information can be improved.

In a seventh possible implementation form of the image data processing method according to the first aspect as such or the first or second or third or fourth or fifth or sixth possible implementation form thereof, determining the road surface height information of the road from the image data and the parameters of the real-time dynamic positioning RTK apparatus comprises: and adjusting the road surface height information by utilizing the point cloud information to obtain the adjusted road surface height information, wherein the point cloud information is determined according to the image data and parameters of the real-time dynamic positioning RTK equipment.

According to the embodiment of the application, the road surface height information is adjusted by utilizing the point cloud information, and the adjusted road surface height information can be more close to the actual road surface height, so that the conditions of deformation, distortion and the like of the lane elements caused by subsequent projection can be further reduced, and the accuracy of subsequent determination of the lane information is improved.

In an eighth possible implementation form of the image data processing method according to the first aspect as such or according to the first or second or third or fourth or fifth or sixth or seventh possible implementation form of the first aspect, the image data is acquired by a panoramic camera or a regular camera. The common camera can be a black-and-white camera, a color camera and the like.

According to the embodiment of the application, the image data can be acquired at low cost, and conditions are created for large-scale popularization and application of the method of the embodiment of the application.

In a ninth possible implementation form of the image data processing method according to the first aspect as such or according to the first or second or third or fourth or fifth or sixth or seventh or eighth possible implementation form of the first aspect, the lane information comprises lane line information and lane face information.

The lane lines may include one or more of a road center line, a lane dividing line and a lane edge line, and the lane surface information may include one or more of a lane number, a lane type, a lane surface mark and an intersection boundary range.

According to the embodiment of the application, the coverage of multiple lane environment scenes can be realized, and the application range is enlarged.

In a second aspect, an embodiment of the present application provides an image data processing apparatus. The device comprises: the acquisition module is used for acquiring image data of the road; the first determining module is used for determining the road surface height information of the road according to the image data and the parameters of the real-time dynamic positioning RTK equipment; the adjusting module is used for adjusting the image data by utilizing the road surface height information, determining the lane information in the adjusted image data, and the lane information is used for constructing the lane-level map.

In a first possible implementation manner of the image data processing apparatus according to the second aspect, the first determining module is configured to: determining the pose of the image data through the image data and parameters of RTK equipment; and determining the road surface height information according to the pose.

In a second possible implementation form of the image data processing device according to the second aspect as such or according to the first possible implementation form of the second aspect, the lane information relates to a position of the lane and a semantic meaning of the lane in the adjusted image data.

In a third possible implementation form of the image data processing apparatus according to the second aspect as such or according to the first or second possible implementation form of the second aspect, the adjusted image data is orthophoto data.

In a fourth possible implementation manner of the image data processing apparatus according to the third possible implementation manner of the second aspect, the adjusting module is configured to: determining an image data set comprising one or more of the image data to be corrected and/or one or more of the image data to be corrected; and carrying out orthographic correction on the image data to be corrected by utilizing the road surface height information of the image data and the road surface mask of the image data set to obtain orthographic image data corresponding to the image data to be corrected, and determining the road surface mask according to the image data to indicate the road surface part in the road.

In a fifth possible implementation manner of the image data processing apparatus according to the fourth possible implementation manner of the second aspect, performing orthographic correction on image data to be corrected using road surface height information of the image data and a road surface mask of an image data set to obtain orthographic image data corresponding to the image data to be corrected, including: determining one or more second pixel points corresponding to the first pixel points to be corrected in the image data set by utilizing the pavement height information and the pose corresponding to the image data to be corrected; traversing the pavement mask of one or more second pixel points, and taking the pixel value of the corresponding second pixel point as the pixel value of a third pixel point corresponding to the first pixel point in the orthographic image data to determine the orthographic image data under the condition that the pavement mask indicates that the corresponding second pixel point is a pavement.

In a sixth possible implementation manner of the image data processing apparatus according to the third or fourth or fifth possible implementation manner of the second aspect, the orthographic image data is spliced orthographic image data, and in a case that the one or more orthographic image data includes bidirectional road information, a splice line between the one or more orthographic image data does not pass through a road surface area indicated by the road surface mask.

In a seventh possible implementation manner of the image data processing apparatus according to the second aspect or the first or second or third or fourth or fifth or sixth possible implementation manner of the second aspect, the first determining module is configured to: and adjusting the road surface height information by utilizing the point cloud information to obtain the adjusted road surface height information, wherein the point cloud information is determined according to the image data and parameters of the real-time dynamic positioning RTK equipment.

In an eighth possible implementation form of the image data processing apparatus according to the second aspect or the first or second or third or fourth or fifth or sixth or seventh aspect, the image data is acquired by a panoramic camera or a regular camera.

In a ninth possible implementation form of the image data processing device according to the second aspect as such or according to the first or second or third or fourth or fifth or sixth or seventh or eighth possible implementation form of the second aspect, the lane information comprises lane line information and lane face information.

In a third aspect, an embodiment of the present application provides an image data processing apparatus including: a processor and a memory; the memory is used for storing programs; the processor is configured to execute a program stored in the memory, to cause the apparatus to implement the image data processing method in the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present application provides a terminal device, which may perform the image data processing method of the first aspect or any one of the possible implementation manners of the first aspect.

In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon program instructions which, when executed by a computer, cause the computer to implement the image data processing method of the first aspect or any one of the possible implementations of the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product comprising program instructions which, when executed by a computer, cause the computer to implement the image data processing method of the first aspect or any one of the possible implementations of the first aspect.

In a seventh aspect, an embodiment of the present application provides a vehicle, the vehicle including a processor configured to perform the image data processing method of the first aspect or any one of the possible implementation manners of the first aspect.

These and other aspects of the application will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.

Fig. 1 shows a schematic diagram of an application scenario according to an embodiment of the application.

Fig. 2 shows a schematic diagram of a system architecture according to an embodiment of the application.

Fig. 3 (a) shows a schematic view of a collection vehicle according to an embodiment of the application.

Fig. 3 (b) shows a schematic view of a collection vehicle according to an embodiment of the application.

Fig. 4 shows a flowchart of an image data processing method according to an embodiment of the present application.

Fig. 5 (a) shows a schematic diagram of image data acquisition according to an embodiment of the present application.

Fig. 5 (b) shows a schematic diagram of image data acquisition according to an embodiment of the present application.

Fig. 6 shows a flowchart of an image data processing method according to an embodiment of the present application.

Fig. 7 shows a schematic diagram of determining road surface height information according to an embodiment of the present application.

Fig. 8 (a) shows an effect diagram after consideration of a road surface height change according to an embodiment of the present application.

Fig. 8 (b) shows an effect diagram after consideration of a road surface height change according to an embodiment of the present application.

Fig. 9 shows a schematic diagram of point cloud information according to an embodiment of the present application.

Fig. 10 shows a flowchart of an image data processing method according to an embodiment of the present application.

Fig. 11 shows a flowchart of an image data processing method according to an embodiment of the present application.

Fig. 12 (a) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 12 (b) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 12 (c) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 13 (a) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 13 (b) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 13 (c) shows an effect diagram of orthorectification according to an embodiment of the present application.

Fig. 14 (a) shows a schematic diagram of orthographic image data stitching according to an embodiment of the present application.

Fig. 14 (b) shows a schematic diagram of orthographic image data stitching according to an embodiment of the present application.

Fig. 15 (a) shows an effect diagram of performing orthographic image data stitching according to an embodiment of the present application.

Fig. 15 (b) shows an effect diagram of performing orthographic image data stitching according to an embodiment of the present application.

Fig. 16 is a schematic diagram showing a binary segmentation result according to an embodiment of the present application.

Fig. 17 shows a schematic diagram of a target detection result according to an embodiment of the present application.

FIG. 18 shows a schematic diagram of model training according to an embodiment of the application.

Fig. 19 shows an effect diagram of determining lane information according to an embodiment of the present application.

Fig. 20 shows a block diagram of an image data processing apparatus according to an embodiment of the present application.

Fig. 21 shows a block diagram of an electronic device 2200 according to an embodiment of the application.

Detailed Description

Various exemplary embodiments, features and aspects of the application will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

In addition, numerous specific details are set forth in the following description in order to provide a better illustration of the application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present application.

The lane lines mentioned hereinafter may include a road center line, a lane dividing line, a lane edge line, etc. on the lane; the lane-face information may include lane numbers, lane types, lane-face markers, intersection boundary range information, and the like. The image data may include one or more images, and the position may refer to coordinates in an image coordinate system corresponding to the image, coordinates in a world coordinate system, or a relative position with other elements.

With the development of national economic progress and infrastructure, more and more complex road structures appear in modern cities, and the actual road driving environment is more and more complex, and the map becomes an integral part of daily travel of vehicles and is mainly used for environment viewing and path navigation. With the continuous development of map mapping technology, the demand of people for driving navigation has gradually changed from road level navigation to high definition navigation. At this time, the standard navigation Map SD Map cannot meet the requirements of users for the information richness and accuracy of the Map, but a high-definition Map with rich information capable of more truly restoring road scenes is required.

The lane-level Map LD Map is a high-definition Map containing various lane information and having data precision up to centimeter level, and can be used for improving the driving experience of a user or an automatic/auxiliary driving system, realizing high-definition navigation and solving the problem of pain points of the navigation experience of the user in a complex traffic environment.

The current method for generating LD Map and obtaining lane LD data mainly comprises: derived from SD Map or from high-precision Map, HD Map degradation. The mode of SD Map derivation depends on manual design, and the finally obtained data is generally low in consistency with the real world; the mode of degrading through the HD Map depends on the existing HD data, the coverage area is lower, the acquisition of the HD data depends on a professional acquisition vehicle, the production cost is high, the period is long, and the large-scale popularization and the coverage are not facilitated. Therefore, a new way is needed to obtain high-precision lane data at low cost and high efficiency, and to be easily applied on a large scale.

In order to solve the technical problems, the application provides an image data processing method, which can acquire an image of a real road environment by low-cost equipment and can determine lane information by using the image data at low cost. The method comprises the steps of obtaining the height information of a road surface by combining parameters of real-time dynamic positioning (REAL TIME KINEMATIC positioning (RTK) equipment so as to adjust image data and optimize the image data, so that the accuracy of the obtained lane information can be further improved. The method can be applied to terminal equipment or a server, so that lane data can be generated with low cost, high efficiency and high precision, and the method can be applied in a large scale to construct a lane-level map, and high-definition navigation wide coverage of complex road scenes is realized.

Fig. 1 shows a schematic diagram of an application scenario according to an embodiment of the application. The method of the embodiment of the application can be used for obtaining the lane information to generate the lane-level Map (LD Map), and the lane-level Map can be applied to a scene of driving a vehicle, such as a scene of driving the vehicle by a user as a navigation Map. As shown in fig. 1, in a scene where a user drives a vehicle, a lane-level map, which is an example in the drawing, may be used as navigation. The lane-level map may include lane information, such as lane line information, such as a road center line (as illustrated in the figure), a lane boundary line (as illustrated in the figure), a lane edge line (as illustrated in the figure), and the like, and may further include lane-face information, such as a lane number, a lane type (as illustrated in the figure), a lane-face flag (as illustrated in the figure), intersection boundary range information (as illustrated in the figure, a road-mouth boundary range), and the like. The user can know the current road lane information and which lane the user is in from the map, so that the user can obtain more complete and accurate high-definition navigation experience in a complex traffic environment.

It should be noted that fig. 1 is only an example of application of the method according to the embodiment of the present application, and the method according to the embodiment of the present application may be applied to other scenarios, for example, the lane information or the lane-level map may be further used in an automatic or assisted driving system of a vehicle to achieve tasks such as positioning, sensing, and decision making of the vehicle.

Fig. 2 shows a schematic diagram of a system architecture according to an embodiment of the application. As shown in fig. 2, the image data processing system according to the embodiment of the present application may be connected to an off-vehicle acquisition system. The off-board acquisition system may include a camera and an RTK device. The RTK apparatus may also be connected to a power source (or battery).

The cameras can be 1 or more panoramic cameras or other common cameras, and the common cameras can be color cameras, black-and-white cameras and the like, and can be used for acquiring image data and sending the image data to an image data processing system.

The RTK apparatus may include a integrated navigation system, a Pulse Per Second (PPS) box, a fourth generation mobile communication technology (the fourth generation of mobile phone mobile communication technology standards, 4G) box, a signal antenna connected to the 4G box, and two or more antennas (e.g., antenna 1 and antenna 2 in the figure) connected to the integrated navigation system. The integrated navigation system can be based on a global positioning system (global positioning system, GPS) system, or can be realized based on a Beidou system or other positioning systems. PPS boxes may be used to achieve time synchronization. The 4G box may also be replaced by a third generation mobile communication technology (the third generation of mobile phone mobile communication technology standards, 3G), a fifth generation (the fifth generation of mobile phone mobile communication technology standards, 5G) mobile communication technology box, etc. to implement data transmission, which is not limited in this respect by the present application. Vehicle position, speed, time information, etc. may be provided to the image processing system by the RTK apparatus.

Referring to fig. 3 (a) and 3 (b), a schematic diagram of a collection vehicle according to an embodiment of the present application is shown. As shown in fig. 3 (a), the above-mentioned off-vehicle acquisition system (including a camera and an RTK device) and a power supply and a storage module may be installed on the top of the acquisition vehicle as a whole. Wherein the unit may be placed in a device housing and the device unit secured to the roof of the vehicle by means of a vehicle-mounted structure. After being fixed, the overall appearance of the collection vehicle can be seen in fig. 3 (b).

The collection vehicle may be any vehicle, may include one or more different types of vehicles, and may also include one or more different types of vehicles or movable objects that operate or move on land (e.g., road, rail, etc.). For example, the vehicle may include an automobile, a bicycle, a motorcycle, a train, a subway or other type of transportation or movable object, etc., and the embodiment of the present application is not limited thereto.

The image data processing system of the embodiment of the application can be deployed on a processor on a server or terminal equipment, can be used for acquiring image data of a road, determining road surface height information of the road according to the image data and parameters of RTK equipment, and can also utilize the road surface height information to adjust the image data so as to determine lane information in the adjusted image data to construct a lane-level map.

The server related to the application can be located in the cloud or local, can be a physical device, can also be a virtual device, such as a virtual machine, a container and the like, and has a wireless communication function, wherein the wireless communication function can be arranged on a chip (system) or other parts or components of the server. The server may be a device with a wireless connection function, and the wireless connection function may be a function of connecting with other servers or terminal devices through wireless connection modes such as Wi-Fi and bluetooth, or the server of the present application may be a function of performing wired connection for communication.

The terminal device related to the application can be a device with a wireless connection function, wherein the wireless connection function is that the terminal device can be connected with other terminal devices or servers through wireless connection modes such as Wi-Fi, bluetooth and the like, and the terminal device can also be provided with a function of wired connection for communication. The terminal equipment of the application can be touch screen, non-touch screen or no screen, the touch screen can be controlled by clicking, sliding and the like on the display screen by fingers, touch pens and the like, the non-touch screen equipment can be connected with input equipment such as a mouse, a keyboard, a touch panel and the like, the terminal equipment is controlled by the input equipment, and the equipment without a screen can be such as a Bluetooth sound box without a screen and the like.

For example, the terminal device of the present application may be a smart phone, a netbook, a tablet computer, a notebook computer, a wearable electronic device (e.g., a smart bracelet, a smart watch, etc.), a TV, a virtual reality device, a sound, an electronic ink, etc. The terminal device of the embodiment of the application can also be a vehicle-mounted terminal device. The processor can be used as a vehicle-mounted computing unit to be built in a vehicle machine on a vehicle, so that the image data processing process of the embodiment of the application can realize real-time processing at the vehicle end, and further improve the data production efficiency.

The image data processing method according to the embodiment of the present application will be described in detail with reference to fig. 4 to 19.

Fig. 4 shows a flowchart of an image data processing method according to an embodiment of the present application. The method may be used in the image data processing system described above, as shown in fig. 4, and includes:

in step S401, image data of a road is acquired.

Alternatively, the image data may be acquired by a panoramic camera or a conventional camera, and the image data may include one or more images. The panoramic camera may be a consumer-grade video camera. The common camera may include a color camera, a black-and-white camera, and the like. Data may be acquired by 1 or more cameras.

The road may correspond to an area where the collected vehicle is to collect, and the information on the road may include information of a driving lane of the vehicle, and may further include auxiliary information of the driving lane (such as traffic sign, red and green light, lane limitation information, etc.).

Referring to fig. 5 (a) and 5 (b), schematic diagrams of image data acquisition according to an embodiment of the present application are shown. When the road condition of the area to be collected meets the preset condition, a bidirectional single-strip collection mode can be adopted, namely, the collection vehicle can travel on a preset lane, and meanwhile, image data corresponding to the traveling lane and the opposite lane are collected. When a bidirectional single-strip acquisition mode is adopted, the preset conditions can comprise one or more of the following: the road type is a single-way road or a small road or a semi-closed road, no isolation belt, trees, shrubs and the like in the middle of the lane can shield the opposite lane information, the total number of lanes is not more than a preset number (such as 6) under the condition of a two-way road, the total width of the driving lanes is not more than a preset distance (such as 30 meters) and the like. The predetermined condition may also include other conditions than the above.

The road shown in fig. 5 (a) is a bidirectional road, and there are 4 lanes of travel in total. Then the vehicle can keep running on the 2 nd or 3 rd lane (such as lane 2 or lane 3 in the figure) to collect the image data, so that the collected image data can comprise the information on the road as comprehensively as possible.

In the case where the above-described predetermined condition is not satisfied, for example, when the total number of lanes of which the road type is a highway or a bidirectional road exceeds a predetermined number (e.g., 6), there is usually a shade in the middle of the lanes. At this time, a bidirectional double-stripe acquisition mode may be adopted, that is, the acquisition vehicle may be divided into a traveling lane and an opposite lane according to the traveling direction, and the acquisition may be performed on a predetermined lane in the traveling lane and a predetermined lane in the opposite lane, respectively.

The road as shown in fig. 5 (b) is a bidirectional road, and there are 8 traveling lanes (including 4 traveling lanes and 4 opposite lanes) in total, and there is a barrier strip barrier (the barrier may be shown as a black trapezoid area in the figure) between the traveling lanes and the opposite lanes. Then the collection may be performed on the travel lane and the opposite lane, respectively, during the collection of the collection vehicle. When collecting in a travelling lane, the vehicle can keep travelling in a 2 nd or 3 rd lane (such as lane 2 or lane 3 in the figure) for collecting; the 6 th or 7 th lane (e.g., lane 6 and lane 7 in the figure) may be kept while the acquisition is performed on the opposite lane. Thus, the acquired image data can include information on the road as comprehensively as possible.

In the process of collecting the panoramic image by the collecting vehicle, the vehicle speed can be controlled to be kept within a preset range (such as 60-80 km/h), and the collected image data is taken as panoramic image data as an example, so that the obtained panoramic image data can be ensured to be clear, and the distortion conditions such as large fuzzy deformation and the like can be prevented.

The method according to the embodiment of the present application will be described below by taking image data as panoramic image data. In order to obtain lane information more accurately, the image data may be adjusted to be converted into a digital orthophoto map (digital elevation model, DOM) base map (i.e., orthophoto data) for measurable use. In this process, by taking the road surface gradient change of the road into consideration, the influence of the gradient change on the orthophoto data in the adjustment process can be reduced, and the lateral accuracy of the obtained orthophoto data can be ensured. Therefore, the road surface height of the middle road of the panoramic image data can be calculated first, see the following.

Step S402, determining the pavement height information of the road according to the image data and the parameters of the real-time dynamic positioning RTK equipment.

The parameters of the RTK apparatus may include position information of the acquisition vehicle determined by the RTK apparatus in a world coordinate system, as shown in fig. 2. The road surface height information may indicate a true road surface height corresponding to each road plane coordinate point in the image data, so that a gradient change of the road surface may be determined. A detailed example of calculating the road surface height information can be found below.

Fig. 6 shows a flowchart of an image data processing method according to an embodiment of the present application. As shown in fig. 6, optionally, the step S402 may include:

in step S4021, the pose of the image data is determined by the image data and the parameters of the RTK apparatus.

The pose of the image data may be represented by a position parameter and a pose parameter of the image data corresponding to the position parameter in the world coordinate system, where the position parameter is, for example, a three-dimensional coordinate (representing a translation amount) of each pixel point in the world coordinate system, and the pose parameter is, for example, a pose parameter R (representing a rotation matrix between the camera coordinate system and the world coordinate system, which may be a matrix with a size of 3*3), where the pose parameter may indicate a transformation relationship between the camera coordinate system corresponding to the camera and the world coordinate system. Optionally, the parameters for representing the pose may further include parameters indicating the order in which rotation and translation are performed, parameters indicating the meaning of a rotation matrix, and the like.

The image data and parameters of the RTK device may be processed by using a least squares adjustment method, or by using real-time positioning and pose determining techniques such as instant positioning and map construction (simultaneous localization AND MAPPING, SLAM), so as to determine the pose of the image data, and the least squares adjustment method may be implemented, for example, by using a deep learning model, which is not limited in this application.

Step S4022, determining road surface height information according to the pose.

In which, referring to fig. 7, a schematic diagram of determining road surface height information according to an embodiment of the present application is shown. As shown in FIG. 7, the coordinate system o-XYZ may represent a world coordinate system, where a sphere may indicate the position of a camera on the collection vehicle, and the coordinate system c-UVW may represent a camera coordinate system. H may represent the relative height of the camera (e.g., panoramic camera) to the ground, and may be the sum of the height of the panoramic camera lens to the roof of the collection vehicle and the height of the collection vehicle to the ground.

According to the position parameter C (X _C,Y_C,Z_c) and the posture parameter R corresponding to the image data, when the relative height from the panoramic camera lens to the ground is H, the coordinates of the point a (as shown in the figure) right under the panoramic camera may be expressed as: a (x _A,y_A,z_A) =c+r× [0, -H,0], the normal vector corresponding to the actual road surface(As shown) can be expressed as

The road surface height information may include the calculated point A and normal vectorSo that a change in gradient of a real road plane in the panoramic image data can be indicated. That is, the point A and the normal vector/>Coordinates (x, y, z) corresponding to each road surface point in the image data are described, wherein z may represent an actual road surface height of the corresponding road surface point.

As shown in fig. 7, for a certain point S (as shown in the figure) on the actual road surface, if the road surface height is assumed to be fixed (as shown in the figure, i.e., the road surface gradient fluctuation is not considered, the projection onto s″ will be incorrect (as shown in s″ on the bottom map projection surface in the figure) when the projection onto the orthophoto data is performed subsequently, and a large deviation will occur from the position S '(as shown in S' on the bottom map projection surface in the figure) where the projection is actually performed. The actual appearance is on the orthophoto data, and can be represented as deformation and distortion of the lane elements.

Referring to fig. 8 (a) and 8 (b), schematic views of effects after considering road surface height changes according to an embodiment of the present application are shown. As shown in fig. 8 (a), when there is a fluctuation in the road surface height, a change in the height of the road surface in a direction perpendicular to the driving direction as shown in the left diagram of fig. 8 (a) may cause a change in the displacement of the lane elements at the time of subsequent projection, and as shown in the right diagram of fig. 8 (a), a deformation of the lane edge line toward the right may be caused. If the road surface height is assumed to be fixed, as shown in fig. 8 (b), distortion of the lane elements may also be caused, and if the road surface height is assumed to be fixed, as shown in the left diagram of fig. 8 (b), distortion of the zebra crossing position may be caused by projection. By the method according to the embodiment of the present application, as shown in the right-hand side of fig. 8 (b), the distortion can be eliminated during projection, taking into account the road surface height information.

Optionally, in determining the road surface height information, the step S402 may further include:

And adjusting the road surface height information by utilizing the point cloud information to obtain the adjusted road surface height information.

The point cloud information can be determined according to the image data and parameters of the real-time dynamic positioning RTK device.

The manner of determining the point cloud information according to the image data and the parameters of the RTK apparatus may be described in the above step S4021, and may be obtained by using a least square adjustment method, for example. Referring to fig. 9, a schematic diagram of point cloud information according to an embodiment of the present application is shown. As shown in fig. 9, the point cloud information may include sparse point cloud data, and information indicating an actual height of the road surface may be included in each point of the sparse point cloud data. Therefore, the obtained road surface height information can be adjusted according to the actual road surface height information included in the sparse point cloud, so that more accurate road surface height information can be obtained.

For example, the road surface height information obtained in step S4022 may be adjusted by actual height information included in each point of the corresponding road surface in the point cloud data, so as to obtain adjusted road surface height information. For example, when the height indicated by the pixel point corresponding to the point cloud data is inconsistent with the height indicated by the corresponding point in the road surface height information, the road surface height information may be adjusted by taking the height indicated by the point cloud data as the height of the corresponding point. The road surface height information can be adjusted by utilizing the point cloud information in other modes, and the application is not limited to the adjustment.

By combining the road surface height information, the lane information in the image data can be determined with high accuracy and high efficiency by using the method of the embodiment of the application, see below.

Step S403, adjusting the image data by using the road surface height information, and determining the lane information in the adjusted image data.

Wherein the lane information may be used to construct a lane-level map. A schematic representation of a lane-level map can be seen in fig. 1 above.

Alternatively, the lane information may include lane line information and lane face information. The lane line information includes, for example, line information on a road surface such as a road center line, a lane line, and a lane edge line, and the lane surface information includes, for example, lane number, lane type (marks indicated as a left-turn lane, a right-turn lane, and the like), lane mark (marks such as a speed reduction, a speed limitation, and the like on a road surface), and surface indication information on a road surface such as intersection boundary range information. Therefore, the coverage of multiple lane environment scenes can be realized, and the application range is enlarged.

It should be noted that, the lane information may further include lane environment information, such as road traffic sign, traffic signal lamp, traffic monitoring point, lane limitation scene information, etc., so that the method of the present application can be used to determine HD information at low cost to construct a Map (e.g., HD Map) with higher accuracy, which is not limited in the present application.

In the process of determining the lane information, since the occlusion of the collected vehicle may cause a void on the collected panoramic image data, a moving object (such as a vehicle) on the road surface may also form a road surface ghost on the panoramic image data, and the occlusion on the road surface may occlude the lane elements (such as lane lines, etc.). Therefore, the image data may be first adjusted to repair the above-mentioned voids, ghosts, shadows, etc. in the image data. See below.

Alternatively, the adjusted image data may be orthophoto data. The image data may be orthorectified using the road surface height information to obtain the orthographic image data.

The method can perform orthographic correction processing on each piece of collected panoramic image data, and can also select one or more pieces of continuous multiple pieces of image data to perform orthographic correction processing to obtain corresponding orthographic image data.

The two-dimensional plane range with the preset size (such as w×h) around the position parameter C corresponding to the image data to be corrected can be taken as the orthorectified range corresponding to the image data to be corrected. The size of the resulting orthographic image data of the image data to be corrected may be [ w/gsd ] × [ h/gsd ], where [. Cndot ] may represent an upward rounding, gsd may represent a ground SAMPLING DISTANCE, and gsd may be determined according to the need for base map resolution, for example, 2-5cm.

In the process of performing the orthorectification, the pavement mask may be combined to perform the orthorectification, and the process of performing the orthorectification on the image data by using the pavement mask may be referred to as fig. 10, which shows a flowchart of an image data processing method according to an embodiment of the present application. As shown in fig. 10, optionally, the step S403 may include:

step S4031, an image data set is determined.

Wherein the image data set may comprise one or more of the image data preceding and/or one or more of the image data following the image data to be corrected, and the image data to be corrected.

For a road point S within a to-be-corrected range of image data to be corrected, determining image data within the image data set according to the position parameter of each image data may include: image data for which the horizontal distance of the corresponding position parameter from the point S is smaller than a predetermined threshold D (for example, 10 meters). The image data set P can be expressed as: p= { p|dis (C, S) < D }. Dis (·) can represent a function that calculates the two-point distance.

The elements in the image data set P (i.e., the image data) may be further sorted by horizontal distance from the point S (e.g., by small-to-large distance), and may be expressed as: p= { P ₀,p₁,p₂……p_n }, where P may represent the image data within P and n may represent the total number of image data within P.

Step S4032, orthorectified image data is carried out by utilizing the road surface height information of the image data and the road surface mask of the image data set, and the orthorectified image data corresponding to the image data to be rectified is obtained.

Wherein the pavement mask may be determined from the image data, the pavement mask may be indicative of a pavement portion in the image data. The process of determining the pavement mask may be performed before performing step S403. The pavement mask may be obtained by processing the image data using semantic segmentation. The semantic segmentation model may be implemented based on a DeepLabV or other neural network model, for example, the input of the model may be panoramic image data, and the output may be a semantic segmentation result. The pavement mask in the embodiment of the application may be a pavement segmentation result (mask) of the semantic segmentation result, for example, a mask value of 0 for a corresponding pixel point may indicate that the corresponding pixel point is a non-pavement portion, and a mask value of 1 may indicate that the corresponding pixel point is a pavement portion.

For each image to be corrected, 1 piece of orthophoto data may be generated accordingly. The detailed procedure for performing orthorectification can be seen below.

Referring to fig. 11, a flowchart of an image data processing method according to an embodiment of the present application is shown. As shown in fig. 11, optionally, the step S4032 may include:

Step S40321, determining one or more second pixel points corresponding to the first pixel point to be corrected in the image data set by using the road surface height information and the pose corresponding to the image data to be corrected.

The road surface height corresponding to the first pixel point S to be corrected can be calculated by using the road surface height and the pose corresponding to the image data to be corrected, so as to determine the coordinate S (X _S,Y_S,Z_S) of the first pixel point. Then, by using a panoramic imaging equation, the first pixel point may be projected onto the corresponding image data according to the arrangement sequence of the image data in the image data set, so as to obtain the coordinates p _i (u, v) of the pixel point (i.e., the second pixel point) of the first pixel point on the corresponding image data.

Step S40322, traversing the pavement mask of one or more second pixel points, and taking the pixel value of the corresponding second pixel point as the pixel value of a third pixel point corresponding to the first pixel point in the orthographic image data to determine the orthographic image data under the condition that the pavement mask indicates that the corresponding second pixel point is a pavement.

For example, the road mask result mask corresponding to each second pixel p _i may be traversed sequentially, and when the value of the mask is 1, that is, the second pixel is indicated to be a valid road surface, the pixel value corresponding to p _i is taken as the pixel value of the third pixel I (u, v) corresponding to the first pixel on the orthophoto data. The correspondence between the first pixel point S and the third pixel point I may be expressed as: i (u, v) = ((X _S-X_C)/gsd,(Y_S-Y_C)/gsd), where C may represent a positional parameter of the image data to be corrected.

And when the mask value is 0, namely indicating that the second pixel point corresponds to a non-road surface, traversing the next second pixel point until the mask value of the second pixel point is 1. After determining that the mask value of the second pixel point is 0, the mask value may be projected to the next image data in the image data set P to obtain the new coordinates of the second pixel point, or a plurality of second pixel points corresponding to a plurality of image data of the image data set may be determined at one time.

The second pixel points corresponding to each image data in the image data set P may be traversed sequentially, and then, under the condition that all the second pixel points are not the pixel points corresponding to the effective pavement, the first pixel point S may be considered to be non-pavement or completely blocked, and the third pixel point I on the orthographic image data may be set as an invalid pixel point, so as to complete the processing of the first pixel point S to be corrected.

According to the method, all pixel points in the region to be corrected of the image data to be corrected are traversed to carry out orthographic correction processing, so that the orthographic image data corresponding to the image data to be corrected can be obtained.

Referring to fig. 12 (a), 12 (b) and 12 (c), schematic diagrams of the effect of orthorectification according to an embodiment of the present application are shown. As shown in fig. 12 (a), panoramic image data acquired by a collection vehicle is acquired. In the process of collecting, due to the shielding of a part of lanes, the collecting vehicle may cause a void on the road surface, as shown in fig. 12 (b), the left side may correspond to the image data to be corrected and the corresponding orthophoto data, and the right side may correspond to other image data in the image data set and the corresponding orthophoto data, where the orthophoto data may refer to the bottom map projection surface in fig. 7. The "+" and "Δ" in the figure may represent points on the road surface, and for the road surface point corresponding to the "+" in the figure, the camera may not capture a corresponding image due to the shielding of the acquisition vehicle, so that after projection onto the orthophoto data (i.e., the bottom image projection surface), a hollow area may be formed as shown by the arrow in the left image in fig. 12 (c). At this time, by using the method according to the embodiment of the present application, the hole portion may be filled with pixels by using pixel information of corresponding points in other image data in the image data set, so as to achieve the effect of orthorectified, as shown in the right graph of fig. 12 (c), which is data after orthorectified, and the hole portion may be rectified.

Fig. 13 (a), 13 (b) and 13 (c) are schematic views showing the effect of orthorectification according to an embodiment of the present application. As shown in fig. 13 (a), panoramic image data acquired by a collection vehicle is acquired. In the process of acquisition, since other running vehicles (such as circled vehicles) exist on the road, when panoramic image data is shot, continuous ghosts (which may be called ghosts) may be formed on the obtained image data by the other running vehicles. As shown in fig. 13 (b), the cameras in the middle may correspond to image data to be corrected, and the cameras on the left and right may correspond to other image data in the image data set. The ghost image formed by the traveling vehicle after projection onto the orthophoto data may be as shown by the arrow in the left-hand diagram of fig. 13 (c). At this time, by using the method according to the embodiment of the present application, the pixel information of the corresponding point in the other image data in the image data set may be used to fill the pixel in the ghost portion, so as to achieve the effect of orthorectified, as shown in the right graph of fig. 13 (c), which is the data after orthorectified, and the ghost portion may be rectified.

According to the above steps, one or more pieces of orthophoto data can be obtained corresponding to one or more images to be corrected. One or more of the orthophoto image data may be stitched to obtain stitched orthophoto image data.

One or more pieces of the orthophoto data can be spliced together by means of mosaic (mosaics), so that a complete piece of orthophoto data can be formed. In the process, the coordinate systems of one or more pieces of the orthophoto data can be unified, the overlapping part between every two pieces of the orthophoto data is determined, and corresponding embedded lines are constructed so as to eliminate tone difference and avoid dislocation of the overlapping part, thereby forming complete image data.

Referring to fig. 14 (a) and 14 (b), schematic diagrams of orthographic image data stitching according to an embodiment of the present application are shown. In the figures, "01", "02", "03", "11", "12", "13" may each correspond to different orthophoto data, wherein "01", "02", "03" and "11", "12", "13" may each represent data acquired on different lanes facing each other.

As shown in fig. 14 (a), in the case of acquiring image data by using a bidirectional single-stripe method, the orthophoto image data may be directly superimposed and inlaid one by one to obtain complete orthophoto image data.

In the case of acquiring image data using a bidirectional double stripe as shown in fig. 14 (b), mosaic can be performed for data acquired in the same direction as shown in fig. 14 (a). If the bidirectional data are directly superimposed, a barrier such as a median in the road may cover the lane of the road. Thus, constraints can be added to the tessellation process at this time so that tessellation lines pass as far as possible from the shade in the middle of the road.

Alternatively, the orthophoto data may be stitched orthophoto data, and in the case where bidirectional road information is included in the one or more orthophoto data, a stitching line between the one or more orthophoto data does not pass through a road surface region indicated by the road surface mask.

As shown in fig. 14 (b), for example, a splice line (i.e., the above-described mosaic line) may be made to pass through a region of the orthophoto data corresponding to a mask value of 0 (a value of 0 may indicate that the corresponding region is a non-road surface region) and not through a region of the corresponding mask value of 1.

Referring to fig. 15 (a) and 15 (b), schematic diagrams of the effect of performing orthographic image data stitching according to an embodiment of the present application are shown. As shown in fig. 15 (a), if the overlapping is directly performed during the process of performing the orthographic image data stitching, the shielding object indicated by the arrow in the figure may cause the lane lines to be covered, so that the number of lanes in the figure is reduced; in contrast, as shown in fig. 15 (b), according to the method of the embodiment of the present application, by adding a constraint to the splice line during the splicing process, the lane line can be prevented from being covered by the shade.

Due to the influence of factors such as traffic jam and slow running, the obtained spliced orthographic image data may still have shielding, blurring and image deletion. Therefore, if lane information is determined directly by using the multi-classification semantic segmentation method, the detection effect may be poor. Meanwhile, in the multi-classification semantic segmentation method, when a sample is marked, the classification of the marked line is judged, the position of the marked line is marked, the marking workload is large, and the time consumption is long. The method according to the embodiment of the application can decouple the position of the lane and the semantics of the lane so as to quickly and accurately determine the lane information, as follows.

Alternatively, the lane information may relate to the position of the lane and the semantics of the lane in the adjusted image data.

The adjusted image data may be the above-mentioned spliced orthophoto data, and the position of the lane may indicate the position of each lane element (such as a lane line, a lane surface, etc.) in the adjusted image data (i.e., the orthophoto data), where the position may include any one or more of coordinates of the lane element in the image, coordinates in an absolute coordinate system, and relative positions of the lane element and other elements in the image.

The position of the lane may be determined by the binary segmentation result. And performing binary segmentation processing on the spliced orthophoto data through a binary segmentation model to determine a binary segmentation result corresponding to the orthophoto data. Referring to fig. 16, a schematic diagram of a binary segmentation result according to an embodiment of the present application is shown. In the binary segmentation result, the corresponding pixel value of 1 may indicate that the corresponding pixel point includes lane information (for example, the position of a lane line, a lane surface, etc.), such as a white portion in fig. 16; a corresponding pixel value of 0 may indicate that the corresponding pixel location does not contain lane information (e.g., is not the location of a lane line, a road surface, etc.), such as the black portion in fig. 16.

The semantics of the lane can indicate that the lane elements are any type of road center line, lane dividing line, lane edge line, lane surface mark and the like.

The semantics of the lane may be determined by the target detection result. And performing target detection on the spliced orthophoto data through a target detection model to determine a target detection result corresponding to the orthophoto data. Referring to fig. 17, a schematic diagram of a target detection result according to an embodiment of the present application is shown. By performing object detection on the spliced orthophoto data, an object detection frame set may be obtained, where the set may include one or more detection frames, such as square frames in fig. 17, and the object detection result may include an object type corresponding to the detection frame, one (e.g., an upper left corner vertex) or a plurality of vertex pixel coordinates of the detection frame, a length, a width, and the like of the detection frame.

The binary segmentation model and the object detection model may be obtained through a pre-training, and referring to fig. 18, a schematic diagram of model training according to an embodiment of the present application is shown. The DOM base graph can be manually marked to respectively obtain binary segmentation sample data and target detection sample data. As shown in the figure, the binary segmentation sample data can be marked with positions of lane elements, and the target detection sample data can be marked with detection frames corresponding to typical areas of the types of the lane elements. The encoder network can be trained based on a small sample learning method to obtain a trained binary segmentation model. The detection network of the multi-scale space features and the like can be trained to obtain a trained target detection model.

It should be noted that, the training process of the binary segmentation model on the target detection model may be performed separately, and the training process may not be performed when the binary segmentation model and/or the target detection model reach a predetermined accuracy.

In the process of target detection, the target type can be expanded, for example, the target type can be newly increased (for example, the types of elements such as lane lines, lane surfaces and the like can be increased, or the types of road environment elements such as traffic signs, traffic lights and the like can be increased), or the existing type can be refined (for example, the types of different marks such as lane surface marks are refined, speed reduction, speed limitation and the like are distinguished), at this time, when the types of elements such as lane lines, lane surfaces and the like are newly increased, or the elements such as the existing lane lines, lane surfaces and the like are refined and classified, sample data can be marked and a target detection model can be retrained in the training process, so that the rapid expansion of the target type can be realized.

When the lane environment elements are newly added, sample data (such as panoramic image sample data containing corresponding lane environment information) including the lane environment information can be marked and the target detection model is retrained, when the target detection model is applied, the image data in the step S401 can be directly subjected to target detection, the target type related to the lane environment elements can be determined without adjusting the image data, and therefore, the rapid expansion of the target type can be realized.

In the way of combining the binary segmentation with the target detection task to determine the lane information, a smaller sample size is required in training the model than in the multi-classification semantic segmentation approach. Referring to Table 1, there is shown a comparison of multi-classification semantic segmentation with the sample size used by the method of an embodiment of the present application in training a model.

TABLE 1

Method of	Sample size
		Multi-class semantic segmentation	500
The method of the embodiment of the application	200 (Binary segmentation) +70 (target detection)

As shown in Table 1, at least 500 sample sizes are needed for the multi-classification semantic segmentation model during training, but the method of the embodiment of the application combines binary segmentation with target detection, only 200 samples are used for training the binary segmentation model, and 70 samples are used for training the target detection model, so that the same precision as that of the multi-classification semantic segmentation model can be achieved. Therefore, the cost can be saved, and the requirement on calculation force can be reduced.

After the position of the lane and the semantics of the lane in the adjusted image data are obtained, the lane information can be determined through the position of the lane and the semantics of the lane.

The above binary segmentation result (may be referred to as F) and the target detection result (the above set of target detection frames may be referred to as B) may be combined to determine the lane information. For example, for each detection frame B in B, for its detection frame range (xmin, ymin, xmin+w, ymin+h), where (xmin, ymin) may represent the pixel coordinates of the top left corner vertex of the detection frame, w and h may represent the width and height of the detection frame, respectively, if the value of F in this range is 1, the type corresponding to F may be set as type, and type may represent the type of target obtained in the target detection result. All elements in B can be traversed, and categories are set for pixel points with corresponding values of 1 in F, so that the semantics and the positions of the lane elements can be integrated, the semantics and the positions of the lane elements are corresponding, and the obtained integration result can be called L. It is also possible to perform vector processing according to the result L obtained by integration, for example, converting pixel coordinates in a position into world coordinates, converting a real lane line in the orthophoto data into a combination of points, lines with directions (for example, into vector lines), or the like, to determine lane information that can be used to determine a lane-level map, see the lane-level map in fig. 1.

Fig. 19 shows an effect diagram of determining lane information according to an embodiment of the present application. As shown in fig. 19, the first column may correspond to the post-stitching orthophoto data, taking the determination of the lane line element in the lane information as an example, the second column may correspond to the actual lane line element detection result (i.e., the true value in the figure), the third column may correspond to the recognition result (e.g., the above L) of the lane line element obtained in the above-mentioned process of determining the lane information by the position of the lane and the semantics of the lane, and the circled portion in the figure may represent the portion of the lane line element that is not recognized or not recognized yet. It can be seen that by using the method of the embodiment of the present application, an identification accuracy of 90% or more can be achieved.

Fig. 20 shows a block diagram of an image data processing apparatus according to an embodiment of the present application. As shown in fig. 20, the apparatus includes:

an acquisition module 2101 for acquiring image data of a road.

A first determining module 2102, configured to determine road surface height information of a road according to the image data and parameters of the real-time dynamic positioning RTK device.

The adjusting module 2103 is configured to adjust the image data by using the road surface height information, and determine lane information in the adjusted image data, where the lane information is used to construct a lane-level map.

Alternatively, the image data is acquired by a panoramic camera or a normal camera.

According to the embodiment of the application, the image data can be acquired at low cost, and conditions are created for large-scale popularization and application of the device of the embodiment of the application.

Alternatively, the lane information includes lane line information and lane face information.

Optionally, the first determining module 2102 is configured to: determining the pose of the image data through the image data and parameters of RTK equipment; and determining the road surface height information according to the pose.

Optionally, the first determining module 2102 is configured to: and adjusting the road surface height information by utilizing the point cloud information to obtain the adjusted road surface height information, wherein the point cloud information is determined according to the image data and parameters of the real-time dynamic positioning RTK equipment.

Optionally, the adjusted image data is orthophoto data.

According to the embodiment of the application, the lane information in the orthophoto data is determined by obtaining the orthophoto data and splicing the orthophoto data, so that the lane information can be determined with low cost and high efficiency, and the device is suitable for large-scale application.

Optionally, the adjusting module 2103 is configured to: determining an image data set comprising one or more of the image data to be corrected and/or one or more of the image data to be corrected; and carrying out orthographic correction on the image data to be corrected by utilizing the road surface height information of the image data and the road surface mask of the image data set to obtain orthographic image data corresponding to the image data to be corrected, and determining the road surface mask according to the image data to indicate the road surface part in the road.

Optionally, using the road surface height information of the image data and the road surface mask of the image data set, orthorectified the image data to be rectified to obtain the orthorectified image data corresponding to the image data to be rectified, including: determining one or more second pixel points corresponding to the first pixel points to be corrected in the image data set by utilizing the pavement height information and the pose corresponding to the image data to be corrected; traversing the pavement mask of one or more second pixel points, and taking the pixel value of the corresponding second pixel point as the pixel value of a third pixel point corresponding to the first pixel point in the orthographic image data to determine the orthographic image data under the condition that the pavement mask indicates that the corresponding second pixel point is a pavement.

Alternatively, the orthographic image data is spliced orthographic image data, and in the case where bidirectional road information is included in one or more of the orthographic image data, a splice line between the one or more of the orthographic image data does not pass through a road surface region indicated by the road surface mask.

Optionally, the lane information relates to the position of the lane and the semantics of the lane in the adjusted image data.

An embodiment of the present application provides an image data processing apparatus including: a processor and a memory; the memory is used for storing programs; the processor is configured to execute a program stored in the memory to cause the apparatus to implement the above-described image data processing method.

An embodiment of the present application provides a computer-readable storage medium having stored thereon program instructions which, when executed by a computer, cause the computer to implement the above-described image data processing method.

An embodiment of the present application provides a terminal device, which may perform the above-described image data processing method.

Embodiments of the present application provide a computer program product comprising program instructions which, when executed by a computer, cause the computer to implement the above-described image data processing method.

An embodiment of the present application provides a vehicle including a processor for performing the above-described image data processing method.

Fig. 21 shows a block diagram of an electronic device 2200 according to an embodiment of the application. As shown in fig. 21, the electronic device 2200 may be the above-described image data processing system. The electronic device 2200 includes at least one processor 1801, at least one memory 1802, and at least one communication interface 1803. The electronic device may further comprise common components such as an antenna, which are not described in detail herein.

The respective constituent elements of the electronic device 2200 will be specifically described below with reference to fig. 21.

The processor 1801 may be a general purpose Central Processing Unit (CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the above-described programs. The processor 1801 may include one or more processing units, such as: the processor 1801 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (IMAGE SIGNAL processor, ISP), a controller, a video codec, a digital signal processor (DIGITAL SIGNAL processor, DSP), a baseband processor, and/or a neural Network Processor (NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

A communication interface 1803 for communicating with other electronic devices or communication networks, such as ethernet, radio Access Network (RAN), core network, wireless local area network (Wireless Local Area Networks, WLAN), etc.

The Memory 1802 may be, but is not limited to, a read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-Only Memory (ElectricallyErasable Programmable Read-Only Memory, EEPROM), a compact disc read-Only Memory (Compact Disc Read-Only Memory) or other optical disc storage, a compact disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be stand alone and coupled to the processor via a bus. The memory may also be integrated with the processor.

Wherein the memory 1802 is configured to store application program codes for performing the above schemes and is controlled to be executed by the processor 1801. The processor 1801 is configured to execute application code stored in the memory 1802.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disk, hard disk, random Access Memory (Random Access Memory, RAM), read Only Memory (ROM), erasable programmable Read Only Memory (ELECTRICALLY PROGRAMMABLE READ-Only-Memory, EPROM or flash Memory), static Random Access Memory (SRAM), portable compact disk Read Only Memory (Compact Disc Read-Only Memory, CD-ROM), digital versatile disk (Digital Video Disc, DVD), memory stick, floppy disk, mechanical coding devices, punch cards or in-groove bump structures such as instructions stored thereon, and any suitable combination of the foregoing.

The computer readable program instructions or code described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present application may be assembler instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++ or the like and conventional procedural programming languages, such as the "C" language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (Local Area Network, LAN) or a wide area network (Wide Area Network, WAN), or may be connected to an external computer (e.g., through the internet using an internet service provider). In some embodiments, aspects of the application are implemented by personalizing electronic circuitry, such as Programmable logic circuitry, field-Programmable gate arrays (GATE ARRAY, FPGA), or Programmable logic arrays (Programmable Logic Array, PLA), with state information for computer-readable program instructions.

Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by hardware, such as circuits or ASIC (Application SPECIFIC INTEGRATED circuits) which perform the corresponding functions or acts, or combinations of hardware and software, such as firmware and the like.

Although the invention is described herein in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

The foregoing description of embodiments of the application has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method of processing image data, the method comprising:

Acquiring image data of a road;

Determining the road surface height information of the road according to the image data and parameters of real-time dynamic positioning RTK equipment;

And adjusting the image data by utilizing the road surface height information to obtain lane information in the adjusted image data, wherein the lane information is used for constructing a lane-level map.

2. The method of claim 1, wherein determining the road surface height information of the road based on the image data and parameters of the real-time kinematic positioning RTK device comprises:

Obtaining the pose of the image data through the image data and parameters of the RTK equipment;

And determining the road surface height information according to the pose.

3. The method according to claim 1 or 2, characterized in that the lane information relates to the position of the lane and the semantics of the lane in the adjusted image data.

4. A method according to any one of claims 1-3, wherein the adjusted image data is orthophoto data.

5. The method of claim 4, wherein adjusting the image data using the road surface height information to obtain lane information in the adjusted image data comprises:

Determining an image data set comprising one or more of the image data preceding and/or one or more of the image data following the image data to be corrected, and the image data to be corrected;

And carrying out orthographic correction on the image data to be corrected by utilizing the road surface height information of the image data and the road surface mask of the image data set to obtain the orthographic image data corresponding to the image data to be corrected, wherein the road surface mask is determined according to the image data and indicates the road surface part in the road.

6. The method according to claim 5, wherein the performing orthographic correction on the image data to be corrected using the road surface height information of the image data and the road surface mask of the image data set to obtain the orthographic image data corresponding to the image data to be corrected, includes:

determining one or more second pixel points corresponding to the first pixel point to be corrected in the image data set by utilizing the pavement height information and the pose corresponding to the image data to be corrected;

Traversing the pavement mask of the one or more second pixel points, and taking the pixel value of the corresponding second pixel point as the pixel value of a third pixel point corresponding to the first pixel point in the orthographic image data to determine the orthographic image data under the condition that the pavement mask indicates the corresponding second pixel point as a pavement.

7. The method of any of claims 4-6, wherein the orthophoto data is stitched orthophoto data, and wherein in the event that bi-directional road information is included in one or more of the orthophoto data, a stitching line between the one or more of the orthophoto data does not pass through a road surface area indicated by the road surface mask.

8. The method of any of claims 1-7, wherein determining the road surface height information of the road based on the image data and parameters of a real-time kinematic positioning RTK device comprises:

And adjusting the road surface height information by utilizing point cloud information to obtain adjusted road surface height information, wherein the point cloud information is determined according to the image data and the parameters of the real-time dynamic positioning RTK equipment.

9. The method according to any one of claims 1-8, wherein the image data is acquired by a panoramic camera or a regular camera.

10. The method of any of claims 1-9, wherein the lane information includes lane line information and lane face information.

11. An image data processing apparatus, characterized in that the apparatus comprises:

The acquisition module is used for acquiring image data of the road;

The first determining module is used for determining the road surface height information of the road according to the image data and the parameters of the real-time dynamic positioning RTK equipment;

The adjusting module is used for adjusting the image data by utilizing the road surface height information, and determining lane information in the adjusted image data, wherein the lane information is used for constructing a lane-level map.

12. The apparatus of claim 11, wherein the first determining module is configured to:

Determining the pose of the image data through the image data and parameters of the RTK equipment;

And determining the road surface height information according to the pose.

13. The apparatus according to claim 11 or 12, characterized in that the lane information relates to the position of the lane and the semantics of the lane in the adjusted image data.

14. The apparatus of any of claims 11-13, wherein the adjusted image data is orthophoto data.

15. The apparatus of claim 14, wherein the adjustment module is configured to:

16. The apparatus of claim 15, wherein the performing orthographic correction on the image data to be corrected using the road surface height information of the image data and the road surface mask of the image data set to obtain the orthographic image data corresponding to the image data to be corrected, comprises:

17. The apparatus of any of claims 14-16, wherein the orthographic image data is stitched orthographic image data, and wherein a stitching line between one or more orthographic image data does not pass through a road surface area indicated by the road surface mask in the event that bi-directional road information is included in the one or more orthographic image data.

18. The apparatus according to any one of claims 11-17, wherein the first determining module is configured to:

19. The apparatus according to any one of claims 11-18, wherein the image data is acquired by a panoramic camera or a regular camera.

20. The apparatus of any one of claims 11-19, wherein the lane information includes lane line information and lane face information.

21. An image data processing apparatus, comprising: a processor and a memory;

the memory is used for storing programs;

the processor is configured to execute a program stored in the memory, so that the apparatus implements the method of any one of claims 1-10.

22. A computer readable storage medium having stored thereon program instructions, which when executed by a computer cause the computer to implement the method of any of claims 1-10.

23. A computer program product comprising program instructions which, when executed by a computer, cause the computer to carry out the method of any one of claims 1 to 10.

24. A vehicle, characterized in that it comprises a processor for performing the method according to any one of claims 1-10.