CN116563812B

CN116563812B - Target detection method, target detection device, storage medium and vehicle

Info

Publication number: CN116563812B
Application number: CN202310834347.5A
Authority: CN
Inventors: 万韶华
Original assignee: Xiaomi Automobile Technology Co Ltd
Current assignee: Xiaomi Automobile Technology Co Ltd
Priority date: 2023-07-07
Filing date: 2023-07-07
Publication date: 2023-11-14
Anticipated expiration: 2043-07-07
Also published as: CN116563812A

Abstract

The disclosure relates to a target detection method, a device, a storage medium and a vehicle, which can acquire a target environment image of a current running environment of the vehicle; identifying the boundary of a target object in the running environment through a target detection model according to the target environment image; the target detection model is a model which is obtained by training a target sample in advance, and the target sample comprises a sample environment image obtained after marking the boundary of the target object according to the point cloud data of the driving environment.

Description

Target detection method, target detection device, storage medium and vehicle

Technical Field

The disclosure relates to the technical field of automatic driving, and in particular relates to a target detection method, a target detection device, a storage medium and a vehicle.

Background

In recent years, various levels of auxiliary driving technologies have been rapidly developed as automobile manufacturers develop intelligent vehicle technologies. Based on the visual sensor, the laser range finder, the ultrasonic wave, the infrared sensor and other sensors, the sensing of the obstacle in the surrounding environment is one of key technologies for realizing the automatic navigation of the vehicle, and is also one of key capabilities for realizing the automatic driving and the automatic parking.

In an automatic driving task, a drivable area (FS) refers to an area in physical space where a vehicle can safely freely run, and a non-drivable area refers to an area where the vehicle cannot advance or stop due to an intrusion of an obstacle. Various types of obstacles exist in the running environment of vehicles, such as static obstacles of vehicles, posts, ice cream cones and the like, and dynamic obstacles of moving vehicles, pedestrians and the like. The ground wire of these obstacles can be used to represent Freespace.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides a target detection method, apparatus, storage medium, and vehicle.

According to a first aspect of an embodiment of the present disclosure, there is provided a target detection method, including:

acquiring a target environment image of a current running environment of a vehicle;

identifying the boundary of a target object in the running environment through a target detection model according to the target environment image;

the target detection model is a model which is obtained by training a target sample in advance, and the target sample comprises a sample environment image obtained after marking the boundary of the target object according to the point cloud data of the driving environment.

Optionally, the target detection model is pre-trained by:

acquiring point cloud data of the running environment and an environment image to be marked according to a preset frequency;

according to each acquisition time, marking boundary points of target objects in the environment image to be marked acquired at the acquisition time according to first point cloud data acquired at the acquisition time, and obtaining the sample environment image;

and carrying out model training on a preset detection model according to the sample environment image to obtain the target detection model.

Optionally, a plurality of image acquisition devices are circumferentially arranged on the vehicle, and after marking boundary points of a target object in an environmental image to be marked, which is acquired at the acquisition time, according to the first point cloud data acquired at the acquisition time, obtaining the sample environmental image includes:

acquiring, for each image acquisition device, a field angle of the image acquisition device and a target position point of the image acquisition device in the first point cloud data;

determining point cloud boundary points of the target object from the first point cloud data according to the field angle and the target position point which are respectively corresponding to each image acquisition device;

determining a target boundary point of the target object in the environment image to be annotated according to the point cloud boundary point;

and marking the target boundary point in the environment image to be marked to obtain the sample environment image.

Optionally, the determining, according to the field angle and the target position point corresponding to each image acquisition device respectively, a point cloud boundary point of the target object from the first point cloud data includes:

determining second point cloud data positioned in a preset height range from the first point cloud data, wherein the height between each point in the second point cloud data and the ground is positioned in the preset height range;

and aiming at each image acquisition device, taking the point closest to the target position point in the point cloud data in each preset direction in the view angle range in the second point cloud data as the point cloud boundary point.

Optionally, the determining, according to the point cloud boundary point, the target boundary point of the target object in the environment image to be annotated includes:

acquiring a preset conversion matrix, wherein the preset conversion matrix represents the mapping relation between each point in the point cloud data and each pixel in the environment image to be marked;

and determining the corresponding target boundary point of each point cloud boundary point in the environment image to be annotated according to the preset conversion matrix.

Optionally, the method further comprises:

aiming at each acquisition time, acquiring jitter parameters corresponding to point cloud data of a preset frame number corresponding to the acquisition time;

and deleting the point cloud data acquired at the acquisition time under the condition that the vehicle passes through a target road surface according to the jitter parameters, wherein the target road surface comprises a convex road surface or a concave road surface.

Optionally, the target environment image comprises a bird's eye view; the obtaining the target environment image of the current running environment of the vehicle comprises the following steps:

respectively acquiring initial environment images of all directions through a plurality of image acquisition devices arranged on the vehicle;

and splicing the initial environment images corresponding to each azimuth to obtain the aerial view.

According to a second aspect of embodiments of the present disclosure, there is provided an object detection apparatus including:

the acquisition module is configured to acquire a target environment image of the current running environment of the vehicle;

an identification module configured to identify a boundary of a target object in the driving environment through a target detection model based on the target environment image;

According to a third aspect of embodiments of the present disclosure, there is provided a vehicle comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the steps of the method of the first aspect of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: the boundary of the target object is marked on the sample image corresponding to the driving environment through the point cloud data of the driving environment, and when the target object is detected, the three-dimensional point cloud data has higher detection precision compared with the two-dimensional image data, so the marking accuracy of the marking is higher because the boundary of the target object is marked based on the point cloud data and compared with the manual marking after the boundary of the target object is determined based on the two-dimensional image recognition in the related art. And the boundary marking of the target object is automatically carried out by adopting the three-dimensional point cloud data, the efficiency is obviously higher than that of manual marking, and the labor cost is saved. Furthermore, the accuracy of target object detection is greatly improved based on the target detection model obtained by training the target sample with higher labeling accuracy.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flow chart illustrating a method of object detection according to an exemplary embodiment.

Fig. 2 is a flow chart of a target detection method according to the embodiment shown in fig. 1.

Fig. 3 is a flow chart of a target detection method according to the embodiment shown in fig. 2.

Fig. 4 is a block diagram illustrating an object detection apparatus according to an exemplary embodiment.

Fig. 5 is a block diagram of an object detection apparatus according to the embodiment shown in fig. 4.

FIG. 6 is a block diagram of a vehicle, according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with aspects of the present disclosure.

The method and the device are mainly applied to the scene of detecting the driving area of the vehicle. Various types of obstacles exist in the vehicle running environment, and the ground line or projection line perpendicular to the ground of each obstacle may be used to determine the drivable area. The Freespace algorithm provided in the related art mainly utilizes four looking around cameras arranged on the vehicle to sense obstacles around the vehicle body so as to assist downstream construction of an obstacle map. The freepace algorithm is input by a Bird Eye View (BEV), which may also be referred to as a BEV plan View, and the output is a projection line of various obstacles perpendicular to the ground on the BEV plan View.

However, in the BEV plan view, if the obstacle is a suspended obstacle, such as a vehicle, a vehicle and a vehicle tail, and in the generated BEV plan view, the edge of the obstacle determined based on the Freespace algorithm in the related art is different from the projection line of the obstacle perpendicular to the ground, that is, the difference between the edge of the obstacle determined by the Freespace algorithm and the actual projection line of the obstacle perpendicular to the ground is obvious, in this case, the projection line of the obstacle perpendicular to the ground can be estimated only based on the edge of the obstacle in the manual marking process, which has poor marking accuracy and consistency, and the manual marking efficiency is low and high.

In order to solve the above-mentioned problems, the present disclosure provides a target detection method, a device, a storage medium, and a vehicle. The following detailed description of specific embodiments of the present disclosure refers to the accompanying drawings.

Fig. 1 is a flow chart illustrating a target detection method according to an exemplary embodiment, as shown in fig. 1, the method comprising the steps of:

in step S101, a target environment image of the running environment in which the vehicle is currently located is acquired.

In a possible implementation, a plurality of image acquisition devices (such as cameras) may be arranged around the vehicle body, such that initial environmental images of different orientations around the vehicle may be acquired by the plurality of image acquisition devices, respectively, and then the target environmental image may be determined from the initial environmental images of different orientations, wherein the target environmental image includes a bird's eye view.

Therefore, in this step, the initial environmental images in different orientations can be acquired respectively by the plurality of image acquisition devices provided on the vehicle; and splicing the initial environment images corresponding to all the directions to obtain the aerial view. When the initial environmental images corresponding to the directions are spliced to obtain the aerial view, the aerial view can be obtained after the initial environmental images are spliced based on the external parameters and the internal parameters of the image acquisition devices.

In step S102, the boundary of the target object in the driving environment is identified according to the target environment image through a target detection model, where the target detection model is a model trained in advance according to a target sample, and the target sample includes a sample environment image obtained after labeling the boundary of the target object according to point cloud data of the driving environment.

The object detection model may include a neural network model for performing object detection, and the model type and model structure of the object detection model are not particularly limited in the present disclosure. The boundary of the target object generally refers to the ground line of the target object, or the projection line of the target object perpendicular to the ground.

In this step, after the target environment image is input to the target detection model, the boundary of the target object on the target environment image may be output through the model, so that the drivable region may be further determined based on the boundary of the target object.

The target object may include, for example, two sides of a road on which the vehicle is currently located or a target object (such as other vehicles, buildings, trees), pedestrians, etc. on the road. In the present disclosure, the target object is an obstacle in the running environment of the vehicle with respect to the running condition of the vehicle.

In general, the point cloud data of the driving environment is acquired by a rotating laser radar, the laser radar can be arranged on the roof of the vehicle and emits laser (such as 128-line laser) to the periphery, and one frame of radar point cloud data is obtained after one week of scanning of the 128-line laser. In an autopilot scene, the distance and azimuth information of a target object can be accurately determined through radar point cloud data.

According to the method, the boundary of the target object is marked on the sample image corresponding to the driving environment through the point cloud data of the driving environment, and when the target object is detected, the three-dimensional point cloud data has higher detection precision compared with the two-dimensional image data, so that the marking of the boundary of the target object based on the point cloud data is performed, and compared with the manual marking after the boundary of the target object is determined based on the two-dimensional image recognition in the related art, the marking accuracy is higher. And the boundary marking of the target object is automatically carried out by adopting the three-dimensional point cloud data, the efficiency is obviously higher than that of manual marking, and the labor cost is saved. Furthermore, the accuracy of target object detection is greatly improved based on the target detection model obtained by training the target sample with higher labeling accuracy.

FIG. 2 is a flow chart of a method of target detection according to the embodiment shown in FIG. 1, as shown in FIG. 2, the target detection model may be pre-trained by:

in step S201, point cloud data of the driving environment and an environment image to be marked are collected according to a preset frequency.

For example, four looking-around cameras are arranged around the vehicle, and a 360-degree laser radar is placed on the roof of the vehicle, so that four looking-around cameras can be used for collecting initial environment images in four directions of the vehicle, and then the initial environment images in all directions are spliced based on external parameters and internal parameters of the cameras to obtain the environment image to be marked. And acquiring point cloud data of the running environment by adopting a laser radar.

The preset frequency can be 10Hz, for example, so that the four looking around cameras acquire initial environment images according to the data acquisition frequency of 10Hz, and the laser radar acquires three-dimensional point cloud data according to the data acquisition frequency of 10 Hz.

In a possible implementation manner, in order to ensure that enough three-dimensional point cloud data can represent a target object, for each acquisition time, multi-frame point cloud data corresponding to the acquisition time can be acquired, and the multi-frame point cloud data is combined into one frame of point cloud data to serve as first point cloud data corresponding to the acquisition time. For example, 10 frames of point cloud data before and after the acquisition time can be acquired, and the total of 20 frames of point cloud data are combined into one frame of point cloud data.

In addition, in the actual data acquisition process of the vehicle, if the vehicle body shakes, unnecessary system errors are introduced, so that the accuracy of detecting the boundary of the target object is affected, for example, if the vehicle passes through an uneven road surface (such as a deceleration strip), the vehicle body shakes, so that in the present disclosure, shake parameters corresponding to the point cloud data of the preset frame number corresponding to each acquisition time can be obtained for each acquisition time; and deleting the point cloud data acquired at the acquisition time under the condition that the vehicle passes through the target road surface at the acquisition time according to the shake parameters, wherein the target road surface comprises a convex road surface or a concave road surface.

The shake parameter may be pitch and/or roll information acquired by the vehicle through an IMU (Inertial Measurement Unit ). In this way, for each acquisition time, shake parameters corresponding to 5 frames (i.e., 10 frames of preset frames) of point cloud data before and after the acquisition time can be taken, and when the standard deviation of the shake parameters of the 10 frames is greater than or equal to a preset shake parameter threshold (e.g., 0.2), the vehicle is considered to be in a shake state when the vehicle passes through an uneven target road surface, and the 10 frames of point cloud data need to be deleted. This is by way of example only and the present disclosure is not limited thereto.

In step S202, for each acquisition time, according to the first point cloud data acquired at the acquisition time, a boundary point is marked on a target object in the environmental image to be marked acquired at the acquisition time, so as to obtain the sample environmental image.

The environmental images to be marked, which are acquired at the acquisition time, are bird's eye views obtained by splicing the environmental images of all directions, which are acquired at the acquisition time.

It should be noted that, in an actual application scenario, the first point cloud data is obtained by accumulating multi-frame point cloud data at each moment, and laser noise points may exist in the first point cloud data at a ground leveling position, so in order to improve accuracy of marking a projection line of a target object perpendicular to the ground, influence of the laser noise points on a marking result may be avoided in the ground leveling position.

After denoising the first point cloud data, marking boundary points of the target object in the environment image to be marked based on the first point cloud data according to the method shown in fig. 3 to obtain a sample environment image.

Fig. 3 is a flowchart of a target detection method according to the embodiment shown in fig. 2, and as shown in fig. 3, step S202 includes the following sub-steps:

in step S2021, for each Of the image pickup apparatuses, the FOV (Field View) Of the image pickup apparatus and the target position point Of the image pickup apparatus in the first point cloud data are acquired.

As described above, the vehicle is provided with a plurality of image pickup devices around. In this step, the field angle corresponding to each image capturing device may be obtained, and the target position point corresponding to each image capturing device in the first point cloud data may be obtained.

The view angle corresponding to each image acquisition device can be preset, the target position point can be calibrated in advance based on different deployment positions of each image acquisition device on the vehicle, and therefore in the step, the target position point corresponding to each image acquisition device in the first point cloud data can be obtained directly according to a pre-calibration result.

The FOV angle of each image pickup device refers to the angle of view of the target object with respect to the image pickup device, and is the central angle of a sector area having the point of the position of the image pickup device as the vertex. The two sides of the sector area may be boundary lines of the maximum angular range that the image acquisition device can capture.

In step S2022, a point cloud boundary point of the target object is determined from the first point cloud data according to the field angle and the target position point corresponding to each of the image capturing devices.

In this step, second point cloud data located in a preset height range may be determined from the first point cloud data, where the height of each point in the second point cloud data and the ground are located in the preset height range; and aiming at each image acquisition device, taking the point closest to the target position point in the point cloud data in each preset direction in the view angle range in the second point cloud data as the point cloud boundary point.

Considering that the present disclosure mainly marks projection lines perpendicular to the ground of an obstacle suspended on the ground, data screening is required to be performed on the first point cloud data to obtain the second point cloud data, where each point in the second point cloud data is a point of the obstacle located in a preset height range. For example, a point of the target object with a height value between 0.2 and 1.5 meters may be selected from the first point cloud data collected by the laser radar as the second point cloud data, which is merely illustrated herein, and the preset height range may be arbitrarily set according to actual requirements, which is not limited in this disclosure.

In addition, since the deployment position of the image acquisition device on the vehicle is relatively low and is located around the vehicle body, the laser radar is usually mounted on the roof, and the mounting position is high and is located in the center of the vehicle body. Therefore, the two data acquisition views are different, and in the case of the position deployment, the laser radar can see more distant obstacles, and the image acquisition device can only see near obstacles. Therefore, when determining the boundary point of the target object, the point cloud data of the target object invisible to the camera needs to be removed, and in the present disclosure, for each image acquisition device, a point closest to the target position point in the point cloud data corresponding to each preset direction located in the view angle range in the second point cloud data may be used as the point cloud boundary point.

For example, taking a camera arranged at the left side of the vehicle body as an example, the target position point corresponding to the camera in the second point cloud data may be determined first, the preset direction may be a plurality of directions divided according to a preset angle interval in a view angle range corresponding to the camera, so that, for each preset direction, the target position point may be taken as a starting point, a point of a target object closest to the target position point in the preset direction may be taken as a point cloud boundary point of the target object, and thus, a true value of boundary points of target objects around the vehicle may be determined based on the point cloud data. The examples herein are merely illustrative, and the present disclosure is not limited thereto.

In step S2023, a target boundary point of the target object in the environmental image to be annotated is determined according to the point cloud boundary point.

In this step, a preset conversion matrix may be obtained, where the preset conversion matrix characterizes a mapping relationship between each point in the point cloud data and each pixel in the environmental image to be marked; and determining the corresponding target boundary point of each point cloud boundary point in the environment image to be marked according to the preset conversion matrix.

That is to say, for each point cloud boundary point of the target object, a target boundary point corresponding to the point cloud boundary point can be determined from the environmental image to be marked based on the preset conversion matrix, where the target boundary point is the boundary point of the target object to be marked, and for a floating obstacle, the target boundary point is a projection point of the boundary of the floating obstacle perpendicular to the ground.

In step S2024, after the target boundary point is labeled in the environmental image to be labeled, the sample environmental image is obtained.

In step S203, the target detection model is obtained after model training is performed on the preset detection model according to the sample environment image.

In this way, the boundary of the target object on the two-dimensional environment image can be marked based on the three-dimensional laser point cloud data by executing the steps S201 to S203 to obtain the sample environment image, and then the sample environment image can be used as a target sample to perform model training on the preset detection model to obtain the target detection model.

Fig. 4 is a block diagram of an object detection apparatus according to an exemplary embodiment, as shown in fig. 4, the apparatus including:

an acquisition module 401 configured to acquire a target environment image of a running environment in which the vehicle is currently located;

an identification module 402 configured to identify a boundary of a target object in the running environment by a target detection model from the target environment image;

Optionally, fig. 5 is a block diagram of an object detection device according to the embodiment shown in fig. 4, and as shown in fig. 5, the device further includes:

a model training module 403 configured to pre-train to the target detection model by:

Optionally, a plurality of image acquisition devices are arranged around the vehicle, and the model training module 403 is configured to acquire, for each image acquisition device, a field angle of the image acquisition device and a target position point of the image acquisition device in the first point cloud data; determining point cloud boundary points of the target object from the first point cloud data according to the field angle and the target position point which are respectively corresponding to each image acquisition device; determining a target boundary point of the target object in the environment image to be annotated according to the point cloud boundary point; and marking the target boundary point in the environment image to be marked to obtain the sample environment image.

Optionally, the model training module 403 is configured to determine second point cloud data located in a preset height range from the first point cloud data, where a height between each point in the second point cloud data and the ground is located in the preset height range; and aiming at each image acquisition device, taking the point closest to the target position point in the point cloud data in each preset direction in the view angle range in the second point cloud data as the point cloud boundary point.

Optionally, the model training module 403 is configured to obtain a preset conversion matrix, where the preset conversion matrix characterizes a mapping relationship between each point in the point cloud data and each pixel in the environmental image to be annotated; and determining the corresponding target boundary point of each point cloud boundary point in the environment image to be annotated according to the preset conversion matrix.

Optionally, the model training module 403 is configured to obtain, for each acquisition time, jitter parameters corresponding to point cloud data of a preset frame number corresponding to the acquisition time, respectively; and deleting the point cloud data acquired at the acquisition time under the condition that the vehicle passes through a target road surface according to the jitter parameters, wherein the target road surface comprises a convex road surface or a concave road surface.

Optionally, the target environment image comprises a bird's eye view; the acquiring module 401 is configured to acquire initial environmental images in different directions through a plurality of image acquisition devices arranged on the vehicle; and splicing the initial environment images corresponding to all the directions to obtain the aerial view.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the object detection method provided by the present disclosure.

FIG. 6 is a block diagram of a vehicle, according to an exemplary embodiment. For example, vehicle 600 may be a hybrid vehicle, but may also be a non-hybrid vehicle, an electric vehicle, a fuel cell vehicle, or other type of vehicle. The vehicle 600 may be an autonomous vehicle, a semi-autonomous vehicle, or a non-autonomous vehicle.

Referring to fig. 6, a vehicle 600 may include various subsystems, such as an infotainment system 610, a perception system 620, a decision control system 630, a drive system 640, and a computing platform 650. Wherein the vehicle 600 may also include more or fewer subsystems, and each subsystem may include multiple components. In addition, interconnections between each subsystem and between each component of the vehicle 600 may be achieved by wired or wireless means.

In some embodiments, the infotainment system 610 may include a communication system, an entertainment system, a navigation system, and the like.

The perception system 620 may include several sensors for sensing information of the environment surrounding the vehicle 600. For example, the sensing system 620 may include a global positioning system (which may be a GPS system, a beidou system, or other positioning system), an inertial measurement unit (inertial measurement unit, IMU), a lidar, millimeter wave radar, an ultrasonic radar, and a camera device.

Decision control system 630 may include a computing system, a vehicle controller, a steering system, a throttle, and a braking system.

The drive system 640 may include components that provide powered movement of the vehicle 600. In one embodiment, the drive system 640 may include an engine, an energy source, a transmission, and wheels. The engine may be one or a combination of an internal combustion engine, an electric motor, an air compression engine. The engine is capable of converting energy provided by the energy source into mechanical energy.

Some or all of the functions of the vehicle 600 are controlled by the computing platform 650. The computing platform 650 may include at least one processor 651 and memory 652, the processor 651 may execute instructions 653 stored in the memory 652.

The processor 651 may be any conventional processor, such as a commercially available CPU. The processor may also include, for example, an image processor (Graphic Process Unit, GPU), a field programmable gate array (Field Programmable Gate Array, FPGA), a System On Chip (SOC), an application specific integrated Chip (Application Specific Integrated Circuit, ASIC), or a combination thereof.

The memory 652 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

In addition to instructions 653, memory 652 may store data such as road maps, route information, vehicle location, direction, speed, and the like. The data stored by memory 652 may be used by computing platform 650.

In an embodiment of the present disclosure, the processor 651 may execute instructions 653 to perform all or part of the steps of the target detection method described above.

In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described object detection method when executed by the programmable apparatus.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof.

Claims

1. A method of detecting an object, comprising:

the target detection model is a model which is obtained by training a target sample in advance, wherein the target sample comprises a sample environment image obtained after marking the boundary of the target object according to the point cloud data of the driving environment;

the target detection model is obtained by pre-training in the following way:

performing model training on a preset detection model according to the sample environment image to obtain the target detection model;

the method for obtaining the sample environment image comprises the steps of:

2. The method according to claim 1, wherein the determining the point cloud boundary point of the target object from the first point cloud data according to the field angle and the target position point respectively corresponding to each of the image capturing devices includes:

3. The method according to claim 1, wherein determining the target boundary point of the target object in the environmental image to be annotated according to the point cloud boundary point comprises:

4. The method according to claim 1, wherein the method further comprises:

5. The method of any of claims 1-4, wherein the target environmental image comprises a bird's eye view BEV; the obtaining the target environment image of the current running environment of the vehicle comprises the following steps:

respectively acquiring initial environment images in different directions through a plurality of image acquisition devices arranged on the vehicle;

and splicing the initial environment images corresponding to all the directions to obtain the BEV.

6. An object detection apparatus, comprising:

the apparatus further comprises:

a model training module configured to pre-train to the target detection model by:

the vehicle is provided with a plurality of image acquisition devices in a surrounding manner, and the model training module is configured to: acquiring, for each image acquisition device, a field angle of the image acquisition device and a target position point of the image acquisition device in the first point cloud data; determining point cloud boundary points of the target object from the first point cloud data according to the field angle and the target position point which are respectively corresponding to each image acquisition device; determining a target boundary point of the target object in the environment image to be annotated according to the point cloud boundary point; and marking the target boundary point in the environment image to be marked to obtain the sample environment image.

7. A vehicle, characterized by comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the steps of the method of any of claims 1-5.

8. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method of any of claims 1-5.