CN115576990A

CN115576990A - Method, device, equipment and medium for evaluating visual truth value data and perception data

Info

Publication number: CN115576990A
Application number: CN202211216241.0A
Authority: CN
Inventors: 孙佩杰; 刘杭
Original assignee: Chongqing Changan Automobile Co Ltd
Current assignee: Chongqing Changan Automobile Co Ltd
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2023-01-06

Abstract

The invention provides a method for evaluating visual truth data and perception data, which comprises the following steps: acquiring video data and truth value data comprising truth value objects; inputting the video data into a pre-constructed perception algorithm model to obtain perception object data; aligning the truth data and the perception object data; setting a plurality of time periods, segmenting the aligned true value data and sensing object data based on the time periods, and matching the true value object and the sensing object belonging to the same time period to obtain a matched object, wherein the matched object is the sensing object successfully matched with the true value object; associating the true value object with a perception object successfully matched with the true value object, and forming an association pair; evaluating the perceptual algorithm model based on the correlation. The method effectively solves the association between the true value object and the perception object, so that the association between the true value object and the perception object is more accurate.

Description

Method, device, equipment and medium for evaluating visual truth value data and perception data

Technical Field

The application relates to the technical field of automatic driving of automobiles, in particular to a method, a device, equipment and a medium for evaluating visual truth value data and perception data.

Background

A large number of tests are required in the production process of automobiles, and particularly, before automobiles leave factories, various specified tests must be completed under the supervision of professional organizations, and the automobiles can be sold after test results meet the requirements of regulations. Therefore, each large automobile manufacturer pays particular attention to safety test and environmental protection test before the automobile leaves the factory, and before the test is carried out under the supervision of a professional organization, the automobile manufacturers can test the automobile manufacturers firstly, and the automobile manufacturers can send the automobile manufacturers to the organization for testing after the test is ensured to be qualified. At present, an intelligent driving automobile is in a competitive research and development stage of various automobile manufacturers, the accuracy of the pretesting before the testing is very important, and when a testing platform field is built, the accuracy of a testing result carried out in the field needs to be ensured. If the test result obtained by the test platform is not accurate, the test result of the tested vehicle is qualified and becomes unqualified, even the test unqualified and becomes qualified, and the pretest is meaningless.

Disclosure of Invention

In view of the above-mentioned shortcomings in the prior art, the present invention provides a method, an apparatus, a device and a medium for evaluating visual truth data and perception data, so as to solve at least one of the above-mentioned technical problems.

The invention provides a method for evaluating visual truth data and perception data, which comprises the following steps:

acquiring video data and truth value data comprising truth value objects;

inputting the video data into a pre-constructed perception algorithm model to obtain perception object data including a perception object output by the perception algorithm model;

aligning the truth data and the perception object data;

setting a plurality of time periods, and segmenting the aligned truth value data and the perception object data based on the time periods to obtain a plurality of first data segments corresponding to the truth value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments are in one-to-one correspondence according to a time sequence order;

matching the truth value object and the perception object belonging to the same time period to obtain a matched object, wherein the matched object is the perception object which is successfully matched with the truth value object;

associating the true value object with a perception object successfully matched with the true value object, and forming an associated pair;

evaluating the perceptual algorithm model based on the correlation.

In an embodiment of the present invention, matching a true value object and a perception object belonging to a same time period includes:

calculating a running track distance error between a true value object and a perception object to obtain a plurality of error values;

obtaining an average value of a plurality of error values;

and comparing the average value with a preset error threshold value to complete the matching of the true value object and the perception object.

In an embodiment of the invention, when the error value is smaller than the error threshold, the true object and the perception object are successfully matched.

In an embodiment of the present invention, if there are a plurality of sensing objects successfully matched with the true object and timestamps of the plurality of sensing objects are different, the method further includes:

determining an association priority according to the magnitude of the error value, wherein the smaller the error value is, the larger the association priority is, and the association priority is used for representing the association sequence;

and sequentially associating a plurality of perception objects successfully matched with the truth object based on the association priority.

In an embodiment of the invention, before associating the sensing objects matching with the truth object, the method further includes:

and setting a plurality of sampling time points, and sampling a first data segment and a second data segment belonging to the same time period based on the plurality of sampling time points to obtain a true value object and a plurality of perception objects which are successfully matched.

In an embodiment of the present invention, if there are a plurality of sensing objects successfully matched with the true object and the timestamps of the plurality of sensing objects are the same, the method further includes:

obtaining a plurality of error values of a plurality of sensing objects and truth value objects which are successfully matched with the truth value objects;

comparing the error values pairwise to determine a minimum error value;

and when the true value object is matched with the perception object, matching the true value object corresponding to the minimum error value with the perception object.

In one embodiment of the present invention, the error value is calculated by the equation (1);

where XD denotes a value of the sensing object in the X direction in the vehicle coordinate system, XG denotes a value of the truth object in the X direction in the vehicle coordinate system, YD denotes a value of the sensing object in the Y direction in the vehicle coordinate system, YG denotes a value of the truth object in the Y direction in the vehicle coordinate system, n denotes the number of segment time points, and i denotes the i-th segment time point.

In an embodiment of the present invention, the truth data is acquired by a laser radar, a millimeter wave radar, or an ultrasonic radar.

The invention provides an evaluating device for visual truth data and perception data, which comprises:

the data acquisition module is used for acquiring video data and truth value data comprising truth value objects;

the perception data acquisition module is used for inputting the video data into a pre-constructed perception algorithm model to obtain perception object data including a perception object and output by the perception algorithm model;

an alignment module, configured to perform time alignment on the truth data and the perception object data;

the segmentation module is used for setting a plurality of time periods, and segmenting the aligned true value data and the aligned perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments are in one-to-one correspondence according to a time sequence order;

the matching module is used for matching the true value object and the perception object belonging to the same time period to obtain a matching object, wherein the matching object is the perception object which is successfully matched with the true value object;

the association module is used for associating the true value object with a perception object successfully matched with the true value object and forming an association pair;

and the evaluating module is used for evaluating the perception algorithm model based on the association.

The invention provides an electronic device, comprising:

one or more processors;

the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the electronic equipment is enabled to realize the steps of the visual truth data and perception data evaluating method.

The invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor of a computer, the computer is caused to execute the steps of the above method for evaluating visual truth data and perception data.

The present invention provides a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the above method for evaluating visual truth data and perceptual data.

The invention has the beneficial effects that: the invention relates to a method for evaluating visual truth data and perception data, which comprises the following steps: acquiring video data and truth value data comprising truth value objects; inputting the video data into a pre-constructed perception algorithm model to obtain perception object data which are output by the perception algorithm model and comprise perception objects; aligning the truth value data and the perception object data; setting a plurality of time periods, and segmenting aligned true value data and perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments correspond to each other one by one according to a time sequence order; matching the truth value object and the perception object belonging to the same time period to obtain a matched object, wherein the matched object is the perception object which is successfully matched with the truth value object; associating the true value object with a perception object successfully matched with the true value object, and forming an association pair; evaluating the perceptual algorithm model based on the correlation. The method effectively solves the association between the true value object and the perception object, so that the association between the true value object and the perception object is more accurate.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

fig. 1 is a schematic diagram of an implementation environment of an evaluation method for visual truth data and perceptual data according to an exemplary embodiment of the present application;

FIG. 2 is a flow chart illustrating a method for evaluating visual truth data and perception data according to an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of a true object trajectory and a sense object trajectory within a coordinate system as shown in an exemplary embodiment of the present application;

FIG. 4 is a schematic diagram illustrating error value calculation in an exemplary embodiment of the present application;

FIG. 5 illustrates mapping truth and perception associated pairs in time slices based on a minimum-error approach according to an exemplary embodiment of the present application;

FIG. 6 is a block diagram of a visual truth data and perception data evaluating device according to an exemplary embodiment of the present application;

FIG. 7 illustrates a schematic structural diagram of a computer system suitable for use in implementing an electronic device of an embodiment of the present application;

wherein the visual true data can be represented by GT and the perceptual data can be represented by DUT.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the disclosure herein, wherein the embodiments of the present invention are described in detail with reference to the accompanying drawings and preferred embodiments. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be understood that the preferred embodiments are illustrative of the invention only and are not limiting upon the scope of the invention.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

In the following description, numerous details are set forth to provide a more thorough explanation of embodiments of the present invention, however, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.

FIG. 1 is a schematic diagram of an exemplary environment for evaluating visual truth data and perception data according to the present application. Referring to fig. 1, the implementation environment includes a terminal device 101 and a server 102, and the terminal device 101 and the server 102 communicate with each other through a wired or wireless network. The method comprises the steps that terminal equipment obtains video data and truth value data comprising a truth value object, then the data are transmitted to a server, the server inputs the video data into a pre-constructed perception algorithm model, and perception object data comprising a perception object and output by the perception algorithm model are obtained; aligning the truth value data and the perception object data; setting a plurality of time periods, and segmenting aligned true value data and perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments are in one-to-one correspondence according to a time sequence order; matching the truth value object and the perception object belonging to the same time period to obtain a matched object, wherein the matched object is the perception object which is successfully matched with the truth value object; associating the true value object with a perception object successfully matched with the true value object, and forming an associated pair; evaluating the perceptual algorithm model based on the correlation.

It should be understood that the number of terminal devices 101 and servers 102 in fig. 1 is merely illustrative. There may be any number of terminal devices 101 and servers 102, as desired.

The terminal device 101 corresponds to a client, and may be any electronic device having a user input interface, including but not limited to a smart phone, a tablet, a notebook computer, a vehicle-mounted computer, and the like, where the user input interface includes but not limited to a touch screen, a keyboard, a physical key, an audio pickup device, and the like.

The server 102 corresponds to a server, may be a server providing various services, may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and an artificial intelligence platform, which is not limited herein.

The terminal device 101 may communicate with the server 102 through a wireless network such as 3G (third generation mobile information technology), 4G (fourth generation mobile information technology), 5G (fifth generation mobile information technology), and the like, which is not limited herein.

The intelligent technology is changing day by day, the system is more and more complex and huge, the artificial intelligence and the automobile technology are rapidly developed, and the automatic driving automobile gradually moves to the market from the experimental demonstration stage. The visual perception outside the vehicle is very important for automatic driving, and the visual perception barrier has important influence on the driving decision of the automatic driving vehicle. The process of visual perception is that image information is collected through a camera, then the image information is labeled, a perception algorithm model is put into the image information for training, algorithm processing is carried out on the image information, visual perception obstacles of an automatic driving scene are output, then information fusion processing is carried out on the visual perception obstacles and other sensors, and detailed data of perception information is output. The existing means for verifying the output effect of visual perception obstacles and attributes thereof in an automatic driving scene mainly comprises the steps of utilizing a truth-value vehicle to carry out road testing, collecting environmental data through a laser radar, a millimeter wave radar, an ultrasonic radar and the like, classifying and labeling the collected data, outputting the collected data through an algorithm perception algorithm model, and comparing the truth-value data with perception data to judge the accuracy of the model. However, the difficulty in how to effectively associate the obstacle objects and the perceived objects output in the truth is high, and the association of the obstacle objects is difficult because of more obstacle objects, complex behaviors and difficult association. The method comprises the steps of calculating through a specific algorithm and a specific mode to carry out fitting, comparing through an evaluation tool after correlation, bringing a picture with low accuracy into a long tail library to retrain and verify, and continuously and effectively improving the accuracy of a perception algorithm model through a continuous integration continuous testing platform.

Because there is a problem in the prior art that the accuracy of association between visual truth data and perception data is not high enough, in order to solve these problems, embodiments of the present application provide a method for evaluating visual truth data and perception data, an apparatus for evaluating visual truth data and perception data, an electronic device, and a computer-readable storage medium, which will be described in detail below.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for evaluating visual truth data and perception data according to an exemplary embodiment of the present application. The method may be applied to the implementation environment shown in fig. 1 and specifically executed by the terminal device 101 in the implementation environment. It should be understood that the method may also be applied to other exemplary implementation environments and specifically executed by devices in other implementation environments, and the embodiment does not limit the implementation environment to which the method is applied.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for evaluating visual truth data and perception data according to an exemplary embodiment of the present disclosure, where the method for evaluating visual truth data and perception data at least includes steps S210 to S270, and the following steps are described in detail:

step S210, acquiring video data and truth value data including a truth value object;

step S220, inputting the video data into a pre-constructed perception algorithm model to obtain perception object data including perception objects output by the perception algorithm model;

step S230, aligning the truth data and the perception object data;

step S240, setting a plurality of time periods, and segmenting the aligned true value data and perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments correspond to each other one by one according to a time sequence order;

step S250, matching the truth value object and the perception object belonging to the same time period to obtain a matching object, wherein the matching object is the perception object successfully matched with the truth value object;

step S260, associating the true value object with the perception object successfully matched with the true value object, and forming an associated pair;

and step S270, evaluating the perception algorithm model based on the association.

The method effectively solves the association between the true value object and the perception object, so that the association between the true value object and the perception object is more accurate.

The respective steps are explained in detail below.

In step S210, video data and truth data including a truth object are acquired;

it should be noted that the video data is collected by the camera on the real-valued vehicle when the real-valued vehicle runs along the collected planned path.

The truth-value video data can be obtained by performing road test acquisition through a truth-value system on a truth-value vehicle according to a sampling plan. The truth value object is included in the truth value video data, and the truth value object can be regarded as a real existing object which can be used as a reference standard. The performance of the perception algorithm model can be evaluated by combining the truth data with the perception data.

The main metadata of the information of the truth object output by the truth system includes a target timestamp (Obj _ stamp _ sec), a frame timestamp (stamp _ sec), an X-direction distance (distance _ X), and a Y-direction distance (distance _ Y).

In an embodiment, the truth data may be acquired by a laser radar, a millimeter wave radar, or an ultrasonic radar disposed on the truth vehicle.

In step S220, inputting the video data into a pre-constructed perceptual algorithm model to obtain perceptual object data including a perceptual object output by the perceptual algorithm model;

the perception algorithm model can be a neural network-based model, and the perception algorithm model is trained by acquiring image information through a camera, labeling the image information, putting the image information into an initial neural network, and training the initial neural network to finally obtain a qualified perception algorithm model. The perception algorithm model is used for carrying out algorithm processing on the image information and outputting a visual perception object of the automatic driving scene. The sensing object may be an obstacle, a pedestrian, a vehicle, or the like.

The perception algorithm model can be arranged in a perception system in advance, then video data collected by the truth vehicle are input into the perception algorithm model, and a result output by the perception algorithm model is used as perception object data.

In step S230, time-aligning the truth data and the perception object data;

it should be noted that the truth data and the perception object data are collected by a truth vehicle. Therefore, the truth data and the percept data collected theoretically have the same time stamp. However, in an actual process, the true value data is acquired by a laser radar, a millimeter wave radar or an ultrasonic radar on the true value vehicle, and the sensing object data is acquired by a camera on the true value vehicle, so that the time stamps of the true value data and the sensing object data acquired actually are different, and an error exists between the time stamps. Therefore, before performing the subsequent analysis, the truth value data and the perception object data need to be aligned, so that the truth value data and the perception object data have the same timestamp. In particular, a linear interpolation algorithm may be employed to target truth data and perceptive target data.

Since the truth data and the perception object data have the same time stamp, the metadata of the perception object information output by the perception system also includes the target time stamp (Obj _ stamp _ sec), the frame time stamp (stamp _ sec), the X-direction distance (distance _ X), and the Y-direction distance (distance _ Y). As shown in fig. 3, fig. 3 is a schematic diagram of GT object trajectories and DUT object trajectories within a coordinate system.

In step S240, setting a plurality of time periods, and segmenting the aligned true value data and perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments correspond to each other one by one according to a time sequence order;

the time period may be understood as a time range, and after segmenting the truth value video data and the perceptual object data based on a plurality of time periods, a plurality of first data segments corresponding to the truth value video data and a plurality of second data segments corresponding to the perceptual object data may be obtained. It should be noted that the first data segments of the true value video data correspond to the second data segments of the perceptron data one-to-one. For example, the truth video data is divided into 10 data segments, which are respectively named as data segment 1, data segment 2, data segment 11, data segment 21, data segment 1, data segment n1, and the perception object data is divided into 10 data segments, which are respectively named as data segment 11, data segment 21, data segment n, and data segment n1, then data segment 1 corresponds to data segment 11, data segment 2 corresponds to data segment 21, data segment n corresponds to data segment n1, and it should be further noted that in the truth video data and the perception object data, corresponding data segments have the same time stamp, that is, data segment 1 and data segment 11 have the same data stamp.

When the real-valued video data and the perception object data are segmented, the segmentation can be performed at fixed time intervals, or can be performed at any time intervals as long as the corresponding time periods are kept to have the same time stamp.

Taking the example of segmenting the truth value video data, the segmenting can be performed at intervals of 5min, 8min, and the like. For example, the data segment 1 is data within 0-5 min, the data segment 2 is data within 6-10 min, and the data segment 3 is data within 11-15 min.

In step S250, matching the true value object and the perception object belonging to the same time period to obtain a matching object, where the matching object is a perception object successfully matched with the true value object;

the objective of this step is to find a sensing object that coincides with the true value object, and therefore, the true value object and the sensing object in the same time period need to be matched, and the matching result includes matching success or matching failure.

The term "overlap" as used herein means overlap in distance, and for example, the closer the distance, the higher the degree of overlap, and the smaller the distance, the lower the degree of overlap. Therefore, in order to find a perception object coinciding or matching with a truth object, it is necessary to match the truth object and the perception object belonging to the same first segment time point, specifically, the method includes: calculating a running track distance error between a true value object and a perception object to obtain a plurality of error values; obtaining an average value of a plurality of error values; and comparing the average value with a preset error threshold value to complete the matching of the true value object and the perception object.

After calculating the distance error of the running track of each sensing object and the real-valued object to obtain an error average value, comparing the error average value with a preset error threshold, if the error value is smaller than the error threshold, successfully matching the real-valued object and the sensing object, otherwise, failing to match. The preset error threshold may be set by a person skilled in the art according to an actual application scenario, and is not further limited herein.

In one embodiment, the error value is calculated by equation (1);

where XD denotes a value of the sensing object in the X direction in the vehicle coordinate system, XG denotes a value of the truth object in the X direction in the vehicle coordinate system, YD denotes a value of the sensing object in the Y direction in the vehicle coordinate system, YG denotes a value of the truth object in the Y direction in the vehicle coordinate system, n denotes the number of time slots, and i denotes the i-th time slot.

The Y-direction value can also be understood as the lateral position and the X-direction value as the longitudinal position.

In step S260, associating the true value object with a perception object successfully matched with the true value object, and forming an association pair; in step S270, the perception algorithm model is evaluated based on the association.

After the association pair is generated, the association pair can be input into an evaluation system to evaluate the perception accuracy of the perception algorithm model.

The invention effectively solves the association between the true value object and the perception object and integrates with the continuous integration continuous testing platform, thereby effectively realizing the continuous testing of evaluation and continuously improving the accuracy of the perception algorithm model.

It should be noted that, in the sensing process, a case that a plurality of sensing objects are associated with the same true object may occur, and a classification comparison is required for the case.

In the first case, a real object may be perceived as two objects, for example, when the real object is preceded by an object message to form a block, the radar may still be able to recognize a GT real object, but the perception system may recognize two objects according to time, so that when such a result is obtained, the time needs to be further divided and compared. The second case is that if two perception objects are associated with the same true value object in the same time slot, the two perception objects are sorted according to the distance fitting error, the perception object corresponding to the minimum error is selected to be associated with the true value object, and the other perception object is not associated, so that an unassociated object list is included. And finally, the related objects are brought into an evaluation system for evaluation of the perception accuracy.

In one embodiment, if there are multiple sensing objects successfully matched with the true object and the timestamps of the multiple sensing objects are not the same, the method further includes: determining an association priority according to the magnitude of the error value, wherein the smaller the error value is, the larger the association priority is, and the association priority is used for representing the association sequence; and sequentially associating a plurality of perception objects successfully matched with the truth object based on the association priority.

The larger the error value is, the larger the association priority is, the priority association is given, and the smaller the association priority is, the association order is ranked behind.

As shown in FIG. 4, taking gt _1: [ dut _1, dut_2 ] as an example, gt1 is the track of the true object and dut _1, dut_2 are the tracks of the two perceptual objects associated with gt 1.

Firstly, ranking according to error values, if the error value between the dut _1 and GT1 is less than the error value between the dut _2 and GT1, preferentially using the time segment of the dut _1, the time segment of start time, end time, for GT2 association, and then associating the remaining time segment of GT with the dut _ 2; if the error value between the let _2 and GT1 is less than the error value between let _1 and GT1, the time segment let _2[ start_time, end_time ] of let _2 is preferentially used for GT1 association, and the remaining time segment of GT is again associated with let _ 1.

In FIG. 4, let _1 has the least error with gt1, then it is first associated with gt1 as much as possible, occupying gt1, then let _2, let _2only occupy the portion of gt1 not occupied by let _1, and finally choosing the set of associated pairs to be in the period of let _1, [ start _time, [ end _time ] ] gt _1: [ dut _2], further dut _2[ start _ u time, end _ u time ] period gt _1: [ dut _2].

In one embodiment, before associating a plurality of perception objects matching the truth object with the truth object, the method further comprises:

and setting a plurality of sampling time points, and sampling a first data segment and a second data segment which belong to the same time segment based on the plurality of sampling time points to obtain a true value object and a plurality of perception objects which are successfully matched.

Taking sampling of the true value video data as an example, sampling may be performed in a 33 ms-by-frame manner, and as shown in fig. 5, segmentation is performed in a 33 ms-by-frame manner, and of course, before segmentation, a start time point needs to be determined. In fig. 5, the first sampling time point may be t0+33ms, the second sampling time point may be t0+2 x 33ms, the third sampling time point may be t0+3 x 33ms, and the nth sampling time point may be t0+ n x 33ms.

In one embodiment, if there are multiple percepts that match the true object successfully and the timestamps of the multiple percepts are the same, the method further comprises:

comparing the error values pairwise to determine a minimum error value;

Fig. 6 is a block diagram of an evaluation device for visual truth data and perception data according to an exemplary embodiment of the present application. The device can be applied to the implementation environment shown in fig. 1 and is specifically configured in the terminal equipment. The apparatus may also be applied to other exemplary implementation environments, and is specifically configured in other devices, and the embodiment does not limit the implementation environment to which the apparatus is applied.

As shown in fig. 6, the present application provides an apparatus for evaluating visual truth data and perception data, the apparatus comprising:

a first data acquisition module 610 for acquiring video data and truth data including a truth object;

the second data obtaining module 620 is configured to input the video data into a pre-constructed perceptual algorithm model to obtain perceptual object data including a perceptual object output by the perceptual algorithm model;

an alignment module 630, configured to time-align the truth data and the percept data;

a segmenting module 640, configured to set multiple time periods, and segment the aligned true value data and perception object data based on the multiple time periods to obtain multiple first data segments corresponding to the true value data and multiple second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments correspond to each other one by one according to a time sequence order;

the matching module 650 is configured to match the true value object and the perception object belonging to the same time period to obtain a matching object, where the matching object is a perception object that is successfully matched with the true value object;

the association module 660 is configured to associate the true value object with a perception object that is successfully matched with the true value object, and form an association pair;

an evaluating module 670 for evaluating the perceptual algorithm model based on the association.

It should be noted that the evaluation device for visual truth data and sensing data provided in the foregoing embodiment and the evaluation method for visual truth data and sensing data provided in the foregoing embodiment belong to the same concept, and the specific manner in which each module and unit performs operations has been described in detail in the method embodiment, and is not described herein again. In practical applications, the evaluation apparatus for visual truth data and perceptual data provided in the above embodiment may distribute the functions to different functional modules according to needs, that is, divide the internal structure of the apparatus into different functional modules to complete all or part of the functions described above, which is not limited herein.

An embodiment of the present application further provides an electronic device, including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the electronic equipment is enabled to realize the evaluation method of the visual truth data and the perception data provided in the above embodiments.

FIG. 7 illustrates a schematic structural diagram of a computer system suitable for use to implement the electronic device of the embodiments of the subject application. It should be noted that the computer system 700 of the electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes, such as executing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for system operation are also stored. The CPU 601, ROM 702, and RAM 703 are connected to each other via a bus 704. An Input/Output (I/O) interface 705 is also connected to the bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN (Local area network) card, a modem, and the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that the computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. When the computer program is executed by a Central Processing Unit (CPU) 701, various functions defined in the system of the present application are executed.

It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer-readable signal medium may comprise a propagated data signal with a computer-readable computer program embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

Yet another aspect of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor of a computer, causes the computer to perform the method for evaluating visual truth data and perception data as described above. The computer-readable storage medium may be included in the electronic device described in the above embodiment, or may exist separately without being incorporated in the electronic device.

Another aspect of the application also provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the method for evaluating the visual truth data and the perception data provided in the above embodiments.

The foregoing embodiments are merely illustrative of the principles of the present invention and its efficacy, and are not to be construed as limiting the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A method for evaluating visual truth data and perception data is characterized by comprising the following steps:

acquiring video data and truth value data comprising a truth value object;

time-aligning the truth data and the percept data;

setting a plurality of time periods, and segmenting aligned true value data and perception object data based on the time periods to obtain a plurality of first data segments corresponding to the true value data and a plurality of second data segments corresponding to the perception object data; the plurality of first data segments and the plurality of second data segments are in one-to-one correspondence according to a time sequence order;

matching a true value object and a perception object belonging to the same time period to obtain a matched object, wherein the matched object is the perception object successfully matched with the true value object;

associating the true value object with a perception object successfully matched with the true value object, and forming an association pair;

evaluating the perceptual algorithm model based on the correlation.

2. The method for evaluating visual truth data and perception data according to claim 1, wherein matching truth objects and perception objects belonging to the same time period comprises:

obtaining an average value of a plurality of error values;

3. The method for evaluating visual truth data and perceptual data according to claim 2, wherein the matching of the truth object and the perceptual object is successful when the error value is less than the error threshold.

4. The method for evaluating visual truth data and perception data according to claim 3, wherein if there are multiple perception objects successfully matched with the truth object and the timestamps of the multiple perception objects are not the same, the method further comprises:

5. The method for evaluating visual truth data and perception data according to claim 4, wherein, prior to associating a plurality of perception objects matching a truth object with the truth object, the method further comprises:

6. The method for evaluating visual truth data and perception data according to claim 3, wherein if there are multiple perception objects successfully matched with the truth object and the multiple perception objects have the same timestamp, the method further comprises:

obtaining a plurality of error values of a plurality of sensing objects and true value objects which are successfully matched with the true value object;

comparing the error values pairwise to determine a minimum error value;

7. The method for evaluating visual truth data and perception data according to any one of claims 2 to 6, wherein the error value is calculated by the equation (1);

where XD denotes a value of the sensing object in the X direction in the vehicle coordinate system, XG denotes a value of the true value object in the X direction in the vehicle coordinate system, YD denotes a value of the sensing object in the Y direction in the vehicle coordinate system, YG denotes a value of the true value object in the Y direction in the vehicle coordinate system, n denotes the number of the segment time points, and i denotes the i-th segment time point.

8. The method for evaluating the visual truth data and the perception data according to claim 1, wherein the truth data is acquired by a laser radar, a millimeter wave radar or an ultrasonic radar.

9. An apparatus for evaluating visual truth data and perception data, the apparatus comprising:

the data acquisition module is used for acquiring video data and truth value data comprising a truth value object;

the matching module is used for matching the true value object and the perception object which belong to the same time period to obtain a matching object, wherein the matching object is the perception object which is successfully matched with the true value object;

10. An electronic device, characterized in that the electronic device comprises:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the electronic device to carry out the steps of the method of evaluating visual truth data and perceptual data according to any one of claims 1 to 8.

11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor of a computer, causes the computer to carry out the steps of the method for evaluating visual truth data and perceptual data according to any one of claims 1 to 8.

12. A computer program product, characterized in that it comprises a computer program carried on a computer readable medium, the computer program comprising a computer program for performing a method of evaluating visual truth data and perception data according to any one of claims 1 to 8.