CN114091587A

CN114091587A - Method, apparatus, device and medium for determining object class for high-precision map

Info

Publication number: CN114091587A
Application number: CN202111329955.8A
Authority: CN
Inventors: 吴启扬; 张瀚天; 周尧; 彭亮; 万国伟
Original assignee: Apollo Intelligent Technology Beijing Co Ltd
Current assignee: Apollo Intelligent Technology Beijing Co Ltd
Priority date: 2021-11-10
Filing date: 2021-11-10
Publication date: 2022-02-25

Abstract

The disclosure provides a method for determining object categories, relates to the technical field of automatic driving, and particularly relates to a computer vision technology and a high-precision map technology. The specific implementation scheme is as follows: determining the weight of a first class of a target object in each image frame aiming at the N image frames to obtain N weights which are in one-to-one correspondence with the N image frames, wherein the number of the first classes is M, each first class corresponds to at least one of the N weights, and both N and M are integers which are larger than or equal to 1; aiming at M first categories, determining fusion weights of the first categories according to at least one weight corresponding to each first category to obtain M fusion weights; and determining a second class of the target object according to the M fusion weights. The disclosure also provides an apparatus, an electronic device and a storage medium for determining the object class.

Description

Method, apparatus, device and medium for determining object class for high-precision map

Technical Field

The present disclosure relates to the field of automated driving technology, and more particularly, to computer vision technology and high-precision map technology. More particularly, the present disclosure provides a method, an apparatus, an electronic device, and a storage medium for determining a class of an object.

Background

The high-precision map is also called as a high-precision map and is used for an automatic driving automobile. The high-precision map has accurate vehicle position information and rich road element data information, can help an automobile to predict road surface complex information such as gradient, curvature, course and the like, and can better avoid potential risks, for example, the position of the current vehicle can be obtained by matching an automatic driving vehicle with lane line information in the high-precision map. The method can adopt a manual labeling mode or a mode of identifying for a single frame image to identify the category of a certain object. For example, a manual labeling mode or a mode of identifying a single frame image of a road acquisition video can be adopted to identify the line type or the color of a certain point on the lane line of the road.

Disclosure of Invention

The disclosure provides a method, an apparatus, a device and a storage medium for determining object class.

According to a first aspect, there is provided a method of determining a class of objects, the method comprising: determining the weight of a first class of a target object in each image frame aiming at N image frames to obtain N weights corresponding to the N image frames one by one, wherein the number of the first classes is M, each first class corresponds to at least one of the N weights, and both N and M are integers greater than or equal to 1; aiming at M first categories, determining fusion weights of the first categories according to at least one weight corresponding to each first category to obtain M fusion weights; and determining a second category of the target object according to the M fusion weights.

According to a second aspect, there is provided an apparatus for determining a class of an object, the apparatus comprising: a first determining module, configured to determine, for N image frames, weights of a first class of a target object in each image frame to obtain N weights in one-to-one correspondence with the N image frames, where the first classes are M, each first class corresponds to at least one of the N weights, and N and M are integers greater than or equal to 1; a second determining module, configured to determine, for the M first classes, a fusion weight of each first class according to at least one weight corresponding to the first class, so as to obtain M fusion weights; and a third determining module, configured to determine a second category of the target object according to the M fusion weights.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.

According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided in accordance with the present disclosure.

According to a fifth aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to the present disclosure.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1A is a schematic diagram of an exemplary system architecture to which the method and apparatus for determining object classes may be applied, according to one embodiment of the present disclosure;

FIG. 1B is a schematic diagram of an application scenario of the method and apparatus for determining object class according to an embodiment of the present disclosure;

FIG. 2 is a flow diagram of a method of determining a class of objects according to one embodiment of the present disclosure;

FIG. 3 is a flow diagram of a method of determining a class of objects according to another embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a plurality of objects according to one embodiment of the present disclosure;

FIG. 5 is a schematic output diagram of a method of determining a class of objects according to one embodiment of the present disclosure;

FIG. 6 is a block diagram of an apparatus for determining an object class according to one embodiment of the present disclosure; and

FIG. 7 is a block diagram of an electronic device in accordance with a method of determining a class of objects according to one embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the related art, the type of the object can be identified by adopting a manual labeling mode. However, manual labeling requires a lot of manpower and high time cost, and can only be applied to small-scale object recognition scenes.

However, due to problems of a recognition algorithm, an object is blocked, a shooting angle, a shooting distance and the like, an incorrect object type may be recognized.

The single frame image may be one image frame in a surveillance video or a video taken by a vehicle. The same object may be contained in a plurality of image frames of the video, from each of which the class of the object may be identified. But different classes may be obtained for the same object from different image frames.

Fig. 1A is a schematic diagram of an exemplary system architecture to which the method and apparatus for determining object classes may be applied, according to one embodiment of the present disclosure. It should be noted that fig. 1A is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1A, the system architecture 100 according to this embodiment may include a plurality of terminal devices 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminal devices 101 and server 103. Network 102 may include various connection types, such as wired and/or wireless communication links, and so forth.

A user may use terminal device 101 to interact with server 103 over network 102 to receive or send messages and the like. Terminal device 101 may be a variety of electronic devices including, but not limited to, a smart phone, a tablet computer, a laptop portable computer, and the like.

The method for determining the object class provided by the embodiment of the present disclosure may be generally performed by the server 103. Accordingly, the apparatus for determining the object class provided by the embodiment of the present disclosure may be generally disposed in the server 103. The method for determining the object class provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 103 and is capable of communicating with the terminal device 101 and/or the server 103. Correspondingly, the device for determining the object class provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 103 and capable of communicating with the terminal device 101 and/or the server 103.

Fig. 1B is a schematic application scenario diagram of a method and an apparatus for determining an object class according to an embodiment of the present disclosure.

As shown in fig. 1B, the road 104 may include 4 lanes, a first lane 1041, a second lane 1042, a third lane 1043, and a fourth lane 1044. The road 104 is provided with 3 lane lines, which are a first lane line 1051, a second lane line 1052 and a third lane line 1053. The first lane line 1051 has a plurality of objects thereon, two of which are the object 106 and the object 107, respectively.

The detection device is movable on 4 lanes, and the detection device is provided with an image pickup device.

In one video capture process, the detection device a108 may move on the first lane 1041 and capture video using its camera. During the movement of the detection device a108, a plurality of image frames in the captured first video each contain an object 106. For a plurality of image frames in the first video containing the object 106, a first class of the object 106 in each image frame may be determined using a classification algorithm or a semantic recognition algorithm. The first category of the object 106 is a single dashed line in a portion of the plurality of image frames containing the object 106. The first category of the object 106 is a single solid line in another portion of the plurality of image frames containing the object 106.

During another video capture process, detection device B109 may move on the fourth lane 1044 and capture video with its camera. During the movement of the detection device B109, a plurality of image frames in the captured second video each contain the object 106. For a plurality of image frames in the second video containing the object 106, a first class of the object 106 in each image frame may be determined using a classification algorithm or a semantic recognition algorithm. The first category of the object 106 is a single dashed line in a portion of the plurality of image frames containing the object 106. The first category of the object 106 is a single solid line in another portion of the plurality of image frames containing the object 106.

FIG. 2 is a flow diagram of a method of determining a class of objects according to one embodiment of the present disclosure.

As shown in fig. 2, the method 200 may include operations S210 to S230.

In operation S210, for N image frames, a first class weight of a target object in each image frame is determined, resulting in N weights corresponding to the N image frames one to one.

In an embodiment of the disclosure, the first classes are M, each first class corresponds to at least one of the N weights, and N and M are integers greater than or equal to 1.

In one example, M is a positive integer less than or equal to N.

For example, 10 image frames may be provided, and 3 first categories of the target objects may be determined from the 10 image frames.

In the disclosed embodiment, the target object may be a point of a lane line of a road.

For example, the target object may be, for example, the object 106 on the first lane line 1051 in fig. 1B. As another example, the target object may be, for example, the object 107 on the first lane line 1051 in fig. 1B.

In the disclosed embodiment, the N image frames may be image frames containing a target object.

For example, each of the N image frames contains an object 106, such as in fig. 1B.

In the disclosed embodiment, the N image frames may be obtained from a video captured by a detection device.

For example, the detection device may be a movable image acquisition device. In one example, the detection apparatus may be a vehicle equipped with a camera device.

In the embodiment of the present disclosure, the N image frames may be obtained from one video or may be obtained from a plurality of videos.

For example, the detection device a108 in fig. 1B may move on the first lane 1041, and record a video with its camera for video capture. N image frames may be acquired from the video recorded by detection device a 108.

For example, detection device B109 in fig. 1B, for example, may move on the fourth lane 1044, record video with its camera for video capture. N image frames may be acquired from the video recorded by detection device B109.

For example, the detection device a108 in fig. 1B may move on the first lane 1041, and for example, the detection device B109 in fig. 1B may move on the fourth lane 1044, and respectively record videos by using respective cameras for video capture. The N image frames may be collectively acquired from two videos recorded by the detection device a108 and the detection device B109.

In the disclosed embodiment, the first category of the target object may be a line type or a color of a point of the lane line.

For example, the line type may include a single solid line, a single dotted line, a double solid line, a double dotted line, a left solid right dotted line, a left virtual right solid line, and the like.

For example, the colors may include yellow and white, and so on.

In an embodiment of the present disclosure, for each image frame, an actual distance between the target object and the detection device may be obtained to determine a first weight of a first class of the target object in the image frame.

For example, a first weight for a first category of a target object in the image frame may be determined by the following formula:

d is the actual distance between the target object and the detection device, w_i1A first weight for a first class of target objects in the image frame. When the distance d between the detection equipment and the target object does not exceed d_lWhen image distortion of image frames containing the target object is small, the first weight of the target object in these image frames may be 1.0. When the distance d increases to d_hThen, although the video acquired by the detection device still contains the target object, the image frames are severely distorted due to the fact that the detection device is far away from the target object, and the first category determined according to the image frames is inaccurate. At this time, the first weight of the target object in the image frames may be set to an empirical value of 0.6. In one example, f (d) is an inverse scaling function.

In one example, a first actual location (such as geographic coordinates obtained via satellite positioning) at which object 106 is located, e.g., in fig. 1B, may be obtained, and a second actual location (such as geographic coordinates obtained via satellite positioning) corresponding to the image frame at which device a108 is located, e.g., in fig. 1B, may be obtained. Further, the actual distance d between the detection device a108 and the object 106 may be acquired.

In an embodiment of the present disclosure, a weight of a first category of a target object in the image frame may be determined according to the first weight.

For example, the first weight may be taken as the weight of the first category of the target object in the image frame.

In an embodiment of the present disclosure, for each image frame, a first actual position at which the target object is located and a second actual position at which the detection device is located are obtained to determine a second weight of the first class of the target object in the image frame.

For example, the second actual position at which the detection device is located may be variable, and may be different for different image frames. In one example, for each image frame, a second actual position at which the detection device is located may be obtained corresponding to the image frame.

For example, the second weight for the first class of the target object in the image frame may be determined by the following formula:

k is a positive integer greater than or equal to 0, k represents the number of forward direction changes required for the detection device to go to the first actual position, w_i2A second weight for the first class of target objects in the image frame.

In one example, for one image frame, the lane in which the target object is located may be determined according to a first actual position in which the target object is located. The lane in which the detection device is located may be determined based on the second actual location in which the detection device is located. According to the lane where the target object is located and the lane where the detection device is located, the number of times of changing the advancing direction required by the detection device to advance to the first actual position where the target object is located can be determined.

It should be noted that k may be the minimum number of times of changing the advancing direction of the detection device to the first actual position.

For example, the lane in which the object 106 is located may be determined to be the first lane 1041 or the second lane 1042, for example, according to the first actual position (e.g., geographic coordinates obtained by satellite positioning) of the object 106 in fig. 1B. The lane in which detection device a108 is located may be determined to be first lane 1041 based on, for example, the second actual location (e.g., geographic coordinates obtained through satellite positioning) in which detection device a108 is located in fig. 1B. From the lane in which the detection device a108 is located and the lane in which the object 106 is located, it can be determined that the number of forward direction changes required for the detection device a108 to proceed to the lane in which the object 106 is located is 0. That is, in this case, k in the formula two is 0, for example.

For another example, the lane in which the object 106 is located may be determined to be the first lane 1041 or the second lane 1042, for example, according to the actual position (e.g., geographic coordinates obtained through satellite positioning) of the object 106 in fig. 1B. The lane in which the detection device B109 is located may be determined to be the fourth lane 1044 according to, for example, the actual location (such as geographic coordinates obtained through satellite positioning) of the detection device B109 in fig. 1B. From the lane in which the detection device B109 is located and the lane in which the object 106 is located, it can be determined that the number of forward direction changes required for the detection device B109 to travel to the lane in which the object 106 is located is 2, that is, in this case, k in equation two is 2, for example.

In the application scenario of fig. 1B, for example, the number of times of change may also be understood as the number of times of lane change of the detection device. According to the regulations of relevant laws and regulations, only one lane can be changed in one lane changing process. That is, for example, the detection apparatus B109 in fig. 1B needs to make a lane change at least twice to the second lane 1042.

In an embodiment of the present disclosure, a weight of a first class of the target object in the image frame is determined based on the second weight.

For example, the second weight may be taken as the weight of the first category of the target object in the image frame.

In an embodiment of the disclosure, for each image frame, a third weight of the first class of the target object in the image frame may be determined according to the first weight, the second weight and the moving speed of the detection device.

For example, the speed of movement at which the detection device is located is variable. The movement speed of the detection device may also be different for different image frames. In one example, for each image frame, a movement speed of the detection device corresponding to the image frame may be acquired.

For example, the third weight for the first category of the target object in the image frame may be determined by the following formula:

w_i1a first weight, w, for a first class of target objects in the image frame_i2A second weight, v, for the first class of the target object in the image frame_iFor detecting the speed of movement of the device corresponding to the image frame, w_i3A third weight of the first class of the target object in the image frame. When the vehicle speed is slow, the determination result of the category is more accurate. According to the third weight obtained by the formula three, the category of the object can be more accurately determined.

In an embodiment of the disclosure, a weight of the first class of the target object in the image frame is determined based on the third weight.

For example, the third weight may be taken as the weight of the first category of the target object in the image frame.

In operation S220, for the M first classes, according to at least one weight corresponding to each first class, a fusion weight of the first class is determined, and M fusion weights are obtained.

For example, the fusion weight for the first category may be determined by the following formula:

W_Iis a fusion weight of the first class, the first class corresponding to J weights of the N weights, J being a positive integer greater than or equal to 1. J is a positive integer less than or equal to N

In operation S230, a second class of the target object is determined according to the M fusion weights.

In the disclosed embodiment, the first category corresponding to the largest fusion weight may be determined as the second category of the target object.

For example, during movement of the detection device a108 in, for example, fig. 1B in the first lane 1041, the captured video includes N image frames containing the object 106. In one example, N-5 and M-2. In 3 of the 5 image frames, the first category of the object 106 is determined as a single dashed line. Further, when the first category is a single dotted line, the fusion weight of the first category is 2.8. The first category of the object 106 is determined as a single solid line in 2 of the 5 image frames. Further, when the first category is a single solid line, the fusion weight of the first category is 1.2. The maximum fusion weight is 2.8, the first class corresponding to the maximum fusion weight is a single dashed line, and the single dashed line may be determined as the second class of the object 106.

Through the embodiment of the disclosure, the cost of manual labeling can be effectively reduced, and the category of the object can be more accurately determined. The problem of wrong category determination caused by the distance between the detection equipment and the target object can be solved, and the possibility of automatic production of the lane line crowdsourcing drawing is improved.

In some embodiments, determining the weight of the first category of the target object in each image frame comprises: for each image frame, the following operations are performed: acquiring an actual distance between a target object and a detection device to determine a first weight of a first category of the target object in the image frame; acquiring a first actual position of the target object and a second actual position of the detection device to determine a second weight of the first category of the target object in the image frame; determining a third weight of the first category of the target object in the image frame according to the first weight, the second weight and the moving speed of the detection device; and determining a weight of the first class of the target object in the image frame according to the third weight.

Fig. 3 is a flow diagram of a method of determining a class of objects according to another embodiment of the present disclosure.

As shown in fig. 3, the method may be performed, for example, after operation S230 of fig. 2, which will be described in detail with reference to operations S340 through S350.

In operation S340, a number of second categories, different from the target object, of the at least one neighboring object is determined according to the category of the at least one neighboring object neighboring the target object.

In the embodiment of the present disclosure, the number of the second categories different from the target object, of the two adjacent objects, is determined according to the categories of the two adjacent objects adjacent to the target object.

For example, in the advancing direction of the detection apparatus, one adjacent object adjacent to the target object is in front of the target object, and another adjacent object adjacent to the target object is behind the target object.

In one example, it may be determined that the number of second classes different from the target object, of the two neighboring objects, is 2.

In one example, it may be determined that the number of second classes different from the target object, of the two neighboring objects, is 1.

In operation S350, in response to the number being greater than or equal to the preset number threshold, the second category of the target object is changed.

For example, in the advancing direction of the detection apparatus, one adjacent object adjacent to the target object is in front of the target object, and another adjacent object adjacent to the target object is behind the target object. In this case, the preset number threshold may be 2.

In one example, it may be determined that the number of second classes different from the target object, of the two neighboring objects, is 2. The second category of the target object may be changed to a category of a neighboring object.

In one example, it may be determined that the number of second classes different from the target object, of the two neighboring objects, is 1. The second category of the target object may not be altered.

Through the embodiment of the disclosure, the type of the target object can be determined more accurately, and sudden change of certain point types on the lane line is avoided.

FIG. 4 is a schematic diagram of a second category of multiple objects according to one embodiment of the present disclosure.

As shown in fig. 4, after, for example, operation S230 of fig. 2, a second category of the plurality of objects may be obtained. In fig. 4, if the second type of the object is different, the shape is different. For example, object 401 is in a second category different from object 402. The second category of object 402 and object 403 is the same. In fig. 4, the heading direction of the detection apparatus is the direction in which the object 401 points to the object 403.

For example, when the target object is the object 401, one adjacent object 405 adjacent to the object 401 is in front of the object 401 and another adjacent object 404 adjacent to the target object is behind the object 401 in the advancing direction of the detection apparatus. At this time, it may be determined that the categories of both the two neighboring

objects

404 and 405 are different from the target object 401, and thus the number of second categories different from the object 401 is 2, the second category of the object 401 may be changed to the neighboring object 404 or the category of the neighboring object 405.

For example, when the target object is the object 402, one adjacent object 406 adjacent to the object 402 is in front of the object 402, and another adjacent object 405 adjacent to the target object is behind the object 402 in the advancing direction of the detection apparatus. At this time, it may be determined that, of the two

adjacent objects

405 and 406, the category of 405 is the same as the target object 402 and the category of 406 is different from the target object 402, so that the number of second categories different from the object 402 is 1 and the second category of the object 402 may not be changed.

For example, when the target object is the object 403, one adjacent object 408 adjacent to the object 403 is in front of the object 403, and another adjacent object 407 adjacent to the target object is behind the object 403 in the advancing direction of the detection apparatus. At this time, it may be determined that the categories of the two neighboring

objects

407 and 408 are both different from the target object 403, the number of second categories different from the object 403 is 2, and the second category of the object 403 may be changed to the category of the neighboring object.

In one example, the plurality of objects shown in FIG. 4 are, for example, a plurality of points on the lane line 1051 in FIG. 1B.

FIG. 5 is a schematic output diagram of a method of determining a class of objects according to one embodiment of the present disclosure.

As shown in fig. 5, the second category of the object 401 and the second category of the object 403 in fig. 4, for example, are respectively changed to the same category as the neighboring object.

Fig. 6 is a block diagram of an apparatus to determine a class of an object according to one embodiment of the present disclosure.

As shown in fig. 6, the apparatus 600 may include a first determination module 610, a second determination module 620, and a third determination module 630.

A first determining module 610, configured to determine, for N image frames, weights of a first category of a target object in each image frame, to obtain N weights in one-to-one correspondence with the N image frames, where the first categories are M, each first category corresponds to at least one of the N weights, and N and M are integers greater than or equal to 1.

The second determining module 620 is configured to determine, for the M first classes, a fusion weight of each first class according to at least one weight corresponding to the first class, so as to obtain M fusion weights.

A third determining module 630, configured to determine a second category of the target object according to the M fusion weights.

In some embodiments, the first determining module comprises: a first obtaining sub-module, configured to obtain, for each image frame, an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame; and the first determining submodule is used for determining the weight of the first category of the target object in the image frame according to the first weight.

In some embodiments, the first determining module comprises: the second acquisition sub-module is used for acquiring a first actual position where the target object is located and a second actual position where the detection device is located for each image frame so as to determine a second weight of the first category of the target object in the image frame; and the second determining submodule is used for determining the weight of the first category of the target object in the image frame according to the second weight.

In some embodiments, the first determining module comprises: an execution submodule for executing, for each image frame, a correlation operation by: a first obtaining unit, configured to obtain an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame; a second obtaining unit, configured to obtain a first actual position where the target object is located and a second actual position where the detection device is located, so as to determine a second weight of the first category of the target object in the image frame; a first determining unit for determining a third weight of the first category of the target object in the image frame according to the first weight, the second weight and the moving speed of the detecting device; and a second determining unit for determining a weight of the first category of the target object in the image frame according to the third weight.

In some embodiments, the first determining module is further configured to: determining a first weight for a first class of the target object in the image frame by:

wherein d is an actual distance between the target object and the detection device, w_i1A first weight for a first class of the target object in the image frame.

In some embodiments, the first determining module is further configured to: determining a second weight for the first class of the target object in the image frame by:

wherein k is a positive integer greater than or equal to 0, k is the number of times of change of the advancing direction required by the detection device to the first actual position, and w_i2A second weight for the first class of the target object in the image frame.

In some embodiments, the first determining module is further configured to: determining a third category of the first category of the target object in the image frame by the following formulaAnd (3) weighting:

wherein, w_i1A first weight, w, for a first class of said target object in the image frame_i2A second weight, v, for a first class of said target object in the image frame_iThe moving speed of the detection device corresponding to the image frame.

In some embodiments, the second determination module is further configured to: determining a fusion weight for the first class by:

wherein, W_IIs a fusion weight of the first class, the first class corresponding to J weights of the N weights, J being a positive integer greater than or equal to 1.

In some embodiments, the third determination module is further configured to: determining the first class corresponding to the largest fusion weight as the second class of the target object.

In some embodiments, the apparatus 600 further comprises: a fourth determining module, configured to determine, according to a category of at least one neighboring object neighboring the target object, a number of second categories, different from the target object, of the at least one neighboring object; a changing module, configured to change the second category of the target object in response to the number being greater than or equal to a preset number threshold.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 executes the respective methods and processes described above, such as the method of determining the object class. For example, in some embodiments, the method of determining a class of objects may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the method of determining a class of objects described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g. by means of firmware) to perform the method of determining the object class.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of determining a class of objects, comprising:

determining weights of a first class of a target object in each image frame aiming at the N image frames to obtain N weights corresponding to the N image frames one by one, wherein the number of the first classes is M, each first class corresponds to at least one of the N weights, and both N and M are integers greater than or equal to 1;

aiming at M first categories, determining fusion weights of the first categories according to at least one weight corresponding to each first category to obtain M fusion weights; and

and determining a second category of the target object according to the M fusion weights.

2. The method of claim 1, wherein the determining a weight of the first class of the target object in each image frame comprises:

for each image frame, acquiring an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame;

from the first weight, a weight of a first category of the target object in the image frame is determined.

3. The method of claim 1, wherein the determining a weight of the first class of the target object in each image frame comprises:

for each image frame, acquiring a first actual position of the target object and a second actual position of the detection device to determine a second weight of the first category of the target object in the image frame;

determining a weight of the first class of the target object in the image frame based on the second weight.

4. The method of claim 1, wherein the determining a weight of the first class of the target object in each image frame comprises:

for each image frame, the following operations are performed:

acquiring an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame;

acquiring a first actual position of the target object and a second actual position of the detection device to determine a second weight of the first category of the target object in the image frame;

determining a third weight of the first category of the target object in the image frame according to the first weight, the second weight and the moving speed of the detection device; and

determining a weight of the first class of the target object in the image frame based on the third weight.

5. The method of any of claims 2 or 4, wherein the determining the first weight for the first category of the target object in the image frame comprises:

determining a first weight for a first class of the target object in the image frame by:

6. The method of any of claims 3 or 4, wherein said determining a second weight for the first category of the target object in the image frame comprises:

determining a second weight for the first class of the target object in the image frame by:

7. The method of claim 4, wherein said determining a third weight for the first class of the target object in the image frame comprises:

determining a third weight for the first class of the target object in the image frame by:

8. The method of claim 7, wherein said determining the fusion weight for the first category comprises:

determining a fusion weight for the first class by:

9. The method of claim 1, wherein the determining the second class of the target object comprises:

determining the first class corresponding to the largest fusion weight as the second class of the target object.

10. The method of claim 1, further comprising:

determining the number of second classes different from the target object in at least one adjacent object according to the class of the at least one adjacent object adjacent to the target object;

in response to the number being greater than or equal to a preset number threshold, altering a second category of the target object.

11. An apparatus for determining a class of objects, comprising:

a first determining module, configured to determine, for N image frames, weights of a first category of a target object in each image frame to obtain N weights in one-to-one correspondence with the N image frames, where the first categories are M, each first category corresponds to at least one of the N weights, and N and M are integers greater than or equal to 1;

a second determining module, configured to determine, for the M first classes, a fusion weight of each first class according to at least one weight corresponding to the first class, so as to obtain M fusion weights; and

and the third determining module is used for determining the second category of the target object according to the M fusion weights.

12. The apparatus of claim 11, wherein the first determining means comprises:

a first obtaining sub-module, configured to obtain, for each image frame, an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame;

and the first determining submodule is used for determining the weight of the first category of the target object in the image frame according to the first weight.

13. The apparatus of claim 11, wherein the first determining means comprises:

the second acquisition sub-module is used for acquiring a first actual position where the target object is located and a second actual position where the detection device is located for each image frame so as to determine a second weight of the first category of the target object in the image frame;

and the second determining submodule is used for determining the weight of the first category of the target object in the image frame according to the second weight.

14. The apparatus of claim 11, wherein the first determining means comprises:

an execution submodule for executing, for each image frame, a correlation operation by:

a first obtaining unit, configured to obtain an actual distance between the target object and a detection device to determine a first weight of a first category of the target object in the image frame;

a second obtaining unit, configured to obtain a first actual position where the target object is located and a second actual position where the detection device is located, so as to determine a second weight of the first category of the target object in the image frame;

a first determining unit for determining a third weight of the first category of the target object in the image frame according to the first weight, the second weight and the moving speed of the detecting device; and

a second determining unit for determining a weight of the first category of the target object in the image frame according to the third weight.

15. The apparatus of any of claims 12 or 14, wherein the first determining module is further configured to:

16. The apparatus of any of claims 13 or 14, wherein the first determining module is further configured to:

17. The apparatus of claim 14, wherein the first determining means is further configured to:

18. The apparatus of claim 17, wherein the second determining means is further configured to:

determining a fusion weight for the first class by:

19. The apparatus of claim 11, wherein the third determining means is further for:

20. The apparatus of claim 11, further comprising:

a fourth determining module, configured to determine, according to a category of at least one neighboring object neighboring the target object, a number of second categories, different from the target object, of the at least one neighboring object;

a changing module, configured to change the second category of the target object in response to the number being greater than or equal to a preset number threshold.

21. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 10.

22. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 10.

23. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 10.