CN114359231A

CN114359231A - Parking space detection method, device, equipment and storage medium

Info

Publication number: CN114359231A
Application number: CN202210013342.1A
Authority: CN
Inventors: 杨一帆
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2022-04-15

Abstract

The embodiment of the application discloses a method, a device, equipment and a storage medium for detecting a parking space, and the related embodiments can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic and the like, and are used for improving the accuracy of obtaining the parking space and the parking efficiency. The method in the embodiment of the application comprises the following steps: obtaining a target feature map corresponding to an image to be detected, wherein the target feature map comprises N anchor points, N is an integer greater than 1, calculating a confidence score of each anchor point, wherein the confidence score is used for indicating a probability value that the anchor points contain a target object, determining M target anchor points from the N anchor points according to the confidence score, M is an integer greater than 1 and smaller than N, performing key point calculation on each target anchor point in the M target anchor points respectively to obtain M key point coordinates, determining the position of a parking space in the image to be detected according to the M key point coordinates, and pushing the position of the parking space in the image to be detected to a target terminal device.

Description

Parking space detection method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of intelligent traffic, in particular to a parking space detection method, device, equipment and storage medium.

Background

Along with the increase of urbanization and the improvement of living standard of people, automobiles in large cities are more and more, the flow of people in most urban areas is large, the parking space is very difficult to find in peak periods, the phenomenon of random parking and random placement is repeated and forbidden, the traveling and living experience of people are seriously influenced, and the problem of difficult parking is more and more prominent. Vehicle users urgently need to accurately know the state information of the free parking spaces in the parking lot so as to improve the parking efficiency.

The conventional detection method for the idle parking spaces generally detects entry points of the parking spaces on the basis of deep learning, and then performs deep estimation on the parking spaces by combining with artificial priori knowledge, so as to generate position information of the parking spaces.

However, the depth estimation of the parking space not only needs to rely on the correct prediction of the entry point, but also easily causes the position of the detected parking space to generate a large angular deviation if the predicted entry point is not well fitted to the entry point in the actual image, and also relies on artificial priori knowledge, so that the artificially set rule cannot cover various parking space situations, the detection error of the parking space is easily increased, the accuracy of detecting the parking space is not high, and the parking efficiency is reduced.

Disclosure of Invention

The embodiment of the application provides a parking space detection method, a device, equipment and a storage medium, which are used for screening M target anchor frames through confidence scores of the anchor frames and calculating M key point coordinates in the M target anchor frames to really draw out the position of a parking space on an image to be detected, so that the prediction error of a prediction result of an entry point and the influence of artificial priori knowledge can be avoided, the real parking space situation can be better fitted, and the parking efficiency is improved and reduced.

An embodiment of the present application provides a method for detecting a parking space, including:

acquiring a target characteristic diagram corresponding to an image to be detected, wherein the target characteristic diagram comprises N anchor point frames, and N is an integer greater than 1;

calculating a confidence score of each anchor box, wherein the confidence score is used for indicating the probability value that the anchor box contains the target object;

determining M target anchor boxes from the N anchor boxes according to the confidence score, wherein M is an integer larger than 1 and smaller than N;

respectively calculating key points of each target anchor point frame in the M target anchor point frames to obtain M key point coordinates;

and determining the position of the parking space in the image to be detected according to the M key point coordinates, and pushing the position of the parking space in the image to be detected to the target terminal equipment.

This application another aspect provides a detection device in parking stall, includes:

the device comprises an acquisition unit, a detection unit and a processing unit, wherein the acquisition unit is used for acquiring a target characteristic diagram corresponding to an image to be detected, the target characteristic diagram comprises N anchor point frames, and N is an integer greater than 1;

the processing unit is used for calculating a confidence score of each anchor box, wherein the confidence score is used for indicating the probability value that the anchor box contains the target object;

the determining unit is used for determining M target anchor boxes from the N anchor boxes according to the confidence score, wherein M is an integer which is greater than 1 and smaller than N;

the processing unit is also used for respectively carrying out key point calculation on each target anchor point frame in the M target anchor point frames to obtain M key point coordinates;

and the determining unit is also used for determining the position of the parking space in the image to be detected according to the M key point coordinates.

In a possible design, in an implementation manner of another aspect of the embodiment of the present application, the obtaining unit may be specifically configured to:

inputting an image to be detected into an image detection model, and performing feature extraction on the image to be detected through the image detection model to obtain a first feature map and a second feature map, wherein the first feature map comprises N first anchor point frames, the second feature map comprises N second anchor point frames, the size of the first feature map is smaller than that of the second feature map, and the size of the first anchor point frame is smaller than that of the second anchor point frame;

the processing unit may specifically be configured to: calculating a first confidence score for each first anchor box and calculating a second confidence score for each second anchor box;

the determining unit may specifically be configured to: determining M first target anchor boxes from the N first anchor boxes and M second target anchor boxes from the N second anchor boxes according to the confidence scores.

In a possible design, in an implementation manner of another aspect of the embodiment of the present application, the determining unit may be specifically configured to:

determining a first anchor box with a first confidence score greater than a first confidence threshold as a first target anchor box;

a second anchor box with a second confidence score greater than a second confidence threshold is determined as a second target anchor box.

In a possible design, in an implementation manner of another aspect of the embodiment of the present application, the processing unit may be specifically configured to:

performing key point calculation on each first target anchor point frame in the M first target anchor point frames to obtain M first key point coordinates;

and respectively carrying out key point calculation on each second target anchor point frame in the M second target anchor point frames to obtain M second key point coordinates.

generating a first enclosure box according to the M first key point coordinates, and generating a second enclosure box according to the M second key point coordinates;

calculating a first overlapping degree between the first bounding box and the first detection frame, and calculating a second overlapping degree between the second bounding box and the second detection frame;

determining a target bounding box from the first bounding box and the second bounding box according to the first overlapping degree and the second overlapping degree;

and generating the position of the parking space in the image to be detected according to the target bounding box.

In one possible design, in one implementation of another aspect of an embodiment of the present application,

the processing unit is further used for respectively calculating the attached point confidence score of each target anchor point frame in the M target anchor point frames;

the determining unit is further used for determining the auxiliary anchor point frame from the M target anchor point frames according to the auxiliary point confidence score;

the processing unit is also used for carrying out auxiliary point calculation on the auxiliary anchor point frame to obtain a target auxiliary point coordinate;

the determining unit may specifically be configured to: and determining the position of the parking space in the image to be detected according to the M key point coordinates and the target auxiliary point coordinates.

the processing unit is further used for respectively calculating occupied confidence scores of each target anchor point frame in the M target anchor point frames;

the acquisition unit is further used for acquiring coordinates of the K occupied anchor frames if the K occupied anchor frames are determined from the M target anchor frames according to the occupied confidence score, wherein K is an integer which is greater than or equal to 1 and smaller than M;

the determining unit may specifically be configured to: and determining the positions of occupied parking spaces in the image to be detected according to the coordinates of the M key points and the coordinates of the K occupied anchor point frames.

the acquiring unit is further configured to acquire a sample feature map corresponding to the sample image, where the sample feature map includes N anchor point frames, the sample image corresponds to labeling information, and the labeling information includes M sample key point coordinates of the sample object;

the determining unit is also used for determining a sample anchor point frame from the N anchor point frames according to the M sample key point coordinates;

the acquisition unit is further used for acquiring the confidence score and the truth label of the sample anchor frame and constructing a first loss function according to the confidence score and the truth label of the sample anchor frame;

the processing unit is also used for respectively calculating the offset between each sample key point coordinate in the M sample key point coordinates and the center point coordinate of the sample anchor point frame to obtain M sample offsets;

the processing unit is further used for constructing a second loss function according to the M sample offsets and the M sample key point coordinates;

and the processing unit is also used for updating the model parameters of the image detection model according to the first loss function and the second loss function.

inputting a sample image into an image detection model, and performing feature extraction on the sample image through the image detection model to obtain a first sample feature map and a second sample feature map, wherein the first sample feature map comprises N first anchor point frames, and the second sample feature map comprises N second anchor point frames;

the determining unit may specifically be configured to: and determining a first sample anchor point frame from the N first anchor point frames according to the M sample key point coordinates, and determining a second sample anchor point frame from the N second anchor point frames.

the processing unit is further used for calculating a first sample overlapping degree between the first sample anchor frame and the first sample detection frame and calculating a second sample overlapping degree between the second sample anchor frame and the second sample detection frame;

the determining unit is further used for determining a target sample anchor point frame from the first sample anchor point frame and the second sample anchor point frame according to the first sample overlapping degree and the second sample overlapping degree;

the obtaining unit may specifically be configured to: and acquiring the confidence score and the truth label of the target sample anchor box, and constructing a first loss function according to the confidence score and the truth label of the target sample anchor box.

the acquisition unit is also used for acquiring the confidence score of the sample auxiliary point of the sample anchor frame and the truth label of the auxiliary point;

the processing unit is further used for constructing a third loss function according to the sample auxiliary point confidence score and the auxiliary point truth value label of the sample anchor point frame;

the processing unit may specifically be configured to: and updating the model parameters of the image detection model according to the first loss function, the second loss function and the third loss function.

the acquisition unit is further used for acquiring the occupied confidence score and the occupied truth label of the sample anchor point frame according to the position of the third sample detection frame in the sample characteristic diagram;

the processing unit is further used for constructing a fourth loss function according to the occupied confidence score and the occupied truth label of the sample anchor box;

the processing unit may specifically be configured to: and updating the model parameters of the image detection model according to the first loss function, the second loss function and the fourth loss function.

Another aspect of the present application provides a computer device, including: a memory, a transceiver, a processor, and a bus system;

wherein, the memory is used for storing programs;

the processor, when executing the program in the memory, implements the methods as described above;

the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.

Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the method of the above-described aspects.

According to the technical scheme, the embodiment of the application has the following advantages:

by acquiring a target characteristic diagram corresponding to an image to be detected and comprising N anchor frames, the confidence score of each anchor frame, which can be used for indicating the probability value of a target object in the anchor frame, can be calculated, then M target anchor frames can be determined from the N anchor frames according to the confidence score, and the key point calculation is respectively carried out on each target anchor frame in the M target anchor frames so as to acquire M key point coordinates, then the position of a parking space in the image to be detected can be determined according to the M key point coordinates, and the position of the parking space in the image to be detected is pushed to a target terminal device. Through the mode, M target anchor frames containing the target object characteristics can be screened out from N anchor frames by obtaining the confidence score of each anchor frame containing the probability value of the target object in the target characteristic diagram corresponding to the image to be detected, then M key point coordinates in the M target anchor frames can be calculated, the position of the parking space can be directly drawn on the image to be detected through the M key point coordinates, the depth of the parking space is not required to be estimated by combining the prediction information assumed by artificial priori knowledge based on the prediction result of the entry point of the parking space to obtain the position of the parking space, the prediction error of the prediction result of the entry point and the influence of the artificial priori knowledge can be avoided, the M target anchor frames can be directly determined on the actual image to directly calculate the key point coordinates, and the actual parking space situation can be better fitted, thereby improving the parking efficiency and reducing.

Drawings

FIG. 1 is a block diagram of an embodiment of an image data control system

Fig. 2 is a flowchart of an embodiment of a method for detecting a parking space in an embodiment of the present application;

fig. 3 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 4 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 5 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 6 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 7 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 8 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 9 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 10 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 11 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 12 is a flowchart of another embodiment of a parking space detection method in the embodiment of the present application;

fig. 13 is a schematic flow chart illustrating a method for detecting a parking space according to an embodiment of the present application;

fig. 14 is a schematic view of an anchor point box of the parking space detection method in the embodiment of the present application;

fig. 15(a) is a schematic diagram of a labeling key point of the parking space detection method in the embodiment of the present application;

fig. 15(b) is a schematic diagram of another labeled key point of the parking space detection method in the embodiment of the present application;

fig. 16(a) is a schematic view of a parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 16(b) is a schematic view of another parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 17(a) is a schematic diagram illustrating an indoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 17(b) is a schematic view illustrating another indoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 17(c) is a schematic view of another indoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 18(a) is a schematic view of an outdoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 18(b) is a schematic view of another outdoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 18(c) is a schematic view of another outdoor parking space detection effect of the parking space detection method in the embodiment of the present application;

fig. 19 is a schematic view of an embodiment of a parking space detection apparatus according to the embodiment of the present application;

FIG. 20 is a schematic diagram of an embodiment of a computer device in the embodiment of the present application.

Detailed Description

The terms "first," "second," "third," "fourth," and the like in the description and in the claims and drawings of the present application, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

With the rapid development of information, Cloud technology (Cloud technology) gradually moves into the aspect of people's life. The cloud technology is a general term of network technology, information technology, integration technology, management platform technology, application technology and the like applied based on a cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

The Cloud Security (Cloud Security) refers to a generic name of Security software, hardware, users, organizations and Security Cloud platforms applied based on a Cloud computing business model. The cloud security integrates emerging technologies and concepts such as parallel processing, grid computing and unknown virus behavior judgment, abnormal monitoring of software behaviors in the network is achieved through a large number of meshed clients, the latest information of trojans and malicious programs in the internet is obtained and sent to the server for automatic analysis and processing, and then the virus and trojan solution is distributed to each client. The parking space detection method provided by the embodiment of the application can be realized through a cloud computing technology and a cloud security technology.

It should be understood that the detection method for the parking space provided by the application can be applied to the fields of cloud technology, artificial intelligence, intelligent traffic and the like, and is used for completing scenes such as vehicle parking tasks or parking lot management and the like by detecting the position of the parking space. As an example, a user is helped to find an empty parking space better and faster, for example by detecting the location of a parking space, to help the intelligent vehicle complete an auto-park task. As another example, parking space management personnel are assisted in maintaining and managing parking spaces, for example, by detecting the location of the parking spaces. As yet another example, the position and number of free parking spaces can be obtained by detecting the position of the parking spaces, for example, and can be pushed to a display device of a parking lot to prompt a user about the condition of the parking spaces in the parking lot. In the above-described various scenarios, in order to complete detection of the position of the parking space, it is common to predict an entry point of the parking space based on deep learning and perform depth estimation of the parking space in combination with artificial priori knowledge to generate position information of the parking space, but the detection accuracy of the parking space is not high due to a prediction error of a prediction result of the entry point and the influence of the artificial priori knowledge, and thus the parking efficiency is reduced.

In order to solve the above problem, the present application proposes a parking space detection method, which is applied to the image data control system shown in fig. 1, please refer to fig. 1, FIG. 1 is a schematic structural diagram of an image data control system in an embodiment of the present application, as shown in FIG. 1, a server obtains a target feature map including N anchor point frames corresponding to an image to be detected provided by a terminal device, a confidence score may be calculated for each anchor box that can be used to indicate a probability value that the anchor box contains the target object, and then M target anchor frames can be determined from the N anchor frames according to the confidence score, and the key point calculation is respectively carried out on each target anchor frame in the M target anchor frames to obtain M key point coordinates, then, the position of the parking space in the image to be detected can be determined according to the M key point coordinates, and the position of the parking space in the image to be detected is pushed to the target terminal device. Through the mode, M target anchor frames containing the target object characteristics can be screened out from N anchor frames by obtaining the confidence score of each anchor frame containing the probability value of the target object in the target characteristic diagram corresponding to the image to be detected, then M key point coordinates in the M target anchor frames can be calculated, the position of the parking space can be directly drawn on the image to be detected through the M key point coordinates, the depth of the parking space is not required to be estimated by combining the prediction information assumed by artificial priori knowledge based on the prediction result of the entry point of the parking space to obtain the position of the parking space, the prediction error of the prediction result of the entry point and the influence of the artificial priori knowledge can be avoided, the M target anchor frames can be directly determined on the actual image to directly calculate the key point coordinates, and the actual parking space situation can be better fitted, thereby improving the parking efficiency and reducing.

It is understood that fig. 1 only shows one terminal device, and in an actual scene, a greater variety of terminal devices may participate in the data processing process, where the terminal devices include, but are not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent household appliance, a vehicle-mounted terminal, and the specific number and variety depend on the actual scene, and are not limited herein. In addition, fig. 1 shows one server, but in an actual scenario, a plurality of servers may participate, and particularly in a scenario of multi-model training interaction, the number of servers depends on the actual scenario, and is not limited herein.

It should be noted that in this embodiment, the server may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. The terminal device and the server may be directly or indirectly connected through a wired or wireless communication manner, and the terminal device and the server may be connected to form a block chain network, which is not limited herein.

In order to solve the above problem, the present application proposes a parking space detection method, which is generally executed by a server or a terminal device, and accordingly, a detection apparatus applied to a parking space is generally disposed in the server or the terminal device.

Referring to fig. 2, a method for detecting a parking space in the present application is described below, where an embodiment of the method for detecting a parking space in the present application includes:

in step S101, a target feature map corresponding to an image to be detected is obtained, where the target feature map includes N anchor point frames, and N is an integer greater than 1;

in this embodiment, when the image to be detected is acquired through the camera, feature extraction may be performed on the image to be detected to obtain a target feature map including N anchor point frames.

The image to be detected can be represented as a front-view camera or a circular-view fisheye camera at the vehicle end of the vehicle or a frame of image of various parking space scenes collected by a camera or a road camera of a parking lot, and the image to be detected can comprise one or more parking spaces. The anchor frame refers to a grid capable of detecting a target object of an image to be detected, the anchor frame is a pixel frame capable of predicting, and is generally square, it can be understood that, in order to detect different types of target objects, a plurality of grids in different shapes (different sizes and different aspect ratios) can be set at the same position, and it can be understood that, at the same position in a target feature map, anchor frames in several different scales can be set to predict the target object at the same position in various states.

Specifically, as shown in fig. 13, when an image to be detected is acquired through a vehicle-end front-view camera or a look-around fisheye camera of a vehicle, if a relevant detection terminal device, such as a visual parking space detection vehicle system, is installed on the vehicle, and there is no specific limitation here, the acquired image to be detected may be sent to the detection terminal device, the acquired image to be detected may be subjected to feature extraction through the detection terminal device to acquire a corresponding feature map, and the feature map is divided according to anchor point frames of preset sizes, so that a target feature map including N anchor point frames may be obtained; or, when the image to be detected is collected through the vehicle-end front-view camera or the all-round-looking fisheye camera of the vehicle, if no relevant detection terminal device is installed on the vehicle, the collected image to be detected can be sent to the cloud server, the obtained image to be detected can be subjected to feature extraction through the cloud server to obtain a corresponding feature map, the feature map is divided according to anchor points of preset sizes, a target feature map comprising N anchor points can be obtained, the target feature map can also be obtained through other modes, and specific limitation is not imposed here.

For example, as shown in fig. 14, assuming that a 3 × 5 (height-width) feature map can be obtained after feature extraction is performed on an image to be detected, the feature map can be divided by 3 × 5 through 3 × 5 anchor blocks (e.g., a grid formed by black lines in fig. 14), and a target feature map including 15 anchor blocks can be obtained.

In step S102, calculating a confidence score of each anchor box, where the confidence score is used to indicate a probability value that the anchor box contains the target object;

in this embodiment, after the target feature map including N anchor frames is acquired, the confidence level of each anchor frame may be calculated to acquire the confidence level score of each anchor frame.

The confidence score may be specifically expressed as a number of 0 to 100, or may be expressed as a number of 0 to 1, and may be specifically set according to the actual application requirement, where the confidence score may be used to indicate a probability that a given label predicts correctly, and a larger score indicates that the image has a higher probability of including the label, for example, the confidence scores corresponding to a plurality of labels on a certain picture are beach 83, a bag 17, a sea 12, household articles 12, and other articles 11, respectively, so that the picture is very likely to include the beach, and the probability of including the bags, the sea, and the household articles is relatively low.

Specifically, after the target feature map including N anchor blocks is obtained, the confidence of each anchor block may be calculated by a conventional confidence algorithm, such as a sampling confidence algorithm, to obtain a confidence score of each anchor block, and it is understood that the higher the confidence score of each anchor block is, the higher the probability that the anchor block contains the target object is, where the target object may specifically represent a parking space, such as a temporary parking space or a fixed parking space, and this is not limited specifically here.

For example, as shown in fig. 14, assuming a target feature map of 15 anchor blocks, with the upper left corner of the target feature map as the origin coordinate (0,0), it is assumed that the confidence score of the anchor block of (1,2) can be indexed as 0.79, the confidence score of the anchor block of (2,2) as 0.91, the confidence score of the anchor block of (3,2) as 0.86, and the confidence score of the anchor block of (4,2) as 0.66.

In step S103, determining M target anchor boxes from the N anchor boxes according to the confidence score, where M is an integer greater than 1 and less than N;

in this embodiment, after the confidence score of each anchor frame is obtained, since the confidence score of each anchor frame is higher, that is, the probability that the anchor frame contains the target object is higher, M target anchor frames containing the features of the target object may be selected from the N anchor frames according to the obtained confidence scores, so that the position information of the target object may be accurately estimated according to the obtained M target anchor frames containing the features of the target object in the subsequent process.

Specifically, after the confidence score of each anchor point frame is obtained, determining M target anchor point frames from N anchor point frames according to the confidence scores may specifically be that the obtained confidence scores are sorted from large to small, the top N/2 or top (N +1)/2 confidence scores may be selected according to a median, then, the confidence scores with the confidence score greater than a confidence threshold, for example, 0.5, are selected from the top N/2 or top (N +1)/2 confidence scores, and the anchor point frames corresponding to the confidence scores are used as the M target anchor point frames; alternatively, the confidence score of each anchor point frame may be compared with a confidence threshold, such as 0.5, and the anchor point frame corresponding to the confidence score greater than the confidence threshold is used as M target anchor point frames, or other selection methods may be used, which are not limited in this respect.

For example, as shown in fig. 14, assuming a target feature map of 15 anchor blocks, with the upper left corner of the target feature map as the origin coordinate (0,0), assuming that the confidence score of the anchor block of (0,2) can be indexed as 0.31, the confidence score of the anchor block of (1,2) is 0.79, the confidence score of the anchor block of (2,2) is 0.91, the confidence score of the anchor block of (3,2) is 0.86, and the confidence score of the anchor block of (4,2) is 0.66, assuming that the confidence threshold is 0.6, comparing the confidence scores of the indexed anchor blocks with the confidence threshold of 0.6, respectively, 4 target anchor blocks with confidence scores greater than the confidence threshold of 0.6 can be obtained, such as the anchor block of (1,2), the anchor block of (2,2), the anchor block of (3,2), and the anchor block of (4,2) shown in fig. 14.

In step S104, performing a key point calculation on each of the M target anchor frames to obtain M key point coordinates;

in this embodiment, because a general parking space that is not occluded may be described by two entry points (e.g., entry _0 and entry _1 in fig. 15 (a)) and two side line points (e.g., tail point _0 and tail point _1 in fig. 15 (a)) of the parking space as shown in fig. 15(a), after M target anchor frames that may include features of a target object are acquired, the present embodiment may perform a keypoint calculation on each target anchor frame in the M target anchor frames respectively to acquire M keypoint coordinates.

Specifically, after M target anchor frames that may include the feature of the target object are acquired, the coordinates of the center point of each target anchor frame may be calculated according to the pixel position of each target anchor frame in the target feature map in the M target anchor frames, and the coordinates of the key points of each key point may be estimated according to the offset between the key points that may be included in each target anchor frame and the coordinates of the center point of each target anchor frame, so as to acquire M key point coordinates.

In step S105, the position of the parking space in the image to be detected is determined according to the M key point coordinates.

In this embodiment, after obtaining the coordinates of the M key points, the corresponding M key point coordinates may be obtained in the image to be detected according to the obtained coordinates of the M key points, and the position of the target object (parking space) may be depicted on the image to be detected by the M key points, that is, the position of the parking space in the image to be detected may be obtained.

Specifically, after the M key point coordinates are obtained, the position of the parking space in the image to be detected is determined according to the M key point coordinates, specifically, the corresponding M key points are found in the image to be detected through the M key point coordinates, then, the M key points may be connected to form a polygon through a connection edge, and the polygon is depicted on the image to be detected, so as to serve as the position of the parking space in the image to be detected, where the polygon may be specifically represented as a quadrangle, a pentagon, or the like, or an area enclosed by the M key points may be rendered on the image to be detected by using a drawing tool according to the M key point coordinates, so as to serve as the position of the parking space in the image to be detected, or the position of the parking space in the image to be detected may be determined by using other manners, which is not particularly limited herein.

It can be understood that after the position of the parking space in the acquired image to be detected, the position of the parking space in the acquired image to be detected can be subjected to color rendering or visual processing to enrich visual effect, further, the position of the parking space in the acquired image to be detected can be pushed to a vehicle provided with related detection terminal equipment or a display device of a parking lot and the like, so that a user can be helped to find the free parking space better and more quickly, an intelligent vehicle is helped to complete an automatic parking task, or a parking lot manager is helped to maintain and manage the parking space and the like.

In the embodiment of the application, a method for detecting a parking space is provided, by the above manner, M target anchor frames including the characteristics of a target object can be screened out from N anchor frames by obtaining a confidence score of each anchor frame capable of representing a probability value including the target object in a target feature map corresponding to an image to be detected, then M key point coordinates in the M target anchor frames can be calculated, and a position of the parking space can be directly drawn on the image to be detected through the M key point coordinates without conjecturing a depth of the parking space based on a prediction result of an entry point of the parking space and prediction information assumed by artificial priori knowledge to obtain the position of the parking space, so that a prediction error of the prediction result of the entry point and the influence of the artificial priori knowledge can be avoided, and the M target anchor frames can be directly determined on an actual image to directly calculate the key point coordinates, the parking space can be better fitted with the real parking space situation, so that the parking efficiency is improved and reduced.

Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 3, the target feature map includes a first feature map and a second feature map, the anchor point frame includes a first anchor point frame and a second anchor point frame, and the target anchor point frame includes a first target anchor point frame and a second target anchor point frame; obtaining a target characteristic diagram corresponding to an image to be detected, comprising:

in step S201, an image to be detected is input to an image detection model, and feature extraction is performed on the image to be detected through the image detection model to obtain a first feature map and a second feature map, where the first feature map includes N first anchor point frames, the second feature map includes N second anchor point frames, the size of the first feature map is smaller than that of the second feature map, and the size of the first anchor point frame is smaller than that of the second anchor point frame;

in step S202, a first confidence score of each first anchor box is calculated, and a second confidence score of each second anchor box is calculated;

in step S203, M first target anchor boxes are determined from the N first anchor boxes according to the confidence scores, and M second target anchor boxes are determined from the N second anchor boxes.

In this embodiment, when an image to be detected is acquired by a camera, the image to be detected is input to an image detection model, and feature extraction is performed on the image to be detected by the image detection model to acquire a first feature map including N first anchor frames and a second feature map including N second anchor frames on the right, so that a first confidence score can be calculated for each first anchor frame and a second confidence score can be calculated for each second anchor frame, then M first target anchor frames can be determined from the N first anchor frames according to the calculated first confidence scores, and M second target anchor frames can be determined from the N second anchor frames according to the calculated second confidence scores, so that the same target object (the same parking space) can be located at the same position of the image to be detected by the first feature maps and the second feature maps of different sizes and the first anchor frames and the second anchor frames of different sizes The prediction (the dense or sparse state of a plurality of parking spaces) can obtain one or more predicted values of the same target object (the same parking space), so that the predicted value of the unique parking space can be obtained by further processing according to the one or more predicted values, and the accuracy of parking space detection can be improved to a certain extent.

The image detection model may be embodied as a depth separable convolution network, or a depth separable convolution network based on a feature pyramid structure, or a convolution neural network, or may be another image detection model, which is not limited in this respect.

Specifically, as shown in fig. 13, when an image to be detected is acquired through the camera, the image to be detected is input to the image detection model, and feature extraction is performed on the image to be detected through the image detection model, so as to acquire a first feature map including N first anchor point frames and a second feature map including N second anchor point frames, where the size of the first feature map is smaller than the second feature map, and the size of the first anchor point frame is smaller than the second anchor point frame. For example, feature extraction is performed on an image to be detected through an image detection model, assuming that a 9 × 16 (size relative to network input) first feature map can be obtained, N first anchor blocks with widths and heights of 575 and 111 (size relative to 1280 × 720) can be set to perform target object prediction on the first feature map, and similarly, assuming that an 18 × 32 second feature map can be obtained, N first anchor blocks with widths and heights of 217 and 42 can be set to perform target object prediction on the first feature map, so that small parking spaces can be predicted subsequently through the dense second feature map and the N second anchor blocks, and large parking spaces can be predicted through the sparse first feature map and the N first anchor blocks.

Further, when a first feature map including N first anchor frames and a second feature map including N second anchor frames are obtained, since the features of the target objects included in the first anchor frames and the second anchor frames of different sizes are different, the confidence calculation may be performed on each first anchor frame through a conventional confidence algorithm to obtain the first confidence score of each first anchor frame, and similarly, the confidence calculation may be performed on each second anchor frame to obtain the second confidence score of each second anchor frame.

Further, after the N first confidence scores and the N second confidence scores are obtained, since the higher the confidence score of each anchor frame is, that is, the higher the probability that the anchor frame contains the target object is, the M first target anchor frames may be determined from the N first anchor frames according to the calculated first confidence scores, and the M second target anchor frames may be determined from the N second anchor frames according to the calculated second confidence scores, where determining the M first target anchor frames and determining the M second target anchor frames may be similar to determining the M target anchor frames from the N anchor frames according to the confidence scores in step S103, and details are not repeated here.

Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 4, determining M target anchor boxes from N anchor boxes according to the confidence score includes:

in step S301, a first anchor point frame with a first confidence score greater than a first confidence threshold is determined as a first target anchor point frame;

in step S302, a second anchor point box with a second confidence score greater than a second confidence threshold is determined as a second target anchor point box.

In this embodiment, after the first confidence score of each first anchor point frame and the second confidence score of each second anchor point frame are obtained, each first confidence score may be compared with the first confidence threshold value respectively to obtain a first target anchor point frame of which the first confidence score is greater than the first confidence threshold value, and similarly, each second confidence score may be compared with the second confidence threshold value respectively to obtain a second target anchor point frame of which the second confidence score is greater than the second confidence threshold value, and one or more predicted values of the same target object (the same parking space) may be obtained through prediction of the same target object (the same parking space) in various states (states in which a plurality of parking spaces are dense or sparse) at the same position of the image to be detected by the first feature map and the second feature map of different sizes and the first anchor point frame and the second anchor point frame of different sizes, the method and the device can further process the predicted value according to the one or more predicted values to obtain the predicted value of the unique parking space, so that the accuracy of detecting the parking space can be improved to a certain extent.

Specifically, after acquiring N first confidence scores of N first anchor frames, a preset first confidence threshold may be acquired, where the first confidence threshold is set according to actual application requirements, and is usually set to be 0.5, where no specific limitation is made, then each first confidence score may be compared with the first confidence threshold, and M first target anchor frames with first confidence scores greater than the first confidence threshold may be screened out from the N first anchor frames, and similarly, after acquiring N second confidence scores of N second anchor frames, a preset second confidence threshold may be acquired, where the second confidence threshold is also set according to actual application requirements, and may also be set to be 0.5, where no specific limitation is made, and then each second score may be compared with the second confidence threshold, therefore, M second target anchor point frames with second confidence scores larger than a second confidence threshold value can be screened out from the N second anchor point frames, so that the positions of parking spaces in the image to be detected can be predicted subsequently according to the M first target anchor point frames and the M second target anchor point frames.

Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 4, the key point coordinates include a first key point coordinate and a second key point coordinate; performing key point calculation on each target anchor point frame in the M target anchor point frames respectively to obtain M key point coordinates, including:

in step S401, performing keypoint calculation on each of the M first target anchor frames, to obtain M first keypoint coordinates;

in step S402, a keypoint calculation is performed on each of the M second target anchor frames, so as to obtain M second keypoint coordinates.

In this embodiment, when the key points of the target object are predicted through the first feature map and the second feature map with different sizes and the first anchor point frame and the second anchor point frame with different sizes, because the pixel positions of the first feature map and the second feature map with different sizes of the first anchor point frame and the second anchor point frame with different sizes are different, and the pixel positions of the target object in the first feature map and the second feature map with different sizes are also different, in order to better describe the position and the size of the same target object (parking space) in the image to be detected, after the M first target anchor point frames and the M second target anchor point frames are obtained, the embodiment may respectively perform key point calculation on each first target anchor point frame in the M first target anchor point frames to obtain M first key point coordinates, and similarly, may respectively perform key point calculation on each second target anchor point frame in the M second target anchor point frames to obtain M first key point coordinates And the M second key point coordinates are obtained, so that the subsequent further processing can be carried out according to the obtained M first key point coordinates and the obtained M second key point coordinates to obtain unique M key point coordinates, the unique position of a target object (parking space) on the image to be detected can be obtained, and the accuracy of detecting the parking space can be improved to a certain extent.

Specifically, after acquiring the M first target anchor blocks and the M second target anchor blocks, according to the pixel position of each first target anchor block in the M first target anchor blocks on the first feature map, the coordinates of the center point of each first target anchor block may be calculated, and the first keypoint coordinates of each keypoint may be estimated according to the offset between the keypoint that may be included in each first target anchor block and the coordinates of the center point of each first target anchor block, to obtain M first keypoint coordinates, and similarly, the M second keypoint coordinates may be obtained by performing keypoint calculation on each of the M first target anchor frames in step S401 to obtain M first keypoint coordinates, which is not described herein again, and the position of the parking space in the image to be detected can be further determined by subsequently acquiring the M first key point coordinates and the M second key point coordinates.

Optionally, on the basis of the embodiment corresponding to fig. 4, in another optional embodiment of the detection method for the parking space provided in the embodiment of the present application, as shown in fig. 5, the first feature diagram further includes a first detection frame, the second feature diagram further includes a second detection frame, and both the first detection frame and the second detection frame are used for indicating the target object; determining the position of a parking space in an image to be detected according to M key point coordinates, comprising:

in step S501, a first bounding box is generated according to the M first keypoint coordinates, and a second bounding box is generated according to the M second keypoint coordinates;

in step S502, a first degree of overlap between the first bounding box and the first detection frame is calculated, and a second degree of overlap between the second bounding box and the second detection frame is calculated;

in step S503, determining a target bounding box from the first bounding box and the second bounding box according to the first degree of overlap and the second degree of overlap;

in step S504, the position of the parking space is generated in the image to be detected from the target bounding box.

In this embodiment, when the first feature map and the second feature map with different sizes and the first anchor point frame and the second anchor point frame with different sizes are used to predict the key points of the target object, due to the influence of the first anchor point frame and the second anchor point frame with different sizes on the densities of the first feature map and the second feature map with different sizes, the coordinates of the M first key points and the coordinates of the M second key points which are fitted are not consistent, so that based on the coordinates of the M first key points and the coordinates of the M second key points, the positions of two parking spaces (such as the parking space indicated by the arrow in fig. 16 (a)) overlapped with each other in an intersecting manner can be obtained, that is, one target object (parking space) corresponds to two predicted values, and therefore, in order to avoid the situation that one target object (parking space) corresponds to multiple predicted values, after the coordinates of the M first key points and the M second key points are obtained, can generate first bounding box and generate second bounding box according to M first key point coordinate according to M second key point coordinate, then, can be through calculating the first overlap degree between first bounding box and the first detection frame that includes the target object, and through calculating the second overlap degree between second bounding box and the second detection frame that includes the target object, can screen out more accurate key point coordinate from M first key point coordinate and M second key point coordinate through first overlap degree and second overlap degree, thereby can acquire the position of unique parking stall, can improve the accuracy that detects the parking stall to a certain extent.

The first bounding box may be specifically represented as a frame surrounded by M first keypoints corresponding to the M first keypoint coordinates on the first feature map, and the first bounding box is used to indicate a selected region of the M first keypoint coordinates on the first feature map. The second bounding box may be specifically represented as a frame surrounded by M second keypoints corresponding to the M second keypoints coordinates on the second feature map, and the second bounding box is used to indicate a selected region of the M second keypoints coordinates on the second feature map. The first detection frame is a circumscribed frame of the target object on the first feature map, and the first detection frame may be embodied as a border frame or a maximum circumscribed rectangle frame. The second detection frame is a circumscribed frame of the target object on the second feature map.

Specifically, after the M first key point coordinates and the M second key point coordinates are obtained, the M first key points corresponding to the M first key point coordinates on the first feature map may be enclosed into a first enclosure box by the M first key points, and similarly, the M second key points corresponding to the M second key point coordinates on the second feature map may be enclosed into a second enclosure box by the M second key points.

Further, before the first bounding box and the second bounding box are obtained, a first detection frame containing the target object on the first feature image and a second detection frame containing the target object on the second feature image may be obtained by an image segmentation algorithm, and further, after the first bounding box and the second bounding box are obtained, a first overlap degree between the first bounding box and the first detection frame containing the target object may be calculated, where the first overlap degree may specifically be calculated by calculating an Intersection Over Unit (IOU) ratio between the first bounding box and the first detection frame, that is, a ratio between an Intersection and a Union between the first bounding box and the first detection frame may also be calculated by using another algorithm, such as an area calculation formula and the like, where the area calculation formula is not particularly limited, and similarly, a second overlap degree between the second bounding box and the second detection frame containing the target object may be calculated, the second degree of overlap may also be calculated in a similar manner as the first degree of overlap, and will not be described herein again.

Further, after the first overlap degree and the second overlap degree are obtained, since the higher the overlap degree is, the higher the similarity is, the first overlap degree and the second overlap degree can be compared, and the bounding box with the larger overlap degree is taken as the target bounding box, and then, the position of the unique parking space corresponding to the target object can be drawn on the image to be detected according to the target bounding box, so that the accuracy of detecting the parking space can be improved to a certain extent.

It will be appreciated that, after acquiring the M first keypoint coordinates and the M second keypoint coordinates, in order to avoid the situation that obtaining M first key point coordinates and M second key point coordinates that are not consistent results in multiple predicted values for one target object (parking space) as illustrated in fig. 16(a), then, the redundant predicted values can be suppressed to ensure that a target object (parking space) corresponds to one predicted value, specifically, corresponding bounding boxes are respectively generated on the image to be detected by the M first key point coordinates and the M second key point coordinates, and specifically, the corresponding bounding boxes can be represented as a first bounding box and a second bounding box on the image to be detected, then, the polygon non-maximum suppression algorithm is collected to find the best target boundary box from the first boundary box and the second boundary box, and eliminating redundant bounding boxes to obtain a predicted value situation corresponding to one target object (parking space) as illustrated in fig. 16 (b). It is understood that the polygon Non-maximum suppression algorithm is similar to a Non-maximum suppression (NMS) algorithm in conventional target detection, wherein the NMS algorithm in conventional target detection is based on rectangular box calculation, and the polygon Non-maximum suppression algorithm of the present embodiment adopts a closed polygon composed of any 4 or 5 points.

It can be understood that, after testing, the accuracy and recall of detecting the parking space of the present embodiment corresponding to the detection effect of the indoor parking space scene illustrated in fig. 17(a) to 17(c) and the detection effect of the parking space scene of the road surface illustrated in fig. 18(a) to 18(c) can be shown in table 1:

TABLE 1

Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 6, before determining the position of the parking space in the image to be detected according to the coordinates of the M key points, the method further includes:

in step S601, an affiliated point confidence score of each target anchor frame of the M target anchor frames is calculated respectively;

in step S602, determining an auxiliary anchor frame from the M target anchor frames according to the auxiliary point confidence score;

in step S603, performing attachment point calculation on the attachment anchor frame to obtain a target attachment point coordinate;

in step S604, the position of the parking space in the image to be detected is determined according to the M key point coordinates and the target auxiliary point coordinates.

In this embodiment, when the target object (e.g. the parking space illustrated in fig. 15 (b)) is incomplete or polygonal, as shown in fig. 15(b), the position and size of one target object (e.g. the parking space) can be described by two entry points (e.g. entry _0 and entry _1 in fig. 15 (a)), two side boundary points (e.g. tail point _0 and tail point _1 in fig. 15 (b)) and one auxiliary point (e.g. the point between tail point _0 and tail point _1 in fig. 15 (b)), so that before determining the position of the parking space in the image to be detected according to the M key point coordinates, whether the target object needs the auxiliary point can be further detected, the auxiliary point confidence score of each target anchor point frame in the M target anchor point frames can be calculated respectively, and the auxiliary frames possibly containing the auxiliary point can be selected from the M target anchor point frames according to the magnitude of the auxiliary point confidence scores, and then, auxiliary point calculation can be carried out on the obtained auxiliary anchor point frame to obtain a target auxiliary point coordinate, and the position of the parking space can be accurately described on the image to be detected according to the M key point coordinates and the target auxiliary point coordinate.

Specifically, before determining the position of the parking space in the image to be detected according to the M key point coordinates, performing confidence calculation on the attached point labels of the M target anchor frames through a conventional confidence algorithm to obtain an attached point confidence score of each target anchor frame, where it is understood that the higher the attached point confidence score is, i.e., the higher the probability that the target anchor frame contains an attached point is, therefore, after obtaining the attached point confidence score of each target anchor frame, the attached point confidence scores of each target anchor frame may be compared pairwise to obtain a maximum attached point confidence score, and then the maximum attached point confidence score may be compared with the attached point confidence threshold, and if the maximum attached point confidence score is greater than or equal to the attached point confidence threshold, it may be understood that the target anchor frame corresponding to the maximum attached point confidence score contains an attached point, the target anchor point frame can be determined to be the auxiliary anchor point frame, then auxiliary point calculation can be carried out on the auxiliary anchor point frame, specifically, coordinates of a central point of the auxiliary anchor point frame and offset of the auxiliary point are obtained to calculate coordinates of the target auxiliary point, and then the position of a parking space can be accurately described on an image to be detected according to the M key point coordinates and the target auxiliary point coordinates.

On the contrary, if the maximum auxiliary point confidence score is smaller than the auxiliary point confidence threshold, it can be understood that the target anchor point frame corresponding to the maximum auxiliary point confidence score does not include an auxiliary point, that is, no auxiliary point exists in the M target anchor point frames, and the position of the parking space can be described on the image to be detected according to the M key point coordinates.

For example, as shown in fig. 14, assuming a target feature map of 4 target anchor boxes, with the upper left corner of the target feature map as the origin coordinate (0,0), assuming that the attached point confidence score of the target anchor box of (1,2) can be indexed as 0.33, the attached point confidence score of the target anchor box of (2,2) is 0.41, the attached point confidence score of the target anchor box of (3,2) is 0.26, and the attached point confidence score of the target anchor box of (4,2) is 0.19, assuming that the attached point confidence threshold is 0.5, it is known that the maximum attached point confidence score of 0.41 is smaller than the attached point confidence threshold, i.e., none of the 4 target anchor boxes has an attached point.

Optionally, on the basis of the embodiment corresponding to fig. 2, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 7, before determining the position of the parking space in the image to be detected according to the coordinates of the M key points, the method further includes:

in step S701, an occupied confidence score of each target anchor frame of the M target anchor frames is calculated respectively;

in step S702, according to the occupied confidence score, if K occupied anchor frames are determined from M target anchor frames, obtaining coordinates of the K occupied anchor frames, where K is an integer greater than or equal to 1 and less than M;

in step S703, the position of the occupied parking space in the image to be detected is determined according to the coordinates of the M key points and the coordinates of the K occupied anchor point frames.

In this embodiment, since in an actual scene, a parking space may be occupied by other objects (not including a car) such as a ground lock, a traffic sign, a bicycle, or a motorcycle, before determining the position of the parking space in the image to be detected according to the M key point coordinates, it may be detected whether the position of the target object (parking space) in the image to be detected is occupied by other objects, in this embodiment, an occupied confidence score of each target anchor point frame in the M target anchor point frames may be calculated, and according to the occupied confidence score, it may be determined whether there is an occupied anchor point frame in the M target anchor point frames, when there are K occupied anchor point frames in the M target anchor point frames, the coordinates of the K occupied anchor point frames on the target feature map may be obtained, and the position of the occupied parking space may be described in the image to be detected according to the M key point coordinates and the coordinates of the K occupied anchor point frames, so as to inform the user that the current parking space is occupied in time or help the management personnel of the parking lot to process the occupied object in time.

Specifically, before determining the position of the parking space in the image to be detected according to the M key point coordinates, the confidence calculation may be performed on the occupied objects of the M target anchor points through a conventional confidence algorithm to obtain the occupied confidence score of each target anchor point, it may be understood that the higher the occupied confidence score is, i.e., the higher the probability that the target anchor point frame contains the occupied object is, and therefore, after obtaining the occupied confidence score of each target anchor point, the occupied confidence score of each target anchor point may be compared with the occupied confidence threshold, and if the occupied confidence score is greater than or equal to the occupied confidence threshold, it may be understood that the target anchor point frame corresponding to the occupied confidence score contains the occupied object, i.e., the target anchor point frame may be determined as the occupied anchor point frame, and further, the coordinates of the occupied anchor point frame on the target characteristic diagram can be mapped out to the positions of the occupied parking spaces on the image to be detected according to the coordinates of the M key points and the coordinates of the occupied anchor point frame. On the contrary, if the occupied confidence scores are all smaller than the occupied confidence threshold, it can be understood that each target anchor point frame does not contain an occupied object, and the position of the parking space can be described on the image to be detected according to the M key point coordinates.

For example, as shown in fig. 14, assuming a target feature map of 4 target anchor blocks, with the upper left corner of the target feature map as the origin coordinate (0,0), assuming that the occupied confidence score of the target anchor block of (1,2) can be indexed as 0.19, the occupied confidence score of the target anchor block of (2,2) is 0.34, the occupied confidence score of the target anchor block of (3,2) is 0.29, and the occupied confidence score of the target anchor block of (4,2) is 0.21, and assuming that the occupied confidence threshold is 0.5, it is known that each occupied confidence score is smaller than the dependent point confidence threshold, that is, all 4 target anchor blocks do not contain occupied objects, that is, the target object (parking space) is not occupied by other objects.

Optionally, on the basis of the embodiment corresponding to fig. 3, in another optional embodiment of the detection method for a parking space provided in the embodiment of the present application, as shown in fig. 8, the method further includes:

in step S801, a sample feature map corresponding to a sample image is obtained, where the sample feature map includes N anchor point frames, the sample image corresponds to labeling information, and the labeling information includes M sample key point coordinates of the sample object;

in this embodiment, after the sample image is acquired, feature extraction may be performed on the acquired sample image to acquire a sample feature map including N anchor frames.

The sample image can be specifically represented as an image obtained by labeling a parking space instance image acquired in an actual parking space scene by a key point method. The annotation information may be embodied as a general unobstructed parking space instance image, as shown in fig. 15(a), two entry points (e.g., entry _0 and entry _1 in fig. 15 (a)) and two side boundary points (e.g., tail _0 and tail _1 in fig. 15 (a)) of the parking space may be annotated to obtain 4 sample key point coordinates of the sample object (unobstructed parking space), and similarly, for an obstructed parking space instance image, as shown in fig. 15(b), two entry points (e.g., entry _0 and entry _1 in fig. 15 (a)), two side boundary points (e.g., tail _0 and tail _1 in fig. 15 (b)) and one auxiliary point (e.g., a point between tail _0 and tail _1 in fig. 15 (b)) of the parking space may be annotated to obtain 4 sample point coordinates and 1 auxiliary key point coordinates of the sample object (unobstructed parking space), the closed polygon formed by the key points and the auxiliary points can better and completely fit various parking spaces possibly appearing in the life reality.

Specifically, as shown in fig. 13, the sample image may be obtained from a database or crawled from a big data platform, or other obtaining manners may be adopted, which is not specifically limited herein, and then the obtained sample image may be subjected to feature extraction to obtain a corresponding feature map, and the feature map is divided according to anchor points of a preset size, and the sample image is subjected to key point labeling and attached point labeling, so that a sample feature map including N anchor points may be obtained.

In step S802, a sample anchor frame is determined from the N anchor frames according to the M sample keypoint coordinates;

in this embodiment, after the sample feature map including the N anchor frames is obtained, M sample key point coordinates of the sample object on the sample feature map may be obtained according to the labeling information corresponding to the sample feature map, and then, one sample anchor frame most likely including the sample object may be screened from the N anchor frames according to the M sample key point coordinates.

Specifically, after the sample feature map including the N anchor frames is obtained, M sample key point coordinates of the sample object on the sample feature map may be obtained according to the labeling information corresponding to the sample feature map, and then the center point coordinates of the sample object on the sample feature image may be calculated according to the M sample key point coordinates (as the large black point in the anchor frame of (2,2) illustrated in fig. 14), it may be understood that the anchor frame including the center point of the sample object necessarily includes the sample object, and then the anchor frame including the center point coordinates of the sample object may be used as the sample anchor frame.

In step S803, the confidence score and the truth label of the sample anchor frame are obtained, and a first loss function is constructed according to the confidence score and the truth label of the sample anchor frame;

in this embodiment, since the sample anchor frame necessarily includes the sample object, after the sample anchor frame is acquired, the confidence score and the truth label of the sample anchor frame may be calculated, and the first loss function may be constructed according to the confidence score and the truth label of the sample anchor frame, so that the anchor frame including the sample object (parking space) may be better fitted through the first loss function in the following process.

Specifically, after the sample anchor block is obtained, a confidence algorithm may be used to calculate a true value label (e.g., a parking space label) of the sample anchor block and a confidence score corresponding to the true value label, and a first loss function shown in the following formula (1) is constructed according to the confidence score and the true value label of the sample anchor block:

wherein, clf_predFor confidence score of whether there is a parking space, index 0 represents background, index 1 represents parking space, clf_gtIs a one-hot true value label, if there is no parking space, then is [1,0 ]]If there is a parking space, it is [0,1 ]]。

In step S804, respectively calculating an offset between each sample key point coordinate of the M sample key point coordinates and a center point coordinate of the sample anchor point frame to obtain M sample offsets;

in this embodiment, because the position and size of the parking space in the sample image may be predicted based on the offset of the key point, so that after the center point coordinates of the sample anchor point frame are obtained, the offset between each of the M sample key point coordinates and the center point coordinates of the sample anchor point frame may be calculated respectively to obtain the M sample offsets.

Specifically, after the coordinates of the center point of the sample anchor point frame are acquired, the offset between each of the coordinates of the key point of the M samples and the coordinates of the center point of the sample anchor point frame can be calculated through a distance formula, so as to acquire M sample offsets.

In step S805, a second loss function is constructed according to the offsets of the M samples and the coordinates of the key points of the M samples;

in this embodiment, in order to better learn the coordinates of each keypoint, the fitting may be performed by using an offset between the coordinates of each sample keypoint and the coordinates of the center point of the sample anchor point frame, that is, M sample offsets, so after the M sample offsets and the coordinates of the M sample keypoint are obtained, a second loss function may be constructed according to the M sample offsets and the coordinates of the M sample keypoint, so that the coordinates of each keypoint in the anchor point frame may be better fitted by using the second loss function subsequently.

It can be understood that, since the calculated value of the small parking space may be smaller than the calculated value of the large parking space for the calculation of the amount of the key point offset for the sample feature maps of different sizes, in order to balance the calculation of the amount of the key point offset for the sample feature maps of different sizes, the offset used in the present embodiment is an offset with respect to the height and width of the anchor point frame.

Specifically, after obtaining M sample offsets and M sample keypoint coordinates, a second loss function shown in the following formula (2) may be constructed according to the M sample offsets and the M sample keypoint coordinates:

loss_offset＝|offset_gt-offset_pred| (2)

offset_gt＝(p_gt-g)/a，offset_gt,offset_pred,p_gt,g,a∈R²

where a is the width and height of the anchor frame, g is the midpoint position of the sample anchor frame, offset_gtIs the true value of the coordinate offset, offset_predAs a prediction of coordinate offset, p_gtSample keypoint coordinates for any of the M sample keypoints.

In step S806, the model parameters of the image detection model are updated according to the first loss function and the second loss function.

Specifically, after the first loss function and the second loss function are obtained, a parameter adjustment operation may be performed on the image detection model, specifically, a reverse gradient descent algorithm may be adopted to update model parameters in the image detection model until convergence occurs, so as to obtain the image detection model, further, a sample image in the database may be repeatedly obtained, a feature extraction operation, a sample anchor point frame obtaining operation, a first loss function constructing operation, a second loss function constructing operation, and a model parameter adjustment operation are repeatedly performed on the sample image until the model parameters tend to be stable, that is, the image detection model converges, so that the image to be detected may be subsequently processed through the trained image detection model.

Optionally, on the basis of the embodiment corresponding to fig. 8, in another optional embodiment of the parking space detection method provided in the embodiment of the present application, as shown in fig. 9, the sample feature map includes a first sample feature map and a second sample feature map, and the sample anchor block includes a first sample anchor block frame and a second sample anchor block frame; obtaining a sample characteristic diagram corresponding to a sample image, wherein the sample characteristic diagram comprises the following steps:

in step S901, a sample image is input to an image detection model, and feature extraction is performed on the sample image through the image detection model to obtain a first sample feature map and a second sample feature map, where the first sample feature map includes N first anchor point frames and the second sample feature map includes N second anchor point frames;

in step S902, a first sample anchor block is determined from the N first anchor blocks according to the M sample keypoint coordinates, and a second sample anchor block is determined from the N second anchor blocks.

In this embodiment, after a sample image is acquired, the sample image may be input to an image detection model, and feature extraction may be performed on the sample image by the image detection model, so as to acquire a first sample feature map including N first anchor frames and a second sample feature map including N second anchor frames, and a sample anchor frame and a target sample feature map that are more suitable for detecting a sample object may be screened by the first sample feature map and the second sample feature map of different sizes and the first anchor frame and the second anchor frame of different sizes, so that a position of the sample object on the sample feature map may be better fitted.

Specifically, after a sample image is obtained, the sample image may be input to an image detection model, the image detection model shown in table 2 adopted in this embodiment is obtained by building a detection network based on depth separable convolution and combining a feature pyramid structure, the network structure has a feature of low computation amount, and further, the last two layers of feature maps of the image detection model shown in fig. 13 may be used as a first sample feature map and a second sample feature map, and the first sample feature map and the second sample feature map may be used as a detection head for being responsible for predicting parking space instance attributes.

TABLE 2

Input size	Operation of	Coefficient of expansion	Number of convolution kernels	Number of repetitions	Stride length	Characteristic pyramid
							144×256×3	3x3 convolution	-	16	1	2	-
72×128×16	Linear bottleneck layer	1	8	1	1	-
							72×128×8	Linear bottleneck layer	6	8	2	2	-
36×64×16	Linear bottleneck layer	6	16	3	2	-
							18×32×24	Linear bottleneck layer	6	24	4	2	√
9×16×32	Linear bottleneck layer	6	32	3	1	-
							9×16×32	Linear bottleneck layer	6	56	2	1	√

Table 2 describes network detail configuration of the image detection model, a linear bottleneck layer is a basic structure in mobilenetV2, an expansion coefficient, the number of convolution kernels, the number of repetitions, and a stride are parameters of the linear bottleneck layer, and √ corresponding to a feature pyramid is used to represent feature fusion that needs to pass through the feature pyramid. Three feature maps illustrated in fig. 13 represent depth separable convolution (backbone), the backbone is a lightweight mvcc javascript library, two feature maps illustrated in fig. 13 represent a feature pyramid structure, and an upsampling structure of the feature pyramid is a structure based on depth separable upsampling and bilinear upsampling splicing, and can be used for fusing bottom-layer semantic information and high-layer semantic information.

It will be appreciated that the image detection model has been tested to be calculated at 52MFLOPs, which is a very low level of floating point calculations that can be used to achieve real-time operation on low-end CPU devices.

Further, M first sample key point coordinates may be obtained according to the labeling information of the first feature map, the center point position of the sample object on the first sample feature map may be calculated according to the M first sample key point coordinates, then, a first sample anchor point frame including the center point position of the sample object on the first sample feature map may be screened from the N first anchor point frames, similarly, M second sample key point coordinates may be obtained according to the labeling information of the second feature map, the center point position of the sample object on the second sample feature map may be calculated according to the M second sample key point coordinates, and then, a second sample anchor point frame including the center point position of the sample object on the second sample feature map may be screened from the N second anchor point frames.

Optionally, on the basis of the embodiment corresponding to fig. 9, in another optional embodiment of the detection method for the parking space provided in the embodiment of the present application, as shown in fig. 10, the first feature diagram further includes a first sample detection frame, the second feature diagram further includes a second sample detection frame, and both the first sample detection frame and the second sample detection frame are used for indicating a sample object in the sample image; before obtaining the confidence score and the truth label of the sample anchor box and constructing the first loss function according to the confidence score and the truth label of the sample anchor box, the method further comprises:

in step S1001, a first sample overlap degree between a first sample anchor frame and a first sample detection frame is calculated, and a second sample overlap degree between a second sample anchor frame and a second sample detection frame is calculated;

in step S1002, determining a target sample anchor point frame from the first sample anchor point frame and the second sample anchor point frame according to the first sample overlap degree and the second sample overlap degree;

in step S1003, the confidence score and the truth label of the target sample anchor box are obtained, and a first loss function is constructed according to the confidence score and the truth label of the target sample anchor box.

In this embodiment, as shown in fig. 14, since the anchor frames may be distributed on the sample feature map to uniformly divide the sample feature map, and the similarity or overlap between the sample object and the sample anchor frame may be characterized by the overlap between the detection frame and the sample anchor frame, it can be understood that the greater the overlap, the greater the similarity or overlap between the sample object and the sample anchor frame is, and thus the overlap between the maximum circumscribed frame formed by the sample object (parking space instance) and the first sample feature map and the second sample feature map may be obtained as the target sample anchor frame to be responsible for predicting the sample object, it can be realized that the dense second sample feature map and the second sample anchor frame are responsible for predicting small parking spaces, the sparse first sample feature map and the first sample anchor frame are responsible for predicting large parking spaces, and then the confidence score and the truth score of the target sample anchor frame may be obtained, and constructing a first loss function according to the confidence score and the truth label of the target sample anchor box.

The first sample detection frame is an outer frame of the sample object on the first sample characteristic diagram, and the first sample detection frame can be embodied as a border frame or a maximum outer rectangle frame. The second sample detection box is an outline box of the sample object on the second sample feature map.

Specifically, a first sample detection frame containing the sample object on the first sample feature image and a second sample detection frame containing the sample object on the second sample feature image may be obtained by an image segmentation algorithm, and then, after obtaining a first sample anchor frame containing a central point position of the sample object on the first sample feature image and a second sample anchor frame containing a central point position of the sample object on the second sample feature image, a first sample overlap degree between the first sample anchor frame and the first sample detection frame may be calculated, where the first sample overlap degree may specifically be calculated by calculating an intersection ratio between the first sample anchor frame and the first sample detection frame, that is, calculating a ratio of an intersection and a union between the first sample anchor frame and the first sample detection frame, and the first sample overlap degree may also be calculated by another algorithm, for example, the area calculation formula is not specifically limited here, and similarly, the second sample overlap between the second sample anchor frame and the second sample detection frame may also be calculated in a similar manner to the first sample overlap, which is not described herein again.

Further, after the first sample overlap degree and the second sample overlap degree are obtained, since the higher the overlap degree is, the higher the similarity is, the higher the overlap degree is, the first sample overlap degree and the second sample overlap degree may be compared, and the sample anchor point frame with the larger overlap degree is used as the target sample anchor point frame, and then, the confidence score and the truth label of the target sample anchor point frame may be obtained, where the manner of obtaining the confidence score and the truth label of the target sample anchor point frame is similar to the manner of obtaining the confidence score and the truth label of the sample anchor point frame in step S803, which is not described herein again, and further, the first loss function may be constructed according to the confidence score and the truth label of the target sample anchor point frame.

Optionally, on the basis of the embodiment corresponding to fig. 8, in another optional embodiment of the detection method for the parking space provided in the embodiment of the present application, as shown in fig. 11, the annotation information further includes a sample attachment point of the sample object; before updating the model parameters of the image detection model according to the first loss function and the second loss function, the method further comprises:

in step S1101, a sample attached point confidence score and an attached point truth label of the sample anchor frame are obtained;

in step S1102, a third loss function is constructed according to the sample attached point confidence score and the attached point truth label of the sample anchor frame;

in step S1103, the model parameters of the image detection model are updated based on the first loss function, the second loss function, and the third loss function.

In this embodiment, since the annotation information further includes sample attachment points of the sample object, the image detection model can be helped to better learn about the incomplete or polygonal sample object, and therefore, before updating the model parameters of the image detection model according to the first loss function and the second loss function, a sample dependent point confidence score and dependent point truth label for the sample anchor block may be obtained, and a third penalty function may be constructed based on the sample dependent point confidence score and dependent point truth label for the sample anchor block, so that subsequently the position of the sample object attachment point on the sample image can be better fitted by means of the third loss function, therefore, the model parameters of the image detection model can be updated according to the first loss function, the second loss function and the third loss function, and the image detection model with high detection precision can be obtained.

Specifically, after the sample anchor block is obtained, a confidence algorithm may be used to calculate an attached point truth label of the sample anchor block and a confidence score corresponding to the attached point truth label, and a third loss function shown in the following formula (3) is constructed according to the attached point confidence score and the attached point truth label of the sample anchor block:

wherein, attach_predFor the attachment point confidence score, index 0 represents no attachment point, index 1 represents attachment point, and attach_gtIs a one-hot true label, where if there is an appendage, it is [1,0 ]]And if not, [0,1 ]]。

Optionally, on the basis of the embodiment corresponding to fig. 8, in another optional embodiment of the detection method for a parking space provided in the embodiment of the present application, as shown in fig. 12, the sample feature map further includes a third sample detection frame, where the third sample detection frame is used to indicate an occupied object in the sample image; before updating the model parameters of the image detection model according to the first loss function and the second loss function, the method further comprises:

in step S1201, acquiring an occupied confidence score and an occupied truth label of the sample anchor point frame according to a position of the third sample detection frame in the sample feature map;

in step S1202, a fourth loss function is constructed according to the occupied confidence score and the occupied truth label of the sample anchor box;

in step S1203, the model parameters of the image detection model are updated based on the first loss function, the second loss function, and the fourth loss function.

In this embodiment, since in the actual scene, the parking space may be occupied by other objects (not including the car) such as a ground lock, a traffic sign, a bicycle or a motorcycle, therefore, before updating the model parameters of the image detection model according to the first loss function and the second loss function, the occupied confidence score and occupied truth label of the sample anchor box may be obtained according to the position of the third sample detection box containing the occupied object in the sample feature map, and constructing a fourth penalty function based on the occupied confidence score and the occupied truth label of the sample anchor box, so that subsequently it is possible to better fit to whether the sample object is occupied on the sample image by means of a fourth loss function, therefore, the model parameters of the image detection model can be updated according to the first loss function, the second loss function and the fourth loss function, and the image detection model with high detection precision can be obtained.

Specifically, after the sample image is acquired, an image segmentation algorithm may be used to acquire a maximum external frame occupying the object in the sample image, that is, a third sample detection frame, and then after the sample anchor point frame is acquired, a confidence score corresponding to an occupied truth label of the sample anchor point frame and the occupied truth label may be calculated by using a confidence algorithm according to a coordinate of the third sample detection frame on a sample feature map, and a fourth loss function shown in the following formula (4) is constructed according to the occupied confidence score and the occupied truth label of the sample anchor point frame:

wherein, occup_predFor an occupied confidence score, index 0 represents no occupancy, index 1 represents occupied, clf_gtIs a one-hot true label, which if occupied here is [1,0 ]]If it is unoccupied, it is [0,1 ]]。

Referring to fig. 19, fig. 19 is a schematic view of an embodiment of a parking space detection apparatus according to an embodiment of the present application, and a parking space detection apparatus 20 includes:

an obtaining unit 201, configured to obtain a target feature map corresponding to an image to be detected, where the target feature map includes N anchor point frames, and N is an integer greater than 1;

a processing unit 202, configured to calculate a confidence score for each anchor box, where the confidence score is used to indicate a probability value that the anchor box contains the target object;

a determining unit 203, configured to determine M target anchor boxes from the N anchor boxes according to the confidence score, where M is an integer greater than 1 and smaller than N;

the processing unit 202 is further configured to perform key point calculation on each target anchor point frame of the M target anchor point frames, so as to obtain M key point coordinates;

the determining unit 203 is further configured to determine the position of the parking space in the image to be detected according to the M key point coordinates.

Optionally, on the basis of the embodiment corresponding to fig. 19, in another embodiment of the detection apparatus for a parking space provided in the embodiment of the present application, the obtaining unit 201 may specifically be configured to:

the processing unit 202 may specifically be configured to: calculating a first confidence score for each first anchor box and calculating a second confidence score for each second anchor box;

the determining unit 203 may specifically be configured to: determining M first target anchor boxes from the N first anchor boxes and M second target anchor boxes from the N second anchor boxes according to the confidence scores.

Optionally, on the basis of the embodiment corresponding to fig. 19, in another embodiment of the detection apparatus for a parking space provided in the embodiment of the present application, the determining unit 203 may be specifically configured to:

Optionally, on the basis of the embodiment corresponding to fig. 19, in another embodiment of the detection apparatus for a parking space provided in the embodiment of the present application, the processing unit 202 may be specifically configured to:

Alternatively, on the basis of the embodiment corresponding to fig. 19, in another embodiment of the parking space detection apparatus provided in the embodiment of the present application,

the processing unit 202 is further configured to calculate an affiliated point confidence score of each target anchor frame in the M target anchor frames respectively;

the determining unit 203 is further configured to determine an affiliated anchor frame from the M target anchor frames according to the affiliated point confidence score;

the processing unit 202 is further configured to perform auxiliary point calculation on the auxiliary anchor point frame to obtain a target auxiliary point coordinate;

the determining unit 203 may specifically be configured to: and determining the position of the parking space in the image to be detected according to the M key point coordinates and the target auxiliary point coordinates.

the processing unit 202 is further configured to calculate an occupied confidence score of each target anchor frame in the M target anchor frames, respectively;

the obtaining unit 201 is further configured to obtain coordinates of K occupied anchor frames if K occupied anchor frames are determined from the M target anchor frames according to the occupied confidence score, where K is an integer greater than or equal to 1 and less than M;

the determining unit 203 may specifically be configured to: and determining the positions of occupied parking spaces in the image to be detected according to the coordinates of the M key points and the coordinates of the K occupied anchor point frames.

the obtaining unit 201 is further configured to obtain a sample feature map corresponding to the sample image, where the sample feature map includes N anchor point frames, the sample image corresponds to labeling information, and the labeling information includes M sample key point coordinates of the sample object;

the determining unit 203 is further configured to determine a sample anchor point frame from the N anchor point frames according to the M sample key point coordinates;

the obtaining unit 201 is further configured to obtain the confidence score and the truth label of the sample anchor frame, and construct a first loss function according to the confidence score and the truth label of the sample anchor frame;

the processing unit 202 is further configured to calculate offsets between each sample key point coordinate of the M sample key point coordinates and the center point coordinate of the sample anchor point frame, respectively, to obtain M sample offsets;

the processing unit 202 is further configured to construct a second loss function according to the offsets of the M samples and the coordinates of the key points of the M samples;

the processing unit 202 is further configured to update the model parameters of the image detection model according to the first loss function and the second loss function.

the determining unit 203 may specifically be configured to: and determining a first sample anchor point frame from the N first anchor point frames according to the M sample key point coordinates, and determining a second sample anchor point frame from the N second anchor point frames.

the processing unit 202 is further configured to calculate a first sample overlap degree between the first sample anchor frame and the first sample detection frame, and calculate a second sample overlap degree between the second sample anchor frame and the second sample detection frame;

a determining unit 203, configured to determine a target sample anchor frame from the first sample anchor frame and the second sample anchor frame according to the first sample overlap degree and the second sample overlap degree;

the obtaining unit 201 is further configured to obtain a sample attached point confidence score and an attached point truth label of the sample anchor frame;

the processing unit 202 is further configured to construct a third loss function according to the sample attached point confidence score and the attached point truth label of the sample anchor box;

the processing unit 202 may specifically be configured to: and updating the model parameters of the image detection model according to the first loss function, the second loss function and the third loss function.

the obtaining unit 201 is further configured to obtain an occupied confidence score and an occupied truth label of the sample anchor point frame according to a position of the third sample detection frame in the sample feature map;

the processing unit 202 is further configured to construct a fourth loss function according to the occupied confidence score and the occupied truth label of the sample anchor box;

the processing unit 202 may specifically be configured to: and updating the model parameters of the image detection model according to the first loss function, the second loss function and the fourth loss function.

Another exemplary computer device is provided, as shown in fig. 20, fig. 20 is a schematic structural diagram of a computer device provided in this embodiment, and the computer device 300 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 310 (e.g., one or more processors) and a memory 320, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 331 or data 332. Memory 320 and storage media 330 may be, among other things, transient or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the computer device 300. Still further, the central processor 310 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the computer device 300.

The computer device 300 may also include one or more power supplies 340, one or more wired or wireless network interfaces 350, one or more input-output interfaces 360, and/or one or more operating systems 333, such as a Windows Server^TM，Mac OS X^TM，Unix^TM,Linux^TM，FreeBSD^TMAnd so on.

The computer device 300 described above is also used to perform the steps in the embodiments corresponding to fig. 2 to 12.

Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the steps in the method as described in the embodiments shown in fig. 2 to 12.

Another aspect of the application provides a computer program product comprising instructions which, when run on a computer or processor, cause the computer or processor to perform the steps of the method as described in the embodiments shown in fig. 2 to 12.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Claims

1. A method for detecting a parking space, comprising:

calculating a confidence score for each of the anchor boxes, wherein the confidence score is indicative of a probability value that the anchor box contains a target object;

determining M target anchor boxes from the N anchor boxes according to the confidence score, wherein M is an integer greater than 1 and smaller than N;

performing key point calculation on each target anchor point frame in the M target anchor point frames to obtain M key point coordinates;

and determining the position of the parking space in the image to be detected according to the M key point coordinates.

2. The method of claim 1, wherein the target feature map comprises a first feature map and a second feature map, wherein the anchor box comprises a first anchor box and a second anchor box, and wherein the target anchor box comprises a first target anchor box and a second target anchor box;

the method for acquiring the target characteristic diagram corresponding to the image to be detected comprises the following steps:

inputting the image to be detected into an image detection model, and performing feature extraction on the image to be detected through the image detection model to obtain a first feature map and a second feature map, wherein the first feature map comprises N first anchor point frames, the second feature map comprises N second anchor point frames, the size of the first feature map is smaller than that of the second feature map, and the size of the first anchor point frame is smaller than that of the second anchor point frame;

the calculating the confidence score of each anchor box comprises:

calculating a first confidence score for each of the first anchor boxes and calculating a second confidence score for each of the second anchor boxes;

the determining M target anchor boxes from the N anchor boxes according to the confidence scores includes:

determining M first target anchor boxes from the N first anchor boxes and M second target anchor boxes from the N second anchor boxes according to the confidence scores.

3. The method of claim 2, wherein determining M target anchor boxes from the N anchor boxes according to the confidence scores comprises:

determining a first anchor box with the first confidence score greater than a first confidence threshold as the first target anchor box;

determining a second anchor box with the second confidence score greater than a second confidence threshold as the second target anchor box.

4. The method of claim 2, wherein the keypoint coordinates comprise a first keypoint coordinate and a second keypoint coordinate;

the calculating key points of each target anchor point frame in the M target anchor point frames to obtain M key point coordinates includes:

5. The method of claim 4, wherein the first feature map further comprises a first detection box, wherein the second feature map further comprises a second detection box, and wherein the first detection box and the second detection box are both used to indicate the target object;

determining the position of the parking space in the image to be detected according to the M key point coordinates, comprising the following steps:

calculating a first degree of overlap between the first bounding box and the first detection frame, and calculating a second degree of overlap between the second bounding box and the second detection frame;

determining a target bounding box from the first bounding box and the second bounding box according to the first degree of overlap and the second degree of overlap;

6. The method according to claim 1, wherein before determining the position of the parking space in the image to be detected according to the M key point coordinates, the method further comprises:

respectively calculating the attached point confidence score of each target anchor point frame in the M target anchor point frames;

determining an affiliated anchor box from the M target anchor boxes according to the affiliated point confidence score;

performing auxiliary point calculation on the auxiliary anchor point frame to obtain a target auxiliary point coordinate;

and determining the position of the parking space in the image to be detected according to the M key point coordinates and the target auxiliary point coordinates.

7. The method according to claim 1, wherein before determining the position of the parking space in the image to be detected according to the M key point coordinates, the method further comprises:

respectively calculating occupied confidence scores of each target anchor point frame in the M target anchor point frames;

according to the occupied confidence score, if K occupied anchor points are determined from the M target anchor points, obtaining coordinates of the K occupied anchor points, wherein K is an integer which is greater than or equal to 1 and smaller than M;

and determining the positions of occupied parking spaces in the image to be detected according to the M key point coordinates and the coordinates of the K occupied anchor point frames.

8. The method of claim 2, further comprising:

acquiring a sample feature map corresponding to a sample image, wherein the sample feature map comprises the N anchor points, the sample image corresponds to labeling information, and the labeling information comprises M sample key point coordinates of a sample object;

determining a sample anchor frame from the N anchor frames according to the M sample keypoint coordinates;

acquiring the confidence score and the truth label of the sample anchor frame, and constructing a first loss function according to the confidence score and the truth label of the sample anchor frame;

respectively calculating the offset between each sample key point coordinate in the M sample key point coordinates and the center point coordinate of the sample anchor point frame to obtain M sample offsets;

constructing a second loss function according to the M sample offsets and the M sample key point coordinates;

and updating the model parameters of the image detection model according to the first loss function and the second loss function.

9. The method of claim 8, wherein the sample feature map comprises a first sample feature map and a second sample feature map, and wherein the sample anchor block comprises a first sample anchor block and a second sample anchor block;

the obtaining of the sample feature map corresponding to the sample image includes:

inputting the sample image into the image detection model, and performing feature extraction on the sample image through the image detection model to obtain a first sample feature map and a second sample feature map, wherein the first sample feature map comprises N first anchor points, and the second sample feature map comprises N second anchor points;

said determining a sample anchor box from said N anchor boxes according to said M sample keypoint coordinates comprises:

determining the first sample anchor block from the N first anchor blocks and the second sample anchor block from the N second anchor blocks according to the M sample keypoint coordinates.

10. The method of claim 9, wherein the first feature map further comprises a first sample detection box, wherein the second feature map further comprises a second sample detection box, and wherein the first sample detection box and the second sample detection box are both used for indicating the sample object in the sample image;

before the obtaining the confidence score and the truth label of the sample anchor block and constructing the first loss function according to the confidence score and the truth label of the sample anchor block, the method further includes:

calculating a first sample overlap between the first sample anchor frame and the first sample detection frame, and calculating a second sample overlap between the second sample anchor frame and the second sample detection frame;

determining a target sample anchor frame from the first sample anchor frame and the second sample anchor frame according to the first sample overlap degree and the second sample overlap degree;

the obtaining the confidence score and the truth label of the sample anchor box, and constructing a first loss function according to the confidence score and the truth label of the sample anchor box, including:

and acquiring the confidence score and the truth label of the target sample anchor box, and constructing the first loss function according to the confidence score and the truth label of the target sample anchor box.

11. The method of claim 8, wherein the annotation information further comprises a sample attachment point of the sample object;

before the updating the model parameters of the image detection model according to the first loss function and the second loss function, the method further includes:

acquiring a sample attached point confidence score and an attached point truth label of the sample anchor point frame;

constructing a third loss function according to the sample auxiliary point confidence score and the auxiliary point truth value label of the sample anchor point frame;

the updating the model parameters of the image detection model according to the first loss function and the second loss function includes:

and updating the model parameters of the image detection model according to the first loss function, the second loss function and the third loss function.

12. The method of claim 8, wherein the sample feature map further comprises a third sample detection box for indicating an occupied object in the sample image;

acquiring occupied confidence score and occupied truth label of the sample anchor point frame according to the position of the third sample detection frame in the sample feature map;

constructing a fourth loss function according to the occupied confidence score and the occupied truth label of the sample anchor box;

and updating the model parameters of the image detection model according to the first loss function, the second loss function and the fourth loss function.

13. A parking space detection device, comprising:

a determining unit, configured to determine M target anchor boxes from the N anchor boxes according to the confidence score, where M is an integer greater than 1 and less than N;

the processing unit is further configured to perform key point calculation on each target anchor point frame of the M target anchor point frames, so as to obtain M key point coordinates;

the determining unit is further configured to determine the position of the parking space in the image to be detected according to the M key point coordinates.

14. A computer device, comprising: a memory, a transceiver, a processor, and a bus system;

wherein the memory is used for storing programs;

the processor, when executing the program in the memory, implementing the method of any of claims 1 to 12;

15. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any of claims 1 to 12.

16. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method of any of claims 1 to 12.