WO2023066142A1 - 全景图像的目标检测方法、装置、计算机设备和存储介质 - Google Patents

全景图像的目标检测方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2023066142A1
WO2023066142A1 PCT/CN2022/125242 CN2022125242W WO2023066142A1 WO 2023066142 A1 WO2023066142 A1 WO 2023066142A1 CN 2022125242 W CN2022125242 W CN 2022125242W WO 2023066142 A1 WO2023066142 A1 WO 2023066142A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
detected
panoramic image
boundary position
detection
Prior art date
Application number
PCT/CN2022/125242
Other languages
English (en)
French (fr)
Inventor
林晓帆
姜文杰
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2023066142A1 publication Critical patent/WO2023066142A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of computers, and in particular to a method, device, computer equipment and storage medium for target detection of panoramic images.
  • Computer vision is a science that studies how to make machines "see”. To put it further, it refers to using cameras and computers instead of human eyes to identify, track and measure targets, and further graphics processing, so that computer processing It becomes an image that is more suitable for human eyes to observe or sent to the instrument for detection.
  • computer vision studies related theories and technologies trying to build artificial intelligence systems that can obtain 'information' from images or multidimensional data.
  • the target detection of panoramic images is one of the research objects of computer vision.
  • the target detection of panoramic images belongs to the subdivision field of target detection. It is a computer vision technology based on the statistical characteristics and semantic information of panoramic targets, which can simultaneously obtain panoramic images. The category information and location information of the target.
  • a panorama image is a special image with an aspect ratio of 2:1 and is composed of multiple images. It follows the latitude and longitude expansion method, the width of the image is the latitude 0-2 ⁇ , and the height of the image is the longitude 0- ⁇ . Therefore, it can record all information of 360 degrees horizontally and 180 degrees pitched.
  • the target detection is generally performed on the plane expanded image of the panoramic image, so some objects in the panoramic image will be distorted, resulting in the rectangular frame of the detection result not being able to reasonably frame the deformed and extended objects. target, resulting in deviations in the detection results.
  • BFoV Bounding Field-of-Views, forming the boundary of the field of view
  • BFoV treats the panoramic image as a spherical surface, uses the latitude and longitude coordinates of the target to represent its center point, and uses its two field angles in the horizontal and vertical directions to represent the space it occupies.
  • the detection area represented by it still cannot always well contain the tilted or deformed targets in the panoramic image, which affects the detection effect of target detection.
  • a target detection method for a panoramic image comprising:
  • the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected;
  • the target detection result corresponding to the panoramic image to be detected is acquired.
  • the convolution processing is performed on the panoramic image to be detected through the convolutional neural network including the preset target deformation adaptive convolution operator, and the target category of the detection target in the panoramic image to be detected is obtained.
  • the target boundary position point data include:
  • a convolutional neural network including a preset target deformation adaptation convolution operator, and obtaining the heat map, target category, and target boundary position point data corresponding to the initial detection target in the panoramic image to be detected;
  • the initial detection target includes a non-boundary position target
  • the convolutional neural network further includes a conventional convolution operator
  • the panoramic image to be detected is input into a convolutional neural network including a preset target deformation adaptive convolution operator, and the heat map, target category, and target boundary position point data corresponding to the initial detection target in the panoramic image to be detected are obtained include:
  • the initial detection target includes a boundary position target
  • the panoramic image to be detected is input into a convolutional neural network including a preset target deformation adaptive convolution operator to obtain the panoramic image to be detected
  • the heat map, target category, and target boundary location point data corresponding to the initial detection target in include:
  • the target attribute indicates that the first detection target and the second detection target are the same detection target
  • the convolution processing is performed on the panoramic image to be detected through the convolutional neural network including the preset target deformation adaptive convolution operator, and the target category of the detection target in the panoramic image to be detected is obtained. And before the target boundary position point data, it also includes:
  • Target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected
  • the initial convolutional neural network including the preset target deformation adaptive convolution operator is trained to obtain the convolutional neural network including the preset target deformation adaptive convolution operator.
  • the acquisition of the target detection result corresponding to the panorama image to be detected according to the target category of the detected target and the target boundary position point data includes;
  • the data groups corresponding to each detected target are filled into the preset target detection result list, and the target detection result corresponding to the panoramic image to be detected is generated.
  • a target detection device for a panoramic image comprising:
  • a data acquisition module configured to acquire a panoramic image to be detected
  • the convolution processing module is used to perform convolution processing on the panoramic image to be detected through a convolution neural network including a preset target deformation adaptive convolution operator, and obtain the target category and target of the detection target in the panoramic image to be detected Boundary position point data, the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected;
  • the target detection module is configured to acquire a target detection result corresponding to the panoramic image to be detected according to the target category of the detected target and target boundary position point data.
  • the convolution processing module is specifically configured to: input the panoramic image to be detected into a convolutional neural network including a preset target deformation adaptive convolution operator, and obtain the initial The thermal map, target category, and target boundary position point data corresponding to the detected target; filter the initial detected target according to the thermal map to obtain the detected target; acquire the target category and target boundary position point data corresponding to the detected target.
  • a computer device comprising a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected;
  • the target detection result corresponding to the panoramic image to be detected is acquired.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected;
  • the target detection result corresponding to the panoramic image to be detected is acquired.
  • the target detection method, device, computer equipment, and storage medium of the above-mentioned panoramic image obtain the panoramic image to be detected; perform convolution processing on the panoramic image to be detected through the convolutional neural network including the preset target deformation adaptation convolution operator, and obtain the panoramic image to be detected.
  • Detect the target category and target boundary position point data of the detected target in the panoramic image the target boundary position point data includes the target boundary position point whose coordinates exceed the boundary of the panoramic image to be detected; according to the target category of the detected target and the target boundary position point data, obtain the Detect the target detection results corresponding to the panoramic image.
  • the present application When detecting panoramic images, the present application extracts the target category and target boundary position point data of the detected target by preset target deformation adaptive convolution operator, and then obtains the final target detection through the target category and target boundary position point data As a result, the location of the detection target is represented by the position point of the target boundary, and the area of all detection targets including the detection target at the boundary of the panoramic image can be effectively determined, thereby improving the accuracy of target detection in the panoramic image.
  • Fig. 1 is the application environment diagram of the target detection method of panoramic image in an embodiment
  • Fig. 2 is a schematic flow chart of a target detection method for a panoramic image in an embodiment
  • Fig. 3 is a schematic subflow diagram of step 203 in Fig. 2 in one embodiment
  • Fig. 4 is a schematic subflow diagram of step 302 in Fig. 3 in one embodiment
  • Fig. 5 is a schematic flow chart of neural network model training steps in an embodiment
  • Fig. 6 is a structural block diagram of a target detection device for a panoramic image in an embodiment
  • Figure 7 is an internal block diagram of a computer device in one embodiment.
  • Panoramic distortion refers to that during the scanning and imaging process of the panoramic image, since the image distance remains unchanged, the object distance increases with the increase of the scanning angle, thus Causes the scale on the image to gradually shrink from the center to both sides.
  • Most of the existing target detection algorithms for panoramic images use the target's Bounding-Box (BBox) or Bounding Field-of-View (BFoV).
  • BBox Bounding-Box
  • BFoV Bounding Field-of-View
  • BFoV regards the panoramic image as a spherical surface, and uses the latitude and longitude coordinates of the target detection target to represent its center point, and uses its two field-of-views (Field-of-Views) in the horizontal and vertical directions to represent it. space occupied.
  • BFoV is specifically defined as ( ⁇ , ⁇ , h, w).
  • ⁇ and ⁇ are the latitude and longitude coordinates of the target on the spherical surface, respectively;
  • h and w represent the two field angles of the target in the horizontal and vertical directions, similar to height and width.
  • the distortion of BFoV to the upper and lower regions can be extended to contain the target.
  • BFoV is defined on a spherical surface, it also avoids the problem that the left and right sides cannot be judged to be the same target.
  • the area detected by BFoV still does not always contain the detection target well, thus affecting the accuracy of target detection. In view of this situation, the applicant proposed the object detection method of the present application.
  • the target detection method for a panoramic image provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 102 communicates with the server 104 through the network.
  • the server 104 performs object detection on the panoramic image to be detected submitted by the terminal 102.
  • the server 104 acquires the panoramic image to be detected; performs convolution processing on the panoramic image to be detected through the convolutional neural network including the preset target deformation adaptive convolution operator, and obtains the target category and the target boundary position point data of the detected target in the panoramic image to be detected , the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected; according to the target category of the detected target and the target boundary position point data, the target detection result corresponding to the panoramic image to be detected is obtained.
  • the terminal 102 can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server 104 can be realized by an independent server or a server cluster composed of multiple servers.
  • a method for target detection of a panoramic image is provided.
  • the method is applied to the server 104 in FIG. 1 as an example for illustration, including the following steps:
  • Step 201 acquire a panorama image to be detected.
  • Step 203 perform convolution processing on the panoramic image to be detected through the convolutional neural network including the preset target deformation adaptive convolution operator, and obtain the target category and target boundary position point data of the detection target in the to-be-detected panoramic image, and the target boundary position point
  • the data includes target boundary position points whose coordinates exceed the boundary of the panorama image to be detected.
  • the panorama image is a special image, generally with an aspect ratio of 2:1, and is formed by splicing multiple images. It follows the latitude and longitude expansion method, the width of the image is the latitude 0-2 ⁇ , and the height of the image is the longitude 0- ⁇ . Therefore, it can record all information of 360 degrees horizontally and 180 degrees pitched.
  • the target detection is performed on the panoramic image, some objects in the panoramic image will be divided into the left and right sides of the horizontal direction of the image, so that it cannot be detected as the same object.
  • the target detection is performed through a rectangular frame. The detection method cannot effectively frame the detection target, which affects the accuracy of target detection in panoramic images.
  • Accurate target detection for panoramic images can be achieved through the target detection method for panoramic images of the present application.
  • the operator is the basic unit of neural network computing, and the convolution operation is the main component of the neural network, which is used to extract the statistical and semantic features of the target.
  • the preset target deformation adaptive convolution operator means that the application transforms the existing convolutional neural network model for target detection and replaces some convolution operators with convolution operators that can adapt to target deformation. Deformable convolution, equirectangular projection convolution, and spherical convolution and other types of convolution operators, these preset target deformation adaptation convolution operators are obtained by using panoramic images for training.
  • the target boundary position point can specifically be the coordinate point at the boundary position of the detection target.
  • a detection target there can be multiple boundary coordinate points.
  • there are nine boundary coordinate points that is, the above , middle and lower left, middle and right points
  • a total of 9 points represent the position of the target in the panoramic image, they are: upper left, upper middle, upper right, middle left, center, middle right, lower left, lower middle
  • the area formed by selecting the nine points on the boundary of the target is connected to represent the area where the target is located.
  • it has better scalability and can represent more complex
  • the shape of the target can effectively adapt to the distortion of the target such as inclination and extension, so as to detect the target more accurately.
  • the target boundary position point whose coordinates exceed the boundary of the panorama image to be detected it specifically includes the target boundary position point with negative coordinates and the target boundary position point greater than the image width, where the negative coordinates represent the left side beyond the boundary, and the coordinates greater than the image width are The right is out of bounds.
  • the target category of the detected target is preset data. According to the purpose of target detection, you can set which categories of targets need to be recognized when the convolutional neural network is trained.
  • the convolutional neural network is not limited here, and it can be realized through anchor free target detection neural network models such as CornerNet, CenterNet, and FCOS.
  • the terminal 102 can submit the panoramic image to be detected to the server 104, so that the server 104 can perform target detection corresponding to the panoramic image to be detected, and determine the target in the panoramic image to be detected.
  • the server 104 receives the panoramic image to be detected.
  • the convolutional neural network including the preset target deformation adaptive convolution operator can be used to perform convolution processing on the panoramic image to be detected, and obtain the target category of the detected target in the panoramic image to be detected, as well as the target boundary position point data, and the target boundary position point data Including target boundary position points whose coordinates exceed the boundary of the panorama image to be detected.
  • the detection model of target detection needs to be able to extract the features of the left and right sides of the image and judge that it is the same target.
  • Traditional convolutional neural networks are weak in handling this situation. Therefore, this application constructs a convolution model more suitable for panoramic images by replacing some traditional convolution operators with preset target deformation adaptive convolution operators.
  • presetting the target deformation adaptation convolution operator it has better adaptability to the target deformation of the panoramic image.
  • presetting the target deformation adaptive convolution operator to perform convolution processing on the target detection candidate area in the boundary part of the panoramic image it can effectively judge whether the targets on the left and right sides of the panoramic image are the same target, and output the corresponding target category and A set of target boundary location point data.
  • targets at non-boundary positions they can be detected by other conventional target detection convolution operators of convolutional neural networks.
  • Step 205 according to the target category of the detected target and the target boundary position point data, the target detection result corresponding to the panoramic image to be detected is obtained.
  • the target categories and target boundary position point data corresponding to all detected targets in the panoramic image to be detected After obtaining the target categories and target boundary position point data corresponding to all detected targets in the panoramic image to be detected. This part of the data can be sorted out, that is, the target category and target boundary position data of each detected target in the panorama image to be detected are merged and sorted, and then the target detection result corresponding to the panorama image to be detected is output.
  • the target detection method of the above-mentioned panoramic image obtains the target of the detected target in the panoramic image to be detected by obtaining the panoramic image to be detected; and performing convolution processing on the panoramic image to be detected by a convolutional neural network including a preset target deformation adaptation convolution operator.
  • Category and target boundary position point data the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected; according to the target category of the detected target and the target boundary position point data, obtain the target detection result corresponding to the panoramic image to be detected .
  • the present application When detecting panoramic images, the present application extracts the target category and target boundary position point data of the detected target by preset target deformation adaptive convolution operator, and then obtains the final target detection through the target category and target boundary position point data As a result, the location of the detection target is represented by the position point of the target boundary, and the area of all detection targets including the detection target at the boundary of the panoramic image can be effectively determined, thereby improving the accuracy of target detection in the panoramic image.
  • step 203 includes:
  • Step 302 input the panoramic image to be detected into the convolutional neural network including the preset target deformation adaptive convolution operator, and obtain the heat map, target category and target boundary position point data corresponding to the initial detection target in the panoramic image to be detected.
  • Step 304 Perform filtering processing on the initial detection target according to the heat map to obtain the detection target.
  • Step 306 acquire the target category corresponding to the detected target and the target boundary position point data.
  • the heat map is the feature map output by the neural network.
  • Each point in the map represents the confidence of the target at the position. Therefore, it can be determined based on the heat map whether there is a detection target at each point in the panoramic image to be detected.
  • the obtained panoramic image to be detected can be input into the trained convolutional neural network containing the preset target deformation adaptive convolution operator, and the convolutional neural network outputs multiple branches, respectively Detect the heat map of the target, the target category, and the target boundary position point data.
  • the target boundary position point data can specifically include the offset of the target boundary position point, and the target boundary position point can be calculated by the offset of the target boundary position point.
  • the server 104 analyzes the output of the convolutional neural network, uses the heat map to filter out the detection targets below a certain threshold, and retains the detection targets with high confidence; finally, the corresponding target category and target boundary of the required detection target can be obtained Position point data to realize target detection on panoramic images.
  • the initial detection target is filtered through the heat map, which can effectively exclude the initial detection target that does not meet the requirements, and improve the accuracy of target detection.
  • the initial detection target includes a non-boundary position target
  • the convolutional neural network also includes a conventional convolution operator.
  • Step 302 includes: extracting the panoramic image features of the panoramic image to be detected through the conventional convolution operator; The image features determine the non-boundary position target in the panoramic image to be detected, and obtain the heat map, target category and target boundary position point data corresponding to the non-boundary position target.
  • the non-boundary position target refers to the initial detection target that is not segmented to both ends of the panoramic image, and the non-boundary target is a complete target, generally located in the middle of the panoramic image.
  • the feature of the panoramic image may include the coordinate position of the currently detected initial detection object in the panoramic image to be detected, so that which initial detection objects in the panoramic image to be detected belong to non-boundary position objects can be determined based on the feature of the panoramic image.
  • the convolution calculation can be performed by the usual convolution operator in the convolutional neural network to extract the corresponding heat map, target category and target boundary position point data.
  • the non-boundary position target in the panoramic image to be detected can be identified according to whether the coordinates of the initially detected target in the panoramic image features include the coordinates at the boundary position of the panoramic image to be detected.
  • the initial detection target is regarded as a non-boundary position target.
  • the initial detection target corresponding to the non-boundary position target can be effectively detected to ensure the detection effect of the target detection.
  • the initial detection target also includes a boundary position target
  • step 302 includes:
  • Step 401 extracting panoramic image features of the panoramic image to be detected by using a preset target deformation adaptive convolution operator.
  • Step 403 based on the feature of the panoramic image, determine the boundary position target in the panoramic image to be detected at the boundary position, and obtain the initial detection target.
  • Step 405 based on the features of the panoramic image, identify the target attributes between the first detection target and the second detection target, where the first detection target and the second detection target are initial detection targets in relative positions in the panoramic image.
  • Step 407 when the target attribute indicates that the first detection target and the second detection target are the same detection target, acquire the heat map, target category and target boundary position point data corresponding to the first detection target and the second detection target;
  • Step 409 according to the thermal map, target category and target boundary position point data corresponding to the first detected target and the second detected target, acquire the thermal map, target category and target boundary position point data corresponding to the boundary position target in the panorama image to be detected.
  • the boundary position object refers to the segmented initial detection target, and different parts of the boundary position object are generally arranged at the left and right ends of the panoramic image.
  • Adapting the convolution operator to the preset target deformation can effectively adapt to the target change, and extracting features from the panoramic image through the preset target deformation adaptive convolution operator can effectively extract the boundary position corresponding to the target from the panoramic image to be detected.
  • the detected target refers to the target whose position is in the panoramic image to be detected and whose position is located at the boundary of the panoramic image, and these targets are divided into two sides of the image by the panoramic image.
  • the target attribute is specifically used to judge whether the first detection target and the second detection target of two detection targets in relative positions are the same target. When the two detection targets in relative positions are the same target, the target of the two detection targets properties are the same. However, when the two detection targets at relative positions are not the same target, the target attributes of the two detection targets are different.
  • the panoramic image features corresponding to these objects can be extracted by preset object deformation adaptive convolution operators. Based on the extracted features of the panoramic image, it is further determined which targets belong to the detection target, and the target attributes corresponding to the two detection targets at relative positions are identified. For example, for a panorama image to be detected, the width of an image is latitude 0-2 ⁇ , and the height of the image is longitude 0- ⁇ .
  • the detection targets at relative positions it specifically refers to the detection targets including the same Y-axis coordinates. For example, if it is recognized that the coordinates of a detection target A include (0, 0.5 ⁇ ), it can be determined that the detection target B including the coordinates (2 ⁇ , 0.5 ⁇ ) is the boundary position target of the relative position of the detection target A.
  • the target boundary position point data corresponding to the detection target can be obtained. That is, according to the thermal map, target category and target boundary position point data corresponding to the first detection target and the second detection target, the thermal map, target category and target boundary position point data corresponding to the boundary position target in the panorama image to be detected are obtained, because the first The first detection target and the second detection target are the same target, and one can be selected as the final boundary position target during target recognition. Generally, the detection target on either side of the left and right sides can be fixed as the final boundary position target. In this embodiment, by extracting the features of the panoramic image of the target detection candidate area, the target boundary position point data corresponding to the detection target can be effectively detected to ensure the detection effect of the target detection.
  • step 203 it also includes:
  • Step 502 acquiring the historical panoramic image marked with the target category and the target boundary position point.
  • the target boundary position point data includes the target boundary position point whose coordinates exceed the boundary of the panoramic image to be detected.
  • Step 504 constructing a model training data set according to historical panoramic images.
  • Step 506 train the initial convolutional neural network including the preset target deformation adaptive convolution operator through the model training data set, and obtain the convolutional neural network including the preset target deformation adaptive convolution operator.
  • the historical panoramic image specifically refers to the panoramic image that contains the detection target under each target category in the historical data.
  • These historical panoramic images can be used to train the convolutional neural network in the initial state, and obtain the deformation adaptive convolution algorithm including the preset target. sub-convolutional neural network.
  • the category corresponding to each detection target in the historical panoramic image and the target boundary position point can be marked first by manual labeling, and the model training data set can be constructed through the marked historical panoramic image . Then, the initial convolutional neural network including the target deformation adaptation convolution operator is trained through the model training data set, and the convolutional neural network including the preset target deformation adaptation convolution operator is obtained.
  • the historical panoramic images can also build a model verification group, which is used to verify the training of the complete convolutional neural network.
  • the trained model can be used as a convolutional neural network that includes a preset target deformation adaptation convolution operator. Otherwise, you need to adjust the model parameters before training.
  • the training of the neural network model can be effectively completed to ensure the accuracy of target detection.
  • step 205 includes: generating a data group corresponding to the detection target according to the target category of the detection target and the target boundary position point data; filling the data group corresponding to each detection target into the preset target detection result list to generate The target detection result corresponding to the panorama image to be detected.
  • a blank target detection result list can be constructed in advance, and then after obtaining the target category and target boundary position data of each detected target in the panoramic image through the neural network model, a data group corresponding to the detected target can be generated, and the data group can specifically be It is an array including target category and target boundary location point data. Then, the data groups corresponding to the detection targets are filled into the blank preset target detection result list, so as to obtain the target detection results corresponding to the panorama images to be detected in the form of a final list.
  • the object detection results corresponding to the panorama image to be detected are generated by filling the data groups corresponding to each detection target into the preset target detection result list, which can effectively improve the intuition of the target detection results.
  • a target detection device for a panoramic image including:
  • the data acquisition module 601 is configured to acquire the panoramic image to be detected.
  • the convolution processing module 603 is used to perform convolution processing on the panoramic image to be detected through the convolution neural network including the preset target deformation adaptive convolution operator, and obtain the target category and target boundary position point data of the detection target in the panoramic image to be detected , the target boundary position point data includes target boundary position points whose coordinates exceed the boundary of the panoramic image to be detected.
  • the target detection module 605 is configured to obtain a target detection result corresponding to the panoramic image to be detected according to the target category of the detected target and the target boundary position point data.
  • the convolution processing module 603 is specifically configured to: input the panoramic image to be detected into a convolutional neural network containing a preset target deformation adaptation convolution operator, and obtain the thermal force corresponding to the initial detection target in the panoramic image to be detected Map, target category, and target boundary position point data; filter the initial detection target according to the heat map to obtain the detection target; obtain the target category and target boundary position point data corresponding to the detection target.
  • the initial detection target includes a non-boundary position target
  • the convolutional neural network further includes a conventional convolution operator
  • the convolution processing module 603 is specifically configured to: extract the panorama of the panoramic image to be detected through the conventional convolution operator Image features: Determine the non-boundary position target in the panoramic image to be detected based on the panoramic image feature, and obtain the heat map, target category and target boundary position point data corresponding to the non-boundary position target.
  • the convolution processing module 603 is specifically configured to: extract the panoramic image features of the panoramic image to be detected through the preset target deformation adaptive convolution operator; determine the boundary position of the panoramic image to be detected based on the panoramic image features The boundary position target of the target to obtain the initial detection target; identify the target attribute between the first detection target and the second detection target based on the characteristics of the panoramic image, and the first detection target and the second detection target are the initial detection targets in relative positions in the panoramic image ; When the target attribute indicates that the first detection target and the second detection target are the same detection target, obtain the heat map, target category and target boundary position point data corresponding to the first detection target and the second detection target, according to the first detection target and the second detection target The thermal map, target category, and target boundary position point data corresponding to the second detection target is to obtain the thermal map, target category, and target boundary position point data corresponding to the boundary position target in the panorama image to be detected.
  • it also includes a model training module, which is used to: obtain the historical panoramic image marked with the target category and the target boundary position point, and the target boundary position point data includes the target boundary position point whose coordinates exceed the boundary of the panoramic image to be detected; Build a model training data set based on historical panoramic images; use the model training data set to train the initial convolutional neural network containing the preset target deformation adaptive convolution operator, and obtain the convolution containing the preset target deformation adaptive convolution operator Neural Networks.
  • a model training module which is used to: obtain the historical panoramic image marked with the target category and the target boundary position point, and the target boundary position point data includes the target boundary position point whose coordinates exceed the boundary of the panoramic image to be detected; Build a model training data set based on historical panoramic images; use the model training data set to train the initial convolutional neural network containing the preset target deformation adaptive convolution operator, and obtain the convolution containing the preset target deformation adaptive convolution operator Neural Networks.
  • the target detection module 605 is specifically configured to: generate a data group corresponding to the detected target according to the target category of the detected target and the target boundary position point data; fill the data group corresponding to each detected target into the preset target detection A list of results to generate the target detection results corresponding to the panorama images to be detected.
  • Each module in the above object detection device for panoramic image can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 7 .
  • the computer device includes a processor, memory and a network interface connected by a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer programs and databases.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer device is used to store traffic forwarding data.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection.
  • FIG. 7 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:
  • the target boundary position point data includes coordinates The target boundary position point beyond the boundary of the panorama image to be detected;
  • the target detection result corresponding to the panoramic image to be detected is obtained.
  • the processor executes the computer program, the following steps are also implemented: input the panoramic image to be detected into a convolutional neural network containing a preset target deformation adaptive convolution operator, and obtain the corresponding initial detection target in the panoramic image to be detected. Heat map, target category and target boundary position point data; filter the initial detection target according to the heat map to obtain the detection target; obtain the target category and target boundary position point data corresponding to the detection target.
  • the following steps are also implemented when the processor executes the computer program: extracting the panoramic image features of the panoramic image to be detected through a conventional convolution operator; determining the non-boundary position target in the panoramic image to be detected based on the panoramic image features, and obtaining The heat map corresponding to the non-boundary position target, the target category, and the point data of the target boundary position.
  • the following steps are also implemented: extracting the panoramic image features of the panoramic image to be detected through the preset target deformation adaptation convolution operator; determining the boundary position in the panoramic image to be detected based on the panoramic image features
  • the initial detection target is obtained from the boundary position target above; the target attribute between the first detection target and the second detection target is identified based on the features of the panoramic image, and the first detection target and the second detection target are initial detections at relative positions in the panoramic image Target; when the target attribute indicates that the first detection target and the second detection target are the same detection target, obtain the heat map, target category and target boundary position point data corresponding to the first detection target and the second detection target; according to the first detection target
  • the thermal map, target category, and target boundary position point data corresponding to the second detected target, and the heat map, target category, and target boundary position point data corresponding to the boundary position target in the panorama image to be detected are obtained.
  • the processor when the processor executes the computer program, the following steps are also implemented: obtaining the historical panoramic image marked with the target category and the target boundary position point, and the target boundary position point data includes the target boundary position point whose coordinates exceed the boundary of the panoramic image to be detected ; Construct a model training data set according to historical panoramic images; use the model training data set to train the initial convolutional neural network containing the preset target deformation adaptation convolution operator, and obtain the volume containing the preset target deformation adaptation convolution operator product neural network.
  • the processor executes the computer program, the following steps are also implemented: according to the target category of the detected target and the target boundary position point data, the data group corresponding to the detected target is generated; the data group corresponding to each detected target is filled into the preset A list of target detection results to generate the target detection results corresponding to the panorama image to be detected.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the target boundary position point data includes coordinates The target boundary position point beyond the boundary of the panoramic image to be detected; according to the target category of the detected target and the target boundary position point data, the target detection result corresponding to the panoramic image to be detected is obtained.
  • the following steps are also implemented: input the panoramic image to be detected into a convolutional neural network containing a preset target deformation adaptive convolution operator, and obtain the initial detection target correspondence in the panoramic image to be detected.
  • the heat map, target category, and target boundary position point data filter the initial detection target according to the heat map to obtain the detection target; obtain the target category and target boundary position point data corresponding to the detection target.
  • the following steps are also implemented: extracting the panoramic image features of the panoramic image to be detected by a conventional convolution operator; determining the non-boundary position target in the panoramic image to be detected based on the panoramic image features, Obtain the heat map, target category, and target boundary position point data corresponding to the non-boundary position target.
  • the following steps are also implemented: extracting the panoramic image features of the panoramic image to be detected through the preset target deformation adaptation convolution operator;
  • the initial detection target is obtained from the boundary position target on the position;
  • the target attribute between the first detection target and the second detection target is identified based on the characteristics of the panoramic image, and the first detection target and the second detection target are the initial detection targets in the relative position in the panoramic image.
  • Detect the target when the target attribute indicates that the first detection target and the second detection target are the same detection target, obtain the heat map, target category, and target boundary position point data corresponding to the first detection target and the second detection target; according to the first detection
  • the thermal map, target category, and target boundary position point data corresponding to the target and the second detected target, and the heat map, target category, and target boundary position point data corresponding to the boundary position target in the panorama image to be detected are obtained.
  • the following steps are also implemented: obtaining the historical panoramic image marked with the target category label and the target boundary position point, and the target boundary position point data includes the target boundary position whose coordinates exceed the boundary of the panoramic image to be detected point; construct a model training data set based on historical panoramic images; through the model training data set, train the initial convolutional neural network containing the preset target deformation adaptation convolution operator, and obtain the initial convolution neural network containing the preset target deformation adaptation convolution operator.
  • Convolutional neural network is also implemented: obtaining the historical panoramic image marked with the target category label and the target boundary position point, and the target boundary position point data includes the target boundary position whose coordinates exceed the boundary of the panoramic image to be detected point; construct a model training data set based on historical panoramic images; through the model training data set, train the initial convolutional neural network containing the preset target deformation adaptation convolution operator, and obtain the initial convolution neural network containing the preset target deformation adaptation convolution operator. Convolutional neural network.
  • the following steps are also implemented: generating a data group corresponding to the detection target according to the target category of the detection target and the target boundary position point data; filling the data group corresponding to each detection target into the preset A list of target detection results is set, and target detection results corresponding to the panorama image to be detected are generated.
  • any references to memory, storage, database or other media used in the various embodiments provided in the present application may include at least one of non-volatile memory and volatile memory.
  • the non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, and the like.
  • Volatile memory may include random access memory (Random Access Memory, RAM) or external cache memory.
  • the RAM can be in various forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及一种全景图像的目标检测方法、装置、计算机设备和存储介质。其中方法通过获取待检测全景图像;通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。本申请在可以有效地确定出包括分列于全景图像两端的检测目标在内的所有检测目标的区域,从而提高全景图像目标检测的准确率。

Description

全景图像的目标检测方法、装置、计算机设备和存储介质 技术领域
本申请涉及计算机领域,特别是涉及一种全景图像的目标检测方法、装置、计算机设备和存储介质。
背景技术
随着人工智能技术的发展,计算机视觉技术也得到了越来越广泛的应用。计算机视觉是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取‘信息’的人工智能系统。而全景图像的目标检测是计算机视觉的研究对象之一,全景图像的目标检测,属于目标检测的细分领域,是一种基于全景目标统计特征和语义信息的计算机视觉技术,可以同时获得全景图像中目标的类别信息和位置信息。
全景图像是一种特殊的图像,宽高比一般为2:1,由多张图像拼接而成。它按照经纬展开法,图像的宽就是纬度0-2π,图像的高就是经度0-π。所以,它能记录水平360度,俯仰180度的全部信息。目前,对全景图像进行目标检测时,一般是对全景图像的平面展开图像进行目标检测,所以全景图像中部分物体会发生畸变,导致检测结果的矩形框并不能合理地框住发生形变和延展的目标,从而导致检测结果出现偏差。
针对全景图像中部分物体发生畸变,导致检测结果出现偏差的现象,目前可以采用BFoV(Bounding Field-of-Views,形成视场角边界)表示全景图像中的目标。BFoV把全景图像视为一个球面,用目标所在的纬经度坐标表示其中心点,用它水平和竖直方向上的两个视场角表示它所占的空间。然而,由于BFoV特有的对称性,其表示的检测区域依然不总是能很好地包含全景图像中倾斜或者发生形变畸变的目标,影响目标检测的检测效果。
发明内容
基于此,有必要针对上述技术问题,提供一种能够提高全景图像的目标检测准确率的全景图像的目标检测方法、装置、计算机设备和存储介质。
一种全景图像的目标检测方法,所述方法包括:
获取待检测全景图像;
通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
在其中一个实施例中,所述通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据包括:
将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;
根据所述热力图对所述初始检测目标进行过滤处理,获取检测目标;
获取所述检测目标对应目标类别以及目标边界位置点数据。
在其中一个实施例中,所述初始检测目标包括非边界位置目标,所述卷积神经网络还包括常规卷积算子;
所述将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据包括:
通过所述常规卷积算子提取所述待检测全景图像的全景图像特征;
基于所述全景图像特征确定所述待检测全景图像中的非边界位置目标,获取所述非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在其中一个实施例中,所述初始检测目标包括边界位置目标,所述将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据包括:
通过预设目标形变适应卷积算子提取所述待检测全景图像的全景图像特征;
基于所述全景图像特征确定所述待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标;
基于所述全景图像特征识别第一检测目标与第二检测目标之间的目标属性,所述第一检测目标与所述第二检测目标为所述全景图像中处于相对位置的初始检测目标;
当所述目标属性表征所述第一检测目标与所述第二检测目标为同一检测目标时,获取所述第一检测 目标与所述第二检测目标对应的热力图、目标类别以及目标边界位置点数据;
根据所述第一检测目标与所述第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取所述待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在其中一个实施例中,所述通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据之前,还包括:
获取目标类别标注以及目标边界位置点标注的历史全景图像,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
根据所述历史全景图像构建模型训练数据组;
通过所述模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
在其中一个实施例中,所述根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果包括;
根据所述检测目标的目标类别以及目标边界位置点数据,生成所述检测目标对应的数据组;
将各检测目标对应的数据组填入预设目标检测结果列表,生成所述待检测全景图像对应的目标检测结果。
一种全景图像的目标检测装置,所述装置包括:
数据获取模块,用于获取待检测全景图像;
卷积处理模块,用于通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
目标检测模块,用于根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
在其中一个实施例中,所述卷积处理模块具体用于:将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;根据所述热力图对所述初始检测目标进行过滤处理,获取检测目标;获取所述检测目标对应目标类别以及目标边界位置点数据。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算 机程序时实现以下步骤:
获取待检测全景图像;
通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
获取待检测全景图像;
通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
上述全景图像的目标检测方法、装置、计算机设备和存储介质,通过获取待检测全景图像;通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。本申请在对全景图像检测时,通过预设目标形变适应卷积算子,来提取检测目标的目标类别以及目标边界位置点数据,而后通过目标类别以及目标边界位置点数据来得到最终的目标检测结果,通过目标边界位置点来表示检测目标的位置,可以有效地确定出包括全景图像边界的检测目标在内的所有检测目标的区域,从而提高全景图像目标检测的准确率。
附图说明
图1为一个实施例中全景图像的目标检测方法的应用环境图;
图2为一个实施例中全景图像的目标检测方法的流程示意图;
图3为一个实施例中图2中步骤203的子流程示意图;
图4为一个实施例中图3中步骤302的子流程示意图;
图5为一个实施例中神经网络模型训练步骤的流程示意图;
图6为一个实施例中全景图像的目标检测装置的结构框图;
图7为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
申请人发现,目前存在的全景图像中一般存在全景畸变的现象,全景畸变是指在全景图像的扫描成像过程中,由于像距保持不变,物距随扫描角度的增大而增大,从而导致图像上从中心到两边比例尺逐渐缩小。现有针对全景图像的目标检测算法大多数都采用目标的Bounding-Box(BBox)或Bounding Field-of-View(BFoV)。然而在全景图像中,由于畸变的存在,采用BBox的检测方法中,检测出的矩形框并不能合理地框住发生形变和延展的检测目标,从而导致检测失败。而对于BFoV,BFoV把全景图像视为一个球面,用目检测标所在的纬经度坐标表示其中心点,用它水平和竖直方向上的两个视场角(Field-of-Views)表示它所占的空间。BFoV具体定义为(φ,θ,h,w)。φ和θ分别是目标在球面上的纬度和经度坐标;h和w表示目标在水平和竖直方向上的两个视场角,类似于高和宽。BFoV对上下区域的畸变能延展开来包含住目标。同时由于BFoV是定义在球面上的,也避免了左右两边不能判断是同一个目标的问题。然而,由于BFoV特有的对称性,当全景图像中的物体在两端边界不对称时,BFoV检测的区域依然不总是能很好地包含检测目标,从而影响目标检测的准确率。针对此情况,申请人提出了本申请的目标检测方法。
本申请提供的全景图像的目标检测方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。其中,当终端102方的数据处理工作人员需要检测全景图像中的目标时,可以将待检测全景图像发送至服务器104,由服务器104来对终端102所提交的待检测全景图像进行目标检测。服务器104获取待检测全景图像;通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图2所示,提供了一种全景图像的目标检测方法,以该方法应用于图1中的服务器104为例进行说明,包括以下步骤:
步骤201,获取待检测全景图像。
步骤203,通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点。
其中,全景图像是一种特殊的图像,宽高比一般为2:1,由多张图像拼接而成。它按照经纬展开法,图像的宽就是纬度0-2π,图像的高就是经度0-π。所以,它能记录水平360度,俯仰180度的全部信息。目前,对全景图像进行目标检测的话,由于全景图像中部分物体会被分割到图像水平方向的左右两边,导致无法将其检测为同一物体,同时因为全景畸变的存在,通过矩形框来进行目标检测的检测方法无法有效地框住检测目标,从而影响全景图像的目标检测的准确率。可以通过本申请的全景图像的目标检测方法来实现针对全景图像精准的目标检测。算子是神经网络计算的基本单元,而卷积操作是神经网络的主要组件,它用于提取目标的统计和语义特征。而预设目标形变适应卷积算子是指本申请通过对现有的目标检测的卷积神经网络模型进行改造,将部分卷积算子替换为能适应目标形变的卷积算子,如可变形卷积、等矩形投影卷积以及球面卷积等类型的卷积算子,这些预设目标形变适应卷积算子通过使用全景图片训练得到。目标边界位置点具体可以是检测目标的边界位置处的坐标点,对于一个检测目标,其边界坐标点可以为多个,在其中一个实施例中,边界坐标点具体为9个,即采用目标上方、中部和下方的左、中、右三个点,共计9个点表示全景图像中的目标所在位置,它们分别为:左上、中上、右上、左中、中心、右中、左下、中下以及右下这九个点,通过选取目标边界的9个点连接所形成的的区域来表示目标所在的区域,相对比4点的矩形框表示,具备更佳的拓展性,可以表示出更加复杂的形状,有效地适应目标的倾斜、延展等畸变,从而更精准地检测出目标。而对于坐标超出待检测全景图像边界的目标边界位置点,具体包括了负坐标的目标边界位置点以及大于图像宽度的目标边界位置点,其中负坐标代表左边超出边界,大于图像宽度的坐标则是右边超出边界。在全景图像中,这种超出边界的坐标是有意义的,它表示目标在图像的另一边还存在一部分。检测目标的目标类别为预设数据,根据目标检测的用途,可以在卷积神经网络进行训练时设置需要识别哪些类别的目标。而卷积神经网络在此处不做限定,具体可以通过如CornerNet、CenterNet以及FCOS等anchor free类的目标检测神经网络模型来实现。
具体地,当终端102方需要进行全景图像的目标检测时,可以通过终端102向服务器104提交待检测全景图像,以通过服务器104进行待检测全景图像对应的目标检测,确定待检测全景图像内的检测目标 的类型以及检测目标的位置。服务器104接收该待检测全景图像。即可通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点。在目标检测的过程中,目标检测的检测模型需要有能力抽取图像左右两边的特征并判断是同一目标。传统的卷积神经网络对这种情况处理能力偏弱。因此,本申请通过把部分传统卷积算子替换为预设目标形变适应卷积算子,构建一个更适应全景图像的卷积模型。通过预设目标形变适应卷积算子对全景图像的目标形变有更好的适应能力。通过预设目标形变适应卷积算子来全景图像边界部分的目标检测候选区域进行卷积处理,可以有效判断全景图像左右两边的目标是否是同一个目标,并对同一目标输出相应的目标类别以及一组目标边界位置点数据。而对于非边界位置的目标,则可以通过卷积神经网络的其他常规目标检测卷积算子来进行检测。
步骤205,根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。
具体地,在得到待检测全景图像中所有检测目标对应的目标类别以及目标边界位置点数据后。可以对这部分数据进行整理,即将待检测全景图像中每一个检测目标的目标类别以及目标边界位置点数据合并整理,而后输出待检测全景图像对应的目标检测结果。
上述全景图像的目标检测方法,通过获取待检测全景图像;通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。本申请在对全景图像检测时,通过预设目标形变适应卷积算子,来提取检测目标的目标类别以及目标边界位置点数据,而后通过目标类别以及目标边界位置点数据来得到最终的目标检测结果,通过目标边界位置点来表示检测目标的位置,可以有效地确定出包括全景图像边界的检测目标在内的所有检测目标的区域,从而提高全景图像目标检测的准确率。
在一个实施例中,如图3所示,步骤203包括:
步骤302,将待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据。
步骤304,根据热力图对初始检测目标进行过滤处理,获取检测目标。
步骤306,获取检测目标对应目标类别以及目标边界位置点数据。
其中,热力图即神经网络输出的特征图featuremap,图中的每一个点表示该位置上存在目标的置信度,因此可以基于热力图确定待检测全景图像中各个点是否存在检测目标。
具体地,在进行目标检测时,可以将得到的待检测全景图像输入到训练完成的包含预设目标形变适应卷积算子的卷积神经网络中,卷积神经网络输出多个分支,分别是检测目标的热力图、目标类别、目标边界位置点数据,其中目标边界位置点数据具体可以包括目标边界位置点的偏移量,可以通过目标边界位置点的偏移量来对目标边界位置点进行定位;而后服务器104通过解析卷积神经网络的输出,利用热力图过滤掉低于一定阈值的检测目标,保留高置信度的检测目标;最后即可获取所需要的检测目标对应目标类别以及目标边界位置点数据,实现对全景图像的目标检测。本实施例中,通过热力图对初始检测目标进行过滤处理,可以有效地将不符合要求的初始检测目标排除,提高目标检测的准确率。
在其中一个实施例中,初始检测目标包括非边界位置目标,卷积神经网络还包括常规卷积算子,步骤302包括:通过常规卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中的非边界位置目标,获取非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
其中,非边界位置目标是指未被分割至全景图像两端的初始检测目标,非边界目标为一个完整的目标,一般位于全景图像的中间位置。全景图像特征可以包括当前检测出的初始检测目标在待检测全景图像的坐标位置,从而可以基于全景图像特征确定待检测全景图像中哪些初始检测目标属于非边界位置目标。
具体地,对于非边界位置目标的检测,可以通过卷积神经网络中通常的卷积算子来进行卷积计算,来提取对应的热力图、目标类别以及目标边界位置点数据。在进行非边界位置目标识别时,具体可以通过全景图像特征中初始检测目标的坐标是否包括待检测全景图像的边界位置处的坐标,来识别待检测全景图像中的非边界位置目标。当初始检测目标的坐标不包括待检测全景图像的边界位置处的坐标,即初始检测目标内的所有坐标都在待检测全景图像的边界范围内时,将初始检测目标作为非边界位置目标。本实施例中,通过提取待检测全景图像的全景图像特征,可以有效对非边界位置目标对应的初始检测目标进行有效检测,保证目标检测的检测效果。
在一个实施例中,如图4所示,初始检测目标还包括边界位置目标,步骤302包括:
步骤401,通过预设目标形变适应卷积算子提取待检测全景图像的全景图像特征。
步骤403,基于全景图像特征确定待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标。
步骤405,基于全景图像特征识别第一检测目标与第二检测目标之间的目标属性,第一检测目标与第 二检测目标为全景图像中处于相对位置的初始检测目标。
步骤407,当目标属性表征第一检测目标与第二检测目标为同一检测目标时,获取第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据;
步骤409,根据第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
其中,边界位置目标是指被分割的初始检测目标,边界位置目标的不同部分一般分列在全景图像的左右两端。通过预设目标形变适应卷积算子可以有效地适应目标性变,通过预设目标形变适应卷积算子对全景图像提取特征,能有效地从待检测全景图像中提取出边界位置目标对应的全景图像特征。检测目标是指位置在待检测全景图像中,位置位于全景图像边界处的目标,这些目标被全景图像分割到了图像的两边。目标属性具体用于判断处于相对位置的两个检测目标第一检测目标与第二检测目标是否为同一个目标,当处于相对位置的两个检测目标为同一目标时,这两个检测目标的目标属性为相同。而处于相对位置的两个检测目标不为同一目标时,这两个检测目标的目标属性为不同。
具体地,在识别边界位置处的坐标时,由于目标已经可能已经被分割在了全景图像中相对的两个边界,产生了目标的形变。因此,此时可以通过预设目标形变适应卷积算子来提取这些目标对应的全景图像特征。基于提取出的全景图像特征,来进一步地确定中哪些目标属于检测目标,并识别出处于相对位置处的两个检测目标所对应的目标属性。如对于一个图像的宽为纬度0-2π,图像的高为经度0-π的待检测全景图像。可以在图像的左下端点为原点、以图像的宽度方向为X轴,以图像的高度方向为Y轴,建立二维平面坐标系。则该待检测全景图像中边界位置为X=0的左边界以及X=2π的右边界。而对于处于相对位置的检测目标,具体是指包含相同Y轴坐标的检测目标。如识别出一个检测目标A的坐标包括(0,0.5π),从而可以确定包含坐标(2π,0.5π)的检测目标B是检测目标A相对位置的边界位置目标。而后即可基于卷积神经网络提取出的全景图像特征来进一步地识别判断,确定两个处于相对位置的检测目标是否相同的检测目标。而在当目标属性表征处于相对位置的检测目标为同一目标时,即可获取检测目标对应的目标边界位置点数据。即根据第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据,因为第一检测目标与第二检测目标为同一个目标,在目标识别时,可以任选一个作为最终的边界位置目标。一般地,可以固定取左右任意一边的检测目标为最终的边界位置目标。本实施例中,通过提取目标检测候选区域的全景图像特征,可以有效对检测目标对应的目标边界位置点数据进行有效检测,保证目标检测的检测效果。
在其中一个实施例中,如图5所示,步骤203之前,还包括:
步骤502,获取目标类别标注以及目标边界位置点标注的历史全景图像,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点。
步骤504,根据历史全景图像构建模型训练数据组。
步骤506,通过模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
其中,历史全景图像具体是指历史数据中包含各个目标类别下检测目标的全景图像,可以通过这些历史全景图像来对初始状态的卷积神经网络进行训练,得到包含预设目标形变适应卷积算子的卷积神经网络。
具体地,在构建模型训练数据时,可以先通过人工标注的方式先对历史全景图像中各个检测目标对应的类别以及目标边界位置点进行标注,通过标注后的历史全景图像来构建模型训练数据组。而后通过模型训练数据组,来对包含目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。在其他实施例中,历史全景图像除了可以构建模型训练数据组外,还可以构建模型验证组,用于对训练完整的卷积神经网络进行验证,只有当验证组的识别准确率高于预设阈值后,才可以将训练完成的模型作为包含预设目标形变适应卷积算子的卷积神经网络。否则需要调整模型参数后再进行训练。本实施例中,通过构建模型训练数据组,可以有效完成对神经网络模型的训练,保证目标检测的准确率。
在其中一个实施例中,步骤205包括;根据检测目标的目标类别以及目标边界位置点数据,生成检测目标对应的数据组;将各检测目标对应的数据组填入预设目标检测结果列表,生成待检测全景图像对应的目标检测结果。
具体地,可以预先构建空白的目标检测结果列表,而后当通过神经网络模型得到全景图像中各个检测目标的目标类别以及目标边界位置点数据后,可以生成检测目标对应的数据组,数据组具体可以为包括目标类别以及目标边界位置点数据的数组。而后将各检测目标对应的数据组填入空白的预设目标检测结果列表,从而得到最终列表形式的待检测全景图像对应的目标检测结果。本实施例中,通过将各检测目标对应的数据组填入预设目标检测结果列表,生成待检测全景图像对应的目标检测结果,可以有效提高目标检测结果的直观性。
应该理解的是,虽然图2-5的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制, 这些步骤可以以其它的顺序执行。而且,图2-5中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图6所示,提供了一种全景图像的目标检测装置,包括:
数据获取模块601,用于获取待检测全景图像。
卷积处理模块603,用于通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点。
目标检测模块605,用于根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。
在其中一个实施例中,卷积处理模块603具体用于:将待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;根据热力图对初始检测目标进行过滤处理,获取检测目标;获取检测目标对应目标类别以及目标边界位置点数据。
在其中一个实施例中,初始检测目标包括非边界位置目标,卷积神经网络还包括常规卷积算子,卷积处理模块603具体用于:通过常规卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中的非边界位置目标,获取非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在其中一个实施例中,卷积处理模块603具体用于:通过预设目标形变适应卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标;基于全景图像特征识别第一检测目标与第二检测目标之间的目标属性,第一检测目标与第二检测目标为全景图像中处于相对位置的初始检测目标;当目标属性表征第一检测目标与第二检测目标为同一检测目标时,获取第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,根据第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在其中一个实施例中,还包括模型训练模块,用于:获取目标类别标注以及目标边界位置点标注的历史全景图像,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据历史全 景图像构建模型训练数据组;通过模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
在其中一个实施例中,目标检测模块605具体用于:根据检测目标的目标类别以及目标边界位置点数据,生成检测目标对应的数据组;将各检测目标对应的数据组填入预设目标检测结果列表,生成待检测全景图像对应的目标检测结果。
关于全景图像的目标检测装置的具体限定可以参见上文中对于全景图像的目标检测方法的限定,在此不再赘述。上述全景图像的目标检测装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储流量转发数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种全景图像的目标检测方法。
本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
获取待检测全景图像;
通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;
根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:将待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;根据热力图对初始检测目标进行过滤处理,获取检测目标;获取检测目标对应目 标类别以及目标边界位置点数据。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:通过常规卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中的非边界位置目标,获取非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:通过预设目标形变适应卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标;基于全景图像特征识别第一检测目标与第二检测目标之间的目标属性,第一检测目标与第二检测目标为全景图像中处于相对位置的初始检测目标;当目标属性表征第一检测目标与第二检测目标为同一检测目标时,获取第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据;根据第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:获取目标类别标注以及目标边界位置点标注的历史全景图像,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据历史全景图像构建模型训练数据组;通过模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:根据检测目标的目标类别以及目标边界位置点数据,生成检测目标对应的数据组;将各检测目标对应的数据组填入预设目标检测结果列表,生成待检测全景图像对应的目标检测结果。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取待检测全景图像;
通过包含预设目标形变适应卷积算子的卷积神经网络对待检测全景图像进行卷积处理,获取待检测全景图像中检测目标的目标类别以及目标边界位置点数据,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据检测目标的目标类别以及目标边界位置点数据,获取待检测全景图像对应的目标检测结果。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:将待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;根据热力图对初始检测目标进行过滤处理,获取检测目标;获取检测目标对应 目标类别以及目标边界位置点数据。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:通过常规卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中的非边界位置目标,获取非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:通过预设目标形变适应卷积算子提取待检测全景图像的全景图像特征;基于全景图像特征确定待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标;基于全景图像特征识别第一检测目标与第二检测目标之间的目标属性,第一检测目标与第二检测目标为全景图像中处于相对位置的初始检测目标;当目标属性表征第一检测目标与第二检测目标为同一检测目标时,获取第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据;根据第一检测目标与第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:获取目标类别标注以及目标边界位置点标注的历史全景图像,目标边界位置点数据包括坐标超出待检测全景图像边界的目标边界位置点;根据历史全景图像构建模型训练数据组;通过模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:根据检测目标的目标类别以及目标边界位置点数据,生成检测目标对应的数据组;将各检测目标对应的数据组填入预设目标检测结果列表,生成待检测全景图像对应的目标检测结果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,上述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-OnlyMemory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(RandomAccessMemory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(StaticRandomAccessMemory,SRAM)或动态随机存取存储器(DynamicRandomAccessMemory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载 的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种全景图像的目标检测方法,所述方法包括:
    获取待检测全景图像;
    通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
    根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据包括:
    将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;
    根据所述热力图对所述初始检测目标进行过滤处理,获取检测目标;
    获取所述检测目标对应目标类别以及目标边界位置点数据。
  3. 根据权利要求2所述的方法,其特征在于,所述初始检测目标包括非边界位置目标,所述卷积神经网络还包括常规卷积算子;
    所述将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据包括:
    通过所述常规卷积算子提取所述待检测全景图像的全景图像特征;
    基于所述全景图像特征确定所述待检测全景图像中的非边界位置目标,获取所述非边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
  4. 根据权利要求2所述的方法,其特征在于,所述初始检测目标包括边界位置目标;
    所述将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以 及目标边界位置点数据包括:
    通过预设目标形变适应卷积算子提取所述待检测全景图像的全景图像特征;
    基于所述全景图像特征确定所述待检测全景图像中处于边界位置上的边界位置目标,得到初始检测目标;
    基于所述全景图像特征识别第一检测目标与第二检测目标之间的目标属性,所述第一检测目标与所述第二检测目标为所述全景图像中处于相对位置的初始检测目标;
    当所述目标属性表征所述第一检测目标与所述第二检测目标为同一检测目标时,获取所述第一检测目标与所述第二检测目标对应的热力图、目标类别以及目标边界位置点数据;
    根据所述第一检测目标与所述第二检测目标对应的热力图、目标类别以及目标边界位置点数据,获取所述待检测全景图像中边界位置目标对应的热力图、目标类别以及目标边界位置点数据。
  5. 根据权利要求1所述的方法,其特征在于,所述通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据之前,还包括:
    获取目标类别标注以及目标边界位置点标注的历史全景图像,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
    根据所述历史全景图像构建模型训练数据组;
    通过所述模型训练数据组,对包含预设目标形变适应卷积算子的初始卷积神经网络进行训练,获取包含预设目标形变适应卷积算子的卷积神经网络。
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果包括:
    根据所述检测目标的目标类别以及目标边界位置点数据,生成所述检测目标对应的数据组;
    将检测目标对应的数据组填入预设目标检测结果列表,生成所述待检测全 景图像对应的目标检测结果。
  7. 一种全景图像的目标检测装置,其特征在于,所述装置包括:
    数据获取模块,用于获取待检测全景图像;
    卷积处理模块,用于通过包含预设目标形变适应卷积算子的卷积神经网络对所述待检测全景图像进行卷积处理,获取所述待检测全景图像中检测目标的目标类别以及目标边界位置点数据,所述目标边界位置点数据包括坐标超出所述待检测全景图像边界的目标边界位置点;
    目标检测模块,用于根据所述检测目标的目标类别以及目标边界位置点数据,获取所述待检测全景图像对应的目标检测结果。
  8. 根据权利要求7所述的装置,其特征在于,所述卷积处理模块具体用于:将所述待检测全景图像输入包含预设目标形变适应卷积算子的卷积神经网络,获取所述待检测全景图像中初始检测目标对应的热力图、目标类别以及目标边界位置点数据;根据所述热力图对所述初始检测目标进行过滤处理,获取检测目标;获取所述检测目标对应目标类别以及目标边界位置点数据。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至6中任一项所述方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至6中任一项所述的方法的步骤。
PCT/CN2022/125242 2021-10-22 2022-10-14 全景图像的目标检测方法、装置、计算机设备和存储介质 WO2023066142A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111233006.X 2021-10-22
CN202111233006.XA CN114005052A (zh) 2021-10-22 2021-10-22 全景图像的目标检测方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023066142A1 true WO2023066142A1 (zh) 2023-04-27

Family

ID=79923854

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125242 WO2023066142A1 (zh) 2021-10-22 2022-10-14 全景图像的目标检测方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN114005052A (zh)
WO (1) WO2023066142A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005052A (zh) * 2021-10-22 2022-02-01 影石创新科技股份有限公司 全景图像的目标检测方法、装置、计算机设备和存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170116498A1 (en) * 2013-12-04 2017-04-27 J Tech Solutions, Inc. Computer device and method executed by the computer device
CN107844750A (zh) * 2017-10-19 2018-03-27 华中科技大学 一种水面全景图像目标检测识别方法
CN110163271A (zh) * 2019-05-13 2019-08-23 武汉大学 一种基于球面投影网格和球面卷积的全景影像目标检测方法
CN110826391A (zh) * 2019-09-10 2020-02-21 中国三峡建设管理有限公司 泌水区域检测方法、系统、计算机设备和存储介质
CN111091117A (zh) * 2019-12-31 2020-05-01 北京城市网邻信息技术有限公司 用于二维全景图像的目标检测方法、装置、设备、介质
CN111402228A (zh) * 2020-03-13 2020-07-10 腾讯科技(深圳)有限公司 图像检测方法、装置和计算机可读存储介质
CN112784810A (zh) * 2021-02-08 2021-05-11 风变科技(深圳)有限公司 手势识别方法、装置、计算机设备和存储介质
CN114005052A (zh) * 2021-10-22 2022-02-01 影石创新科技股份有限公司 全景图像的目标检测方法、装置、计算机设备和存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170116498A1 (en) * 2013-12-04 2017-04-27 J Tech Solutions, Inc. Computer device and method executed by the computer device
CN107844750A (zh) * 2017-10-19 2018-03-27 华中科技大学 一种水面全景图像目标检测识别方法
CN110163271A (zh) * 2019-05-13 2019-08-23 武汉大学 一种基于球面投影网格和球面卷积的全景影像目标检测方法
CN110826391A (zh) * 2019-09-10 2020-02-21 中国三峡建设管理有限公司 泌水区域检测方法、系统、计算机设备和存储介质
CN111091117A (zh) * 2019-12-31 2020-05-01 北京城市网邻信息技术有限公司 用于二维全景图像的目标检测方法、装置、设备、介质
CN111402228A (zh) * 2020-03-13 2020-07-10 腾讯科技(深圳)有限公司 图像检测方法、装置和计算机可读存储介质
CN112784810A (zh) * 2021-02-08 2021-05-11 风变科技(深圳)有限公司 手势识别方法、装置、计算机设备和存储介质
CN114005052A (zh) * 2021-10-22 2022-02-01 影石创新科技股份有限公司 全景图像的目标检测方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN114005052A (zh) 2022-02-01

Similar Documents

Publication Publication Date Title
WO2021017261A1 (zh) 识别模型训练方法、图像识别方法、装置、设备及介质
CN111310800B (zh) 图像分类模型生成方法、装置、计算机设备和存储介质
CN109344727B (zh) 身份证文本信息检测方法及装置、可读存储介质和终端
US10410354B1 (en) Method and apparatus for multi-model primitive fitting based on deep geometric boundary and instance aware segmentation
US9025863B2 (en) Depth camera system with machine learning for recognition of patches within a structured light pattern
JP2015533434A (ja) 教師あり形状ランク付けに基づく生物学的単位の識別
JP2011508323A (ja) 不変の視覚場面及び物体の認識
WO2021114776A1 (en) Object detection method, object detection device, terminal device, and medium
CN111814905A (zh) 目标检测方法、装置、计算机设备和存储介质
WO2023279847A1 (zh) 单元格位置的检测方法、装置和电子设备
WO2023066142A1 (zh) 全景图像的目标检测方法、装置、计算机设备和存储介质
CN111768415A (zh) 一种无量化池化的图像实例分割方法
CN114359932B (zh) 文本检测方法、文本识别方法及装置
CN115131803A (zh) 文档字号的识别方法、装置、计算机设备和存储介质
CN112836682B (zh) 视频中对象的识别方法、装置、计算机设备和存储介质
WO2021159778A1 (zh) 图像处理方法、装置、智能显微镜、可读存储介质和设备
CN113469091A (zh) 人脸识别方法、训练方法、电子设备及存储介质
WO2023066143A1 (zh) 全景图像的图像分割方法、装置、计算机设备和存储介质
US10991085B2 (en) Classifying panoramic images
US20230169755A1 (en) Apparatus and method with image processing
CN116778581A (zh) 一种基于改进YOLOv7模型的考场异常行为检测方法
US20220309610A1 (en) Image processing method and apparatus, smart microscope, readable storage medium and device
EP4073698A1 (en) Object detection method, object detection device, terminal device, and medium
Shao et al. Digital Image Aesthetic Composition Optimization Based on Perspective Tilt Correction
Liu et al. Prediction with Visual Evidence: Sketch Classification Explanation via Stroke-Level Attributions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22882751

Country of ref document: EP

Kind code of ref document: A1