CN111429463A - Instance splitting method, instance splitting device, electronic equipment and storage medium - Google Patents

Instance splitting method, instance splitting device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111429463A
CN111429463A CN202010142850.0A CN202010142850A CN111429463A CN 111429463 A CN111429463 A CN 111429463A CN 202010142850 A CN202010142850 A CN 202010142850A CN 111429463 A CN111429463 A CN 111429463A
Authority
CN
China
Prior art keywords
features
extracted
size
characteristic diagram
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010142850.0A
Other languages
Chinese (zh)
Inventor
王钰晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202010142850.0A priority Critical patent/CN111429463A/en
Publication of CN111429463A publication Critical patent/CN111429463A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an example segmentation method, an example segmentation device, an electronic device and a storage medium. The method comprises the following steps: respectively generating a thermodynamic characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image based on a backbone network of a first-stage network; determining a center point of an example according to the thermodynamic characteristic diagram; extracting a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map respectively; extracting global significant features from the global significant feature map according to the central point and the extracted size features; and generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features. The method has the advantages that the example position is determined by using a central point prediction mode, so that the calculation of irrelevant points is omitted, and the speed is improved; and fine segmentation can be performed at the pixel level, and the precision is higher.

Description

Instance splitting method, instance splitting device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer vision, and in particular, to an instance segmentation method, apparatus, electronic device, and storage medium.
Background
In the high-precision map building process, images are required to be identified, for example, different traffic signs are distinguished. For another example, in an automatic driving scenario, pedestrians and vehicles at a short distance on a road need to be distinguished.
Currently, computer vision techniques are commonly employed to accomplish this task. Such tasks cannot be accomplished by semantic segmentation alone, so instance segmentation is required to distinguish between different individuals. Two-stage (two-stage) type neural networks are often adopted in the prior art to realize example segmentation, but the problems of high calculation cost and low speed exist.
Disclosure of Invention
In view of the above, the present application has been developed to provide an example segmentation method, apparatus, device and storage medium in a computer vision task that overcomes or at least partially solves the above mentioned problems.
According to one aspect of the present application, there is provided a method of instance segmentation in a computer vision task, comprising: respectively generating a thermodynamic characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image based on a backbone network of a first-stage network; determining a center point of an example according to the thermodynamic characteristic diagram; extracting a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map respectively; extracting global significant features from the global significant feature map according to the central point and the extracted size features; and generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features.
Optionally, the determining the center point of the instance according to the thermodynamic characteristic map comprises: and extracting a local maximum response point in the thermodynamic characteristic diagram, and determining the position of the center point of the example according to the position of the local maximum response point.
Optionally, the determining the center point position of the instance according to the position of the local maximum response point includes: generating an offset feature map of the image based on a backbone network of a one-stage network; extracting corresponding offset features from the offset feature map according to the position of the local maximum response point; and determining the position of the center point of the example according to the position of the local maximum response point and the offset characteristic.
Optionally, the extracting, according to the central point position and the extracted size feature, a global saliency feature from the global saliency feature map includes: determining the occupation area of the example according to the central point position and the extracted size characteristic; and cutting out global saliency features corresponding to the occupation area from the global saliency feature map.
Optionally, the generating an example segmentation result according to the extracted size feature, the extracted shape feature, and the extracted global saliency feature includes: determining the outline characteristics of the example according to the extracted size characteristics and the extracted shape characteristics; and generating a mask of an example according to the outline feature and the extracted global significance feature.
Optionally, the backbone network is a convolutional neural network.
Optionally, each feature map has the same width and height.
According to another aspect of the present application, there is provided an example segmentation apparatus in a computer vision task, including: the characteristic diagram generating unit is used for respectively generating a thermal characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image based on a main network of a one-stage network; the central point determining unit is used for determining the central point of the example according to the thermodynamic characteristic diagram; a feature extraction unit configured to extract a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map, respectively; extracting global significant features from the global significant feature map according to the central point and the extracted size features; and the example segmentation unit is used for generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features.
Optionally, the central point determining unit is configured to extract a local maximum response point in the thermodynamic characteristic diagram, and determine a central point position of the example according to a position of the local maximum response point.
Optionally, the feature map generating unit is configured to generate an offset feature map of the image based on a backbone network of a one-stage network; the feature extraction unit is configured to extract a corresponding offset feature from the offset feature map according to the position of the local maximum response point; and determining the position of the center point of the example according to the position of the local maximum response point and the offset characteristic.
Optionally, the feature extraction unit is configured to determine an occupation area of the instance according to the central point position and the extracted size feature; and cutting out global saliency features corresponding to the occupation area from the global saliency feature map.
Optionally, the example segmenting unit is configured to determine a contour feature of the example according to the extracted size feature and the extracted shape feature; and generating a mask of an example according to the outline feature and the extracted global significance feature.
Optionally, the backbone network is a convolutional neural network.
Optionally, each feature map has the same width and height.
In accordance with yet another aspect of the present application, there is provided an electronic device including: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to perform a method as any one of the above.
According to a further aspect of the application, there is provided a computer readable storage medium, wherein the computer readable storage medium stores one or more programs which, when executed by a processor, implement a method as in any above.
As can be seen from the above, the present application proposes an example segmentation scheme based on one-stage. The method comprises the steps of respectively generating a thermal characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of an image based on a main network of a one-stage network, determining a central point of an example according to the thermal characteristic diagram, respectively extracting the size characteristic corresponding to the central point and the shape characteristic corresponding to the central point from the size characteristic diagram and the shape characteristic, extracting the global significance characteristic from the global significance characteristic diagram according to the central point and the extracted size characteristic, and generating an example segmentation result according to the extracted size characteristic, the extracted shape characteristic and the extracted global significance characteristic. The method has the advantages that the example position is determined by using a central point prediction mode, so that the calculation of irrelevant points is omitted, and the speed is improved; and fine segmentation can be performed at the pixel level, and the precision is higher.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 shows a flow diagram of an example segmentation method in a computer vision task, according to an embodiment of the present application;
FIG. 2 illustrates a flow diagram of an example segmentation method according to one embodiment of the present application;
FIG. 3 illustrates a block diagram of an example segmentation apparatus in a computer vision task, in accordance with one embodiment of the present application;
FIG. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 5 shows a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
Mask-area convolutional neural networks (Mask-RCNN) are a representative two-stage example segmentation network. The scheme needs to predict a region of interest (ROI frame) first, generates a large number of meaningless frames and then performs segmentation, and has high calculation cost and low speed.
Tensor-mask (TensorMask) is a one-stage example segmentation scheme that, while not requiring advance prediction of the boxes, is time consuming because of the need to predict a mask at each point.
Polar mask (polar mask) is also a one-stage example segmentation scheme, which also directly predicts mask without relying on ROI extracted in advance, but models mask as a polygonal structure composed of several rays from a central point, and cannot describe the contour of an object finely.
In view of this, the technical solution of the present application adopts a one-stage network to perform instance segmentation, and the instance segmentation is split into two parts: one part is to predict the rough outline characteristic based on the central point to distinguish different examples; another part is to achieve fine segmentation at the pixel level by global saliency.
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a flow diagram of an example segmentation method in a computer vision task, according to one embodiment of the present application. As shown in fig. 1, the method includes:
step S110, based on the backbone network of the first-stage network, a thermodynamic characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image are respectively generated.
In particular, the backbone network may be a convolutional neural network, performing tasks involving multiple branches.
Wherein the number of channels of the thermodynamic feature map (heatmap) may be the number of categories of instances. The technical scheme of the application can be applied to various scenes which need computer vision tasks as technical support, including but not limited to traffic sign recognition of high-precision map building, obstacle detection of automatic driving, target detection of monitoring and the like, and further can be applied to actual business scenes such as express logistics, take-away food delivery and the like. Therefore, the scheme can be used on line or off line.
Correspondingly, the examples to be divided relate to the scene, for example, in an automatic driving scene, the pedestrian and the vehicle need to be divided as examples, and even the same type of vehicle needs to be divided into a plurality of examples.
The dimension characteristic is used for characterizing the size of an example, the dimension is usually embodied as width and height, and the number of channels of the dimension characteristic diagram can be 2; the shape feature is used for characterizing the shape of the example, and the number of channels of the shape feature map can be related to the area. The global saliency characteristic map can be embodied as a single-channel gray image, and the gray value represents the global saliency of a pixel point in the original image.
And step S120, determining the center point of the example according to the thermodynamic characteristic diagram. Generally, one instance corresponds to one center point. This greatly reduces the involvement of extraneous regions, and thus reduces the computational load of candidates, compared to pre-selecting multiple ROIs.
The feature map and the original image have a corresponding relationship, and determining the center point corresponds to determining the position of a designated point in each of the feature map and the original image.
In step S130, a size feature corresponding to the central point and a shape feature corresponding to the central point are extracted from the size feature map and the shape feature map, respectively.
Step S140, extracting global salient features from the global salient feature map according to the central point and the extracted size features.
And step S150, generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features.
Therefore, the method shown in fig. 1 determines the example position by using a central point prediction mode, omits the calculation of irrelevant points, and improves the speed; and different from the mode that PolarMask adopts polar coordinate system modeling example Mask, the method performs fine segmentation at pixel level by extracting more features, and the precision is higher.
In one embodiment of the present application, the determining the center point of the example according to the thermodynamic characteristic map includes: and extracting a local maximum response point in the thermodynamic characteristic diagram, and determining the position of the center point of the example according to the position of the local maximum response point.
In the Heatmap, each area with higher heat degree generally corresponds to one example, but when the Heatmap is used, the occupied area of one example is not directly obtained according to the occupied area, only one point corresponding to the example is obtained, and the occupied area of the example is determined by matching with other characteristics.
Therefore, the method adopts a mode of calculating the local response, and the position of the center point of the example is determined by extracting the local maximum response point.
Considering the local maximum response point as the center point of the example is a feasible but not high precision manner, and therefore in an embodiment of the present application, the determining the position of the center point of the example according to the position of the local maximum response point includes: generating an offset characteristic diagram of the image based on a backbone network of a first-stage network; extracting corresponding offset features from the offset feature map according to the position of the local maximum response point; and determining the position of the center point of the example according to the position of the local maximum response point and the offset characteristic.
The offset feature, which may be specifically a position offset, cooperates with the thermal feature to further ensure the accuracy of the center point.
In an embodiment of the present application, in the method, extracting, according to the central point position and the extracted size feature, a global saliency feature from a global saliency feature map includes: determining an occupation area of the example according to the central point position and the extracted size characteristic; global salient features corresponding to the occupied regions are cropped from the global salient feature map.
For example, determining the center point coordinate as (30, 50) and the size feature as 16 × 16 (height × width), the footprint is a rectangle with four vertices (24, 44), (24, 56), (36, 44) and (36, 56).
In the embodiment of the present application, if the feature maps have different sizes, when extracting features from one feature map according to points in another feature map, transformation is also required, which is inconvenient.
After the central point is determined, the central point in each feature map is determined at one time, and corresponding features can be conveniently extracted.
In an embodiment of the present application, the generating an example segmentation result according to the extracted size feature, the extracted shape feature, and the extracted global saliency feature in the method includes: determining the outline characteristics of the example according to the extracted size characteristics and the extracted shape characteristics; and generating a mask of the example according to the outline characteristic and the extracted global significance characteristic.
The shape feature may describe the shape of the object at which the point is present, and the size feature may describe the size of the object at which the point is present, so that the two, in combination, may determine the profile feature of the instance. Then, a mode of multiplying the outline characteristic and the extracted global significance characteristic can be selected, and the mask of the example is generated.
FIG. 2 shows a flow diagram of an example segmentation method according to one embodiment of the present application. As shown in fig. 2, after the image is input into the main network, the global Saliency salience feature map, the Shape feature map, the Size feature map, the thermal heatmap feature map, and the Offset feature map are output through five branches. The feature maps have the same height and width, so that the positions can be in one-to-one correspondence, and the operation is convenient.
Then, firstly, extracting local maximum response on the thermodynamic characteristic diagram, obtaining the corresponding position point of the thermodynamic characteristic diagram, and determining the center point of the example by combining the position point and the value of the point on the corresponding position of the offset characteristic diagram so as to extract the subsequent characteristic.
Then, the shape feature at the position corresponding to the center point is extracted from the shape feature map as the shape of the object (i.e., the example to be divided) at the point. And extracting the dimension feature of the position corresponding to the central point in the dimension feature map, and taking the dimension feature as the size of the object at the point. The two information are combined to obtain the contour feature.
And then, shearing the global saliency feature graph by using the corresponding position of the central point and the determined example size to obtain the global saliency feature.
And finally, multiplying the contour feature of the central point by the global significance feature to be used as a final predicted example segmentation result mask. "person" and "bench" in the figure represent categories of examples.
FIG. 3 shows a schematic diagram of an example segmentation apparatus in a computer vision task, according to an embodiment of the present application. As shown in fig. 3, an example segmentation apparatus 300 in a computer vision task includes:
the feature map generating unit 310 is configured to generate a thermal feature map, a size feature map, a shape feature map, and a global saliency feature map of the image based on a backbone network of a one-stage network.
In particular, the backbone network may be a convolutional neural network, performing tasks involving multiple branches.
Wherein the number of channels of the thermodynamic feature map (heatmap) may be the number of categories of instances. The technical scheme of the application can be applied to various scenes which need computer vision tasks as technical support, including but not limited to traffic sign recognition of high-precision map building, obstacle detection of automatic driving, target detection of monitoring and the like, and further can be applied to actual business scenes such as express logistics, take-away food delivery and the like. Therefore, the scheme can be used on line or off line.
Correspondingly, the examples to be divided relate to the scene, for example, in an automatic driving scene, the pedestrian and the vehicle need to be divided as examples, and even the same type of vehicle needs to be divided into a plurality of examples.
The dimension characteristic is used for characterizing the size of an example, the dimension is usually embodied as width and height, and the number of channels of the dimension characteristic diagram can be 2; the shape feature is used for characterizing the shape of the example, and the number of channels of the shape feature map can be related to the area. The global saliency characteristic map can be embodied as a single-channel gray image, and the gray value represents the global saliency of a pixel point in the original image.
A center point determining unit 320 for determining the center point of the example according to the thermodynamic characteristic map.
Generally, one instance corresponds to one center point. This greatly reduces the involvement of extraneous regions, and thus reduces the computational load of candidates, compared to pre-selecting multiple ROIs.
The feature map and the original image have a corresponding relationship, and determining the center point corresponds to determining the position of a designated point in each of the feature map and the original image.
A feature extraction unit 330, configured to extract a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map, respectively; and extracting global significant features from the global significant feature graph according to the central point and the extracted size features.
An example segmentation unit 340, configured to generate an example segmentation result according to the extracted size feature, the extracted shape feature, and the extracted global saliency feature.
Therefore, the device shown in fig. 3 determines the example position by using a central point prediction mode, so that the calculation of irrelevant points is omitted, and the speed is improved; and different from the mode that PolarMask adopts polar coordinate system modeling example Mask, the method performs fine segmentation at pixel level by extracting more features, and the precision is higher.
In an embodiment of the present application, in the above apparatus, the central point determining unit 320 is configured to extract a local maximum response point in the thermodynamic characteristic diagram, and determine a central point position of the example according to a position of the local maximum response point.
In the Heatmap, each area with higher heat degree generally corresponds to one example, but when the Heatmap is used, the occupied area of one example is not directly obtained according to the occupied area, only one point corresponding to the example is obtained, and the occupied area of the example is determined by matching with other characteristics.
Therefore, the method adopts a mode of calculating the local response, and the position of the center point of the example is determined by extracting the local maximum response point.
Taking the local maximum response point as the example center point is a feasible but not high-precision manner, so in an embodiment of the present application, the center point determining unit 320 in the apparatus is configured to generate the offset feature map of the image based on the backbone network of the one-stage network; extracting corresponding offset features from the offset feature map according to the position of the local maximum response point; and determining the position of the center point of the example according to the position of the local maximum response point and the offset characteristic.
The offset feature, which may be specifically a position offset, cooperates with the thermal feature to further ensure the accuracy of the center point.
In an embodiment of the present application, in the above apparatus, the feature extraction unit 330 is configured to determine an occupation area of the instance according to the position of the central point and the extracted size feature; global salient features corresponding to the occupied regions are cropped from the global salient feature map.
For example, determining the center point coordinate as (30, 50) and the size feature as 16 × 16 (height × width), the footprint is a rectangle with four vertices (24, 44), (24, 56), (36, 44) and (36, 56).
In the embodiment of the present application, if the feature maps have different sizes, when extracting features from one feature map according to points in another feature map, transformation is also required, which is inconvenient.
After the central point is determined, the central point in each feature map is determined at one time, and corresponding features can be conveniently extracted.
In an embodiment of the present application, in the apparatus, the example dividing unit 340 is configured to determine the outline feature of the example according to the extracted size feature and the extracted shape feature; and generating a mask of the example according to the outline characteristic and the extracted global significance characteristic.
The shape feature may describe the shape of the object at which the point is present, and the size feature may describe the size of the object at which the point is present, so that the two, in combination, may determine the profile feature of the instance. Then, a mode of multiplying the outline characteristic and the extracted global significance characteristic can be selected, and the mask of the example is generated.
In summary, the present application proposes an example segmentation scheme based on one-stage (one-stage). The method comprises the steps of respectively generating a thermal characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of an image based on a main network of a one-stage network, determining a central point of an example according to the thermal characteristic diagram, respectively extracting the size characteristic corresponding to the central point and the shape characteristic corresponding to the central point from the size characteristic diagram and the shape characteristic, extracting the global significance characteristic from the global significance characteristic diagram according to the central point and the extracted size characteristic, and generating an example segmentation result according to the extracted size characteristic, the extracted shape characteristic and the extracted global significance characteristic. The method has the advantages that the example position is determined by using a central point prediction mode, so that the calculation of irrelevant points is omitted, and the speed is improved; and fine segmentation can be performed at the pixel level, and the precision is higher.
It should be noted that:
the algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may be used with the teachings herein. The required structure for constructing such a device will be apparent from the description above. In addition, this application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the present application.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of an example segmentation apparatus in a computer vision task according to embodiments of the present application. The present application may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present application may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
For example, fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the present application. Specifically to an autopilot scenario, the electronic device may be an autopilot device. The electronic device 400 comprises a processor 410 and a memory 420 arranged to store computer executable instructions (computer readable program code). The memory 420 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. The memory 420 has a storage space 430 storing computer readable program code 431 for performing any of the method steps described above. For example, the storage space 430 for storing the computer readable program code may include respective computer readable program codes 431 for respectively implementing various steps in the above method. The computer readable program code 431 can be read from or written to one or more computer program products. These computer program products comprise a program code carrier such as a hard disk, a Compact Disc (CD), a memory card or a floppy disk. Such a computer program product is typically a computer readable storage medium such as described in fig. 5.
FIG. 5 shows a schematic diagram of a computer-readable storage medium according to an embodiment of the present application. The computer readable storage medium 400 has stored thereon a computer readable program code 431 for performing the steps of the method according to the present application, which is readable by the processor 410 of the electronic device 400, which computer readable program code 431, when executed by the electronic device 400, causes the electronic device 400 to perform the steps of the method described above, in particular the computer readable program code 431 stored thereon may perform the method shown in any of the embodiments described above. The computer readable program code 431 may be compressed in a suitable form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (10)

1. A method of instance segmentation in a computer vision task, comprising:
respectively generating a thermodynamic characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image based on a backbone network of a first-stage network;
determining a center point of an example according to the thermodynamic characteristic diagram;
extracting a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map respectively;
extracting global significant features from the global significant feature map according to the central point and the extracted size features;
and generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features.
2. The method of claim 1, wherein said determining a center point of an instance from said thermodynamic map comprises:
and extracting a local maximum response point in the thermodynamic characteristic diagram, and determining the position of the center point of the example according to the position of the local maximum response point.
3. The method of claim 2, wherein determining the center point location of the instance based on the location of the local maximum response point comprises:
generating an offset feature map of the image based on a backbone network of a one-stage network;
extracting corresponding offset features from the offset feature map according to the position of the local maximum response point;
and determining the position of the center point of the example according to the position of the local maximum response point and the offset characteristic.
4. The method of claim 1, wherein said extracting global salient features from said global salient feature map based on said centerline position and said extracted size features comprises:
determining the occupation area of the example according to the central point position and the extracted size characteristic;
and cutting out global saliency features corresponding to the occupation area from the global saliency feature map.
5. The method of claim 1, wherein generating instance segmentation results from the extracted size features, the extracted shape features, and the extracted global saliency features comprises:
determining the outline characteristics of the example according to the extracted size characteristics and the extracted shape characteristics;
and generating a mask of an example according to the outline feature and the extracted global significance feature.
6. The method of any one of claims 1-5, wherein the backbone network is a convolutional neural network.
7. The method of any of claims 1-5, wherein each feature map has the same width and height.
8. An instance segmentation apparatus in a computer vision task, comprising:
the characteristic diagram generating unit is used for respectively generating a thermal characteristic diagram, a size characteristic diagram, a shape characteristic diagram and a global significance characteristic diagram of the image based on a main network of a one-stage network;
the central point determining unit is used for determining the central point of the example according to the thermodynamic characteristic diagram;
a feature extraction unit configured to extract a size feature corresponding to the central point and a shape feature corresponding to the central point from the size feature map and the shape feature map, respectively; extracting global significant features from the global significant feature map according to the central point and the extracted size features;
and the example segmentation unit is used for generating an example segmentation result according to the extracted size features, the extracted shape features and the extracted global significance features.
9. An electronic device, wherein the electronic device comprises: a processor; and a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform the method of any one of claims 1-7.
10. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which, when executed by a processor, implement the method of any of claims 1-7.
CN202010142850.0A 2020-03-04 2020-03-04 Instance splitting method, instance splitting device, electronic equipment and storage medium Pending CN111429463A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010142850.0A CN111429463A (en) 2020-03-04 2020-03-04 Instance splitting method, instance splitting device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010142850.0A CN111429463A (en) 2020-03-04 2020-03-04 Instance splitting method, instance splitting device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111429463A true CN111429463A (en) 2020-07-17

Family

ID=71547518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010142850.0A Pending CN111429463A (en) 2020-03-04 2020-03-04 Instance splitting method, instance splitting device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111429463A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053439A (en) * 2020-09-28 2020-12-08 腾讯科技(深圳)有限公司 Method, device and equipment for determining instance attribute information in image and storage medium
WO2022211409A1 (en) * 2021-03-31 2022-10-06 현대자동차주식회사 Method and device for coding machine vision data by using reduction of feature map
CN116681892A (en) * 2023-06-02 2023-09-01 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement
DE102023005045A1 (en) 2022-12-16 2024-06-27 Mercedes-Benz Group AG System and method for recording occupancy within a vehicle

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679173A (en) * 2013-12-04 2014-03-26 清华大学深圳研究生院 Method for detecting image salient region
EP3101594A1 (en) * 2015-06-04 2016-12-07 Omron Corporation Saliency information acquisition device and saliency information acquisition method
US20190063998A1 (en) * 2017-08-23 2019-02-28 Wistron Corp. Image processing device and method
CN109741293A (en) * 2018-11-20 2019-05-10 武汉科技大学 Conspicuousness detection method and device
CN110532955A (en) * 2019-08-30 2019-12-03 中国科学院宁波材料技术与工程研究所 Example dividing method and device based on feature attention and son up-sampling
CN110675407A (en) * 2019-09-17 2020-01-10 北京达佳互联信息技术有限公司 Image instance segmentation method and device, electronic equipment and storage medium
CN110751157A (en) * 2019-10-18 2020-02-04 厦门美图之家科技有限公司 Image saliency segmentation and image saliency model training method and device
CN110751641A (en) * 2019-10-18 2020-02-04 山东贝特建筑项目管理咨询有限公司 Anchor bolt information detection method and storage medium
CN110838125A (en) * 2019-11-08 2020-02-25 腾讯医疗健康(深圳)有限公司 Target detection method, device, equipment and storage medium of medical image

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679173A (en) * 2013-12-04 2014-03-26 清华大学深圳研究生院 Method for detecting image salient region
EP3101594A1 (en) * 2015-06-04 2016-12-07 Omron Corporation Saliency information acquisition device and saliency information acquisition method
US20190063998A1 (en) * 2017-08-23 2019-02-28 Wistron Corp. Image processing device and method
CN109741293A (en) * 2018-11-20 2019-05-10 武汉科技大学 Conspicuousness detection method and device
CN110532955A (en) * 2019-08-30 2019-12-03 中国科学院宁波材料技术与工程研究所 Example dividing method and device based on feature attention and son up-sampling
CN110675407A (en) * 2019-09-17 2020-01-10 北京达佳互联信息技术有限公司 Image instance segmentation method and device, electronic equipment and storage medium
CN110751157A (en) * 2019-10-18 2020-02-04 厦门美图之家科技有限公司 Image saliency segmentation and image saliency model training method and device
CN110751641A (en) * 2019-10-18 2020-02-04 山东贝特建筑项目管理咨询有限公司 Anchor bolt information detection method and storage medium
CN110838125A (en) * 2019-11-08 2020-02-25 腾讯医疗健康(深圳)有限公司 Target detection method, device, equipment and storage medium of medical image

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053439A (en) * 2020-09-28 2020-12-08 腾讯科技(深圳)有限公司 Method, device and equipment for determining instance attribute information in image and storage medium
CN112053439B (en) * 2020-09-28 2022-11-25 腾讯科技(深圳)有限公司 Method, device and equipment for determining instance attribute information in image and storage medium
WO2022211409A1 (en) * 2021-03-31 2022-10-06 현대자동차주식회사 Method and device for coding machine vision data by using reduction of feature map
DE102023005045A1 (en) 2022-12-16 2024-06-27 Mercedes-Benz Group AG System and method for recording occupancy within a vehicle
CN116681892A (en) * 2023-06-02 2023-09-01 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement
CN116681892B (en) * 2023-06-02 2024-01-26 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement

Similar Documents

Publication Publication Date Title
CN108229307B (en) Method, device and equipment for object detection
CN111429463A (en) Instance splitting method, instance splitting device, electronic equipment and storage medium
CN107358596B (en) Vehicle loss assessment method and device based on image, electronic equipment and system
Rasheed et al. Automated number plate recognition using hough lines and template matching
US20210081695A1 (en) Image processing method, apparatus, electronic device and computer readable storage medium
JP2018136803A (en) Image recognition system
CN113139543B (en) Training method of target object detection model, target object detection method and equipment
CN112036400B (en) Method for constructing network for target detection and target detection method and system
WO2020258077A1 (en) Pedestrian detection method and device
CN111144315A (en) Target detection method and device, electronic equipment and readable storage medium
Gluhaković et al. Vehicle detection in the autonomous vehicle environment for potential collision warning
Sunny et al. Image based automatic traffic surveillance system through number-plate identification and accident detection
CN117315406B (en) Sample image processing method, device and equipment
CN114913340A (en) Parking space detection method, device, equipment and storage medium
CN114820679A (en) Image annotation method and device, electronic equipment and storage medium
CN109635701B (en) Lane passing attribute acquisition method, lane passing attribute acquisition device and computer readable storage medium
CN114429619A (en) Target vehicle detection method and device
CN114693963A (en) Recognition model training and recognition method and device based on electric power data feature extraction
CN117765485A (en) Vehicle type recognition method, device and equipment based on improved depth residual error network
CN110222652B (en) Pedestrian detection method and device and electronic equipment
CN116884003A (en) Picture automatic labeling method and device, electronic equipment and storage medium
CN116843983A (en) Pavement disease recognition method, model training method, electronic equipment and medium
CN111709377A (en) Feature extraction method, target re-identification method and device and electronic equipment
CN111401359A (en) Target identification method and device, electronic equipment and storage medium
CN113591543B (en) Traffic sign recognition method, device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200717

WD01 Invention patent application deemed withdrawn after publication