WO2020029708A1 - 图像处理方法和装置、电子设备、存储介质、程序产品 - Google Patents

图像处理方法和装置、电子设备、存储介质、程序产品 Download PDF

Info

Publication number
WO2020029708A1
WO2020029708A1 PCT/CN2019/093646 CN2019093646W WO2020029708A1 WO 2020029708 A1 WO2020029708 A1 WO 2020029708A1 CN 2019093646 W CN2019093646 W CN 2019093646W WO 2020029708 A1 WO2020029708 A1 WO 2020029708A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
vector
feature map
weight
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/093646
Other languages
English (en)
French (fr)
Chinese (zh)
Inventor
赵恒爽
张熠
石建萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Sensetime Technology Co Ltd
Original Assignee
Shenzhen Sensetime Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Sensetime Technology Co Ltd filed Critical Shenzhen Sensetime Technology Co Ltd
Priority to JP2020554362A priority Critical patent/JP7065199B2/ja
Priority to SG11202005737WA priority patent/SG11202005737WA/en
Publication of WO2020029708A1 publication Critical patent/WO2020029708A1/zh
Priority to US16/905,478 priority patent/US20200356802A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the present application relates to machine learning technology, and in particular, to an image processing method and device, an electronic device, a storage medium, and a program product.
  • a feature is a corresponding (essential) feature or characteristic that distinguishes a certain type of object from other types of objects, or a collection of these characteristics and features.
  • Features are data that can be extracted by measurement or processing. For images, each image has its own characteristics that can be distinguished from other types of images. Some are natural features that can be intuitively felt, such as brightness, edges, texture, and color; some require transformation or processing. To get, such as histograms, principal components, etc.
  • An embodiment of the present application provides an image processing technology.
  • an image processing method including:
  • the feature information of the feature points corresponding to the feature weights are respectively transmitted to a plurality of other feature points included in the feature map to obtain a feature map with enhanced features.
  • an image processing apparatus including:
  • a feature extraction unit configured to perform feature extraction on an image to be processed to generate a feature map of the image
  • a weight determining unit configured to determine a feature weight corresponding to each feature point in the multiple feature points included in the feature map
  • a feature enhancement unit is configured to transmit feature information of feature points corresponding to the feature weights to a plurality of other feature points included in the feature map, respectively, to obtain a feature map after feature enhancement.
  • an electronic device including a processor, where the processor includes the image processing apparatus according to any one of the foregoing.
  • an electronic device comprising: a memory for storing executable instructions;
  • a processor configured to communicate with the memory to execute the executable instructions to complete the operations of the image processing method according to any one of the above.
  • a computer storage medium for storing computer-readable instructions, characterized in that, when the instructions are executed, the operations of the image processing method according to any one of the foregoing are performed .
  • a computer program product including computer-readable code, characterized in that, when the computer-readable code runs on a device, a processor in the device executes an application. Instructions for implementing the image processing method according to any one of the above.
  • feature extraction is performed on an image to be processed to generate a feature map of the image; and each of a plurality of feature points included in the feature map is determined.
  • Feature weights corresponding to each feature point; the feature information of the feature points corresponding to the feature weights are respectively transmitted to a plurality of other feature points included in the feature map to obtain a feature-enhanced feature map; the information is transmitted between the feature points to make the context
  • the information can be better used, so that the feature map with enhanced features contains more information.
  • FIG. 1 is a flowchart of an embodiment of an image processing method of the present application.
  • FIG. 2 is a schematic diagram of transmitting information between feature points in an optional example of the image processing method of the present application.
  • FIG. 3 is a schematic diagram of a network structure of another embodiment of an image processing method of the present application.
  • FIG. 4-a is a schematic diagram of obtaining a weight vector of an information collection branch in another embodiment of the image processing method of the present application.
  • FIG. 4-b is a schematic diagram of obtaining a weight vector of an information distribution branch in another embodiment of the image processing method of the present application.
  • FIG. 5 is an exemplary schematic structural diagram of network training in the image processing method of the present application.
  • FIG. 6 is another exemplary structure diagram of network training in the image processing method of the present application.
  • FIG. 7 is a schematic structural diagram of an embodiment of an image processing apparatus of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device suitable for implementing a terminal device or a server according to an embodiment of the present application.
  • Embodiments of the invention can be applied to a computer system / server, which can operate with many other general or special purpose computing system environments or configurations.
  • Examples of well-known computing systems, environments, and / or configurations suitable for use with computer systems / servers include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop devices, based on Microprocessor systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and so on.
  • a computer system / server may be described in the general context of computer system executable instructions, such as program modules, executed by a computer system.
  • program modules may include routines, programs, target programs, components, logic, data structures, and so on, which perform specific tasks or implement specific abstract data types.
  • the computer system / server can be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks are performed by remote processing devices linked through a communication network. In a distributed cloud computing environment, program modules may be located on a local or remote computing system storage medium including a storage device.
  • FIG. 1 is a flowchart of an embodiment of an image processing method of the present application. As shown in FIG. 1, the method in this embodiment includes:
  • Step 110 Perform feature extraction on the image to be processed to generate a feature map of the image.
  • the image in this embodiment may be an image that has not been subjected to feature extraction processing, or may be a feature map obtained after one or more feature extractions, etc.
  • the present application does not limit the specific form of the image to be processed.
  • step S110 may be executed by a processor calling a corresponding instruction stored in a memory, or may be executed by a feature extraction unit 71 executed by a processor.
  • Step 120 Determine a feature weight corresponding to each feature point in the multiple feature points included in the feature map.
  • the multiple feature points in this embodiment may be all feature points or part of the feature points in the feature map; in order to realize the information transfer between the feature points, it is necessary to determine the transfer probability, that is, to transfer all or part of the information of one feature point to other Feature points, and the proportion passed is determined by feature weights.
  • FIG. 2 is a schematic diagram of transferring information between feature points in an optional example of the image processing method of the present application.
  • Figure 2-a there is only one-way transmission between the feature points to realize the collection of information.
  • the feature information transmitted from the surrounding feature points to the feature point is received;
  • Figure 2-b there is only one-way transmission between feature points to achieve information distribution.
  • the feature information of the feature points is transmitted outward to the surrounding feature points.
  • Figure 2-c two-way transmission is performed, that is, each feature point not only transmits information to the outside, but also receives information transmitted by surrounding feature points to realize bi-direction of information.
  • the feature weight includes the Inward receiving weight and outward transmitting weight, while transmitting the product of the outward transmitting weight and the characteristic information to the surrounding feature points, the product of the inward receiving weight and the characteristic information of the surrounding feature points is transmitted. To the feature point.
  • step S120 may be executed by a processor calling a corresponding instruction stored in a memory, or may be executed by a weight determining unit 72 executed by the processor.
  • Step 130 The feature information of the feature points corresponding to the feature weights is transmitted to a plurality of other feature points included in the feature map, respectively, to obtain a feature enhanced feature map.
  • the other feature points refer to feature points other than the corresponding feature points in the feature map.
  • Each feature point has its own information transfer, which is represented by a point-by-point spatial attention mechanism (feature weights). These information transfers can be learned through neural networks and have strong self-adaptability. And when learning the information transfer between different feature points, the relative positional relationship between the feature points and the feature points is taken into account.
  • feature weights point-by-point spatial attention mechanism
  • step S130 may be executed by the processor by calling the corresponding instruction stored in the memory, or may be executed by the feature enhancement unit 73 executed by the processor.
  • feature extraction is performed on an image to be processed to generate a feature map of the image; a feature weight corresponding to each feature point among a plurality of feature points included in the feature map is determined; and the feature weight is The feature information of the corresponding feature points is transmitted to a plurality of other feature points included in the feature map to obtain a feature enhanced feature map.
  • the context information can be better used and the features are enhanced.
  • the subsequent feature map contains more information.
  • the method in this embodiment may further include: performing scene analysis processing or object segmentation processing on the image based on the feature map after feature enhancement.
  • each feature point in the feature map can collect information about other points to help predict the current point, and can also distribute information about the current point to help predict the other points.
  • the adaptive learning adjustment is related to the positional relationship.
  • the method in this embodiment may further include: performing robot navigation control or vehicle intelligent driving control according to a result of scene analysis processing or a result of object segmentation processing.
  • the results of the scene analysis processing or object segmentation processing are more accurate and closer to the human eye processing results, and applied to robot navigation control or vehicle intelligent driving control Time, can achieve the results close to manual control.
  • the feature weights of feature points included in the feature map include an inward receiving weight and an outward sending weight.
  • the inward receiving weight represents the weight when the feature point receives feature information of other feature points included in the feature map
  • the outward sending weight represents the weight when the feature point transmits feature information to other feature points included in the feature map
  • the embodiment of the present application realizes two-way propagation of information in feature points by receiving weights inward and sending weights outward, so that each feature point in the feature map can collect information about other feature points to help the prediction of the current feature points, and also It can distribute the information of current feature points to help the prediction of other feature points, and improves the accuracy of prediction through the two-way propagation of information.
  • step 120 may include:
  • a second branch process is performed on the feature map to obtain a second weight vector of outward sending weights of each of the plurality of feature points included.
  • each feature point corresponds to at least one inward receiving weight and outward sending weight respectively.
  • the feature map is processed through two branches to obtain the feature map.
  • the efficiency of two-way information transmission between feature points is achieved, enabling faster information transmission.
  • performing a first branch processing on the feature map to obtain a first weight vector including an inward receiving weight of each feature point in the multiple feature points includes:
  • the invalid information in the first intermediate weight vector is removed to obtain a first weight vector.
  • the invalid information indicates information in the first intermediate weight vector that has no influence on feature transmission or the degree of influence is less than a set condition.
  • the first intermediate weight vector obtained through the processing of the neural network includes a lot of meaningless invalid information. Since these invalid information has only one transmission end (feature point), is it correct? The transmission of this information has no effect on the transmission of features or the degree of influence is less than the set conditions; removing this invalid information can obtain a first weight vector, which ensures that the information is comprehensive while not including useless information, which improves the transmission of useful information s efficiency.
  • processing the feature map through a neural network to obtain a first intermediate weight vector includes:
  • a first intermediate weight vector is obtained based on the first transfer scale vector.
  • each feature point in the feature map is used as an input point.
  • positions around the input point are used as output points.
  • the surrounding positions include multiple feature points in the feature map and Multiple adjacent positions of the first input point in the spatial position.
  • all positions around the first input point may be used as the first output points corresponding to the first input point.
  • the multiple feature points may be in the feature map. All or part of the feature points, for example, include all the feature points in the feature map and the 8 adjacent positions of the input points in the spatial position. These 8 adjacent positions are determined based on a 3 ⁇ 3 cube with the input point as the center. Among them, the feature points coincide with 8 adjacent positions, and the coincident position is used as an output point.
  • removing invalid information in the first intermediate weight vector to obtain the first weight vector includes:
  • a first weight vector is determined based on the incoming weights.
  • At least one feature point (for example, all feature points) is used as the first input point. Therefore, when there is no feature point at a position around the first input point, the first transfer ratio vector of the position is useless. That is, zero times any value is zero, which is the same as the untransmitted information. In this embodiment, after removing these useless first transmission proportional vectors, all inward receiving weights are obtained, and the first weight vector is determined.
  • the embodiment of the present application uses the operation of learning a large intermediate weight vector first and then performing selective selection to achieve the relative position information considering the feature information.
  • determining the first weight vector based on the inbound receiving weight includes:
  • the inward receiving weights are arranged according to the positions of the corresponding first output points to obtain a first weight vector.
  • this embodiment arranges the inward receiving weight obtained for each feature point according to the position of its corresponding first output point for subsequent information transmission, where The multiple first output points corresponding to a feature point are sorted according to the inward receiving weight; optionally, in the subsequent information transfer process, the information transmitted by the multiple output points to the feature point can be received in order.
  • the method before processing the feature map by using a neural network, before obtaining the first intermediate weight vector, the method further includes:
  • the feature map is processed by a neural network to obtain a first intermediate weight vector, including:
  • the first intermediate feature map after dimensionality reduction is processed by a neural network to obtain a first intermediate weight vector.
  • the feature maps can also be subjected to dimensionality reduction processing to reduce the amount of calculation by reducing the number of channels.
  • processing the first intermediate feature map after dimensionality reduction through a neural network to obtain a first intermediate weight vector includes:
  • a first intermediate weight vector is obtained based on the first transfer scale vector.
  • each first intermediate feature point in the first intermediate feature map after dimensionality reduction is used as an input point, and all positions around the input point are used as output points, where all the surrounding locations include the first intermediate feature map.
  • Multiple feature points and multiple adjacent positions of the first input point in space the multiple feature points may be all or part of the first intermediate feature points in the first intermediate feature map, for example, including the first intermediate point.
  • the eight adjacent positions of all the first intermediate feature points and the input points in the feature map in the spatial position are determined based on a 3 ⁇ 3 cube with the input point as the center, where the feature points and 8 There is overlap between two adjacent positions, and the overlapped position is used as an output point.
  • performing a second branch processing on the feature map to obtain a second weight vector of an outward sending weight of each of the multiple feature points included includes:
  • the invalid information indicates information in the second intermediate weight vector that has no influence on the feature transmission or the degree of influence is less than a set condition.
  • the second intermediate weight vector obtained through the processing of the neural network includes a lot of meaningless invalid information. Since these invalid information has only one transmitting end (characteristic point), so whether to transmit This information has no effect on the feature transmission or the degree of influence is less than the set conditions; removing this invalid information can obtain a second weight vector, which ensures the comprehensiveness of the information and does not include useless information, improving the efficiency of information transmission .
  • processing the feature map through a neural network to obtain a second intermediate weight vector includes:
  • Each feature point in the feature map is used as a second output point, and the position around the second input point is used as a second input point corresponding to the second input point;
  • a second intermediate weight vector is obtained based on the second transfer scale vector.
  • each feature point in the feature map is used as an output point.
  • the positions around the output point are used as input points.
  • the surrounding positions include multiple feature points in the feature map and Multiple adjacent positions of the second output point in the spatial position.
  • all positions around the second output point may be used as the second input points corresponding to the second output point.
  • the multiple feature points may be in the feature map. All or part of the feature points, for example: 8 neighboring positions in the spatial position including all feature points and output points in the feature map. These 8 neighboring positions are determined based on a 3 ⁇ 3 cube with the input point as the center. Among them, the feature points coincide with 8 adjacent positions, and the coincident position is used as an input point.
  • the transmission ratio is transmitted to the output point according to the transmission ratio; according to this embodiment, the transmission ratio of the transmission information between the two characteristic points can be obtained.
  • removing invalid information in the second intermediate weight vector to obtain the second weight vector includes:
  • a second weight vector is determined based on the outgoing weights.
  • At least one feature point (for example, all feature points) is used as the second output point. Therefore, when there is no feature point around the second output point, the second transfer ratio vector of the position is useless. That is, zero times any value is zero, which is the same as the untransmitted information. In this embodiment, these useless second transmission ratio vectors are removed to obtain the inward receiving weight, and the second weight vector is determined.
  • the embodiment of the present application uses the operation of learning a large intermediate weight vector first and then performing selective selection to achieve the relative position information considering the feature information.
  • determining the second weight vector based on the outward sending weight includes:
  • the outward sending weights are arranged according to the positions of the corresponding second input points to obtain a second weight vector.
  • this embodiment arranges the outward sending weight obtained for each feature point according to the position of its corresponding second output point for subsequent information transmission,
  • the multiple second input points corresponding to a feature point are sorted according to the outward sending weight; optionally, in the subsequent information transfer process, the information of the feature points may be sequentially sent to the multiple input points.
  • processing the feature map through a neural network, and before obtaining the second intermediate weight vector the method further includes:
  • the feature map is processed through a neural network to obtain a second intermediate weight vector, including:
  • the second intermediate feature map after dimensionality reduction is processed by a neural network to obtain a second intermediate weight vector.
  • the feature maps can also be subjected to dimensionality reduction processing to reduce the amount of calculation by reducing the number of channels.
  • the same neural network may be used to perform dimensionality reduction.
  • the first intermediate feature map and the second intermediate feature map after the feature map is reduced may be the same or different.
  • processing the second intermediate feature map after dimensionality reduction through a neural network to obtain a second intermediate weight vector includes:
  • a second intermediate weight vector is obtained based on the second transfer scale vector.
  • each second intermediate feature point in the second intermediate feature map after the dimension reduction is used as an output point
  • all surrounding positions include a plurality of second intermediate feature points and a second input point in the second intermediate feature map.
  • all positions around the output point are used as input points.
  • all second transfer proportional vectors corresponding to the output point will be generated, and the information of the output point will be passed by passing the proportional vector
  • the transmission ratio is transmitted to the input points according to the transmission ratio; according to this embodiment, the transmission ratio of the information transmitted between each two second intermediate feature points can be obtained.
  • step 130 may include:
  • An enhanced feature map is obtained based on the first feature vector, the second feature vector, and the feature map.
  • the feature information received by the feature points in the feature map is obtained through the first weight vector and the feature map
  • the feature information sent by the feature points in the feature map is obtained through the second weight vector and the feature map.
  • the transmitted feature information can be obtained based on the bidirectionally transmitted feature information and feature maps, including enhanced information.
  • obtaining the first feature vector based on the first weight vector and the feature map; and obtaining the second feature vector based on the second weight vector and the feature map includes:
  • Matrix multiplication processing is performed on the second weight vector and the second intermediate feature map after the feature map is subjected to dimensionality reduction processing to obtain a second feature vector.
  • Matrix multiplication is performed on the second weight vector and the feature map to obtain a second feature vector.
  • the first weight vector obtained and the first intermediate feature map after dimensionality reduction satisfy the requirements of matrix multiplication.
  • each feature point in the first intermediate feature map and The weights corresponding to the feature points are multiplied, so that the feature information is transmitted to at least one feature point (for example, each feature point) according to the weight; and the second feature vector is implemented from at least one feature point (for example, each feature point) to The feature information is transmitted according to the corresponding weight.
  • each feature point in the feature map and The weights corresponding to the feature points are multiplied, so that the feature information is transmitted to each feature point according to the weight; and the second feature vector is used to transmit the feature information from each feature point outward according to the corresponding weight.
  • obtaining the enhanced feature map based on the first feature vector, the second feature vector, and the feature map includes:
  • the stitching feature vector and feature map are stitched in the channel dimension to obtain the feature enhanced feature map.
  • the first feature vector and the second feature vector are combined through stitching to obtain bidirectionally transmitted information, and then the bidirectionally transmitted information and the feature map are stitched to obtain a feature enhanced feature map.
  • This feature enhanced feature map not only It includes the feature information of each feature point in the original feature map, and also includes the feature information that is transmitted in both directions between each two feature points.
  • the method further includes:
  • the stitching feature vector and feature map are stitched in the channel dimension to obtain the feature enhanced feature map, including:
  • the processed mosaic feature vector and feature map are stitched in the channel dimension to obtain a feature enhanced feature map.
  • a neural network is used for processing (for example, a convolution layer and a cascade of a non-linear activation layer) to implement feature projection, and the feature vector and the feature map are unified in other dimensions than the channel through the feature projection. Achieve splicing on dimensional channels.
  • FIG. 3 is a schematic diagram of a network structure of another embodiment of an image processing method of the present application. As shown in FIG. 3, for the input image features, the whole is divided into two branches, one is an information collection stream, which is responsible for information collection, and the other is an information distribution stream, which is responsible for information distribution. 1) In each branch, a convolution operation with a reduced number of channels is performed first, and a reduction in computation is achieved through feature reduction.
  • the dimensionality reduction feature map is predicted (adaption) by a small neural network (usually a concatenation of some convolutional layers and non-linear activation layers, these are the basic modules of the convolutional neural network), and the number is close to the feature map. 2 times the size of feature weights (for example: if the size of the feature map is H ⁇ W (height H, width W), then the number of feature weights predicted by each feature point is (2H-1) ⁇ (2W-1), so (Ensure that each point can carry out information dissemination with all points in the full picture when considering the relative position relationship).
  • a matrix product is used between the obtained weight matrix and the reduced feature in the previous dimension to perform information transfer.
  • feature projection for example: a neural network is used to process the obtained features (for example, a convolution layer and a cascade of non-linear activation layers can be used) ) Processing to obtain global features.
  • the obtained global features and the initial input features are stitched to obtain the final output feature expression; where the stitching can be the feature dimensions for stitching.
  • stitching can be the feature dimensions for stitching.
  • This feature includes both the semantic information in the original feature and the global context information corresponding to the global feature.
  • the enhanced feature can be used for scene analysis, for example: directly input to a classifier implemented by a small convolutional neural network to classify each point.
  • FIG. 4-a is a schematic diagram of obtaining a weight vector of an information collection branch in another embodiment of the image processing method of the present application.
  • the center point of the non-compact weight feature alignment is the target feature point i, and each feature point is predicted (2H-1)
  • the non-compact feature weight of ⁇ (2W-1) can be expanded into a translucent rectangle covering the whole image, and the center of the rectangle is aligned with the point. This step ensures that the relative position relationship between feature points and feature points can be accurately considered when predicting feature weights.
  • FIG. 4-b is a schematic diagram of obtaining a weight vector of an information distribution branch in another embodiment of the image processing method of the present application.
  • the center point of the alignment is the information starting point j.
  • the non-compact feature weight of (2H-1) ⁇ (2W-1) predicted by each point can be expanded into a semi-transparent rectangle covering the full image, and the semi-transparent rectangle is the mask.
  • the overlapping area is indicated by a dotted frame, which is an effective weight feature.
  • the method of this embodiment is implemented by using a feature extraction network and a feature enhancement network;
  • the sample image has an annotation processing result
  • the annotation processing result includes an annotation scene analysis result or an annotation object segmentation result.
  • the feature extraction network involved in this implementation may be pre-trained or untrained. When the feature extraction network is pre-trained, You can choose to train only the feature enhancement network, or both the feature extraction network and the feature enhancement network; when the feature extraction network is untrained, the sample image will be used to train the feature extraction network and the feature enhancement network.
  • using a sample image to train a feature enhancement network includes:
  • the sample image is input into a feature extraction network and a feature enhancement network to obtain prediction processing results;
  • the feature enhancement network is trained.
  • the feature enhancement network is trained based on the obtained prediction processing results.
  • the proposed point-wise spatial weighting module (PSA, Point-wise Spatial Attention, corresponding to the feature-enhanced network provided by the above embodiment) is embedded into the framework of scene analysis.
  • FIG. 5 shows the network training in the image processing method of this application. An exemplary structure diagram. As shown in Figure 5, the input image passes through the existing scene parsing model, and the output feature map is sent to the PSA module structure for information aggregation. The final feature input classifier is used for scene analysis. A main loss is obtained based on the predicted scene analysis result and the labeling processing result. The main loss corresponds to the first loss in the above embodiment, and the network is trained based on the main loss.
  • PSA Point-wise Spatial Attention
  • using a sample image to train a feature extraction network and a feature enhancement network includes:
  • the sample image is input into a feature extraction network and a feature enhancement network to obtain prediction processing results;
  • Feature extraction network and feature enhancement network are trained based on the main loss.
  • the feature extraction network and the feature enhancement network are connected in sequence, when the first loss (for example, the main loss) obtained is fed back to the feature enhancement network, and the feedback is continued, the feature extraction network can be trained or fine-tuned (when The feature extraction network is pre-trained, and only fine-tuning can be performed at this time), so that the feature extraction network and the feature enhancement network can be trained at the same time, which ensures that the results of scene analysis tasks or object segmentation tasks are more accurate.
  • the first loss for example, the main loss
  • the method in this embodiment may further include:
  • the parameters of the feature extraction network are adjusted based on the second loss.
  • FIG. 6 is another exemplary structural schematic diagram of network training in the image processing method of the present application.
  • the PSA module can act on the final feature expression of a fully connected network based on the residual network (ResNet) (such as Stage5). This will result in better information integration and better use of the contextual information of the scene.
  • the residual network is composed of 5 stages. The input image is divided into two branches after 4 stages.
  • the feature map is obtained through stage 5 and then the PSA structure is input.
  • the final feature map is input to the classifier to classify each point to obtain the main loss ( main loss) to train a residual network and a feature enhancement network, the main loss corresponding to the first loss in the above embodiment.
  • the output of the 4th stage in the side branch is directly input to the classifier for scene analysis.
  • the side branch is mainly used for auxiliary loss training obtained during the training of the neural network.
  • the auxiliary loss corresponds to the second loss in the above embodiment. , Obtained, the main analysis of the main branch of the scene analysis results.
  • the foregoing program may be stored in a computer-readable storage medium.
  • the program When the program is executed, it is executed
  • the method includes the steps of the foregoing method embodiment; and the foregoing storage medium includes: a ROM, a RAM, a magnetic disk, or an optical disc, which can store various program codes.
  • FIG. 7 is a schematic structural diagram of an embodiment of an image processing apparatus of the present application.
  • the device in this embodiment may be used to implement the foregoing method embodiments of the present application.
  • the apparatus of this embodiment includes:
  • a feature extraction unit 71 is configured to perform feature extraction on an image to be processed to generate a feature map of the image.
  • the image in this embodiment may be an image that has not been subjected to feature extraction processing, or may be a feature map obtained after one or more feature extractions, etc.
  • the present application does not limit the specific form of the image to be processed.
  • the weight determining unit 72 is configured to determine a feature weight corresponding to each feature point in the multiple feature points included in the feature map.
  • the multiple feature points in this embodiment may be all feature points or part of the feature points in the feature map; in order to realize the information transfer between the feature points, it is necessary to determine the transfer probability, that is, to transfer all or part of the information of one feature point to other Feature points, and the proportion passed is determined by feature weights.
  • a feature enhancement unit 73 is configured to transmit feature information of feature points corresponding to feature weights to a plurality of other feature points included in the feature map, respectively, to obtain a feature map after feature enhancement.
  • the other feature points refer to feature points other than the corresponding feature points in the feature map.
  • feature extraction is performed on an image to be processed to generate a feature map of the image; a feature weight corresponding to each feature point in a plurality of feature points included in the feature map is determined; and the feature weight is The feature information of the corresponding feature points is transmitted to a plurality of other feature points included in the feature map to obtain a feature enhanced feature map.
  • the context information can be better used and the features are enhanced.
  • the subsequent feature map contains more information.
  • the apparatus further includes:
  • An image processing unit is configured to perform scene analysis processing or object segmentation processing on an image based on a feature map after feature enhancement.
  • each feature point in the feature map can collect information about other points to help predict the current point, and also distribute information about the current point to help predict the other points.
  • the point-by-point spatial weighting scheme designed in this solution is The adaptive learning adjustment is related to the position relationship.
  • the feature map after feature enhancement can better use the context information of complex scenes to help scene analysis or object segmentation processing.
  • the apparatus in this embodiment further includes:
  • the result application unit is configured to perform robot navigation control or vehicle intelligent driving control according to a result of scene analysis processing or a result of object segmentation processing.
  • the feature weights of the feature points included in the feature map include an inward receive weight and an outward send weight; the inward receive weight indicates that the feature point receives feature information of other feature points included in the feature map.
  • Time weight; sending weight indicates the weight when the feature points transmit feature information to other feature points included in the feature map.
  • each feature point in the feature map By receiving weights inward and sending weights outward, two-way propagation of information in the feature points is achieved, enabling each feature point in the feature map to collect information about other feature points to help predict the current feature points, and to distribute the current features.
  • the point information helps the prediction of other characteristic points.
  • the weight determining unit 72 includes:
  • a first weight module configured to perform a first branch processing on the feature map to obtain a first weight vector of an inward receiving weight of each feature point among a plurality of feature points included;
  • a second weighting module is configured to perform a second branching process on the feature map to obtain a second weighting vector of an outward sending weight of each of the plurality of feature points included.
  • the first weighting module includes:
  • a first intermediate vector module configured to process the feature map through a neural network to obtain a first intermediate weight vector
  • the first information removing module is configured to remove invalid information in the first intermediate weight vector to obtain a first weight vector.
  • the invalid information indicates information in the first intermediate weight vector that has no influence on feature transmission or the degree of influence is less than a set condition.
  • the first intermediate weight vector obtained through the processing of the neural network includes a lot of meaningless invalid information. Since these invalid information have only one transmitting end (feature) (Point), therefore, whether the transmission of this information has no effect on the transmission of features or the degree of influence is less than the set conditions; removing this invalid information can obtain a first weight vector, which does not include useless information while ensuring comprehensive information To improve the efficiency of information transmission.
  • the first intermediate vector module is configured to use each feature point in the feature map as a first input point, and use a position around the first input point as a first output point corresponding to the first input point, and the surrounding positions include features Multiple feature points in the figure and multiple adjacent positions of the first input point in space; obtain a first transfer ratio vector between the first input point and the first output point corresponding to the first input point in the feature map ; Obtaining a first intermediate weight vector based on the first transfer ratio vector.
  • the first information removing module is configured to identify, from the first intermediate weight vector, the first transfer ratio vector whose information included in the first output point is empty; removing the first output point from the first intermediate weight vector includes The information is the empty first transfer ratio vector, and the inward receiving weight of the feature map is obtained; the first weight vector is determined based on the inward receiving weight.
  • the first information removal module when determining the first weight vector based on the inbound reception weight, is configured to arrange the inbound reception weight according to the position of the corresponding first output point to obtain the first weight vector.
  • the first weighting module further includes:
  • a first dimensionality reduction module configured to perform dimensionality reduction processing on the feature map through a convolution layer to obtain a first intermediate feature map
  • a first intermediate vector module configured to process the first intermediate feature map after dimensionality reduction through a neural network to obtain a first intermediate weight vector.
  • the second weighting module includes:
  • a second intermediate vector module configured to process the feature map through a neural network to obtain a second intermediate weight vector
  • the second information removing module is configured to remove invalid information in the second intermediate weight vector to obtain a second weight vector.
  • the invalid information indicates information in the second intermediate weight vector that has no influence on the feature transmission or the degree of influence is less than a set condition.
  • the second intermediate weight vector obtained through the processing of the neural network includes a lot of meaningless invalid information. Since the invalid information has only one transmitting end (feature point), whether to transmit these information is important for the feature. The transmission has no influence or the degree of influence is less than the set conditions; removing these invalid information can obtain a second weight vector, which ensures the comprehensiveness of the information and does not include useless information, which improves the transmission efficiency of useful information.
  • a second intermediate vector module is configured to use each second feature point in the feature map as a second output point, and use the surrounding position of the second input point as the second input point corresponding to the second input point and the surrounding position.
  • a second information removal module is configured to identify, from the second intermediate weight vector, that the second transmission point vector contains empty second transfer ratio vectors. Removing the second output point from the second intermediate weight vector includes: The information is an empty second transfer ratio vector, and an outward sending weight of the feature map is obtained; the second weight vector is determined based on the outgoing weight.
  • the second information removal module determines the second weight vector based on the outward transmission weight
  • the second information removal module is configured to arrange the outward transmission weight according to the position of the corresponding second input point to obtain the second weight vector.
  • the second weighting module further includes:
  • a second dimension reduction module configured to perform dimension reduction processing on the feature map through a convolution layer to obtain a second intermediate feature map
  • a second intermediate vector module configured to process the second intermediate feature map after dimensionality reduction through a neural network to obtain a second intermediate weight vector.
  • the feature enhancement unit includes:
  • a feature vector module configured to obtain a first feature vector based on a first weight vector and a feature map; obtain a second feature vector based on a second weight vector and a feature map;
  • the enhanced feature map module is used to obtain a feature enhanced feature map based on the first feature vector, the second feature vector and the feature map.
  • the feature information received by the feature points in the feature map is obtained through the first weight vector and the feature map
  • the feature information sent by the feature points in the feature map is obtained through the second weight vector and the feature map.
  • the transmitted feature information is based on the bidirectionally transmitted feature information and the original feature map to obtain an enhanced feature map including more information.
  • a feature vector module is configured to perform matrix multiplication processing on the first weight vector and the feature map or the first intermediate feature map after the feature map is dimension-reduced to obtain the first feature vector; and combine the second weight vector with the feature The second intermediate feature map after the graph or feature map undergoes dimension reduction is subjected to matrix multiplication processing to obtain a second feature vector.
  • an enhanced feature map module is used to stitch the first feature vector and the second feature vector in the channel dimension to obtain the stitched feature vector; the stitched feature vector and the feature map are stitched in the channel dimension to obtain the feature enhanced Feature map.
  • the feature enhancement unit further includes:
  • a feature projection module configured to perform feature projection processing on the stitching feature vector to obtain the processed stitching feature vector
  • the enhanced feature map module is used to stitch the processed mosaic feature vector and the feature map in the channel dimension to obtain a feature enhanced feature map.
  • the apparatus of this embodiment is implemented by using a feature extraction network and a feature enhancement network;
  • the training unit is configured to use the sample image to train the feature enhancement network, or use the sample image to train the feature extraction network and the feature enhancement network.
  • the sample image has an annotation processing result
  • the annotation processing result includes an annotation scene analysis result or an annotation object segmentation result.
  • the feature extraction network involved in this implementation may be pre-trained or untrained. When the feature extraction network is pre-trained, You can choose to train only the feature enhancement network, or both the feature extraction network and the feature enhancement network; when the feature extraction network is untrained, the sample image will be used to train the feature extraction network and the feature enhancement network.
  • a training unit is configured to input a sample image into a feature extraction network and a feature enhancement network to obtain a prediction processing result; and train a feature enhancement network based on the prediction processing result and the labeling processing result.
  • a training unit is configured to input a sample image into a feature extraction network and a feature enhancement network to obtain a prediction processing result; obtain a first loss based on the prediction processing result and the labeling processing result; train a feature extraction network and feature enhancement based on the first loss; The internet.
  • the training unit is further configured to determine an intermediate prediction processing result based on a feature map output from an intermediate layer in the feature extraction network; obtain a second loss based on the intermediate prediction processing result and the labeling processing result; adjust the feature based on the second loss Extract the parameters of the network.
  • an electronic device including a processor, where the processor includes the image processing apparatus according to any one of the foregoing.
  • the electronic device may be a vehicle-mounted electronic device.
  • an electronic device including: a memory for storing executable instructions;
  • a processor configured to communicate with the memory to execute the executable instructions to complete the operations of the image processing method according to any one of the above.
  • a computer-readable storage medium for storing computer-readable instructions that, when executed, perform the operations of the image processing method according to any one of the foregoing.
  • a computer program product including computer-readable code, and when the computer-readable code runs on a device, a processor in the device executes to implement any of the above.
  • An instruction for the image processing method is provided.
  • FIG. 8 illustrates a schematic structural diagram of an electronic device 800 suitable for implementing a terminal device or a server in the embodiment of the present application.
  • the electronic device 800 includes one or more processors and a communication unit.
  • the one or more processors are, for example, one or more central processing units (CPUs) 801, and / or one or more special-purpose processors.
  • the special-purpose processors may serve as the acceleration unit 813, which may include, but is not limited to, images.
  • the processors can be loaded into random access memory (from the memory portion 808 according to executable instructions stored in read-only memory (ROM) 802 or RAM) 803 can execute various appropriate actions and processes by executing instructions.
  • the communication part 812 may include, but is not limited to, a network card, and the network card may include, but is not limited to, an IB (Infiniband) network card.
  • the processor may communicate with the read-only memory 802 and / or the random access memory 803 to execute executable instructions, connect to the communication unit 812 through the bus 804, and communicate with other target devices via the communication unit 812, thereby completing the embodiments of the present application.
  • Operations corresponding to any of the methods for example, performing feature extraction on an image to be processed to generate a feature map of the image; determining a feature weight corresponding to each feature point among a plurality of feature points included in the feature map; and a feature corresponding to the feature weight
  • the feature information of the points is transmitted to a plurality of other feature points included in the feature map, respectively, to obtain a feature map with enhanced features.
  • RAM 803 can also store various programs and data required for device operation.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • ROM802 is an optional module.
  • the RAM 803 stores executable instructions, or writes executable instructions to the ROM 802 at runtime, and the executable instructions cause the central processing unit 801 to perform operations corresponding to the foregoing communication method.
  • An input / output (I / O) interface 805 is also connected to the bus 804.
  • the communication unit 812 may be provided in an integrated manner, or may be provided with a plurality of sub-modules (for example, a plurality of IB network cards) and connected on a bus link.
  • the following components are connected to the I / O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output portion 807 including a cathode ray tube (CRT), a liquid crystal display (LCD), and a speaker; a storage portion 808 including a hard disk and the like ; And a communication section 809 including a network interface card such as a LAN card, a modem, and the like. The communication section 809 performs communication processing via a network such as the Internet.
  • the driver 810 is also connected to the I / O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 810 as needed, so that a computer program read therefrom is installed into the storage section 808 as needed.
  • FIG. 8 is only an optional implementation manner. In the specific practice process, the number and types of components in FIG. 8 may be selected, deleted, added or replaced according to actual needs. For different functional component settings, separate or integrated settings can also be used.
  • the acceleration unit 813 and CPU801 can be set separately or the acceleration unit 813 can be integrated on CPU801.
  • the communication unit can be set separately or integrated on CPU801. Or on the acceleration unit 813, and so on.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present application include a computer program product including a computer program tangibly embodied on a machine-readable medium.
  • the computer program includes program code for performing a method shown in a flowchart, and the program code may include a corresponding
  • the instructions corresponding to the method steps provided in the embodiments of the present application are executed, for example, performing feature extraction on an image to be processed to generate a feature map of the image; determining a feature weight corresponding to each feature point among a plurality of feature points included in the feature map; The feature information of the feature points corresponding to the weights is transmitted to a plurality of other feature points included in the feature map, respectively, to obtain a feature map with enhanced features.
  • the computer program may be downloaded and installed from a network through the communication section 809, and / or installed from a removable medium 811.
  • CPU central processing unit
  • the methods and apparatus of the present application may be implemented in many ways.
  • the methods and devices of the present application can be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps of the method is for illustration only, and the steps of the method of the present application are not limited to the order specifically described above, unless otherwise specifically stated.
  • the present application may also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present application.
  • the present application also covers a recording medium storing a program for executing a method according to the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Electromagnetism (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
PCT/CN2019/093646 2018-08-07 2019-06-28 图像处理方法和装置、电子设备、存储介质、程序产品 Ceased WO2020029708A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2020554362A JP7065199B2 (ja) 2018-08-07 2019-06-28 画像処理方法及び装置、電子機器、記憶媒体並びにプログラム製品
SG11202005737WA SG11202005737WA (en) 2018-08-07 2019-06-28 Image processing method and apparatus, electronic device, storage medium, and program product
US16/905,478 US20200356802A1 (en) 2018-08-07 2020-06-18 Image processing method and apparatus, electronic device, storage medium, and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810893153.1A CN109344840B (zh) 2018-08-07 2018-08-07 图像处理方法和装置、电子设备、存储介质、程序产品
CN201810893153.1 2018-08-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/905,478 Continuation US20200356802A1 (en) 2018-08-07 2020-06-18 Image processing method and apparatus, electronic device, storage medium, and program product

Publications (1)

Publication Number Publication Date
WO2020029708A1 true WO2020029708A1 (zh) 2020-02-13

Family

ID=65291562

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/093646 Ceased WO2020029708A1 (zh) 2018-08-07 2019-06-28 图像处理方法和装置、电子设备、存储介质、程序产品

Country Status (5)

Country Link
US (1) US20200356802A1 (https=)
JP (1) JP7065199B2 (https=)
CN (1) CN109344840B (https=)
SG (1) SG11202005737WA (https=)
WO (1) WO2020029708A1 (https=)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344840B (zh) * 2018-08-07 2022-04-01 深圳市商汤科技有限公司 图像处理方法和装置、电子设备、存储介质、程序产品
CN109798888B (zh) * 2019-03-15 2021-09-17 京东方科技集团股份有限公司 移动设备的姿态确定装置、方法和视觉里程计
JP6965298B2 (ja) * 2019-03-18 2021-11-10 株式会社東芝 物体検出装置、物体検出方法、プログラム、および移動体
CN110135440A (zh) * 2019-05-15 2019-08-16 北京艺泉科技有限公司 一种适用于海量文物图像检索的图像特征提取方法
US11080884B2 (en) * 2019-05-15 2021-08-03 Matterport, Inc. Point tracking using a trained network
CN111767925B (zh) * 2020-04-01 2024-09-24 北京沃东天骏信息技术有限公司 物品图片的特征提取和处理方法、装置、设备和存储介质
CN111951252B (zh) * 2020-08-17 2024-01-23 中国科学院苏州生物医学工程技术研究所 多时序图像处理方法、电子设备及存储介质
CN112191055B (zh) * 2020-09-29 2021-12-31 武穴市东南矿业有限公司 一种矿山机械用具有空气检测结构的降尘装置
US20220237434A1 (en) * 2021-01-25 2022-07-28 Samsung Electronics Co., Ltd. Electronic apparatus for processing multi-modal data, and operation method thereof
CN112926595B (zh) * 2021-02-04 2022-12-02 深圳市豪恩汽车电子装备股份有限公司 深度学习神经网络模型的训练装置、目标检测系统及方法
CN113065997B (zh) * 2021-02-27 2023-11-17 华为技术有限公司 一种图像处理方法、神经网络的训练方法以及相关设备
CN112987765B (zh) * 2021-03-05 2022-03-15 北京航空航天大学 一种仿猛禽注意力分配的无人机/艇精准自主起降方法
CN113485750B (zh) * 2021-06-29 2024-01-23 海光信息技术股份有限公司 数据处理方法及数据处理装置
CN113191461B (zh) * 2021-06-29 2021-09-17 苏州浪潮智能科技有限公司 一种图片识别方法、装置、设备及可读存储介质
TWI806640B (zh) * 2022-01-11 2023-06-21 旺宏電子股份有限公司 記憶體裝置及其操作方法
CN115115680A (zh) * 2022-06-08 2022-09-27 腾讯科技(深圳)有限公司 图像处理方法、装置、设备以及存储介质
CN115496998B (zh) * 2022-06-17 2025-09-16 中国人民解放军网络空间部队信息工程大学 一种遥感影像码头目标检测方法

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101517538B1 (ko) * 2013-12-31 2015-05-15 전남대학교산학협력단 중심 가중치 맵을 이용한 중요도 영역 검출 장치 및 방법, 이를 위한 프로그램을 기록한 기록 매체
CN105023253A (zh) * 2015-07-16 2015-11-04 上海理工大学 基于视觉底层特征的图像增强方法
CN106127208A (zh) * 2016-06-16 2016-11-16 北京市商汤科技开发有限公司 对图像中的多个对象进行分类的方法和系统、计算机系统
CN107527059A (zh) * 2017-08-07 2017-12-29 北京小米移动软件有限公司 文字识别方法、装置及终端
CN108053028A (zh) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 数据定点化处理方法、装置、电子设备及计算机存储介质
CN108154222A (zh) * 2016-12-02 2018-06-12 北京市商汤科技开发有限公司 深度神经网络训练方法和系统、电子设备
CN108364023A (zh) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 基于注意力模型的图像识别方法和系统
CN109344840A (zh) * 2018-08-07 2019-02-15 深圳市商汤科技有限公司 图像处理方法和装置、电子设备、存储介质、程序产品

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102801972B (zh) * 2012-06-25 2017-08-29 北京大学深圳研究生院 基于特征的运动矢量估计和传递方法
CN105095833B (zh) * 2014-05-08 2019-03-15 中国科学院声学研究所 用于人脸识别的网络构建方法、识别方法及系统
US9792521B2 (en) * 2014-12-26 2017-10-17 Here Global B.V. Extracting feature geometries for localization of a device
US20160358069A1 (en) * 2015-06-03 2016-12-08 Samsung Electronics Co., Ltd. Neural network suppression
US11449061B2 (en) * 2016-02-29 2022-09-20 AI Incorporated Obstacle recognition method for autonomous robots
US11927965B2 (en) * 2016-02-29 2024-03-12 AI Incorporated Obstacle recognition method for autonomous robots
JP6858002B2 (ja) * 2016-03-24 2021-04-14 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 物体検出装置、物体検出方法及び物体検出プログラム
CN106022221B (zh) * 2016-05-09 2021-11-30 腾讯科技(深圳)有限公司 一种图像处理方法及处理系统
CN107516103B (zh) * 2016-06-17 2020-08-25 北京市商汤科技开发有限公司 一种影像分类方法和系统
JP6776696B2 (ja) * 2016-07-26 2020-10-28 富士通株式会社 並列情報処理装置、情報処理方法、およびプログラム
US20180039853A1 (en) * 2016-08-02 2018-02-08 Mitsubishi Electric Research Laboratories, Inc. Object Detection System and Object Detection Method
US11544539B2 (en) * 2016-09-29 2023-01-03 Tsinghua University Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
KR101879207B1 (ko) * 2016-11-22 2018-07-17 주식회사 루닛 약한 지도 학습 방식의 객체 인식 방법 및 장치
CN108229274B (zh) * 2017-02-28 2020-09-04 北京市商汤科技开发有限公司 多层神经网络模型训练、道路特征识别的方法和装置
US11010595B2 (en) * 2017-03-23 2021-05-18 Samsung Electronics Co., Ltd. Facial verification method and apparatus
CN108205803B (zh) * 2017-07-19 2020-12-25 北京市商汤科技开发有限公司 图像处理方法、神经网络模型的训练方法及装置
CN108229497B (zh) * 2017-07-28 2021-01-05 北京市商汤科技开发有限公司 图像处理方法、装置、存储介质、计算机程序和电子设备
WO2019041360A1 (zh) * 2017-09-04 2019-03-07 华为技术有限公司 行人属性识别与定位方法以及卷积神经网络系统
CN108229307B (zh) * 2017-11-22 2022-01-04 北京市商汤科技开发有限公司 用于物体检测的方法、装置和设备
US11270523B2 (en) * 2017-11-29 2022-03-08 Sdc U.S. Smilepay Spv Systems and methods for constructing a three-dimensional model from two-dimensional images
JP7094702B2 (ja) * 2018-01-12 2022-07-04 キヤノン株式会社 画像処理装置及びその方法、プログラム
US11768292B2 (en) * 2018-03-14 2023-09-26 Uatc, Llc Three-dimensional object detection
US10592780B2 (en) * 2018-03-30 2020-03-17 White Raven Ltd. Neural network training system
CN110660037B (zh) * 2018-06-29 2023-02-10 京东方科技集团股份有限公司 图像间脸部交换的方法、装置、系统和计算机程序产品
CN111666960B (zh) * 2019-03-06 2024-01-19 南京地平线机器人技术有限公司 图像识别方法、装置、电子设备及可读存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101517538B1 (ko) * 2013-12-31 2015-05-15 전남대학교산학협력단 중심 가중치 맵을 이용한 중요도 영역 검출 장치 및 방법, 이를 위한 프로그램을 기록한 기록 매체
CN105023253A (zh) * 2015-07-16 2015-11-04 上海理工大学 基于视觉底层特征的图像增强方法
CN106127208A (zh) * 2016-06-16 2016-11-16 北京市商汤科技开发有限公司 对图像中的多个对象进行分类的方法和系统、计算机系统
CN108154222A (zh) * 2016-12-02 2018-06-12 北京市商汤科技开发有限公司 深度神经网络训练方法和系统、电子设备
CN107527059A (zh) * 2017-08-07 2017-12-29 北京小米移动软件有限公司 文字识别方法、装置及终端
CN108053028A (zh) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 数据定点化处理方法、装置、电子设备及计算机存储介质
CN108364023A (zh) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 基于注意力模型的图像识别方法和系统
CN109344840A (zh) * 2018-08-07 2019-02-15 深圳市商汤科技有限公司 图像处理方法和装置、电子设备、存储介质、程序产品

Also Published As

Publication number Publication date
JP7065199B2 (ja) 2022-05-11
JP2021507439A (ja) 2021-02-22
CN109344840A (zh) 2019-02-15
CN109344840B (zh) 2022-04-01
SG11202005737WA (en) 2020-07-29
US20200356802A1 (en) 2020-11-12

Similar Documents

Publication Publication Date Title
WO2020029708A1 (zh) 图像处理方法和装置、电子设备、存储介质、程序产品
US11734851B2 (en) Face key point detection method and apparatus, storage medium, and electronic device
CN111369427B (zh) 图像处理方法、装置、可读介质和电子设备
US11670023B2 (en) Artificial intelligence techniques for performing image editing operations inferred from natural language requests
JP2023547917A (ja) 画像分割方法、装置、機器および記憶媒体
CN109543549A (zh) 用于多人姿态估计的图像数据处理方法及装置、移动端设备、服务器
US20260030809A1 (en) Image processing method and apparatus, storage medium, and electronic device
US20230036338A1 (en) Method and apparatus for generating image restoration model, medium and program product
CN111368685A (zh) 关键点的识别方法、装置、可读介质和电子设备
US20230017578A1 (en) Image processing and model training methods, electronic device, and storage medium
CN113762321B (zh) 多模态分类模型生成方法和装置
US20250104453A1 (en) Image description generation method and apparatus, device, medium, and product
CN110796721A (zh) 虚拟形象的颜色渲染方法、装置、终端及存储介质
US20250022136A1 (en) Image cropping method and apparatus, model training method and apparatus, electronic device, and medium
CN116580184B (zh) 一种基于YOLOv7的轻量化模型
CN114912629A (zh) 联合感知模型训练、联合感知方法、装置、设备和介质
US20250299463A1 (en) Segmentation-assisted detection and tracking of objects or features
US20260065488A1 (en) Image processing method and apparatus, electronic device, and storage medium
WO2024240222A1 (zh) 图像风格化处理方法、装置、设备、存储介质和程序产品
CN115131464A (zh) 图像生成方法、装置、设备以及存储介质
US20240221126A1 (en) Image splicing method and apparatus, and device and medium
CN116168451A (zh) 一种图像活体检测方法、装置、存储介质及电子设备
WO2024131630A1 (zh) 车牌识别方法、装置、电子设备及存储介质
CN115775318A (zh) 一种面向道路场景的目标实例分割方法和装置
CN116935425A (zh) 表格图像识别方法、装置、设备及计算机存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845983

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020554362

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14/05/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19845983

Country of ref document: EP

Kind code of ref document: A1