CN111476806B

CN111476806B - Image processing method, image processing device, computer equipment and storage medium

Info

Publication number: CN111476806B
Application number: CN202010578425.6A
Authority: CN
Inventors: 胡一凡
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-06-23
Filing date: 2020-06-23
Publication date: 2020-10-23
Anticipated expiration: 2040-06-23
Also published as: CN111476806A

Abstract

The application relates to an image processing method, an image processing device, a computer device and a storage medium. The method comprises the following steps: acquiring a target characteristic matrix corresponding to a target image to be processed; determining a plurality of characteristic areas obtained by dividing the target characteristic matrix; determining a target weight corresponding to each characteristic region based on the characteristic values in the characteristic regions; obtaining a weighted feature matrix corresponding to the target feature matrix according to each feature region and the corresponding target weight; and processing the weighted feature matrix to obtain a target processing result corresponding to the target image. The image processing result can be obtained based on an image processing model, the image processing model can be an artificial intelligence model, and the accuracy of the image processing result can be improved by adopting the method.

Description

Image processing method, image processing device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, a computer device, and a storage medium.

Background

With the development of science and technology, images are more and more widely applied. In many scenes, the image needs to be processed, for example, the image may be segmented to obtain an image corresponding to the foreground and an image corresponding to the background.

At present, images can be processed according to an artificial intelligence-based image processing model, however, there is often a problem that the result obtained by image processing model processing does not conform to the expected image processing result, i.e. the accuracy of the obtained image processing result is low.

Disclosure of Invention

In view of the above, it is necessary to provide an image processing method, an apparatus, a computer device and a storage medium for solving the above technical problems.

A method of image processing, the method comprising: acquiring a target characteristic matrix corresponding to a target image to be processed; determining a plurality of characteristic areas obtained by dividing the target characteristic matrix; determining a target weight corresponding to each characteristic region based on the characteristic values in the characteristic regions; obtaining a weighted feature matrix corresponding to the target feature matrix according to each feature region and the corresponding target weight; and processing the weighted feature matrix to obtain a target processing result corresponding to the target image.

An image processing apparatus, the apparatus comprising: the target characteristic matrix acquisition module is used for acquiring a target characteristic matrix corresponding to a target image to be processed; the characteristic region determining module is used for determining a plurality of characteristic regions obtained by dividing the target characteristic matrix; a target weight obtaining module, configured to determine, based on the feature values in the feature regions, a target weight corresponding to each feature region; a weighted feature matrix obtaining module, configured to obtain a weighted feature matrix corresponding to the target feature matrix according to each feature region and the corresponding target weight; and the target processing result obtaining module is used for processing the weighted characteristic matrix to obtain a target processing result corresponding to the target image.

In some embodiments, the feature region determination module comprises: a division number acquisition unit for acquiring the number of area divisions; and the characteristic map dividing unit is used for dividing the target characteristic matrix according to the distribution of characteristic values in the target characteristic matrix to obtain the characteristic areas of the area dividing quantity.

In some embodiments, the feature map dividing unit is configured to: dividing a target characteristic value range corresponding to the target characteristic matrix to obtain sub-characteristic value ranges with the number of the sub-characteristic value ranges being the number of the region division; and dividing the characteristic points of which the characteristic values are in the same sub-characteristic value range in the target characteristic matrix into the same characteristic region to obtain a plurality of characteristic regions corresponding to the target characteristic matrix.

In some embodiments, the feature map dividing unit is configured to: acquiring a maximum eigenvalue and a minimum eigenvalue corresponding to the target eigenvalue matrix; according to the region division quantity, determining a characteristic value equipartition point in a target characteristic value range corresponding to the maximum characteristic value and the minimum characteristic value; and taking the range between the equal division points of the characteristic value as a sub characteristic value range.

In some embodiments, the target processing result is obtained by processing using an image processing model, and the division number obtaining unit is configured to: and acquiring the number of candidate result types corresponding to the image processing model, and acquiring the number of the region partitions according to the number of the candidate result types.

In some embodiments, the target weight derivation module comprises: the pooling unit is used for pooling the target feature matrix by taking a region as a unit to obtain pooling values respectively corresponding to the feature regions; the target weight determining unit is used for obtaining target weights corresponding to the characteristic regions according to the first pooling vector; the first pooling vector is composed of pooling values corresponding to the feature regions.

In some embodiments, the target feature matrix is a plurality of matrices, and the module for obtaining the first pooling vector comprises: and combining the pooling values corresponding to the characteristic areas in the target characteristic matrixes as vector values to obtain a first pooling vector.

In some embodiments, the target weight determination unit is to: inputting the first pooling vector into a first fully-connected layer to obtain a first weight vector; determining the region type corresponding to each feature region in the target feature matrix, splitting the first weight vector according to the region type of the feature region, and obtaining a second weight vector corresponding to each region type; and inputting the second weight vector into a second full-connection layer corresponding to the region type to obtain the target weight corresponding to each characteristic region.

In some embodiments, the target processing result obtaining module is configured to: processing the weighted feature matrix to obtain the segmentation class probability corresponding to each pixel point in the target image; determining a target class corresponding to the pixel point according to the segmentation class probability corresponding to the pixel point; and segmenting the target image according to the target category corresponding to the pixel point to obtain an image segmentation result.

In some embodiments, the target feature matrix acquisition module is to: inputting the target image into an image processing model, and performing feature extraction through a plurality of feature extraction layers in the image processing model to obtain a first feature matrix; and fusing the first characteristic matrix and the forward characteristic matrix of the first characteristic matrix to obtain a target characteristic matrix corresponding to the target image.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of the image processing method when executing the computer program:

a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the image processing method described above:

the image processing method, the image processing device, the computer equipment and the storage medium can determine a plurality of characteristic areas obtained by dividing the target characteristic matrix for the target characteristic matrix of the target image, determine the target weight corresponding to each characteristic area based on the characteristic value in the characteristic area, perform weighting processing on each characteristic area based on the target weight to obtain the weighted characteristic matrix, and obtain the image processing result based on the weighted characteristic matrix. Because the target weight corresponding to each characteristic region in the target characteristic matrix can be determined according to the characteristic value in the characteristic region, the characteristic region can be differentiated based on the weight reflecting the characteristic value difference of the characteristic region in the target characteristic matrix, so that the weighted characteristic matrix can reflect more accurate characteristics in different characteristic regions, and the accuracy of an image processing result can be improved.

Drawings

FIG. 1 is a diagram of an application environment of an image processing method in some embodiments;

FIG. 2 is a flow diagram illustrating a method of image processing in some embodiments;

FIG. 3 is a schematic illustration of segmentation of a target image in some embodiments;

FIG. 4 is a schematic diagram of an image processing model in further embodiments;

FIG. 5 is a schematic diagram illustrating a principle of region partition of a target feature matrix in some embodiments;

FIG. 6 is a schematic flow chart illustrating obtaining target weights corresponding to feature regions in some embodiments;

FIG. 7 is a schematic diagram of the processing of the weight determination module in some embodiments;

FIG. 8 is a network architecture diagram of an image processing model in some embodiments;

FIG. 9 is a schematic diagram of an image processing apparatus according to some embodiments;

FIG. 10 is a diagram of the internal structure of a computer device in some embodiments.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The automatic driving technology generally comprises technologies such as high-precision maps, environment perception, behavior decision, path planning, motion control and the like, and the self-determined driving technology has wide application prospect,

with the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the application relates to the computer vision technology of artificial intelligence and the like, and is specifically explained by the following embodiment:

the image processing method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. A user can send an image processing instruction, for example, an image processing instruction to divide an image, to the server 104 by operating the terminal 102. The image processing instruction may carry an image identifier of the target image or the target image itself, the server 104 may obtain a pre-stored target image according to the image identifier or may obtain the target image carried in the image processing instruction in real time, and execute the image processing method provided in the embodiment of the present application to obtain an image processing result (target processing result), for example, a result of segmenting the image, where the image is divided into a foreground portion and a background portion. The server 104 may return the image processing result to the terminal 106, and the server 104 may perform further processing according to the image segmentation result. As an actual example, during automatic driving, a visual perception device (terminal) in an automobile may acquire an ambient image in real time, transmit the ambient image to the server 104, and the server 104 performs segmentation processing on the ambient image to distinguish a drivable road surface from a non-drivable road surface from the ambient image, thereby making a driving decision.

The terminals 102 and 106 may be, but are not limited to, various personal computers, notebook computers, smart phones, automobiles, tablet computers, medical imaging devices, or portable wearable devices. Terminal 102 and terminal 106 may be the same terminal or different terminals. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers. It is understood that the image processing method provided by the embodiment of the present application may also be executed in a terminal.

In some embodiments, as shown in fig. 2, an image processing method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step S202, a target characteristic matrix corresponding to a target image to be processed is obtained.

The target image is an image to be subjected to image processing, such as an image to be segmented, or an image to be subjected to target recognition. For example, the target image may be an image that needs to be segmented into foreground and background or an image that needs to identify whether a target object is present. The target image may be a medical image, such as an image obtained by cross-sectional scanning with a CT (Computed Tomography) imaging apparatus, and the medical image may be subjected to image segmentation into a region of interest in which the target object exists and a region of non-interest in which the target object does not exist. The target object may also be at least one of a plant or an item.

The target feature matrix includes a plurality of feature values for reflecting characteristics of the target image. The eigenvalue is a value corresponding to each of the eigenvalues in the eigenvalue matrix. For example, the gray value of each pixel point in the target image may be obtained as the feature value corresponding to the feature point in the feature matrix, or the feature of the target image may be extracted to obtain the target feature matrix. The extracted target Feature matrix may be referred to as a Feature Map (Feature Map), and in image processing, a Feature matrix obtained by extracting features from an image may be referred to as an image, and thus may be referred to as a Feature Map. In the feature extraction, the input image may be processed, for example, convolved, using model parameters of a feature extraction layer in the image processing model, to obtain a feature matrix representing features of the image. The feature extraction layer may have a plurality of feature channels (channels), and may acquire a feature matrix extracted by each feature channel as a target feature matrix. For example, the feature map output by the feature extraction layer may be represented as follows: h W C. Wherein H represents the height of the feature map, which may represent the number of feature points in the height direction, and one feature point corresponds to one feature value. W represents the width of the feature map, and may represent the number of feature points in the width direction. C represents the number of feature maps, which may be the same as the number of channels in the feature extraction layer.

The image processing model is an artificial intelligence model for processing an image, and may be a Neural Network model obtained through supervised training, for example, an interception model, a VGG (Visual Geometry Group, super-resolution test sequence), or a resource Network (Residual Neural Network model). The feature extraction layer is a network layer for extracting features of an image in the image processing model, and may extract features of the image using, for example, a convolutional layer to obtain a feature map.

Specifically, the server may acquire an image sent by the terminal, and use the image as a target image to be processed, for example, an image scanned by a medical imaging device as the target image to be processed. The server may also acquire a pre-stored image, and take the image as a target image to be processed. The server can input the target image into the image processing model, and the target characteristic matrix is obtained by performing characteristic extraction through a characteristic extraction layer of the image processing model, and pixel values in the target image can also be obtained to form the target characteristic matrix.

In some embodiments, an image processing model may include a plurality of feature extraction layers, and the method provided in the embodiments of the present application may be performed on a feature matrix extracted by each feature extraction layer. The method provided by the embodiment of the present application may also be performed on the feature matrix extracted by the partial feature extraction layer. Or fusing the feature matrixes extracted by a plurality of feature extraction layers to obtain a target feature matrix. For example, the feature matrices extracted by the first layer and the third layer may be added to obtain a target feature matrix. Or fusing a feature matrix extracted by the feature extraction layer with a feature matrix formed by the gray value of the target image to obtain a target feature matrix.

Step S204, a plurality of characteristic areas obtained by dividing the target characteristic matrix are determined.

Specifically, "plurality" means at least two. The principle of segmenting the target feature matrix may be set as required, for example, the target feature matrix may be divided into a plurality of regions based on a semantic segmentation method. When a plurality of target feature matrices are available, each target feature matrix is divided to obtain a feature area corresponding to each target feature matrix.

It is understood that in the embodiment of the present application, feature points corresponding to one "feature region" may be continuously distributed or may be discretely distributed. For example, when the target feature matrix is divided, the target feature matrix is divided into two feature regions, feature points having a feature value greater than 100 are divided into a first feature region, and feature points having a feature value of 100 or less are divided into a second feature region.

In some embodiments, when the target feature matrix is segmented, the segmentation may be performed based on feature values of the target feature matrix, and feature points having the same feature are classified into the same region. For example, the segmentation may be performed based on the distribution of the feature values in the target feature matrix, and feature points having gray values distributed in the same range may be classified into the same feature region.

In some embodiments, the number of the feature regions may be set as needed, for example, may be a fixed number, or may be flexibly determined according to specific situations, for example, determined according to the size of the target feature matrix, where the larger the target feature matrix is, the larger the number of the feature regions is.

Step S206, based on the characteristic values in the characteristic regions, determining the target weight corresponding to each characteristic region.

The weight may represent the importance of the corresponding feature region in the target feature matrix. The weight may be obtained by processing the feature value in the feature region according to the model parameter. The model parameters are obtained by training the image processing model.

Specifically, the image processing model includes a weight determining module, configured to determine weights corresponding to each feature map region, and the target feature matrix may be input to the weight determining module in the image processing model, and the weight determining module may process feature values in the feature regions by using model parameters obtained through training, so as to obtain a target weight corresponding to each feature region.

For example, assuming there are C feature maps, each feature map divided into K feature regions, the weight determination module may output "K times C" weights. As a practical example, assume that there are 2 object feature matrices, each feature map is divided into 3 feature regions, and the feature region of the first object feature matrix is denoted as T₁₁、T₂₁、T₃₁. The feature region of the second target feature matrix is denoted T₁₂、T₂₂、T₃₂. The weight determination module may output 6 target weights, each being T₁₁、T₂₁、T₃₁、T₁₂、T₂₂、T₃₂The respective corresponding target weights: q. q.s₁₁、q₂₁、q₃₁、q₁₂、q₂₂、q₃₂。

And step S208, obtaining a weighted feature matrix corresponding to the target feature matrix according to each feature area and the corresponding target weight.

Specifically, the feature value of each feature region may be multiplied by the target weight corresponding to the feature region to obtain a weighted feature matrix. For example, for the first target feature matrix, T may be₁₁Each eigenvalue in (1) is multiplied by q₁₁Will T₂₁Each eigenvalue in (1) is multiplied by q₂₁Will T₃₁Each eigenvalue in (1) is multiplied by q₃₁Thereby obtaining a weighted feature matrix.

And step S210, processing the weighted feature matrix to obtain a target processing result corresponding to the target image.

Specifically, the target processing result is an image processing result obtained by processing the target image. The image segmentation result may be an image segmentation result, or a result corresponding to target detection, which is specifically set according to actual needs. For example, the image processing result may be position information of the target object in the target image. The image processing result may also be the probability that each pixel partitions the category for each image. The next feature extraction layer of the image processing model may be used to perform the next feature extraction on the weighted feature matrix to obtain a feature matrix output by the next feature extraction layer, which is used as the target feature matrix, and the step S204 is returned to. The weighted feature matrix may be input to an output layer to obtain an image processing result.

The image processing method may determine a plurality of feature regions obtained by dividing a target feature matrix of a target image, determine a target weight corresponding to each feature region based on feature values in the feature regions, perform weighting processing on each feature region based on the target weight to obtain a weighted feature matrix, and obtain an image processing result based on the weighted feature matrix. Because the target weight corresponding to each characteristic region in the target characteristic matrix can be determined according to the characteristic value in the characteristic region, the characteristic region can be differentiated based on the weight reflecting the characteristic value difference of the characteristic region in the target characteristic matrix, so that the weighted characteristic matrix can reflect more accurate characteristics corresponding to different characteristic regions, and the accuracy of the image processing result can be improved.

In some embodiments, processing the weighted feature matrix to obtain a target processing result corresponding to the target image includes: processing the weighted feature matrix to obtain the segmentation class probability corresponding to each pixel point in the target image; determining a target class corresponding to the pixel point according to the segmentation class probability corresponding to the pixel point; and segmenting the target image according to the target category corresponding to the pixel point to obtain an image segmentation result.

Specifically, when image segmentation is performed, it may be set in advance that an image is segmented into image portions corresponding to a plurality of segmentation categories, respectively. Plural means at least two. For example, the face image may be segmented into a portion corresponding to a nose, a portion corresponding to a mouth, a portion corresponding to an eye, and the like. The segmentation class probability refers to the probability that a pixel belongs to the segmentation class. Such as the probability that a pixel belongs to a pixel corresponding to the nose, the probability that a pixel belongs to a pixel corresponding to the mouth, and the probability that a pixel belongs to a pixel corresponding to the eye. After the segmentation class probability is obtained, the class corresponding to the maximum segmentation class probability corresponding to the pixel point can be obtained and used as the target class. And segmenting the target image according to the target category corresponding to each pixel point to obtain the image part corresponding to each target category. For example, if a pixel point is assumed, the segmentation class probability of the image processing model output by the image processing model, which belongs to the nose portion, is 0.1, the segmentation class probability of the image processing model, which belongs to the mouth portion, is 0.8, and the segmentation class probability of the image processing model, which belongs to the eye portion, is 0.1, then the corresponding class of the pixel point is the mouth class.

In some embodiments, obtaining a target feature matrix corresponding to a target image to be processed includes: inputting a target image into an image processing model, and performing feature extraction through a plurality of feature extraction layers in the image processing model to obtain a first feature matrix; and fusing the first characteristic matrix and the forward characteristic matrix of the first characteristic matrix to obtain a target characteristic matrix corresponding to the target image.

Specifically, the first feature matrix may be features extracted by a previous feature extraction layer of the weight determination module in the image processing model. The previous feature extraction layer corresponding to the weight determination module is a feature extraction layer connected with the weight determination module in the image processing model, and the previous feature extraction layer is arranged in front of the weight determination module. The forward feature matrix of the first feature matrix is a feature matrix extracted by a feature extraction layer before the previous feature extraction layer. For example, assuming that the first feature matrix is a feature matrix extracted by the 3 rd feature extraction layer, the forward feature matrix may be at least one of a feature matrix extracted by the 2 nd feature extraction layer or a feature matrix extracted by the 1 st feature extraction layer. Fusion may refer to addition. For example, the first feature matrix is added to the forward feature matrix to obtain the target feature matrix. The target image is subjected to feature extraction through a plurality of feature extraction layers, and a first feature matrix which reflects deep features of the target image can be extracted and obtained. And because the forward feature matrix is extracted before, and the number of the extracted feature layers is relatively small, more details in the target image can be reserved, so that the first feature matrix is fused with the forward feature matrix, and the feature of the target image can be better embodied by the target feature matrix.

In some embodiments, the image processing model provided in the embodiments of the present application may be an end-to-end image segmentation model, the input data is a target image to be segmented, the output is an image segmented into K classes, and is a probability map output for K-1 or K channels, and a value of a segmentation class probability corresponding to each pixel point in the probability map may be in a range of [0,1 ]. The target class corresponding to one pixel point is a channel class corresponding to the maximum segmentation class probability corresponding to the pixel point in the probability map. As shown in fig. 3, for example, assuming that K =3, the image may be divided into image portions corresponding to the a, B, and C categories, respectively.

In some embodiments, for the image processing model, the weight determination module may be arranged after each feature extraction layer and connected with the feature extraction layer, or the weight determination module may be arranged after a part of the feature extraction layers and connected with the feature extraction layer. As shown in fig. 4, the image processing model may include 3 feature extraction layers, and the weight determination module may be disposed after the feature extraction layer 2. Then, for the feature matrix extracted by the feature extraction layer 1, the feature matrix may be directly input into the feature extraction layer 2, when the feature matrix extracted by the feature extraction layer 2 is obtained, the feature matrix is used as a target feature matrix, the target feature matrix obtained by dividing the feature area is input into the weight determination module, the weight determination module outputs the weight corresponding to each feature area, and each feature area of the target feature matrix is multiplied by the corresponding weight (weighting processing) to obtain a weighted feature matrix. The weighted feature matrix is input to the feature extraction layer 3 to continue feature extraction. It is to be understood that the weight determination module may be provided after the feature extraction layer 1 and the feature extraction layer 2. That is, the weight determining module provided in the embodiment of the present application may be pluggable, and may determine which feature extraction layers are connected to the module according to the actual needs, so that the target processing model has extensibility and high universality, and may be nested on different classical networks, including but not limited to ResNet series, inclusion series, VGG series, and the like.

In some embodiments, determining the plurality of feature regions obtained by segmenting the target feature matrix comprises: acquiring the number of area divisions; and segmenting the target characteristic matrix according to the distribution of characteristic values in the target characteristic matrix to obtain characteristic regions with the number of region partitions.

The number of the area divisions may be set as needed, and may be fixedly set to 4, for example. The target processing result is obtained by processing with an image processing model, and the number of the region partitions can also be determined according to the number of the candidate result types corresponding to the image processing model. For example, the number of region divisions is N times the number of candidate result categories, where N is a positive integer, and the number of candidate result categories may be set as the number of region divisions, for example. The number of candidate result categories corresponding to an image processing model refers to the number of categories of image processing results that the model can output. For example, for a K-class image segmentation model, the number of candidate result classes is K, so the target feature matrix may be divided into K blocks. By obtaining the number of the region partitions according to the number of the candidate result categories, the feature map can be flexibly and adaptively partitioned according to the number of the candidate result categories of the image processing model, and the flexibility is high.

The distribution of the feature values refers to the distribution of the feature values. For example, the feature value corresponding to each feature point or the number of feature points corresponding to the feature value in each feature value range. And after the characteristic value distribution is obtained, the server divides the target characteristic matrix according to the characteristic value distribution. For example, feature points whose feature values are ranked within the last 10% may be divided into a first feature region, feature points whose feature values are ranked within the top 20% may be divided into a second feature region, and the remaining feature points may be divided into a third feature region.

Specifically, after obtaining the distribution of the feature values, the server may divide the target feature matrix into feature areas of which the number is the area division number according to the distribution. For example, if the number of area divisions is 3, the target feature matrix is divided into 3 feature areas.

In some embodiments, segmenting the target feature matrix according to the distribution of the feature values in the target feature matrix, and obtaining the feature regions with the region segmentation quantities includes: dividing a target characteristic value range corresponding to the target characteristic matrix to obtain sub-characteristic value ranges with the number of the sub-characteristic value ranges being the number of the area divisions; and dividing the characteristic points of the characteristic values in the same sub-characteristic value range in the target characteristic matrix into the same characteristic region to obtain a plurality of characteristic regions corresponding to the target characteristic matrix.

Specifically, the target eigenvalue range refers to a range in which eigenvalues corresponding to the target eigenvalue matrix are located, and for example, the maximum eigenvalue and the minimum eigenvalue corresponding to the target eigenvalue matrix may be obtained, and a range between the maximum eigenvalue and the minimum eigenvalue is used as the target eigenvalue range.

In some embodiments, when performing segmentation, it may be to divide the range of eigenvalues evenly. For example, the target feature matrix may be divided into the feature regions of the number of region divisions by determining the equal-feature-value dividing points in the target feature-value range corresponding to the maximum feature value and the minimum feature value according to the number of region divisions, and using the range between the equal-feature-value dividing points as the sub-feature-value range. The eigenvalue equal division point is a point that uniformly divides the eigenvalue range in the target eigenvalue matrix into a plurality of eigenvalue ranges. For example, if the target feature matrix needs to be divided into 3 feature areas, the maximum value of the target feature matrix is 180, and the minimum value of the target feature matrix is 30, the mean division points of the feature values are 80 and 130. The first sub-eigenvalue range is the minimum value of 30 to 80, and the eigenvalue of 30 to 80 in the target eigenvalue matrix is divided into a first eigen region. The second sub-eigenvalue ranges from 80 to 130, and the eigenvalue of the target eigenvalue matrix is divided into a second eigen region at the eigenvalue of 80 to 130. The third sub-eigenvalue range is 130 to 180, and the eigenvalue of the target eigenvalue matrix is divided into a third eigen region at the eigenvalue of 130 to 180. It is understood that 80 may be within a first sub-range of eigenvalues or within a second sub-range of eigenvalues. The formula for determining the equipartition point of the characteristic value can be expressed as formula (1), wherein, torch.max represents the maximum value, torch.min represents the minimum value, K represents the number of the region divisions, and m represents the equal divisionThe serial number of the point is,

representing the x-th object feature matrix, d_mxRepresents the mth bisector point, m, in the xth target feature matrix

。

（1）

In some embodiments, when performing segmentation, the range of feature values may also be divided non-uniformly. The specific setting can be according to needs.

In the embodiment of the application, the feature points with the feature values in the same sub-feature value range are divided into the same feature region, so that the region division can be performed on the target feature matrix based on the size of the feature values, the feature points with the similar feature values are divided into the same feature region, and therefore the calculated target weight is the weight of the feature regions with the similar feature values, and the weight corresponding to the feature regions can be determined adaptively according to the difference of the feature values between the feature regions, so that the flexibility and the accuracy of the obtained target weight are improved.

In some embodiments, determining the target weight corresponding to each feature region based on the feature values in the feature regions includes: performing pooling treatment on the target characteristic matrix by taking the area as a unit to obtain pooling values respectively corresponding to each characteristic area; obtaining target weights corresponding to the characteristic regions according to the first pooling vector; the first pooling vector is composed of pooling values corresponding to the feature regions.

Specifically, the target feature matrix may be input into a pooling layer of the weight determining module, and the pooling layer performs pooling on the target feature matrix by using the region as a unit to obtain pooling values respectively corresponding to each feature region; and inputting the first pooling vector into a weight determination layer of a weight determination module to obtain target weights corresponding to the characteristic regions.

Wherein the pooling layer is used for pooling (pooling) the feature map. When pooling is performed, pooling may be maximum or average. The pooling of the target feature matrix in units of regions means that when pooling the target feature matrix, pooling is performed for each feature region. In this way, when pooling, differences between different regions are not cancelled out by global pooling. For example, when the image needs to be segmented into foreground and background, for an image with a large background and a small target, the global pooling results close to the global pooling of the background, and the foreground region is completely ignored. Pooling in units of regions reduces the likelihood of ignoring foreground regions.

As an actual example, assuming that the target feature matrix is divided into two feature areas, an average value of feature values in a first feature area may be obtained as a pooling value corresponding to the first feature area. An average value of the feature values in the second feature region may be obtained as the pooling value corresponding to the second feature region.

The weight determination layer is a network layer after the pooling layer in the weight determination module and may include a connection layer. For example, the weight determination layer includes a first full connection layer, a first activation function layer (e.g., ReLU layer), a second full connection layer, a second activation function layer (e.g., sigmoid layer), and the like, the ReLU layer processes the input using a ReLU function, and the sigmoid layer processes the input using a sigmoid function.

In some embodiments, after obtaining the pooling value, the pooling value is used as a vector value in the first pooling vector to form the first pooling vector. And inputting the first pooling vector into a weight determination layer of a weight determination module to obtain the target weight corresponding to each characteristic region.

In some embodiments, the target feature matrix is a plurality of matrices, and the step of obtaining the first pooling vector comprises: and taking the pooling values corresponding to the characteristic areas in the target characteristic matrixes as vector values to obtain first pooling vectors.

In particular, the feature extraction layer may be used forFor example, a feature extraction layer in the image processing model may have C feature channels, each feature channel may output one feature matrix, and then the number of the target feature matrices may be C. The first pooling vector is obtained by combining pooling values as vector values, for example, if a target feature matrix is divided into K regions, K times C pooling values are shared, and each pooling value is used as a vector value to form a first pooling vector [ 2 ] having K times C vector values

]Wherein the first letter in the subscript of the letter s denotes: the serial number of the target feature matrix where the feature region is located. The second letter in the subscript of the letter s denotes: and the serial number of the target feature matrix where the feature area is located. For example,

can represent the pooling value corresponding to the 1 st characteristic region in the target characteristic matrix corresponding to the 1 st characteristic channel,

and (3) representing the pooling value corresponding to the Kth characteristic area in the target characteristic matrix (the C-th target characteristic matrix) corresponding to the C-th characteristic channel.

In the embodiment of the application, since the first pooling vector is composed of pooling values of each feature region in the plurality of target feature matrices, and the first pooling vector is input to the weight determination layer of the weight determination module for processing, in the process of determining the weight corresponding to the feature region, feature information corresponding to the feature region between different target feature matrices and feature information corresponding to different feature regions of the same target feature matrix can be interacted, so that more valuable weight information can be generated, and the accuracy of the obtained target weight corresponding to the feature region can be improved.

In some embodiments, pooling the feature regions of the target feature matrix may be represented by equation (2). Wherein mask represents mask, and mask is 0And 1, and a binary matrix. When the mask is applied to the target feature matrix, feature points corresponding to the point with the value of 1 in the mask are processed, and feature points corresponding to the point with the value of 0 are not processed.

And representing the xth channel of the target feature matrix, namely representing the target feature matrix corresponding to the xth feature channel. Wherein x is an integer value between 1 and C, expressed as

"" is the elementalwise product (the intelligent product of elements) of two matrices, sum represents the sum of all elements of the matrices, K represents the number of region divisions, m represents the mth feature region in a target feature matrix, m is an integer value between 1 and K,

representing the mth mask (mask) corresponding to the xth target feature matrix, the size of the mask and

same, is H W. The mask may be calculated according to formula (3) and formula (4), where formula (4) is a formula for calculating the 1 st mask in the target feature matrix when m =1, and formula (3) is a formula for calculating the m-th mask in the target feature matrix when m is greater than 1. For the last mask, i.e. the kth mask, of the target feature matrix, the calculation of formula (5) can also be adopted to improve the calculation efficiency.

The sign function represents the positive and negative values of the returned data, the data returned value greater than 0 is 1, the data returned value less than 0 is-1, and the data returned value equal to 0 is 0.

(Rectified Linear Unit) represents a linear flow function.

See formula (1).

（2）

（3）

（4）

（5）

The mask of the 1 st feature region is calculated by dividing the target feature matrix into 2 feature regions

For example, the principle analysis for calculating the pooling value in the embodiment of the present application by equations (1) to (5) is as follows: ,

a matrix obtained by subtracting the value of the 1 st halving point corresponding to the target characteristic matrix from each characteristic value in the target characteristic matrix is represented, and the size of the matrix is

Has the same size and is H W. Due to the passing of the pair

The obtained matrix value is processed

And calculating, wherein the matrix value in the calculated matrix is one of 0, -1 or 1. The matrix value "0" represents

The characteristic value of the characteristic point of the position corresponding to the matrix value is equal to

The matrix value "-1" represents

The characteristic value of the characteristic point of the corresponding position is greater than

The matrix value "1" represents

The characteristic value of the characteristic point of the corresponding position is less than

. Due to the fact that

The function is taken to be the maximum of the input value and 0, and is therefore calculated

To obtain a mask

Wherein the matrix value in (1) is "0" or "1", and wherein the mask

The region corresponding to the position where the medium matrix value is "1", and

the 1 st feature region in (1) is corresponding, and thus will be

And

performing intelligent product calculation of elements

Equivalent to separate out

The 1 st feature region in (a),

the number of feature points in the 1 st feature region is obtained, and the feature values of the feature regions are added, i.e., calculated

Dividing the sum by

Is a pair

The 1 st feature region of (a) was subjected to average pooling.

As an actual example, as shown in fig. 5, the left graph is an object feature matrix, the right graph is a feature value histogram of the object feature matrix, and the image is divided by using feature value histogram information. For example, if the image is divided into 2 parts as a result of image division, the target feature matrix may be divided into 2 parts, and since the maximum value in the target feature matrix is 29 and the minimum value is-27, the bisector point may be calculated as (maximum value + minimum value)/2, that is (-27 + 29)/2 = 1. The target feature matrix can thus be divided into two feature regions: a first characteristic region (a portion having a characteristic value smaller than 1) displayed in gray and a second characteristic region (a portion having a characteristic value larger than 1) displayed in white. By pooling the feature regions, the matrix value of the position corresponding to the first feature region in the mask corresponding to the first feature region is 1, and the matrix values of the other positions are 0. In the mask corresponding to the second feature region, the matrix value at the position corresponding to the second feature region is 1, and the matrix values at the other positions are 0.

In some embodiments, as shown in fig. 6, obtaining the target weight corresponding to each feature region according to the first pooling vector includes:

step S602, the first pooled vector is input into the first full-link layer to obtain a first weight vector.

Specifically, the full connected layers (FC) are used for performing full connection processing on the first pooled vector, so as to obtain intermediate weights corresponding to each feature region in each target feature matrix, and form a first weight vector. The first weight vector may be output by the first fully-connected layer, or may be output by another network layer after the first fully-connected layer, for example, the Relu layer, via the Relu layer. By inputting the first pooling vector into the first full-link layer, the feature information corresponding to the feature regions between different target feature matrices and the feature information corresponding to different feature regions of the same target feature matrix can be interacted, so that a more valuable first weight vector is obtained.

Step S604, determining the region type corresponding to each feature region in the target feature matrix, splitting the first weight vector according to the region type of the feature region, and obtaining a second weight vector corresponding to each region type.

Specifically, the category of the feature region is determined according to the partitioning rule of the target feature matrix. For example, when the target feature matrix is divided according to a semantic division method, a region having the same semantic division feature is regarded as one category. For another example, when the target feature matrix is divided according to the range of feature values, the feature regions may be classified according to the size of the range of feature values. For example, in the target feature matrix, the category of the feature region having the feature value between the minimum value and the first bisector is the first region category. The category of the feature region having the feature value between the first bisection point and the second bisection point is a second region category. As a practical example, assume that there are 2 object feature matrices, in terms of eigenvalue rangesDividing each target feature matrix into 3 feature areas equally, and expressing the feature area of the first target feature matrix as T₁₁、T₂₁、T₃₁. The feature region of the second target feature matrix is denoted T₁₂、T₂₂、T₃₂. Then T₁₁And T₁₂Are of the same area class. T is₂₁And T₂₂Are of the same area class. T is₃₁And T₃₂Are of the same area class.

The first weight vector is composed of intermediate weights corresponding to each feature region in the plurality of target feature matrices, and the intermediate weights are obtained by further processing the weights, for example, step 606 is performed to obtain target weights. The second weight vector is a sub-vector of the first weight vector. The second weight vector is composed of intermediate weights for the same region class. E.g. T₁₁And T₁₂The corresponding intermediate weights are used as vector values to form 1 second weight vector T₂₁And T₂₂The intermediate weights respectively corresponding to the intermediate weights are respectively used as vector values to form another second weight vector T₃₁And T₃₂And the corresponding intermediate weights are respectively used as vector values to form another second weight vector. I.e. 3 region classes, each region class corresponding to 1 second weight vector, for a total of 3 second weight vectors.

Step S606, inputting the second weight vector into the second fully-connected layer corresponding to the region type to obtain the target weight corresponding to each feature region.

Specifically, the second weight vectors are respectively input into the corresponding second fully-connected layers for processing, so as to obtain target weight vectors respectively corresponding to each region type, and a vector value of each target weight vector is a weight of a feature region corresponding to the vector value. The target weight vector can be output by the second fully-connected layer, or other network layers can be connected after the second fully-connected layer, for example, an activation function layer, which can be a sigmoid layer, and the target weight vector is output through activation of the activation function layer.

For example, assuming there are C object feature matrices,each target feature matrix is divided into K feature areas, and K area categories are shown. Then, target weight vectors respectively corresponding to each region category may be output, where the target weight vectors are K: [

Each target weight vector has C values, wherein the target weight vector of the m-th region class can be represented as

，

And expressing the target weight corresponding to the mth characteristic region in the 1 st target characteristic matrix.

After the target weights corresponding to the feature areas are obtained, obtaining a weighted feature matrix corresponding to the target feature matrix according to each feature area and the corresponding target weight can be represented by formulas (6) and (7): wherein the content of the first and second substances,

a matrix of the weighted features is represented,

in the target characteristic matrix, the target weight of each characteristic region and the corresponding mask are subjected to element intelligent product calculation and then added to obtain the target weight.

（6）

（7）

In the embodiment of the application, the first weight vector is split according to the type of the feature region to obtain the second weight vector corresponding to each region type, and the second weight vector is input into the second full connection layer for processing, so that information interaction between feature regions of the same region type can be realized, and more valuable information can be learned.

Fig. 7 is a schematic diagram illustrating a processing principle of the weight determination module in some embodiments. The method comprises the steps that C target feature matrixes H W are obtained through extraction of a feature extraction layer, each target feature matrix is divided into K feature areas, the K feature areas are respectively subjected to pooling, obtained pooling values form a first pooling value vector, wherein maskx pooling means pooling of the Kth feature area of the target feature matrix is achieved through maskx, and corresponds to a formula (2), and the maskx means a mask corresponding to the Kth feature area in the xth feature map. The first pooled value vector is input into the first fully-connected layer, where r is a scaling factor, which may be set according to the actual situation, e.g. 1. The first pooling value vector is processed by the first full link layer and the ReLU layer to obtain a first weight vector. The first weight vector can be split according to the region category of the characteristic region to obtain second weight vectors corresponding to the region categories respectively, the second weight vectors are input into a second full-connection layer corresponding to the region category, the second weight vectors pass through the second full-connection layer and the sigmoid layer, and target weight vectors are output, so that target weights corresponding to the characteristic region can be obtained. The target weight is multiplied by the mask corresponding to the feature region, corresponding to the term to the right of the medium sign in equation (7). For example, the weighting of mask1x means that the weighting of the 1 st feature region of the target feature matrix is realized through the mask1x (namely, the weighting is calculated)

) From equation (7), it can be obtained

According to the formula (6) will

And

performing element intelligent product calculation, namely performing weighting processing, and multiplying values of corresponding positions of the matrix to obtain weighting characteristicsMatrix, thereby realizing that different attention is exerted on different characteristic regions. When the image processing model is a network structure based on Reset (residual error network), the weighted feature matrix obtained by the processing may be added to the input of the feature extraction layer, that is, identity mapping may be performed.

In some embodiments, the network structure diagram of the image processing model may be as shown in fig. 8, where B1 in fig. 8 refers to a network module which is not connected with the weight determination module after the feature extraction layer, and may be, for example, a ResNet block. B2 is a network module with a connection weight determination module behind the feature extraction layer, for example, the weight determination module is connected behind the Residual network (Residual) in the ResNet block, the weights in the weight determination module and the feature map output by the Residual network are weighted, C represents the convolutional layer, the number under each module is the number of feature channels passing through the module, and the number on the left side of B1 represents the size of the feature map. The arrows indicate the flow of data and the plus sign inside the circles indicates the addition. For example, for B2 connected to the last convolutional layer, its input is the sum of the output of the previous B2 and the output of the second B1.

In some embodiments, the image processing model is pre-trained. When training the image processing model, according to different task needs, different data sets may be used for training, for example, a medical image data set including 4 modalities and four categories may be used for training, so as to obtain an image processing model for segmenting a medical image. Training of the model can also be performed with cityscaps dataset comprising 30 classes. The original image may also be pre-processed, for example, by normalizing the image data to pixel values between 0 and 1. Secondly, for the area of the target object without fixed shape size and directivity, the original image can be subjected to data expansion operations such as turning, rotating, scaling or contrast enhancement, so as to increase the number of training samples and increase the directivity and the information value under different scales. In the process of training the image processing model, the model parameters can be updated by adopting a gradient descent method based on Adam. The Loss value of the model can be calculated by a Dice Loss (Dice Loss) function. The initial learning rate at model training may be 0.05 and the beta in Adam may be between 0.95 and 0.9995. During training, a prediction segmentation probability graph is obtained through a neural network model (an image processing model), segmentation loss values are obtained according to the difference between the prediction segmentation probability graph and a real segmentation probability graph corresponding to the label, an error gradient is calculated through the loss values corresponding to the minimum loss function, and the gradient of the network is updated through back propagation. And after the model training is finished, inputting the target image into an image processing model, and obtaining an image processing result by using the final prediction probability value.

The method provided by the embodiment of the application can be applied to an image segmentation task, and an image segmentation model can be obtained through pre-training to perform image segmentation. The image segmentation (image segmentation) technology is an important research direction in the field of computer vision, and is an important ring for image semantic understanding. Image segmentation refers to a process of dividing an image into several regions having similar properties, and from a mathematical point of view, is a process of dividing an image into mutually disjoint regions. Technologies such as scene object segmentation, human body front background segmentation, human face human body (analysis) matching, three-dimensional reconstruction and the like in image segmentation are widely applied to industries such as unmanned driving, augmented reality, security monitoring, medical image analysis and the like. The principle of segmentation is that the similarity of the divided sub-images is kept to be maximum inside, the similarity between the sub-images is kept to be minimum, and the gray value distribution or the feature value distribution corresponding to the images of different regions to be segmented are generally obviously different, so that different weights can be applied to the regions with different feature value distributions by the scheme provided by the embodiment of the application, the corresponding feature regions needing important attention are found by utilizing the optimized feature value information, the artificial intelligent network can pay more attention to the difference of the information of the different regions, more accurate features of the different regions can be extracted, and more accurate image segmentation is realized. Moreover, when the feature regions are segmented according to the distribution of the feature values, different self-adaptive weights can be taken for different regions by utilizing the histogram of the feature graph in the segmentation process, and model parameters corresponding to the determined weights can be obtained without carrying out supervised training on the weights independently, so that the method is easy to realize.

The following describes an image processing method provided in an embodiment of the present application with reference to fig. 1 by taking image segmentation as an example, including the following steps:

1. and acquiring a target image to be processed.

Specifically, the front end a (the terminal 102) may acquire medical image data, perform preprocessing on the medical image data, for example, but not limited to, perform operations such as normalization, translation, rotation, or scaling on image pixel values to obtain a target image, upload the target image to the back end, and the back end uses the target image as a target image to be processed. Wherein the front end a may be a medical device for acquiring medical images. The back-end may be a computer device, such as the server of fig. 1, for feature extraction of medical images and image classification based on the extracted medical features. The front end B (terminal 106) may be a display device for displaying the classification result of the medical image. The front end a and the front end B may be the same device or different devices.

2. And inputting the target image into an image processing model, and performing feature extraction through a plurality of feature extraction layers in the image processing model to obtain a first feature matrix extracted by a previous feature extraction layer corresponding to the weight determination module.

Specifically, the back end may input the target image into the image processing model, and perform feature extraction to obtain the first feature matrix. For example, a first feature matrix is obtained through feature extraction of 3 feature extraction layers.

3. And fusing the first characteristic matrix and the forward characteristic matrix of the first characteristic matrix to obtain a target characteristic matrix corresponding to the target image.

Specifically, the back end may add the first feature matrix output by the 3 rd layer feature extraction layer to the feature matrix output by the previous 2 layers of feature extraction layers to obtain the target feature matrix.

4. And acquiring the number of candidate result types corresponding to the image processing model, and taking the number of the candidate result types as the number of region division.

Specifically, assuming that an image processing model is used to divide the target image into 3 categories, the resulting number of categories is 3, with 3 as the number of region divisions.

5. Acquiring a maximum eigenvalue and a minimum eigenvalue corresponding to the target eigenvalue matrix; according to the number of the region divisions, determining a characteristic value equipartition point in a target characteristic value range corresponding to the maximum characteristic value and the minimum characteristic value; the range between the points of the eigenvalue equant is taken as a sub-eigenvalue range.

Specifically, assuming that the target feature matrix needs to be divided into 3 feature regions, the maximum value of the target feature matrix is 180, and the minimum value is 30, the feature value equal division points are 80 and 130.

6. And dividing the characteristic points of the characteristic values in the same sub-characteristic value range in the target characteristic matrix into the same characteristic region to obtain a plurality of characteristic regions corresponding to the target characteristic matrix.

7. And inputting the target characteristic matrix into a pooling layer of a weight determination module, and pooling the target characteristic matrix by using the pooling layer with the region as a unit to obtain pooling values respectively corresponding to the characteristic regions.

Specifically, the back end may obtain an average value of the feature values in each feature region as a pooling value corresponding to the feature region.

8. Inputting the first pooling vector into a weight determination layer of a weight determination module to obtain target weights corresponding to all the characteristic regions; the first pooling vector is composed of pooling values corresponding to the feature regions.

Specifically, the weight determining module comprises a pooling layer, a first full-link layer, a ReLU layer, a second full-link layer and a sigmoid layer, and after passing through the pooling layer, the first pooling vector passes through the first full-link layer, the ReLU layer, the second full-link layer and the sigmoid layer to obtain target weights corresponding to the characteristic regions.

9. And obtaining a weighted feature matrix corresponding to the target feature matrix according to each feature area and the corresponding target weight.

Specifically, the eigenvalue of each eigenvalue in the eigen region is multiplied by the corresponding target weight to obtain a weighted eigenvalue corresponding to each eigenvalue in the weighted eigenvalue matrix.

10. And processing the weighted feature matrix to obtain a target processing result corresponding to the target image.

Specifically, the server may input the weighted feature matrix into the next feature extraction layer to continue feature extraction, or input the weighted feature matrix into the output layer to obtain an image processing result. For example, the classification probability corresponding to each pixel point is obtained, and the image is divided into image parts corresponding to three categories according to the classification probability. The back end can send the image processing result to the front end B for display.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above-mentioned flowcharts may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or the stages in other steps.

In some embodiments, as shown in fig. 9, there is provided an image processing apparatus, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, the apparatus specifically includes: a target feature matrix obtaining module 902, a feature region determining module 904, a target weight obtaining module 906, a weighted feature matrix obtaining module 908, and a target processing result obtaining module 910. Wherein:

a target feature matrix obtaining module 902, configured to obtain a target feature matrix corresponding to a target image to be processed.

A feature region determining module 904, configured to determine a plurality of feature regions obtained by segmenting the target feature matrix.

And a target weight obtaining module 906, configured to determine a target weight corresponding to each feature region based on the feature value in the feature region.

A weighted feature matrix obtaining module 908, configured to obtain a weighted feature matrix corresponding to the target feature matrix according to each feature area and the corresponding target weight.

And an object processing result obtaining module 910, configured to process the weighted feature matrix to obtain an object processing result corresponding to the object image.

In some embodiments, the feature region determination module comprises: a division number acquisition unit for acquiring the number of area divisions; and the characteristic diagram dividing unit is used for dividing the target characteristic matrix according to the distribution of the characteristic values in the target characteristic matrix to obtain the characteristic areas with the area dividing quantity.

In some embodiments, the feature map dividing unit is configured to: dividing a target characteristic value range corresponding to the target characteristic matrix to obtain sub-characteristic value ranges with the number of the sub-characteristic value ranges being the number of the area divisions; and dividing the characteristic points of the characteristic values in the same sub-characteristic value range in the target characteristic matrix into the same characteristic region to obtain a plurality of characteristic regions corresponding to the target characteristic matrix.

In some embodiments, the feature map dividing unit is configured to: acquiring a maximum eigenvalue and a minimum eigenvalue corresponding to the target eigenvalue matrix; according to the number of the region divisions, determining a characteristic value equipartition point in a target characteristic value range corresponding to the maximum characteristic value and the minimum characteristic value; the range between the points of the eigenvalue equant is taken as a sub-eigenvalue range.

In some embodiments, the target processing result is obtained by processing using an image processing model, and the division number obtaining unit is configured to: and acquiring the number of candidate result types corresponding to the image processing model, and acquiring the number of region partitions according to the number of the candidate result types.

In some embodiments, the target weight derivation module comprises: the pooling unit is used for pooling the target characteristic matrix by taking the region as a unit to obtain pooling values respectively corresponding to each characteristic region; the target weight determining unit is used for obtaining target weights corresponding to the characteristic regions according to the first pooling vector; the first pooling vector is composed of pooling values corresponding to the feature regions.

In some embodiments, the target feature matrix is a plurality, and the module for obtaining the first pooling vector comprises: and combining the pooling values corresponding to the characteristic areas in the target characteristic matrixes as vector values to obtain a first pooling vector.

In some embodiments, the target weight determination unit is to: inputting the first pooling vector into a first full-link layer of the image processing model to obtain a first weight vector; determining the region type corresponding to each feature region in the target feature matrix, splitting the first weight vector according to the region type of the feature region, and obtaining a second weight vector corresponding to each region type; and inputting the second weight vector into a second full-connection layer corresponding to the region type to obtain the target weight corresponding to each characteristic region.

In some embodiments, the target processing result obtaining module is to: processing the weighted feature matrix to obtain the segmentation class probability corresponding to each pixel point in the target image; determining a target class corresponding to the pixel point according to the segmentation class probability corresponding to the pixel point; and segmenting the target image according to the target category corresponding to the pixel point to obtain an image segmentation result.

In some embodiments, the target feature matrix acquisition module is to: inputting a target image into an image processing model, and performing feature extraction through a plurality of feature extraction layers in the image processing model to obtain a first feature matrix; and fusing the first characteristic matrix and the forward characteristic matrix of the first characteristic matrix to obtain a target characteristic matrix corresponding to the target image.

For specific limitations of the image processing apparatus, reference may be made to the above limitations of the image processing method, which are not described herein again. The respective modules in the image processing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In some embodiments, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 10. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is for storing at least one of the target image or the image processing result. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In some embodiments, there is further provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above method embodiments when executing the computer program.

In some embodiments, a computer-readable storage medium is provided, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An image processing method, characterized in that the method comprises:

acquiring a target characteristic matrix corresponding to a target image to be processed;

determining a plurality of characteristic regions obtained by dividing the target characteristic matrix, wherein a target characteristic value range corresponding to the target characteristic matrix is divided into a plurality of sub-characteristic value ranges, when the target characteristic value range is divided, obtaining a mask corresponding to each sub-characteristic value range, and performing intelligent product calculation on the mask and the target characteristic matrix to obtain the characteristic regions corresponding to each sub-characteristic value range;

determining a target weight corresponding to each characteristic region based on the characteristic values in the characteristic regions;

carrying out intelligent product calculation of elements on the target weight corresponding to each characteristic region and the corresponding mask respectively, then adding, and carrying out intelligent product calculation of elements on the data obtained by adding and the target characteristic matrix to obtain a weighted characteristic matrix;

and processing the weighted feature matrix to obtain a target processing result corresponding to the target image.

2. The method of claim 1, wherein determining the plurality of feature regions obtained by segmenting the target feature matrix comprises:

acquiring the number of area divisions;

and segmenting the target characteristic matrix according to the distribution of characteristic values in the target characteristic matrix to obtain the characteristic regions of the region segmentation quantity.

3. The method according to claim 2, wherein the segmenting the target feature matrix according to the distribution of the feature values in the target feature matrix to obtain the region segmentation quantity of the feature regions comprises:

dividing a target characteristic value range corresponding to the target characteristic matrix to obtain sub-characteristic value ranges with the number of the sub-characteristic value ranges being the number of the region division;

and dividing the characteristic points of which the characteristic values are in the same sub-characteristic value range in the target characteristic matrix into the same characteristic region to obtain a plurality of characteristic regions corresponding to the target characteristic matrix.

4. The method according to claim 3, wherein the dividing the target eigenvalue range corresponding to the target eigenvalue matrix to obtain the sub-eigenvalue ranges with the number of the region division numbers comprises:

acquiring a maximum eigenvalue and a minimum eigenvalue corresponding to the target eigenvalue matrix;

according to the region division quantity, determining a characteristic value equipartition point in a target characteristic value range corresponding to the maximum characteristic value and the minimum characteristic value;

and taking the range between the equal division points of the characteristic value as a sub characteristic value range.

5. The method of claim 2, wherein the target processing result is processed using an image processing model, and wherein obtaining the number of region partitions comprises:

and acquiring the number of candidate result types corresponding to the image processing model, and acquiring the number of the region partitions according to the number of the candidate result types.

6. The method according to claim 1, wherein the determining the target weight corresponding to each feature region based on the feature values in the feature region comprises:

pooling the target feature matrix by taking a region as a unit to obtain pooling values respectively corresponding to the feature regions;

obtaining target weights corresponding to the characteristic regions according to the first pooling vector; the first pooling vector is composed of pooling values corresponding to the feature regions.

7. The method of claim 6, wherein the target feature matrix is plural, and the step of obtaining the first pooling vector comprises:

and combining the pooling values corresponding to the characteristic areas in the target characteristic matrixes as vector values to obtain a first pooling vector.

8. The method according to claim 6 or 7, wherein the obtaining the target weight corresponding to each feature region according to the first pooling vector comprises:

inputting the first pooling vector into a first full-link layer to obtain a first weight vector;

determining the region type corresponding to each feature region in the target feature matrix, splitting the first weight vector according to the region type of the feature region, and obtaining a second weight vector corresponding to each region type;

and inputting the second weight vector into a second full-connection layer corresponding to the region type to obtain the target weight corresponding to each characteristic region.

9. The method according to claim 1, wherein the processing the weighted feature matrix to obtain the target processing result corresponding to the target image comprises:

processing the weighted feature matrix to obtain the segmentation class probability corresponding to each pixel point in the target image;

determining a target class corresponding to the pixel point according to the segmentation class probability corresponding to the pixel point;

and segmenting the target image according to the target category corresponding to the pixel point to obtain an image segmentation result.

10. The method according to claim 1, wherein the obtaining of the target feature matrix corresponding to the target image to be processed comprises:

inputting the target image into an image processing model, and performing feature extraction through a plurality of feature extraction layers in the image processing model to obtain a first feature matrix;

and fusing the first characteristic matrix and the forward characteristic matrix of the first characteristic matrix to obtain a target characteristic matrix corresponding to the target image.

11. An image processing apparatus, characterized in that the apparatus comprises:

the target characteristic matrix acquisition module is used for acquiring a target characteristic matrix corresponding to a target image to be processed;

the characteristic region determining module is used for determining a plurality of characteristic regions obtained by dividing the target characteristic matrix, wherein a target characteristic value range corresponding to the target characteristic matrix is divided into a plurality of sub-characteristic value ranges, a mask corresponding to each sub-characteristic value range is obtained during division, and intelligent product calculation of elements is carried out on the mask and the target characteristic matrix to obtain the characteristic regions corresponding to each sub-characteristic value range;

a target weight obtaining module, configured to determine, based on the feature values in the feature regions, a target weight corresponding to each feature region;

a weighted feature matrix obtaining module, configured to perform intelligent product calculation on the target weights corresponding to the feature areas and the corresponding masks respectively, add the target weights, and perform intelligent product calculation on the data obtained by the addition and the target feature matrix to obtain a weighted feature matrix;

and the target processing result obtaining module is used for processing the weighted characteristic matrix to obtain a target processing result corresponding to the target image.

12. The apparatus of claim 11, wherein the feature region determination module comprises:

a division number acquisition unit for acquiring the number of area divisions;

and the characteristic map dividing unit is used for dividing the target characteristic matrix according to the distribution of characteristic values in the target characteristic matrix to obtain the characteristic areas of the area dividing quantity.

13. The apparatus of claim 12, wherein the feature map partitioning unit is configured to:

14. The apparatus of claim 13, wherein the feature map partitioning unit is configured to: acquiring a maximum eigenvalue and a minimum eigenvalue corresponding to the target eigenvalue matrix;

15. The apparatus according to claim 12, wherein the target processing result is obtained by processing using an image processing model, and the division number obtaining unit is configured to:

16. The apparatus of claim 11, wherein the target weight derivation module comprises:

the pooling unit is used for pooling the target feature matrix by taking a region as a unit to obtain pooling values respectively corresponding to the feature regions;

the target weight determining unit is used for obtaining target weights corresponding to the characteristic regions according to the first pooling vector; the first pooling vector is composed of pooling values corresponding to the feature regions.

17. The apparatus of claim 16, wherein the target feature matrix is plural, and wherein the means for obtaining the first pooling vector comprises:

18. The apparatus according to claim 16 or 17, wherein the target weight determination unit is configured to:

19. The apparatus of claim 11, wherein the target processing result obtaining module is configured to:

20. The apparatus of claim 11, wherein the target feature matrix acquisition module is configured to:

21. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 10 when executing the computer program.

22. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 10.