CN117094959A

CN117094959A - Image processing method and training method of image processing model

Info

Publication number: CN117094959A
Application number: CN202311013022.7A
Authority: CN
Inventors: 刘伟; 周彦捷; 许静; 高远; 王宇; 吕乐
Original assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Current assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date: 2023-08-11
Filing date: 2023-08-11
Publication date: 2023-11-21

Abstract

The embodiment of the specification provides an image processing method and a training method of an image processing model, wherein the image processing method comprises the following steps: receiving an image processing task, wherein the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not; inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results. According to the method provided by the specification, the initial prediction result is screened through the result relation matrix, so that a more accurate target detection result is obtained.

Description

Image processing method and training method of image processing model

Technical Field

The embodiment of the specification relates to the technical field of computers, in particular to an image processing method.

Background

Along with the improvement of the living standard of people, more and more people begin to pay attention to the health problem of the people, skin is used as the largest organ of the human body, the skin is directly contacted with the external environment, and the skin diseases affect nearly one third of the population in the world due to the comprehensive reasons of various factors, but dermatologists face huge gaps, and many skin patients cannot be diagnosed in a professional way.

With the advent of the big data age, deep learning technology is increasingly applied to image recognition, and by performing image recognition on an image of a skin patient, it is possible to assist in judging whether a skin lesion exists in the image and whether a lesion type exists. At present, a skin disease image classification algorithm is only used for singly predicting a disease label, namely, a picture comprising skin is input, the predicted disease is output, in image recognition, the accuracy of the output result of the existing algorithm cannot reach more than 90 percent due to the complexity and diversity of the skin disease, accurate results are usually required to be screened out from the results of the first few predicted names, and in the existing image recognition, mutually exclusive prediction results appear in the prediction results, and the prediction is inaccurate. Therefore, how to accurately locate a lesion in an image and identify the type of the lesion is a problem to be solved by the technicians.

Disclosure of Invention

In view of this, the present embodiment provides an image processing method. One or more embodiments of the present specification relate to an image processing apparatus, a computing device, a computer-readable storage medium, and a computer program that solve the technical drawbacks of the related art.

According to a first aspect of embodiments of the present specification, there is provided an image processing method including:

receiving an image processing task, wherein the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not;

inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results.

According to a second aspect of embodiments of the present specification, there is provided a skin-loss image processing method, including:

Receiving a skin damage image processing task, wherein the skin damage image processing task carries a skin damage image to be detected corresponding to a target detection area, and the skin damage image processing task is used for detecting whether the target detection area is abnormal or not;

inputting the skin damage image to be detected into a skin damage image processing model to obtain a target detection result corresponding to the target detection area, wherein the skin damage image processing model generates an initial prediction result based on the skin damage image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying the association relation among a plurality of results.

According to a third aspect of embodiments of the present disclosure, there is provided a training method of an image processing model, applied to cloud-side equipment, including:

acquiring a sample image and a sample detection result corresponding to the sample image;

inputting the sample image into an image processing model to obtain a prediction detection result, prediction abnormal characteristic information and prediction non-abnormal characteristic information which are output by the image processing model;

calculating a model loss value according to the sample detection result, the prediction abnormal characteristic information and the prediction non-abnormal characteristic information;

Adjusting model parameters of the image processing model according to the model loss value until model training stopping conditions are reached, and obtaining the model parameters of the image processing model;

and sending the model parameters of the image processing model to end-side equipment.

According to a fourth aspect of embodiments of the present specification, there is provided an image processing method comprising:

receiving an image processing request sent by a user, wherein the image processing request comprises an image processing task, the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not;

inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results;

and sending a target detection result corresponding to the target detection area to the user.

According to a fifth aspect of embodiments of the present specification, there is provided an image processing apparatus comprising:

The receiving module is configured to receive an image processing task, wherein the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not;

the detection module is configured to input the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results.

According to a sixth aspect of embodiments of the present specification, there is provided a computing device comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions that, when executed by the processor, perform the steps of the method described above.

According to a seventh aspect of embodiments of the present description, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the above-described method.

According to an eighth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above method.

In the method provided by one embodiment of the present disclosure, during the process of processing an image to be detected, in the image processing model, during the process of processing the image to be detected to obtain an initial prediction result, abnormal feature information generated during the processing process is referred to. After the initial predicted result is obtained, the result in the initial predicted result is further screened by continuously referring to a preset result relation matrix, so that the mutually exclusive predicted result is avoided, and the final target detection result is more accurate.

Drawings

FIG. 1 is a block diagram of an image processing system according to one embodiment of the present disclosure;

FIG. 2 is a flow chart of an image processing method provided by one embodiment of the present description;

FIG. 3 is a schematic diagram of a resulting relationship matrix provided by one embodiment of the present disclosure;

FIG. 4 is a flowchart of a method for processing a skin lesion image according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a method for processing a skin damage image according to an embodiment of the present disclosure;

FIG. 6 is a flow chart of a training method for an image processing model provided in one embodiment of the present disclosure;

FIG. 7 is a schematic view of the structure of an image processing model provided in one embodiment of the present specification;

FIG. 8 is a flow chart of another image processing method provided by one embodiment of the present disclosure;

FIG. 9 is a process flow diagram of an image processing method according to one embodiment of the present disclosure;

fig. 10 is a schematic structural view of an image processing apparatus provided in one embodiment of the present specification;

FIG. 11 is a block diagram of a computing device provided in one embodiment of the present description.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant areas, and is provided with corresponding operation entries for the user to select authorization or rejection.

With the advent of the big data age, deep learning technology is increasingly applied to image recognition, and by performing image recognition on an image of a skin patient, it is possible to assist in judging whether a skin lesion exists in the image and whether a lesion type exists. At present, a skin disease image classification algorithm is only used for singly predicting a disease label, namely, a picture comprising skin is input, the predicted disease is output, in image recognition, the accuracy of the output result of the existing algorithm cannot reach more than 90 percent due to the complexity and diversity of the skin disease, accurate results are usually required to be screened out from the results of the first few predicted names, and in the existing image recognition, mutually exclusive prediction results appear in the prediction results, and the prediction is inaccurate.

Based on this, in the present specification, an image processing method is provided, and the present specification relates to an image processing apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.

Referring to fig. 1, fig. 1 illustrates an architecture diagram of an image processing system provided in one embodiment of the present disclosure, which may include a client 100 and a server 200;

the client 100 is configured to send an image processing task to the server 200, where the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is configured to detect whether the target detection area has an abnormality;

the server 200 is configured to input the image to be detected to an image processing model, and obtain a target detection result corresponding to the target detection area, where the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relationship matrix, and the result relationship matrix is used to identify an association relationship between a plurality of results; sending a target detection result to the client 100;

The client 100 is further configured to receive a target detection result sent by the server 200.

By applying the scheme of the embodiment of the specification, an image processing task is received, wherein the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not; inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results.

According to the scheme provided by the embodiment of the specification, the initial prediction result is generated according to the image to be detected, then a correction basis is provided for the initial prediction result according to the result relation matrix, the association relation among a plurality of results is reflected in the result relation matrix, and the result with larger prediction deviation in the initial prediction result is removed through the result relation matrix, so that the accuracy of the prediction result is further improved.

The image processing system may include a plurality of clients 100 and a server 200, wherein the clients 100 may be referred to as end-side devices and the server 200 may be referred to as cloud-side devices. Communication connection can be established between the plurality of clients 100 through the server 200, and in an image processing scenario, the server 200 is used to provide an image processing service between the plurality of clients 100, and the plurality of clients 100 can respectively serve as a transmitting end or a receiving end, so that communication is realized through the server 200.

The user may interact with the server 200 through the client 100 to receive data transmitted from other clients 100, or transmit data to other clients 100, etc. In the image processing scenario, it may be that the user issues a data stream to the server 200 through the client 100, and the server 200 generates a target detection result according to the data stream and pushes the target detection result to other clients that establish communication.

Wherein, the client 100 and the server 200 establish a connection through a network. The network provides a medium for a communication link between client 100 and server 200. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The data transmitted by the client 100 may need to be encoded, transcoded, compressed, etc. before being distributed to the server 200.

The client 100 may be a browser, APP (Application), or a web Application such as H5 (HyperText Markup Language, hypertext markup language (htv) 5 th edition) Application, or a light Application (also called applet, a lightweight Application) or cloud Application, etc., and the client 100 may be based on a software development kit (SDK, software Development Kit) of a corresponding service provided by the server 200, such as a real-time communication (RTC, real Time Communication) based SDK development acquisition, etc. The client 100 may be deployed in an electronic device, need to run depending on the device or some APP in the device, etc. The electronic device may for example have a display screen and support information browsing etc. as may be a personal mobile terminal such as a mobile phone, tablet computer, personal computer etc. Various other types of applications are also commonly deployed in electronic devices, such as human-machine conversation type applications, model training type applications, text processing type applications, web browser applications, shopping type applications, search type applications, instant messaging tools, mailbox clients, social platform software, and the like.

The server 200 may include a server that provides various services, such as a server that provides communication services for multiple clients, a server for background training that provides support for a model used on a client, a server that processes data sent by a client, and so on. It should be noted that, the server 200 may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. The server may also be a server of a distributed system or a server that incorporates a blockchain. The server may also be a cloud server for cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN, content Delivery Network), and basic cloud computing services such as big data and artificial intelligence platforms, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology.

It should be noted that, the image processing method provided in the embodiment of the present disclosure is generally executed by the server, but in other embodiments of the present disclosure, the client may have a similar function to the server, so as to execute the image processing method provided in the embodiment of the present disclosure. In other embodiments, the image processing method provided in the embodiments of the present disclosure may be performed by the client and the server together.

Referring to fig. 2, fig. 2 shows a flowchart of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:

step 202: and receiving an image processing task, wherein the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not.

In practical application, the image processing task sent by the user can be received through the server side or the client side.

Specifically, the image processing task is a task for detecting whether an abnormality exists in the target detection area, and the image processing task carries an image to be detected corresponding to the target detection area, further, the target detection area can be understood as an area for predicting whether the abnormality exists, for example, the target detection area can be any organ of a human body, a mouth, eyes, local skin and the like, and by predicting whether the abnormality exists in the target detection area, the state of the object to be detected can be further judged in an auxiliary manner according to the prediction result, and help is provided for the object to be detected after the state of the object to be detected is determined.

It should be noted that, in one or more embodiments of the present disclosure, the image processing task may be applied to identify various medical images, and determine whether an abnormality exists in a target detection area in the medical image according to image features. For example, in an application scenario of eye disease detection, whether there is a focus on the eye or not, and the type of focus can be detected from a photograph of the eye; in the application scene of skin disease detection, whether skin damage exists and the type of the skin damage can be detected according to the photo of local skin; thereby helping doctors to judge whether the target detection area is abnormal or not, and further facilitating the subsequent treatment.

In an embodiment provided in the present disclosure, in an application scenario for detecting an eye disease, an image to be detected obtained is an image of an eye of a user, and specifically may be an image taken by a user through a mobile phone, a camera, or other portable devices. In practical application, the sizes of the images collected by various portable devices are different, and in the method provided by the specification, in order to facilitate subsequent processing, the images carrying the target detection area are subjected to image preprocessing, and the sizes of the images to be detected are unified.

In another embodiment provided in the present disclosure, taking an application scenario of detecting a skin disease as an example, an initial image of local skin shot by a user through a mobile phone is obtained, the initial image is processed to an image to be detected with 384×384 by using an OpenCV kit, and a normalization process of pixel values is performed on pixels in the image to be detected. In practical application, the size of the image to be detected may be set according to practical application, and in the present specification, the size of the image to be detected is not limited. By performing image detection processing on an image to be detected of local skin, it is detected whether the user suffers from a skin disease, and the type of skin disease.

The terminal receives the image processing task, and can take an image to be detected corresponding to the target detection area carried in the image processing task as input for detecting whether the target detection area is abnormal or not.

Step 204: inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results.

In practical application, after receiving an image processing task, acquiring an image to be detected carried by the image processing task from the image processing task, inputting the image to be detected into an image processing model for processing, and obtaining a target detection result corresponding to a target detection area output by the image processing model, wherein the target detection result specifically comprises information such as whether the target detection area is abnormal, an abnormal type of the abnormality, probability of the abnormal type and the like.

Specifically, the image processing model can extract detection region characteristic information corresponding to a target detection region based on an input image to be detected, predict based on the detection region characteristic information, obtain an initial prediction result, and compare the initial prediction result with a result relation matrix after obtaining the initial prediction result, so that a final target detection result is determined based on the initial prediction result. In practical application, the image processing model may be a traditional machine learning model or a deep learning model.

The result relation matrix specifically refers to that a plurality of currently known results are expressed in a matrix form, and the result relation matrix is used for identifying the association relation among the plurality of results. Referring to fig. 3, fig. 3 is a schematic diagram of a result relationship matrix provided in an embodiment of the present disclosure, in the result relationship matrix shown in fig. 3, 49 results are altogether, and the relevance between any two results is represented by numbers, where 2 represents autocorrelation, 1 represents similarity, 0 represents unknown relationship, and-1 represents mutual exclusion, for example, for result 0 and result 2, a relationship value between the 2 results can be queried through the result relationship matrix, which indicates that the result 0 and the result 2 are mutually exclusive, and if the result 0 is predicted, the result 2 is predicted with less possibility.

In the embodiment provided in the specification, the final target detection result is generated in the image processing model through the result relation matrix, and the target detection result can be more accurate through the result relation matrix.

Specifically, in one embodiment provided in the present specification, the image processing model includes an embedding layer, a feature processing layer, a classification layer, and an alignment layer;

inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the target detection result comprises S2042-S2048:

s2042, inputting the image to be detected into the embedding layer to obtain embedded image characteristics.

The embedding layer is used for carrying out embedding processing on the input image to be detected to obtain embedded image characteristics. Image embedding is the conversion of data into a fixed-size representation of features for ease of processing and computation. The embedding process is a coding technique, and the coding is represented by a low-dimensional vector, and the coding can express the relation between images through the optimization of a neural network. By the embedding process, the image can be encoded into a feature vector that can be recognized and processed by a computer.

Further, inputting the image to be detected to the embedding layer to obtain embedded image features, including:

dividing the image to be detected into a plurality of sub-images to be detected based on preset dividing information;

splicing a preset classifier and a plurality of sub-images to be detected into an image to be input;

and inputting the image to be input into the embedding layer to obtain the embedded image characteristics.

In practical application, in order to more accurately locate the abnormal position, the method can also divide the image to be detected into a plurality of sub-images to be detected, and respectively identify the plurality of sub-images to be detected, so that in the subsequent identification process, the abnormal position in the image to be detected can be more accurately located.

In the embodiment provided in the present specification, the preset division information specifically refers to predetermined information for dividing the image to be detected, for example, dividing the image to be detected into 16×16 sub-images to be detected, or dividing the image to be detected into 32×32 sub-images to be detected. The preset segmentation information is a preset super parameter, and can be set according to actual conditions.

According to preset division information, the image to be detected can be divided into a corresponding number of sub-images to be detected, for example, the preset division information is 16 x 16, then the image to be detected (384 x 384) is divided into 576 sub-images to be detected based on the division information, and each sub-image to be detected is marked according to the position information of each sub-image to be detected in the image to be detected, so that the plurality of sub-images to be detected can be restored according to the position information.

The preset classifier specifically refers to a "CLS" identifier, where the "CLS" is used for performing a classification task, and in practical application, information input into the image processing model is added with the [ CLS ] identifier at the beginning, so as to perform classification during the processing of the model. Taking 576 sub-images to be detected as examples, adding a preset classifier CLS, inputting the sub-images into an embedding layer for embedding treatment to obtain the embedded image characteristics output by the embedding layer, and taking 768 dimensions of the embedding layer as examples, wherein the dimensions of the embedded image characteristics are 577-768 dimensions.

S2044, inputting the embedded image features to the feature processing layer to obtain image decoding features and abnormal feature information.

The feature processing layer is specifically used for performing feature processing on the embedded image features. In practical application, the image processing model may be a model built based on the Vision Transformer model, and the feature processing layer may be understood as a plurality of Transformer Layer, and the embedded image features are sequentially input into a plurality of consecutive Transformer Layer, so as to obtain image decoding features and abnormal feature information.

The image decoding feature specifically refers to image feature information obtained after encoding and decoding processing is performed on an image to be detected. The abnormal feature information represents feature information of occurrence of an abnormality in the image to be detected.

In practical application, the embedded image features are sequentially input into a plurality of continuous Transformer Layer to be processed, each self-attention matrix A in each Layer is obtained, initial association features are extracted from each self-attention matrix A, the initial association features represent association between CLS (clear line) and each sub-image to be detected in the self-attention matrix, in practical application, the first line vector of each self-attention matrix A is taken out, and the first element of the first line vector is removed, so that the initial association features of the respective self-attention matrix A can be obtained.

And carrying out weighted average processing on the plurality of initial incidence relation features to obtain target incidence relation features, wherein the target incidence relation features are used for marking the positioning information of the abnormal positions obtained after the encoding and decoding processing is carried out for a plurality of times, the target incidence relation features correspond to the plurality of sub-images to be detected one by one, and the abnormal position information can be determined by restoring the target incidence relation features into the images to be detected according to the position information of each sub-image to be detected.

After the abnormal position information is determined, the abnormal feature information can be determined according to the abnormal position information and the image decoding feature, concretely, see the following formula 1:

Wherein f _lesion Representing abnormal characteristic information, f _non-lesion Representing the information of the non-abnormal characteristics,saliency information representing abnormal location, +.>Representing the image decoding characteristics.

Through the above formula 1, abnormal characteristic information can be calculated, and the abnormal characteristic information is used for providing corresponding characteristic information for the initial prediction result in the subsequent classification processing process.

S2046, inputting the image decoding characteristics and the abnormal characteristic information into the classification layer to obtain an initial prediction result.

After the abnormal feature information is obtained, the image decoding features and the abnormal feature information can be input into a classification layer for feature classification, so that an initial prediction result is obtained.

In practical application, the pre-added CLS features are used for classifying, specifically, classification feature information corresponding to a preset classifier CLS is extracted from the image decoding features, the classification feature information and the abnormal feature information are spliced to obtain spliced feature information, and the spliced feature information is processed in a classification layer according to the spliced feature information to obtain an initial prediction result.

In the method provided by the specification, the image decoding characteristics and the abnormal characteristic information are input into a classification layer, prediction is carried out in the classification layer according to the classification characteristic information and the abnormal characteristic information in the image decoding characteristics, an initial prediction result is obtained, specifically, the initial prediction result comprises a plurality of prediction sub-results and probabilities of all the prediction sub-results, all the prediction sub-results are ordered according to the probabilities of all the prediction sub-results, and a first preset number of sub-results are selected as the initial prediction result.

For example, in the method for detecting skin diseases, according to the image decoding characteristics and the abnormal characteristic information, the probabilities corresponding to 49 kinds of predictor results and each predictor result are obtained, the probabilities corresponding to each predictor result are ordered, and the predictor result with the top 10 ranks is selected as the initial predictor result.

S2048, inputting the initial prediction result into the comparison layer to obtain a target detection result, wherein the comparison layer comprises a result relation matrix.

The comparison layer is specifically used for correcting the initial prediction result. The result relation matrix is stored in the comparison layer. Further, the initial prediction result is corrected in the comparison layer based on the result relation matrix, so that a final target detection result is obtained.

In one embodiment provided in the present specification, the initial predictor includes a plurality of initial predictors;

inputting the initial prediction result to the comparison layer to obtain a target detection result, wherein the method comprises the following steps:

determining an initial predictor to be processed and at least one reference initial predictor, wherein the initial predictor to be processed is any one of a plurality of initial predictors;

Calculating a relevance score corresponding to the initial predictor result to be processed based on the result relation matrix, the initial predictor result to be processed and each reference initial predictor result;

and sequencing the initial prediction results based on the relevance scores corresponding to the initial predictor results to be processed, and determining a target detection result.

In the above step, it may be determined that a plurality of initial predictor results are included in the initial predictor results, voting is performed on each initial predictor result according to the result relationship matrix in the comparison layer, so as to obtain relevance scores of the initial predictor results, and finally, a final target detection result is determined according to the relevance scores.

In practical application, an initial predictor to be processed is first determined from a plurality of initial predictors, and other initial predictors except the initial predictor to be processed are determined as reference initial predictors. The initial predictor result to be processed specifically refers to an initial predictor result for which a relevance score needs to be determined in the current calculation.

For example, the initial predictor includes 5 initial predictor results, namely initial predictor result 1, initial predictor result 2, initial predictor result 3, initial predictor result 4, and initial predictor result 5. When the relevance score of the initial predictor 1 needs to be calculated, the initial predictor 1 is the initial predictor to be processed, the initial predictors 2-5 are the reference initial predictors, and the relevance score corresponding to the initial predictor to be processed is calculated by combining the result relation matrix, the initial predictor to be processed and the reference initial predictors.

Specifically, calculating, based on the result relation matrix, the to-be-processed initial predictor result, and each reference initial predictor result, a relevance score corresponding to the to-be-processed initial predictor result includes:

calculating the relevance sub-scores of the initial predictor results to be processed and the reference initial predictor results based on the result relation matrix;

and determining the relevance score corresponding to the initial predictor result to be processed based on each relevance score.

In practical application, the result relation matrix comprises the relevance between any two results, and if the two results are similar, the result relation matrix is expressed by 1; if the relation between the two results is unknown, the two results are expressed by 0; if the two results are mutually exclusive, then it is denoted by-1. Along the above example, with the initial predictor result 1 as the initial predictor result to be processed and the initial predictor results 2-5 as the reference initial predictor results, determining the relevance scores between the initial predictor result 1 and each reference initial predictor result in sequence through inquiring the result relation matrix, and adding the relevance scores to obtain the relevance score corresponding to the initial predictor result 1.

Based on the same method, calculating the relevance scores of the initial predictor results, then reordering the initial predictor results according to the relevance scores of the initial predictor results, and selecting a preset number of results as target detection results.

For example, there are 10 initial predictor results, and after the 10 initial predictor results are reordered according to each relevance score, the top 5 child results are selected as target detection results. In practical applications, the number of neutron results in the initial detection result is greater than the number of neutron results in the target detection result.

After the target detection result is determined, the probability of each sub-result in the target detection result is also required to be determined, specifically, according to the relevance score of each sub-result and the total relevance score calculation of all sub-results in the target detection result, for example, 5 results in the target detection result are respectively the relevance score of the result 1 of 10 points; the relevance score for result 2 was 8, the relevance score for result 3 was 7, the relevance score for result 4 was 6, and the relevance score for result 5 was 5. The total score of the 5 results is 36 points, wherein the probability of the result 1 is 10/36×100% =27% … …, and the probability of each sub-result and each sub-result in the target detection result can be determined by the same method.

In a specific embodiment provided in the present specification, during the process of processing an image to be detected, in the image processing model, during the process of processing the image to be detected to obtain an initial prediction result, abnormal feature information generated during the processing process is referred to. After the initial predicted result is obtained, the result in the initial predicted result is further screened by continuously referring to a preset result relation matrix, so that the mutually exclusive predicted result is avoided, and the final target detection result is more accurate.

In a specific embodiment provided in the present specification, the image processing model is obtained through training in the following S2062 to S2068:

s2062, acquiring a sample image and a sample detection result corresponding to the sample image.

Specifically, the sample image is specifically an image for training an image processing model, the sample detection result is specifically a known detection result corresponding to the sample image, and in practical application, the sample image is taken as a skin disease image as an example, and the sample detection result is a diagnosis result of a doctor. The sample image and the sample detection result form a sample pair, and the sample pair is used for carrying out weak supervision training on the image processing model.

The sample image is usually obtained by optimizing the sample image after being obtained according to the portable device in the same processing manner as the image to be detected. Taking a patient with skin disease diagnosis as an example, taking a skin damage image of the patient through a mobile phone, a camera and other equipment, and obtaining a diagnosis result of a doctor as a sample detection result. It should be noted that, the obtaining of the skin injury image of the patient is authorized and agreed by the patient, and the obtaining of the diagnosis result of the doctor is also authorized and agreed by the patient and the doctor.

After obtaining the skin damage image, filtering out the blurred and invalid image, and after the valid skin damage image is reserved, scaling the image to a preset size through an OpenCV kit, wherein a sample image is obtained.

In the method provided in the embodiment of the present disclosure, a sample image set for training is divided into a training set and a test set, an image processing model is trained by the training set, and a processing effect of the image processing model is verified by the test set.

S2064, inputting the sample image into an image processing model to obtain a prediction detection result, prediction abnormal characteristic information and prediction non-abnormal characteristic information which are output by the image processing model.

In the method provided in the embodiment of the present disclosure, after a training sample is obtained, according to a preset training batch, sample images are input into an image processing model, where the image processing model is an untrained image processing model, and in the image processing model, processing is performed according to the sample images to obtain a prediction detection result corresponding to the sample images, and meanwhile, prediction abnormal feature information and prediction non-abnormal feature information are also obtained.

The prediction detection result specifically refers to an image processing prediction result obtained after the image processing model is processed in the method provided in the specification. The predicted abnormal characteristic information specifically refers to characteristic information extracted from the sample image and possibly belonging to an abnormal target, and the predicted non-abnormal characteristic information specifically refers to characteristic information extracted from the sample image and possibly belonging to a non-abnormal target.

In practical application, a sample image is firstly segmented into a plurality of sample sub-images, an image processing model is used for determining which sample sub-image has an abnormality, and image features of the sub-images with the abnormality are extracted as prediction abnormality feature information. Similarly, it is also determined which sub-image of the sample has no abnormality, and at the same time, the image features of the sub-image having no abnormality are extracted as predicted non-abnormality feature information.

inputting the sample image into an image processing model to obtain a prediction detection result, prediction abnormal characteristic information and prediction non-abnormal characteristic information which are output by the image processing model, wherein the method comprises the following steps of:

inputting the sample image into the embedding layer to obtain embedded sample image characteristics;

inputting the embedded sample image features to the feature processing layer to obtain sample image decoding features, predicted abnormal feature information and predicted non-abnormal feature information;

and inputting the sample image decoding characteristics and the prediction abnormal characteristic information into the classification layer to obtain a prediction detection result.

In practical application, the image processing model is similar to the processing mode of the model application stage in the model training stage, and the image processing model comprises an embedding layer, a characteristic processing layer, a classification layer and a comparison layer. The sample image is input to the embedded layer to obtain the characteristics of the embedded sample image, and the processing method of the sample image in the embedded layer refers to the processing of the image to be detected by the embedded layer, and is not repeated here.

In the processing of the embedded sample image features, the feature processing layer in the embodiment of the present disclosure obtains, in addition to the sample image decoding features, predicted abnormal feature information and predicted non-abnormal feature information, where the predicted abnormal feature information and the predicted non-abnormal feature information are used to perform contrast learning on features that are abnormal and features that are not abnormal in the target detection area, so that the image processing model can learn to distinguish between an abnormal target and a normal area, and further can more accurately locate abnormal position points in the sample image.

In practical application, the sample image is usually captured by a portable device, and the background of the image is complex and contains many irrelevant environmental factors, which is unfavorable for the accurate positioning of the abnormal position of the image processing model. At present, a supervised training mode is usually used for processing the sample image, namely, the abnormal points on the sample image are marked, so that an image processing model can determine where the abnormal points are, but the supervised training mode requires a large amount of marking data, and the marking data consumes a large amount of manpower and material resources.

The image processing method provided by the specification uses a weak supervision training mode, and the used sample pairs are sample images and sample detection results corresponding to the sample images. Position information of abnormal points cannot be marked accurately in the sample image, predicted abnormal characteristic information and predicted non-abnormal characteristic information are obtained through processing of the sample image by the image processing model, and accurate positioning of the abnormal points can be achieved through comparison learning between the predicted abnormal characteristic information and the predicted non-abnormal characteristic information.

In a specific implementation manner provided in the embodiment of the present specification, at least one attention sub-layer is included in the feature processing layer;

inputting the embedded sample image features to the feature processing layer to obtain sample image decoding features, predicted abnormal feature information and predicted non-abnormal feature information, including:

sequentially inputting the embedded image features to all attention sublayers to obtain sample image decoding features and attention feature matrixes corresponding to all attention sublayers;

extracting initial association relation features corresponding to the attention feature matrixes from the attention feature matrixes, and determining target association relation features according to the initial association relation features, wherein the initial association relation features represent weights of abnormal targets included in each sample sub-image, and the target association relation features include distribution weights of the abnormal targets in the sample image;

And determining predicted abnormal characteristic information and predicted non-abnormal characteristic information according to the target association relation characteristics.

Specifically, taking a scene of skin lesion detection as an example, in the method provided in the present specification, the acquired sample image is a clinical image of a skin patient, and a diagnosis result of a doctor. And carrying out image preprocessing on the clinical image to obtain a sample image, and using the diagnosis result as a sample detection result of the sample image.

The size information of the sample image subjected to image preprocessing is 384 x 384, and meanwhile, the sample image is segmented according to a preset segmentation size, 576 non-overlapping sample sub-images are obtained, and the size of each sample sub-image is 16 x 16. After each sample sub-image is processed by an embedding layer, the embedded sub-image characteristic corresponding to each sample sub-image is obtained, and the dimension of each image characteristic is 1 x 768 dimension. And meanwhile, CLS is introduced for classification, and the dimension of the CLS token after embedding treatment is 1 x 768 dimensions.

Based on the position information of each sample sub-image on the original sub-image, the position code of each sub-image is obtained using Relative Position Embedding, and the position code of CLS token is initialized to 768-dimensional vectors of all 0. And splicing the position codes and the embedded sub-images to obtain the embedded image characteristics.

The embedded image features are input into a feature processing layer, the feature processing layer consists of a plurality of continuous Transformer Layer, the output features of the CLS token and the embedded sub-images are obtained after each Transformer Layer processing, and the output features of the CLS token are f _cls Representing the output characteristics of each embedded sub-imageIndicating that i has a value of 0 to 576.f (f) _cls And f _patch Constitutes a sample image decoding feature.

And extracting attention feature matrixes in the attention sub-layers in Transformer Layer for determining final predicted abnormal feature information and predicted non-abnormal feature information. Specifically, the first line vector in each attention feature matrix is extracted first, and the first element is removed, so that an initial association feature is obtained, the feature dimension of the initial association feature is 1 x 576, and because the first line vector in each attention feature matrix represents the relationship information between the CLS token and each other sample sub-image, the first element is the information of the CLS token itself, and therefore, after the first line vector in the attention feature matrix is extracted, the first element is removed, so that the initial association feature between each sample sub-image can be obtained.

Taking Transformer Layer with 12 layers as an example, at this time, 12 initial association characteristics can be obtained, and the target association characteristics can be obtained by means of weighted averaging.

Restoring the target association relation characteristics through the position information of each sample sub-image to obtain the positioning characteristic information of the abnormal target(i.e., the significance of the outlier target). Restoring the output characteristics of each embedded sub-image through the position information of each sample sub-image to obtain image decoding characteristics +.>

Calculating the predicted abnormal characteristic information f by the above formula 1 _lesion And predicting non-abnormal feature information f _non-lesion 。

S2066, calculating a model loss value from the sample detection result, the prediction abnormality characteristic information, and the prediction non-abnormality characteristic information.

And after a prediction detection result, the prediction abnormal characteristic information and the prediction non-abnormal characteristic information are obtained, a model loss value can be calculated by combining a sample detection result. There are many methods for calculating the model loss value, such as cross entropy loss function, maximum loss function, average loss function, etc., and in this specification, the specific manner of the loss function is not limited, and the actual application is based.

In another specific embodiment provided in the present specification, calculating a model loss value from the sample detection result, the prediction abnormality feature information, and the prediction non-abnormality feature information includes:

calculating a first loss value according to the sample detection result and the prediction detection result;

calculating a second loss value according to the predicted abnormal characteristic information and the predicted non-abnormal characteristic information;

and determining a model loss value according to the first loss value and the second loss value.

In practical application, the image processing model is used for accurately positioning the abnormal position, and meanwhile, a prediction result is determined according to the information of the abnormal position, and based on the prediction result, the abnormal position is positioned in a mode of comparing the predicted abnormal characteristic information and the predicted non-abnormal characteristic information. Based on this, the first loss value is calculated from the sample detection result and the prediction detection result, concretely, see the following formula 2:

L _d ＝-ylog(p _d ) Equation 2

Wherein L is _d For the first loss value, y is the sample detection result, p _d Is the predictive detection result. By the above formula 2, the first loss value can be calculated from the sample detection result and the prediction detection result. It should be noted that the predicted detection result p obtained during training _d The detection result which is not reordered by the result relation matrix is similar to the initial prediction result of the model in the practical application, but is not the target detection result of the model in the practical application.

Calculating a second loss value based on the predicted abnormal feature information and the predicted non-abnormal feature information, see the following equation 3:

L _l ＝-log(1-sim(f _lesion ,f _non-lesion ) Equation 3

Specifically, L _l Is the second loss value, f _lesion To predict abnormal feature information, f _non-lesion To predict non-abnormal featuresInformation. sim is cosine similarity. The prediction abnormal characteristic information and the prediction non-abnormal characteristic information can help the model to more accurately locate abnormal position points through comparison of learning loss functions.

After obtaining the first loss value and the second loss value, the model loss value may be determined by the following equation 4.

L＝L _l +L _d Equation 4

Wherein L is a model loss value.

S2068, adjusting model parameters of the image processing model according to the model loss value until a model training stopping condition is reached.

After the model loss value is obtained, the model parameters of the image processing model can be adjusted according to the model loss value, and specifically, the model parameters of the image processing model can be updated by back propagation of the model loss value.

After the model parameters are adjusted, the steps can be continuously repeated, and the image processing model is continuously trained until the training stopping condition is reached, and in practical application, the training stopping condition of the image processing model comprises the following steps:

the model loss value is smaller than a preset threshold value; and/or

The training round reaches the preset training round.

Specifically, in the process of training the image processing model, the training stop condition of the model may be set to be smaller than the preset threshold value, or the training stop condition may be set to be a preset training round, for example, 10 training rounds, where in the present specification, the preset threshold value of the loss value and/or the preset training round are not specifically limited, and the actual application is in order.

According to the method provided by the embodiment of the specification, in the process of training the image processing model, a comparison learning mode is used, so that the accuracy of abnormal positioning is improved, and particularly for the case of small abnormal points, the positioning of the abnormal position is more accurate. And meanwhile, the global image features and the local abnormal features are fused, and the information of the image is fully utilized, so that the classification of the prediction result is more accurate.

Referring to fig. 4, fig. 4 shows a flowchart of a skin damage image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:

step 402: and receiving a skin damage image processing task, wherein the skin damage image processing task carries a skin damage image to be detected corresponding to a target detection area, and the skin damage image processing task is used for detecting whether the target detection area is abnormal or not.

Step 404: inputting the skin damage image to be detected into a skin damage image processing model to obtain a target detection result corresponding to the target detection area, wherein the skin damage image processing model generates an initial prediction result based on the skin damage image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying the association relation among a plurality of results.

It should be noted that, the implementation manner of step 402 and step 404 is the same as the implementation manner of steps 202 to 204, and will not be described in detail in the embodiment of the present disclosure.

Illustratively, in the application scenario of skin disease detection, the target detection area is a local skin of the patient, and whether skin disease exists in the local skin or not and the type of skin disease are detected. And receiving a skin damage image processing task, wherein the skin damage image processing task comprises a skin damage image to be detected corresponding to local skin of a target user, and the skin damage image to be detected is obtained through shooting by a mobile phone or a camera and other portable equipment. The skin lesion image processing task is used to detect whether a user has skin disease.

The skin damage image processing model of the method according to the embodiment of the present disclosure is the image processing model of the above embodiment, and the model structure of the skin damage image processing model is the same as that of the image processing model of the above embodiment, which is not described herein, and the detection result corresponding to the local skin output by the image processing model can be obtained by inputting the skin damage image to be detected into the skin damage image processing model, so as to realize automatic detection of whether the local skin has skin diseases or not and the type of skin diseases.

Referring to fig. 5, fig. 5 is a schematic diagram illustrating a skin damage image processing method according to an embodiment of the present disclosure. As shown in fig. 5, the skin damage image to be detected is input into a skin damage image processing model, in the skin damage image processing model, the image decoding characteristics are obtained after the processing of an embedding layer and a characteristic processing layer, and the initial prediction result is obtained after the processing of a classification layer. And inputting the initial prediction result into a comparison layer, and comparing the initial prediction result with a result relation matrix in the comparison layer so as to obtain a final target prediction result.

In the skin damage image processing method provided by the embodiment of the specification, in the skin damage image processing model, the skin damage image to be detected is processed, and in the process of obtaining the initial prediction result, the abnormal characteristic information generated in the processing process is referred. After the initial predicted result is obtained, the result in the initial predicted result is further screened by continuously referring to a preset result relation matrix, so that the mutually exclusive predicted result is avoided, and the final target detection result is more accurate.

Referring to fig. 6, fig. 6 shows a flowchart of a training method of an image processing model according to an embodiment of the present disclosure, which is applied to cloud-side equipment, and specifically includes the following steps:

step 602: and acquiring a sample image and a sample detection result corresponding to the sample image.

Step 604: and inputting the sample image into an image processing model to obtain a prediction detection result, prediction abnormal characteristic information and prediction non-abnormal characteristic information which are output by the image processing model.

Step 606: and calculating a model loss value according to the sample detection result, the prediction abnormal characteristic information and the prediction non-abnormal characteristic information.

Step 608: and adjusting the model parameters of the image processing model according to the model loss value until the model training stopping condition is reached, and obtaining the model parameters of the image processing model.

Step 610: and sending the model parameters of the image processing model to end-side equipment.

It should be noted that, step 602 and step 608 are the same as the implementation manners of S2062-S2068, and are not repeated in the embodiment of the present disclosure.

In the process of training the image processing model through the cloud side equipment, if the trained image processing model needs to be deployed on the cloud side equipment, a notification of model training completion can be sent to a user; if the trained image processing model is deployed on the end side, model parameters of the image processing model may be sent to the user.

Referring to FIG. 7, FIG. 7 is a schematic diagram showing the structure of an image processing model according to an embodiment of the present disclosure, wherein a sample image is segmented into a plurality of sample sub-images as shown in FIG. 7, and input to a Patch Embedding layer for processing while introducing [ CLS ]]the token is used for subsequent classification. After Position Embedding (position coding) is carried out on each sample sub-image, embedded sample image characteristics are obtained, the embedded sample image characteristics are input into a plurality of Transformer Layer for processing, and the output characteristics f of the CLS token are obtained _cls Simultaneously restoring the embedded sub-image features of each sample sub-image to obtain image decoding featuresObtaining positioning characteristic information +.>And decoding the feature->And positioning characteristic information->Calculating predicted abnormal feature information f _lesion And predicting non-abnormal feature information f _non-lesion 。

Through f_cls and f _lesion After fusion, the fusion is input into classification for classification,obtaining a predicted detection result and calculating a first loss value L according to the predicted detection result and the sample detection result _d According to the predicted abnormal characteristic information f _lesion And predicting non-abnormal feature information f _non-lesion Calculating a second loss value L _f 。

And calculating a model loss value through the first loss value and the second loss value, and adjusting model parameters of the image processing model according to the back propagation of the model loss value, so that model training of the image processing model is realized.

In practical application, because a large amount of data and better computing resources are required for training the model, the terminal side equipment may not have corresponding processing capability, so that the model training process can be realized in cloud side equipment, and the cloud side equipment can also send the model parameters to the terminal side equipment after obtaining the model parameters of the image processing model. The terminal equipment can locally construct an image processing model according to model parameters of the image processing model, and further perform image processing by using the image processing model.

Referring to fig. 8, fig. 8 shows a flowchart of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:

step 802: receiving an image processing request sent by a user, wherein the image processing request comprises an image processing task, the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is used for detecting whether the target detection area is abnormal or not.

Step 804: inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying association relations among a plurality of results.

Step 806: and sending a target detection result corresponding to the target detection area to the user.

It should be noted that, the specific implementation manner of the steps 802 to 804 is the same as the implementation manner of the steps 202 to 204, and will not be described in detail in the embodiment of the present disclosure.

In this embodiment, an image processing request sent by a user is received, where the image processing request includes an image processing task, and after the processing is completed by the image processing method of the foregoing embodiment to obtain a target detection result, the target detection result needs to be returned to the user, so that the user performs corresponding subsequent processing according to the target detection result.

In the method provided by the embodiment of the specification, during the process of processing the image to be detected, in the image processing model, the image to be detected is processed, and during the process of obtaining the initial prediction result, the abnormal characteristic information generated during the processing process is referred. After the initial predicted result is obtained, the result in the initial predicted result is further screened by continuously referring to a preset result relation matrix, so that the mutually exclusive predicted result is avoided, and the final target detection result is more accurate.

The application of the image processing method provided in the present specification to the detection of skin diseases will be further described with reference to fig. 9. Fig. 9 shows a flowchart of a processing procedure of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:

step 902: and obtaining an image of the skin damage part of the patient, and scaling the image of the skin damage part to a preset size to obtain an image to be detected, wherein the image of the skin damage part is obtained through mobile phone shooting.

Step 904: inputting the image to be detected into a pre-trained image processing model, and obtaining global image characteristics, local image characteristics and skin damage distribution weight information of skin damage parts in the skin damage part image of the image to be detected in the image processing model.

Step 906: and obtaining predicted skin damage characteristic information and predicted non-skin damage characteristic information according to the local image characteristics and the skin damage distribution weight information.

Step 908: and fusing the global image characteristic F and the predicted skin damage characteristic information, and inputting the fused characteristic information into a classifier of an image processing model to obtain an initial prediction result of the image to be detected.

Step 910: inputting the initial comparison result into a comparison layer of the image processing model, wherein the comparison layer comprises a preset result relation matrix, and the result relation matrix is a similarity relation matrix between pre-constructed known skin disease diagnosis results.

Step 912: in the comparison layer, the initial comparison result is subjected to result voting according to the result relation matrix, and mutually exclusive prediction results are eliminated, so that a target detection result is obtained.

Corresponding to the above method embodiments, the present disclosure further provides an image processing apparatus embodiment, and fig. 10 shows a schematic structural diagram of an image processing apparatus according to one embodiment of the present disclosure. As shown in fig. 10, the apparatus includes:

a receiving module 1002, configured to receive an image processing task, where the image processing task carries an image to be detected corresponding to a target detection area, and the image processing task is configured to detect whether the target detection area has an abnormality;

the detection module 1004 is configured to input the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the image processing model generates an initial prediction result based on the image to be detected, and determines the target detection result based on the initial prediction result and a result relation matrix, and the result relation matrix is used for identifying an association relation among a plurality of results.

Optionally, the image processing model comprises an embedding layer, a feature processing layer, a classifying layer and a comparison layer;

the detection module 1004 is further configured to:

inputting the image to be detected into the embedding layer to obtain embedded image characteristics;

inputting the embedded image features into the feature processing layer to obtain image decoding features and abnormal feature information;

inputting the image decoding characteristics and the abnormal characteristic information into the classification layer to obtain an initial prediction result;

and inputting the initial prediction result into the comparison layer to obtain a target detection result, wherein the comparison layer comprises a result relation matrix.

Optionally, the detection module 1004 is further configured to:

Optionally, the initial predictor includes a plurality of initial predictor results;

the detection module 1004 is further configured to:

Optionally, the detection module 1004 is further configured to:

Optionally, the apparatus further comprises a training module configured to:

And adjusting model parameters of the image processing model according to the model loss value until a model training stopping condition is reached.

the training module is further configured to:

Optionally, the feature processing layer includes at least one attention sub-layer;

the training module is further configured to:

Optionally, the training module is further configured to:

In the device provided in one embodiment of the present disclosure, during the process of processing an image to be detected, in the image processing model, during the process of processing the image to be detected to obtain an initial prediction result, abnormal feature information generated during the processing process is referred to. After the initial predicted result is obtained, the result in the initial predicted result is further screened by continuously referring to a preset result relation matrix, so that the mutually exclusive predicted result is avoided, and the final target detection result is more accurate.

The above is a schematic scheme of an image processing apparatus of the present embodiment. It should be noted that, the technical solution of the image processing apparatus and the technical solution of the image processing method belong to the same concept, and details of the technical solution of the image processing apparatus, which are not described in detail, can be referred to the description of the technical solution of the image processing method.

Fig. 11 illustrates a block diagram of a computing device 1100 provided according to one embodiment of the present description. The components of computing device 1100 include, but are not limited to, a memory 1110 and a processor 1120. Processor 1120 is coupled to memory 1110 via bus 1130, and database 1150 is used to hold data.

The computing device 1100 also includes an access device 1140, the access device 1140 enabling the computing device 1100 to communicate via one or more networks 1160. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 1140 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, near field communication (NFC, near Field Communication).

In one embodiment of the present description, the above components of computing device 1100, as well as other components not shown in FIG. 11, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 11 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 1100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 1100 may also be a mobile or stationary server.

The processor 1120 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the image processing method or the skin damage image processing method or the training method of the image processing model.

The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the foregoing image processing method or skin damage image processing method or training method of the image processing model belong to the same concept, and details of the technical solution of the computing device that are not described in detail may be referred to the description of the technical solution of the foregoing image processing method or skin damage image processing method or training method of the image processing model.

An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the above-described image processing method or skin-lesion image processing method or training method of an image processing model.

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the above-mentioned image processing method or skin damage image processing method or training method of the image processing model belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the above-mentioned image processing method or skin damage image processing method or training method of the image processing model.

An embodiment of the present disclosure further provides a computer program, wherein the computer program when executed in a computer causes the computer to perform the steps of the above-mentioned image processing method or skin-damaged image processing method or training method of an image processing model.

The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the above-mentioned image processing method or skin damage image processing method or training method of the image processing model belong to the same conception, and the details of the technical solution of the computer program which are not described in detail can be referred to the description of the technical solution of the above-mentioned image processing method or skin damage image processing method or training method of the image processing model.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims

1. An image processing method, comprising:

2. The method of claim 1, the image processing model comprising an embedding layer, a feature processing layer, a classification layer, an alignment layer;

inputting the image to be detected into an image processing model to obtain a target detection result corresponding to the target detection area, wherein the method comprises the following steps:

3. The method of claim 2, inputting the image to be detected to the embedding layer to obtain embedded image features, comprising:

4. The method of claim 2, the initial predictor comprising a plurality of initial predictor results;

5. The method of claim 4, calculating a relevance score corresponding to the initial predictor to be processed based on the result relationship matrix, the initial predictor to be processed, and each reference initial predictor result, comprising:

6. The method of claim 1, the image processing model being obtained by training:

7. The method of claim 6, the image processing model comprising an embedding layer, a feature processing layer, a classification layer, an alignment layer;

8. The method of claim 7, the feature handling layer comprising at least one attention sub-layer;

sequentially inputting the embedded sample image features to all attention sublayers to obtain sample image decoding features and attention feature matrixes corresponding to all attention sublayers;

9. The method of claim 6, calculating a model loss value from the sample detection result, the predicted anomalous feature information, and the predicted non-anomalous feature information, comprising:

10. A skin damage image processing method, comprising:

11. The training method of the image processing model is applied to cloud side equipment and comprises the following steps:

12. An image processing method, comprising:

13. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the method of any one of claims 1 to 12.

14. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of any one of claims 1 to 12.