WO2023030097A1

WO2023030097A1 - Method and apparatus for determining cleanliness of tissue cavity, and readable medium and electronic device

Info

Publication number: WO2023030097A1
Application number: PCT/CN2022/114259
Authority: WO
Inventors: 边成; 杨志雄; 杨延展; 李永会
Original assignee: 北京字节跳动网络技术有限公司
Priority date: 2021-09-03
Filing date: 2022-08-23
Publication date: 2023-03-09
Also published as: CN113470030B; CN113470030A

Abstract

The present disclosure relates to a method and apparatus for determining the cleanliness of a tissue cavity, and a readable medium and an electronic device, and relates to the technical field of image processing. The method comprises: firstly, acquiring a tissue image that is collected by an endoscope; then, according to the tissue image and a pre-trained recognition model, determining an initial cleanliness and a target rounding mode, wherein the initial cleanliness is of a floating-point type; and finally, according to the target rounding mode, rounding the initial cleanliness, so as to obtain the cleanliness of the tissue image, wherein the cleanliness is of an integer type. In the present disclosure, a floating point-type initial cleanliness and a target rounding mode applicable to a tissue image are determined by means of a recognition model, and the initial cleanliness is then rounded by using the target rounding mode, so as to obtain the cleanliness of the tissue image, such that the accuracy of the cleanliness can be improved.

Description

Method, device, readable medium and electronic device for determining the cleanliness of a tissue cavity

Cross References to Related Applications

This application is based on the Chinese patent application with the application number 202111033610.8 and the filing date on September 03, 2021, entitled "Method, device, readable medium and electronic equipment for determining the cleanliness of tissue cavity", and requires the Chinese patent application The priority of this Chinese patent application, the entire content of this Chinese patent application is hereby incorporated into this application as a reference.

technical field

The present disclosure relates to the technical field of image processing, and in particular, to a method, device, readable medium and electronic equipment for determining the cleanliness of a tissue cavity.

Background technique

The endoscope is equipped with optical lens, image sensor, light source and other components, which can enter the inside of the human body for inspection, so that doctors can intuitively observe the internal conditions of the human body, and have been widely used in the medical field. To ensure the accuracy of endoscopic examination results, it is necessary to judge the cleanliness of the tissue cavity. If the cleanliness is too low, it means that the tissue preparation is insufficient, which may cause the problem of missing small polyps or adenomas, and even cause endoscopy. A mirror check failed and a repeat check is required. Therefore, accurately identifying the cleanliness of tissues can ensure the validity and accuracy of endoscopy. The endoscope may be, for example, a colonoscope, a gastroscope, or the like. For colonoscopy, the inspected tissue is the intestinal tract, and what needs to be identified is the cleanliness of the intestinal lumen. For gastroscopy, the inspected tissue is the esophagus or stomach, and what needs to be identified is the esophageal cavity or Cleanliness of gastric cavity.

However, the cleanliness of the tissue cavity is usually determined by professionals at the stage of withdrawing the endoscope after the endoscopic examination, based on the actual inspection of the tissue. The requirements for the experience and operation level of the professionals are relatively high, and there is a certain degree of subjectivity. However, it is difficult to ensure accurate identification of the cleanliness of the tissue cavity.

Contents of the invention

This Summary is provided to introduce a simplified form of concepts that are described in detail later in the Detailed Description. This summary of the invention is not intended to identify key features or essential features of the claimed technical solution, nor is it intended to be used to limit the scope of the claimed technical solution.

In a first aspect, the present disclosure provides a method for determining the cleanliness of a tissue cavity, the method comprising:

Obtain tissue images collected by the endoscope;

Determine the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, and the initial cleanliness is a floating-point type;

According to the target rounding manner, the initial cleanliness is rounded to obtain the cleanliness of the tissue image, and the cleanliness is an integer.

In a second aspect, the present disclosure provides a device for determining the cleanliness of a tissue cavity, the device comprising:

An acquisition module, configured to acquire tissue images collected by the endoscope;

A recognition module, configured to determine an initial cleanliness and a target rounding method according to the tissue image and a pre-trained recognition model, where the initial cleanliness is a floating-point type;

The rounding module is configured to round the initial cleanliness according to the target rounding manner to obtain the cleanliness of the tissue image, and the cleanliness is integer.

In a third aspect, the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the steps of the method described in the first aspect of the present disclosure are implemented.

In a fourth aspect, the present disclosure provides an electronic device, including:

a storage device on which a computer program is stored;

A processing device configured to execute the computer program in the storage device to implement the steps of the method described in the first aspect of the present disclosure.

Through the above technical solution, the present disclosure first obtains the tissue image collected by the endoscope, and then determines the initial cleanliness of the floating point type and the target rounding method according to the tissue image and the pre-trained recognition model. Finally, the initial cleanliness is rounded according to the target rounding method to obtain the cleanliness of the tissue image, and the cleanliness is an integer. The disclosure determines the initial cleanliness of the floating-point type and the target rounding method suitable for tissue images through the recognition model, thereby using the target rounding method to round the initial cleanliness to obtain the cleanliness of the tissue image, which can improve the cleanliness accuracy.

Other features and advantages of the present disclosure will be described in detail in the detailed description that follows.

Description of drawings

The above and other features, advantages and aspects of the various embodiments of the present disclosure will become more apparent with reference to the following detailed description in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale. In the attached picture:

Fig. 1 is a flow chart of a method for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 2 is a flowchart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 3 is a schematic diagram of a recognition model shown according to an exemplary embodiment;

Fig. 4 is a flowchart showing a training recognition model according to an exemplary embodiment;

Fig. 5 is a flow chart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 6 is a flowchart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 7 is a flow chart showing a training classification model according to an exemplary embodiment;

Fig. 8 is a block diagram of a device for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 9 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 10 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 11 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment;

Fig. 12 is a block diagram of an electronic device according to an exemplary embodiment.

Detailed ways

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein; A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the protection scope of the present disclosure.

It should be understood that the various steps described in the method implementations of the present disclosure may be executed in different orders, and/or executed in parallel. Additionally, method embodiments may include additional steps and/or omit performing illustrated steps. The scope of the present disclosure is not limited in this regard.

As used herein, the term "comprise" and its variations are open-ended, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one further embodiment"; the term "some embodiments" means "at least some embodiments." Relevant definitions of other terms will be given in the description below.

It should be noted that concepts such as "first" and "second" mentioned in this disclosure are only used to distinguish different devices, modules or units, and are not used to limit the sequence of functions performed by these devices, modules or units or interdependence.

It should be noted that the modifications of "one" and "multiple" mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more" multiple".

The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are used for illustrative purposes only, and are not used to limit the scope of these messages or information.

Fig. 1 is a flowchart of a method for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 1, the method includes the following steps:

Step 101, acquire tissue images collected by an endoscope.

For example, when performing an endoscopic inspection, the endoscope will continuously collect images in the tissue according to the preset collection period. The tissue image in this embodiment can be the image collected by the endoscope at the current moment, or It can be an image collected by the endoscope at any time. That is to say, the tissue image can be an image collected by the endoscope during the process of entering the tissue (ie, the process of entering the mirror), or it can be an image collected by the endoscope during the process of withdrawing from the tissue (ie, the process of withdrawing the mirror). This is not specifically limited. Further, after the tissue image is obtained, the tissue image can also be preprocessed, which can be understood as performing enhancement processing on the data included in the tissue image. Preprocessing can include: random affine transformation, random brightness, contrast, saturation, chroma adjustment, random erasing of some pixels, flip processing (including: left-right flip, up-down flip, rotation, etc.), size transformation (English: Resize) and so on, the finally obtained preprocessed tissue image may be an image of a specified size (for example, it may be 224*224).

Step 102: Determine the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, where the initial cleanliness is a floating-point type.

For example, the preprocessed tissue image can be input into the pre-trained recognition model, so that the recognition model can recognize the tissue image, and output the initial cleanliness and target rounding method of floating point. Specifically, the recognition model can determine the matching probabilities of the tissue image and multiple types of cleanliness, and then determine the initial cleanliness according to the multiple matching probabilities. The initial cleanliness is a floating point type, that is to say, the initial cleanliness is usually not an integer. Various types of cleanliness are used to indicate the cleanliness of tissue images. For example, an endoscope is a colonoscope, and a tissue image is an intestinal image. The multiple cleanliness types can be the Boston Bowel Cleanliness Score Standard (English: Boston Bowel Preparation Scale, abbreviation: BBPS) in the four types: that is, the cleanliness type is 0 points, corresponding to "the entire intestinal mucosa cannot be observed due to solid and liquid feces that cannot be removed", and the cleanliness type is 1 point, corresponding to "due to Part of the intestinal mucosa cannot be observed due to stains, turbid liquid, and residual feces", the cleanliness type is 2 points, corresponding to "the intestinal mucosa is well observed, but a small amount of stains, turbid liquid, and feces remain", and the cleanliness type is 3 Score, corresponding to "the intestinal mucosa is well observed, basically no residual stains, turbid liquid, and feces". Further, the recognition model can also determine the matching probability of the tissue image and multiple rounding methods, and then determine the target rounding method according to the multiple matching probabilities. The multiple rounding methods may include, for example: rounding up (such as the ceil function), rounding down (such as the floor function), and the like. Wherein, the recognition model can be trained according to a large number of pre-collected training images and a cleanliness label corresponding to each training image. The recognition model can be, for example, CNN (English: Convolutional Neural Networks, Chinese: convolutional neural network) or LSTM (English: Long Short-Term Memory, Chinese: long-term short-term memory network), or Transformer (such as Vision Transformer). Encoder, etc., this disclosure does not specifically limit it.

In step 103, the initial cleanliness is rounded according to the target rounding method to obtain the cleanliness of the tissue image, and the cleanliness is an integer.

For example, after obtaining the initial cleanliness output by the recognition model and the target rounding method, the initial cleanliness may be rounded according to the target rounding method, so as to obtain the cleanliness of the integer tissue image. If the target rounding method is upward rounding, then the initial cleanliness can be rounded up as the cleanliness of the tissue image; if the target rounding method is downward rounding, then the initial cleanliness can be rounded down Integrate as the cleanliness of the tissue image. For example, the initial cleanliness is 2.8, if the target rounding method is up, then the cleanliness is 3, if the target rounding method is down, then the cleanliness is 2. Compared with randomly selecting the rounding method to round the floating-point cleanliness, introducing random errors, and reducing the accuracy of the cleanliness, this embodiment can use the recognition model to learn from tissue images that are suitable for tissue images. The target rounding method is used to obtain the cleanliness of the tissue image, which can effectively improve the robustness and accuracy of the cleanliness. Furthermore, since the tissue image can be an image collected by the endoscope at any time, this embodiment can determine the current cleanliness of the tissue cavity in real time, without limiting the cleanliness judgment during the process of withdrawing the mirror, and can be timely based on The cleanliness of the tissue cavity is used to determine the next operation of the endoscope, avoiding problems such as invalid mirror entry and repeated inspections.

It should be noted that the endoscope described in the embodiments of the present disclosure can be, for example, a colonoscope or a gastroscope. If the endoscope is a colonoscope, then the above-mentioned tissue image is an intestinal image, and the tissue cavity is the intestinal cavity. Then this embodiment determines the cleanliness of the intestinal lumen. If the endoscope is a gastroscope, then the tissue image above can be an image of the esophagus, stomach or duodenum, and correspondingly, the tissue cavity can be the cavity of the esophagus, the cavity of the stomach, or the cavity of the duodenum, then In this embodiment, the cleanliness of esophageal cavity, gastric cavity and duodenal cavity are determined. In the present disclosure, the endoscope can also be used to collect images of other tissues with cavities to determine the cleanliness of the tissue cavities, which is not specifically limited in the present disclosure.

To sum up, the disclosure first obtains the tissue image collected by the endoscope, and then determines the initial cleanliness of the floating-point type and the target rounding method according to the tissue image and the pre-trained recognition model. Finally, the initial cleanliness is rounded according to the target rounding method to obtain the cleanliness of the tissue image, and the cleanliness is an integer. The disclosure determines the initial cleanliness of the floating-point type and the target rounding method suitable for tissue images through the recognition model, thereby using the target rounding method to round the initial cleanliness to obtain the cleanliness of the tissue image, which can improve the cleanliness accuracy.

Fig. 2 is a flow chart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 2, the recognition model is shown in Fig. 3, which includes: model and rounded submodels. Specifically, the structure of the feature extraction sub-model may be, for example, the Encoder in the Vision Transformer, or other structures capable of extracting image features, which are not specifically limited in the present disclosure. The structure of the cleanliness sub-model can be, for example, two linear layers (which can be understood as fully connected layers), cascaded together through the ReLU nonlinear layer, or other structures. The structure of the rounding sub-model can be, for example, a linear layer. Other structures are also possible, which is not specifically limited in the present disclosure.

Correspondingly, the implementation of step 102 may include:

Step 1021, input the tissue image into the feature extraction sub-model to obtain the image features output by the feature extraction sub-model for characterizing the tissue image.

For example, the tissue image is first input into the feature extraction sub-model to obtain the image features output by the feature extraction sub-model for characterizing the tissue image. In the following, the structure of the feature extraction sub-model is the Encoder in the Vision Transformer as an example to describe the process of extracting image features in detail.

First divide the input tissue image into multiple sub-images of equal size (which can be expressed as patches) according to the specified size. For example, if the input tissue image is 224*224 and the specified size is 16*16, then the tissue image can be divided into 196 sub-image. Then, the linear projection layer (English: Linear Projection) can be used to flatten each sub-image to obtain an image vector corresponding to the sub-image (which can be expressed as patch embedding), and the image vector can represent the sub-image. Further, a position vector (which may be represented as position embedding) for indicating the position of the sub-image in the tissue image may also be generated, where the size of the position embedding is the same as that of the patch embedding. It should be noted that the position embedding can be randomly generated, and the Encoder can learn the representation of the position of the corresponding sub-image in the tissue image. Afterwards, according to the image vector and position vector of each sub-image, a token corresponding to the sub-image (which can be expressed as a token) can be generated. Specifically, the token corresponding to each sub-image may be obtained by concatenating the image vector and the position vector of the sub-image (which can be understood as concat).

Further, after the token corresponding to each sub-image is obtained, the token corresponding to the tissue image may also be generated. For example, an image vector and a position vector can be randomly generated and concatenated to serve as tokens corresponding to the tissue image.

Afterwards, the token corresponding to each sub-image and the token corresponding to the tissue image can be input into the encoder, and the encoder can generate a local encoding vector corresponding to each sub-image according to the token corresponding to each sub-image. At the same time, it can also be based on Tokens corresponding to all sub-images generate global encoding vectors corresponding to tissue images. Wherein, the local encoding vector can be understood as a vector learned by the encoder and can represent the corresponding sub-image, and the global encoding vector can be understood as the vector learned by the encoder and can represent the entire tissue image. Finally, the global encoding vector can be used as the output of the feature extraction sub-model, i.e. image features. It is also possible to concatenate the global encoding vector and the local encoding vector as the output of the feature extraction sub-model, that is, the image feature. In this way, the image feature can not only represent the global information, but also represent the local information.

In step 1022, the image features are respectively input into the cleanliness sub-model and the rounding sub-model to obtain the cleanliness vector output by the cleanliness sub-model and the rounding vector output by the rounding sub-model.

Step 1023, determine the initial cleanliness according to the cleanliness vector, and determine the target rounding method according to the rounding vector.

For example, the image features can be input into the cleanliness sub-model and the rounding sub-model respectively to obtain the cleanliness vector output by the cleanliness sub-model and the rounding vector output by the rounding sub-model. Among them, the dimension of the cleanliness vector output by the cleanliness sub-model is the same as the number of cleanliness types. For example, if the tissue image is an intestinal image (that is, an endoscope is a colonoscope), the BBPS includes four types of cleanliness types, then the dimension of the cleanliness vector can be 1*4, and each dimension corresponds to a cleanliness degree type. Similarly, the dimension of the rounding vector output by the rounding sub-model is the same as the number of rounding methods. For example, the rounding methods include two types of rounding up and rounding down, so the dimension of the rounding vector can be 1* 2. Each dimension corresponds to a rounding type. Finally, the initial cleanliness is determined according to the cleanliness vector, and the target rounding method is determined according to the rounding vector.

In one implementation, the manner of determining the initial cleanliness in step 1023 may include:

Step 1) According to the cleanliness vector, determine the matching probabilities of the tissue image and various cleanliness types.

Step 2) Determine the initial cleanliness according to the weight corresponding to each cleanliness type and the matching probability of the tissue image and multiple cleanliness types.

For example, the Softmax function can be used to process the cleanliness vector to obtain the matching probability of the tissue image and various cleanliness types, and then according to the weight corresponding to each cleanliness type, and the matching probability of the tissue image and various cleanliness types, The multiple matching probabilities are weighted and summed to get the initial cleanliness. In the case that the tissue image is an intestinal image (that is, the endoscope is a colonoscope), the weight corresponding to each type of cleanliness can be determined according to the score of the BBPS. Specifically, the initial cleanliness can be determined by formula 1:

Among them, S represents the initial cleanliness, N represents the number of cleanliness types, and a _i represents the weight corresponding to the i-th cleanliness type. p _i (x) represents the matching probability between the tissue image and the i-th cleanliness type (which can be understood as the output of the Softmax function), f _i (x) represents the value of the i-th dimension in the cleanliness vector, and x represents the image feature. Take the tissue image as the intestinal image (that is, the endoscope is the colonoscope), and determine the corresponding weight according to the BBPS as an example, where the cleanliness type is 0 points and the corresponding weight is 0, and the cleanliness type is 1 point and the corresponding weight is 1 , the cleanliness type being 2 points corresponds to a weight of 2, and the cleanliness type is 3 points corresponding to a weight of 3, then N=4, a _i =i.

The manner of determining the target rounding manner in step 1023 may include:

Step 3) Determine the matching probabilities of the tissue image and multiple rounding methods according to the rounding vector.

Step 4) Determine the target rounding method among the multiple rounding methods according to the matching probabilities between the tissue image and the multiple rounding methods.

As an example, the Softmax function can also be used to process the rounding vector to obtain the matching probability of the tissue image and multiple rounding methods, and then select the rounding with the highest corresponding matching probability among the matching probabilities of the tissue image and multiple rounding methods mode as the destination rounding mode. Specifically, formula 2 can be used to determine the matching probability of the tissue image and multiple rounding methods:

Among them, M represents the number of rounding methods, q _j (x) represents the matching probability of the tissue image and the jth rounding method, g _j (x) represents the value of the jth dimension in the rounding vector, and x represents the image feature. Taking the rounding method including rounding up and rounding down as an example, then M=2.

Fig. 4 is a flow chart showing a training recognition model according to an exemplary embodiment. As shown in Fig. 4, the recognition model is obtained by training in the following manner:

Step A, obtaining a first sample input set and a first sample output set, the first sample input set includes: a plurality of first sample inputs, each first sample input includes a sample tissue image, the first sample The output set includes a first sample output corresponding to each first sample input, each first sample output including the true cleanliness of the corresponding sample tissue image.

In step B, the first sample input set is used as the input of the recognition model, and the first sample output set is used as the output of the recognition model, so as to train the recognition model.

Among them, the loss of the identification model is determined according to the cleanliness loss and the rounding loss, the cleanliness loss is determined according to the output of the cleanliness sub-model and the first sample output set, and the rounding loss is determined according to the output of the rounding sub-model and the first sample output set This output set is OK.

For example, when training the recognition model, it is necessary to first obtain the first sample input set and the first sample output set used for training the recognition model. The first sample input set includes a plurality of first sample inputs, and each first sample input may be a sample tissue image, and the sample tissue image may be, for example, a tissue image collected during an endoscopic examination before. The first sample output set includes a first sample output corresponding to each first sample input, and each first sample output includes the true cleanliness of the corresponding sample tissue image. The real cleanliness is used to indicate the cleanliness of the sample tissue image. Taking an endoscope as a colonoscope and a sample tissue image as a sample intestinal image, the real cleanliness can be divided into four types according to the BBPS: 0 points, corresponding to "due to inability to Cleared solid and liquid feces lead to unobservable part of the intestinal mucosa", 1 point, corresponding to "partial intestinal mucosa cannot be observed due to stains, cloudy liquid, residual feces", 2 points, corresponding to "the intestinal mucosa is well observed, But a small amount of stains, turbid liquid, and feces remain", 3 points, corresponding to "the intestinal mucosa is well observed, and there are basically no residual stains, turbid liquid, and feces".

When training the recognition model, the first sample input set can be used as the input of the recognition model, and then the first sample output set can be used as the output of the recognition model to train the recognition model, so that when the first sample input set is input When , the output of the recognition model can match the output set of the first sample. For example, the loss function of the recognition model can be determined according to the output of the recognition model and the first sample output set, with the goal of reducing the loss function, and the backpropagation algorithm is used to correct the parameters of the neurons in the recognition model, the parameters of the neurons For example, it may be a weight (English: Weight) and a bias (English: Bias) of a neuron. Repeat the above steps until the loss function satisfies the preset condition, for example, the loss function is smaller than the preset loss threshold, so as to achieve the purpose of training the recognition model.

Specifically, the loss of the recognition model can be divided into two parts: cleanliness loss and rounding loss. Wherein, the root cleanliness loss is determined according to the output of the cleanliness sub-model and the first sample output set, and the rounding loss is determined according to the output of the rounding sub-model and the first sample output set.

The loss of the recognition model can be determined by Equation 3:

L＝L ₁ +γL ₂ formula three

Among them, L represents the loss of the recognition model, L ₁ represents the rounding loss, L ₂ represents the cleanliness loss, and γ represents the weight parameter corresponding to the cleanliness loss, for example, it can be set to 0.5.

Further, the cleanliness loss can be determined by formula 4:

L ₂ ＝|fy ₂ | ² ＝l ² Formula 4

Wherein, L ₂ represents the cleanliness loss, f represents the output of the cleanliness sub-model, y ₂ represents the real cleanliness included in the output of the first sample, l=fy ₂ .

The rounding loss can be determined by formula five, namely the cross entropy loss function (English: CrossEntropyLoss):

Among them, L ₁ represents the rounding loss, M represents the number of rounding methods, y _1,j represents the rounding method corresponding to the real cleanliness included in the first sample output, and g _j represents the rounding vector output by the rounding sub-model The value of the jth dimension in .

Further, the initial learning rate of the training recognition model can be set to: 5e-2, the Batch size can be set to: 128, the optimizer can be selected: SGD, Epoch can be set to: 60, Decay can be set to: 0.1, and the sample organization image The size can be: 224×224.

In one implementation, each sample tissue image includes a plurality of cleanliness labels, the true cleanliness of the sample tissue image is determined according to the multiple cleanliness labels of the sample tissue image, and the consistency of the sample tissue image is determined according to the sample Among the multiple cleanliness labels of the tissue image, the number of cleanliness labels matching the real cleanliness is determined. The first sample output also includes a degree of consistency of the corresponding sample tissue image.

Correspondingly, the cleanliness loss is determined from the output of the cleanliness sub-model, the true cleanliness and consistency contained in each first sample input.

As an example, taking the tissue image as an intestinal image (that is, the endoscope is a colonoscope) as an example, according to the BBPS scoring standard, it can be seen that the cleanliness is actually based on the intestinal mucosa and stains, turbid liquid, and residual feces. It is determined based on the area ratio, which makes it easy for professionals to be subject to subjective influence when labeling sample tissue images. Therefore, when training the recognition model, each first sample input included in the first sample input set can be marked by a plurality of professionals (for example, personnel with more than 5 years of experience in the industry). After marking, each sample Tissue images are included with multiple cleanliness labels. Then, the true cleanliness and consistency of each sample tissue image can be determined according to the plurality of cleanliness labels of the sample tissue image.

Specifically, the true cleanliness may be determined according to the number of identical cleanliness labels among the multiple cleanliness labels. For example, if a sample tissue image includes K cleanliness labels, among which more than K/2 cleanliness labels are 2 points, then it can be determined that the real cleanliness of the sample tissue image is 2 points. For another example, if a sample tissue image includes K cleanliness labels, and there are no more than K/2 identical cleanliness labels, then the sample tissue image can be deleted from the first sample input set, that is, the sample tissue image is discarded . In this way, the influence of subjectivity on the true cleanliness can be reduced, thereby ensuring the stability of the recognition model training.

The degree of consistency can be determined according to the number of cleanliness labels matching the real cleanliness among the multiple cleanliness labels of the sample tissue image. For example, a sample tissue image includes K cleanliness labels, among which, there are D (D≥K/2) cleanliness labels with 3 points, the true cleanliness of the sample tissue image is 3 points, and the consistency of the sample tissue image is The degree is D. The degree of consistency is used to indicate the degree of difficulty in distinguishing the image of the sample tissue. The higher the degree of consistency, the easier it is to distinguish the image of the sample tissue, and the lower the degree of consistency, the more difficult it is to distinguish the image of the sample tissue. Further, in the first sample output set, each first sample output may include not only the actual cleanliness of the corresponding sample tissue image, but also the consistency of the corresponding sample tissue image.

Correspondingly, the cleanliness loss can be determined from the output of the cleanliness sub-model, the true cleanliness and consistency contained in each first sample input. Specifically, the cleanliness loss can be determined by formula six:

Wherein, L ₂ represents the loss of cleanliness, t represents a preset threshold, and l=fy ₂ . α represents the preset control coefficient, for example, it can be set to 0.1; β represents the preset bias coefficient, which is used to ensure that when l=t, the upper and lower calculation results are the same, and can be set to 0.2; D represents the first sample output The degree of consistency included in , K represents the possible degree of consistency in the output of the first sample, taking each sample tissue image including 5 cleanliness labels as an example, then the possible degrees of consistency are 3, 4, 5.

The consistency of the sample tissue image is introduced in the cleanliness loss, which can reduce the influence of subjectivity on the recognition model training, thereby improving the stability and accuracy of the recognition model.

Fig. 5 is a flow chart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 5, before step 102, the method may further include:

Step 104, classify the tissue image by using the pre-trained classification model to determine the target type of the tissue image.

Correspondingly, the implementation manner of step 102 may be:

If the target type indicates that the quality of the tissue image satisfies the preset condition, the initial cleanliness and the target rounding method are determined according to the tissue image and the recognition model.

For example, the tissue image collected by the endoscope can be input into a pre-trained classification model, so that the classification model can classify the tissue image, and the output of the classification model is the target type of the tissue image. The target type may include: the first type and the second type, the first type is used to indicate that the quality of the tissue image meets the preset condition, indicating that the quality of the tissue image is high, and the second type is used to indicate that the quality of the tissue image does not meet the preset condition condition, indicating poor quality of tissue images. Wherein, the classification model is used to identify the type of the input image, and the classification model can be trained according to a large number of pre-collected training images and a type label corresponding to each training image. The classification model can be, for example, CNN or LSTM, or an Encoder in Transformer (such as Vision Transformer), etc., which is not specifically limited in the present disclosure. When the endoscope is a colonoscope and the tissue image is an intestinal image, the preset conditions may include: the colonoscope is not blocked when the intestinal image is collected, and the distance between the colonoscope and the intestinal wall is greater than the preset distance when the intestinal image is collected Threshold, the exposure of the intestinal tract image is less than the preset exposure threshold, the blurriness of the intestinal tract image is less than the preset blurriness threshold, there is no intestinal adhesion in the intestinal tract image, and the like. For example, if the intestinal tract is covered by sewage, or the colonoscope is too close to the intestinal wall, the intestinal image is overexposed, the intestinal image is too blurred, or the intestinal adhesion occurs, the quality of the intestinal image does not meet the preset conditions.

Correspondingly, when the target type indicates that the quality of the tissue image satisfies the preset condition, the tissue image can be input into the recognition model, so that the recognition model can determine the initial cleanliness and the rounding method of the target. That is to say, when it is determined that the quality of the tissue image is high, the tissue image is then identified. In the case where the target type indicates that the quality of the tissue image does not meet the preset condition, the tissue image may be discarded directly. Further, the image collected by the endoscope in the next collection cycle may be selected, and the above steps may be repeated to determine the cleanliness of the tissue cavity.

Fig. 6 is a flowchart of another method for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 6, the implementation of step 104 may include:

Step 1041, perform preprocessing on the tissue image, and divide the preprocessed tissue image into multiple sub-images of equal size.

Step 1042, according to the image vector corresponding to each sub-image and the position vector corresponding to the sub-image, determine the token corresponding to the sub-image, and the position vector is used to indicate the position of the sub-image in the preprocessed tissue image.

Step 1043: Input the token corresponding to each sub-image and the token corresponding to the tissue image into the encoder to obtain a local encoding vector corresponding to each sub-image and a global encoding vector corresponding to the tissue image.

Step 1044, input the global encoding vector and multiple local encoding vectors into the classification layer, so as to obtain the target type output by the classification layer.

Exemplarily, the classification model may include: an encoder and a classification layer, and may also include a linear projection layer. Among them, the encoder can be the Encoder in Vision Transformer, the classification layer can be MLP (English: Multilayer Perceptron Head), and the linear projection layer can be understood as a fully connected layer.

First, the tissue image can be preprocessed to enhance the data included in the tissue image. The preprocessing can include: random affine transformation, random brightness, contrast, saturation, hue adjustment, size transformation, etc., and finally get The preprocessed tissue image may be an image of a specified size (for example, 224*224). After that, the preprocessed tissue image can be divided into multiple sub-images of equal size (which can be represented as patches) according to the specified size. For example, the preprocessed tissue image is 224*224, and the specified size is 16*16, then you can Divide the preprocessed tissue image into 196 sub-images.

Afterwards, each sub-image can be flattened by using the linear projection layer to obtain an image vector corresponding to the sub-image (which can be expressed as patch embedding), and the image vector can represent the sub-image. Further, a position vector (may be expressed as position embedding) for indicating the position of the sub-image in the preprocessed tissue image may also be generated, where the size of the position embedding is the same as that of the patch embedding. It should be noted that the position embedding can be randomly generated, and the encoder can learn the representation of the position of the corresponding sub-image in the tissue image. Afterwards, according to the image vector and position vector of each sub-image, a token corresponding to the sub-image (which can be expressed as a token) can be generated. Specifically, the token corresponding to each sub-image may be obtained by concatenating the image vector and the position vector of the sub-image.

Then, the token corresponding to each sub-image and the token corresponding to the tissue image can be input into the encoder, and the encoder can generate a local encoding vector corresponding to each sub-image according to the token corresponding to each sub-image. At the same time, it can also be based on Tokens corresponding to all sub-images generate global encoding vectors corresponding to tissue images. Wherein, the local encoding vector can be understood as a vector learned by the encoder and can represent the corresponding sub-image, and the global encoding vector can be understood as the vector learned by the encoder and can represent the entire tissue image.

Finally, the global encoding vector and multiple local encoding vectors can be input into the classification layer, and the output of the classification layer is the target type. Specifically, the global encoding vector and multiple local encoding vectors can be concatenated to obtain a comprehensive encoding vector, and then the integrated encoding vector is input into the classification layer, and the classification layer can determine the matching of tissue images with various types according to the integrated encoding vector. Probability, and finally the type with the highest matching probability is used as the target type. Since the input of the classification layer includes both the global encoding vector and each local encoding vector, the characteristics of the entire tissue image and each sub-image are integrated, that is, the global information and local information are considered, which can effectively improve the classification accuracy of the classification model. .

Fig. 7 is a flow chart showing a training classification model according to an exemplary embodiment. As shown in Fig. 7, the classification model is obtained by training in the following manner:

Step C, obtaining a second sample input set and a second sample output set, the second sample input set includes: a plurality of second sample inputs, each second sample input includes a sample tissue image, and the second sample output set includes each The second sample input corresponds to the second sample output, and each second sample output includes the true type of the corresponding sample tissue image.

In step D, the second sample input set is used as the input of the classification model, and the second sample output set is used as the output of the classification model, so as to train the classification model.

For example, when training a classification model, it is necessary to obtain a second sample input set and a second sample output set for training the classification model. The second sample input set includes a plurality of second sample inputs, and each second sample input may be a sample tissue image, and the sample tissue image may be, for example, a tissue image collected during an endoscopic examination before. The second sample output set includes a second sample output corresponding to each second sample input, and each second sample output includes the true type of the corresponding sample tissue image, and the true type may include: the first type and the second type, The first type is used to indicate that the quality of the tissue image meets the preset condition, and the second type is used to indicate that the quality of the tissue image does not meet the preset condition.

When training the classification model, the second sample input set can be used as the input of the classification model, and then the second sample output set can be used as the output of the classification model to train the classification model, so that when the second sample input set is input, the classification The output of the model can be matched with the second sample output set. For example, according to the output of the classification model, the difference (or mean square error) with the second sample output set can be used as the loss function of the classification model, with the goal of reducing the loss function, and the backpropagation algorithm is used to correct the neurons in the classification model. Parameters, the parameters of the neuron may be, for example, the weight and bias of the neuron. Repeat the above steps until the loss function satisfies the preset condition, for example, the loss function is smaller than the preset loss threshold, so as to achieve the purpose of training the classification model. Specifically, the loss function of the classification model can be shown in Formula 7 (ie, the cross-entropy loss function):

Among them, L _class represents the loss function of the classification model,

Indicates the output of the classification model (which can be understood as the matching probability between the sample tissue image and the qth type), s _q represents the matching probability between the real type of the sample tissue image and the qth type, and F represents the number of real types. Taking the real type includes a first type and a second type, the first type is used to indicate that the quality of the tissue image meets the preset condition, and the second type is used to indicate that the quality of the tissue image does not meet the preset condition, then F=2.

Fig. 8 is a block diagram of a device for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 8, the device 200 may include:

The obtaining module 201 is configured to obtain tissue images collected by the endoscope.

The recognition module 202 is configured to determine the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, and the initial cleanliness is a floating-point type.

The rounding module 203 is configured to round the initial cleanliness according to the target rounding method to obtain the cleanliness of the tissue image, and the cleanliness is an integer.

Fig. 9 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 9 , the identification model includes: a feature extraction sub-model, a cleanliness sub-model and a rounding sub-model.

Correspondingly, the identification module 202 may include:

The feature extraction sub-module 2021 is configured to input the tissue image into the feature extraction sub-model to obtain the image features output by the feature extraction sub-model for characterizing the tissue image.

The processing sub-module 2022 is used to input the image features into the cleanliness sub-model and the rounding sub-model respectively, so as to obtain the cleanliness vector output by the cleanliness sub-model and the rounding vector output by the rounding sub-model.

The determining sub-module 2023 is configured to determine the initial cleanliness according to the cleanliness vector, and determine the target rounding method according to the rounding vector.

In one implementation, the determining submodule 2023 can be used to perform the following steps:

In another implementation, the recognition model is trained by:

In yet another implementation, each sample tissue image includes multiple cleanliness labels, the real cleanliness of the sample tissue image is determined according to the multiple cleanliness labels of the sample tissue image, and the consistency of the sample tissue image is determined according to the Among the multiple cleanliness labels of the sample tissue image, the number of cleanliness labels matching the real cleanliness is determined. The first sample output also includes a degree of consistency of the corresponding sample tissue image.

Fig. 10 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 10, the device 200 further includes:

The classification module 204 is configured to classify the tissue image by using the pre-trained classification model to determine the target type of the tissue image before determining the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model.

Correspondingly, the recognition module 202 may be configured to determine the initial cleanliness and the rounding method of the target according to the tissue image and the recognition model if the target type indicates that the quality of the tissue image satisfies a preset condition.

Fig. 11 is a block diagram of another device for determining the cleanliness of a tissue cavity according to an exemplary embodiment. As shown in Fig. 11 , the classification module 204 may include:

The preprocessing sub-module 2041 is configured to preprocess the tissue image, and divide the preprocessed tissue image into multiple sub-images of equal size.

The token determination sub-module 2042 is configured to determine the token corresponding to the sub-image according to the image vector corresponding to each sub-image and the position vector corresponding to the sub-image, and the position vector is used to indicate the organization of the sub-image after preprocessing position in the image.

The encoding sub-module 2043 is configured to input the token corresponding to each sub-image and the token corresponding to the tissue image into the encoder to obtain a local encoding vector corresponding to each sub-image and a global encoding vector corresponding to the tissue image.

The classification sub-module 2044 is configured to input the global encoding vector and multiple local encoding vectors into the classification layer, so as to obtain the target type output by the classification layer.

In one implementation, the classification model is trained by:

Regarding the apparatus in the foregoing embodiments, the specific manner in which each module executes operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

Referring now to FIG. 12 , it shows a schematic structural diagram of an electronic device (for example, the execution subject in the above embodiments, which may be a terminal device or a server) 300 suitable for implementing the embodiments of the present disclosure. The terminal equipment in the embodiment of the present disclosure may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like. The electronic device shown in FIG. 12 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.

As shown in FIG. 12, an electronic device 300 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 301, which may be randomly accessed according to a program stored in a read-only memory (ROM) 302 or loaded from a storage device 308. Various appropriate actions and processes are executed by programs in the memory (RAM) 303 . In the RAM 303, various programs and data necessary for the operation of the electronic device 300 are also stored. The processing device 301, ROM 302, and RAM 303 are connected to each other through a bus 304. An input/output (I/O) interface 305 is also connected to the bus 304 .

Typically, the following devices can be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibrating an output device 307 such as a computer; a storage device 308 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device 300 to perform wireless or wired communication with other devices to exchange data. While FIG. 12 shows electronic device 300 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product, which includes a computer program carried on a non-transitory computer readable medium, where the computer program includes program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 309, or from storage means 308, or from ROM 302. When the computer program is executed by the processing device 301, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are performed.

It should be noted that the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.

In some embodiments, the terminal device and the server can communicate with any currently known or future-developed network protocols such as HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol), and can communicate with digital data in any form or medium The communication (eg, communication network) interconnections. Examples of communication networks include local area networks ("LANs"), wide area networks ("WANs"), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.

The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.

The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the tissue image collected by the endoscope; according to the tissue image and the pre-trained Identifying the model, determining the initial cleanliness and the target rounding method, the initial cleanliness is a floating-point type; according to the target rounding method, rounding the initial cleanliness to obtain the cleanliness of the tissue image , the cleanliness is an integer.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as "C" or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (for example, using an Internet service provider to connected via the Internet).

The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.

The modules involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the module does not constitute a limitation of the module itself under certain circumstances, for example, the obtaining module may also be described as a "module for obtaining tissue images".

The functions described herein above may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.

In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, Example 1 provides a method for determining the cleanliness of a tissue cavity, including: acquiring a tissue image collected by an endoscope; according to the tissue image and a pre-trained recognition model, determining an initial Cleanliness and target rounding method, the initial cleanliness is a floating point type; according to the target rounding method, the initial cleanliness is rounded to obtain the cleanliness of the tissue image, the cleanliness is an integer.

According to one or more embodiments of the present disclosure, Example 2 provides the method of Example 1, the recognition model includes: a feature extraction sub-model, a cleanliness sub-model and a rounding sub-model; The trained recognition model determines the initial cleanliness and the target rounding method, including: inputting the tissue image into the feature extraction sub-model to obtain an image output by the feature extraction sub-model for characterizing the tissue image feature; input the image feature into the cleanliness sub-model and the rounding sub-model respectively, to obtain the cleanliness vector output by the cleanliness sub-model, and the rounding vector output by the rounding sub-model; The initial cleanliness is determined according to the cleanliness vector, and the target rounding manner is determined according to the rounding vector.

According to one or more embodiments of the present disclosure, Example 3 provides the method of Example 2, the determining the initial cleanliness according to the cleanliness vector includes: determining the tissue image according to the cleanliness vector Matching probabilities with multiple types of cleanliness; determining the initial cleanliness according to the weight corresponding to each type of cleanliness and the matching probability of the tissue image with multiple types of cleanliness; The rounding vector, determining the target rounding method, includes: according to the rounding vector, determining the matching probability of the tissue image and multiple rounding methods; according to the tissue image and multiple rounding methods The matching probability of , and determine the target rounding method among multiple rounding methods.

According to one or more embodiments of the present disclosure, Example 4 provides the method of Example 2, the recognition model is obtained by training in the following manner: obtaining a first sample input set and a first sample output set, the first A sample input set includes: a plurality of first sample inputs, each of the first sample inputs includes a sample tissue image, and the first sample output set includes a corresponding to each of the first sample inputs The first sample output, each of the first sample outputs includes the true cleanliness of the corresponding sample tissue image; the first sample input set is used as the input of the recognition model, and the first The sample output set is used as the output of the recognition model to train the recognition model; the loss of the recognition model is determined according to the cleanliness loss and the rounding loss, and the cleanliness loss is determined according to the output of the cleanliness sub-model determined with the first sample output set, and the rounding loss is determined according to the output of the rounding sub-model and the first sample output set.

According to one or more embodiments of the present disclosure, Example 5 provides the method of Example 4, each of the sample tissue images includes a plurality of cleanliness labels, and the real cleanliness of the sample tissue image is based on the plurality of cleanliness labels of the sample tissue image. The cleanliness label is determined, and the consistency of the sample tissue image is determined according to the number of the cleanliness labels that match the real cleanliness among the multiple cleanliness labels of the sample tissue image; the first The sample output also includes the consistency of the corresponding sample tissue image; the cleanliness loss is based on the output of the cleanliness sub-model, the true cleanliness included in each of the first sample inputs, and the The degree of consistency is determined.

According to one or more embodiments of the present disclosure, Example 6 provides the method of Example 1, before determining the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, the method further Including: using a pre-trained classification model to classify the tissue image to determine the target type of the tissue image; determining the initial cleanliness and target rounding method according to the tissue image and the pre-trained recognition model, The method includes: if the target type indicates that the quality of the tissue image satisfies a preset condition, determining the initial cleanliness and the target rounding method according to the tissue image and the recognition model.

According to one or more embodiments of the present disclosure, Example 7 provides the method of Example 6, the classification model includes: an encoder and a classification layer, and the pre-trained classification model is used to classify the tissue image to determine The target type of the tissue image includes: preprocessing the tissue image, and dividing the preprocessed tissue image into multiple sub-images of equal size; according to the image vector corresponding to each of the sub-images, The position vector corresponding to the sub-image determines the token corresponding to the sub-image, and the position vector is used to indicate the position of the sub-image in the preprocessed tissue image; each of the sub-images corresponds to token, and the token corresponding to the tissue image is input into the encoder to obtain a local encoding vector corresponding to each sub-image, and a global encoding vector corresponding to the tissue image; combine the global encoding vector and a plurality of the local encoding vectors An encoded vector is input to a classification layer to obtain the object type output by the classification layer.

According to one or more embodiments of the present disclosure, Example 8 provides the method of Example 7, the classification model is obtained by training in the following manner: obtaining a second sample input set and a second sample output set, the second sample The input set includes: a plurality of second sample inputs, each of which includes a sample tissue image, and the second sample output set includes a second sample output corresponding to each of the second sample inputs, each The second sample output includes the true type of the corresponding sample tissue image; the second sample input set is used as the input of the classification model, and the second sample output set is used as the output of the classification model, to train the classification model.

According to one or more embodiments of the present disclosure, Example 9 provides a device for determining the cleanliness of a tissue cavity, including: an acquisition module for acquiring tissue images collected by an endoscope; an identification module for The image and the pre-trained recognition model determine the initial cleanliness and the target rounding method, the initial cleanliness is a floating point type; the rounding module is used to round the initial cleanliness according to the target rounding method integer to obtain the cleanliness of the tissue image, and the cleanliness is integer.

According to one or more embodiments of the present disclosure, Example 10 provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, the steps of the methods described in Example 1 to Example 8 are implemented.

According to one or more embodiments of the present disclosure, Example 11 provides an electronic device, including: a storage device on which a computer program is stored; a processing device configured to execute the computer program in the storage device to Implement the steps of the method described in Example 1 to Example 8.

The above description is only a preferred embodiment of the present disclosure and an illustration of the applied technical principle. Those skilled in the art should understand that the disclosure scope involved in this disclosure is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, but also covers the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of equivalent features. For example, a technical solution formed by replacing the above-mentioned features with (but not limited to) technical features with similar functions disclosed in this disclosure.

In addition, while operations are depicted in a particular order, this should not be understood as requiring that the operations be performed in the particular order shown or performed in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims. Regarding the apparatus in the foregoing embodiments, the specific manner in which each module executes operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

Claims

A method for determining the cleanliness of a tissue cavity, wherein the method comprises:

Obtain tissue images collected by the endoscope;

Determine the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, and the initial cleanliness is a floating-point type;

According to the target rounding manner, the initial cleanliness is rounded to obtain the cleanliness of the tissue image, and the cleanliness is an integer.
The method according to claim 1, wherein the identification model comprises: a feature extraction sub-model, a cleanliness sub-model and a rounding sub-model; the initial cleanliness is determined according to the tissue image and the pre-trained identification model and target rounding methods, including:

inputting the tissue image into the feature extraction sub-model to obtain image features output by the feature extraction sub-model for characterizing the tissue image;

Inputting the image features into the cleanliness sub-model and the rounding sub-model respectively to obtain the cleanliness vector output by the cleanliness sub-model and the rounding vector output by the rounding sub-model;

The initial cleanliness is determined according to the cleanliness vector, and the target rounding manner is determined according to the rounding vector.
The method according to claim 2, wherein said determining said initial cleanliness according to said cleanliness vector comprises:

determining the matching probabilities of the tissue image and multiple cleanliness types according to the cleanliness vector;

determining the initial cleanliness according to the weight corresponding to each of the cleanliness types and the matching probabilities of the tissue image and multiple cleanliness types;

The determining the target rounding method according to the rounding vector includes:

According to the rounding vector, determine the matching probability of the tissue image and multiple rounding methods;

The target rounding method is determined among the multiple rounding methods according to the matching probabilities of the tissue image and the multiple rounding methods.
The method according to claim 2, wherein the recognition model is obtained by training in the following manner:

Obtain a first sample input set and a first sample output set, the first sample input set includes: a plurality of first sample inputs, each of the first sample inputs includes a sample tissue image, the first sample input set includes: A sample output set includes a first sample output corresponding to each of the first sample inputs, and each of the first sample outputs includes the true cleanliness of the corresponding sample tissue image;

using the first sample input set as an input to the recognition model, and using the first sample output set as an output of the recognition model to train the recognition model;

The loss of the identification model is determined according to the cleanliness loss and the rounding loss, the cleanliness loss is determined according to the output of the cleanliness sub-model and the first sample output set, and the rounding loss is determined according to the The output of the rounded submodel is determined with the first set of sample outputs.
The method according to claim 4, wherein each of the sample tissue images includes a plurality of cleanliness labels, the true cleanliness of the sample tissue image is determined according to the plurality of the cleanliness labels of the sample tissue image, the sample The consistency of the tissue image is determined according to the number of the cleanliness labels that match the real cleanliness among the multiple cleanliness labels of the sample tissue image; the first sample output also includes the corresponding Consistency of sample tissue images;

The cleanliness loss is determined from the output of the cleanliness sub-model, the true cleanliness and the consistency included in each of the first sample inputs.
The method according to claim 1, wherein, before determining the initial cleanliness and the target rounding method according to the tissue image and the pre-trained recognition model, the method further comprises:

classifying the tissue image using a pre-trained classification model to determine the target type of the tissue image;

The determining the initial cleanliness and target rounding method according to the tissue image and the pre-trained recognition model includes:

If the target type indicates that the quality of the tissue image satisfies a preset condition, determine the initial cleanliness and the target rounding method according to the tissue image and the identification model.
The method according to claim 6, wherein said classification model comprises: an encoder and a classification layer, said classifying said tissue image using a pre-trained classification model to determine the target type of said tissue image, comprising :

Preprocessing the tissue image, and dividing the preprocessed tissue image into multiple sub-images of equal size;

According to the image vector corresponding to each of the sub-images, and the position vector corresponding to the sub-image, determine the token corresponding to the sub-image, and the position vector is used to indicate that the sub-image is in the preprocessed tissue image s position;

Inputting the token corresponding to each sub-image and the token corresponding to the tissue image into an encoder to obtain a local coding vector corresponding to each sub-image and a global coding vector corresponding to the tissue image;

Inputting the global encoding vector and the plurality of local encoding vectors into a classification layer to obtain the object type output by the classification layer.
The method according to claim 7, wherein the classification model is obtained by training in the following manner:

Obtain a second sample input set and a second sample output set, the second sample input set includes: a plurality of second sample inputs, each of the second sample inputs includes a sample tissue image, and the second sample output set includes a second sample output corresponding to each of the second sample inputs, each of the second sample outputs including the true type of the corresponding sample tissue image;

The second sample input set is used as an input of the classification model, and the second sample output set is used as an output of the classification model, so as to train the classification model.
A device for determining the cleanliness of a tissue cavity, the device comprising:

An acquisition module, configured to acquire tissue images collected by the endoscope;

A recognition module, configured to determine an initial cleanliness and a target rounding method according to the tissue image and a pre-trained recognition model, where the initial cleanliness is a floating-point type;

The rounding module is configured to round the initial cleanliness according to the target rounding manner to obtain the cleanliness of the tissue image, and the cleanliness is integer.
A computer-readable medium, on which a computer program is stored, and when the program is executed by a processing device, the steps of the method according to any one of claims 1-8 are implemented.
An electronic device comprising:

a storage device on which a computer program is stored;

A processing device configured to execute the computer program in the storage device to implement the steps of the method according to any one of claims 1-8.