WO2022042506A1

WO2022042506A1 - Convolutional neural network-based cell screening method and device

Info

Publication number: WO2022042506A1
Application number: PCT/CN2021/114165
Authority: WO
Inventors: 陈亮; 韩晓健; 侯媛媛; 哈斯木买买提依明; 梁国龙
Original assignee: 深圳太力生物技术有限责任公司
Priority date: 2020-08-26
Filing date: 2021-08-24
Publication date: 2022-03-03
Also published as: CN112037862B; CN112037862A

Abstract

The present application relates to a convolutional neural network-based cell screening method and device, the method comprising: acquiring gray-scale images of cells to be tested which respectively correspond to multiple cells to be tested in a cell culture pool; inputting the multiple gray-scale images of cells to be tested which correspond to the multiple cells to be tested into a target convolutional neural network model; obtaining the protein expression levels respectively corresponding to the multiple cells to be tested according to the output of the convolutional neural network model; and determining, according to the protein expression levels corresponding to the multiple cells to be tested, from the multiple cells to be tested target cells having protein expression levels satisfying a preset condition. Cells having high protein expression levels can be quickly determined, cell screening can be performed without repeated culturing and screening, and the screening cycle can be significantly shortened.

Description

Cell screening method and device based on convolutional neural network

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the priority of the Chinese patent application with the application number 202010869638.4 and the invention title "Cell Screening Method and Device Based on Convolutional Neural Networks", which was submitted to the State Intellectual Property Office of China on August 26, 2020, and its entire contents Incorporated into this application by reference.

technical field

The present application relates to the field of biotechnology, in particular to a cell screening method, device, computer equipment and storage medium based on convolutional neural network.

Background technique

With the continuous development of genetic engineering technology, the isolation of monoclonal cell lines capable of expressing specific products from cell pools has become a common requirement in the biological field.

In the prior art, when obtaining cells for culturing a monoclonal cell line, the cells in the cell pool can be transfected first, and the cell pool can be processed by a limiting dilution method to obtain a single cell, and then a single cell can be obtained. Cells are cultured with homogeneous cell populations, namely cell lines, and the cell lines with high target protein expression are screened.

However, the process of obtaining single cells by the limiting dilution method is cumbersome and requires repeated cultivation and screening. At the same time, due to the problem of cell transfection efficiency, the proportion of cells with high expression levels of the target protein is low, resulting in low efficiency of screening cells. The screening cycle is long, and it is difficult to obtain cells with high target protein expression quickly and accurately.

SUMMARY OF THE INVENTION

Based on this, it is necessary to provide a cell screening method, device, computer equipment and storage medium based on convolutional neural network in view of the above technical problems.

A cell screening method based on a convolutional neural network, the method comprising:

Obtain the grayscale images of the cells to be tested corresponding to the plurality of cells to be tested in the cell culture pool;

A plurality of grayscale images of cells to be tested corresponding to a plurality of cells to be tested are input into the target convolutional neural network model; the target convolutional neural network model is obtained by training a plurality of grayscale images of training cells with expression labels, and the said target convolutional neural network model is obtained by training The expression label is used to characterize the real protein expression of the cells in the grayscale image of each training cell, and the target convolutional neural network model is used to detect the protein expression of the cells in the grayscale image of the cells to be tested of the input model;

According to the output of the target convolutional neural network model, obtain the corresponding protein expression levels of the plurality of cells to be tested;

According to the protein expression levels corresponding to the plurality of cells to be tested, target cells whose protein expression levels meet the set conditions are determined from the plurality of cells to be tested.

Optionally, also include:

Obtain the training cell grayscale image and its corresponding training cell fluorescence image;

According to the training cell fluorescence map, the actual protein expression level of the cells in the corresponding training cell grayscale map is determined, and the expression level label corresponding to the training cell grayscale map is obtained based on the actual protein expression level;

The convolutional neural network model is trained by using the training cell grayscale image and the expression label to generate a target convolutional neural network model.

Optionally, according to the training cell fluorescence map, determine the actual protein expression level of the cells in the corresponding training cell grayscale map, and obtain the expression level label corresponding to the training cell grayscale map based on the actual protein expression level. ,include:

determining the value of the green channel in the training cell fluorescence image;

According to the value of the green channel in the fluorescence image of the training cells, determine the actual protein expression of the cells in the corresponding gray-scale image of the training cells;

The real protein expression level is determined as the expression level label corresponding to the grayscale image of the training cells.

Optionally, the training of a convolutional neural network model using the training cell grayscale image and the expression label to generate a target convolutional neural network model, including:

Inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression corresponding to the training cell grayscale image;

Determine the training error according to the protein expression corresponding to the grayscale image of the training cells and the expression label;

According to the training error, the network parameters of the convolutional neural network model are adjusted by reducing the error to obtain optimal network parameters, and the target convolutional neural network model is generated by using the optimal network parameters.

Optionally, the convolutional neural network model includes a multi-layer structure, and according to the training error, the network parameters of the convolutional neural network model are adjusted by reducing the error to obtain optimal network parameters, including: :

judging whether the training error is converged and less than a preset error threshold;

If so, determine the network parameters of the current convolutional neural network model as the optimal network parameters;

If not, use the training error to backpropagate from the last layer of the convolutional neural network model, adjust the network parameters of each layer of the convolutional neural network model by reducing the error, and return the training The cell grayscale image is input into the convolutional neural network model, and the step of determining the protein expression level corresponding to the training cell grayscale image.

Optionally, the convolutional neural network model includes a first network structure, a second network structure, a third network structure, a fourth network structure and a fully connected layer; the training cell grayscale map is a plurality of training cell grayscales. picture;

Inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression corresponding to the training cell grayscale image, including:

inputting multiple grayscale images of training cells into the convolutional neural network model;

For each training cell grayscale image, obtain the corresponding training cell feature through the first network structure; input the training cell feature into the second network structure, the third network structure and the fourth network structure to obtain the corresponding a first cell feature, a second cell feature, and a third cell feature;

The first cell feature, the second cell feature and the third cell feature are connected in parallel to obtain the feature fusion result corresponding to each training cell grayscale image; wherein the first cell feature, the second cell feature and the third cell feature are Three-cell features have different levels of abstract expression;

The feature fusion results corresponding to the grayscale images of the training cells are input to the fully connected layer, and the protein expression levels corresponding to the grayscale images of the training cells are determined according to the output results of the fully connected layer.

Optionally, according to the protein expression levels corresponding to the plurality of cells to be tested, the target cells whose protein expression levels meet the set conditions are determined from the plurality of cells to be tested, including:

Sorting the protein expression levels corresponding to the plurality of cells to be tested, and determining the pre-set number of protein expression levels at the top of the sorting as the target expression level from the multiple protein expression levels after sorting;

The grayscale image of the cell to be tested corresponding to the target expression level is determined, and the cell to be tested corresponding to the grayscale image of the cell to be tested is determined as the target cell.

A cell screening device based on a convolutional neural network, the device comprising:

The grayscale image acquisition module of the cells to be tested is used to acquire the grayscale images of the cells to be tested corresponding to a plurality of cells to be tested in the cell culture tank;

The first input module is used to input multiple grayscale images of cells to be tested corresponding to multiple cells to be tested into the target convolutional neural network model; the target convolutional neural network model The degree map training is obtained, the expression label is used to represent the real protein expression of the cells in the grayscale map of each training cell, and the target convolutional neural network model is used to detect the input model. protein expression;

A protein expression level prediction module, configured to obtain the respective protein expression levels corresponding to the plurality of cells to be tested according to the output of the target convolutional neural network model;

The cell screening module is used to determine, from the plurality of cells to be tested, the target cells whose protein expression meets the set condition according to the protein expression levels corresponding to the plurality of cells to be tested.

A computer device, comprising a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the steps of the above-mentioned cell screening method based on a convolutional neural network are implemented:

A computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned cell screening method based on a convolutional neural network are realized:

The above-mentioned cell screening method, device, computer equipment and storage medium based on convolutional neural network, by acquiring the grayscale images of the cells to be tested corresponding to the plurality of cells to be tested in the cell culture tank, and the corresponding cells of the plurality of cells to be tested. Multiple grayscale images of the cells to be tested are input into the target convolutional neural network model, and according to the output of the target convolutional neural network model, the protein expression levels corresponding to the plurality of cells to be tested are obtained, and then the protein expression levels corresponding to the plurality of cells to be tested are obtained. Expression level, the target cells whose protein expression levels meet the set conditions are determined from multiple cells to be tested, which realizes the rapid determination of cells with high protein expression levels, and avoids the need to undergo repeated culture and screening before cell screening can be performed. In addition, the application can quickly process millions of single cells, while increasing the range of cell screening, reducing the workload of staff and effectively improving the efficiency of cell screening.

Description of drawings

1 is a schematic flowchart of a cell screening method based on a convolutional neural network in one embodiment;

Fig. 2 is a schematic flowchart of steps of model training in one embodiment;

Figure 3a is a grayscale image of a training cell in one embodiment;

Figure 3b is a fluorescence image of a training cell in one embodiment;

4 is a schematic flowchart of steps of another model training in one embodiment;

5 is a schematic flowchart of a step of model parameter adjustment in one embodiment;

6 is a schematic flowchart of a step of predicting protein expression in one embodiment;

7 is a structural block diagram of a cell screening device based on a convolutional neural network in one embodiment;

FIG. 8 is a diagram of the internal structure of a computer device in one embodiment.

detailed description

In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

In order to facilitate the understanding of the embodiments of the present invention, the prior art is first described.

However, the process of obtaining single cells by the limiting dilution method is cumbersome and requires repeated cultivation and screening. At the same time, due to the problem of cell transfection efficiency, the proportion of cells with high expression levels of the target protein is low, resulting in low efficiency of screening cells. The screening cycle is long, and traditional methods often take 6 months or more. While consuming a lot of human and material resources, it is difficult to meet the needs of scale and industrialization.

In one embodiment, as shown in FIG. 1 , a cell screening method based on a convolutional neural network is provided. In this embodiment, the method is applied to a terminal for illustration. It can be understood that the method can also be applied to The server can also be applied to a system including a terminal and a server, and the method is implemented through the interaction between the terminal and the server. In this embodiment, the method includes the following steps:

Step 101: Obtain grayscale images of cells to be tested corresponding to a plurality of cells to be tested in the cell culture tank.

As an example, the cells to be tested may be cells treated with transfection technology, the cells to be tested may be cells that fail to obtain exogenous DNA fragments after treatment with transfection technology, or cells that have obtained exogenous DNA fragments but not A cell that has been integrated into a chromosome, or a cell in which a foreign DNA segment has been integrated into a chromosome. The grayscale image of the cells to be tested is a grayscale image of the cells to be tested.

In practical applications, multiple cells in the cell culture pool can be transfected, so that some or all of the cells in the cell culture pool can obtain exogenous DNA fragments. After the transfection technique is performed, grayscale images of the cells to be tested corresponding to the plurality of cells to be tested in the cell culture tank can be obtained.

Step 102: Inputting multiple grayscale images of cells to be tested corresponding to multiple cells to be tested into a target convolutional neural network model; the target convolutional neural network model is obtained by training a plurality of grayscale images of training cells with expression labels. , the expression label is used to represent the real protein expression of the cells in the grayscale image of each training cell, and the target convolutional neural network model is used to detect the cells in the grayscale image of the cells to be tested that are input to the target convolutional neural network model. protein expression.

As an example, the grayscale image of the training cells is a picture used as a training sample, which is used to train a convolutional neural network model, and the cells in the grayscale image of the training cells are cells that have been transfected.

In practical applications, the convolutional neural network model can be trained by using multiple grayscale images of training cells with expression labels to obtain the target convolutional neural network model. Among them, the expression label corresponding to the grayscale image of the training cell can represent the real protein expression of the cell in the grayscale image of the training cell. Since the main expression product of the gene is protein, the grayscale image of the training cell can be determined through the actual protein expression. expression of target genes in cells.

After obtaining the grayscale images of the cells to be tested corresponding to the cells to be tested, the grayscale images of the cells to be tested can be input into the trained target convolutional neural network model, and the target convolutional neural network model can be used to detect The protein expression of the cells in the grayscale image of the cells to be tested.

Step 103 , according to the output of the target convolutional neural network model, obtain the respective protein expression levels corresponding to the plurality of cells to be tested.

As an example, the protein expression level corresponding to the cells to be tested is the protein expression level of the cells in the grayscale image of the cells to be tested, and the protein expression level is predicted by the target convolutional neural network model.

After inputting the grayscale images of the cells to be tested into the target convolutional neural network model, the protein expression levels corresponding to the cells to be tested can be obtained according to the output of the target convolutional neural network model.

Step 104 , according to the protein expression levels corresponding to the plurality of cells to be tested, determine, from the plurality of cells to be tested, target cells whose protein expression meets the set condition.

After the protein expression levels corresponding to the plurality of cells to be tested are obtained, the plurality of cells to be tested can be screened, and the target cells whose protein expression levels meet the set conditions are determined from the plurality of cells to be tested.

In this embodiment, by acquiring the grayscale images of the cells to be tested corresponding to the cells to be tested in the cell culture tank, the grayscale images of the cells to be tested corresponding to the cells to be tested are input into the target convolutional neural network model , and according to the output of the target convolutional neural network model, the protein expression levels corresponding to the multiple cells to be tested are obtained, and then the protein expression levels are determined from the multiple cells to be tested according to the protein expression levels corresponding to the multiple cells to be tested. Target cells that meet the set conditions can quickly determine cells with high protein expression, avoid the need for repeated culture and screening before cell screening, greatly shorten the screening cycle, and quickly process millions of cells through this application The single cell can increase the range of cell screening, reduce the workload of the staff, and effectively improve the efficiency of cell screening.

In one embodiment, as shown in FIG. 2 , the cell screening method based on convolutional neural network may further include the following steps:

Step 201 , acquiring a training cell grayscale image and its corresponding training cell fluorescence image.

In a specific implementation, a cell as a training set can be set, and the cell can be photographed to obtain a grayscale image of the training cell and a corresponding fluorescence image of the training cell, respectively. Among them, the cells in the training set can be used to train the convolutional neural network model, and the grayscale image of the training cells and the corresponding fluorescence image of the training cells can be the grayscale images obtained by shooting the same cells under the same shooting conditions and Fluorescence map.

The cells in the training set can be cells in the cell culture pool that have been processed by transfection technology. The cells can be cells that have not been able to obtain exogenous DNA fragments after the transfection technology treatment, or cells that have obtained exogenous DNA fragments. But cells that are not integrated into chromosomes, or cells that have foreign DNA fragments integrated into chromosomes.

For the same batch of cells in the cell culture pool treated with transfection technology, the grayscale image and the fluorescence image can be captured simultaneously with a microscope under the same shooting conditions, and the obtained grayscale image and fluorescence image can include one or more cells , the coordinates of each cell in the grayscale image correspond to the coordinates of that cell in the fluorescence image. Since the same grayscale image and fluorescence image can contain multiple cells at the same time, after obtaining the grayscale image and the fluorescence image, the grayscale image and the fluorescence image can be segmented to obtain the training cell grayscale image and training cell corresponding to a single cell. The fluorescence images are shown in Fig. 3a and Fig. 3b.

Step 202 , according to the training cell fluorescence map, determine the actual protein expression level of the cells in the corresponding training cell grayscale map, and obtain an expression level label corresponding to the training cell grayscale map based on the actual protein expression level.

After acquiring the training cell fluorescence map, the actual protein expression level of the cells in the corresponding training cell grayscale map can be determined according to the training cell fluorescence map, and the expression level label corresponding to the training cell grayscale map can be obtained based on the real protein expression level.

Step 203 , using the training cell grayscale image and the expression label to train a convolutional neural network model to generate a target convolutional neural network model.

After the expression label is obtained, the training cell grayscale image and the corresponding expression label can be used to train the convolutional neural network model to generate the target convolutional neural network model.

In this embodiment, by using the training cell grayscale map and the expression label to train the convolutional neural network model to generate the target convolutional neural network model, the cell grayscale map and the cell protein expression level in the cell grayscale map can be established. The relationship between them provides model support for the rapid screening of cells with high protein expression.

In one embodiment, the actual protein expression of cells in the corresponding training cell grayscale image is determined according to the training cell fluorescence image, and the expression corresponding to the training cell grayscale image is obtained based on the actual protein expression. quantity label, which can include the following steps:

Determine the value of the green channel in the fluorescence image of the training cells; according to the value of the green channel in the fluorescence image of the training cells, by accumulating the value of the green channel in the cell area in the fluorescence image, determine the true value of the cell in the grayscale image of the corresponding training cell Protein expression level; the real protein expression level is determined as the expression level label corresponding to the grayscale image of the training cells.

In practical applications, proteins produced by genes of interest, such as exogenous DNA fragments, can fluoresce at specific wavelengths. After the training cell fluorescence map is obtained, the value corresponding to the green channel in the training cell fluorescence map (also called the fluorescence value) can be determined, and the real protein expression of the cells in the corresponding training cell grayscale image can be determined according to the value, and then The real protein expression can be determined as the expression label corresponding to the grayscale image of the training cells. Among them, the relationship between the fluorescence value and the protein expression amount can be a positive correlation, and by obtaining the quantitative mapping relationship between the fluorescence value and the protein expression amount, the real protein expression amount can be determined by the green brightness value.

In this embodiment, according to the value corresponding to the green channel in the training cell fluorescence image, the actual protein expression level of the cells in the corresponding training cell grayscale image is determined, and the actual protein expression level is determined as the corresponding training cell grayscale image. The expression label can use the value corresponding to the green channel of the training cell fluorescence image as an intermediate variable to quantify the protein expression of the cells in the grayscale image of the training cell, and obtain the expression label corresponding to the grayscale image of the training cell, providing accurate real protein Expression data.

In one embodiment, as shown in FIG. 4 , the training of a convolutional neural network model by using the training cell grayscale image and the expression label to generate a target convolutional neural network model may include the following steps:

Step 401: Input the grayscale image of the training cells into the convolutional neural network model, and determine the protein expression level corresponding to the grayscale image of the training cells.

In practical applications, when training the convolutional neural network model, the training cell grayscale image can be input into the convolutional neural network model, and the output result of the convolutional neural network model can be used to determine the corresponding grayscale image of the training cell. The protein expression level, where the convolutional neural network model is used to predict the protein expression level of the cells in the grayscale image of the training cells. When predicting the protein expression level, the convolutional neural network model takes the grayscale image of the training cells as input. Predict the corresponding protein expression level.

Step 402: Determine the training error according to the protein expression level corresponding to the grayscale image of the training cells and the expression level label.

After obtaining the protein expression level corresponding to the grayscale image of the training cells, since the protein expression level corresponding to the grayscale image of the training cell is the protein expression level predicted by the convolutional neural network model, at the beginning of the training process, the predicted protein expression level is the same as the protein expression level predicted by the convolutional neural network model. There is a gap between the real protein expression levels. Based on this, the training error between the two can be determined by training the protein expression level and expression level label corresponding to the grayscale image of the training cells. In practical applications, the cost function can be used to calculate the training error between the protein expression level corresponding to the grayscale image of the training cell and the expression level label.

Step 403 , according to the training error, adjust the network parameters of the convolutional neural network model by reducing the error to obtain optimal network parameters, and use the optimal network parameters to generate a target convolutional neural network model.

After the training error is determined, the network parameters of the convolutional neural network model can be adjusted according to the training error and the adjustment purpose of reducing the error until the optimal network parameters are obtained. After the optimal network parameters are determined, the target convolutional neural network model can be generated based on the optimal network parameters.

In this embodiment, the training error is determined by the protein expression level and the expression level label corresponding to the grayscale image of the training cells, and the network parameters of the convolutional neural network model are adjusted according to the training error to obtain the optimal network parameters, and the optimal network parameters are adopted. The optimal network parameters generate the target convolutional neural network model, which can continuously train and optimize the convolutional neural network model through the gap between the predicted protein expression and the actual protein expression.

In one embodiment, the convolutional neural network model includes a multi-layer structure, and according to the training error, the network parameters of the convolutional neural network model are adjusted by reducing the error to obtain optimal network parameters , which can include the following steps:

Determine whether the training error converges and is smaller than the preset error threshold; if so, determine that the network parameters of the current convolutional neural network model are the optimal network parameters; if not, use the training error from the convolutional neural network model The last layer of backpropagation adjusts the network parameters of each layer of the convolutional neural network model by reducing the error, and returns to input the grayscale image of the training cells into the convolutional neural network model to determine the training Steps of protein expression levels corresponding to cell grayscale images.

In practical applications, convolutional neural network models can include multi-layer structures, such as setting max pooling layers, average pooling layers, and multi-layer convolutional layers. When training the convolutional neural network model, it is determined whether the training error converges and is smaller than the preset error threshold.

If yes, it is determined that the protein expression corresponding to the grayscale image of the training cells is close to the real protein expression, and the network parameters of the current convolutional neural network model can be determined as the optimal network parameters; The last layer of the network model is back-propagated, and the network parameters of each layer in the convolutional neural network model are adjusted according to the adjustment direction of reducing the error. After adjustment, the grayscale image of the training cells can be returned to the convolutional neural network model. , the steps of determining the protein expression level corresponding to the grayscale image of the training cells.

In order to enable those skilled in the art to better understand the above steps, an example is used below to illustrate the embodiment of the present application, but it should be understood that the embodiment of the present application is not limited thereto.

As shown in Figure 5, the training cell grayscale image and the training fluorescence image can be obtained. After initializing the network parameters of each layer in the convolutional neural network model, the training cell grayscale image can be input into the convolutional neural network model. The output value (that is, the protein expression corresponding to the grayscale image of the training cells in this application) is obtained through propagation, and the training error between the output value and the real value (that is, the expression label in this application) is calculated.

After the training error is obtained, it can be judged whether the training error is converged and small enough. If so, the training error can be back-propagated, and the SGD (Stochastic Gradient Descent) algorithm or other optimization algorithm can be used to update the connection weights and biases of each layer (ie network parameters in this application), and pass forward propagation again to obtain the output value; if not, the current network parameters can be determined as the optimal network parameters, and the target convolutional neural network model is generated based on the optimal network parameters.

In practical applications, the transfected cells in the cell culture pool can be divided into a training set, a validation set and a test set, in which the grayscale images of the cells in the training set corresponding to the training cells can be used to train the convolutional neural network model; the validation set The grayscale image of the cells corresponding to the cells can be used to verify the trained target convolutional neural network model to prevent the model from overfitting on the training set, and the accuracy of the model during the training process can be determined by the validation set; , which can be the cells to be tested in the present application. Through the target convolutional neural network model, the target cells whose protein expression meets the set conditions can be determined from a plurality of cells to be tested.

In this embodiment, it is judged whether the training error has converged. If not, the training error can be back-propagated from the last layer of the convolutional neural network model, and the network parameters of each layer of the convolutional neural network model can be adjusted by reducing the error. , the network parameters can be continuously optimized through iterative calculation until the predicted protein expression is close to the real protein expression, which improves the prediction accuracy of the target convolutional neural network model for the protein expression.

In one embodiment, the convolutional neural network model includes a first network structure, a second network structure, a third network structure, a fourth network structure and a fully connected layer, and the training cell grayscale image may be a plurality of training cell grayscale images .

As shown in FIG. 6 , inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression level corresponding to the training cell grayscale image, may include the following steps:

Step 601: Inputting multiple grayscale images of training cells into the convolutional neural network model.

In a specific implementation, multiple grayscale images of training cells can be input into the convolutional neural network model.

Step 602, for each training cell grayscale image, obtain the corresponding training cell feature through the first network structure; input the training cell feature into the second network structure, the third network structure and the fourth network structure, The corresponding first cell feature, second cell feature and third cell feature are obtained.

For each training cell grayscale map, the training cell features corresponding to the training cell grayscale map can be obtained through the first network structure, and the training cell features can be input into the second network structure, the third network structure, and the fourth network structure to obtain Corresponding first cell signature, second cell signature and third cell signature.

Step 603: Connect the first cell feature, the second cell feature and the third cell feature in parallel to obtain a feature fusion result corresponding to each training cell grayscale image; wherein, the first cell feature, the second cell feature Features and tertiary cell features have different levels of abstract expression.

After obtaining the first cell feature, the second cell feature and the third cell feature, the first cell feature, the second cell feature and the third cell feature can be connected in parallel to obtain the feature fusion result corresponding to each training cell grayscale image , wherein the first cell feature, the second cell feature, and the third cell feature can have different levels of abstract expression.

In a specific implementation, the first network structure may be a feature extraction network composed of 10 convolutional layers, the second network structure may be composed of 11 convolutional layers and an average pooling layer, and the third network structure may be a 2-layer convolutional layer. The accumulation layer and the maximum pooling layer are composed together, and the fourth network structure can be a network composed of an average pooling layer added after the third network structure, that is, the fourth network structure can be composed of 2 layers of convolutional layers, maximum pooling layers and average pooling layers. Composition of pooling layers.

In the convolutional neural network model, the shallow network can extract simple features in the grayscale image of the training cells, such as feature extraction for cell shape, color, texture and cell edge, which can reflect the specific features of a certain dimension of the cell , and the features extracted by the deep network can abstract the features extracted by the shallow network to obtain cell features that can reflect the overall cell. Based on this, after the specific training cell features are extracted from the first network structure, the training cell features can be further input into the second network structure, the third network structure and the fourth network structure, and through different levels of networks, different abstract expressions can be obtained. Hierarchical cellular characteristics.

In practical applications, the first cell feature, the second cell feature, and the third cell feature can be output in the form of matrices. After the matrices corresponding to the first cell feature, the second cell feature, and the third cell feature are obtained, each matrix can be After multiplying by different weights, add and sum up, the result is the feature fusion result, in which the weight of the matrix is positively correlated with the proportion of cell features extracted by the network structure, that is, the larger the weight, the more cell features extracted by the network structure. higher proportion.

Step 604: Input the feature fusion results corresponding to the grayscale images of the training cells to the fully connected layer, and determine the protein expression levels corresponding to the grayscale images of the training cells according to the output results of the fully connected layer.

After the feature fusion results are determined, the feature fusion results corresponding to the grayscale images of the training cells can be input to the fully connected layer, and the protein expression corresponding to the grayscale images of the training cells can be determined according to the output results of the fully connected layer. quantity.

For multiple grayscale images of training cells, the input parameters can be defined as B*3*448*448, where B is the number of grayscale images of training cells input to the network each time the convolutional neural network model is trained, and 3 means the number of image channels is R , G, B three channels, 448 is the width and height of the picture.

After inputting multiple training cell grayscale images into the first network structure and obtaining the training cell features corresponding to each training cell grayscale image, each training cell feature can be input to the second network structure and the third network structure respectively. and the fourth network structure, the second network structure, the third network structure and the fourth network structure can respectively output a matrix of size B*100, that is, the first cell feature, the second cell feature and the third cell feature, where the matrix The size can be adjusted during training, that is, the value of 100 can be adjusted according to actual needs.

After obtaining 3 matrices of size B*100, each matrix can be multiplied by different weights and added and summed to obtain a matrix of B*100, that is, the result of feature fusion, where the weights can be 0 The number of changes in the interval to 1.

After the feature fusion result is obtained, the result can be input to the fully connected layer. The number of inputs of the fully connected layer corresponds to the size of the matrix, and the number of outputs is 1. In this example, the result of feature fusion can pass through an input of 100 and output The fully-connected layer with 1 obtains a vector whose output is in the form of B*1. Each component in the vector corresponds to a grayscale image of the training cells, and the value of the component is the expression level of the second protein.

In one embodiment, according to the protein expression levels corresponding to the plurality of cells to be tested, determining from the plurality of cells to be tested the target cells whose protein expression meets the set condition may include the following steps:

Sort the protein expression levels corresponding to the plurality of cells to be tested, and from the protein expression levels corresponding to the plurality of cells to be tested after sorting, determine the protein expression level of the first preset number as the target expression level; The grayscale map of the cells to be tested corresponding to the target expression level is determined, and the cells to be tested corresponding to the grayscale map of the cells to be tested are determined as target cells.

In a specific implementation, after obtaining the protein expression levels corresponding to the plurality of cells to be tested, the protein expression levels corresponding to the plurality of cells to be tested can be sorted, and from the sorted protein expression levels, the most advanced prediction The amount of protein expression was determined as the target expression level.

Specifically, the protein expression levels corresponding to the plurality of cells to be tested can be sorted in descending order, that is, sorted from large to small. After sorting, the protein expression levels corresponding to the top N names can be determined as the target expression levels. Of course, in practical applications, the protein expression level exceeding the preset expression level threshold can also be determined as the target expression level.

After the target expression level is determined, the grayscale image of the cell to be tested corresponding to the target expression level can be determined, and the cell to be tested corresponding to the grayscale image of the cell to be tested is determined as the target cell. The target cells can be used to culture cell lines.

In this embodiment, the protein expression levels corresponding to the plurality of cells to be tested are sorted, and according to the sorted protein expression levels corresponding to the plurality of cells to be tested, the protein expression levels of the plurality of cells to be tested are ranked first. The preset number of cells are determined as target cells, which can quickly screen cells with high protein expression, which greatly reduces the screening workload.

In one embodiment, the acquiring a grayscale image of the training cells may include the following steps:

Obtain the original cell grayscale image used for model training, and perform normalization processing on the original cell grayscale image; perform data enhancement processing on the processed original cell grayscale image to obtain the training cell grayscale image; the Data enhancement processing includes any one or more of the following: rotation processing, inversion processing, contrast enhancement processing, and random cropping processing.

In a specific implementation, the original cell grayscale image used for model training can be obtained, and the original cell grayscale image can be normalized, wherein the original cell grayscale image can be photographed using a microscope on the cells used as the training set Grayscale image.

After normalization processing, data enhancement processing can be performed on the processed raw grayscale image, such as rotating, flipping, randomly cropping the image, or enhancing the contrast of the image.

In this embodiment, by performing data enhancement processing on the processed original cell grayscale image, the training cell grayscale image is obtained, and the training cell grayscale image used for training the convolutional neural network model can be added. In this case, the training samples are rapidly expanded to provide data support for the training of the convolutional neural network model.

It should be understood that although the various steps in the flowcharts of FIGS. 1 , 2 , and 4-6 are displayed in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. The steps may be performed in other orders unless explicitly stated herein to indicate a strict order restriction on the performance of the steps. Moreover, at least a part of the steps in FIGS. 1, 2, and 4-6 may include multiple steps or multiple stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. These steps Alternatively, the order of execution of the stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in the other steps.

In one embodiment, as shown in FIG. 7, a cell screening device based on a convolutional neural network is provided, which may include:

The grayscale image acquisition module 701 of the cells to be tested is used to obtain grayscale images of the cells to be tested corresponding to a plurality of cells to be tested in the cell culture tank;

The first input module 702 is used to input multiple grayscale images of the cells to be tested corresponding to the multiple cells to be tested into the target convolutional neural network model; the target convolutional neural network model is based on multiple training cells with expression labels. The grayscale image training is obtained, the expression label is used to represent the real protein expression of the cells in the grayscale image of each training cell, and the target convolutional neural network model is used to detect the cells to be tested that are input to the target convolutional neural network model. The protein expression of cells in the grayscale image;

A protein expression level prediction module 703, configured to obtain the respective protein expression levels corresponding to the multiple cells to be tested according to the output of the target convolutional neural network model;

The cell screening module 704 is configured to determine, according to the protein expression levels corresponding to the plurality of cells to be tested, target cells whose protein expression meets the set condition from the plurality of cells to be tested.

In one embodiment, the cell screening device based on convolutional neural network may further include:

The training cell grayscale image acquisition module is used to obtain the training cell grayscale image and its corresponding training cell fluorescence image;

The expression label determination module is used to determine the real protein expression of the cells in the corresponding training cell grayscale image according to the training cell fluorescence map, and obtain the expression corresponding to the training cell grayscale image based on the real protein expression quantity label;

A training module is used to train a convolutional neural network model by using the training cell grayscale map and the expression label to generate a target convolutional neural network model.

In one embodiment, the expression quantity label determination module includes:

The green brightness value determination submodule is used to determine the value of the green channel in the fluorescence image of the training cells;

The real protein expression determination submodule is used to determine the real protein expression of the cells in the corresponding training cell grayscale image according to the value of the green channel in the training cell fluorescence image;

The expression level label generation sub-module is used for determining the real protein expression level as the expression level label corresponding to the grayscale image of the training cells.

In one embodiment, the training module includes:

A protein expression level determination submodule, configured to input the grayscale image of the training cells into the convolutional neural network model, and determine the protein expression level corresponding to the grayscale image of the training cells;

a training error determination submodule, used for determining the training error according to the protein expression corresponding to the grayscale image of the training cells and the expression label;

A parameter adjustment sub-module, configured to adjust the network parameters of the convolutional neural network model by reducing the error according to the training error to obtain optimal network parameters, and use the optimal network parameters to generate a target convolution Neural network model.

In one embodiment, the convolutional neural network model includes a multi-layer structure, and the parameter adjustment sub-module includes:

a judgment unit for judging whether the training error has converged and is less than a preset error threshold; if so, call the parameter determination unit; if not, call the backpropagation unit;

a parameter determination unit, used for determining the network parameters of the current convolutional neural network model as the optimal network parameters;

A back-propagation unit, configured to use the training error to back-propagate from the last layer of the convolutional neural network model, adjust the network parameters of each layer of the convolutional neural network model by reducing the error, and return The step of inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression level corresponding to the training cell grayscale image.

In one embodiment, the convolutional neural network model includes a first network structure, a second network structure, a third network structure, a fourth network structure and a fully connected layer; the training cell grayscale image is a plurality of training cells grayscale image;

The protein expression level determination submodule includes:

The second input unit is used for inputting a plurality of training cell grayscale images into the convolutional neural network model;

The training cell feature acquisition unit is used to obtain the corresponding training cell feature through the first network structure for each training cell grayscale image; input the training cell feature into the second network structure, the third network structure and The fourth network structure obtains the corresponding first cell feature, second cell feature and third cell feature;

A feature fusion result acquisition unit, configured to connect the first cell feature, the second cell feature and the third cell feature in parallel to obtain a feature fusion result corresponding to each training cell grayscale image; wherein, the first cell feature The feature, the second cell feature, and the second cell feature have different levels of abstract expression;

The result output unit is used to input the feature fusion results corresponding to the grayscale images of the training cells to the fully connected layer, and determine the second corresponding grayscale images of the training cells according to the output results of the fully connected layer. protein expression.

In one embodiment, the cell screening module 704 includes:

The sorting submodule is used to sort the protein expression levels corresponding to the multiple cells to be tested, and from the protein expression levels corresponding to the sorted multiple cells to be tested, determine the protein expression level of the first preset number as target expression level;

The target cell determination submodule is used to determine the grayscale image of the cell to be tested corresponding to the target expression level, and to determine the cell to be tested corresponding to the grayscale image of the cell to be tested as the target cell.

In one embodiment, the training cell grayscale image acquisition module includes:

The original cell grayscale image acquisition sub-module is used to obtain the original cell grayscale image used for model training, and normalize the original cell grayscale image;

The data enhancement processing submodule is used to perform data enhancement processing on the processed original cell grayscale image to obtain the training cell grayscale image; the data enhancement processing includes any one or more of the following: rotation processing, flip processing, contrast Enhanced processing, random cropping processing.

For a specific definition of a cell screening device based on a convolutional neural network, reference may be made to the above definition of a cell screening method based on a convolutional neural network, which will not be repeated here. Each module in the above-mentioned cell screening device based on convolutional neural network can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, a computer device is provided, and the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 8 . The computer equipment includes a processor, memory, a communication interface, a display screen, and an input device connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The nonvolatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for wired or wireless communication with an external terminal, and the wireless communication can be realized by WIFI, operator network, NFC (Near Field Communication) or other technologies. The computer program, when executed by a processor, implements a convolutional neural network based cell screening method. The display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.

Those skilled in the art can understand that the structure shown in FIG. 8 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:

In one embodiment, when the processor executes the computer program, it also implements the steps in the other embodiments described above.

In one embodiment, a computer-readable storage medium is provided on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:

In one embodiment, the computer program, when executed by the processor, also implements the steps in the other embodiments described above.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage In the medium, when the computer program is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the various embodiments provided in this application may include at least one of non-volatile and volatile memory. Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), and the like.

The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.

The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the invention patent. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims

A cell screening method based on convolutional neural network, characterized in that the method comprises:

Obtain the grayscale images of the cells to be tested corresponding to the plurality of cells to be tested in the cell culture pool;

A plurality of grayscale images of cells to be tested corresponding to a plurality of cells to be tested are input into the target convolutional neural network model; the target convolutional neural network model is obtained by training a plurality of grayscale images of training cells with expression labels, and the said target convolutional neural network model is obtained by training The expression label is used to characterize the real protein expression of the cells in the grayscale image of each training cell, and the target convolutional neural network model is used to detect the protein expression of the cells in the grayscale image of the cells to be tested of the input model;

According to the output of the target convolutional neural network model, obtain the corresponding protein expression levels of the plurality of cells to be tested;

According to the protein expression levels corresponding to the plurality of cells to be tested, target cells whose protein expression levels meet the set conditions are determined from the plurality of cells to be tested.
The method of claim 1, further comprising:

Obtain the training cell grayscale image and its corresponding training cell fluorescence image;

According to the training cell fluorescence map, the actual protein expression level of the cells in the corresponding training cell grayscale map is determined, and the expression level label corresponding to the training cell grayscale map is obtained based on the actual protein expression level;

The convolutional neural network model is trained by using the training cell grayscale image and the expression label to generate a target convolutional neural network model.
The method according to claim 2, wherein, according to the training cell fluorescence map, the actual protein expression of the cells in the corresponding training cell grayscale image is determined, and the training is obtained based on the actual protein expression. Expression labels corresponding to cell grayscale images, including:

determining the value of the green channel in the training cell fluorescence image;

According to the value of the green channel in the fluorescence image of the training cells, determine the actual protein expression of the cells in the corresponding gray-scale image of the training cells;

The real protein expression level is determined as the expression level label corresponding to the grayscale image of the training cells.
The method according to claim 2, wherein the training a convolutional neural network model by using the training cell grayscale image and the expression label to generate a target convolutional neural network model, comprising:

Inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression corresponding to the training cell grayscale image;

Determine the training error according to the protein expression corresponding to the grayscale image of the training cells and the expression label;

According to the training error, the network parameters of the convolutional neural network model are adjusted by reducing the error to obtain optimal network parameters, and the target convolutional neural network model is generated by using the optimal network parameters.
The method according to claim 4, wherein the convolutional neural network model comprises a multi-layer structure, and the network parameters of the convolutional neural network model are adjusted by reducing the error according to the training error , to get the optimal network parameters, including:

judging whether the training error is converged and less than a preset error threshold;

If so, determine the network parameters of the current convolutional neural network model as the optimal network parameters;

If not, use the training error to backpropagate from the last layer of the convolutional neural network model, adjust the network parameters of each layer of the convolutional neural network model by reducing the error, and return the training The cell grayscale image is input into the convolutional neural network model, and the step of determining the protein expression level corresponding to the training cell grayscale image.
The method according to claim 4, wherein the convolutional neural network model comprises a first network structure, a second network structure, a third network structure, a fourth network structure and a fully connected layer; the training cell gray The degree map is a grayscale image of multiple training cells;

Inputting the training cell grayscale image into the convolutional neural network model, and determining the protein expression corresponding to the training cell grayscale image, including:

inputting multiple grayscale images of training cells into the convolutional neural network model;

For each training cell grayscale image, obtain the corresponding training cell feature through the first network structure; input the training cell feature into the second network structure, the third network structure and the fourth network structure to obtain the corresponding a first cell feature, a second cell feature, and a third cell feature;

The first cell feature, the second cell feature and the third cell feature are connected in parallel to obtain the feature fusion result corresponding to each training cell grayscale image; wherein the first cell feature, the second cell feature and the third cell feature are Three-cell features have different levels of abstract expression;

The feature fusion results corresponding to the multiple grayscale images of the training cells are input into the fully connected layer, and the protein expression levels corresponding to the multiple grayscale images of the training cells are determined according to the output results of the fully connected layer.
The method according to claim 1, wherein, according to the protein expression levels corresponding to the plurality of cells to be tested, the target cells whose protein expression meets the set condition are determined from the plurality of cells to be tested, comprising: :

Sorting the protein expression levels corresponding to the plurality of cells to be tested, and from the protein expression levels corresponding to the plurality of cells to be tested after sorting, determining the protein expression level of the first preset number of cells as the target expression level;

The grayscale image of the cell to be tested corresponding to the target expression level is determined, and the cell to be tested corresponding to the grayscale image of the cell to be tested is determined as the target cell.
A cell screening device based on convolutional neural network, characterized in that the device comprises:

The grayscale image acquisition module of the cells to be tested is used to acquire the grayscale images of the cells to be tested corresponding to a plurality of cells to be tested in the cell culture tank;

The first input module is used to input multiple grayscale images of the cells to be tested corresponding to the multiple cells to be tested into the target convolutional neural network model; the target convolutional neural network model The degree map training is obtained, the expression label is used to represent the actual protein expression of the cells in the grayscale map of each training cell, and the target convolutional neural network model is used to detect the input model. protein expression;

A protein expression level prediction module, configured to obtain the respective protein expression levels corresponding to the plurality of cells to be tested according to the output of the target convolutional neural network model;

The cell screening module is used to determine, from the plurality of cells to be tested, the target cells whose protein expression meets the set condition according to the corresponding protein expression of the plurality of cells to be tested.
A computer device, comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, when the processor executes the computer program, the convolutional neural-based convolutional neural network described in any one of claims 1 to 7 is implemented Steps of a cellular screening method for networks.
A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the convolutional neural network-based cell screening method according to any one of claims 1 to 7 is realized A step of.