WO2021189910A1

WO2021189910A1 - Image recognition method and apparatus, and electronic device and computer-readable storage medium

Info

Publication number: WO2021189910A1
Application number: PCT/CN2020/131990
Authority: WO
Inventors: 李楠楠; 叶苓; 刘新卉; 周云舒; 黄凌云
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-09-24
Filing date: 2020-11-27
Publication date: 2021-09-30
Also published as: CN111932564A; CN111932564B

Abstract

An image recognition method and apparatus, and an electronic device and a storage medium, which relate to the field of artificial intelligence, and can be applied to an application scenario of medical image recognition. The method comprises: performing global target region clipping and pixel normalization processing on an initial image set so as to obtain a first standard image set (S1); performing first model training using the first standard image set so as to obtain a first recognition model (S2); performing local target region clipping, data enhancement and pixel normalization processing on the initial image set so as to obtain a second standard image set (S3); performing second model training using the second standard image set so as to obtain a second recognition model (S4); and when an image to be subjected to recognition is received, performing recognition and result determination on said image using the first recognition model and the second recognition model so as to obtain a recognition result (S5). The image to be subjected to recognition can be stored in a blockchain node. By means of the method, the accuracy of image recognition can be improved.

Description

Picture recognition method, device, electronic equipment and computer readable storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is CN202011015349.4, and the title is "picture recognition method, device, electronic equipment, and computer-readable storage medium" on September 24, 2020, all of which The content is incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence, and in particular to an image recognition method, device, electronic equipment, and computer-readable storage medium.

Background technique

With the development of artificial intelligence, the use of image recognition models to recognize images has become more and more widely used, not only in life, but also in medical technology, such as: recognizing patients’ chest CT images Assist the doctor in the diagnosis of tuberculosis.

However, the inventor realizes that the current picture recognition model recognizes pictures globally, and it is easy to ignore local subtle features, resulting in low accuracy of picture recognition.

Summary of the invention

A picture recognition method provided by this application includes:

Acquiring an initial picture set, and performing global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set;

Training the pre-built first deep learning model by using the first standard picture set to obtain the first recognition model;

Performing local target region cropping conversion, data enhancement and pixel normalization processing on the initial picture set to obtain a second standard picture set;

Training the pre-built second deep learning model by using the second standard picture set to obtain a second recognition model;

When the picture to be recognized is received, the first recognition model and the second recognition model are used to recognize and determine the result of the picture to be recognized to obtain a recognition result.

The present application also provides a picture recognition device, the device includes:

The global model generation module is used to obtain an initial picture set, perform global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set; The first deep learning model is trained to obtain the first recognition model;

The local model generation module is used to perform local target area cropping conversion, data enhancement, and pixel normalization processing on the initial picture set to obtain a second standard picture set; use the second standard picture set to compare the pre-built second The deep learning model is trained to obtain the second recognition model;

The picture recognition module is configured to, when the picture to be recognized is received, use the first recognition model and the second recognition model to recognize and determine the result of the picture to be recognized to obtain a recognition result.

In order to solve the above-mentioned problems, the present application also provides an electronic device, which includes:

At least one processor; and,

A memory communicatively connected with the at least one processor; wherein,

The memory stores computer program instructions executable by the at least one processor, and the computer program instructions are executed by the at least one processor, so that the at least one processor can execute the following steps:

In order to solve the above-mentioned problems, the present application also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:

Description of the drawings

FIG. 1 is a schematic flowchart of a picture recognition method provided by an embodiment of this application;

FIG. 2 is a detailed flowchart of one of the steps in the image recognition method provided in FIG. 1;

FIG. 3 is a schematic diagram of modules of a picture recognition device provided by an embodiment of the application;

4 is a schematic diagram of the internal structure of an electronic device for implementing a picture recognition method provided by an embodiment of the application;

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.

This application provides a picture recognition method. Referring to FIG. 1, it is a schematic flowchart of a picture recognition method provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the picture recognition method includes:

S1. Obtain an initial picture set, and perform global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set;

In the embodiment of the present application, the initial picture set may be a medical image picture set containing initial tags, such as a patient chest X-ray film set containing tags, wherein the initial tags are preset disease identification tags, such as tuberculosis and tuberculosis. Non-tuberculosis.

Further, in order to eliminate the interference of the background region and improve the training accuracy of the subsequent model, the embodiment of the present application crops the first region of interest (Region Of Interest, ROI region for short) of each picture in the initial picture set to obtain the first global Picture collection, preferably, in this embodiment of the present application, the first region of interest is the whole lung region.

Further, in order to facilitate subsequent unified processing of the model, the embodiment of the present application fills and interpolates each picture in the first global picture set to a preset size to obtain the second global picture set.

In detail, the embodiment of the present application interpolating the filling of each picture in the first global picture set to a preset size includes: filling blank pixels in each picture in the first global picture set according to a preset rule to obtain the filling Picture set; picture interpolation is performed on each picture in the filling picture set to a preset size to obtain the second global picture set, wherein the pictures in the filling picture set are the same as the pictures in the second global picture set Have the same aspect ratio, for example: fill and interpolate picture A in the first global picture set to a preset size, the default size is 1024*1024, the size of picture A is 256*240, and picture A is filled with blank pixels It is the smallest square picture B containing picture A, the size of picture B is 256*256, and then picture B is interpolated to obtain picture C with a size of 1024*1024, and the filling and interpolation of picture A is completed.

Further, in the embodiment of the present application, in order to speed up the training speed of the subsequent model, normalization processing is performed on each pixel value in each picture in the second global picture set to obtain the first standard picture set. Further, in this embodiment of the present application, the normalization of each original pixel value in each picture in the second global picture set can be calculated by the following formula:

P _g ＝P _x /256

Wherein, P _x represents the original pixel value, and P _g represents the original pixel value after normalization.

In summary, in the embodiment of the present application, performing global target area cropping conversion and pixel normalization processing on the initial picture set to obtain the first standard picture set includes: cropping each picture in the initial picture set To obtain the first global picture set; fill and interpolate each picture in the first global picture set to a preset size to obtain the second global picture set; for each picture in the second global picture set Each pixel value in a picture is normalized to obtain the first standard picture set.

S2. Use the first standard picture set to train the pre-built first deep learning model to obtain the first recognition model;

In the embodiment of the present application, the first deep learning model may be a convolutional neural network model or a residual network model.

Further, using the first standard picture set to train the pre-built first deep learning model in the embodiment of the present application includes:

Step A: Perform a convolution pooling operation on the first standard picture set according to the preset number of convolution pooling times to obtain a feature set;

Step B: Use the preset activation function to calculate the predicted value of the feature set, obtain the label value of the initial label corresponding to each picture in the standard picture set, and according to the predicted value and the label value, Calculate using the pre-built first loss function to obtain the first loss value;

In the examples of this application, the label value corresponds to the initial label one-to-one. For example, the initial label has two labels: tuberculosis and non-tuberculosis, the label value corresponding to the tuberculosis label is 1, and the label corresponding to the non-tuberculosis label The value is 0.

Step C: Compare the magnitude of the first loss value with the preset first loss threshold, and when the first loss value is greater than or equal to the first preset threshold, return to the step A; When a loss value is less than the first preset threshold, stop training to obtain the first recognition model.

In detail, in the embodiment of the present application, performing a convolution pooling operation on the first standard picture set to obtain the first feature set includes: performing a convolution operation on the first standard picture set to obtain the first convolution data Set; maximum pooling operation is performed on the first convolutional data set to obtain the first feature set.

Further, the convolution operation is:

Where, ω'represents the number of channels of the first convolution data set, ω represents the number of channels of the first standard picture set, k is the size of the preset convolution kernel, and f is the step size of the preset convolution operation , P is the preset data zero-filling matrix.

Further, the first activation function in the preferred embodiment of the present application includes:

Wherein, μ _t represents the predicted value, and s represents the data in the feature set.

In detail, the first loss function described in the preferred embodiment of the present application includes:

Wherein, L _ce represents the first loss value, N is the number of data in the first standard picture set, i is a positive integer, y _i is the label value, and p _i is the predicted value.

S3. Perform local target region cropping conversion, data enhancement, and pixel normalization processing on the initial picture set to obtain a second standard picture set;

In the embodiment of the present application, the pictures in the first standard picture set are global pictures, so the first recognition model is a global recognition model, but the actual application of the global recognition model usually ignores the subtle features corresponding to the local position, resulting in omissions. Therefore, it is necessary to use the local location picture set to train a local recognition model as a supplement to the first recognition model. For example, the first recognition model is a global recognition model for recognizing the whole lung, but the above The situation where there are slight lung lesions (such as fibrosis, multiple small spots) is easy to miss, therefore, the local recognition model of the upper lung part class is used to train the upper lung picture as a supplement to the first recognition model; Therefore, the local target area cropping conversion, data enhancement, and pixel normalization processing are performed on the initial picture set to obtain a second standard picture set, wherein the pictures in the second standard picture set are local position pictures, for example: The pictures in the first standard picture set are pictures of the whole lung area, and the pictures in the second standard picture set are pictures of the upper lung area.

In detail, referring to FIG. 2, performing local target region cropping conversion, data enhancement, and pixel normalization processing on the initial picture set in the embodiment of the present application includes:

S31. Crop the second region of interest of each picture in the initial picture set to obtain an initial partial picture set;

Preferably, in this embodiment of the present application, the second region of interest is an upper lung region.

S32: Mark corresponding pictures in the initial partial picture set according to the initial label corresponding to each picture in the initial picture set to obtain a first partial picture set;

For example: the label of picture A in the initial picture set is tuberculosis, and the label position is the upper left lung. After S31 processing, the upper left lung picture a and the upper right lung picture b are obtained from picture A, and the picture a is marked as tuberculosis according to the label of picture A. Label, mark picture b as a non-TB label.

In the embodiment of the present application, the following picture processing process only processes pictures in the first partial picture set, and does not affect the labels corresponding to the pictures.

S33: Filling and interpolating pictures in the first partial picture set to a preset size to obtain a second partial picture set;

S34. Perform normalization processing on each pixel value in each picture in the second partial picture set to obtain a third partial picture set;

S35: Rotate each picture in the third partial picture set by a preset angle, and mark the picture with the corresponding rotation angle to obtain the second standard picture set;

In the embodiment of the application, in order to improve the generalization ability of the subsequent model, the data processing method of self-supervised learning model training well known to those skilled in the art is used to adjust the angle and the corresponding angle of each picture in the third partial picture set. Label marking, for example, random 0°, 90°, 180°, 270° rotation of pictures in the third partial picture set, and label marking of rotation angles, to obtain the second standard picture set.

In detail, the pictures in the second standard picture set have double labels, which are the initial label and the rotation angle label. For example, the initial label of picture A is tuberculosis, and the rotation angle label is 90°.

S4. Use the second standard picture set to train the pre-built second deep learning model to obtain a second recognition model;

In the embodiment of the present application, the second deep learning model may be a convolutional neural network model or a residual network model.

In detail, in this embodiment of the present application, using the second standard picture set to train the pre-built second deep learning model includes:

Step I: Perform weight calculation according to the preset second loss function and the preset third loss function to obtain the target loss function;

In detail, in the embodiment of the present application, the pictures in the second standard picture set have double labels, which are the initial label and the rotation angle label. Therefore, two types of prediction results will be generated during the model training process. To measure the prediction results of the two categories separately, two loss functions are required, namely the second loss function and the third loss function, and the second loss function is the loss function corresponding to the initial label , The third loss function is a loss function corresponding to the rotation angle label.

Further, in order to better measure the training progress of the model, the weight calculation is performed according to the preset second loss function and the preset third loss function, and the weight calculation can be expressed by the following formula:

L=L _tb +αL _rot

Wherein, L is the target loss function, L _tb is the second loss function; L _rot is the third loss function, and α is a preset weight coefficient.

Preferably, the weight coefficient is 0.1.

Step II: According to the target loss function, use the second standard picture set to train the second deep learning model; when the value of the target loss function is less than a second preset threshold, stop training to obtain The second recognition model.

S5. When the picture to be recognized is received, the first recognition model and the second recognition model are used to recognize and determine the result of the picture to be recognized to obtain a target recognition result.

The format of the picture to be recognized in the embodiment of this application is the same as that of the picture in the initial picture collection. Preferably, the picture to be recognized in the embodiment of this application is a medical imaging picture in medical technology, such as a patient's chest X-ray. .

Further, in this embodiment of the application, the first recognition model and the second recognition model are used to recognize the picture to be recognized, and the first recognition model is used to recognize the picture to be recognized to obtain a first recognition. Result; using the second recognition model to recognize the picture to be recognized to obtain a second recognition result, wherein the second recognition result includes a disease recognition result and a picture rotation angle result, preferably, the embodiment of the present application The disease recognition result mentioned in is the recognition result of tuberculosis.

Further, in the embodiment of the present application, the logical operation is performed according to the first recognition result and the second recognition result to obtain the target recognition result, wherein the logical operation in the embodiment of the present application is OR, AND Two logical operations, for example: the first recognition result is positive for pulmonary tuberculosis, the disease recognition result in the disease recognition result in the second recognition result is negative for pulmonary tuberculosis, or the first recognition result is negative for pulmonary tuberculosis, the If the disease recognition result in the second recognition result is positive for pulmonary tuberculosis, the target recognition result is positive for pulmonary tuberculosis; when the disease recognition results in the first recognition result and the second recognition result are both negative for pulmonary tuberculosis, then The target recognition result is negative for pulmonary tuberculosis; when the disease recognition results in the first recognition result and the second recognition result are both positive for pulmonary tuberculosis, the target recognition result is positive for pulmonary tuberculosis.

In another embodiment of the present application, in order to ensure the privacy of the data, the picture to be identified may be stored in the blockchain.

In the embodiment of the present application, the global target area cropping conversion and pixel normalization processing are performed on the initial picture set to obtain a first standard picture set, and the first standard picture set is used to train the pre-built first deep learning model , Improve the training speed and accuracy of the model, realize the global recognition of the picture; perform local target area cropping conversion, data enhancement and pixel normalization processing on the initial picture set, and use the second standard picture set to pre-build The second deep learning model is trained to obtain the second recognition model, which improves the training speed of the model, enhances the robustness of the model, and realizes the local recognition of the picture; when the picture to be recognized is received, the first recognition model is used. A recognition model and the second recognition model recognize and determine the result of the picture to be recognized to obtain a recognition result, and perform complementary operations of global recognition of the picture and local recognition of the picture through a dual model to improve the accuracy of picture recognition.

As shown in Figure 3, it is a functional block diagram of the picture recognition device of the present application.

The picture recognition apparatus 100 described in this application can be installed in an electronic device. According to the realized functions, the picture recognition device may include a global model generation module 101, a local model generation module 102, and a picture recognition module 103. The module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.

In this embodiment, the functions of each module/unit are as follows:

The global model generation module 101 is used to obtain an initial picture set, perform global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set; The constructed first deep learning model is trained to obtain the first recognition model.

Further, in the embodiment of the present application, in order to eliminate the interference of the background area and improve the training accuracy of the subsequent model, the global model generation module 101 crops the first region of interest of each picture in the initial picture set to obtain the first global picture set Preferably, in the embodiment of the present application, the first region of interest is a whole lung region.

Further, in order to facilitate subsequent unified model processing, the global model generation module 101 described in this embodiment of the present application fills and interpolates each picture in the first global picture set to a preset size to obtain a second global picture set.

In detail, the global model generation module 101 of the embodiment of the present application uses the following means to fill and interpolate each picture in the first global picture set to a preset size, including: inserting each picture in the first global picture set Fill in blank pixels according to preset rules to obtain a filled picture set; perform picture interpolation and adjust each picture in the filled picture set to a preset size to obtain the second global picture set, wherein the pictures in the filled picture set It has the same aspect ratio as the pictures in the second global picture set. For example, the interpolation value for picture A in the first global picture set is a preset size, the preset size is 1024*1024, and the size of picture A is 256 ＊240, use blank pixels to fill picture A into the smallest square picture B containing picture A, the size of picture B is 256*256, and then interpolate picture B to get a size of 1024*1024 picture C completes the filling and interpolation of picture A .

Further, in this embodiment of the application, in order to speed up the training of subsequent models, the global model generation module 101 normalizes each pixel value in each picture in the second global picture set to obtain The first standard picture collection. Further, the global model generation module 101 according to the embodiment of the present application normalizes each original pixel value in each picture in the second global picture set and can be calculated by the following formula:

P _g ＝P _x /256

In summary, in the embodiment of the present application, the global model generation module 101 performs global target area cropping conversion and pixel normalization processing on the initial picture set to obtain the first standard picture set, including: cropping the initial picture set. The first region of interest of each picture in the picture set is used to obtain the first global picture set; the filling and interpolation of each picture in the first global picture set is the preset size to obtain the second global picture set; Each pixel value in each picture in the global picture set is normalized to obtain the first standard picture set.

Further, the global model generation module 101 in the embodiment of the present application uses the following methods to train the pre-built first deep learning model, including:

In detail, in the embodiment of the present application, the global model generation module 101 performs a convolution pooling operation on the first standard picture set to obtain a first feature set, including: performing a convolution operation on the first standard picture set to obtain A first convolution data set; performing a maximum pooling operation on the first convolution data set to obtain the first feature set.

Further, the convolution operation is:

Further, the activation function described in the preferred embodiment of the present application includes:

The local model generation module 102 is used to perform local target region cropping conversion, data enhancement and pixel normalization processing on the initial picture set to obtain a second standard picture set; The second deep learning model is trained to obtain the second recognition model.

In detail, the local model generation module 102 in the embodiment of the present application uses the following methods to perform local target region cropping conversion, data enhancement, and pixel normalization processing on the initial picture set, including:

Crop the second region of interest of each picture in the initial picture set to obtain the initial partial picture set;

Marking the corresponding pictures in the initial partial picture set according to the initial label corresponding to each picture in the initial picture set to obtain a first partial picture set;

For example: the label of picture A in the initial picture set is tuberculosis, and the label position is the upper left lung. After the above processing, the upper left lung picture a and the upper right lung picture b are obtained from picture A, and picture a is labeled as tuberculosis label according to the label of picture A. , And mark picture b as a non-tuberculosis label.

Filling and interpolating pictures in the first partial picture set to a preset size to obtain a second partial picture set;

Performing normalization processing on each pixel value in each picture in the second partial picture set to obtain a third partial picture set;

Performing a preset angle rotation on each picture in the third partial picture set, and labeling with a corresponding rotation angle to obtain the second standard picture set;

In detail, in the embodiment of the present application, the local model generation module 102 uses the following means to train the pre-built second deep learning model including:

Perform weight calculation according to the preset second loss function and the preset third loss function to obtain the target loss function;

In detail, in the embodiment of the present application, the pictures in the second standard picture set have double labels as the initial label and the rotation angle label. Therefore, two types of prediction results will be generated during the model training process. The prediction results of the two categories are measured separately, and two loss functions are required, namely the second loss function and the third loss function, and the second loss function is the loss function corresponding to the initial label, The third loss function is a loss function corresponding to the rotation angle label.

The weight calculation can be expressed by the following formula:

L=L _tb +αL _rot

Preferably, the weight coefficient is 0.1.

According to the target loss function, use the second standard picture set to train the second deep learning model; when the value of the target loss function is less than a second preset threshold, stop training to obtain the second Identify the model.

The picture recognition module 103 is configured to, when a picture to be recognized is received, use the first recognition model and the second recognition model to recognize and determine the result of the picture to be recognized to obtain a recognition result.

Further, the picture recognition module 103 in the embodiment of the present application respectively uses the first recognition model and the second recognition model to recognize the picture to be recognized, and uses the first recognition model to recognize the picture to be recognized. Perform recognition to obtain a first recognition result; use the second recognition model to recognize the picture to be recognized to obtain a second recognition result, wherein the second recognition result includes a disease recognition result and a picture rotation angle result, preferably Specifically, the disease recognition result in the embodiment of the present application is the pulmonary tuberculosis recognition result.

Further, the picture recognition module 103 in the embodiment of the present application uses the following means to perform logical operations according to the first recognition result and the second recognition result to obtain the target recognition result. The logical operation is two logical operations of OR and AND, for example: the first recognition result is positive for tuberculosis, the disease recognition result in the disease recognition result in the second recognition result is negative for tuberculosis, or the first recognition result The result is pulmonary tuberculosis negative, and the disease recognition result in the second recognition result is positive for pulmonary tuberculosis, then the target recognition result is positive for pulmonary tuberculosis; when the first recognition result and the disease recognition results in the second recognition result are both Is negative for pulmonary tuberculosis, the target recognition result is negative for pulmonary tuberculosis; when the disease recognition results in the first recognition result and the second recognition result are both positive for pulmonary tuberculosis, the target recognition result is positive for pulmonary tuberculosis .

As shown in FIG. 4, it is a schematic diagram of the structure of an electronic device that implements the image recognition method of the present application.

The electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a picture recognition program.

The memory 11 may be volatile or non-volatile. The memory 11 includes at least one type of readable storage medium. The readable storage medium includes flash memory, mobile hard disk, and multimedia card. , Card-type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of a picture recognition program, etc., but also to temporarily store data that has been output or will be output.

The processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc. The processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules (such as pictures) stored in the memory 11 Recognition program, etc.), and call the data stored in the memory 11 to execute various functions of the electronic device 1 and process data.

The bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.

FIG. 4 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or combinations of certain components, or different component arrangements.

For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

Further, the electronic device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a Wi-Fi interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.

Optionally, the electronic device 1 may also include a user interface. The user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.

It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.

The picture recognition program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:

Specifically, for the specific implementation method of the above-mentioned instructions by the processor 10, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 1, which will not be repeated here.

Further, if the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. It can be volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .

Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store a block chain node Use the created data, etc.

In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the goals of the solutions of the embodiments.

In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application.

Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any associated diagram marks in the claims should not be regarded as limiting the claims involved.

The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

In addition, it is obvious that the word "including" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices stated in the system claims can also be implemented by one unit or device through software or hardware. The second class words are used to indicate names, and do not indicate any specific order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims

A picture recognition method, wherein the method includes:

Acquiring an initial picture set, and performing global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set;

Training the pre-built first deep learning model by using the first standard picture set to obtain the first recognition model;

Performing local target region cropping conversion, data enhancement and pixel normalization processing on the initial picture set to obtain a second standard picture set;

Training the pre-built second deep learning model by using the second standard picture set to obtain a second recognition model;

When the picture to be recognized is received, the first recognition model and the second recognition model are used to recognize and determine the result of the picture to be recognized to obtain a recognition result.
8. The picture recognition method according to claim 1, wherein said performing global target area cropping conversion and pixel normalization processing on said initial picture set to obtain a first standard picture set comprises:

Crop the first region of interest of each picture in the initial picture set to obtain the first global picture set;

Filling and interpolating each picture in the first global picture set to a preset size to obtain a second global picture set;

Perform normalization processing on each pixel value in each picture in the second global picture set to obtain the first standard picture set.
8. The picture recognition method according to claim 1, wherein said performing partial target region cropping conversion, data enhancement and pixel normalization processing on said initial picture set to obtain a second standard picture set comprises:

Crop the second region of interest of each picture in the initial picture set to obtain the initial partial picture set;

Marking the corresponding pictures in the initial partial picture set according to the initial label corresponding to each picture in the initial picture set to obtain the first partial picture set;

Filling and interpolating pictures in the first partial picture set to a preset size to obtain a second partial picture set;

Performing normalization processing on each pixel value in each picture in the second partial picture set to obtain a third partial picture set;

Rotate each picture in the third partial picture set by a preset angle, and mark it with a corresponding rotation angle to obtain the second standard picture set.
5. The picture recognition method according to claim 3, wherein said training a pre-built first deep learning model using said first standard picture set to obtain a first recognition model comprises:

Step A: Perform a convolution pooling operation on the first standard picture set according to the preset number of convolution pooling times to obtain a feature set;

Step B: Calculate the feature set by using a preset activation function to obtain a predicted value, obtain the label value of the initial label corresponding to each picture in the first standard picture set, according to the predicted value and the The label value is calculated by using the pre-built first loss function to obtain the first loss value;

Step C: Compare the magnitude of the first loss value with the preset first loss threshold, and when the first loss value is greater than or equal to the first preset threshold, return to the step A; When a loss value is less than the first preset threshold, stop training to obtain the first recognition model.
5. The picture recognition method according to claim 1, wherein said training a pre-built second deep learning model using said second standard picture set to obtain a second recognition model comprises:

Perform weight calculation according to the preset second loss function and the preset third loss function to obtain the target loss function;

Training the second deep learning model by using the second standard picture set according to the target loss function;

When the value of the target loss function is less than the second preset threshold, the training is stopped to obtain the second recognition model.
8. The picture recognition method of claim 1, wherein said using said first recognition model and said second recognition model to recognize and determine the result of the picture to be recognized to obtain a target recognition result comprises:

Recognizing the picture to be recognized by using the first recognition model to obtain a first recognition result;

Recognizing the picture to be recognized by using the second recognition model to obtain a second recognition result;

Perform logical operations according to the first recognition result and the second recognition result to obtain the target recognition result.
The picture recognition method according to any one of claims 1 to 6, wherein the initial picture set is a patient chest X-ray picture set, the first standard picture set is a whole lung area picture set, and the second The standard picture set is a picture set of the upper lung area.
A picture recognition device, wherein the device includes:

The global model generation module is used to obtain an initial picture set, perform global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set; The first deep learning model is trained to obtain the first recognition model;

The local model generation module is used to perform local target area cropping conversion, data enhancement, and pixel normalization processing on the initial picture set to obtain a second standard picture set; use the second standard picture set to compare the pre-built second The deep learning model is trained to obtain the second recognition model;

The picture recognition module is configured to, when the picture to be recognized is received, use the first recognition model and the second recognition model to recognize and determine the result of the picture to be recognized to obtain a recognition result.
An electronic device, wherein the electronic device includes:

At least one processor; and,

A memory communicatively connected with the at least one processor; wherein,

The memory stores computer program instructions executable by the at least one processor, and the computer program instructions are executed by the at least one processor, so that the at least one processor can execute the following steps:

Acquiring an initial picture set, and performing global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set;

Training the pre-built first deep learning model by using the first standard picture set to obtain the first recognition model;

Performing local target region cropping conversion, data enhancement and pixel normalization processing on the initial picture set to obtain a second standard picture set;

Training the pre-built second deep learning model by using the second standard picture set to obtain a second recognition model;

When the picture to be recognized is received, the first recognition model and the second recognition model are used to recognize and determine the result of the picture to be recognized to obtain a recognition result.
9. The electronic device according to claim 9, wherein said performing global target area cropping conversion and pixel normalization processing on said initial picture set to obtain a first standard picture set comprises:

Crop the first region of interest of each picture in the initial picture set to obtain the first global picture set;

Filling and interpolating each picture in the first global picture set to a preset size to obtain a second global picture set;

Perform normalization processing on each pixel value in each picture in the second global picture set to obtain the first standard picture set.
9. The electronic device according to claim 9, wherein said performing local target region cropping conversion, data enhancement and pixel normalization processing on said initial picture set to obtain a second standard picture set comprises:

Crop the second region of interest of each picture in the initial picture set to obtain the initial partial picture set;

Marking the corresponding pictures in the initial partial picture set according to the initial label corresponding to each picture in the initial picture set to obtain the first partial picture set;

Filling and interpolating pictures in the first partial picture set to a preset size to obtain a second partial picture set;

Performing normalization processing on each pixel value in each picture in the second partial picture set to obtain a third partial picture set;

Rotate each picture in the third partial picture set by a preset angle, and mark it with a corresponding rotation angle to obtain the second standard picture set.
11. The electronic device according to claim 11, wherein said training a pre-built first deep learning model using said first standard picture set to obtain a first recognition model comprises:

Step A: Perform a convolution pooling operation on the first standard picture set according to the preset number of convolution pooling times to obtain a feature set;

Step B: Calculate the feature set by using a preset activation function to obtain a predicted value, obtain the label value of the initial label corresponding to each picture in the first standard picture set, according to the predicted value and the The label value is calculated by using the pre-built first loss function to obtain the first loss value;

Step C: Compare the magnitude of the first loss value with the preset first loss threshold, and when the first loss value is greater than or equal to the first preset threshold, return to the step A; When a loss value is less than the first preset threshold, stop training to obtain the first recognition model.
9. The electronic device according to claim 9, wherein said training a pre-built second deep learning model using said second standard picture set to obtain a second recognition model comprises:

Perform weight calculation according to the preset second loss function and the preset third loss function to obtain the target loss function;

Training the second deep learning model by using the second standard picture set according to the target loss function;

When the value of the target loss function is less than the second preset threshold, the training is stopped to obtain the second recognition model.
9. The electronic device according to claim 9, wherein said using said first recognition model and said second recognition model to recognize and determine the result of said image to be recognized to obtain a target recognition result comprises:

Recognizing the picture to be recognized by using the first recognition model to obtain a first recognition result;

Recognizing the picture to be recognized by using the second recognition model to obtain a second recognition result;

Perform logical operations according to the first recognition result and the second recognition result to obtain the target recognition result.
The electronic device according to any one of claims 9 to 14, wherein the initial picture set is a patient's chest X-ray picture set, the first standard picture set is a whole lung area picture set, and the second standard picture set is a whole lung area picture set. The picture collection is a collection of pictures of the upper lung area.
A computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor to implement the following steps:

Acquiring an initial picture set, and performing global target area cropping conversion and pixel normalization processing on the initial picture set to obtain a first standard picture set;

Training the pre-built first deep learning model by using the first standard picture set to obtain the first recognition model;

Performing local target region cropping conversion, data enhancement and pixel normalization processing on the initial picture set to obtain a second standard picture set;

Training the pre-built second deep learning model by using the second standard picture set to obtain a second recognition model;

When the picture to be recognized is received, the first recognition model and the second recognition model are used to recognize and determine the result of the picture to be recognized to obtain a recognition result.
15. The computer-readable storage medium according to claim 16, wherein said performing global target area cropping conversion and pixel normalization processing on said initial picture set to obtain a first standard picture set comprises:

Crop the first region of interest of each picture in the initial picture set to obtain the first global picture set;

Filling and interpolating each picture in the first global picture set to a preset size to obtain a second global picture set;

Perform normalization processing on each pixel value in each picture in the second global picture set to obtain the first standard picture set.
16. The computer-readable storage medium of claim 16, wherein the performing partial target region cropping conversion, data enhancement, and pixel normalization processing on the initial picture set to obtain a second standard picture set comprises:

Crop the second region of interest of each picture in the initial picture set to obtain the initial partial picture set;

Marking the corresponding pictures in the initial partial picture set according to the initial label corresponding to each picture in the initial picture set to obtain the first partial picture set;

Filling and interpolating pictures in the first partial picture set to a preset size to obtain a second partial picture set;

Performing normalization processing on each pixel value in each picture in the second partial picture set to obtain a third partial picture set;

Each picture in the third partial picture set is rotated by a preset angle, and a label is marked with a corresponding rotation angle to obtain the second standard picture set.
18. The computer-readable storage medium according to claim 18, wherein the training of the pre-built first deep learning model by using the first standard picture set to obtain the first recognition model comprises:

Step A: Perform a convolution pooling operation on the first standard picture set according to the preset number of convolution pooling times to obtain a feature set;

Step B: Calculate the feature set by using a preset activation function to obtain a predicted value, obtain the label value of the initial label corresponding to each picture in the first standard picture set, according to the predicted value and the The label value is calculated by using the pre-built first loss function to obtain the first loss value;

Step C: Compare the magnitude of the first loss value with the preset first loss threshold, and when the first loss value is greater than or equal to the first preset threshold, return to the step A; When a loss value is less than the first preset threshold, stop training to obtain the first recognition model.
15. The computer-readable storage medium according to claim 16, wherein said training a pre-built second deep learning model using said second standard picture set to obtain a second recognition model comprises:

Perform weight calculation according to the preset second loss function and the preset third loss function to obtain the target loss function;

Training the second deep learning model by using the second standard picture set according to the target loss function;

When the value of the target loss function is less than the second preset threshold, the training is stopped to obtain the second recognition model.