CN118038487A

CN118038487A - Training method of recognition model, body part recognition method, body part recognition device and medium

Info

Publication number: CN118038487A
Application number: CN202410108047.3A
Authority: CN
Inventors: 史红涛; 杨明雷; 袁红美; 武会杰; 王玉栋; 姚丹; 刘志鹏; 鞠熙慧; 侯丽芳
Original assignee: Beijing Wandong Medical Technology Co ltd
Current assignee: Beijing Wandong Medical Technology Co ltd
Priority date: 2024-01-25
Filing date: 2024-01-25
Publication date: 2024-05-14

Abstract

The application discloses a training method of an identification model, a body part identification method, a body part identification device and a medium. The training method of the recognition model can comprise the following steps: distinguishing gray values corresponding to different tissue parts of a human body in the CT image by adopting a mode of presetting window width and window level; and inputting the historical image dataset and the corresponding window width and level into an initial recognition model to find the possible positions and the corresponding types of each body part in the CT historical image. And training the initial model according to the position identification information and the type marking information of the manually marked body part carried by the CT historical image to obtain an identification model. Therefore, according to the embodiment of the application, different window widths and window levels are preset for different parts of the body, and the initial recognition model is trained by utilizing the historical image data set, so that the recognition model capable of rapidly and accurately recognizing and positioning all parts of the body is obtained, the defect of manual operation in body part positioning is overcome, the time cost of medical staff is saved, the manual intervention is reduced, and the whole recognition positioning process is more efficient and convenient.

Description

Training method of recognition model, body part recognition method, body part recognition device and medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a training method for a recognition model, a body part recognition method, a body part recognition device, and a medium.

Background

With the rapid development of medical imaging technology, computed tomography (Computed Tomography, CT) imaging plays an increasingly important role in clinical diagnosis. CT is an important medical imaging technique that can provide high resolution, three-dimensional, stereoscopic images of body parts, allowing the user to see the internal condition of the body without cutting.

The positioning and identification of each body part in a CT image of the related art generally relies on manual operations by a medical technician, and such manual operations require manual marking and body part adjustment by the technician according to the anatomy shown on the image, and are greatly affected by personal experience of the technician, and may have problems such as inconsistent identification and low efficiency.

Disclosure of Invention

The embodiment of the application provides a training method of an identification model, a body part identification method, a body part identification device and a medium, which are used for solving the problems of inconsistent manual operation identification, low efficiency and the like in the prior art.

In a first aspect, an embodiment of the present application provides a training method for an identification model, where the method includes:

Acquiring a historical image dataset; the historical image data set comprises a plurality of CT (computed tomography) historical images, wherein the CT historical images carry position identification information and type labeling information of each body part;

Determining the window width and the window level corresponding to each body part; wherein, the window width and window level corresponding to different body parts are not identical;

Inputting the window width and window level corresponding to each body part and the historical image dataset into an initial recognition model to obtain position prediction information and type prediction information corresponding to each of a plurality of CT historical images;

And training the initial recognition model based on the position prediction information and the type prediction information corresponding to each of the CT historical images and the position recognition information and the type labeling information of each body part carried by the CT historical images to obtain a recognition model.

In a second aspect, embodiments of the present application provide a body part identification method, the method comprising:

Acquiring a CT image to be identified;

inputting the CT image to be identified into an identification model to obtain the position information of an identification frame of each body part in the CT image to be identified and the type of the body part corresponding to the identification frame;

wherein the recognition model is trained by the training method of the recognition model according to any one of claims 1 to 7.

In a third aspect, an embodiment of the present application provides a training apparatus for identifying a model, where the apparatus includes:

a historical image data acquisition module for acquiring a historical image data set; the historical image data set comprises a plurality of CT (computed tomography) historical images, wherein the CT historical images carry position identification information and type labeling information of each body part;

a window information determining module for determining a window width and a window level corresponding to each body part; wherein, the window width and window level corresponding to different body parts are not identical;

The prediction information obtaining module is used for inputting the window width and window level corresponding to each body part and the historical image dataset into the initial recognition model to obtain position prediction information and type prediction information corresponding to each of the plurality of CT historical images;

The identification model obtaining module is used for training the initial identification model based on the position prediction information and the type prediction information corresponding to each of the CT historical images and the position identification information and the type labeling information of each body part carried by the CT historical images to obtain an identification model.

In a fourth aspect, embodiments of the present application provide a body part identification device, the device comprising:

the CT image acquisition module is used for acquiring a CT image to be identified;

the identification module is used for inputting the CT image to be identified into an identification model to obtain the position information of an identification frame of each body part in the CT image to be identified and the type of the body part corresponding to the identification frame;

In a fifth aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps.

In a sixth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory;

Wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

The technical scheme provided by the embodiments of the application has the beneficial effects that at least:

In the embodiment of the application, the gray values corresponding to different tissue parts of the human body in the CT image can be distinguished by adopting a mode of presetting window width and window level; and inputting the historical image dataset and the corresponding window width and level into an initial recognition model to find the possible positions and the corresponding types of each body part in the CT historical image. Further, training the initial model according to the position identification information and the type marking information of the manually marked body part carried by the CT historical image to obtain an identification model. Therefore, according to the embodiment of the application, different window widths and window levels are preset for different parts of the body, and the initial recognition model is trained by utilizing the historical image data set, so that the recognition model capable of rapidly and accurately recognizing and positioning all parts of the body is obtained, the defect of manual operation in body part positioning is overcome, the time cost of medical staff is saved, the manual intervention is reduced, and the whole recognition positioning process is more efficient and convenient.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a training method for an identification model according to an embodiment of the present application;

FIGS. 2A and 2B are schematic diagrams of CT images including a plurality of body part identification frames according to embodiments of the present application;

FIG. 3 is a schematic flow chart of a training process of a training method for an identification model according to an embodiment of the present application;

FIG. 4 is a flowchart of another training method for an identification model according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a training process of another training method for an identification model according to an embodiment of the present application;

Fig. 6 is a flowchart of a body part recognition method according to an embodiment of the present application;

FIGS. 7A-7C are schematic diagrams of CT images including 6 body part identification frames according to embodiments of the present application;

FIG. 8 is a schematic structural diagram of a training device for identifying a model according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a body part recognition device according to an embodiment of the present application;

FIG. 10 is a schematic diagram of an electronic device including a training application for recognition models according to an embodiment of the present application;

Fig. 11 is a schematic structural diagram of an electronic device including a body part recognition application according to an embodiment of the present application.

Detailed Description

When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application as detailed in the accompanying claims.

In the description of the present application, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application will be understood in specific cases by those of ordinary skill in the art. Furthermore, in the description of the present application, unless otherwise indicated, "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

The CT can measure the human body by using an instrument with extremely high sensitivity according to the difference of the absorption and the transmittance of the X-rays by different tissues of the human body, then the data obtained by measurement is input into an electronic computer, and after the electronic computer processes the data, a section or a three-dimensional image of the inspected part of the human body can be shot. CT images are represented in different gray scales reflecting the extent of absorption of X-rays by organs and tissues.

The CT image is formed by arranging a certain number of pixels with different gray scales from black to white in a matrix. These pixels reflect the X-ray absorption coefficient of the corresponding voxel. The size and number of pixels of images obtained by different CT devices are different. The size may be 1.0X1.0mm, 0.5X0.5 mm unequal; the number may be 256×256, i.e. 65536, or 512×512, i.e. 2626144, unequal. The smaller the pixels, the larger the number, the finer the constituent image, i.e., the higher the spatial resolution.

The positioning and identification of body parts in a CT image of the related art typically relies on manual operations by a medical technician, which may typically include the following: image capture and manual marking: before performing a formal CT scan, a technician needs to acquire CT images and manually mark the body part in the CT images. Manual auxiliary tool positioning: in some cases, the physician may use auxiliary tools or software to assist in identifying and locating body parts, which may include image processing software or specific locating means. Manual adjustment and identification: the physician manually adjusts the positioning, confirms and identifies the body part according to his own experience in combination with the anatomy shown on the CT image.

Aiming at the problems of inaccurate positioning and identification, low efficiency and the like existing in CT images in manual operation in the related technology, the application provides a solution, and aims to improve the identification accuracy and the positioning efficiency of body parts.

The training method and the body part recognition method of the recognition model provided by the embodiment of the application are described next in combination with the CT related information described above.

In one embodiment, as shown in FIG. 1, a flow diagram of a training method for an identification model is provided. As shown in fig. 1, the training method of the recognition model may include the following steps:

S101, acquiring a historical image dataset.

The historical image data set in the embodiment of the application can comprise a plurality of CT historical images, wherein the CT historical images carry position identification information and type labeling information of each body part.

It will be appreciated that in the data preparation stage of the training process, a certain amount of CT scout scan images, which may be body parts such as the head, chest, abdomen, heart, pelvis, cervical vertebra, etc., need to be collected in advance as training data (the size of the scout scan image is not limited in any way by the present application).

Specifically, the position identification information and the type labeling information of each body part carried by the CT history image in the embodiment of the present application indicate that the positions and types of each body part are labeled in advance by a manual means, for example, a doctor labels the positions and types of 6 body parts such as the head, the chest, the abdomen, the heart, the pelvis, the cervical vertebra, and the like of the human body in the CT history image.

S102, determining the window width and the window level corresponding to each body part.

Window Width (Window Width) and Window Level (Window Level) represent parameters for adjusting and displaying a CT image. Since a CT image is a 16-bit image unlike a general natural image, the natural image is generally an 8-bit image. Thus, the physician will adjust the window width and level in the CT image as needed, making the CT image easier to interpret and diagnose. Because of the different tissue structures and densities of organs in different parts of the human body, different window widths and window level parameters are required for different parts of the human body.

Specifically, window width represents the range of gray levels selected for use in displaying a body part, using the principles generally followed: the narrow window width (50-350 HU) is used, so that the displayed CT value range is small, the amplitude of the CT value represented by each gray scale is small, the contrast ratio is high, and the method is suitable for observing tissue structures (such as brain) with close density. The wide window width (400-2000 HU) is used, so that the displayed CT value range is large, the CT value amplitude represented by each gray level is large, the image contrast is poor, but the density is uniform, and the method is suitable for observing structures with large density difference (such as the positions of bone tissues). Wherein, the calculation formula of the CT value range=window level (+/-) (window width/2). The window level refers to the central position of the corresponding gray level, and the purpose of adjusting the window level is mainly to influence the brightness of the image, the window level is increased, and the image is blackened and otherwise whiter.

It can be understood that the purpose of setting the window level is to take the average CT value of the tissue to be observed as a reference, and when setting the CT window value, the window level should be determined according to the CT value of the body part to be observed, and then a suitable window width should be set according to the contrast between the tissue to be observed and the surrounding tissue and the display requirement of the whole image.

Specifically, the embodiment of the application can perform window width and level processing on CT image data in a data normalization mode to highlight a target body part, and then normalize the pixel value of the data to be between 0 and 1. Specifically, in the embodiment of the application, the head and the cervical vertebra of the CT image can be provided with a window width of 1000 and a window level of 400, the chest, the abdomen and the heart can be provided with a window width of 1200 and a window level of 800, and the pelvic bone can be provided with a window width of 800 and a window level of 300.

S103, inputting the window width and the window level corresponding to each body part and the historical image dataset into an initial recognition model to obtain position prediction information and type prediction information corresponding to each of the plurality of CT historical images.

Possibly, the embodiment of the application can process the CT historical image in a mode of random rotation of the image, mosaic data enhancement, random color change and the like, so that the number of samples in the historical image dataset is enriched.

Specifically, the embodiment of the application can perform arbitrary angle rotation processing on the CT historical images to obtain CT rotation images, for example, the CT historical images in the historical image data set are randomly and horizontally turned over with the probability of 0.5, the CT historical images in the historical image data set are randomly and color-changed with the probability of 0.5, and finally the size of the processed CT images is adjusted (scaled) to 640x 640. modes of mosaics data enhancement include, but are not limited to: random scaling, random cutting and random arrangement.

Aiming at the requirements of accurate positioning and automatic recognition of body parts in the field of medical imaging, the initial recognition model in the embodiment of the application can adopt a target detection and image segmentation algorithm YOLOv (You Only Look Once version 8) so as to realize rapid, efficient and accurate recognition and positioning of the body parts.

Possibly, other deep learning models, such as fast R-CNN, SSD (Single Shot Multibox Detector), etc., may also be employed in embodiments of the present application, as well as for object detection and image localization. The Fast R-CNN is a unified target detection network formed by combining two modules, namely RPN and Fast R-CNN, wherein the first module is used for generating a region of interest, and the second module is used for judging the type of a target and performing frame regression.

Specifically, the location prediction information in the embodiment of the present application may include: the location information of the plurality of identification frames, the type prediction information may include: probability of different body part types corresponding to each identification frame.

Further, the embodiment of the application can input the window width and the window level corresponding to each body part and the plurality of CT historical images into the initial recognition model to obtain the position information of the plurality of recognition frames corresponding to each CT historical image and the probabilities of different body part types corresponding to each recognition frame.

For example, a CT image with a size of 640x640 is input into a YOLOv model for recognition, the model obtains 8400x10 data, 8400 represents that 8400 recognition frames are detected in total, 10 represents the data amount corresponding to each frame, the first 4 items represent the position of the recognition frame, the abscissa of the center point of the recognition frame, the height and the width of the recognition frame, the second 6 items represent the recognition rate of each part of the head, the chest, the abdomen, the heart, the pelvis and the cervical vertebra, and the recognition rate may be any value between 0 and 1.

See the schematic diagram of a CT history image containing a plurality of position information and probability information shown in fig. 2A. As can be seen from fig. 2, the area 0 of the CT history image corresponds to 3 recognition frames, 01, 02, 03, and the area 5 corresponds to 3 recognition frames, 51, 52, 53, respectively. Wherein, 10 items of data corresponding to the identification frame 01 are: (15,20,12,10) (0.9, 0.04, 0.02, 0.01, 0.05, 0.09), 10 items of data corresponding to the identification frame 02 are: (12,18,12,10) (0.92, 0.03, 0.01, 0.05, 0.06, 0.03), 10 items of data corresponding to the identification frame 03 are: (10,15,12,10) (0.89, 0.13, 0.02, 0.01, 0.04, 0.034), the 10 items of data corresponding to the identification frame 51 are: (21,10,8,7) (0.15, 0.1, 0.12, 0.13, 0.8, 0.18), the 10 items of data corresponding to the identification frame 52 are: (19,8,8,7) (0.13, 0.12, 0.14, 0.15, 0.83, 0.13), the 10 items of data corresponding to the identification frame 53 are: (8,7,8,7) (0.12, 0.13, 0.15, 0.16, 0.79, 0.17).

S104, training the initial recognition model based on the position prediction information and the type prediction information corresponding to each of the plurality of CT historical images and the position recognition information and the type labeling information of each body part carried by the plurality of CT historical images to obtain a recognition model.

Possibly, the embodiment of the application can filter the probabilities of different body part types corresponding to each identification frame in the CT historical image based on the confidence threshold of the identification frame to obtain a CT historical filtered image; determining the body part type corresponding to the maximum probability value in probabilities of different body part types corresponding to each identification frame in the CT historical filtering image as the body part type corresponding to each identification frame; based on the cross-correlation threshold, performing de-duplication processing on a plurality of identification frames of the same type in the CT historical filtering image to obtain type identification frames corresponding to the types of all body parts in the CT historical image and position information corresponding to the type identification frames; training the initial recognition model based on the type recognition frame corresponding to each body part type in the CT historical images and the position information corresponding to the type recognition frame, and the position recognition information and the type labeling information of each body part carried by the CT historical images to obtain a recognition model.

For example, the confidence threshold of the recognition frame is set to 0.3 firstly, so as to filter out the recognition frames with the probability of the body part type in the CT historical image being less than 0.3, then the recognition frames are further screened, the recognition frames of the same type which are similar and overlapped are removed by using a non-maximal suppression algorithm (Non Maximum Suppression, NMS), wherein the threshold of the cross-Over-Union (IOU) is set to 0.1, so that only one recognition frame is reserved for each type of body part, namely, the recognition frame of the body part type corresponding to the maximum probability value is reserved finally as the recognition frame of the part.

Referring to fig. 2A and 2B, it is first determined whether there is an identification frame with the probability of having the body part type smaller than 0.3 in fig. 2A, if there is an identification frame with the probability of having been excluded, the remaining identification frame is screened out the maximum probability value in the 6 items of data, and the body part type corresponding to the maximum probability value is determined. Then, deleting the coincident identification frames in the same type by adopting an IOU algorithm, and leaving unique identification frames corresponding to each type to obtain the head identification frame 02 shown in FIG. 2B: (12,18,12,10) (0.92) and cervical spine identification frame 52: (19,8,8,7) (0.83) CT history image.

Possibly, the embodiment of the application can convert the identification frame of each body part into two corner coordinates of the upper left corner and the lower right corner according to the horizontal and vertical coordinates of the central point corresponding to the unique identification frame, as well as the height and the width, and the coordinates of each point are returned in a proportional form, namely the ratio of the position of the corner to the data width and the height of the whole CT historical image.

Further, the embodiment of the application can determine the position loss value based on the position information corresponding to each type of identification frame in the CT historical image and the position identification information of each body part carried by the CT historical image; determining a type loss value based on the type of the body part corresponding to each type of identification frame in the CT historical image and the type labeling information of each body part carried by the CT historical image; determining a loss value of the initial recognition model based on the position loss value and the type loss value; training the initial recognition model based on the loss value of the initial recognition model to obtain the recognition model.

For example, a plurality of CT historical images with the size of 640x640 are sent to a YOLOv algorithm model for training, wherein the training parameter batch size batchsize can be set to 8, the optimizer selects random gradient GRADIENT DESCENT (SGD), the learning rate is set to 0.01, the loss value loss includes two parts, one part is the type loss value of the identification classification of the part, the cross entropy BCE loss calculation can be used, the other part is the position loss value of the identification frame of the part, and the cross-correlation ratio CIOU loss can be used.

In particular, batch size is an important super-parameter in deep learning and machine learning, which determines the number of samples that are processed simultaneously in each model training iteration. The optimizer determines how the model adjusts the parameters during training to minimize the loss function, the SGD is a gradient-based optimization algorithm for finding the parameter configuration that minimizes the loss function, and the SGD can update the parameters by calculating the gradient of each CT history image and randomly select a plurality of CT history images in each update. The learning rate can be set to 0.5, 0.1, 0.05, 0.01, 0.005, 0.0005, 0.0001 and 0.00001, and the actual situation is combined to compare and judge, so that the small learning rate can be converged slowly, but the loss value can be reduced.

It can be understood that the learning rate can be selected according to the size of the historical image data set, and the learning rate is dynamically changed over time instead of using a fixed value in the training process, for example, just starting training and being far away from the optimal value, then a little faster learning rate can be used, and when the learning rate is quickly approaching the optimal value, the adjustment speed is slowed down, i.e. the learning rate is required to be trained with a smaller learning rate, so as to avoid crossing the optimal value.

Possibly, the embodiment of the application can also adaptively adjust the initial model to realize the identification of various parts of the body in the CT historical image, thereby obtaining an identification model. The parameters and the output structure of the initial model can be optimized and improved, so that the model has stronger multi-target recognition capability.

Specifically, the adjustment of the model according to the embodiment of the present application includes, but is not limited to: and (3) adjusting a model output layer: the output layer of the model is structurally adjusted so that the model can output identification results applicable to various body parts. Parameter optimization of the model: and the parameters of the model, such as color brightness parameter adjustment, are optimized and adjusted to adapt to different characteristics and forms of various parts, so that the identification accuracy and the robustness are improved. Module of the attention increasing mechanism: in order to further improve the recognition accuracy and the importance of focusing on specific parts, an attention mechanism module can be added in the model, and the module can focus on and optimize specific body parts in the recognition process, so that the recognition accuracy is improved, and higher positioning accuracy is ensured.

Referring to fig. 3, in a specific example, a historical image dataset including a plurality of CT historical images carrying position identification information and type labeling information of each body part is acquired, and the historical image dataset of window widths and window levels corresponding to each body part is input into an initial identification model to obtain a plurality of identification frames corresponding to each body part in the CT historical images and type prediction probabilities corresponding to each identification frame. Further, it may be determined first whether there is an identification frame with the probability of having the body part type smaller than 0.3 in the identification result, and if there is an identification frame with the probability of having been excluded, the remaining identification frame screens out the maximum probability value in the 6 items of data, and determines the body part type corresponding to the maximum probability value. And then deleting the coincident identification frames in the same type by adopting an IOU algorithm, and leaving an identification result containing unique identification frames corresponding to various types, namely, the identification result of each CT historical image only comprises 6 identification frames, each identification frame corresponds to 6 body parts such as head, chest, abdomen, heart, pelvis, cervical vertebra and the like, and in addition, the identification result also comprises the position information of each identification frame. Further, the loss value of the model can be calculated based on the position information and the type information of each identification frame in the CT historical image and the manual annotation information carried by the CT historical image, namely the position identification information and the type annotation information of each body part annotated by a doctor, and the initial identification model is trained according to the loss value to obtain the trained identification model.

The application can distinguish the gray values corresponding to different tissue parts of the human body in the CT image by adopting a mode of presetting window width and window level; and inputting the historical image dataset and the corresponding window width and level into an initial recognition model to find the possible positions and the corresponding types of each body part in the CT historical image. Further, training the initial model according to the position identification information and the type marking information of the manually marked body part carried by the CT historical image to obtain an identification model. Therefore, according to the embodiment of the application, different window widths and window levels are preset for different parts of the body, and the initial recognition model is trained by utilizing the historical image data set, so that the recognition model capable of rapidly and accurately recognizing and positioning all parts of the body is obtained, the defect of manual operation in body part positioning is overcome, the time cost of medical staff is saved, the manual intervention is reduced, and the whole recognition positioning process is more efficient and convenient.

In some implementations, fig. 4 schematically shows a flowchart of a training method of an identification model according to an embodiment of the present application. As shown in fig. 4, the training method of the recognition model may at least include the following steps:

s401, acquiring a historical image dataset.

Specifically, S401 corresponds to S101, and will not be described here.

S402, determining the window width and the window level corresponding to each body part.

Specifically, S402 corresponds to S102, and will not be described here again.

S403, cutting the plurality of CT history images to obtain a plurality of CT history sub-images.

It can be appreciated that the embodiment of the application can arbitrarily clip the CT history image, that is, the length and width of the clipping position can be randomly changed.

S404, based on the size of the CT history image, at least two of the CT history sub-images are selected for stitching and combining, and a CT stitched image is obtained.

For example, a plurality of CT history images in the history image dataset are cut with a probability of 1.0, a corresponding number of CT history sub-images are obtained, and then appropriate sizes in the CT history sub-images are selected according to a size of 640x640, and mutually spliced and combined to obtain a CT spliced image.

It can be understood that the CT spliced images obtained by splicing in a random cutting and random arrangement mode enrich the historical image data set, and particularly, a plurality of small targets are added in random cutting, so that the robustness of the model is better.

Possibly, the embodiment of the application can carry out multiple times of adjustment on the parameters in the initial recognition model for unlimited times until the loss value corresponding to the initial recognition model can not be reduced any more, and finally, the model corresponding to the minimum loss value is determined as the recognition model.

Possibly, the embodiment of the application can also adjust the parameters of the initial recognition model based on the preset adjustment times, so as to obtain a plurality of loss values corresponding to the initial recognition model, and determine the recognition model after training according to the plurality of loss values. For example, the number of iterative training is set to 200, parameters of an initial recognition model are adaptively adjusted for each training to obtain 200 loss values, a minimum loss value in the obtained 200 loss values is determined as a recognition model, and a model corresponding to the minimum loss value is determined.

S405, based on a plurality of CT historical images and a plurality of CT spliced images, parameters of an initial recognition model are adjusted n times, and n first loss values are obtained, wherein n is an integer.

The first loss values represent loss values obtained by inputting a plurality of CT historical images and a plurality of CT spliced images into an initial recognition model, and the n first loss values represent n loss values output by the initial recognition model of n different parameters after n times of adjustment of the parameters in the initial recognition model.

S406, based on the CT historical images, adjusting parameters of the initial recognition model for N-N times to obtain N-N second loss values, wherein N is greater than N, and N is an integer.

The second loss values represent loss values obtained by inputting a plurality of CT historical images into the initial recognition model, and the N-N second loss values represent N-N loss values output by the initial recognition model of N-N different parameters after N-N times of adjustment of the parameters in the initial recognition model.

It will be appreciated that in the last N-N times of the training process, the embodiment of the present application selects to turn off the mosaics data enhancement, and only uses the unprocessed, authentic CT history images until the training is completed, so as to ensure the authenticity and validity of the model training.

S407, determining the minimum loss value of the N first loss values and the N-N second loss values.

And S408, taking the initial recognition model corresponding to the minimum loss value as a recognition model.

Referring to fig. 5, in a specific example, a history image dataset including a plurality of CT history images carrying position identification information and type labeling information of each body part is obtained, and after the plurality of CT history images are cut and stitched, a plurality of CT stitched images are obtained. Further, the input training data set is divided according to the iterative training frequency set to 200, the first 190 training adopts a combined data set comprising a plurality of CT historical images and a plurality of CT spliced images, and the last 10 training adopts a historical image data set comprising only a plurality of CT historical images. In this way, after the initial recognition model is trained for 200 times, 200 loss values can be obtained, wherein the initial recognition model comprises 190 first loss values obtained by using the combined data set and 10 second loss values obtained by using the historical image data set, and finally the initial recognition model corresponding to the minimum loss value in the 200 loss values is used as the recognition model.

Therefore, the embodiment of the application can effectively improve the stability and accuracy of the recognition model after training by increasing the number of the input training samples in different forms. In addition, because the added training samples are the CT images obtained by cutting and splicing, compared with CT historical images which only adopt uniform sizes and pixels, the CT historical images have universality and can be suitable for different CT imaging equipment and scanning scenes.

In some embodiments, fig. 6 schematically shows a flow chart of a body part identification method according to an embodiment of the present application. As shown in fig. 6, the body part identification method may include at least the steps of:

S601, acquiring a CT image to be identified.

Specifically, the embodiment of the application can be applied to scenes such as hospitals, and doctors can acquire CT images to be identified by shooting the CT images.

In particular, CT may utilize the intensity of X-rays and the different degrees of absorption by the human body to generate images. In a CT scan, X-rays pass through the body from different angles and are received by a detector. Since CT images can provide higher spatial resolution and clearly show the boundaries of bones and soft tissues, CT is prominent in detecting fractures, finding bumps, and assessing the extent of damage.

S602, inputting the CT image to be identified into an identification model to obtain the position information of the identification frame of each body part in the CT image to be identified and the type of the body part corresponding to the identification frame.

The recognition model is obtained through training by the training method of the recognition model.

See fig. 7A-7C. According to the embodiment of the application, the CT image to be identified, which is acquired by the user, can be input into the identification model which is already trained, so that the position information of the identification frame of each body part in the CT image to be identified and the body part type corresponding to the identification frame are obtained. For example, the position of the identification frame 0 in fig. 7A is the head, the position of the identification frame 5 in fig. 7A is the cervical vertebra, the position of the identification frame 1 in fig. 7B is the chest, the position of the identification frame 2 in fig. 7A is the abdomen, the position of the identification frame 1 in fig. 7C is the chest, the position of the identification frame 2 in fig. 7C is the abdomen, the position of the identification frame 3 in fig. 7C is the heart, and the position of the identification frame 4 in fig. 7C is the pelvis.

Therefore, the identification model adopted by the embodiment of the application can automatically identify and position the body part, effectively reduces errors possibly caused by manual operation, obviously improves the accuracy and consistency of the identification and the positioning of each part of the human body, and provides more reliable image basis for medical diagnosis and treatment.

Fig. 8 is a schematic structural diagram of a training device for recognition model according to an exemplary embodiment of the present application. The training device of the recognition model can be arranged in a server, a terminal or other equipment to execute the training method of the recognition model in any embodiment of the application. As shown in fig. 8, the training apparatus of the recognition model may include:

A history image data acquisition module 81 for acquiring a history image dataset; the historical image data set comprises a plurality of CT (computed tomography) historical images, wherein the CT historical images carry position identification information and type labeling information of each body part;

A window information determining module 82 for determining a window width and a window level corresponding to each of the body parts; wherein, the window width and window level corresponding to different body parts are not identical;

the prediction information obtaining module 83 is configured to input the window width and window level corresponding to each body part and the historical image dataset into an initial recognition model, so as to obtain position prediction information and type prediction information corresponding to each of the plurality of CT historical images;

the identification model obtaining module 84 is configured to train the initial identification model based on the position prediction information and the type prediction information corresponding to each of the CT history images, and the position identification information and the type labeling information of each body part carried by the CT history images, so as to obtain an identification model.

In some embodiments, the location prediction information includes: position information of a plurality of identification frames; the type prediction information includes: probability of different body part types corresponding to each identification frame;

the prediction information obtaining module 83 is specifically configured to:

and inputting the window width and the window level corresponding to each body part and the CT historical images into the initial recognition model to obtain the position information of a plurality of recognition frames corresponding to each CT historical image and the probabilities of different body part types corresponding to each recognition frame.

In some embodiments, the recognition model derivation module 84 includes:

The first obtaining unit is used for filtering the probabilities of different body part types corresponding to each identification frame in the CT historical image based on the confidence threshold of the identification frame to obtain a CT historical filtered image;

A first determining unit, configured to determine a body part type corresponding to a maximum probability value in probabilities of different body part types corresponding to each identification frame in the CT history filtering image as the body part type corresponding to each identification frame;

the second obtaining unit is used for carrying out de-duplication processing on a plurality of identification frames of the same type in the CT historical filtering image based on an intersection ratio threshold value to obtain type identification frames corresponding to the types of all body parts in the CT historical image and position information corresponding to the type identification frames;

The third obtaining unit is configured to train the initial recognition model based on a type recognition frame corresponding to each body part type in the CT history images and position information corresponding to the type recognition frame, and position recognition information and type labeling information of each body part carried by the CT history images, so as to obtain the recognition model.

In some embodiments, the third deriving unit comprises:

A first determining subunit, configured to determine a position loss value based on position information corresponding to each type of identification frame in the CT history image and position identification information of each body part carried by the CT history image;

The second determining subunit is used for determining a type loss value based on the type of the body part corresponding to each type of identification frame in the CT historical image and the type labeling information of each body part carried by the CT historical image;

a third determining subunit configured to determine a loss value of the initial recognition model based on the position loss value and the type loss value;

the first obtaining subunit is configured to train the initial recognition model based on the loss value of the initial recognition model, so as to obtain the recognition model.

In some embodiments, the first obtaining subunit is specifically configured to:

Based on preset adjustment times, adjusting parameters of the initial recognition model to obtain a plurality of loss values corresponding to the initial recognition model; and taking the initial recognition model corresponding to the minimum loss value as the recognition model.

In some embodiments, after the historical image data obtaining module 81, before the prediction information obtaining module, the apparatus further includes:

the history sub-image obtaining module is used for cutting a plurality of CT history images to obtain a plurality of CT history sub-images;

And the spliced image obtaining module is used for selecting at least two of the CT history sub-images to be spliced and combined based on the size of the CT history image to obtain the CT spliced image.

In some embodiments, the preset number of adjustments is N, N being an integer;

The first obtaining subunit is specifically configured to: based on a plurality of CT historical images and a plurality of CT spliced images, adjusting parameters of the initial identification model for n times to obtain n first loss values; wherein N is greater than N, N being an integer; based on a plurality of CT historical images, carrying out N-N times of adjustment on parameters of the initial identification model to obtain N-N second loss values; a minimum loss value of the N first loss values and the N-N second loss values is determined.

Fig. 9 is a schematic structural view of a body part recognition device according to an exemplary embodiment of the present application. The body part recognition device may be provided in a server, a terminal, or the like, and execute the body part recognition method according to any of the above embodiments of the present application. As shown in fig. 9, the body part identification device may include:

It should be noted that, when the information pushing apparatus provided in the foregoing embodiment performs the information pushing method, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the information pushing device and the information pushing method provided in the foregoing embodiments belong to the same concept, which embody the detailed implementation process in the method embodiment, and are not repeated here.

The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

Referring to fig. 10, a schematic structural diagram of an electronic device is provided in an embodiment of the present application. As shown in fig. 10, the electronic device 10 may include: at least one processor 11, at least one network interface 14, a user interface 13, a memory 15, at least one communication bus 12.

Wherein the communication bus 12 is used to enable connected communication between these components.

The user interface 13 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 13 may further include a standard wired interface and a standard wireless interface.

The network interface 14 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Wherein the processor 11 may comprise one or more processing cores. The processor 11 utilizes various interfaces and lines to connect various portions of the overall electronic device 10, perform various functions of the electronic device 10 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 15, and invoking data stored in the memory 15. Alternatively, the processor 11 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 11 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 11 and may be implemented by a single chip.

The Memory 15 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 15 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 15 may be used to store instructions, programs, code sets, or instruction sets. The memory 15 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described respective method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 15 may alternatively be at least one memory device located remotely from the aforementioned processor 11. As shown in fig. 10, an operating system, a network communication module, a user interface module, and an information push application program may be included in the memory 15 as one type of computer storage medium.

In the electronic device 10 shown in fig. 10, the user interface 13 is mainly used as an interface for providing input for a user, and obtains data input by the user; while processor 11 may be used to invoke a training application of the recognition model stored in memory 15 and specifically:

the processor 11 specifically performs, when performing the inputting the window width and the window level corresponding to each body part and the historical image dataset into the initial recognition model to obtain the position prediction information and the type prediction information corresponding to each of the plurality of CT historical images:

In some embodiments, the processor 11 is executing

The training of the initial recognition model based on the position prediction information and the type prediction information corresponding to each of the plurality of CT history images and the position recognition information and the type labeling information of each body part carried by the plurality of CT history images is specifically performed when the recognition model is obtained:

Filtering probabilities of different body part types corresponding to each identification frame in the CT historical image based on the identification frame confidence threshold value to obtain a CT historical filtered image;

Determining the body part type corresponding to the maximum probability value in the probabilities of different body part types corresponding to the identification frames in the CT history filtering image as the body part type corresponding to the identification frames;

based on an intersection ratio threshold value, performing de-duplication processing on a plurality of identification frames of the same type in the CT historical filtering image to obtain type identification frames corresponding to the types of all body parts in the CT historical image and position information corresponding to the type identification frames;

And training the initial recognition model based on the type recognition frames corresponding to the body part types in the CT historical images, the position information corresponding to the type recognition frames, the position recognition information and the type labeling information of the body parts carried by the CT historical images, and obtaining the recognition model. In some embodiments, when the processor 11 performs training on the initial identification model based on the type identification frame corresponding to each body part type in the CT history images and the position information corresponding to the type identification frame, and the position identification information and the type labeling information of each body part carried by the CT history images, to obtain the identification model, the specific implementation is that:

Determining a position loss value based on position information corresponding to each type of identification frame in the CT historical image and position identification information of each body part carried by the CT historical image;

determining a type loss value based on the type of the body part corresponding to each type of identification frame in the CT historical image and the type labeling information of each body part carried by the CT historical image;

Determining a loss value for the initial recognition model based on the location loss value and the type loss value;

Training the initial recognition model based on the loss value of the initial recognition model to obtain the recognition model.

In some embodiments, when the processor 11 performs the training on the initial recognition model based on the loss value of the initial recognition model to obtain the recognition model, the specific implementation is that:

based on preset adjustment times, adjusting parameters of the initial recognition model to obtain a plurality of loss values corresponding to the initial recognition model;

And taking the initial recognition model corresponding to the minimum loss value as the recognition model.

In some embodiments, the processor 11

After the historical image dataset is acquired, before the window width and window level corresponding to each body part and the historical image dataset are input into an initial recognition model to obtain position prediction information and type prediction information corresponding to each of a plurality of CT historical images, the method further comprises the steps of:

shearing the plurality of CT history images to obtain a plurality of CT history sub-images;

And selecting at least two of the plurality of CT history sub-images to be spliced and combined based on the size of the CT history image to obtain the CT spliced image.

In some embodiments, the preset number of adjustments is N, N being an integer;

the processor 11 specifically performs, when performing the adjusting, based on the preset adjusting times, the parameter of the initial recognition model to obtain a plurality of loss values corresponding to the initial recognition model:

Based on a plurality of CT historical images and a plurality of CT spliced images, adjusting parameters of the initial identification model for n times to obtain n first loss values; wherein N is greater than N, N being an integer;

Based on a plurality of CT historical images, carrying out N-N times of adjustment on parameters of the initial identification model to obtain N-N second loss values;

a minimum loss value of the N first loss values and the N-N second loss values is determined.

Referring to fig. 11, a schematic structural diagram of an electronic device is provided in an embodiment of the present application. As shown in fig. 11, the electronic device 20 may include: at least one processor 21, at least one network interface 24, a user interface 23, a memory 25, at least one communication bus 22.

Wherein the communication bus 22 is used to enable connected communication between these components.

The user interface 23 may include a Display screen (Display), a Camera (Camera), and the optional user interface 23 may further include a standard wired interface, a wireless interface.

The network interface 24 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Wherein the processor 21 may comprise one or more processing cores. The processor 21 utilizes various interfaces and lines to connect various portions of the overall electronic device 20, perform various functions of the electronic device 20 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 25, and invoking data stored in the memory 25. Alternatively, the processor 21 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 21 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 21 and may be implemented by a single chip.

The Memory 25 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 25 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 25 may be used to store instructions, programs, code sets, or instruction sets. The memory 25 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described respective method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 25 may alternatively be at least one memory device located remotely from the aforementioned processor 21. As shown in fig. 11, an operating system, a network communication module, a user interface module, and an information push application program may be included in the memory 25 as one type of computer storage medium.

In the electronic device 20 shown in fig. 11, the user interface 23 is mainly used for providing an input interface for a user, and acquiring data input by the user; and the processor 21 may be configured to invoke the body part identification method application stored in the memory 25 and specifically perform the following operations:

Acquiring a CT image to be identified;

Embodiments of the present application also provide a computer-readable storage medium having instructions stored therein, which when executed on a computer or processor, cause the computer or processor to perform one or more of the steps of the embodiments shown in fig. 1 and 4 described above. The above-described constituent modules of the information push device may be stored in the computer-readable storage medium if implemented in the form of software functional units and sold or used as independent products.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (Digital Subscriber Line, DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disk (DIGITAL VERSATILE DISC, DVD)), or a semiconductor medium (e.g., a Solid state disk (Solid STATE DISK, SSD)), or the like.

Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by way of a computer program, which may be stored in a computer-readable storage medium, instructing relevant hardware, and which, when executed, may comprise the embodiment methods as described above. And the aforementioned storage medium includes: a Read Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, or the like. The technical features in the present examples and embodiments may be arbitrarily combined without conflict.

The above-described embodiments are merely illustrative of the preferred embodiments of the present application and are not intended to limit the scope of the present application, and various modifications and improvements made by those skilled in the art to the technical solution of the present application should fall within the scope of protection defined by the claims of the present application without departing from the design spirit of the present application.

Claims

1. A method of training an identification model, the method comprising:

2. The method of claim 1, wherein the location prediction information comprises: position information of a plurality of identification frames; the type prediction information includes: probability of different body part types corresponding to each identification frame;

The step of inputting the window width and window level corresponding to each body part and the historical image dataset into an initial recognition model to obtain the position prediction information and the type prediction information corresponding to each of a plurality of CT historical images, comprising the following steps:

3. The method according to claim 2, wherein training the initial recognition model based on the position prediction information and the type prediction information corresponding to each of the plurality of CT history images, and the position recognition information and the type labeling information of each body part carried by the plurality of CT history images to obtain a recognition model includes:

And training the initial recognition model based on the type recognition frames corresponding to the body part types in the CT historical images, the position information corresponding to the type recognition frames, the position recognition information and the type labeling information of the body parts carried by the CT historical images, and obtaining the recognition model.

4. The method according to claim 3, wherein training the initial recognition model based on the type recognition frame corresponding to each body part type in the plurality of CT history images and the position information corresponding to the type recognition frame, and the position recognition information and the type labeling information of each body part carried by the plurality of CT history images to obtain the recognition model includes:

5. The method of claim 4, wherein training the initial recognition model based on the loss value of the initial recognition model to obtain the recognition model comprises:

6. The method of claim 5, wherein after the acquiring the historical image dataset, the step of inputting the window width and the window level corresponding to each body part and the historical image dataset into an initial recognition model to obtain the position prediction information and the type prediction information corresponding to each of the plurality of CT historical images, the method further comprises:

7. The method of claim 6, wherein the predetermined number of adjustments is N, N being an integer;

the step of adjusting the parameters of the initial recognition model based on the preset adjustment times to obtain a plurality of loss values corresponding to the initial recognition model comprises the following steps:

8. A method of body part identification, the method comprising:

Acquiring a CT image to be identified;

9. A training device for identifying a model, the device comprising:

10. A body part identification device, the device comprising:

11. An electronic device, the electronic device comprising:

A memory for storing executable program code;

A processor for calling and running the executable program code from the memory, causing the electronic device to perform the method of any one of claims 1 to 7 or claim 8.

12. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed, implements the method of any one of claims 1 to 7 or claim 8.