CN110648327B

CN110648327B - Automatic ultrasonic image video tracking method and equipment based on artificial intelligence

Info

Publication number: CN110648327B
Application number: CN201910947137.0A
Authority: CN
Inventors: 殷晨; 赵明昌
Original assignee: Chison Medical Technologies Co ltd
Current assignee: Chison Medical Technologies Co ltd
Priority date: 2019-09-29
Filing date: 2019-09-29
Publication date: 2022-06-28
Anticipated expiration: 2039-09-29
Also published as: CN110648327A

Abstract

The invention discloses an artificial intelligence-based ultrasonic image video automatic tracking method and equipment, which can realize automatic identification and tracking of a target position in an ultrasonic image through training and using of a neural network, save the time of manual marking of a doctor, obtain the target position accurately and quickly, can combine a current radiography image to generate a TIC curve in real time while tracking in real time, and improve the efficiency and the accuracy of target identification and tracking in the ultrasonic image.

Description

Automatic ultrasonic image video tracking method and equipment based on artificial intelligence

Technical Field

The invention relates to the technical field of neural networks, in particular to an ultrasonic image video automatic tracking method and device based on artificial intelligence.

Background

In a common ultrasound image examination process, for example, for organs such as kidney and liver, pathological tissues such as suspicious mass, cyst, hemangioma, etc., characteristics thereof need to be analyzed by contrast, and benign and malignant states at a lesion position are analyzed by analyzing a change characteristic of a Time Intensity Curve (TIC) of contrast in the pathological tissues.

In the existing analysis process, a doctor manually marks a suspected lesion tissue, and then calculates the contrast image intensity of each frame to obtain a corresponding TIC curve, but the traditional method is limited in that the doctor manually marks an interested target, so that the efficiency is low, and the position of the target marked by each frame possibly deviates the calculation result along with the movement or angle change of a probe and the pulsation of an organ, so that a certain error exists in the final result, and the accuracy of calculation is influenced.

Disclosure of Invention

The embodiment of the invention aims to provide an ultrasonic image video automatic tracking method, ultrasonic equipment and a storage medium based on artificial intelligence, so as to solve the problem of low calculation efficiency and accuracy caused by manual marking calculation of a doctor in the prior art.

In order to solve the technical problem, the embodiment of the application adopts the following technical scheme: an ultrasonic image tracking method based on a neural network comprises the following steps: acquiring an Nth frame image of an ultrasonic image to be processed as an initial frame image of the ultrasonic image to be processed; determining an output parameter of a first neural network by taking the Nth frame of image as an input parameter of the first neural network, wherein the output parameter of the first neural network is first position information of a target in the Nth frame of image, and the first neural network is a trained neural network for identifying the position of the target in an ultrasonic image; and taking the first position information as a first input parameter of a second neural network, taking an N +1 frame image of the ultrasonic image to be processed as a second input parameter of the second neural network, and tracking second position information of the target in the N +1 frame through the second neural network, wherein N is a positive integer, and the second neural network is a trained neural network for identifying the similarity between the images.

Further, the determining the output parameter of the first neural network includes: acquiring at least one position information of the target output by the first neural network in an Nth frame of image; determining a confidence corresponding to each piece of position information; and determining first position information of the target in the Nth frame of image according to the position information of which the confidence coefficient is greater than a first threshold value.

Further, the first location information includes at least the following information: the coordinate of the center point of the target in the N frame image, and the length and the width of the target in the N frame image.

Further, taking the (N + 1) th frame image of the ultrasonic image to be processed as a second input parameter of a second neural network, comprising: determining a first search area on the (N + 1) th frame image, wherein the center point coordinate of the first search area is the center point coordinate of the target in the nth frame image, and the size of the first search area is a preset multiple of the size of the target in the nth frame image; and taking the first search area as the second input parameter.

Further, the tracking, by the second neural network, second position information of the target in the N +1 th frame includes: determining, by the second neural network, a sub-region in the first search region having similarity to the image at the first location information in the nth frame; determining the position information of the sub-region as the second position information of the target in the (N + 1) th frame.

Further, the determining, by the second neural network, a sub-region having similarity with the image at the first location information in the nth frame in the first search region includes: calculating the distance between each pixel point in the first search area and the image center point at the first position information in the Nth frame; determining a region composed of all pixel points with the distance smaller than a second threshold in the first search region as the sub-region having similarity with the image at the first position information in the nth frame.

Further, the tracking, by the second neural network, second position information of the target in the N +1 th frame includes: calculating similarity scores between each point in the first search area and a central point of the target in the N frame image, and taking the coordinate of the point with the highest similarity score as the coordinate of the central point of the target in the (N + 1) frame image; determining the length and the width of the target in the (N + 1) th frame image, wherein the length and the width of the target in the (N + 1) th frame image are the same as the length and the width of the target in the N frame image; and determining the coordinates of the center point of the target in the (N + 1) th frame image, and the length and the width of the target in the (N + 1) th frame image, and outputting the coordinates and the length and the width as the position information of the target in the (N + 1) th frame image.

The embodiment of the invention also discloses a training method of the neural network, which comprises the following steps: acquiring a first ultrasonic sample image, wherein the position of a target is marked in the ultrasonic sample image; training a first neural network according to the ultrasonic sample image, wherein the trained first neural network is used for identifying first position information of the target in the ultrasonic image; obtaining a second neural network, wherein the second neural network is used for calculating the similarity of two frames of images; acquiring a second ultrasonic sample image, wherein the second ultrasonic sample image is marked with the positions of the targets in two adjacent frames; and training the second neural network according to the second ultrasonic sample image, wherein the trained second neural network is used for identifying and obtaining the position of the target in the current frame according to the position of the target in the previous frame.

The embodiment of the invention also discloses ultrasonic equipment which at least comprises a memory and a processor, wherein the memory is stored with a computer program, and the ultrasonic equipment is characterized in that the processor realizes the steps of the ultrasonic image video automatic tracking method based on artificial intelligence when executing the computer program on the memory.

The embodiment of the invention also discloses a storage medium, which is characterized in that a computer program is stored on the storage medium, and when the computer program is executed by a processor, the method for automatically tracking the ultrasonic image and the video based on the artificial intelligence is executed.

The embodiment of the invention has the beneficial effects that: by training and using the neural network, the automatic identification and tracking of the target position in the ultrasonic image are realized, the manual marking time of a doctor is saved, the target position is accurately and quickly acquired, a TIC curve can be generated in real time by combining the current radiography image while tracking in real time, and the efficiency and the accuracy of target identification and tracking in the ultrasonic image are improved.

Drawings

FIG. 1 is a flowchart of a method for processing ultrasound images according to a first embodiment of the present invention;

FIG. 2 is a flowchart of a method for training a neural network according to a second embodiment of the present invention;

FIG. 3 is a diagram illustrating object labeling according to a second embodiment of the present invention;

FIG. 4 is a diagram illustrating a second neural network according to a second embodiment of the present invention;

fig. 5 is a schematic structural view of an ultrasonic apparatus in a third embodiment of the present invention.

Detailed Description

Various aspects and features of the present application are described herein with reference to the accompanying drawings.

It will be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be construed as limiting, but merely as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the application.

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the application and, together with a general description of the application given above and the detailed description of the embodiments given below, serve to explain the principles of the application.

These and other characteristics of the present application will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.

It should also be understood that, although the present application has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of application, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.

The above and other aspects, features and advantages of the present application will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.

Specific embodiments of the present application are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the application, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the application of unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present application in virtually any appropriately detailed structure.

The description may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the application.

The first embodiment of the present invention provides an artificial intelligence-based method for automatically tracking an ultrasound image video, which can be implemented by an ultrasound device with a computing processing function or a computer or a server connected to the ultrasound device, and the flowchart is shown in fig. 1, and mainly includes steps S101 to S103:

s101, acquiring an Nth frame image of the to-be-processed ultrasonic image as a starting frame image of the to-be-processed ultrasonic image.

The ultrasonic image to be processed may be an ultrasonic image fed back by the ultrasonic probe acquired in real time after being connected with the ultrasonic device, or may be a section of an existing ultrasonic image which has been scanned and stored in the storage device or the cloud. Before identifying and tracking a focus, preprocessing an ultrasonic image to be processed, mainly including operations of removing a sensitive part and only reserving an effective ultrasonic image, and the like, and then decoding the ultrasonic image to be processed in a video format into a single-frame image, wherein an Nth frame image is used as an image to be processed firstly, and N is a positive integer. Preferably, the nth frame image is generally the first frame image of the ultrasound image, and may also be any frame image that is selected by the user according to the actual scanning situation.

S102, the Nth frame of image is used as an input parameter of a first neural network, an output parameter of the first neural network is determined, and the output parameter of the first neural network is first position information of a target in the Nth frame of image.

The target to be identified and tracked in this embodiment is mainly a lesion, such as a suspected lump, a cyst, and a hemangioma, which may exist in an organ, and the "lesion" is directly used for description in the subsequent embodiments.

The first neural network mentioned in this embodiment is a multilayer convolutional neural network, which is intended to be used for identifying the position of a lesion in an ultrasound image, is a basis for subsequently tracking the position of the lesion, and can be trained in advance by a computer or a server before use. The first neural network in this embodiment is a convolutional neural network that has been trained, and according to different types and characteristics of a lesion to be identified, corresponding first neural networks may be trained for different types of lesions, for example, a neural network for identifying a cyst, a neural network for identifying a tumor, a neural network for identifying a hemangioma, and different training samples may be used when training the neural networks for different types of lesions.

After the first neural network is trained, the position of the focus in the ultrasonic image to be processed can be identified, the input parameter is the Nth frame image of the ultrasonic image to be processed, namely the image of the initial frame of the ultrasonic image to be identified, which is expected to be used by a user, preferably the first frame image of the ultrasonic image to be processed after normalization, after the first neural network is calculated, the output result is determined as the first position information of the focus in the Nth frame image, and the identification of the focus in the ultrasonic image to be processed is completed. Specifically, the first position information includes at least the following information: the coordinate of the center point of the focus in the image of the Nth frame, the length and the width of the focus in the image of the Nth frame, and the information such as the specific position and the size of the focus in the image of the first frame can be determined by determining the center point of the focus and the length and the width of the focus. It should be noted that, if a rectangular mark is used as a mark of a lesion when a lesion is identified, the first position information includes a length and a width of the lesion in the nth frame image, and if a mark of another shape is used as a mark of a lesion when a lesion is identified, a parameter corresponding to the shape is used as first position information of the lesion, for example, if a circle is used as a mark of a lesion, a circle center coordinate (a coordinate of a center point) and a size of a radius of the lesion are used as the first position information, and if an ellipse is used as a mark of a lesion, a coordinate of a focus of the lesion, a major axis and a minor axis are used as the first position information.

Further, if the first neural network simultaneously outputs the position information of at least one lesion in the nth frame of image, the final lesion position is determined by the confidence of each position information. The confidence degree represents the probability of the focus identified by the neural network, the higher the confidence degree is, the more accurate the focus identification is proved, therefore, when the first neural network outputs the position information of a plurality of focuses, the confidence degree corresponding to each position information is determined, and the position information of which the confidence degree is greater than a first threshold value is determined as the first position information of the focus in the Nth frame of image. And if the confidence degrees of the plurality of pieces of position information are all larger than a first threshold value, selecting the position information with the maximum confidence degree from the position information as the first position information of the focus in the N frame of image.

S103, taking the first position information as a first input parameter of a second neural network, taking the (N + 1) th frame image of the ultrasonic image to be processed as a second input parameter of the second neural network, and tracking second position information of the target in the (N + 1) th frame through the second neural network.

The second neural network mentioned in this embodiment is also a multi-layer convolutional neural network, and is intended for real-time tracking of the lesion position in the ultrasound image. The second neural network in this embodiment is a convolutional neural network that has been trained, and although the objective is to perform real-time tracking of the lesion position, the determination of the position of the target to be tracked in the image to be searched is substantially achieved by performing similarity comparison between the target to be tracked and the image to be searched, so the second neural network is a neural network that can obtain image similarity, that is, the neural network can no longer depend on what the target to be tracked is, and can achieve the function of outputting a relatively high confidence for a position with a relatively high similarity, and thus the training target can be arbitrary. In this embodiment, a known stable tracking model may be selected in combination with computer vision, and then fine-tuned to improve the generalization capability of the tracking model, so as to further improve the tracking capability of the tracking model in this embodiment.

After the second neural network training is finished, the focus position output by the first neural network can be correspondingly tracked and identified in subsequent images. In practical use, the first input parameter of the second neural network is position information of the focus in the N frame image, and mainly comprises a central point P of the focus in the N frame image₀Coordinates (cx1, cy1) of the lesion in the N-th frame imageLength w1 and width h 1; in order to reduce the actual calculation amount of the second neural network, a first search region may be first determined in the nth frame image, and the first search region may be used as the second input parameter of the second neural network, where a center point coordinate of the first search region is a center point coordinate of the lesion in the nth frame image, and the size of the first search region is a preset multiple of the size of the lesion in the nth frame image, for example, the length and width of the first search region may be set as the preset multiple of the length and width of the lesion in the nth frame image.

In practical use, the first position information of the lesion output by the first neural network in the nth frame image may also be changed according to different identifiers used for marking the lesion, and may also be adjusted according to the change when the first search region is calculated. For example, a circle is used as the mark of the lesion, the first position information output by the first neural network is the size of the center coordinates (coordinates of the center point) and the radius of the lesion, the corresponding first search area is adjusted to be a circle, the coordinates of the center point of the first search area are set as the center coordinates of the lesion, the radius of the first search area is set as a preset multiple of the radius of the lesion in the first position information, and the like.

Taking the identification of the focus as a rectangle as an example, when tracking the second position information of the focus in the N +1 th frame, firstly, determining a sub-region having similarity with the image at the first position information of the nth frame in the first search region through the second neural network, and then determining the position information of the sub-region as the second position information of the focus in the N +1 th frame. It should be noted that the similarity refers to the structural similarity of the two graphs, and the sub-region determined in the second neural network should have a similar structure with the image of the lesion at the first position information in the nth frame, so that the sub-region can be determined as the position information of the lesion in the N +1 th frame. Specifically, the manner of determining the sub-region in the N +1 th frame used in this embodiment is to determine according to the distance between each pixel point in the first search region and the image center point at the first position information in the N th frame, when the distance calculated by a certain pixel point is smaller than a preset second threshold, it is proved that the pixel point may fall within the range of the lesion in the first search region, and a region formed by all pixel points whose distances are smaller than the second threshold is used as the sub-region, so that the specific position of the lesion in the first search region can be correspondingly obtained, the position information of the sub-region is the second position information of the lesion in the N +1 th frame, and if the center point coordinate of the sub-region is the center point coordinate in the second position information, the length and the width of the sub-region are used as the length and the width in the second position information. Specifically, the second threshold may be determined according to the first location information in combination with a preset multiple of the first search region and the lesion between the nth frame image.

Wherein c is the coordinate of the center point of the lesion in the input image, p is the distance between each point and the center point in the image output by the neural network, R is the radius of the lesion in each input image relative to the center point,

the scaling factor for the neural network, i.e. the second threshold value is

In practical implementation, the second neural network may output the value of s (p)', that is, may obtain a score map, and after obtaining the score map, the second location information may be determined according to the point with the highest score, for example: taking the coordinate of the point with the highest similarity score as the coordinate of the focus central point in the image of the (N + 1) th frame, and combining the scaling coefficients of the second neural network if the coordinate of the point with the highest similarity score is (cx2, cy2)

(namely the preset multiple), determining the central point P 'of the focus in the (N + 1) th frame image'₀Coordinates of the objectIs composed of

Then determining the length w2 and the width h2 of the focus in the N +1 frame image according to the length w1 and the width h1 of the focus in the N +1 frame image, wherein the length and the width of the focus in the N +1 frame image are the same as those of the focus in the N frame image in the embodiment, namely w1 is w2, and h1 is h 2; finally, the coordinates (cx2, cy2) of the center point of the lesion in the N +1 th frame image, the length w2 and the width h2 of the lesion in the N +1 th frame image are output as second position information in the N +1 th frame image.

It should be noted that the nth frame of image described in this embodiment is a start frame of image of the ultrasound image to be processed at the time of identification, which may be a first frame of image of the ultrasound image to be processed or any one frame of image selected by the user, the location information of the lesion in the frame of image is determined by the first neural network, and the tracking identification of the lesion in the frame subsequent to the start frame is performed by calculating the similarity between the location information of the current frame and the location information of the lesion in the previous frame through the second neural network, that is, the location information of the lesion in the N +1 th frame is determined according to the location information of the lesion in the nth frame, the location information of the lesion in the N +2 th frame is determined according to the location information of the lesion in the N +1 th frame, and so on until the location information of the lesion in the last frame of image of the ultrasound image to be identified is determined, or determining the position information of the focus in the last frame of image to be processed preset by the user.

According to the method, the automatic identification and tracking of the focus position in the ultrasonic image are realized through the training and the use of the neural network, the manual marking time of a doctor is saved, the focus position is accurately and quickly acquired, a TIC curve can be generated in real time in combination with the current contrast image while tracking is carried out in real time, and the efficiency and the accuracy of target identification and tracking in the ultrasonic image are improved.

A second embodiment of the present invention provides a method for training a neural network, a flowchart of which is shown in fig. 2, and mainly includes:

s201, acquiring a first ultrasonic sample image, wherein the position of a target is marked in the ultrasonic sample image;

s202, training a first neural network according to the ultrasonic sample image, wherein the trained first neural network is used for identifying first position information of a target in the ultrasonic image;

s203, obtaining a second neural network, wherein the second neural network is used for calculating the similarity of the two frames of images;

s204, acquiring a second ultrasonic sample image, wherein the second ultrasonic sample image is marked with the positions of targets in two adjacent frames;

s205, training a second neural network according to the second ultrasonic sample image, wherein the trained second neural network is used for identifying and obtaining the position of the target in the current frame according to the position of the target in the previous frame.

The training process of the first neural network is described below in conjunction with fig. 3.

The purpose of the first neural network is to identify the lesion position in the ultrasound image, so a large number of historical ultrasound images with known lesion positions are required as the first ultrasound sample image to train the first neural network. It should be noted that the plurality of first ultrasound sample images selected here should be historical ultrasound images corresponding to the same type of lesion, so that the trained neural network can be used to identify the location of the type of lesion in the ultrasound images.

For example, referring to fig. 3, a white rectangular frame in the figure is a position of a lesion in an ultrasound sample image. If a plurality of focuses exist in an ultrasonic sample image at the same time, the positions of the focuses can be marked at the same time, and the marking can be carried out according to the sequence of high sensitivity and specificity improvement. Note that the method of marking the lesion position is not limited to marking with a rectangular frame, and marking with a circular frame or a marking frame with another shape may also be performed according to other factors such as actual use conditions and personal habits.

When training is carried out, firstly, the neural network is initialized, the weight parameter of the neural network is set to be a random value between 0 and 1, then the Nth frame image of each training sample without focus position marking is normalized, the normalized image is used as an actual input parameter when the first neural network is trained, the image with the focus actual position information marked correspondingly is compared after an output result is obtained, a first loss function corresponding to the first neural network is determined according to the comparison result, the first neural network is optimized and iterated, the weight of the first neural network is continuously adjusted until the value of the first loss function is minimum, at this moment, the first neural network is considered to be trained, and the trained first neural network can be used for identifying the first position information of the focus in the ultrasonic image.

Further, the second neural network is a network used for calculating the similarity of the two frames of images, when the second neural network is trained, the used training sample is a second ultrasonic sample image marked with the position of the focus in the two adjacent frames, the second neural network is trained according to the second ultrasonic sample image, and the trained second neural network is used for identifying and obtaining the position of the focus in the current frame according to the position of the focus in the previous frame. The training process for the second neural network is described below in conjunction with FIG. 4.

The second neural network is provided with two input parameters, wherein the first input parameter is the position information of a previous frame image of a target in the second ultrasonic sample image, the information can be manually marked, and can also be determined according to the position information of a focus output in the first neural network when the previous frame image is taken as a starting frame, and the information is used as a first input parameter x in the training process of the second neural network after being cut and normalized; the second input parameter y is a current frame image of an Nth frame image of the training sample, and the purpose of the second neural network is to continuously determine a process of the most similar position to the target to be tracked in the ultrasonic image. In order to reduce the amount of calculation, a search area may be defined in the current frame image, and since the position of the lesion in this embodiment does not move in real time and only slightly changes with the scanning angle or the pulsation of the internal organs, it is only necessary to expand the lesion position determined in the previous frame as the reference to a small range around the reference, which can ensure that the position of the lesion does not exceed the range and reduce the amount of calculation of the neural network. When the lesion mark is marked with a mark in another shape, the location information of the lesion mark may be represented by using parameters in another shape, and corresponding adjustment is performed when the search area is determined, where the parameters corresponding to the formula of the calculation score map in the following description should also be adjusted adaptively according to the difference of the lesion mark, and the specific adjustment mode is determined according to the lesion mark selected by the user, which is not limited in this embodiment.

The network structure passed by the first input parameter x and the second input parameter y is the same, the training parameters are consistent and shared, the network can be regarded as a twin network, as shown in fig. 5, the output of the two input parameters passed by the neural network is respectively

And

then, the two are convolution-operated, similar to convolution layer,

the parameters of the convolutional layer can be understood, and a similarity score chart, namely a score chart representing the similarity between the target to be tracked and the search area, is finally obtained, namely the output result of the second neural network, the score chart is used for representing the similarity between the target to be tracked and the search area, the position with the larger score represents the similarity with the target to be tracked, and the position with the largest score is the position of the target to be tracked in the search areaThe center of the location within the domain.

It should be noted that, the score map used as a reference in training the second neural network needs to be manually generated according to the training samples, and since the images of the training input all have a central point, the corresponding score map calculation formula is as follows:

wherein c is the coordinate of the center point of the lesion in each input image, p is the distance between each point and the center point in the image output by the neural network, R is the radius of the lesion in each input image relative to the center point,

is the scaling factor of the neural network.

The manually calculated score chart s (p)' is used as a second loss function of the second neural network, and the second loss function in this embodiment is calculated by cross entropy summation and averaging, and the specific formula is as follows:

where f (p) is a cross entropy function, which is formulated as:

(p) (- [ S (p)) logS (p) '+ (1-S (p)) log (1-S (p))') formula (3)

And (3) calculating f (p) according to a formula (3), correspondingly obtaining loss (S), and continuously optimizing the second neural network according to the value of the loss (S) until the value of the loss (S) is minimum, wherein the score graph output by the second neural network is S (p), the real score graph obtained manually is S (p)', and the training of the second neural network is finished.

Assuming that the coordinates of the point with the highest score are (cx2, cy2), the scaling factor of the second neural network is combined

(i.e., the predetermined multiple) ofDetermining the center point P 'of the focus in the first next frame image'₀The coordinates are

Wherein, the central point coordinate of the focus in the current frame is: (cx1, cy 1). Then determining the length w2 and the width h2 of the focus in the image of the (N + 1) th frame according to the length w1 and the width h1 of the focus in the image of the N th frame, wherein the length and the width of the focus in the next image are the same as those of the focus in the image of the N th frame in the embodiment, namely w1 is w2, and h1 is h 2; finally, the coordinates (cx2, cy2) of the center point of the lesion in the next frame image, the length w2 and the width h2 of the lesion in the N +1 th frame image are output as second position information in the N +1 th frame image.

A schematic structural diagram of an ultrasound apparatus according to a third embodiment of the present invention is shown in fig. 5, and the ultrasound apparatus at least includes a memory 100 and a processor 200, and may further include a scanning probe, a display screen, a keyboard, and other hardware devices necessary for performing ultrasound processing, where a computer program is stored on the memory 100, and the processor 200, when executing the computer program on the memory 100, implements the following steps S1 to S3:

s1, acquiring the Nth frame image of the ultrasonic image to be processed as the initial frame image of the ultrasonic image to be processed;

s2, determining an output parameter of a first neural network by taking the Nth frame image as an input parameter of the first neural network, wherein the output parameter of the first neural network is first position information of a target in the Nth frame image, and the first neural network is a trained neural network for identifying the position of the target in the ultrasonic image;

s3, the first position information is used as a first input parameter of a second neural network, the (N + 1) th frame image of the ultrasonic image to be processed is used as a second input parameter of the second neural network, and second position information of the target in the (N + 1) th frame is tracked through the second neural network, wherein N is a positive integer, and the second neural network is a trained neural network used for identifying the similarity between the images.

The first position information in this embodiment includes at least the following information: the coordinates of the center point of the target in the image of the Nth frame, and the length and the width of the target area in the image of the Nth frame.

When the processor 200 executes the step of determining that the output parameter of the first neural network is the first position information of the target in the nth frame image on the memory 100, the following computer program is specifically executed: acquiring at least one position information of a target output by the first neural network in an Nth frame of image; determining the confidence corresponding to each position information; and determining first position information of the target in the Nth frame of image according to the position information of which the confidence coefficient is greater than the first threshold value.

When the processor 200 executes the step of using the N +1 th frame image of the ultrasound image to be processed as the second input parameter of the second neural network on the memory 100, the following computer program is specifically executed: determining a first search area on the (N + 1) th frame image, wherein the center point coordinate of the first search area is the center point coordinate of the target in the nth frame image, and the size of the first search area is a preset multiple of the size of the target in the nth frame image; the first search area is taken as a second input parameter.

The processor 200, when executing the step of tracking the second position information of the target in the N +1 th frame through the second neural network on the memory 100, specifically executes the following computer program: determining a sub-region having similarity with the image at the first position information in the Nth frame in the first search region through a second neural network; and determining the position information of the sub-area as second position information of the target in the (N + 1) th frame.

The processor 200 specifically executes the following computer program when executing the step of determining a sub-region having similarity with the image at the first position information in the nth frame in the first search region through the second neural network on the memory 100: calculating the distance between each pixel point in the first search area and the image center point at the first position information in the Nth frame; and determining a region formed by all pixel points with the distance smaller than a second threshold value in the first search region as a sub-region with similarity to the image at the first position information in the Nth frame.

The processor 200, when executing the step of tracking the second position information of the target in the N +1 th frame through the second neural network on the memory 100, specifically executes the following computer program: calculating similarity scores between each point in the first search area and the center point of the target in the Nth frame image, and taking the coordinate of the point with the highest similarity score as the coordinate of the center point of the target in the (N + 1) th frame image; determining the length and the width of the target in the (N + 1) th frame image, wherein the length and the width of the target in the (N + 1) th frame image are the same as those of the target in the Nth frame image; and determining the coordinates of the center point of the target in the (N + 1) th frame image, and the length and the width of the target in the (N + 1) th frame image, and outputting the coordinates and the length and the width as the position information of the target in the (N + 1) th frame image.

According to the method, the target position in the ultrasonic image is automatically identified and tracked through training and using of the neural network, the time for manual marking of a doctor is saved, the target position is accurately and quickly acquired, a TIC curve can be generated in real time in combination with the current contrast image while tracking is carried out in real time, and the efficiency and the accuracy of target identification and tracking in the ultrasonic image are improved.

A fourth embodiment of the present invention provides a storage medium having a computer program stored thereon, which, when being executed by a processor, performs the neural network-based ultrasound image tracking method as in the first embodiment of the present invention.

Moreover, although exemplary embodiments have been described herein, the scope thereof includes any and all embodiments based on the present invention with equivalent elements, modifications, omissions, combinations (e.g., of various embodiments across), adaptations or alterations. The elements of the claims are to be interpreted broadly based on the language employed in the claims and not limited to examples described in the present specification or during the prosecution of the application, which examples are to be construed as non-exclusive. It is intended, therefore, that the specification and examples be considered as exemplary only, with a true scope and spirit being indicated by the following claims and their full scope of equivalents.

The above description is intended to be illustrative, and not restrictive. For example, the above-described examples (or one or more versions thereof) may be used in combination with each other. For example, other embodiments may be used by those of ordinary skill in the art upon reading the above description. In addition, in the above-described embodiments, various features may be grouped together to streamline the disclosure. This should not be interpreted as an intention that a disclosed feature not claimed is essential to any claim. Rather, inventive subject matter may lie in less than all features of a particular disclosed embodiment. Thus, the following claims are hereby incorporated into the detailed description as examples or embodiments, with each claim standing on its own as a separate embodiment, and it is contemplated that these embodiments may be combined with each other in various combinations or permutations. The scope of the invention should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

While the embodiments of the present invention have been described in detail, the present invention is not limited to these specific embodiments, and those skilled in the art can make various modifications and modifications of the embodiments based on the concept of the present invention, which fall within the scope of the present invention as claimed.

Claims

1. An ultrasonic image video automatic tracking method based on artificial intelligence is characterized by comprising the following steps:

acquiring an Nth frame image of an ultrasonic image to be processed as an initial frame image of the ultrasonic image to be processed;

determining an output parameter of a first neural network by taking the N frame image as an input parameter of the first neural network, wherein the output parameter of the first neural network is first position information of a target in the N frame image, the first neural network is a trained neural network for identifying the position of the target in an ultrasonic image, and the first position information at least comprises the following information: the coordinates of the center point of the target in the N frame image, the length and the width of the target in the N frame image;

taking the first position information as a first input parameter of a second neural network, taking an N +1 frame image of the ultrasonic image to be processed as a second input parameter of the second neural network, and tracking second position information of the target in the N +1 frame through the second neural network, wherein N is a positive integer, and the second neural network is a trained neural network for identifying similarity between images;

The method for processing the ultrasonic image comprises the following steps of taking the (N + 1) th frame of image of the ultrasonic image to be processed as a second input parameter of a second neural network, wherein the steps of:

determining a first search area on the (N + 1) th frame image, wherein the central point coordinate of the first search area is the central point coordinate of the target in the nth frame image, and the size of the first search area is a preset multiple of the size of the target in the nth frame image;

taking the first search area as the second input parameter;

the tracking second position information of the target in the N +1 th frame through the second neural network comprises:

calculating similarity scores between each point in the first search area and the central point of the target in the N frame image, and taking the coordinate of the point with the highest similarity score as the coordinate of the central point of the target in the (N + 1) frame image;

determining the length and the width of the target in the (N + 1) th frame image, wherein the length and the width of the target in the (N + 1) th frame image are the same as the length and the width of the target in the Nth frame image;

and determining the coordinates of the center point of the target in the (N + 1) th frame image and the length and width of the target in the (N + 1) th frame image, and outputting the coordinates and the length and width of the target as the position information of the target in the (N + 1) th frame image.

2. The method of claim 1, wherein determining the output parameter of the first neural network comprises:

acquiring at least one position information of the target output by the first neural network in an Nth frame of image;

determining a confidence corresponding to each piece of position information;

and determining first position information of the target in the Nth frame of image according to the position information of which the confidence coefficient is greater than a first threshold value.

3. The method of claim 1, wherein tracking second location information of the target in the N +1 th frame through the second neural network comprises:

determining, by the second neural network, a sub-region in the first search region having similarity to the image at the first location information in the nth frame;

determining the position information of the sub-region as the second position information of the target in the (N + 1) th frame.

4. The method of claim 3, wherein the determining, by the second neural network, a sub-region in the first search region having similarity to the image at the first location information in the Nth frame comprises:

Calculating distances between each pixel point in the first search area and an image center point at the first position information in the Nth frame;

determining a region composed of all pixel points with the distance smaller than a second threshold in the first search region as the sub-region having similarity with the image at the first position information in the nth frame.

5. A method of training a neural network, comprising:

acquiring a first ultrasonic sample image, wherein the position of a target is marked in the ultrasonic sample image;

training a first neural network according to the ultrasonic sample image, wherein the trained first neural network is used for identifying first position information of the target in the ultrasonic image, and the first position information at least comprises the following information: the coordinate of the center point of the target in the Nth frame image, and the length and the width of the target in the Nth frame image;

obtaining a second neural network, wherein the second neural network is used for calculating the similarity of two frames of images;

acquiring a second ultrasonic sample image, wherein the second ultrasonic sample image is marked with the positions of the targets in two adjacent frames;

training the second neural network according to the second ultrasonic sample image, wherein the trained second neural network is used for identifying and obtaining the position of the target in the current frame according to the position of the target in the previous frame;

The first input parameter of the second neural network is the first position information, the second input parameter of the second neural network is the (N + 1) th frame image of the ultrasonic image to be processed, the second neural network calculates similarity scores between each point in the first search area and the center point of the target in the (N) th frame image, and the coordinate of the point with the highest similarity score is used as the coordinate of the center point of the target in the (N + 1) th frame image; determining the length and the width of the target in the (N + 1) th frame image, wherein the length and the width of the target in the (N + 1) th frame image are the same as the length and the width of the target in the N frame image; determining the coordinates of the center point of the target in the (N + 1) th frame image and the length and width of the target in the (N + 1) th frame image as the position information of the target in the (N + 1) th frame image to output; the center point coordinate of the first search area is the center point coordinate of the target in the Nth frame image, and the size of the first search area is a preset multiple of the size of the target in the Nth frame image.

6. An ultrasound device comprising at least a memory, a processor, the memory having a computer program stored thereon, wherein the processor, when executing the computer program on the memory, implements the steps of the neural network based ultrasound image tracking method of any one of claims 1 to 4.

7. A storage medium having stored thereon a computer program which, when executed by a processor, performs the neural network-based ultrasound image tracking method of any one of claims 1 to 4.