CN113822791A

CN113822791A - Image registration method, registration network training method, device, equipment and medium

Info

Publication number: CN113822791A
Application number: CN202110625883.5A
Authority: CN
Inventors: 秦陈陈; 姚建华; 刘翌勋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-04
Filing date: 2021-06-04
Publication date: 2021-12-21

Abstract

The application discloses an image registration method, a registration network training method, a device, equipment and a medium, and belongs to the technical field of artificial intelligence. After the image is registered through the image registration network, a mode of continuously optimizing the initial registration result is provided, so that when the initial registration result is not accurate enough, the optimized registration result can be more accurate through the optimization process, and the problem that the generalization capability of the image registration network on a new data set is insufficient is solved. In the optimization process, the transformation parameters are continuously adjusted, so that the similarity between the reference image and the second image is higher, and a more accurate registration result can be obtained.

Description

Image registration method, registration network training method, device, equipment and medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an image registration method, a registration network training method, an apparatus, a device, and a medium.

Background

The image registration refers to a process of matching and superimposing two or more images acquired at different times, different sensors (imaging devices), or under different conditions (weather, illuminance, camera position, angle, and the like).

At present, an image registration method generally trains an image registration network in advance, inputs an image to be registered and a reference image into the image registration network, performs image registration on the image to be registered and the reference image by the image registration network, and outputs a registration result.

In the above method, the image registration network is usually trained based on a certain image set, and the generalization capability of the image registration network on a new data set is not good. The image to be registered and the reference image input into the image registration device during use are not in the image set during training, and therefore the obtained registration result may not be accurate enough.

Disclosure of Invention

The embodiment of the application provides an image registration method, a registration network training method, a device, equipment and a medium, and improves the accuracy of image registration. The technical scheme is as follows:

in one aspect, an image registration method is provided, the method including:

based on the image registration network, carrying out image registration on the reference image and the first image to obtain an initial registration result;

updating an initial transformation parameter corresponding to the initial registration result in response to the initial registration result satisfying a condition;

performing transformation processing on the first image based on the updated transformation parameters to obtain a second image;

and responding to the condition that the similarity between the reference image and the second image meets the condition, updating the transformation parameters, continuing to perform transformation processing and updating the transformation parameters based on the updated transformation parameters until the target condition is met, and obtaining a target registration result.

In one aspect, an image registration network training method is provided, and the method includes:

acquiring a sample image pair, wherein the sample image pair comprises a sample reference image and a sample first image, and the sample image pair carries image type information;

according to the image type information carried by the sample image pair, carrying out image registration on the sample image pair based on a branch corresponding to the image type information in an image registration network to obtain a transformation parameter;

training network parameters of the image registration network based on the transformation parameters.

In one aspect, an image registration apparatus is provided, the apparatus including:

the registration module is used for carrying out image registration on the reference image and the first image based on the image registration network to obtain an initial registration result;

an updating module, configured to update an initial transformation parameter corresponding to the initial registration result in response to that the initial registration result satisfies a condition;

the transformation module is used for carrying out transformation processing on the first image based on the updated transformation parameters to obtain a second image;

the updating module and the transformation module are further configured to update the transformation parameters in response to that the similarity between the reference image and the second image satisfies a condition, continue transformation processing and updating the transformation parameters based on the updated transformation parameters, and stop until a target condition is satisfied to obtain a target registration result.

In some embodiments, the update module is to perform any of:

in response to receiving an optimization instruction for the initial registration result, executing the step of updating the initial transformation parameters corresponding to the initial registration result;

and in response to that the similarity between the initial second image after the first image is registered and the reference image in the initial registration result is smaller than a first similarity threshold, executing the step of updating the initial transformation parameters corresponding to the initial registration result.

In some embodiments, the image registration network comprises a first branch for rigid body registration and a second branch for non-rigid body registration;

the registration module is to:

responding to a rigid body registration instruction, and carrying out image registration on the reference image and the first image based on a first branch of the image registration network to obtain an initial registration result, wherein an initial transformation parameter corresponding to the initial registration result is an initial affine matrix;

and responding to a non-rigid body registration instruction, and performing image registration on the reference image and the first image based on a second branch of the image registration network to obtain an initial registration result, wherein an initial transformation parameter corresponding to the initial registration result is an initial vector field.

In some embodiments, the update module is to perform any of:

in response to the initial registration result meeting a condition, updating an initial affine matrix corresponding to the initial registration result;

updating an initial vector field corresponding to the initial registration result in response to the initial registration result satisfying a condition;

the transformation module is to perform any one of:

carrying out affine transformation on the first image based on the updated affine matrix to obtain a second image;

and carrying out strain transformation on the first image based on the updated vector field to obtain a second image.

In some embodiments, the update module is to update the transformation parameter in response to a similarity between the reference image and the second image being less than a second similarity threshold.

In some embodiments, the obtaining of the similarity between the reference image and the second image comprises:

and according to the modal information of the reference image and the second image, acquiring the similarity between the reference image and the second image by adopting a similarity acquisition mode corresponding to the modal information.

In some embodiments, the obtaining, according to the modality information of the reference image and the second image, the similarity between the reference image and the second image in a similarity obtaining manner corresponding to the modality information includes:

acquiring a normalized cross-correlation coefficient between the sub-image set of the reference image and the second image in response to the modality information of the reference image and the modality information of the second image being the same, wherein the normalized cross-correlation coefficient is a similarity between the reference image and the second image;

acquiring normalized mutual information between the sub-image set of the reference image and the second image in response to the difference between the modal information of the reference image and the modal information of the second image, and acquiring the similarity between the reference image and the second image based on the normalized mutual information.

In some embodiments, the obtaining the similarity between the reference image and the second image based on the normalized mutual information includes:

in response to the transformation parameter being a vector field, transforming the first image into a strain transformation, and obtaining a smooth constraint value based on the vector field;

and acquiring the similarity between the reference image and the second image based on the smooth constraint value and the normalized mutual information.

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a sample image pair, the sample image pair comprises a sample reference image and a sample first image, and the sample image pair carries image type information;

the registration module is used for carrying out image registration on the sample image pair based on a branch corresponding to the image type information in an image registration network according to the image type information carried by the sample image pair to obtain a transformation parameter;

and the training module is used for training the network parameters of the image registration network based on the transformation parameters.

In some embodiments, the training module is to perform any one of:

processing the sample first image based on the transformation parameters to obtain a target second image, and training network parameters of the image registration network based on the similarity between the sample reference image and the target second image;

and training the network parameters of the image registration network based on the transformation parameters and the target transformation parameters carried by the sample image pair.

In some embodiments, the image type information comprises a first image type and a second image type;

the registration module is to:

responding to the image type information carried by the sample image pair as a first image type, and carrying out image registration on the sample image pair based on a first branch in the image registration network to obtain an affine matrix, wherein the affine matrix is a transformation parameter, and the first branch is used for rigid body registration;

and in response to that the image type information carried by the sample image pair is a second image type, carrying out image registration on the sample image pair based on a second branch in the image registration network to obtain a vector field, wherein the vector field is a transformation parameter, and the second branch is used for non-rigid body registration.

In some embodiments, the obtaining module is further configured to obtain batch data in response to an update instruction of an image registration network, where the batch data includes a target image pair and the sample image pair, and the target image pair is an image pair obtained by optimizing an initial registration result output by the image registration network during use of the image registration network;

the training module is further used for training the trained image registration network based on the batch data to obtain an updated image registration network.

In one aspect, an electronic device is provided that includes one or more processors and one or more memories having stored therein at least one computer program that is loaded and executed by the one or more processors to implement various alternative implementations of the image registration method or the image registration network training method described above.

In one aspect, a computer-readable storage medium is provided, in which at least one computer program is stored, which is loaded and executed by a processor to implement various alternative implementations of the above-described image registration method or image registration network training method.

In one aspect, a computer program product or computer program is provided that includes one or more program codes stored in a computer-readable storage medium. The one or more processors of the electronic device read the one or more program codes from the computer-readable storage medium, and the one or more processors execute the one or more program codes to cause the electronic device to perform the image registration method or the image registration network training method of any of the above possible embodiments.

After the image is registered through the image registration network, a mode of continuously optimizing the initial registration result is provided, so that when the initial registration result is not accurate enough, the optimized registration result can be more accurate through the optimization process, and the problem that the generalization capability of the image registration network on a new data set is insufficient is solved. In the optimization process, the transformation parameters are continuously adjusted, so that the similarity between the reference image and the second image is higher, and a more accurate registration result can be obtained.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of an image registration method provided in an embodiment of the present application;

fig. 2 is a flowchart of an image registration method provided in an embodiment of the present application;

fig. 3 is a flowchart of an image registration network training method provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of an image registration apparatus provided in an embodiment of the present application;

fig. 5 is a schematic diagram of an image registration network provided in an embodiment of the present application;

fig. 6 is a flowchart of an image registration network training method provided in an embodiment of the present application;

fig. 7 is a flowchart of an image registration method provided in an embodiment of the present application;

fig. 8 is a schematic structural diagram of an image registration apparatus provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of an image registration network training apparatus provided in an embodiment of the present application;

fig. 10 is a block diagram of an electronic device according to an embodiment of the present disclosure;

fig. 11 is a block diagram of a terminal according to an embodiment of the present disclosure;

fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first image is referred to as a second image, and similarly, a second image is referred to as a first image, without departing from the scope of the various described examples. The first image and the second image are both images, and in some cases, separate and distinct images.

The term "at least one" is used herein to mean one or more, and the term "plurality" is used herein to mean two or more, e.g., a plurality of packets means two or more packets.

It is to be understood that the terminology used in the description of the various described examples herein is for the purpose of describing particular examples only and is not intended to be limiting. As used in the description of the various described examples and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. The term "and/or" is an associative relationship that describes an associated object, meaning that there are three relationships, e.g., A and/or B, meaning: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present application generally indicates that the former and latter related objects are in an "or" relationship.

It should also be understood that, in the embodiments of the present application, the size of the serial number of each process does not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

It should also be understood that determining B from a does not mean determining B from a alone, but also from a and/or other information.

It will be further understood that the terms "Comprises," "Comprising," "inCludes" and/or "inCluding," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also understood that the term "if" may be interpreted to mean "when" ("where" or "upon") or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined." or "if [ a stated condition or event ] is detected" may be interpreted to mean "upon determining.. or" in response to determining. "or" upon detecting [ a stated condition or event ] or "in response to detecting [ a stated condition or event ]" depending on the context.

The embodiment of the application relates to an artificial intelligence technology, in particular to image registration based on the artificial intelligence technology. The following is a brief description of artificial intelligence.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the application relates to the image processing technology in the computer vision of artificial intelligence, and the like, and is specifically explained by the following embodiment.

The following describes an embodiment of the present application.

Fig. 1 is a schematic diagram of an implementation environment of an image registration method provided in an embodiment of the present application. The implementation environment comprises a terminal 101, or the implementation environment comprises a terminal 101 and an image registration platform 102. The terminal 101 is connected to the image registration platform 102 through a wireless network or a wired network.

The terminal 101 is at least one of a smart phone, a game console, a desktop computer, a tablet computer, an electronic book reader, an MP3(Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4) player, and a laptop. The terminal 101 is installed and operated with an application supporting image registration, for example, a system application, an instant messaging application, a news push application, a shopping application, an online video application, a social application.

Illustratively, the terminal 101 has an image capturing function and an image processing function, processes a captured image, and executes a corresponding function according to a processing result. Optionally, the terminal 101 may have an image processing function, and perform image processing on images acquired by other devices. The terminal 101 does this independently and also provides data services to it through the image registration platform 102. The embodiments of the present application do not limit this.

The image registration platform 102 includes at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. The image registration platform 102 is used to provide background services for applications that support image registration. Optionally, the image registration platform 102 undertakes primary processing and the terminal 101 undertakes secondary processing; or, the image registration platform 102 undertakes secondary processing work, and the terminal 101 undertakes primary processing work; alternatively, the image registration platform 102 or the terminal 101 may be separately responsible for processing. Alternatively, the image registration platform 102 and the terminal 101 perform collaborative computation by using a distributed computing architecture.

Optionally, the image registration platform 102 includes at least one server 1021 and a database 1022, where the database 1022 is used to store data, and in this embodiment, the database 1022 stores a sample image pair or a to-be-processed image pair, and provides a data service for the at least one server 1021.

The server is an independent physical server, is also a server cluster or distributed system formed by a plurality of physical servers, and is also a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content delivery network) and a big data and artificial intelligence platform. The terminal is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.

Those skilled in the art will appreciate that there may be more or fewer terminals 101 and servers 1021. For example, there is only one terminal 101 or one server 1021, or tens or hundreds of the terminals 101 and the servers 1021, or more, and the number of the terminals or the servers and the device types are not limited in the embodiments of the present application.

Fig. 2 is a flowchart of an image registration method provided in an embodiment of the present application, where the method is applied to an electronic device, the electronic device being a terminal or a server, and referring to fig. 2, taking application of the method to a terminal as an example, the method includes the following steps.

201. The electronic equipment carries out image registration on the reference image and the first image based on the image registration network to obtain an initial registration result.

An image registration network is used to image register the input image pairs. The specific process of image registration can be understood as follows: for one image pair, one image (floating image, image to be registered) in the image pair is mapped onto the other image (fixed image) by finding a space transformation, so that points corresponding to the same position in space in the two images are in one-to-one correspondence, and the purpose of information fusion is achieved.

The image pair includes a reference image and a first image. Wherein the first image refers to an image to be registered. During registration, the reference image is taken as a standard, the related information of the first image is converted into the coordinate system of the reference image, so that the same or similar image contents in the first image and the reference image are converted into the same position through image registration, and then the first image and the reference image are subjected to subsequent processing, so that the first image and the reference image have more contrast, and the condition of the same or similar image contents in the two images can be known more intuitively.

In some embodiments, the electronic device may input the reference image and the first image into an image registration network, perform image registration on the reference image and the first image by the image registration network based on network parameters, and output an initial second image after the first image registration. In some embodiments, the electronic device may also output a reference image. In other embodiments, the electronic device may further output an initial transformation parameter for transforming the first image in the image registration process, that is, an initial transformation parameter corresponding to the initial registration result. The initial transformation parameter is obtained by processing the reference image and the first image by the image registration network.

202. And the electronic equipment responds to the condition that the initial registration result meets the condition, and updates the initial transformation parameters corresponding to the initial registration result.

After the initial registration result is obtained through the image registration network, whether the initial registration result is accurate or not can be further determined, and if the initial registration result is not accurate, the electronic equipment can further optimize the initial registration result so as to improve the accuracy of the registration result.

The initial registration result satisfies a condition for indicating that the initial registration result is not accurate enough, the electronic device optimizes the initial registration result, and during optimization, the initial transformation parameters corresponding to the initial registration result can be updated, and further, the updated initial transformation parameters are analyzed through subsequent steps so that the registered second image is not more similar to the reference image.

Of course, the objective of the optimization is to update the transformation parameters so that the second image obtained by transforming the first image based on the transformation parameters is more similar to the reference image, and thus the more similar the second image and the reference image, the more accurate the image registration result.

203. And the electronic equipment carries out transformation processing on the first image based on the updated transformation parameters to obtain a second image.

After the electronic device updates the transformation parameters, the first image may be transformed based on the transformation parameters to obtain a second image. It should be noted that the essence of the transformation process is to transform the image content of the first image into the coordinate system of the reference image, so that the position, size and shape of the same or similar image content in the reference image and the second image are the same in the images, so that when two images are applied, the two images can be effectively combined to analyze the target in the images.

For example, if the reference image and the first image are images obtained by shooting the brain of the human body from different angles, the position, size and shape of the same brain-related tissue in the two images can be adjusted to be consistent through the transformation processing, so that when the two images are combined to analyze the brain-related tissue, the brain-related tissue in the first image does not need to be artificially imagined to be at the angle of the reference image.

204. And the electronic equipment responds to the condition that the similarity between the reference image and the second image meets the condition, updates the transformation parameters, continues to perform transformation processing and update the transformation parameters based on the updated transformation parameters, and stops until the target condition is met to obtain a target registration result.

The step 202 and the step 203 are steps required to be executed in one iteration process in the optimization process, in the current iteration process, the electronic device may obtain the similarity between the reference image and the second image, determine whether the registration result is accurate by using the similarity as a measurement standard, if the registration result is not accurate enough, continue to update the transformation parameter, and then repeatedly execute the steps of the step 202 and the step 203 to improve the similarity between the reference image and the second image, so as to achieve the optimization effect on the initial registration result.

Fig. 3 is a flowchart of an image registration network training method provided in an embodiment of the present application, where the method is applied to an electronic device, and the electronic device is a terminal or a server, and referring to fig. 3, taking the application of the method to a terminal as an example, the method includes the following steps.

301. The electronic device obtains a sample image pair comprising a sample reference image and a sample first image, the sample image pair carrying image type information.

Image registration includes rigid body registration and non-rigid body registration. The image type information is used to indicate the image registration used for the sample reference image and the sample first image.

In some embodiments, the image type information includes a first image type and a second image type. The first image type is indicative of rigid body registration of the sample reference image and the sample first image. The second image type is indicative of a non-rigid registration of the sample reference image and the sample first image.

In the embodiment of the application, the image registration network has rigid body registration and non-rigid body registration functions, when the image registration network is trained, the image type information can be set for the sample image pair, so that the image registration network can freely switch rigid body registration and non-rigid body registration according to the image type information, and the two registration functions of the image registration network are trained simultaneously, thereby effectively improving the training intellectualization and the training efficiency.

302. And the electronic equipment performs image registration on the sample image pair based on the branch corresponding to the image type information in the image registration network according to the image type information carried by the sample image pair to obtain a transformation parameter.

The transformation parameters obtained by different registration functions are different, and the image registration modes are different, so that different registration functions of the image registration network are realized through different branches, for example, the rigid body registration function is realized through a first branch, and the non-rigid body registration is realized through a second branch.

When the electronic equipment is trained, the image registration can be realized by freely switching branches of an image registration network by reading the image type information carried by the sample image pair.

303. The electronic device trains network parameters of the image registration network based on the transformation parameters.

After the electronic device obtains the transformation parameters based on the image registration network, it may determine whether the transformation parameters obtained by the current network parameters are accurate based on the transformation parameters.

When the transformation parameters are measured accurately, the method can be realized in different modes. For example, if unsupervised training is adopted, the first image can be transformed through the transformation parameters, and whether the transformed second image is accurate or not is measured through the similarity between the transformed second image and the reference image. If the supervised training is adopted, the sample image pair can carry target transformation parameters, and the electronic equipment can measure whether the target transformation parameters are accurate or not by comparing the transformation parameters output by the image registration network with the target transformation parameters.

The image registration network trained by the embodiment of the application supports multiple image registration modes, and when the image registration network is trained, image type information can be set for a sample image pair, so that the image registration network automatically adapts the image registration modes to the image registration network according to the image type information, and network parameters of the image registration modes are trained, thus the multiple image registration modes are synchronously trained in the training process, and the training efficiency and the applicability of the image registration network are improved.

Fig. 4 is a flowchart of an image registration network training method provided in an embodiment of the present application, and referring to fig. 4, the method includes the following steps.

401. The electronic device obtains a sample image pair comprising a sample reference image and a sample first image, the sample image pair carrying image type information.

The image registration network training method can be applied to two-dimensional image registration and also can be applied to three-dimensional image registration, that is, the images in the sample image pair can be two-dimensional images and also can be three-dimensional images, which is not limited in the embodiment of the present application.

The sample image pair may be stored at different locations and, accordingly, the electronic device may acquire the sample image pair in different ways.

In some embodiments, the sample image pair may be stored in the electronic device, and accordingly, the electronic device may retrieve the sample image pair from this storage.

In other embodiments, the sample image pairs may be stored in an image database. Accordingly, the electronic device can obtain the sample image pair from the image database.

The above provides two possible acquisition manners of the sample image pair, which is not limited in the embodiment of the present application.

The image type information is used to indicate the manner in which the image pair is registered.

In some embodiments, the image registration may include rigid body registration and non-rigid body registration. The image type information is used to indicate whether rigid body registration or non-rigid body registration is performed on the image pair. Rigid body registration refers to mapping a first image into a coordinate system of a reference image by performing rigid body transformation on the first image. Non-rigid registration refers to mapping the first image into the coordinate system of the reference image by non-rigid transformation of the first image.

The rigid body is an object which has unchanged shape and size and the relative position of each point in the rigid body after movement and stress action. The rigid body transformation refers to the movement of rotating, translating, mirroring and the like of a geometric object, namely, the length, the angle, the area, the gradient, the divergence, the rotation and the like are kept unchanged in the movement process. Non-rigid transformations are more complex than rigid transformations, and some of the more complex transformations, such as scaling, affine, transmission, polynomial, etc., are all non-rigid transformations. The non-rigid body transformation describes changes to the size of a geometric object rather than the shape.

In some embodiments, the image type information may be set by the relevant technician for the sample image pair, which may be understood as annotation data or a label for the sample image pair. By setting the image type information, the electronic device can know the image registration mode adopted for the sample image pair.

In other embodiments, the image type information is stored in association with the sample image pair, and the electronic device acquires the image type information synchronously with the acquisition of the sample image pair.

402. And the electronic equipment responds to the fact that the image type information carried by the sample image pair is a first image type, image registration is carried out on the sample image pair on the basis of a first branch in the image registration network, an affine matrix is obtained, the affine matrix is a transformation parameter, and the first branch is used for rigid body registration.

When the image type information is different, the image registration mode adopted by the electronic device may be different. In some embodiments, the image registration network rigid-body registers the pair of images when the image type information is of the first image type. And when the image type information is the second image type, the image registration network performs non-rigid registration on the image pair.

In some embodiments, the image type information is represented by 0 or 1, the image type information is 0 for representing the first image type, and the image type information is 1 for representing the second image type.

After the electronic device acquires the sample image pair, the sample image pair may be input into an image registration network, the image registration network may read image type information carried by the sample image pair, and when the image type information is of a first image type, the image registration network may be input into a first branch, and the first branch performs an image registration step.

In some embodiments, the image registration network may perform feature extraction on the sample image pair, input the extracted feature map into the first branch, and process the feature map by the first branch to obtain an affine matrix.

Optionally, the processing of the feature map by the first branch may be: the first branch performs global pooling on the feature map to obtain a vector, and then performs dimension reduction and reshaping on the vector to obtain an affine matrix.

In some embodiments, the image registration network includes a feature extraction module and a registration module. The registration module includes a first branch and a second branch. After the electronic device inputs the sample image into the image registration network, the feature extraction module of the image registration network may perform feature extraction on the sample image to obtain a feature map of the sample reference image and the sample first image, then input the feature map into the first branch of the registration module, and process the feature map by the first branch to obtain an affine matrix.

In some embodiments, the feature extraction module may employ a feature extraction network, and the feature extraction module may be understood as an encoder through which the sample image pair is encoded to obtain the feature map.

For example, the feature extraction network or encoder may employ the backbone network of ResNet. The registration module may be understood as a decoder. Different registration functions correspond to different decoders. That is, rigid body registration corresponds to a decoder. Non-rigid registration corresponds to another decoder.

For the first branch, the first branch may further process the feature map to obtain an affine matrix, where the affine matrix is a transformation parameter for the first image. I.e. by analyzing the feature map, it is determined how to process the first image to transform it into the coordinate system of the reference image.

In some embodiments, the first branch may include a Global Average Posing (GAP), a linear layer, and a remodeling layer. The global pooling layer is used for performing global pooling on the feature maps and compressing the feature maps into vectors so as to reduce data dimensionality and reduce subsequent calculation amount. The linear layer is used to perform dimensionality reduction on the vector. The reshaping layer is used for reshaping the vector (reshape) and converting the vector into an affine matrix.

In a specific example, the sample image pair is subjected to feature extraction to obtain one or more feature maps, and for each feature map, the feature map can be converted into a numerical value through the GAP. The GAP process is to average the pixel values of all pixels in the feature map to obtain an average value. Therefore, a plurality of average values corresponding to a plurality of feature maps form a vector. In a specific example, the number of feature maps is 256, and a vector with a length of 256 can be obtained through GAP. The first branch then proceeds to process the length 256 vector through the linear layer, resulting in a length 12 vector. The first branch converts the vector with the length of 12 into a matrix form based on the reshaping layer, so that an affine matrix is obtained. For example, the affine matrix is a 3 × 4 matrix.

For example, as shown in fig. 5, the Image registration network may obtain a feature map of an Image after inputting a to-be-registered Image (Mov Image)501 and a reference Image (Fixed Image)502 into a feature extraction module 503, and then may convert the feature map into a vector through a GAP504, and may obtain an affine matrix 507 through a linear layer 505 and a reshaping process 506.

403. And the electronic equipment responds to the fact that the image type information carried by the sample image pair is a second image type, image registration is carried out on the sample image pair on the basis of a second branch in the image registration network, a vector field is obtained, the vector field is a transformation parameter, and the second branch is used for non-rigid body registration.

After the electronic device acquires the sample image pair, the sample image pair may be input into an image registration network, the image registration network may read image type information carried by the sample image pair, and when the image type information is of a second image type, the image registration network may be input into a second branch, and the second branch performs an image registration step.

Similarly, in some embodiments, the image registration network may perform feature extraction on the sample image pair, input the extracted feature map into the second branch, and process the feature map by the second branch to obtain the vector field. A vector field is a function of one vector to another vector.

Optionally, the processing of the feature map by the first branch may be: and the second branch performs convolution processing on the characteristic diagram to obtain a vector field.

In some embodiments, the image registration network includes a feature extraction module and a registration module. The registration module includes a first branch and a second branch. After the electronic device inputs the sample image into the image registration network, a feature extraction module of the image registration network can perform feature extraction on the sample image to obtain a sample reference image and a feature map of the sample first image, then the feature map is input into a second branch of the registration module, and the second branch processes the feature map to obtain a vector field.

For the second branch, the second branch may further process the feature map to obtain a vector field, which is the transformation parameter for the first image. I.e. by analyzing the feature map, it is determined how to process the first image to transform it into the coordinate system of the reference image.

In some embodiments, the second branch may include a plurality of convolutional layers, which may be understood as vector field decoders, for converting the feature map into a vector field. A vector field is a function of one vector to another vector.

In some embodiments, the plurality of convolutional layers of the second branch and convolutional layers in the feature extraction module are connected by hopping, so that the decoder and encoder can constitute a Unet structure. In this way, when each convolutional layer is decoded, its input includes not only the feature map finally output by the feature extraction module, but also the output of a certain convolutional layer of the feature extraction module. This enables the second branch to be independent of the output of the feature extraction module, and also enables a more accurate vector field to be derived based on more raw data.

In some embodiments, the sample image pair is subjected to feature extraction to obtain one or more feature maps, and the second branch is subjected to convolution processing through the one or more feature maps to obtain a vector field.

For example, as shown in fig. 5, the Image registration network may obtain a feature map of an Image after inputting a feature extraction module 503 for an Image to be registered (Mov Image)501 and a reference Image (Fixed Image)502, and then can convert the Image into a vector field 509 through a plurality of convolutional layers 508 of a second branch.

The above-mentioned

steps

402 and 403 are processes of performing image registration on the sample image pair based on a branch corresponding to the image type information in an image registration network according to the image type information carried by the sample image pair to obtain a transformation parameter, and the above-mentioned two steps are respectively a rigid body registration process and a non-rigid body registration process.

404. The electronic device trains network parameters of the image registration network based on the transformation parameters.

That is, the step 404 can be implemented by the following two ways:

in a first mode, the electronic device may process the sample first image based on the transformation parameter to obtain a target second image, and train a network parameter of the image registration network based on a similarity between the sample reference image and the target second image.

In the first mode, the sample first image is transformed by the obtained transformation parameters, and then the similarity between the target second image and the reference image obtained by the transformation processing is used to measure whether the transformation parameters are accurate.

It is to be understood that after the first branch and the second branch respectively obtain the affine matrix and the vector field, the first branch may perform affine transformation on the first image based on the affine matrix, thereby obtaining the target second image. The second branch may perform a strain transformation on the second image based on the vector field to obtain a target second image. If the target second image is similar to the reference image, the transformation parameters are more accurate. If the target second image is not similar to the reference image, the conversion parameters are not accurate, and the network parameters need to be updated, so that the first branch or the second branch obtains more accurate conversion parameters.

For example, as shown in fig. 5, after obtaining the Affine matrix 507, the first branch may perform Affine transformation (affinity Transform)510 on the image to be registered 501 through the Affine matrix 507. For the second branch, after obtaining the vector field 509, the second branch may perform a strain Transform (Deformable Transform)511 on the image to be registered 501 through the vector field 509. The transformed image is referred to herein as the target second image.

And secondly, training the network parameters of the image registration network by the electronic equipment based on the transformation parameters and the target transformation parameters carried by the sample image pair.

In the second mode, training is performed in a supervised mode, the sample image carries target transformation parameters, the target transformation parameters are correct and real transformation parameters, namely the sample first image is processed through the target transformation parameters and can be well transformed into the coordinate system of the sample reference image. Therefore, the electronic equipment can know whether the transformation parameters are accurate or not through the transformation parameters and the target transformation parameters estimated by the network.

Wherein the target transformation parameters can be obtained by an open source registration tool (such as Elastix) or obtained from a historical registration record.

In the training process, the network parameters are essentially updated, and then the iterative process of registration is repeatedly executed. In training, a training target may be set, by which the timing of ending of training is determined.

In some embodiments, the electronic device may obtain, based on the transformation parameter, a value of an objective function indicating a degree of similarity between the sample reference image and the target second image, or indicating a degree of similarity between the transformation parameter and the target transformation parameter.

Taking as an example that the value of the objective function is used to indicate the similarity between the sample reference image and the target second image, the objective function may be an NCC (Normalized Cross Correlation) function, or an NMI (Normalized Mutual Information) function.

In some embodiments, the image registration network can image register single-modality image pairs as well as multi-modality image pairs. A pair of images of a single modality means that the pixel distribution of the images in the pair of images is the same or similar, or that the imaging devices are the same. A multimodal pair of images refers to the difference in pixel distribution of the images in the pair, or to the difference in imaging devices. For example, two images obtained by CT and MRI are a pair of images of the multi-modality.

In some embodiments, different objective functions may be employed for training for image pairs of different modalities. That is, the electronic device may acquire the similarity between the sample reference image and the sample second image by using a similarity acquisition manner corresponding to the modality information according to the modality information of the sample reference image and the sample first image.

In a first mode, in response to that the modality information of the reference image is the same as the modality information of the first image, acquiring a normalized cross-correlation coefficient between the sub-image set of the reference image and the second image, where the normalized cross-correlation coefficient is a similarity between the reference image and the second image.

The first method is to use the NCC function to obtain the similarity between the sample reference image and the sample second image for the single-mode image pair.

And secondly, acquiring normalized mutual information between the subgraph set of the reference image and the second image in response to the difference between the modal information of the reference image and the modal information of the first image, and acquiring the similarity between the reference image and the second image based on the normalized mutual information.

The second method is to use an NMI function to obtain the similarity between the sample reference image and the sample second image for the monomodal image pair.

For example, as shown in fig. 5, after the target second image is obtained by transforming the image to be registered, a Loss value, that is, a value of the target Function (NCC or NMI Loss Function), may be calculated based on the NCC or NMI Loss Function (Loss Function)512 based on the target second image and the reference image 502.

For the training end time, the similarity between the sample reference image and the sample second image may be converged, or the iteration number may reach a preset number. The embodiments of the present application do not limit this.

In some embodiments, for the non-rigid body transformation, i.e., the registration result obtained through the second branch, the electronic device may measure the accuracy of the vector field by considering not only the above objective function but also the smoothness of the vector field transformation when determining the similarity between the sample reference image and the sample second image.

Specifically, the electronic device may, in response to the transformation parameter being a vector field, transform the sample first image into a strain transform, obtain a smoothing constraint value based on the vector field, and then obtain a similarity between the sample reference image and the target second image based on the smoothing constraint value and the normalized mutual information.

In a specific example, the smooth constraint value can be obtained by the following formula one.

Wherein the content of the first and second substances,

is a smoothed constraint value. u (p) is a vector field (transform field).

Thus, the overall objective function can be obtained by considering the above smooth constraint value as shown in the following equation two.

Wherein the content of the first and second substances,

is a smoothed constraint value.

For the value of the overall objective function,

is the value of the objective function obtained based on NCC or NMI.

The training process is an iterative process, in each iterative process, the image registration network can process the input image pair and acquire the value of the target function, and if the value of the target function does not meet the target condition, the electronic equipment can optimize the parameters of the image registration network according to the value of the target function. If the value of the objective function meets the objective condition, the electronic device may determine that the training is finished, and use the network parameter used in the iteration as the final network parameter of the image registration network.

The optimization process may adopt various manners, for example, a gradient descent manner, and the electronic device may obtain a gradient of the network parameter according to a value of the objective function, update the network parameter based on the gradient, and obtain an updated network parameter. The updated network parameters may be used when the image registration network processes the input image pair at the next iteration.

After the image registration network is trained, if the electronic device has an image registration requirement, the trained image registration network can be called to perform image registration. Specifically, the electronic device responds to an image registration instruction, acquires a reference image and a first image, and performs image registration on the reference image and the first image based on the trained image registration network to obtain an image registration result. The process of image registration by the image registration network is the same as the process of feature extraction, transformation parameter determination, and transformation processing in the above-mentioned step 402-404, and will not be described herein again.

The using process of the image registration network may specifically refer to the embodiment shown in fig. 6 below, and an online learning mode may also be adopted in the using process, and based on a target image pair generated in the using process, the image registration network is updated and trained online in combination with the original sample image pair, so that the image registration network is continuously optimized in the using process. For the content of this part, reference may be made to the content shown in fig. 6, which will not be described in detail herein.

The above method describes an image registration network training process, and the following describes a use process of the image registration network.

Fig. 6 is a flowchart of an image registration method provided in an embodiment of the present application, and referring to fig. 6, the method includes the following steps.

601. The electronic device acquires a reference image and a first image.

In some embodiments, the reference image and the first image may be captured by other electronic devices and transmitted to the electronic devices, that is, when the other electronic devices have image registration requirements, the image pair to be subjected to image registration may be transmitted to the electronic devices.

In other embodiments, the reference image and the first image may be captured by the electronic device. For example, a certain object is photographed at different times, different lighting conditions or different angles.

For the two images taken, the electronic device may set one of the images as a reference image and the other image as a first image in response to a setting instruction for the other image.

The image registration network can perform image registration for single-modality image pairs as well as multi-modality image pairs. A pair of images of a single modality means that the pixel distribution of the images in the pair of images is the same or similar, or that the imaging devices are the same. A multimodal pair of images refers to the difference in pixel distribution of the images in the pair, or to the difference in imaging devices. For example, two images obtained by CT and MRI are a pair of images of the multi-modality. Accordingly, the reference image and the first image may be captured by the same imaging device or may be captured by different imaging devices.

602. The electronic equipment carries out image registration on the reference image and the first image based on the image registration network to obtain an initial registration result.

When the electronic equipment has the image registration requirement, the trained image registration network can be called to perform image registration. The training process of the image registration network can be seen in the embodiment shown in fig. 4 described above.

The image registration process for the image registration network may be the same as the process described above for obtaining the target second image in steps 402-404.

Similarly, after the electronic device acquires the reference image and the first image, the reference image and the first image may be input into an image registration network, the image registration network may perform feature extraction on the reference image and the first image, then process the extracted feature map to obtain an initial transformation parameter, and perform transformation processing on the first image based on the initial transformation parameter to obtain an initial registration result, where the initial registration result may include an initial second image after the first image is registered. The initial registration result may also include the initial transformation parameters described above.

In some embodiments, the image registration network comprises a first branch for rigid body registration and a second branch for non-rigid body registration, similarly. The electronic device may perform, in response to the rigid body registration instruction, image registration on the reference image and the first image based on the first branch of the image registration network to obtain an initial registration result, where an initial transformation parameter corresponding to the initial registration result is an initial affine matrix. The electronic device may perform image registration on the reference image and the first image based on the second branch of the image registration network in response to the non-rigid body registration instruction, to obtain an initial registration result, where an initial transformation parameter corresponding to the initial registration result is an initial vector field.

The rigid body registration instruction and the non-rigid body registration instruction may be triggered by a registration operation of a user. The user can choose to perform rigid registration or non-rigid registration, and accordingly, the image registration network can perform the image registration step based on the corresponding branch.

Similarly, after the electronic device inputs the reference image and the first image into the image registration network, the image registration network may perform feature extraction on the reference image and the first image to obtain a feature map, in response to a rigid body registration instruction, the image registration network may input the feature map into the first branch, the first branch processes the feature map to obtain an initial affine matrix, and perform affine transformation on the first image based on the initial affine matrix to obtain an initial registration result.

Or after the electronic device inputs the reference image and the first image into the image registration network, the image registration network may perform feature extraction on the reference image and the first image to obtain a feature map, in response to the non-rigid registration instruction, the image registration network may input the feature map into the second branch, the second branch processes the feature map to obtain an initial vector field, and the first image is strain-transformed based on the initial vector field to obtain an initial registration result.

In some embodiments, the image registration network includes a feature extraction module and a registration module. The registration module includes the first branch and the second branch. The feature extraction module is used for extracting features of the reference image and the first image to obtain a feature map.

Similarly, in some embodiments, if rigid body registration is performed, the first branch may include a Global Average potential stacking (GAP), a linear layer, and a remodeling layer. In a specific example, the reference image and the first image are subjected to feature extraction to obtain one or more feature maps, and each feature map can be converted into a numerical value through the GAP. And obtaining a vector by a plurality of feature maps through the GAP. The first branch then proceeds to process the vector based on the linear layer to obtain a shorter length vector, which is then converted to a matrix form based on the remodeling layer to obtain the initial affine matrix. The first branch may then perform affine transformation on the first image based on the initial affine matrix, resulting in an initial second image.

In other embodiments, if non-rigid body registration is performed, the second branch may comprise a plurality of convolutional layers, which may be understood as vector field decoders, for converting the feature map into a vector field. In a specific example, the reference image and the first image are subjected to feature extraction to obtain one or more feature maps, and for each feature map, the one or more feature maps can be subjected to convolution processing through a plurality of convolution layers to obtain an initial vector field. The second branch may then perform a strain transformation on the first image based on the initial vector field, resulting in an initial second image.

603. And the electronic equipment responds to the condition that the initial registration result meets the condition, and updates the initial transformation parameters corresponding to the initial registration result.

The condition that the initial registration result meets the condition can be triggered by user operation, and the electronic equipment can also analyze and trigger the initial registration result. Specifically, the following two modes may be included.

In a first mode, in response to receiving an optimization instruction for the initial registration result, the step of updating the initial transformation parameters corresponding to the initial registration result is performed.

In the first mode, after the electronic device obtains the initial registration result, the initial second image after the initial registration may be displayed on a screen, and a user determines whether the initial second image is accurate enough, and if the initial second image is not accurate enough, the electronic device may perform an optimization operation to trigger the electronic device to perform an optimization step.

In a second mode, in response to that the similarity between the initial second image after the first image is registered and the reference image in the initial registration result is smaller than a first similarity threshold, the step of updating the initial transformation parameter corresponding to the initial registration result is performed.

In the second mode, after the electronic device obtains the initial registration result, the initial registration result can be analyzed, and whether the initial registration result is accurate or not can be measured by comparing the similarity between the initial second image after registration and the reference image.

The first similarity threshold may be set by a related technician as required, which is not limited in the embodiment of the present application. The method for obtaining the similarity between the two images may be the same as that shown in step 404, or may adopt other methods, which is not limited in this embodiment of the present application.

Similarly to the embodiment shown in fig. 4, when rigid body registration is performed, the transformation parameter corresponding to the initial registration result is an initial affine matrix, and the electronic device updates the initial affine matrix corresponding to the initial registration result in response to that the initial registration result satisfies a condition. When the non-rigid body is registered, the transformation parameter corresponding to the initial registration result is an initial vector field, and the electronic device may update the initial vector field corresponding to the initial registration result in response to that the initial registration result satisfies a condition.

It should be noted that, in the embodiment of the present application, after the initial transformation parameter is updated, the first image is transformed to obtain the second image, then the electronic device determines the similarity between the second image and the reference image to determine whether further optimization of the transformation parameter is needed, if further optimization is needed, the optimization process is the same as that in step 603, and the electronic device continues to repeat step 604 and the process of determining whether further optimization is needed based on the similarity. The process is an iterative optimization process, and the transformation parameters are updated through one or more iterations, so that the second image obtained based on the transformation parameters is more similar to the reference image, and the obtained registration result is more accurate.

604. And the electronic equipment carries out transformation processing on the first image based on the updated transformation parameters to obtain a second image.

Similarly to the embodiment shown in fig. 4, when the rigid body is aligned, the electronic device may perform affine transformation on the first image based on the updated affine matrix to obtain the second image. During the non-rigid registration, the electronic device may perform strain transformation on the first image based on the updated vector field to obtain a second image.

Similarly to the embodiment shown in fig. 4, the electronic device may obtain a smooth constraint value based on the vector field in response to the transformation parameter being a vector field, perform transformation processing on the first image being strain transformation, and then obtain a similarity between the reference image and the second image based on the smooth constraint value and the normalized mutual information.

605. And the electronic equipment responds to the condition that the similarity between the reference image and the second image meets the condition, updates the transformation parameters, continues to perform transformation processing and update the transformation parameters based on the updated transformation parameters, and stops until the target condition is met to obtain a target registration result.

The similarity satisfying conditions can be set by the related technical personnel according to requirements. In some embodiments, the electronic device updates the transformation parameter in response to a similarity between the reference image and the second image being less than a second similarity threshold. The second similarity threshold may be set by a related technician as required, which is not limited in the embodiment of the present application.

The process of obtaining the similarity is similar to that shown in step 404 in the embodiment shown in fig. 4, and is not repeated here.

The procedure for updating the initial transformation parameters in step 603 and the transformation parameters in step 605 may be the same as the procedure for updating the network parameters in step 404, except that the transformation parameters are updated here. When the initial transformation parameters are processed, the initial transformation parameters may be updated based on the similarity between the initial second image and the reference image. When the transformation parameters are updated in the subsequent iteration process, the electronic device may update the transformation parameters based on the similarity between the second image and the reference image.

The similarity between the second image and the reference image can be expressed by the value of the objective function, as in step 404 above. That is, the similarity between the second image and the reference image can be obtained by the objective function.

Similar to the embodiment shown in fig. 4, in some embodiments, the similarity obtaining step may be performed by using different objective functions for image pairs of different modalities. The acquisition process of the similarity between the reference image and the second image may be determined from the modality information of both images. Specifically, the electronic device may acquire the similarity between the reference image and the second image by using a similarity acquisition manner corresponding to the modality information according to the modality information of the reference image and the first image.

In some embodiments, the electronic device, in response to the modality information of the reference image being the same as the modality information of the first image, obtains a normalized cross-correlation coefficient between the sub-set of reference images and the second image, the normalized cross-correlation coefficient being a similarity between the reference image and the second image. That is, the reference image and the first image are a pair of single-mode images, and the NCC function can be used to determine the similarity.

The electronic device acquires normalized mutual information between the sub-image set of the reference image and the second image in response to the difference between the modal information of the reference image and the modal information of the first image, and acquires the similarity between the reference image and the second image based on the normalized mutual information. That is, the reference image and the first image are a pair of multimodal images, and the NMI function can be used to determine the similarity.

After obtaining the value of the objective function, in the same way as in step 404, when updating the initial transformation parameter and updating the transformation parameter, the electronic device may obtain the gradient of the transformation parameter according to the value of the objective function, and update the transformation parameter based on the gradient to obtain the updated transformation parameter. The first image may be transformed at the next iteration based on the updated transformation parameters.

The above-mentioned step 603 and step 604 are iterative processes, and when it is determined that the next iteration is needed, the transformation parameter updating process similar to that of step 603 may be executed, and step 604 is executed repeatedly, and it is determined whether the iteration is finished.

For the condition of the end of the iteration, that is, the target condition includes any one of the number of iterations reaching the target number or the convergence of the similarity. The setting can be carried out by the related technical personnel according to the requirements. The target times can be set by a related technician, or can be obtained by setting operation based on the times of the user.

For example, as shown in fig. 7, the reference image and the first image may be preliminarily registered by the image registration network to obtain an initial solution, which is an initial registration result, and then good and bad are determined, and if the initial registration result is good, the registration may be directly ended without optimization. If the initial registration result is poor, iterative optimization can be performed to obtain a target registration result.

The steps 603 to 605 are iterative optimization processes, and in some embodiments, the updating of the transformation parameters may also be implemented by updating network parameters of an image registration network, that is, the electronic device may also implement optimization using the image registration network. For example, in step 603, the electronic device may input the reference image and the initial second image in the initial registration result to the image registration network again, the image registration network processes the input image pair, obtains the similarity through the objective function based on the registration result output by the network, updates the network parameter based on the similarity, performs image registration on the reference image and the first image again through the updated network parameter, and so on, and can obtain a better registration result through multiple iterations. The iterative process is similar to the iterative process of the image registration network training process, and details are not repeated here.

By means of iterative optimization in this way, network parameters of the image registration network change, the electronic device may acquire the sample image pair in step 601 again, and train the image registration network by combining the target image pair composed of the reference image and the first image, so as to obtain an updated image registration network. By the online learning mode, the image registration network can be trained through the new image pair and the original image pair used for training in the using process so as to continuously optimize network parameters and continuously improve the performance and generalization capability of the image registration network.

In other embodiments, the iterative optimization process described above may not be implemented by the image registration network, and many new image pairs, referred to herein as target image pairs, can be collected during use. For example, the target image pair is an image pair input to the image registration network during use of the image registration network, that is, the reference image and the first image. The image registration network can be further trained by matching the original sample image pair through the image pair generated in the using process, so that the image registration network can have the registration capability of the image pair generated in the using process while inheriting the registration capability of the original sample image pair.

Specifically, the electronic device responds to an update instruction responding to the image registration network, acquires batch data, wherein the batch data comprises a target image pair and the sample image pair, and trains the trained image registration network based on the batch data to obtain the updated image registration network.

The updating instruction of the image registration network may be triggered periodically, or may be triggered when the number of times of use of the image registration network reaches a threshold number. For example, an update period of the image registration network may be set, and update training is performed every other update period. The electronic device may acquire the image pair input to the image registration network in the update period as a target image pair and a sample image pair when the image registration network was previously trained, and train the image registration network. The batch data is data formed by image pairs of the input image registration network in the updating period and sample image pairs used in the previous training. In one specific example, one update per day of the image registration network may be provided. The electronic device may train sample image pairs used for training before acquiring image pairs input into the image registration network the day before combining at a fixed point in time each day. Batch data is data consisting of pairs of images input into the image registration network on the previous day and pairs of sample images used for previous training.

For example, the image registration method described above may be applied in surgical navigation applications, where a patient will typically scan image data of multiple modalities, such as CT (Computed Tomography) and MRI (Magnetic Resonance Imaging). In a preoperative planning system, registration of multiple scan images is required. The accuracy of registration is crucial, and the navigation accuracy is indirectly influenced. Through image registration, the patient condition is registered into more or less images, so that the patient condition can be accurately known. Due to the variety of diseases, the images are variable. In order to ensure the algorithm speed, firstly, inference type registration is adopted, if the initial solution is good enough, the registration is finished, and if the initialization is not accurate enough, the initial solution needs to be further iteratively optimized at the moment.

The method combines an inference type and an iterative type registration mode, balances speed and precision, and makes full use of various equipment acceleration algorithms. And the universality of the algorithm is improved, and the method can be used for single-mode, multi-mode, rigid body and non-rigid body registration tasks. The registration network adopts a dynamic structure and is switched among a plurality of tasks. And the image is not limited to an image of a certain dimension. That is, the method can be universally used for two-dimensional image and three-dimensional image registration tasks.

All the above-mentioned optional technical solutions are combined arbitrarily to form the optional embodiments of the present application, and are not described herein again.

Fig. 8 is a schematic structural diagram of an image registration apparatus provided in an embodiment of the present application, and referring to fig. 8, the apparatus includes:

a registration module 801, configured to perform image registration on the reference image and the first image based on an image registration network to obtain an initial registration result;

an updating module 802, configured to update an initial transformation parameter corresponding to the initial registration result in response to that the initial registration result satisfies a condition;

a transformation module 803, configured to perform transformation processing on the first image based on the updated transformation parameter to obtain a second image;

the updating module 802 and the transforming module 803 are further configured to update the transformation parameter in response to that the similarity between the reference image and the second image satisfies a condition, continue to perform transformation processing and update the transformation parameter based on the updated transformation parameter, and stop until a target condition is satisfied, so as to obtain a target registration result.

In some embodiments, the update module 802 is configured to perform any of:

the registration module 801 is configured to:

In some embodiments, the update module 802 is configured to perform any of:

the transformation module 803 is configured to perform any of the following:

In some embodiments, the update module 802 is configured to update the transformation parameter in response to a similarity between the reference image and the second image being less than a second similarity threshold.

It should be noted that: the image registration apparatus provided in the above embodiment is only illustrated by dividing the above functional modules when registering images, and in practical applications, the above function allocation is completed by different functional modules according to needs, that is, the internal structure of the image registration apparatus is divided into different functional modules to complete all or part of the above described functions. In addition, the image registration apparatus and the image registration method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 9 is a schematic structural diagram of an image registration network training apparatus provided in an embodiment of the present application, and referring to fig. 9, the apparatus includes:

an obtaining module 901, configured to obtain a sample image pair, where the sample image pair includes a sample reference image and a sample first image, and the sample image pair carries image type information;

a registration module 902, configured to perform image registration on the sample image pair based on a branch corresponding to the image type information in an image registration network according to the image type information carried by the sample image pair, so as to obtain a transformation parameter;

a training module 903, configured to train network parameters of the image registration network based on the transformation parameters.

In some embodiments, the training module 903 is configured to perform any of the following:

the registration module 902 is configured to:

In some embodiments, the obtaining module 901 is further configured to, in response to an update instruction of an image registration network, obtain batch data, where the batch data includes a target image pair and the sample image pair, and the target image pair is an image pair obtained by optimizing an initial registration result output by the image registration network during use of the image registration network;

the training module 903 is further configured to train the trained image registration network based on the batch data to obtain an updated image registration network.

It should be noted that: in the image registration network training device provided in the above embodiment, when training an image registration network, only the division of each functional module is exemplified, and in practical application, the function distribution is completed by different functional modules according to needs, that is, the internal structure of the image registration network training device is divided into different functional modules to complete all or part of the functions described above. In addition, the image registration network training device and the image registration network training method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 10 is a schematic structural diagram of an electronic device 1000 according to an embodiment of the present application, where the electronic device 1000 may generate relatively large differences due to different configurations or performances, and includes one or more processors (CPUs) 1001 and one or more memories 1002, where the memory 1002 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 1001 to implement the image registration method or the image registration network training method provided by the foregoing method embodiments. The electronic device further includes other components for implementing the functions of the device, for example, the electronic device further includes components such as a wired or wireless network interface and an input/output interface for inputting and outputting. The embodiments of the present application are not described herein in detail.

The electronic device in the above method embodiment is implemented as a terminal. For example, fig. 11 is a block diagram of a terminal according to an embodiment of the present disclosure. The terminal 1100 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3(Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4) player, a notebook computer or a desktop computer. Terminal 1100 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.

In general, terminal 1100 includes: a processor 1101 and a memory 1102.

Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1101 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1101 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and rendering content that the display screen needs to display. In some embodiments, the processor 1101 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1102 may include one or more computer-readable storage media, which may be non-transitory. Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1102 is used to store at least one instruction for execution by processor 1101 to implement an image registration method or an image registration network training method provided by method embodiments herein.

In some embodiments, the terminal 1100 may further include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1104, display screen 1105, camera assembly 1106, audio circuitry 1107, positioning assembly 1108, and power supply 1109.

The peripheral interface 1103 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1101 and the memory 1102. In some embodiments, the processor 1101, memory 1102, and peripheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1101, the memory 1102 and the peripheral device interface 1103 may be implemented on separate chips or circuit boards, which is not limited by this embodiment.

The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1104 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1104 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1104 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1105 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1105 is a touch display screen, the display screen 1105 also has the ability to capture touch signals on or over the surface of the display screen 1105. The touch signal may be input to the processor 1101 as a control signal for processing. At this point, the display screen 1105 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, display 1105 may be one, disposed on a front panel of terminal 1100; in other embodiments, the display screens 1105 can be at least two, respectively disposed on different surfaces of the terminal 1100 or in a folded design; in other embodiments, display 1105 can be a flexible display disposed on a curved surface or on a folded surface of terminal 1100. Even further, the display screen 1105 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display screen 1105 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

Camera assembly 1106 is used to capture images or video. Optionally, camera assembly 1106 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1106 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 1107 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1101 for processing or inputting the electric signals to the radio frequency circuit 1104 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location of terminal 1100. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1101 or the radio frequency circuit 1104 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1107 may also include a headphone jack.

Positioning component 1108 is used to locate the current geographic position of terminal 1100 for purposes of navigation or LBS (Location Based Service). The Positioning component 1108 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.

Power supply 1109 is configured to provide power to various components within terminal 1100. The power supply 1109 may be alternating current, direct current, disposable or rechargeable. When the power supply 1109 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 1100 can also include one or more sensors 1110. The one or more sensors 1110 include, but are not limited to: acceleration sensor 1111, gyro sensor 1112, pressure sensor 1113, fingerprint sensor 1114, optical sensor 1115, and proximity sensor 1116.

Acceleration sensor 1111 may detect acceleration levels in three coordinate axes of a coordinate system established with terminal 1100. For example, the acceleration sensor 1111 may be configured to detect components of the gravitational acceleration in three coordinate axes. The processor 1101 may control the display screen 1105 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1111. The acceleration sensor 1111 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1112 may detect a body direction and a rotation angle of the terminal 1100, and the gyro sensor 1112 may cooperate with the acceleration sensor 1111 to acquire a 3D motion of the user with respect to the terminal 1100. From the data collected by gyroscope sensor 1112, processor 1101 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensor 1113 may be disposed on a side bezel of terminal 1100 and/or underlying display screen 1105. When the pressure sensor 1113 is disposed on the side frame of the terminal 1100, the holding signal of the terminal 1100 from the user can be detected, and the processor 1101 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1113. When the pressure sensor 1113 is disposed at the lower layer of the display screen 1105, the processor 1101 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 1105. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 1114 is configured to collect a fingerprint of the user, and the processor 1101 identifies the user according to the fingerprint collected by the fingerprint sensor 1114, or the fingerprint sensor 1114 identifies the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the user is authorized by the processor 1101 to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 1114 may be disposed on the front, back, or side of terminal 1100. When a physical button or vendor Logo is provided on the terminal 1100, the fingerprint sensor 1114 may be integrated with the physical button or vendor Logo.

Optical sensor 1115 is used to collect ambient light intensity. In one embodiment, the processor 1101 may control the display brightness of the display screen 1105 based on the ambient light intensity collected by the optical sensor 1115. Specifically, when the ambient light intensity is high, the display brightness of the display screen 1105 is increased; when the ambient light intensity is low, the display brightness of the display screen 1105 is reduced. In another embodiment, processor 1101 may also dynamically adjust the shooting parameters of camera assembly 1106 based on the ambient light intensity collected by optical sensor 1115.

Proximity sensor 1116, also referred to as a distance sensor, is typically disposed on a front panel of terminal 1100. Proximity sensor 1116 is used to capture the distance between the user and the front face of terminal 1100. In one embodiment, when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 is gradually decreased, the display screen 1105 is controlled by the processor 1101 to switch from a bright screen state to a dark screen state; when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 becomes progressively larger, the display screen 1105 is controlled by the processor 1101 to switch from a breath-screen state to a light-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

The electronic device in the above method embodiment is implemented as a server. For example, fig. 12 is a schematic structural diagram of a server 1200, which may generate relatively large differences due to different configurations or performances, and includes one or more processors (CPUs) 1201 and one or more memories 1202, where the memory 1202 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 1201 to implement the image registration method or the image registration network training method provided by the above-mentioned method embodiments. Certainly, the server further has a wired or wireless network interface, an input/output interface, and other components to facilitate input and output, and the server further includes other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer readable storage medium, such as a memory including at least one computer program, executable by a processor to perform the image registration method or the image registration network training method of the above embodiments, is also provided. For example, the computer readable storage medium is a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product or a computer program is also provided, which comprises one or more program codes stored in a computer-readable storage medium. The one or more program codes are read from the computer-readable storage medium by one or more processors of the electronic device, and the one or more processors execute the one or more program codes to cause the electronic device to perform the image registration method or the image registration network training method described above.

In some embodiments, the computer program according to the embodiments of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site, or may be executed on multiple computer devices distributed at multiple sites and interconnected by a communication network, and the multiple computer devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.

Those skilled in the art will understand that all or part of the steps for implementing the above embodiments are implemented by hardware, and also implemented by a program for instructing relevant hardware, where the program is stored in a computer-readable storage medium, and the storage medium mentioned above is a read-only memory, a magnetic disk or an optical disk, etc.

The above description is intended only to be an alternative embodiment of the present application, and not to limit the present application, and any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of image registration, the method comprising:

2. The method according to claim 1, wherein the updating the initial transformation parameters corresponding to the initial registration result in response to the initial registration result satisfying a condition includes any one of:

3. The method of claim 1, wherein the image registration network comprises a first branch for rigid body registration and a second branch for non-rigid body registration;

the image registration of the reference image and the first image based on the image registration network to obtain an initial registration result comprises:

4. The method according to claim 3, wherein the updating the initial transformation parameters corresponding to the initial registration result in response to the initial registration result satisfying a condition includes any one of:

the transformation processing is performed on the first image based on the updated transformation parameters to obtain a second image, and the second image comprises any one of the following items:

5. The method of claim 1, wherein the updating the transformation parameter in response to the similarity between the reference image and the second image satisfying a condition comprises:

updating the transformation parameter in response to a similarity between the reference image and the second image being less than a second similarity threshold.

6. The method according to claim 1, wherein the obtaining of the similarity between the reference image and the second image comprises:

and according to the modal information of the reference image and the first image, acquiring the similarity between the reference image and the second image by adopting a similarity acquisition mode corresponding to the modal information.

7. The method according to claim 6, wherein the obtaining the similarity between the reference image and the second image by using a similarity obtaining manner corresponding to the modality information according to the modality information of the reference image and the first image comprises:

acquiring a normalized cross-correlation coefficient between the sub-image set of the reference image and the second image in response to the modality information of the reference image being the same as the modality information of the first image, the normalized cross-correlation coefficient being a similarity between the reference image and the second image;

acquiring normalized mutual information between the sub-image set of the reference image and the second image in response to the difference between the modal information of the reference image and the modal information of the first image, and acquiring the similarity between the reference image and the second image based on the normalized mutual information.

8. The method of claim 7, wherein the obtaining the similarity between the reference image and the second image based on the normalized mutual information comprises:

9. An image registration network training method, the method comprising:

according to the image type information carried by the sample image pair, carrying out image registration on the sample image pair based on the branch corresponding to the image type information in an image registration network to obtain transformation parameters, wherein the image registration functions of different branches are different;

10. The method of claim 9, wherein the image type information comprises a first image type and a second image type;

the image registration of the sample image pair based on the branch corresponding to the image type information in the image registration network according to the image type information carried by the sample image pair to obtain the transformation parameters includes:

11. The method of claim 9, further comprising:

in response to an update instruction of an image registration network, acquiring batch data, wherein the batch data comprises a target image pair and the sample image pair, and the target image pair is the image pair input into the image registration network in the use process of the image registration network;

and training the trained image registration network based on the batch data to obtain an updated image registration network.

12. An image registration apparatus, characterized in that the apparatus comprises:

13. An image registration apparatus, characterized in that the apparatus comprises:

14. An electronic device, comprising one or more processors and one or more memories having at least one computer program stored therein, the at least one computer program being loaded and executed by the one or more processors to implement the image registration method of any one of claims 1 to 8 or to implement the image registration network training method of any one of claims 9 to 11.

15. A computer-readable storage medium, in which at least one computer program is stored, which is loaded and executed by a processor to implement the image registration method of any one of claims 1 to 8, or to implement the image registration network training method of any one of claims 9 to 11.