CN117437365A

CN117437365A - Medical three-dimensional model generation method, device, electronic equipment and storage medium

Info

Publication number: CN117437365A
Application number: CN202311756147.9A
Authority: CN
Inventors: 秦文健; 陈鑫
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-01-23
Anticipated expiration: 2043-12-20
Also published as: CN117437365B

Abstract

The application provides a method and a device for generating a medical three-dimensional model, electronic equipment and a storage medium, and relates to the technical field of computers. The method for generating the medical three-dimensional model comprises the following steps: responding to input operation in the interactive interface, and acquiring medical text; invoking a first generation network, and under the guidance of a medical text, learning a first tensor input into the first generation network to obtain a medical two-dimensional image conforming to the description of the medical text; the first generation network is a trained deep learning model with the capability of generating two-dimensional images from medical text to medical science; inputting the medical two-dimensional image into a second generation network to generate a medical three-dimensional model; the second generation network is a trained deep learning model with generation capabilities from a medical two-dimensional image to a medical three-dimensional model; and displaying the medical three-dimensional model generated by the second generation network in the interactive interface. The method and the device solve the problem of poor reality of the medical three-dimensional model in the related technology.

Description

Medical three-dimensional model generation method, device, electronic equipment and storage medium

技术领域Technical field

本申请涉及计算机技术领域，具体而言，本申请涉及一种医学三维模型的生成方法、装置、电子设备及存储介质。The present application relates to the field of computer technology. Specifically, the present application relates to a method, device, electronic equipment and storage medium for generating a medical three-dimensional model.

背景技术Background technique

目前，无论是医学培训还是医学教学过程中，都需要使用大量的医学三维模型，这些医学三维模型通常是由三维模型内容公司人工创造，不仅虚拟内容逼真程度有限，内容编辑还会耗费大量时间，而且无法真实反映患者在不同情景不同状态下的变化。At present, a large number of medical 3D models need to be used in both medical training and medical teaching. These 3D medical models are usually manually created by 3D model content companies. Not only is the virtual content realistic, but content editing also consumes a lot of time. And it cannot truly reflect the changes of patients in different situations and states.

由上可知，如何提高医学三维模型的真实度尚待解决。It can be seen from the above that how to improve the realism of medical three-dimensional models has yet to be solved.

发明内容Contents of the invention

本申请各实施例提供了一种医学三维模型的生成方法、装置、电子设备及存储介质，可以解决相关技术中存在的医学三维模型真实度差的问题。所述技术方案如下：Each embodiment of the present application provides a method, device, electronic device, and storage medium for generating a medical three-dimensional model, which can solve the problem of poor authenticity of medical three-dimensional models existing in related technologies. The technical solutions are as follows:

根据本申请的一个方面，一种医学三维模型的生成方法，包括：响应于交互界面中的输入操作，获取医学文本；调用第一生成网络，在所述医学文本的引导下，对输入所述第一生成网络的第一张量进行学习，得到符合所述医学文本描述的医学二维图像；所述第一生成网络是经过训练、且具有从医学文本到医学二维图像的生成能力的深度学习模型；将所述医学二维图像输入第二生成网络进行医学三维模型的生成；所述第二生成网络是经过训练、且具有从医学二维图像到医学三维模型的生成能力的深度学习模型；将所述第二生成网络生成的医学三维模型，显示在所述交互界面中。According to one aspect of the present application, a method for generating a medical three-dimensional model includes: in response to an input operation in an interactive interface, obtaining medical text; calling the first generation network, and under the guidance of the medical text, to input the said The first tensor of the first generation network is learned to obtain a medical two-dimensional image that conforms to the description of the medical text; the first generation network is trained and has the depth to generate from medical text to medical two-dimensional images. Learning a model; inputting the medical two-dimensional image into a second generation network to generate a medical three-dimensional model; the second generation network is a deep learning model that has been trained and has the ability to generate from a medical two-dimensional image to a medical three-dimensional model. ; Display the medical three-dimensional model generated by the second generation network in the interactive interface.

根据本申请的一个方面，一种医学三维模型的生成装置，包括：文本获取模块，用于响应于交互界面中的输入操作，获取医学文本；图像生成模块，用于调用第一生成网络，在所述医学文本的引导下，对输入所述第一生成网络的第一张量进行学习，得到符合所述医学文本描述的医学二维图像；所述第一生成网络是经过训练、且具有从医学文本到医学二维图像的生成能力的深度学习模型；模型生成模块，用于将所述医学二维图像输入第二生成网络进行医学三维模型的生成；所述第二生成网络是经过训练、且具有从医学二维图像到医学三维模型的生成能力的深度学习模型；模型展示模块，用于将所述第二生成网络生成的医学三维模型，显示在所述交互界面中。According to one aspect of the present application, a device for generating a medical three-dimensional model includes: a text acquisition module, configured to acquire medical text in response to an input operation in an interactive interface; an image generation module, configured to call the first generation network, in Under the guidance of the medical text, the first tensor input to the first generation network is learned to obtain a medical two-dimensional image that conforms to the description of the medical text; the first generation network is trained and has the ability to A deep learning model with the ability to generate medical text into medical two-dimensional images; a model generation module for inputting the medical two-dimensional images into a second generation network to generate a medical three-dimensional model; the second generation network is trained, And a deep learning model with the ability to generate from medical two-dimensional images to medical three-dimensional models; a model display module used to display the medical three-dimensional model generated by the second generation network in the interactive interface.

根据本申请的一个方面，一种电子设备，包括至少一个处理器以及至少一个存储器，其中，所述存储器上存储有计算机可读指令；所述计算机可读指令被一个或多个所述处理器执行，使得电子设备实现如上所述的医学三维模型的生成方法。According to one aspect of the present application, an electronic device includes at least one processor and at least one memory, wherein computer readable instructions are stored on the memory; the computer readable instructions are processed by one or more of the processors Execution enables the electronic device to implement the method for generating a medical three-dimensional model as described above.

根据本申请的一个方面，一种存储介质，其上存储有计算机可读指令，所述计算机可读指令被一个或多个处理器执行，以实现如上所述的医学三维模型的生成方法。According to one aspect of the present application, a storage medium has computer-readable instructions stored thereon, and the computer-readable instructions are executed by one or more processors to implement the method for generating a medical three-dimensional model as described above.

根据本申请的一个方面，一种计算机程序产品，计算机程序产品包括计算机可读指令，计算机可读指令存储在存储介质中，电子设备的一个或多个处理器从存储介质读取计算机可读指令，加载并执行该计算机可读指令，使得电子设备实现如上所述的医学三维模型的生成方法。According to one aspect of the present application, a computer program product includes computer readable instructions, the computer readable instructions are stored in a storage medium, and one or more processors of an electronic device reads the computer readable instructions from the storage medium , loading and executing the computer-readable instructions, so that the electronic device implements the method for generating a medical three-dimensional model as described above.

本申请提供的技术方案带来的有益效果是：The beneficial effects brought by the technical solution provided by this application are:

在上述技术方案中，在获得医学文本后，便能够调用具有从医学文本到医学二维图像的生成能力的第一生成网络，以在医学文本的引导下，对输入该第一生成网络的第一张量进行学习，得到符合医学文本描述的医学二维图像，然后基于具有从医学二维图像到医学三维模型的生成能力的第二生成网络，将医学二维图像输入该第二生成网络进行医学三维模型的生成，最终生成医学三维模型并显示在交互界面中。可见，一方面，利用两个生成网络便能够自动并快速地生成医学培训或医学教学所需要的医学三维模型，另一方面，可以根据不同医学培训或医学教学的实际需要，在交互界面中输入简单的医学文本，引导生成的医学三维模型能够真实地反映出患者不同情景不同状态下的变化，从而能够有效地解决相关技术中存在的医学三维模型真实度差的问题。In the above technical solution, after obtaining the medical text, the first generation network with the ability to generate from medical text to medical two-dimensional images can be called to generate the first generation network input to the first generation network under the guidance of the medical text. A piece of data is learned to obtain a medical two-dimensional image that conforms to the medical text description, and then the medical two-dimensional image is input into the second generation network based on the second generation network with the ability to generate from medical two-dimensional images to medical three-dimensional models. The generation of medical three-dimensional models, and finally the medical three-dimensional models are generated and displayed in the interactive interface. It can be seen that on the one hand, two generation networks can be used to automatically and quickly generate medical three-dimensional models required for medical training or medical teaching. On the other hand, according to the actual needs of different medical training or medical teaching, input in the interactive interface Simple medical texts can guide the generation of medical three-dimensional models that can truly reflect changes in patients in different situations and states, thereby effectively solving the problem of poor authenticity of medical three-dimensional models existing in related technologies.

附图说明Description of the drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对本申请实施例描述中所需要使用的附图作简单地介绍。显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the technical solutions in the embodiments of the present application more clearly, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can be obtained based on these drawings without exerting creative efforts.

图1是根据本申请所涉及的实施环境的示意图；Figure 1 is a schematic diagram of an implementation environment involved in this application;

图2是根据一示例性实施例示出的一种医学三维模型的生成方法的流程图；Figure 2 is a flow chart of a method for generating a medical three-dimensional model according to an exemplary embodiment;

图3是根据一示例性实施例示出的第一生成网络从医学文本到医学二维图像的生成过程的流程图；Figure 3 is a flow chart of the generation process of the first generation network from medical text to medical two-dimensional images according to an exemplary embodiment;

图4是根据一示例性实施例示出的医学文本到医学三维模型的示意图；Figure 4 is a schematic diagram showing medical text to a medical three-dimensional model according to an exemplary embodiment;

图5是根据一示例性实施例示出的第一训练集和第二训练集的构建过程的流程图；Figure 5 is a flow chart of a process of constructing a first training set and a second training set according to an exemplary embodiment;

图6是图5对应实施例中第一训练集和第二训练集的示意图；Figure 6 is a schematic diagram of the first training set and the second training set in the embodiment corresponding to Figure 5;

图7是一应用场景中一种医学三维模型的生成方法的具体交互示意图；Figure 7 is a specific interaction diagram of a method for generating a medical three-dimensional model in an application scenario;

图8是根据一示例性实施例示出的一种医学三维模型的生成装置的结构框图；Figure 8 is a structural block diagram of a device for generating a medical three-dimensional model according to an exemplary embodiment;

图9是根据一示例性实施例示出的一种电子设备的硬件结构图；Figure 9 is a hardware structure diagram of an electronic device according to an exemplary embodiment;

图10是根据一示例性实施例示出的一种电子设备的结构框图。FIG. 10 is a structural block diagram of an electronic device according to an exemplary embodiment.

具体实施方式Detailed ways

下面详细描述本申请的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本申请，而不能解释为对本申请的限制。The embodiments of the present application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the drawings are exemplary and are only used to explain the present application and cannot be construed as limiting the present application.

本技术领域技术人员可以理解，除非特意声明，这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是，本公开的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件，但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解，当我们称元件被“连接”或“耦接”到另一元件时，它可以直接连接或耦接到其他元件，或者也可以存在中间元件。此外，这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。Those skilled in the art will understand that, unless expressly stated otherwise, the singular forms "a", "an", "the" and "the" used herein may also include the plural form. It should be further understood that the word "comprising" as used in the description of the present disclosure refers to the presence of stated features, integers, steps, operations, elements and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components and/or groups thereof. It will be understood that when we refer to an element being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Additionally, "connected" or "coupled" as used herein may include wireless connections or wireless couplings. As used herein, the term "and/or" includes all or any unit and all combinations of one or more of the associated listed items.

下面是对本申请涉及的几个名词进行的介绍和解释：The following is an introduction and explanation of several terms involved in this application:

AIGC，英文全称为AI-Generated Content，中文含义为生成式人工智能。AIGC, the full English name is AI-Generated Content, and the Chinese meaning is generative artificial intelligence.

Stable Diffusion，中文含义为稳定扩散。Stable Diffusion means stable diffusion in Chinese.

VR，英文全称为Virtual Reality，中文含义为虚拟现实。VR, the full English name is Virtual Reality, and the Chinese meaning is virtual reality.

MR，英文全称为Mixed Reality，中文含义为混合现实。MR, the full English name is Mixed Reality, and the Chinese meaning is mixed reality.

如前所述，目前的医学三维模型主要由三维模型内容公司人工创造，通常是利用3D软件制作或者由医学影像分割、重建、渲染后生成，尚存在真实度不高的缺陷，不仅操作步骤复杂、制作成本高，没有办法做到建议的文本描述控制内容风格的自由改变和编辑，并且难以适应于医学培训或医学教学等多种不同场景下对不同内容风格的需求。As mentioned before, current medical 3D models are mainly created manually by 3D model content companies. They are usually produced using 3D software or generated after segmentation, reconstruction, and rendering of medical images. They still have the disadvantage of low realism and complex operation steps. , The production cost is high, there is no way to freely change and edit the suggested text description control content style, and it is difficult to adapt to the needs for different content styles in various scenarios such as medical training or medical teaching.

随着深度学习模型的快速发展，生成式人工智能(AIGC)开始出现，AIGC技术的核心思想是利用人工智能算法生成具有一定创意和质量的内容。通过训练模型和大量数据的学习，AIGC可以根据输入的条件或指导，生成与之相关的内容。例如，通过输入关键词、描述或样本，AIGC可以生成与之相匹配的文章、图像、音频等。With the rapid development of deep learning models, generative artificial intelligence (AIGC) has begun to appear. The core idea of AIGC technology is to use artificial intelligence algorithms to generate content with certain creativity and quality. By training models and learning from large amounts of data, AIGC can generate relevant content based on input conditions or guidance. For example, by entering keywords, descriptions or samples, AIGC can generate matching articles, images, audio, etc.

当前的AIGC在自然图像模型和自然场景下效果还行，受限于医学内容数据集的数量，实现不了大规模的医学内容数据集下的训练，这使得其在医学相关内容生成上仍然无法达到让人满意的效果。The current AIGC performs well in natural image models and natural scenes. However, it is limited by the number of medical content data sets and cannot achieve training under large-scale medical content data sets. This makes it still unable to achieve the goal of generating medical-related content. Satisfactory results.

由上可知，相关技术中仍存在医学三维模型真实度差的缺陷，从而导致医学培训或医学教学等应用场景下的考核难以进行准确的量化评估。It can be seen from the above that there are still flaws in the related technology of poor realism of medical three-dimensional models, which makes it difficult to conduct accurate quantitative assessments in application scenarios such as medical training or medical teaching.

为此，本申请提供的医学三维模型的生成方法，能够有效地提高医学三维模型的真实度，相应地，该医学三维模型的生成方法适用于医学三维模型的生成装置，该医学三维模型的生成装置可部署于电子设备，该电子设备可以是部署了冯诺依曼体系架构的计算机设备，例如，该计算机设备包括台式电脑、笔记本电脑、服务器等。To this end, the method for generating a medical three-dimensional model provided by this application can effectively improve the authenticity of the medical three-dimensional model. Correspondingly, the method for generating a medical three-dimensional model is suitable for a medical three-dimensional model generating device. The generation of the medical three-dimensional model The device may be deployed on an electronic device, and the electronic device may be a computer device deployed with a von Neumann architecture. For example, the computer device includes a desktop computer, a notebook computer, a server, etc.

为使本申请的目的、技术方案和优点更加清楚，下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the purpose, technical solutions and advantages of the present application clearer, the embodiments of the present application will be further described in detail below with reference to the accompanying drawings.

图1为一种医学三维模型的生成方法所涉及的一种实施环境的示意图。需要说明的是，该种实施环境只是一个适配于本申请的示例，不能认为是提供了对本申请的使用范围的任何限制。Figure 1 is a schematic diagram of an implementation environment involved in a method for generating a medical three-dimensional model. It should be noted that this implementation environment is only an example adapted to the present application and cannot be considered to provide any restriction on the scope of use of the present application.

该实施环境包括用户端110和服务端130。The implementation environment includes a client 110 and a server 130 .

具体地，用户端110，可以是提供显示功能的电子设备，例如，该电子设备可以是配置显示屏的台式电脑、笔记本电脑、服务器等，还可以是配置触摸屏的智能手机、平板电脑等。Specifically, the user terminal 110 can be an electronic device that provides a display function. For example, the electronic device can be a desktop computer, a notebook computer, a server, etc. equipped with a display screen, or a smartphone, a tablet computer, etc. equipped with a touch screen.

服务端130，该服务端130可以是台式电脑、笔记本电脑、服务器等等电子设备，还可以是由多台服务器构成的计算机集群，甚至是由多台服务器构成的云计算中心。其中，服务端130用于提供后台服务，例如，后台服务包括但不限于医学三维模型的生成服务等等。Server 130. The server 130 can be an electronic device such as a desktop computer, a laptop computer, a server, etc., or it can be a computer cluster composed of multiple servers, or even a cloud computing center composed of multiple servers. The server 130 is used to provide background services. For example, the background services include but are not limited to medical three-dimensional model generation services and so on.

服务端130与用户端110之间通过有线或者无线等方式预先建立通信连接，并通过该通信连接实现服务端130与用户端110之间的数据传输。传输的数据包括但不限于：第一生成网络、第二生成网络等等。A communication connection is established in advance between the server 130 and the client 110 through wired or wireless means, and data transmission between the server 130 and the client 110 is implemented through this communication connection. The transmitted data includes but is not limited to: the first generation network, the second generation network, and so on.

在一应用场景中，对于服务端130而言，调用医学三维模型的生成服务，以便于将用于生成医学三维模型的第一生成网络和第二生成网络部署至用户端110，具体地，基于预先构建的第一训练集训练得到第一生成网络，并基于预先构建的第二训练集训练得到第二生成网络，将第一生成网络和第二生成网络发送至用户端110。In an application scenario, for the server 130, the generation service of the medical three-dimensional model is called so as to deploy the first generation network and the second generation network for generating the medical three-dimensional model to the client 110. Specifically, based on The first generation network is obtained by training with the pre-constructed first training set, and the second generation network is obtained by training based on the pre-constructed second training set, and the first generation network and the second generation network are sent to the user terminal 110 .

随着用户端110与服务端130的交互，用户端110便接收到第一生成网络和第二生成网络，进而完成第一生成网络和第二生成网络的部署，然后在用户借助用户端110的交互界面输入医学文本后，便能够调用第一生成网络和第二生成网络，在医学文本的引导下生成医学三维模型，使得该医学三维模型能够真实地反映患者不同情景不同状态下的变化，并最终将该医学三维模型显示在交互界面中，以此有效地解决相关技术中存在的医学三维模型真实度差的问题。As the user end 110 interacts with the server end 130, the user end 110 receives the first generation network and the second generation network, and then completes the deployment of the first generation network and the second generation network. Then, the user uses the user end 110 to After inputting medical text into the interactive interface, the first generation network and the second generation network can be called to generate a medical three-dimensional model under the guidance of the medical text, so that the three-dimensional medical model can truly reflect the changes of patients in different situations and states, and Finally, the medical three-dimensional model is displayed in the interactive interface, thereby effectively solving the problem of poor authenticity of the medical three-dimensional model existing in related technologies.

请参阅图2，本申请实施例提供了一种医学三维模型的生成方法，该方法适用于电子设备，该电子设备可以是图1所示出实施环境中的用户端110。Please refer to Figure 2. This embodiment of the present application provides a method for generating a medical three-dimensional model. The method is suitable for electronic equipment. The electronic equipment can be the user terminal 110 in the implementation environment shown in Figure 1.

在下述方法实施例中，为了便于描述，以该方法各步骤的执行主体为电子设备为例进行说明，但是并非对此构成具体限定。In the following method embodiments, for convenience of description, the execution subject of each step of the method is an electronic device as an example, but this is not a specific limitation.

如图2所示，该方法可以包括以下步骤：As shown in Figure 2, the method may include the following steps:

步骤310，响应于交互界面中的输入操作，获取医学文本。Step 310: Obtain medical text in response to the input operation in the interactive interface.

首先说明的是，交互界面，实质是指在医学三维模型生成过程中与用户交互的页面。在一些实施例中，交互界面是由运行在电子设备的客户端提供的。可以理解，随着客户端在电子设备上运行，交互界面便能够显示在电子设备配置的屏幕中，该客户端可以是应用程序形式的，也可以是网页形式的，相应地，页面可以是程序窗口形式的，还可以是网页页面形式的，此处并未加以限定。First of all, the interactive interface essentially refers to the page that interacts with the user during the generation of the medical 3D model. In some embodiments, the interactive interface is provided by a client running on the electronic device. It can be understood that as the client runs on the electronic device, the interactive interface can be displayed on the screen configured on the electronic device. The client can be in the form of an application program or a web page. Correspondingly, the page can be a program. It can be in the form of a window or a web page, which is not limited here.

其次，医学文本，用于描述内容风格，尤其是用于描述用户所期望的医学二维图像或医学三维模型的内容风格，也就是说，该医学文本实质是真实地反映了患者不同情景不同状态下的变化。例如，医学文本可以描述为“带肿瘤的肺”，还可以更加精准地描述为“右肺部直径20mm的肿瘤”。Secondly, medical text is used to describe the content style, especially the content style of the two-dimensional medical images or three-dimensional medical models expected by the user. In other words, the medical text essentially reflects the different situations and states of patients. changes below. For example, a medical text can be described as "a lung with a tumor" or more accurately as "a tumor with a diameter of 20 mm in the right lung."

为了能够了解到用户期望的医学二维图像或医学三维模型的内容风格，在一些实施例中，交互界面中提供一个输入入口，以便于用户借助该输入入口输入医学文本。In order to understand the content style of the two-dimensional medical image or the three-dimensional medical model desired by the user, in some embodiments, an input portal is provided in the interactive interface so that the user can input medical text through the input portal.

例如，在交互界面中显示一个输入框，用户便能够在该输入框中输入医学文本，对于电子设备来说，便能够检测到该输入框中的输入操作，进而了解到用户所期望的医学二维图像或医学三维模型的内容风格。其中，输入框视为交互界面中提供的输入入口，该输入操作视为交互界面中的输入操作。For example, an input box is displayed in the interactive interface, and the user can enter medical text in the input box. For electronic devices, the input operation in the input box can be detected, and the user can understand the medical text expected by the user. 3D image or medical 3D model content style. Among them, the input box is regarded as the input entrance provided in the interactive interface, and the input operation is regarded as the input operation in the interactive interface.

当然，在其他实施例中，输入操作可以根据电子设备配置的输入组件的不同而有所区别。例如，电子设备是配置键盘的台式电脑，相应地，输入操作可以是指针对键盘的单击等机械操作；电子设备是配置触摸屏的平板电脑，相应地，输入操作可以是指针对触摸屏的点击、滑动等手势操作，此处并非构成具体限定。Of course, in other embodiments, the input operations may differ according to different input components configured by the electronic device. For example, the electronic device is a desktop computer equipped with a keyboard. Correspondingly, the input operation may refer to mechanical operations such as clicking on the keyboard; the electronic device is a tablet computer equipped with a touch screen. Correspondingly, the input operation may refer to clicking, clicking on the touch screen, etc. Gesture operations such as sliding are not specifically limited here.

步骤330，调用第一生成网络，在医学文本的引导下，对输入第一生成网络的第一张量进行学习，得到符合医学文本描述的医学二维图像。Step 330: Call the first generation network, and under the guidance of the medical text, learn the first tensor input to the first generation network to obtain a medical two-dimensional image that conforms to the description of the medical text.

其中，第一生成网络是经过训练、且具有从医学文本到医学二维图像的生成能力的深度学习模型。也就是说，通过第一训练集对深度学习模型进行训练，便能够得到具有从医学文本到医学二维图像的生成能力的第一生成网络，其中，第一训练集是由大量的医学二维图像及相应的医学文本构建的。基于此，第一生成网络实质反映了医学文本与医学二维图像之间的数学映射关系，那么，通过第一生成网络的调用，便能够基于第一生成网络所反映的医学文本与医学二维图像之间的数学映射关系，由医学文本映射得到相应的医学二维图像。Among them, the first generation network is a deep learning model that has been trained and has the ability to generate from medical text to medical two-dimensional images. That is to say, by training the deep learning model through the first training set, the first generation network with the ability to generate from medical text to medical two-dimensional images can be obtained. The first training set is composed of a large number of medical two-dimensional images. Constructed from images and corresponding medical text. Based on this, the first generation network essentially reflects the mathematical mapping relationship between medical texts and medical two-dimensional images. Then, through the call of the first generation network, the medical text and medical two-dimensional images reflected by the first generation network can be The mathematical mapping relationship between images is based on medical text mapping to obtain corresponding medical two-dimensional images.

在一些实施例中，第一生成网络包括文本编码器、扩散模型、以及图像解码器。其中，初始的扩散模型可以是经过预训练的稳定扩散Stable Diffusion模型。In some embodiments, the first generative network includes a text encoder, a diffusion model, and an image decoder. Among them, the initial diffusion model may be a pre-trained Stable Diffusion model.

图3展示了第一生成网络从医学文本到医学二维图像的生成过程的示意图。在图3中，首先，利用文本编码器，将医学文本转换为第二张量（tensor），并获取随机生成的第一张量；然后，控制第二张量引导扩散模型按照医学文本描述的内容风格，对第一张量进行扩散学习，得到第三张量；最后，利用图像解码器，将第三张量转换为医学二维图像。Figure 3 shows a schematic diagram of the generation process of the first generation network from medical text to medical two-dimensional images. In Figure 3, first, use the text encoder to convert the medical text into a second tensor (tensor), and obtain the randomly generated first tensor; then, control the second tensor to guide the diffusion model according to the description of the medical text. Content style, perform diffusion learning on the first tensor to obtain the third tensor; finally, use the image decoder to convert the third tensor into a medical two-dimensional image.

其中，扩散学习的过程，具体是指：如图3所示，将第一张量输入扩散模型，通过扩散模型的逆向过程进行去噪；以第二张量作为引导条件，并将引导条件引入第一张量的去噪过程，得到第三张量。Among them, the process of diffusion learning specifically refers to: as shown in Figure 3, the first tensor is input into the diffusion model, and denoising is performed through the reverse process of the diffusion model; the second tensor is used as the guiding condition, and the guiding condition is introduced The denoising process of the first tensor results in the third tensor.

上述过程中，由于医学文本实质描述了用户期望的医学二维图像的内容风格，那么，由医学文本转换得到的第二张量也能够反映用户期望的医学二维图像的内容风格，在第一张量的去噪过程中引入该第二张量，便是指导扩散模型向用户期望的内容风格对第一张量进行扩散学习，最终得到符合医学文本描述的医学二维图像，亦即是该医学二维图像符合用户期望的内容风格。例如，若医学文本是“带肿瘤的肺”，则由此得到的医学二维图像中不仅包含有肺，还在肺中呈现肿瘤；或者，若医学文本是“右肺部直径20mm的肿瘤”，则由此得到的医学二维图像不仅包含左肺和右肺，还会在右肺中呈现直径20mm的肿瘤，以此便能够在医学文本的引导下，使得医学二维图像真实地反映了患者在不同情景不同状态下的变化。In the above process, since the medical text essentially describes the content style of the two-dimensional medical image expected by the user, the second tensor converted from the medical text can also reflect the content style of the two-dimensional medical image expected by the user. In the first Introducing the second tensor in the denoising process of the tensor is to guide the diffusion model to diffuse the first tensor in the content style expected by the user, and finally obtain a medical two-dimensional image that conforms to the medical text description, that is, the Medical 2D images conform to the content style expected by users. For example, if the medical text is "Lung with Tumor", the resulting two-dimensional medical image not only contains the lungs, but also shows tumors in the lungs; or, if the medical text is "Tumor with a diameter of 20 mm in the right lung" , the resulting medical two-dimensional image not only contains the left lung and the right lung, but also presents a tumor with a diameter of 20mm in the right lung. In this way, under the guidance of medical text, the medical two-dimensional image can truly reflect the Changes in patients in different situations and states.

值得一提的是，第一张量是指设定大小的图像张量，相应地，第三张量也是设定大小的图像张量，可以理解为，第一张量的大小控制了第三张量的大小，并且控制了医学二维图像的图像大小。换而言之，设定大小可以根据应用场景中对医学二维图像的图像大小的实际需要灵活地设置，此处并未加以限定。It is worth mentioning that the first tensor refers to an image tensor of a set size. Correspondingly, the third tensor is also an image tensor of a set size. It can be understood that the size of the first tensor controls the size of the third tensor. The size of the tensor, and controls the image size of medical two-dimensional images. In other words, the set size can be flexibly set according to the actual needs for the image size of the medical two-dimensional image in the application scenario, which is not limited here.

步骤350，将医学二维图像输入第二生成网络进行医学三维模型的生成。Step 350: Input the two-dimensional medical image into the second generation network to generate a three-dimensional medical model.

其中，第二生成网络是经过训练、且具有从医学二维图像到医学三维模型的生成能力的深度学习模型。在一些实施例中，深度学习模型可以是经过预训练的端到端深度学习模型Pixel2Mesh。Among them, the second generation network is a deep learning model that has been trained and has the ability to generate from medical two-dimensional images to medical three-dimensional models. In some embodiments, the deep learning model may be a pre-trained end-to-end deep learning model Pixel2Mesh.

也就是说，通过第二训练集对深度学习模型进行训练，便能够得到具有从医学二维图像到医学三维模型的生成能力的第二生成网络，其中，第二训练集是由大量相同或不同视角下的医学二维图像及相应的医学三维模型构建的。在此说明的是，医学二维图像与医学三维模型相应，是指相同或不同视角下的医学二维图像与医学三维模型具有相匹配的医学文本，也可以理解为，相同或不同视角下的医学二维图像与医学三维模型符合该相匹配的医学文本所描述的内容风格。That is to say, by training the deep learning model through the second training set, a second generation network with the ability to generate from medical two-dimensional images to medical three-dimensional models can be obtained, where the second training set is composed of a large number of the same or different The medical two-dimensional image and the corresponding medical three-dimensional model are constructed from the perspective of the patient. What is explained here is that the two-dimensional medical image corresponds to the three-dimensional medical model, which means that the two-dimensional medical image and the three-dimensional medical model at the same or different viewing angles have matching medical texts. It can also be understood as, the medical two-dimensional image at the same or different viewing angles has matching medical text. The two-dimensional medical image and the three-dimensional medical model conform to the content style described by the matching medical text.

基于此，第一生成网络实质反映了医学二维图像与医学三维模型之间的数学映射关系，那么，通过输入医学二维图像至第二生成网络，便能够基于第二生成网络所反映的医学二维图像与医学三维模型之间的数学映射关系，由医学二维图像映射得到相应的医学三维模型。Based on this, the first generation network essentially reflects the mathematical mapping relationship between the medical two-dimensional image and the medical three-dimensional model. Then, by inputting the medical two-dimensional image to the second generation network, the medical data reflected by the second generation network can be generated. The mathematical mapping relationship between the two-dimensional image and the medical three-dimensional model, and the corresponding medical three-dimensional model is obtained by mapping the medical two-dimensional image.

图4展示了医学文本到医学三维模型的示意图，在图4中，医学文本为“一个带肿瘤的肺”，通过第一生成网络，便得到带肿瘤的肺的图片，即医学二维图像，在该医学二维图像的基础上，通过第二生成网络，便得到一个带肿瘤的肺的三维网格模型，即医学三维模型。Figure 4 shows a schematic diagram of medical text to medical three-dimensional model. In Figure 4, the medical text is "a lung with tumors." Through the first generation network, a picture of a lung with tumors is obtained, that is, a two-dimensional medical image. On the basis of the two-dimensional medical image, through the second generation network, a three-dimensional mesh model of the lung with tumors, that is, a three-dimensional medical model, is obtained.

步骤370，将第二生成网络生成的医学三维模型，显示在交互界面中。Step 370: Display the medical three-dimensional model generated by the second generation network in the interactive interface.

在获得医学三维模型后，便可以在交互界面中向用户展示该医学三维模型。应当说明的是，随着不同电子设备提供的显示功能的差异，例如，不同电子设备配置的显示屏的分辨率不同，在交互界面中显示医学三维模型之前，需要按照电子设备兼容的数据格式对生成的医学三维模型进行编码，方能够输出适配于该电子设备的医学三维模型。After obtaining the medical three-dimensional model, the medical three-dimensional model can be displayed to the user in the interactive interface. It should be noted that with the differences in display functions provided by different electronic devices, for example, the resolutions of display screens configured with different electronic devices are different, before displaying the medical 3D model in the interactive interface, it is necessary to compare it according to the data format compatible with the electronic device. The generated medical three-dimensional model is encoded to output a medical three-dimensional model adapted to the electronic device.

通过上述过程，一方面，利用两个生成网络便能够自动并快速地生成医学培训或医学教学所需要的医学三维模型，另一方面，可以根据不同医学培训或医学教学的实际需要，在交互界面中输入简单的医学文本，引导生成的医学三维模型能够真实地反映出患者不同情景不同状态下的变化，从而能够有效地解决相关技术中存在的医学三维模型真实度差的问题。Through the above process, on the one hand, two generation networks can be used to automatically and quickly generate medical three-dimensional models required for medical training or medical teaching. On the other hand, according to the actual needs of different medical training or medical teaching, the interactive interface can be By inputting simple medical text into the system, the generated medical 3D model can truly reflect the changes of patients in different situations and states, thereby effectively solving the problem of poor authenticity of medical 3D models existing in related technologies.

请参阅图5，在一示例性实施例中，上述方法还可以包括以下步骤：Referring to Figure 5, in an exemplary embodiment, the above method may further include the following steps:

步骤410，获取医学原始图像，并对医学原始图像进行关于医学文本的标注，得到医学标注图像。Step 410: Obtain the original medical image and annotate the original medical image with the medical text to obtain the medical annotation image.

其中，医学标注图像是指携带医学文本的医学原始图像。Among them, medical annotation images refer to medical original images carrying medical text.

关于医学原始图像的获取，可以从网上公开的医学图像得到，还可以从各大医院/医学院等组织机构私有的医学影像数据得到，也可以从各个公开的医学比赛数据中得到；进一步地，基于上述获取到的医学原始图像，还可以按照不同器官、组织、病变部位等进行分割，例如，将一个包含左肺和右肺的医学原始图像按照器官分割为两个医学原始图像，以此形成大规模且具有各个器官、组织、病变部位的医学原始图像数据集。Regarding the acquisition of original medical images, it can be obtained from medical images published on the Internet, private medical image data from major hospitals/medical schools and other organizations, or from various public medical competition data; further, Based on the original medical image obtained above, it can also be segmented according to different organs, tissues, diseased parts, etc. For example, an original medical image containing the left lung and the right lung can be segmented into two original medical images according to the organs to form A large-scale medical original image data set with various organs, tissues, and diseased parts.

其次说明的是，标注指的是为医学原始图像添加医学文本。在一些实施例中，医学文本可以作为文字标签添加至医学原始图像内，还可以通过命名为文件名的形式添加至医学原始图像，此处并未加以限定。Secondly, annotation refers to adding medical text to medical original images. In some embodiments, the medical text can be added to the original medical image as a text label, or can also be added to the original medical image in the form of a file name, which is not limited here.

步骤430，对医学标注图像进行三维图像重建计算，得到三维标注模型。Step 430: Perform three-dimensional image reconstruction calculation on the medical annotation image to obtain a three-dimensional annotation model.

其中，三维标注模型携带有与医学标注图像相应的医学文本。Among them, the three-dimensional annotation model carries medical text corresponding to the medical annotation image.

具体而言，基于医学标注图像，确定医学原始图像及其携带的医学文本；采用三维图像重建技术，对该医学原始图像进行三维图像重建计算，得到医学原始三维模型；利用该医学原始图像携带的医学文本对医学原始三维模型进行标注，得到三维标注模型。Specifically, based on the medical annotation image, determine the original medical image and the medical text it carries; use three-dimensional image reconstruction technology to perform three-dimensional image reconstruction calculations on the original medical image to obtain the original three-dimensional medical model; use the original medical image to carry the medical text The medical text annotates the original medical three-dimensional model to obtain a three-dimensional annotation model.

也就是说，采用三维图像重建技术，每一个医学标注图像都可以得到相应的三维标注模型，应当说明的是，相应是指医学标注图像和三维标注模型具有相匹配的医学文本，也可以理解为，医学原始图像和医学原始三维模型均符合该相匹配的医学文本所描述的内容风格。That is to say, using three-dimensional image reconstruction technology, each medical annotation image can obtain a corresponding three-dimensional annotation model. It should be noted that correspondence means that the medical annotation image and the three-dimensional annotation model have matching medical texts, which can also be understood as , both the original medical image and the original three-dimensional medical model conform to the content style described by the matching medical text.

步骤450，按照不同视角，将三维标注模型分解为多个二维标注图像。Step 450: Decompose the three-dimensional annotation model into multiple two-dimensional annotation images according to different viewing angles.

其中，各二维标注图像对应不同视角、且携带有与三维标注模型相应的医学文本。Among them, each two-dimensional annotation image corresponds to a different perspective and carries medical text corresponding to the three-dimensional annotation model.

具体地，基于三维标注模型，确定医学原始三维模型及其携带的医学文本；将该医学原始三维模型分解为不同视角下的多个医学原始二维图像；利用该医学原始三维模型携带的医学文本对不同视角下的多个医学原始二维图像进行标注，得到不同视角下的多个二维标注图像。Specifically, based on the three-dimensional annotation model, determine the original medical three-dimensional model and the medical text it carries; decompose the original medical three-dimensional model into multiple original medical two-dimensional images from different perspectives; use the medical text carried by the original medical three-dimensional model Label multiple original medical two-dimensional images from different viewing angles to obtain multiple two-dimensional annotated images from different viewing angles.

在此说明的是，对于每一个三维标注模型来说，都具有相应的不同视角下的多个二维标注图像，该三维标注模型和其相应的多个二维标注图像具有相匹配的医学文本，也可以理解为，医学原始三维模型和其相应的多个医学原始二维图像均符合该相匹配的医学文本所描述的内容风格。What is explained here is that for each three-dimensional annotation model, there are corresponding multiple two-dimensional annotation images from different perspectives. The three-dimensional annotation model and its corresponding multiple two-dimensional annotation images have matching medical texts. , it can also be understood that the original medical three-dimensional model and its corresponding multiple original medical two-dimensional images conform to the content style described by the matching medical text.

步骤470，基于各二维标注图像及其携带的医学文本，构建第一训练集，并基于三维标注模型、以及携带有与三维标注模型相应的医学文本的各二维标注图像，构建第二训练集。Step 470: Construct a first training set based on each two-dimensional annotated image and the medical text it carries, and construct a second training set based on the three-dimensional annotation model and each two-dimensional annotated image carrying the medical text corresponding to the three-dimensional annotation model. set.

其中，二维标注图像是指携带医学文本的医学原始二维图像；三维标注模型是指携带医学文本的医学原始三维模型。Among them, the two-dimensional annotated image refers to the original medical two-dimensional image carrying medical text; the three-dimensional annotated model refers to the original medical three-dimensional model carrying medical text.

如图6所示，第一训练集是由大量医学原始二维图像及其携带的医学文本构成的，即第一训练集是从医学文本到医学二维图像的数据集；而第二训练集则是由大量医学原始三维模型、相应的不同视角下的多个医学原始二维图像及其携带的医学文本构成的，即第二训练集是从医学二维图像到医学三维模型的数据集。As shown in Figure 6, the first training set is composed of a large number of original medical two-dimensional images and the medical texts they carry, that is, the first training set is a data set from medical texts to medical two-dimensional images; and the second training set It is composed of a large number of original medical three-dimensional models, corresponding multiple original medical two-dimensional images from different perspectives, and the medical texts they carry. That is, the second training set is a data set from medical two-dimensional images to medical three-dimensional models.

在得到第一训练集后，便能够基于第一训练集训练得到第一生成网络，具体可以包括以下步骤：获取初始的扩散模型；初始的扩散模型是经过预训练的稳定扩散StableDiffusion模型；基于第一训练集，对初始的扩散模型进行参数调优训练，若初始的扩散模型的参数调优训练满足设定条件，得到完成训练的扩散模型；基于文本编码器、完成训练的扩散模型、以及图像解码器，构建第一生成网络。After obtaining the first training set, the first generation network can be trained based on the first training set, which may include the following steps: obtaining an initial diffusion model; the initial diffusion model is a pre-trained stable diffusion StableDiffusion model; based on the first A training set, perform parameter tuning training on the initial diffusion model. If the parameter tuning training of the initial diffusion model meets the set conditions, the diffusion model that has completed the training will be obtained; based on the text encoder, the diffusion model that has completed the training, and the image Decoder, builds the first generative network.

在得到第二训练集后，便能够基于第二训练集训练得到第二生成网络，具体可以包括以下步骤：获取利用自然图像训练集预训练得到的深度学习模型；基于第二训练集，对深度学习模型进行参数调优训练；若深度学习模型的参数调优训练满足设定条件，则得到第二生成网络。After obtaining the second training set, the second generation network can be trained based on the second training set, which may include the following steps: obtaining a deep learning model pre-trained using the natural image training set; based on the second training set, the depth The learning model performs parameter tuning training; if the parameter tuning training of the deep learning model meets the set conditions, the second generation network is obtained.

其中，设定条件可以根据应用场景的实际需要灵活地设置，例如，在一应用场景，设定条件可以是指参数达到最优，以此来提高模型的训练精度；在另一应用场景，设定条件可以是指迭代次数达到阈值，以此来提高模型的训练效率，在此并未加以限定。其中，参数达到最优，可以通过损失函数等实现，此处也并未进行限定。Among them, the setting conditions can be flexibly set according to the actual needs of the application scenario. For example, in one application scenario, the setting conditions can mean that the parameters are optimized to improve the training accuracy of the model; in another application scenario, the setting conditions can be The certain condition may mean that the number of iterations reaches a threshold to improve the training efficiency of the model, which is not limited here. Among them, the parameters are optimized, which can be achieved through loss functions, etc., and are not limited here.

在上述实施例的作用下，通过构建医学文本-医学三维模型的数据集，将其应用在预训练好的模型上进行训练，实现了端到端的医学文本到医学三维模型的网络，能够适用于医学上的多临床场景和多类别，提高原有的基于文本提示的三维AIGC生成模型在医学内容生成上的表现，以便于为后续VR/MR临床实践训练考核提供全新的技术和工具。Under the influence of the above embodiments, by constructing a medical text-medical three-dimensional model data set and applying it to the pre-trained model for training, an end-to-end medical text to medical three-dimensional model network is realized, which can be applied to Multi-clinical scenarios and categories in medicine improve the performance of the original text prompt-based three-dimensional AIGC generation model in medical content generation, so as to provide new technologies and tools for subsequent VR/MR clinical practice training and assessment.

目前以动物标本、人体标本和教学辅助器具为主的传统医科教育体系，在医科人才的培养过程中存在资源不足，对患者造成损害的风险较高等问题。虚拟现实技术（VR）是通过利用计算机生成虚拟的三维场景，给用户带来视觉、听觉以及触觉等沉浸感。随着近年来虚拟现实和混合现实（MR）技术的发展，将VR和MR技术应用到医学教育培训考核中成为新的趋势。相对于传统的教学模式，VR和MR技术有着十分显著的优势，通过对真实人体进行多维度数据采集，通过仿真建模构建数字化人体或者目标组织模型，可以进行低成本、可重复、可量化评估的数字化教学，允许学员在可重复练习的环境中学习成长，提供案例丰富、科学规范的仿真素材，能够有效改善教学资源不足和考核难以量化评估的难题。同时在临床教学以及术前规划中，使用混合现实技术能够达到更好的效果。The current traditional medical education system, which mainly focuses on animal specimens, human specimens and teaching aids, has problems such as insufficient resources and a high risk of harm to patients in the training process of medical talents. Virtual reality technology (VR) uses computers to generate virtual three-dimensional scenes, bringing users visual, auditory and tactile immersion. With the development of virtual reality and mixed reality (MR) technology in recent years, applying VR and MR technology to medical education, training and assessment has become a new trend. Compared with the traditional teaching model, VR and MR technology have very significant advantages. By collecting multi-dimensional data from the real human body and constructing a digital human body or target tissue model through simulation modeling, low-cost, repeatable, and quantifiable evaluation can be carried out. Digital teaching allows students to learn and grow in a repeatable practice environment, and provides rich case studies and scientific and standardized simulation materials, which can effectively improve the problems of insufficient teaching resources and difficulty in quantitative assessment. At the same time, in clinical teaching and preoperative planning, the use of mixed reality technology can achieve better results.

当前混合现实医学培训和教学任务需要用到大量的医学三维模型，一方面，现有医学三维虚拟模型主流方法是使用几何建模软件直接制作，也有部分方案使用医学影像通过分割，重建，渲染等步骤获取三维模型，但是操作步骤复杂、制作成本较高，没有办法做到简易的文本描述控制内容风格的自由改变和编辑；另一方面，当前的AIGC三维生成大模型在自然图像模型和自然场景下效果还行，但是在医学相关内容生成上还无法达到让人满意的效果，难以适应于医学培训多种不同场景对不同内容风格的需求。Current mixed reality medical training and teaching tasks require the use of a large number of medical 3D models. On the one hand, the mainstream method of existing medical 3D virtual models is to use geometric modeling software to directly create them. There are also some solutions that use medical images through segmentation, reconstruction, rendering, etc. Steps are taken to obtain a 3D model, but the operation steps are complicated and the production cost is high. There is no way to control the free change and editing of the content style through simple text descriptions. On the other hand, the current AIGC 3D generated large model is not suitable for natural image models and natural scenes. The effect is okay, but it cannot achieve satisfactory results in generating medical-related content, and it is difficult to adapt to the needs of different content styles in various medical training scenarios.

图7展示了医学三维模型在一应用场景中的示意图。如图7所示，该应用场景中，服务端可以是服务器等，第一用户端和第二用户端均则可以是能够实现与用户交互的电子设备，例如，第一用户端可以是台式电脑，第二用户端可以是笔记本电脑等，那么，第一用户端便能够通过用户交互的方式将用户输入的医学文本生成医学三维模型，第二用户端便能够通过用户交互的方式对用户进行医学培训或医学教学等考核。值得一提的是，第一用户端和第二用户端也可以部署于同一台电子设备，此处并非构成具体限定。Figure 7 shows a schematic diagram of a medical three-dimensional model in an application scenario. As shown in Figure 7, in this application scenario, the server can be a server, etc., and both the first client and the second client can be electronic devices capable of interacting with users. For example, the first client can be a desktop computer. , the second user terminal can be a laptop, etc. Then, the first user terminal can generate a medical three-dimensional model from the medical text input by the user through user interaction, and the second user terminal can perform medical treatment on the user through user interaction. Assessments such as training or medical teaching. It is worth mentioning that the first client and the second client can also be deployed on the same electronic device, and this does not constitute a specific limitation.

具体而言，服务端构建第一训练集和第二训练集，以便基于第一训练集训练得到第一生成网络、以及基于第二训练集训练得到第二生成网络，并将二者部署至第一用户端。Specifically, the server constructs a first training set and a second training set so as to obtain a first generation network based on training on the first training set and a second generation network based on training on the second training set, and deploy the two to the second generation network. A client.

第一用户端在完成第一生成网络和第二生成网络的部署后，用户便能够根据期望得到的医学三维模型的内容风格，借助第一用户端输入相应的医学文本，使得第一用户端通过调用第一生成网络从该医学文本生成医学二维图像，并调用第二生成网络从该医学二维图像生成医学三维模型，由此得到的医学三维模型符合用户所期望的内容风格，进而将该医学三维模型传输至第二用户端。After the first user terminal completes the deployment of the first generation network and the second generation network, the user can input the corresponding medical text with the help of the first user terminal according to the desired content style of the medical three-dimensional model, so that the first user terminal can pass The first generation network is called to generate a medical two-dimensional image from the medical text, and the second generation network is called to generate a medical three-dimensional model from the medical two-dimensional image. The medical three-dimensional model thus obtained conforms to the content style expected by the user, and then the medical three-dimensional model is generated. The medical three-dimensional model is transmitted to the second user terminal.

随着第二用户端中客户端的运行，第二用户端将向用户展示导入了医学三维模型的虚拟场景，该虚拟场景是利用计算机技术为医学培训或医学教学等考核而构建的数字化场景，以模拟医学培训或医学教学所需环境（例如医学手术实验室），进而通过捕捉用户在模拟环境中模拟操作的方式对用户进行医学培训或医学教学等考核。As the client in the second user terminal runs, the second user terminal will display to the user a virtual scene with the medical three-dimensional model imported. The virtual scene is a digital scene constructed using computer technology for assessments such as medical training or medical teaching. Simulate the environment required for medical training or medical teaching (such as a medical surgery laboratory), and then assess the user's medical training or medical teaching by capturing the user's simulated operations in the simulated environment.

例如，当用户期望参与医学培训考核时，便可启动该客户端进入到相应的虚拟场景，例如，该虚拟场景可以是模拟的医学手术实验室，该虚拟场景中导入的医学三维模型可以是某个患者带肿瘤的肺。此时，用户针对患者带肿瘤的肺所进行的模拟操作包括但不限于：用户对手术器械的模拟操作、用户针对考核内容进行的答复操作等。随着用户的模拟操作过程，第二用户端还可以基于图像采集设备采集到相应的操作视频、以及基于混合现实设备采集得到相应的传感数据，来对用户本次的医学培训进行考核。For example, when a user wishes to participate in a medical training assessment, the client can be started to enter the corresponding virtual scene. For example, the virtual scene can be a simulated medical surgery laboratory, and the medical three-dimensional model imported into the virtual scene can be a certain The lungs of a patient with tumors. At this time, the simulated operations performed by the user on the patient's tumor-bearing lung include but are not limited to: the user's simulated operations on surgical instruments, the user's reply operations on the assessment content, etc. As the user simulates the operation process, the second user terminal can also collect corresponding operation videos based on the image acquisition device and obtain corresponding sensor data based on the mixed reality device to assess the user's medical training.

在本应用场景中，通过将AIGC技术应用于医学领域，实现了AIGC技术应用于医学混合现实模型的生成，填补了这方面的应用空白；也能够弥补AIGC缺少医学内容的学习所表现出的在医学内容模型生成方面的不足，同时使用该生成方案，可以解决混合现实医学模拟培训和教学中的不同场景下的三维模型需求；此外，通过这种文本生成的方式，可以快速的获取多种状况下的医学三维内容，减少了用于混合现实医学开发三维模型的生成成本。In this application scenario, by applying AIGC technology to the medical field, AIGC technology is applied to the generation of medical mixed reality models, filling the application gap in this area; it can also make up for the lack of medical content learning in AIGC. The shortcomings in medical content model generation, while using this generation solution, can solve the needs of 3D models in different scenarios in mixed reality medical simulation training and teaching; in addition, through this method of text generation, multiple conditions can be quickly obtained 3D medical content, reducing the cost of generating 3D models for mixed reality medical development.

下述为本申请装置实施例，可以用于执行本申请所涉及的医学三维模型的生成方法。对于本申请装置实施例中未披露的细节，请参照本申请所涉及的医学三维模型的生成方法的实施例。The following are device embodiments of the present application, which can be used to execute the method for generating a medical three-dimensional model involved in the present application. For details not disclosed in the device embodiments of this application, please refer to the embodiments of the medical three-dimensional model generation method involved in this application.

请参阅图8，本申请实施例中提供了一种医学三维模型的生成装置900，包括但不限于：文本获取模块910、图像生成模块930、模型生成模块950、以及模型展示模块970。Referring to Figure 8, an embodiment of the present application provides a device 900 for generating a medical three-dimensional model, including but not limited to: a text acquisition module 910, an image generation module 930, a model generation module 950, and a model display module 970.

其中，文本获取模块910，用于响应于交互界面中的输入操作，获取医学文本。Among them, the text acquisition module 910 is used to acquire medical text in response to input operations in the interactive interface.

图像生成模块930，用于调用第一生成网络，在医学文本的引导下，对输入第一生成网络的第一张量进行学习，得到符合医学文本描述的医学二维图像。第一生成网络是经过训练、且具有从医学文本到医学二维图像的生成能力的深度学习模型。The image generation module 930 is used to call the first generation network, and under the guidance of the medical text, learn the first tensor input to the first generation network to obtain a medical two-dimensional image that conforms to the description of the medical text. The first generative network is a deep learning model that has been trained and has the ability to generate from medical text to medical two-dimensional images.

模型生成模块950，用于将医学二维图像输入第二生成网络进行医学三维模型的生成。第二生成网络是经过训练、且具有从医学二维图像到医学三维模型的生成能力的深度学习模型。The model generation module 950 is used to input the medical two-dimensional image into the second generation network to generate the medical three-dimensional model. The second generation network is a deep learning model that has been trained and has the ability to generate from medical two-dimensional images to medical three-dimensional models.

模型展示模块970，用于将第二生成网络生成的医学三维模型，显示在交互界面中。The model display module 970 is used to display the medical three-dimensional model generated by the second generation network in the interactive interface.

在一示例性实施例中，所述第一生成网络包括文本编码器、扩散模型、以及图像解码器。In an exemplary embodiment, the first generative network includes a text encoder, a diffusion model, and an image decoder.

其中，所述图像生成模块930，还用于利用所述文本编码器，将所述医学文本转换为第二张量，并获取随机生成的所述第一张量；控制所述第二张量引导所述扩散模型按照所述医学文本描述的内容风格，对所述第一张量进行扩散学习，得到第三张量；利用所述图像解码器，将所述第三张量转换为所述医学二维图像。Wherein, the image generation module 930 is also used to use the text encoder to convert the medical text into a second tensor, and obtain the randomly generated first tensor; control the second tensor Guide the diffusion model to perform diffusion learning on the first tensor according to the content style described in the medical text to obtain a third tensor; use the image decoder to convert the third tensor into the Medical 2D images.

在一示例性实施例中，所述图像生成模块930，还用于将所述第一张量输入所述扩散模型，通过所述扩散模型的逆向过程进行去噪；以所述第二张量作为引导条件，并将所述引导条件引入所述第一张量的去噪过程，得到所述第三张量。In an exemplary embodiment, the image generation module 930 is further configured to input the first tensor into the diffusion model, and perform denoising through the reverse process of the diffusion model; using the second tensor As a guiding condition, and introducing the guiding condition into the denoising process of the first tensor, the third tensor is obtained.

在一示例性实施例中，所述装置900还包括：训练集构建模块。In an exemplary embodiment, the apparatus 900 further includes: a training set construction module.

其中，所述训练集构建模块，用于获取医学原始图像，并对所述医学原始图像进行关于医学文本的标注，得到医学标注图像；所述医学标注图像是指携带医学文本的医学原始图像；对所述医学标注图像进行三维图像重建计算，得到三维标注模型；所述三维标注模型携带有与所述医学标注图像相应的医学文本；按照不同视角，将所述三维标注模型分解为多个二维标注图像；各所述二维标注图像对应不同视角、且携带有与所述三维标注模型相应的医学文本；基于各所述二维标注图像及其携带的医学文本，构建第一训练集，并基于所述三维标注模型、以及携带有与所述三维标注模型相应的医学文本的各所述二维标注图像，构建第二训练集；其中，所述第一训练集用于训练得到所述第一生成网络，所述第二训练集用于训练得到所述第二生成网络。Wherein, the training set construction module is used to obtain original medical images, annotate the original medical images with medical texts, and obtain medical annotated images; the medical annotated images refer to original medical images carrying medical texts; Three-dimensional image reconstruction calculation is performed on the medical annotation image to obtain a three-dimensional annotation model; the three-dimensional annotation model carries medical text corresponding to the medical annotation image; the three-dimensional annotation model is decomposed into a plurality of two-dimensional annotation models according to different viewing angles. 2D annotated images; each two-dimensional annotated image corresponds to a different perspective and carries medical text corresponding to the three-dimensional annotated model; based on each two-dimensional annotated image and the medical text it carries, a first training set is constructed, And based on the three-dimensional annotation model and each of the two-dimensional annotation images carrying the medical text corresponding to the three-dimensional annotation model, a second training set is constructed; wherein the first training set is used for training to obtain the The first generating network, the second training set is used to train to obtain the second generating network.

在一示例性实施例中，所述装置900还包括：第一训练模块。In an exemplary embodiment, the device 900 further includes: a first training module.

其中，所述第一训练模块，用于获取初始的扩散模型；初始的所述扩散模型是经过预训练的稳定扩散Stable Diffusion模型；基于所述第一训练集，对初始的所述扩散模型进行参数调优训练，得到完成训练的所述扩散模型；基于文本编码器、完成训练的所述扩散模型、以及图像解码器，构建所述第一生成网络。Wherein, the first training module is used to obtain an initial diffusion model; the initial diffusion model is a pre-trained stable diffusion model; based on the first training set, the initial diffusion model is Parameter tuning and training are performed to obtain the diffusion model that has been trained; and the first generation network is constructed based on the text encoder, the diffusion model that has been trained, and the image decoder.

在一示例性实施例中，所述装置900还包括：第二训练模块。In an exemplary embodiment, the device 900 further includes: a second training module.

其中，所述第二训练模块，用于获取利用自然图像训练集预训练得到的深度学习模型；基于所述第二训练集，对所述深度学习模型进行参数调优训练；若所述深度学习模型的参数调优训练满足设定条件，则得到所述第二生成网络。Wherein, the second training module is used to obtain a deep learning model pre-trained using a natural image training set; based on the second training set, perform parameter tuning training on the deep learning model; if the deep learning If the parameter tuning training of the model meets the set conditions, the second generating network is obtained.

在一示例性实施例中，所述装置900还包括：考核模块。In an exemplary embodiment, the device 900 further includes: an assessment module.

其中，所述考核模块，用于在已构建的虚拟场景中导入所述医学三维模型；所述虚拟场景是为医学培训或医学教学构建的；基于目标对象在所述虚拟场景中针对所述医学三维模型的模拟操作，对所述目标对象的医学培训或医学教学进行考核。Wherein, the assessment module is used to import the medical three-dimensional model into the constructed virtual scene; the virtual scene is constructed for medical training or medical teaching; based on the target object in the virtual scene, based on the medical The simulation operation of the three-dimensional model is used to assess the medical training or medical teaching of the target object.

需要说明的是，上述实施例所提供的医学三维模型的生成装置在生成医学三维模型时，仅以上述各功能模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能模块完成，即医学三维模型的生成装置的内部结构将划分为不同的功能模块，以完成以上描述的全部或者部分功能。It should be noted that when the device for generating a medical three-dimensional model provided in the above embodiments generates a medical three-dimensional model, only the division of the above functional modules is used as an example. In practical applications, the above functions can be allocated from different modules as needed. The functional modules are completed, that is, the internal structure of the medical three-dimensional model generation device will be divided into different functional modules to complete all or part of the functions described above.

另外，上述实施例所提供的医学三维模型的生成装置与医学三维模型的生成方法的实施例属于同一构思，其中各个模块执行操作的具体方式已经在方法实施例中进行了详细描述，此处不再赘述。In addition, the device for generating a medical three-dimensional model and the method for generating a three-dimensional medical model provided in the above embodiments belong to the same concept. The specific manner in which each module performs operations has been described in detail in the method embodiment and will not be used here. Again.

图9根据一示例性实施例示出的一种电子设备的结构示意。该电子设备适用于图1所示出实施环境中的用户端110。Figure 9 shows a schematic structural diagram of an electronic device according to an exemplary embodiment. The electronic device is suitable for the client 110 in the implementation environment shown in FIG. 1 .

需要说明的是，该电子设备只是一个适配于本申请的示例，不能认为是提供了对本申请的使用范围的任何限制。该电子设备也不能解释为需要依赖于或者必须具有图9示出的示例性的电子设备2000中的一个或者多个组件。It should be noted that this electronic device is only an example adapted to the present application and cannot be considered to provide any limitation on the scope of use of the present application. The electronic device is also not to be construed as being dependent on or required to have one or more components of the exemplary electronic device 2000 shown in FIG. 9 .

电子设备2000的硬件结构可因配置或者性能的不同而产生较大的差异，如图9所示，电子设备2000包括：电源210、接口230、至少一存储器250、以及至少一中央处理器（CPU,Central Processing Units）270。The hardware structure of the electronic device 2000 may vary greatly due to different configurations or performance. As shown in FIG. 9 , the electronic device 2000 includes: a power supply 210, an interface 230, at least one memory 250, and at least one central processing unit (CPU). ,Central Processing Units) 270.

具体地，电源210用于为电子设备2000上的各硬件设备提供工作电压。Specifically, the power supply 210 is used to provide operating voltage for each hardware device on the electronic device 2000 .

接口230包括至少一有线或无线网络接口231，用于与外部设备交互。当然，在其余本申请适配的示例中，接口230还可以进一步包括至少一串并转换接口233、至少一输入输出接口235以及至少一USB接口237等，如图9所示，在此并非对此构成具体限定。The interface 230 includes at least one wired or wireless network interface 231 for interacting with external devices. Of course, in other examples adapted to this application, the interface 230 may further include at least one serial-to-parallel conversion interface 233, at least one input-output interface 235, and at least one USB interface 237, etc., as shown in Figure 9, which is not intended to be used here. This constitutes a specific limitation.

存储器250作为资源存储的载体，可以是只读存储器、随机存储器、磁盘或者光盘等，其上所存储的资源包括操作系统251、应用程序253及数据255等，存储方式可以是短暂存储或者永久存储。As a carrier for resource storage, the memory 250 can be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc. The resources stored thereon include the operating system 251, application programs 253, data 255, etc., and the storage method can be short-term storage or permanent storage. .

其中，操作系统251用于管理与控制电子设备2000上的各硬件设备以及应用程序253，以实现中央处理器270对存储器250中海量数据255的运算与处理，其可以是WindowsServerTM、Mac OS XTM、UnixTM、LinuxTM、FreeBSDTM等。Among them, the operating system 251 is used to manage and control each hardware device and application program 253 on the electronic device 2000 to realize the operation and processing of the massive data 255 in the memory 250 by the central processor 270. It can be WindowsServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

应用程序253是基于操作系统251之上完成至少一项特定工作的计算机可读指令，其可以包括至少一模块（图9未示出），每个模块都可以分别包含有对电子设备2000的计算机可读指令。例如，医学三维模型的生成装置可视为部署于电子设备2000的应用程序253。The application program 253 is a computer-readable instruction based on the operating system 251 to complete at least one specific work. It may include at least one module (not shown in FIG. 9 ), and each module may include a computer program for the electronic device 2000 . Readable instructions. For example, the device for generating a medical three-dimensional model can be regarded as an application program 253 deployed on the electronic device 2000 .

数据255可以是存储于磁盘中的照片、图片等，还可以是第一生成网络、第二生成网络等等，存储于存储器250中。The data 255 may be photos, pictures, etc. stored in a disk, or may be a first generated network, a second generated network, etc., stored in the memory 250 .

中央处理器270可以包括一个或多个以上的处理器，并设置为通过至少一通信总线与存储器250通信，以读取存储器250中存储的计算机可读指令，进而实现对存储器250中海量数据255的运算与处理。例如，通过中央处理器270读取存储器250中存储的一系列计算机可读指令的形式来完成医学三维模型的生成方法。The central processing unit 270 may include one or more processors, and is configured to communicate with the memory 250 through at least one communication bus to read the computer readable instructions stored in the memory 250, and thereby implement processing of the massive data 255 in the memory 250. operation and processing. For example, the method of generating a medical three-dimensional model is completed by the central processor 270 reading a series of computer-readable instructions stored in the memory 250 .

此外，通过硬件电路或者硬件电路结合软件也能同样实现本申请，因此，实现本申请并不限于任何特定硬件电路、软件以及两者的组合。In addition, the present application can also be implemented through hardware circuits or hardware circuits combined with software. Therefore, implementation of the present application is not limited to any specific hardware circuit, software, or combination of the two.

请参阅图10，本申请实施例中提供了一种电子设备4000，该电子设备400可以包括：台式电脑、笔记本电脑、服务器等。Referring to FIG. 10 , an electronic device 4000 is provided in an embodiment of the present application. The electronic device 400 may include: a desktop computer, a notebook computer, a server, etc.

在图10中，该电子设备4000包括至少一个处理器4001以及至少一个存储器4003。In FIG. 10 , the electronic device 4000 includes at least one processor 4001 and at least one memory 4003 .

其中，处理器4001和存储器4003之间的数据交互，可以通过至少一个通信总线4002实现。该通信总线4002可包括一通路，用于在处理器4001和存储器4003之间传输数据。通信总线4002可以是PCI（Peripheral Component Interconnect，外设部件互连标准）总线或EISA（Extended Industry Standard Architecture，扩展工业标准结构）总线等。通信总线4002可以分为地址总线、数据总线、控制总线等。为便于表示，图10中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。Among them, data interaction between the processor 4001 and the memory 4003 can be realized through at least one communication bus 4002. The communication bus 4002 may include a path for transmitting data between the processor 4001 and the memory 4003. The communication bus 4002 may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus or an EISA (Extended Industry Standard Architecture, Extended Industry Standard Architecture) bus, etc. The communication bus 4002 can be divided into an address bus, a data bus, a control bus, etc. For ease of presentation, only one thick line is used in Figure 10, but it does not mean that there is only one bus or one type of bus.

可选地，电子设备4000还可以包括收发器4004，收发器4004可以用于该电子设备与其他电子设备之间的数据交互，如数据的发送和/或数据的接收等。需要说明的是，实际应用中收发器4004不限于一个，该电子设备4000的结构并不构成对本申请实施例的限定。Optionally, the electronic device 4000 may also include a transceiver 4004, which may be used for data interaction between the electronic device and other electronic devices, such as data transmission and/or data reception. It should be noted that in practical applications, the number of transceivers 4004 is not limited to one, and the structure of the electronic device 4000 does not constitute a limitation on the embodiments of the present application.

处理器4001可以是CPU（Central Processing Unit，中央处理器），通用处理器，DSP（Digital Signal Processor，数据信号处理器），ASIC（Application SpecificIntegrated Circuit，专用集成电路），FPGA（Field Programmable Gate Array，现场可编程门阵列）或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框，模块和电路。处理器4001也可以是实现计算功能的组合，例如包含一个或多个微处理器组合，DSP和微处理器的组合等。The processor 4001 can be a CPU (Central Processing Unit, central processing unit), a general-purpose processor, a DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit, application specific integrated circuit), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with this disclosure. The processor 4001 may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc.

存储器4003可以是ROM（Read Only Memory，只读存储器）或可存储静态信息和指令的其他类型的静态存储设备，RAM（Random Access Memory，随机存取存储器）或者可存储信息和指令的其他类型的动态存储设备，也可以是EEPROM（Electrically ErasableProgrammable Read Only Memory，电可擦可编程只读存储器）、CD-ROM（Compact DiscRead Only Memory，只读光盘）或其他光盘存储、光碟存储（包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等）、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序指令或代码并能够由电子设备400存取的任何其他介质，但不限于此。The memory 4003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, RAM (Random Access Memory) or other types that can store information and instructions. Dynamic storage devices can also be EEPROM (Electrically Erasable Programmable Read Only Memory), CD-ROM (Compact DiscRead Only Memory) or other optical disc storage, optical disc storage (including compressed optical discs, Laser disc, optical disc, digital versatile disc, Blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program instructions or codes in the form of instructions or data structures and can be stored by the electronic device 400 any other medium, but not limited to this.

存储器4003上存储有计算机可读指令，处理器4001可以通过通信总线4002读取存储器4003中存储的计算机可读指令。Computer-readable instructions are stored in the memory 4003, and the processor 4001 can read the computer-readable instructions stored in the memory 4003 through the communication bus 4002.

该计算机可读指令被一个或多个处理器4001执行以实现上述各实施例中的医学三维模型的生成方法。The computer readable instructions are executed by one or more processors 4001 to implement the method for generating a medical three-dimensional model in the above embodiments.

此外，本申请实施例中提供了一种存储介质，该存储介质上存储有计算机可读指令，该计算机可读指令被一个或多个处理器执行，以实现如上所述的医学三维模型的生成方法。In addition, embodiments of the present application provide a storage medium with computer-readable instructions stored on the storage medium. The computer-readable instructions are executed by one or more processors to achieve the generation of the medical three-dimensional model as described above. method.

本申请实施例中提供了一种计算机程序产品，计算机程序产品包括计算机可读指令，计算机可读指令存储在存储介质中，电子设备的一个或多个处理器从存储介质读取计算机可读指令，加载并执行该计算机可读指令，使得电子设备实现如上所述的医学三维模型的生成方法。An embodiment of the present application provides a computer program product. The computer program product includes computer readable instructions. The computer readable instructions are stored in a storage medium. One or more processors of the electronic device read the computer readable instructions from the storage medium. , loading and executing the computer-readable instructions, so that the electronic device implements the method for generating a medical three-dimensional model as described above.

应该理解的是，虽然附图的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，其可以以其他的顺序执行。而且，附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段，这些子步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，其执行顺序也不必然是依次进行，而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although various steps in the flowchart of the accompanying drawings are shown in sequence as indicated by arrows, these steps are not necessarily performed in the order indicated by arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least some of the steps in the flow chart of the accompanying drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and their execution order is also It does not necessarily need to be performed sequentially, but may be performed in turn or alternately with other steps or sub-steps of other steps or at least part of the stages.

以上所述仅是本申请的部分实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本申请原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本申请的保护范围。The above are only some of the embodiments of the present application. It should be pointed out that those of ordinary skill in the technical field can also make several improvements and modifications without departing from the principles of the present application. These improvements and modifications can also be made. should be regarded as the scope of protection of this application.

Claims

1. A method for generating a medical three-dimensional model, characterized in that the method includes:

Obtain medical text in response to input operations in the interactive interface;

Call the first generation network, and under the guidance of the medical text, learn the first tensor input to the first generation network to obtain a medical two-dimensional image that conforms to the description of the medical text; the first generation network It is a deep learning model that has been trained and has the ability to generate from medical text to medical two-dimensional images;

Input the medical two-dimensional image into a second generation network to generate a medical three-dimensional model; the second generation network is a deep learning model that has been trained and has the ability to generate a medical three-dimensional model from a medical two-dimensional image;

The medical three-dimensional model generated by the second generation network is displayed in the interactive interface.

2. The method of claim 1, wherein the first generation network includes a text encoder, a diffusion model, and an image decoder;

The first generation network is called, and under the guidance of the medical text, the first tensor input to the first generation network is learned to obtain a medical two-dimensional image that conforms to the description of the medical text, including:

Using the text encoder, convert the medical text into a second tensor, and obtain the randomly generated first tensor;

Control the second tensor to guide the diffusion model to perform diffusion learning on the first tensor according to the content style described in the medical text to obtain a third tensor;

Using the image decoder, the third tensor is converted into the medical two-dimensional image.

3. The method of claim 2, wherein the controlling the second tensor guides the diffusion model to perform diffusion learning on the first tensor according to the content style described in the medical text, Get the third tensor, including:

Input the first tensor into the diffusion model, and perform denoising through the reverse process of the diffusion model;

Using the second tensor as a guiding condition and introducing the guiding condition into the denoising process of the first tensor, the third tensor is obtained.

4. The method of claim 1, further comprising:

Obtain the original medical image, and annotate the original medical image with the medical text to obtain the medical annotated image; the medical annotated image refers to the original medical image carrying the medical text;

Perform three-dimensional image reconstruction calculations on the medical annotation image to obtain a three-dimensional annotation model; the three-dimensional annotation model carries medical text corresponding to the medical annotation image;

Decompose the three-dimensional annotation model into multiple two-dimensional annotation images according to different viewing angles; each two-dimensional annotation image corresponds to a different viewing angle and carries medical text corresponding to the three-dimensional annotation model;

A first training set is constructed based on each of the two-dimensional annotated images and the medical text they carry, and based on the three-dimensional annotation model and each of the two-dimensional annotated images carrying the medical text corresponding to the three-dimensional annotation model. , construct the second training set;

Wherein, the first training set is used to train the first generating network, and the second training set is used to train the second generating network.

5. The method of claim 4, wherein the first generating network is called, and under the guidance of the medical text, the first tensor input to the first generating network is learned to obtain a conforming result. Before the medical text describes the two-dimensional medical image, the method further includes:

Obtain an initial diffusion model; the initial diffusion model is a pre-trained stable diffusion StableDiffusion model;

Based on the first training set, perform parameter tuning training on the initial diffusion model to obtain the diffusion model that has completed training;

The first generative network is constructed based on a text encoder, the trained diffusion model, and an image decoder.

6. The method of claim 4, wherein before inputting the medical two-dimensional image into the second generation network to generate the medical three-dimensional model, the method further includes:

Obtain the deep learning model pre-trained using the natural image training set;

Based on the second training set, perform parameter tuning training on the deep learning model;

If the parameter tuning training of the deep learning model meets the set conditions, the second generation network is obtained.

7. The method according to any one of claims 1 to 6, wherein after displaying the medical three-dimensional model generated by the second generation network in the interactive interface, the method includes:

Import the medical three-dimensional model into the constructed virtual scene; the virtual scene is constructed for medical training or medical teaching;

Based on the target object's simulated operation on the medical three-dimensional model in the virtual scene, the medical training or medical teaching of the target object is assessed.

8. A device for generating a medical three-dimensional model, characterized in that the device includes:

A text acquisition module, used to acquire medical text in response to input operations in the interactive interface;

The image generation module is used to call the first generation network, and under the guidance of the medical text, learn the first tensor input to the first generation network to obtain a medical two-dimensional image that conforms to the description of the medical text; The first generation network is a deep learning model that has been trained and has the ability to generate from medical text to medical two-dimensional images;

A model generation module for inputting the medical two-dimensional image into a second generation network to generate a medical three-dimensional model; the second generation network is trained and has the ability to generate from a medical two-dimensional image to a medical three-dimensional model. deep learning model;

A model display module is used to display the medical three-dimensional model generated by the second generation network in the interactive interface.

9. An electronic device, characterized by comprising: at least one processor and at least one memory, wherein,

Computer readable instructions are stored on the memory;

The computer-readable instructions are executed by one or more of the processors, so that the electronic device implements the method for generating a medical three-dimensional model according to any one of claims 1 to 7.

10. A storage medium with computer-readable instructions stored thereon, characterized in that the computer-readable instructions are executed by one or more processors to implement the method of any one of claims 1 to 7 Generating method of medical three-dimensional model.