WO2023041166A1

WO2023041166A1 - Inferring clinical preferences from data

Info

Publication number: WO2023041166A1
Application number: PCT/EP2021/075535
Authority: WO
Inventors: Jens Olof Sjolund
Original assignee: Elekta Ab (Publ)
Priority date: 2021-09-16
Filing date: 2021-09-16
Publication date: 2023-03-23

Abstract

Systems and methods are disclosed for estimating optimization problem preferences. The systems and methods perform operations comprising receiving a first set of radiotherapy treatment plan data; processing the first set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem, wherein the machine learning model is trained to establish a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network; generating a solution to the MCO problem based on the estimated one or more preferences; and generating a radiotherapy treatment device parameter based on the solution to the MCO problem.

Description

INFERRING CLINICAL PREFERENCES FROM DATA

TECHNICAL FIELD

[0001] Embodiments of the present disclosure pertain generally to generating radiation therapy treatment plans.

BACKGROUND

[0002] Radiation therapy (or “radiotherapy”) can be used to treat cancers or other ailments in mammalian (e.g., human and animal) tissue. External beam radiotherapy employs a device that emits high-energy particles (e.g., photons, electrons, protons, ions and the like) to irradiate a patient. One such radiotherapy technique is a Gamma Knife, by which a patient is irradiated by a large number of low-intensity gamma rays that converge with high intensity and high precision at a target (e.g., a tumor). In another embodiment, external beam radiotherapy is provided using a linear accelerator. Another form of radiotherapy is brachytherapy, where a radiation source is placed inside or next to the area requiring treatment.

[0003] The placement, direction, and shape of the radiation field must be accurately controlled to ensure the target receives the prescribed radiation dose, and the radiation from the beam should minimize damage to the surrounding healthy tissue, often called the organ(s) at risk (OARs). Radiation dose is termed “prescribed” because a physician orders a predefined amount of radiation dose to the target and surrounding organs similar to a prescription for medicine.

[0004] A specified or selectable beam energy can be used, such as for delivering a diagnostic energy level range or a therapeutic energy level range. Modulation of a radiation beam can be provided by one or more attenuators or collimators (e.g., a multi-leaf collimator (MLC)). The intensity and shape of the radiation beam can be adjusted by collimation to avoid damaging healthy tissue (e.g., OARs) adjacent to the targeted tissue by conforming the projected beam to a profile of the targeted tissue.

[0005] The treatment planning procedure may include using a three- dimensional (3D) image of the patient to identify a target region (e.g., the tumor) and to identify critical organs near the tumor. Creation of a treatment plan can be a time-consuming process where a planner tries to comply with various treatment objectives or constraints (e.g., dose volume histogram (DVH), overlap volume histogram (OVH)), taking into account their individual importance (e.g., weighting or clinical preferences) in order to produce a treatment plan that is clinically acceptable. This task can be a time-consuming trial- and-error process that is complicated by the various OARs because as the number of OARs increases (e.g., a dozen or more for a head-and-neck treatment), so does the complexity of the process. OARs distant from a tumor may be easily spared from radiation, while OARs close to or overlapping a target tumor may be difficult to spare.

[0006] Traditionally, for each patient, the initial treatment plan can be generated in an “offline” manner. The treatment plan can be developed well before radiation therapy is delivered, such as using one or more medical imaging techniques. Imaging information can include, for example, images from X-rays, computed tomography (CT), nuclear magnetic resonance (MR), positron emission tomography (PET), single-photon emission computed tomography (SPECT), or ultrasound. A health care provider, such as a physician, may use 3D imaging information indicative of the patient anatomy to identify one or more target tumors along with the OARs near the tumor(s). The health care provider can delineate the target tumor that is to receive a prescribed radiation dose using a manual technique, and the health care provider can similarly delineate nearby tissue, such as organs, at risk of damage from the radiation treatment. Alternatively or additionally, an automated tool (e.g., ABAS provided by Elekta AB, Sweden) can be used to assist in identifying or delineating the target tumor and organs at risk. A radiation therapy treatment plan (“treatment plan”) can then be created using numerical optimization techniques that minimize objective functions composed of clinical and dosimetric objectives and constraints (e.g., the maximum, minimum, and fraction of dose of radiation to a fraction of the tumor volume (“95% of target shall receive no less than 100% of prescribed dose”), and like measures for the critical organs). The optimized plan is comprised of numerical parameters that specify, for instance, the direction, cross-sectional shape, and intensity of each radiation beam.

[0007] The treatment plan can then be later executed by positioning the patient in the treatment machine and delivering the prescribed radiation therapy directed by the optimized plan parameters. The radiation therapy treatment plan can include dose “fractioning,” whereby a sequence of radiation treatments is provided over a predetermined period of time (e.g., 30-45 daily fractions), with each treatment including a specified fraction of a total prescribed dose. However, during treatment, the position of the patient and the position of the target tumor in relation to the treatment machine (e.g., linear accelerator - “linac”) is very important in order to ensure the target tumor and not healthy tissue is irradiated.

OVERVIEW

[0008] In some embodiments, methods, systems, and computer-readable medium are provided for inferring or generating one or more clinical preferences. The methods, systems, and computer-readable medium perform operations comprising: receiving, by processing circuitry, a first set of radiotherapy treatment plan data; processing, by the processing circuitry, the first set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem, wherein the machine learning model is trained to establish a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network; generating, by the processing circuitry, a solution to the MCO problem based on the estimated one or more preferences; and generating a radiotherapy treatment device parameter based on the solution to the MCO problem.

[0009] In some examples, the solution to the MCO problem is further generated using the first set of radiotherapy treatment plan data.

[0010] In some examples, the operations include selecting the machine learning model from a plurality of machine learning models, each of the plurality of machine learning models being trained based on a different set of training radiotherapy treatment plan data corresponding to different medical facilities or clinicians.

[0011] In some examples, the first set of radiotherapy treatment plan data comprises one or more medical images.

[0012] In some examples, the one or more preferences are expressed using at least one of a weighted sum of objectives or an ordinal ranking of objectives.

[0013] In some examples, a first preference of the one or more preferences corresponds to an amount of dose to a target region, a second preference of the one or more preferences corresponds to a level of normal tissue sparing, and a third preference of the one or more preferences corresponds to a treatment complexity parameter.

[0014] In some examples, the treatment complexity parameter comprises at least one of treatment delivery time, number of required geometrical configurations, or robustness to positioning uncertainty.

[0015] In some examples, the operations include: generating a display of a graphical user interface for generating a radiotherapy treatment plan; displaying, within the graphical user interface, the one or more preferences as default values for solving the MCO problem; and detecting user interaction with the graphical user interface modifying the default values.

[0016] In some examples, the operations include training the machine learning model by performing training operations comprising: obtaining the plurality of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans; selecting a first batch of the plurality of training radiotherapy treatment plan data; processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model to estimate a set of preferences in the MCO problem; solving the MCO problem based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans; computing a gradient of a loss function with respect to the estimated set of preferences, wherein the loss function is based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans; and updating parameters of the machine learning model based on the gradient.

[0017] In some examples, the operations include performing a plurality of iterations of the training operations across additional batches of the plurality of training radiotherapy treatment plan data until a stopping criterion is met.

[0018] In some examples, the batch of training radiotherapy treatment plans is a first batch of training radiotherapy treatment plans, and the operations further include after updating the parameters of the machine learning model: processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model again to estimate a new set of preferences of the MCO problem; solving the MCO problem based on the estimated new set of preferences to generate a new batch of training radiotherapy treatment plans; computing a new gradient of the loss function with respect to the new batch of training radiotherapy treatment plans and the first batch of training radiotherapy treatment plans; and updating parameters of the machine learning model based on the new gradient.

[0019] In some examples, the MCO problem is solved in accordance with a gradient-based optimization method.

[0020] In some examples, the machine learning model comprises a generative adversarial network (GAN) comprising a discriminative machine learning model or a generative machine learning model; and the loss function comprises at least one of (i) a deviation between information associated with the batch of training radiotherapy treatment plans and information associated with the known set of radiotherapy treatment plans, (ii) an adversarial loss function, (iii) an evidence lower bound (ELBO), or (iv) a likelihood.

[0021] In some examples, solving the MCO problem based on the estimated set of preferences comprises processing the estimated set of preferences in accordance with at least one of a deep equilibrium model or a convex optimization layers process.

[0022] In some examples, the generated radiotherapy treatment device parameter comprises at least one of fluence maps, control points, beam-angles, collimator apertures, seed placements, dwell times, isocenter locations, or beam-on times.

[0023] In some embodiments, methods, systems, and computer-readable medium are provided for inferring or generating one or more clinical preferences. The methods, systems, and computer-readable medium perform operations comprising: receiving a first set of radiotherapy treatment plan data; and training a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem by establishing a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network. The one or more preferences can comprise weighted objectives in the MCO problem.

[0024] The above overview is intended to provide an overview of subject matter of the present patent application. It is not intended to provide an exclusive or exhaustive explanation of the inventive subject matter. The detailed description is included to provide further information about the present patent application.

BRIEF DESCRIPTION OF THE DRAWINGS

[0025] In the drawings, which are not necessarily drawn to scale, like numerals describe substantially similar components throughout the several views. Like numerals having different letter suffixes represent different instances of substantially similar components. The drawings illustrate generally, by way of example but not by way of limitation, various embodiments discussed in the present document.

[0026] FIG. 1 illustrates an example radiotherapy system, according to some embodiments of the present disclosure.

[0027] FIG. 2A illustrates an example radiation therapy system that can include radiation therapy output configured to provide a therapy beam, according to some embodiments of the present disclosure.

[0028] FIG. 2B illustrates an example system including a combined radiation therapy system and an imaging system, such as a cone beam computed tomography (CBCT) imaging system, according to some embodiments of the present disclosure.

[0029] FIG. 3 illustrates a partially cut-away view of an example system including a combined radiation therapy system and an imaging system, such as a nuclear magnetic resonance (MR) imaging (MRI) system, according to some embodiments of the present disclosure.

[0030] FIGS. 4A and 4B depict the differences between an example MRI image and a corresponding CT image, respectively, according to some embodiments of the present disclosure.

[0031] FIG. 5 illustrates an example collimator configuration for shaping, directing, or modulating an intensity of a radiation therapy beam, according to some embodiments of the present disclosure.

[0032] FIG. 6 illustrates an example Gamma Knife radiation therapy system, according to some embodiments of the present disclosure.

[0033] FIGS. 7A and 7B illustrate example flow diagrams for deep learning, according to some embodiments of the present disclosure. [0034] FIGS. 8 and 9 illustrate example data flows for training and use of a machine learning model to generate preferences of an multicriteria optimization (MCO) problem, according to some embodiments of the present disclosure.

[0035] FIG. 10 illustrates an example user interface for solving an MCO problem using estimated preferences, according to some embodiments of the present disclosure.

[0036] FIG. 11 illustrates an example block diagram of a machine on which one or more of the methods as discussed herein can be implemented.

DETAILED DESCRIPTION

[0037] In the following detailed description, reference is made to the accompanying drawings which form a part hereof, and which is shown by way of illustration-specific embodiments in which the present disclosure may be practiced. These embodiments, which are also referred to herein as “examples,” are described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that the embodiments may be combined, or that other embodiments may be utilized and that structural, logical and electrical changes may be made without departing from the scope of the present disclosure. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims and their equivalents.

[0038] Intensity modulated radiotherapy (IMRT) and volumetric modulated arc therapy (VMAT) have become the standards of care in modern cancer radiation therapy. Creating individual patient IMRT or VMAT treatment plans is a trial-and- error process, weighing target dose versus OAR sparing tradeoffs, and adjusting program constraints whose effects on the plan quality metrics and the dose distribution can be very difficult to anticipate. Indeed, the order in which the planning constraints are adjusted can itself result in dose differences. Treatment plan quality depends on often subjective judgements by the planner that depend on his/her experience and skill. Even the most skilled planners still have no assurance that their plans are close to the best possible or whether a little or a lot of effort will result in a significantly better plan.

[0039] The process for preparing a radiotherapy treatment plan usually involves solving an optimization problem that balances various conflicting objectives or preferences, such as high dose to target, normal tissue sparing, and treatment complexity. One substantial challenge in radiation therapy is the conflict between multiple clinical goals or objectives, such as delivering a sufficient amount of radiation dose to the tumor yet sparing nearby healthy organs. Moreover, each healthy organ responds to radiation differently; thus, various clinical objectives are required to address issues with different organs. Treatment planners typically employ multicriteria optimization (MCO) techniques to find optimal treatment parameter settings that generate desirable trade-offs among the set of objectives. Such optimization problems are so-called MCO problems. [0040] Commonly, the different criteria or preferences are combined using a weighted sum, where each weight determines the relative importance of that criterion. Finding acceptable weights (or preferences) is often a manual and tedious process of trial-and-error, especially so because evaluating a single choice of preferences and associated weights requires solving the full MCO problem, which may take from a few seconds up to an hour depending on the application. Moreover, different planners often have different preferences, so there is no choice of weights or preferences that works for everyone. The entire planning process is often slow and leads to inconsistent, suboptimal treatment quality. Specifically, the chosen preferences can affect computational complexity as well as the realism of the optimization problem. For example, linear objective functions can make the problem computationally more efficient, but the resulting formulation may not adequately reflect clinical reality.

[0041] Certain systems exist that train machine learning models to predict preferences by learning from training data where the MCO problems that generated the data is known. For example, a set of training data is retrieved that includes treatment plan data and associated objectives with known relative weights. The machine learning models are then trained, based on the set of training data, to predict the same or similar objectives when presented with a new set of treatment plan data. While such systems generally work well, they require training data that already includes the known set of objectives. There is no system that exists that can train a machine learning model to estimate preferences using training data that does not include such preferences in the first place.

[0042] The disclosed embodiments provide an individualized method of predicting suitable weights or preferences (objectives) to speed up the radiotherapy treatment plan generation. Specifically, the present disclosure includes various techniques to improve and enhance radiotherapy treatment planning by providing a machine learning model (e.g., a neural network, a generative adversarial network (GAN), a generative model, or a discriminative model) that predicts, estimates, or generates preferences (or objectives) of an MCO problem. As referred to herein, the terms “preferences” and “objectives” of an MCO problem represent clinical desired data that can be translated in various ways into mathematical expressions that in turn can be translated into the preferences and/or objective functions and/or constraints of the MCO problem. The MCO problem can then be solved based on such estimated preferences to generate a radiotherapy treatment device parameter based on the solution to the MCO problem. Namely, rather than having to rely on the expertise of an individual clinician to select a given set of preferences and test those preferences to see if the solution to the MCO problem is suitable in a trial and error process, the disclosed techniques provide a machine learning model that predicts, estimates, or generates such preferences that may already provide a suitable solution to the MCO problem, which avoids unnecessary time and expense of solving an MCO problem based on preferences that may or may not be optimal. The technical benefits include reduced radiotherapy treatment plan creation time. The disclosed techniques may be applicable to a variety of medical treatment and diagnostic settings or radiotherapy treatment equipment and devices.

[0043] According to some embodiments, the disclosure, indirectly or directly, learns (trains a machine learning model based on) the preferences for an MCO problem based on information associated with a plurality of training radiotherapy treatment plan data. Namely, the machine learning model is trained based on a set of training radiotherapy treatment plans to learn the preferences used to generate such training radiotherapy treatment plans without being provided the objectives or preferences. For example, the machine learning model is trained to estimate what preferences were used to generate a set of radiotherapy treatment plans when such preferences are not known. To do so, an MCO problem is solved to generate a radiotherapy treatment plan using a set of estimated preferences (such as based on a set of radiotherapy treatment plan data) and then a deviation is computed between the generated radiotherapy treatment plan and the known radiotherapy treatment plan that corresponds to the set of radiotherapy treatment plan data. This deviation is used to compute a gradient of a loss function to update parameters of the machine learning model.

[0044] In an implementation, the machine learning model can be trained by selecting a batch of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans to process with the machine learning model to estimate a set of preferences of the MCO problem. The MCO problem is then solved based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans. A gradient of a loss function is computed with respect to the estimated set of preferences, wherein the loss function is based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans. Parameters of the machine learning model are then updated based on the gradient.

[0045] In this way, by using machine learning to generate the preferences of the MCO problem, computational complexity of the total plan optimization process is reduced and the time needed to create a treatment plan for a given patient is reduced.

[0046] FIG. 1 illustrates an example radiotherapy system 100 for providing radiation therapy to a patient. The radiotherapy system 100 includes an image processing device 112 (e.g., a data processing device). The image processing device 112 may be connected to a network 120. The network 120 may be connected to the Internet 122. The network 120 can connect the image processing device 112 with one or more of a database 124, a hospital database 126, an oncology information system (OIS) 128, a radiation therapy device 130, an image acquisition device 132, a display device 134, and a user interface 136. The image processing device 112 can be configured to generate radiation therapy treatment plans 142 to be used by the radiation therapy device 130. In some implementations, some or all of the components of the image processing device 112 can be integrated or incorporated into any one of the database 124, a hospital database 126, an oncology information system (OIS) 128, a radiation therapy device 130, an image acquisition device 132, a display device 134, and/or a user interface 136.

[0047] As an example, the image processing device 112 can include a processor 114 that stores in a memory 116 a trained machine learning model (e.g., a neural network or generative machine learning model) for generating or predicting preferences of an MCO problem to generate a radiotherapy treatment device parameter for the radiation therapy device 130. In an example, the image processing device 112 receives a set of radiotherapy treatment plan data and processes the set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in an MCO problem. The image processing device 112 generates a solution to the MCO problem based on the estimated preferences and then generates a radiotherapy treatment device parameter based on the solution. [0048] The image processing device 112 may include a memory device 116, an processor 114, and a communication interface 118. The memory device 116 may store computer-executable instructions, such as an operating system 143, radiation therapy treatment plans 142 (e.g., original treatment plans, adapted treatment plans and the like), software programs 144 (e.g., artificial intelligence, deep learning, neural networks, radiotherapy treatment plan software, trained generative machine learning model(s) of one or more radiation therapy devices 130), and any other computer-executable instructions to be executed by the processor 114.

[0049] In one embodiment, the software programs 144 may convert medical images of one format (e.g., MRI) to another format (e.g., CT) by producing synthetic images, such as pseudo-CT images. For instance, the software programs 144 may include image processing programs to train a predictive model for converting a medical image 146 in one modality (e.g., an MRI image) into a synthetic image of a different modality (e.g., a pseudo CT image); alternatively, the trained predictive model may convert a CT image into an MRI image. In another embodiment, the software programs 144 may register the patient image (e.g., a CT image or an MR image) with that patient’s dose distribution (also represented as an image) so that corresponding image voxels and dose voxels are associated appropriately by the network. In yet another embodiment, the software programs 144 may substitute functions of the patient images such as signed distance functions or processed versions of the images that emphasize some aspect of the image information. Such functions might emphasize edges or differences in voxel textures or any other structural aspect useful to neural network learning. In another embodiment, the software programs 144 may substitute functions of the dose distribution that emphasize some aspect of the dose information. Such functions might emphasize steep gradients around the target or any other structural aspect useful to neural network learning. The memory device 116 may store data, including medical images 146, patient data 145, and other data required to create and implement a radiation therapy treatment plan 142.

[0050] In yet another embodiment, the software programs 144 may use a generative machine learning model (e.g., a GAN, normalizing flow, variational autoencoder (VAE) and/or diffusion model), a discriminative machine learning model, and/or a neural network to generate one or more preferences in an MCO problem based on a set of radiotherapy treatment plan data (e.g., patient images). For example, the software programs 144 can store or access the machine learning model, which has been previously trained to generate the one or more preferences and can generate new preferences for a new set of radiotherapy treatment plan data (e.g., medical images 146 and/or patient data 145) based on an output of the machine learning model. The software programs 144 then generate a solution to the MCO problem based on the one or more preferences and use the solution to generate a radiotherapy treatment plan that includes one or more radiotherapy treatment device parameters (e.g., fluence maps, control points, beam-angles, collimator apertures, seed placements, dwell times, isocenter locations, and/or beam-on times).

[0051] In some examples, the software programs 144 can present the one or more preferences in a graphical user interface comprising a display device 134 and a user interface 136 which is used to generate a radiotherapy treatment plan. The display device 134 can present such preferences as default preferences. The graphical user interface is configured to receive input through a user interface 136 that selects or modifies any one of the preferences to generate the radiotherapy treatment plan. For example, after a clinician is satisfied with the default and/or modified preferences, input can be received from the clinician via the user interface 136 that instructs the software programs 144 to solve the MCO problem based on the specified preferences. The solution to the MCO problem is then used to generate a radiotherapy treatment plan, which can also be presented on the graphical user interface.

[0052] In some examples, multiple machine learning models can be trained and selected, such that each is configured to generate or estimate the preferences associated with a different medical facility or clinician. In some implementations, a first preference of the one or more preferences corresponds to an amount of dose to a target region, a second preference of the one or more preferences corresponds to a level of normal tissue sparing, and a third preference of the one or more preferences corresponds to a treatment complexity parameter. The treatment complexity parameter comprises at least one of treatment delivery time, number of required geometrical configurations, or robustness to positioning uncertainty.

[0053] In one example, the software programs 144 train a given one of the machine learning models based on training data that includes training radiotherapy treatment plan data (e.g., medical images 146 and patient data 145) corresponding to a known set of radiotherapy treatment plans. The training data can be stored locally on image processing device 112 or retrieved via network 120 from database 124, hospital database 126, OIS 128, radiation therapy device 130 and/or image acquisition device 132. The known set of radiotherapy treatment plans may be generated by a first clinical facility or clinician based on the training radiotherapy treatment plan data. Specifically, the software programs 144 select a first batch of the plurality of training radiotherapy treatment plan data (e.g., a first batch of images of training patients). The first batch of the plurality of training radiotherapy treatment plan data may not include any information about the objectives or preferences used to generate the corresponding known set of radiotherapy treatment plans. The first batch of the plurality of training radiotherapy treatment plan data is processed with the given one of the machine learning models to estimate a first set of preferences of the MCO problem. The one or more preferences can be expressed using at least one of a weighted sum of objectives or an ordinal ranking of objectives. The MCO problem is solved based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans, such as by processing the estimated set of preferences in accordance with at least one of a deep equilibrium model or a convex optimization layers process. The software programs 144 compute a gradient of a loss function with respect to the estimated set of preferences. The loss function can be based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans. For example, the software programs 144 can compute a deviation between the generated batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans in the training data. Based on the deviation or the gradient of the loss function, parameters of the given machine learning model are updated. The software programs 144 perform a plurality of iterations of the training operations across additional batches of the plurality of training radiotherapy treatment plan data until a stopping criterion is met.

[0054] In an implementation, a second one of the machine learning models is trained based on training data that includes training radiotherapy treatment plan data (e.g., medical images 146 and patient data 145) corresponding to a second known set of radiotherapy treatment plans for a second clinical facility or clinician. The second one of the machine learning models can be trained by federated learning, ensembling, bagging or boosting with the given one of the machine learning models. In such cases, the software programs 144 select the first batch of the plurality of training radiotherapy treatment plan data (e.g., a first batch of images of training patients). The first batch of the plurality of training radiotherapy treatment plan data may not include any information about the objectives or preferences used to generate the corresponding known set of radiotherapy treatment plans. The first batch of the plurality of training radiotherapy treatment plan data is processed with the second one of the machine learning models to estimate a second set of preferences of the MCO problem. The MCO problem is solved based on the estimated second set of preferences to generate a second batch of training radiotherapy treatment plans. The software programs 144 compute a gradient of a loss function with respect to the estimated set of preferences. The loss function can be based on the second batch of training radiotherapy treatment plans and the second known set of radiotherapy treatment plans. For example, the software programs 144 can compute a deviation between the generated second batch of training radiotherapy treatment plans and the second known set of radiotherapy treatment plans in the training data. Based on the deviation or the gradient of the loss function, parameters of the second machine learning model are updated. The software programs 144 perform a plurality of iterations of the training operations across additional batches of the plurality of training radiotherapy treatment plan data until a stopping criterion is met. In this way, different machine learning models can be trained to estimate preferences for different clinical facilities or clinicians.

[0055] The second one of the machine learning models can be trained and stored entirely separately from the given one of the machine learning models for a different facility or clinician. In another example, a single machine learning model can be generated based on training the second one of the machine learning models together with the given one of the machine learning models. For example, the models can be trained independently and make independent predictions which are then fused/aggregated into a collective model. Ensembling can work for both centralized and decentralized data. As another example, the multiple models can be trained gradually to generate more accurate models by feeding back the errors (residuals), such as based on bagging, boosting and stacking. As another example, federated learning can be used to tram the multiple models, where a shared model is trained based on decentralized data. This can be done by computing gradients on local data and communicating the gradients to a central server that aggregates the local gradients before updating and redistributing the shared model.

[0056] In some examples, the MCO problem is solved, during training and/or prediction, in accordance with a gradient-based optimization method. In some examples, solving the MCO problem based on the estimated set of preferences includes processing the estimated set of preferences in accordance with at least one of a deep equilibrium model or a convex optimization layers process during training of the machine learning technique.

[0057] In one example, a generative machine learning model is trained to generate the parameters of the MCO problem. For example, the generative machine learning model can include a GAN in which a generative model is trained using a discriminative model. In such circumstances, parameter values applied by the generative model and the discriminative model are established using adversarial training between the discriminative model and the generative model. For example, the adversarial training includes training the generative model to generate one or more preferences of the MCO problem and using those one or more preferences to generate a solution to the MCO problem that provides a synthetic radiotherapy treatment plan. The discriminative model is trained to classify the synthetic radiotherapy treatment plan as a synthetic or a real radiotherapy treatment plan. In some cases, the discriminative model is trained to classify the estimated one or more preferences directly (instead of having the generative model generate the synthetic radiotherapy treatment plan) as a synthetic or a real preference. In such cases, the discriminative model can use a distribution of real preferences to classify the estimated preferences as real or synthetic. In either case, an output of the generative model is used for training the discriminative model and an output of the discriminative model is used for training the generative model. Specifically, parameters of the generative model and the discriminative model are updated based on the classification of the discriminative model with respect to the underlying training data. In one implementation, a known radiotherapy treatment plan is obtained from the set of training data, and training loss for the discriminative model is computed based on a result of comparing the classification output by the discriminative model with the radiotherapy treatment plan obtained from the set of training data. For example, the generative machine learning model or the discriminative machine learning model can be trained and updated based on a loss function. The loss function can include a deviation between information associated with the batch of training radiotherapy treatment plans and information associated with the known set of radiotherapy treatment plans; (ii) an adversarial loss function; (iii) an evidence lower bound (ELBO); and/or (iv) a likelihood.

[0058] In addition to the memory device 116 storing the software programs 144, it is contemplated that software programs 144 may be stored on a removable computer medium, such as a hard drive, a computer disk, a CD-ROM, a DVD, a HD, a Blu-Ray DVD, USB flash drive, a SD card, a memory stick, or any other suitable medium; and when downloaded to image processing device 112, the software programs 144 may be executed by processor 114.

[0059] The processor 114 may be communicatively coupled to the memory device 116, and the processor 114 may be configured to execute coiiiputer- executable instructions stored thereon. The processor 114 may send or receive medical images 146 to memory device 116. For example, the processor 114 may receive medical images 146 from the image acquisition device 132 via the communication interface 118 and network 120 to be stored in memory device 116. The processor 114 may also send medical images 146 stored in memory device 116 via the communication interface 118 to the network 120 to be either stored in database 124 or the hospital database 126.

[0060] Further, the processor 114 may utilize software programs 144 (e.g., a treatment planning software) along with the medical images 146 and patient data 145 to create the radiation therapy treatment plan 142. Medical images 146 may include information such as imaging data associated with a patient anatomical region, organ, or volume of interest segmentation data. Patient data 145 may include information such as (1) functional organ modeling data (e.g., serial versus parallel organs, appropriate dose response models, etc.); (2) radiation dosage data (e.g., DVH information); or (3) other clinical information about the patient and course of treatment (e.g., other surgeries, chemotherapy, previous radiotherapy, etc.).

[0061] In addition, the processor 114 may utilize software programs to generate intermediate data such as updated parameters to be used, for example, by a machine learning model, such as a neural network model; or generate intermediate 2D or 3D images, which may then subsequently be stored in memory device 116. The processor 114 may subsequently then transmit the executable radiation therapy treatment plan 142 via the communication interface 118 to the network 120 to the radiation therapy device 130, where the radiation therapy plan will be used to treat a patient with radiation. In addition, the processor 114 may execute software programs 144 to implement functions such as image conversion, image segmentation, artefact correction, dimensionality reduction and function estimation. For instance, the processor 114 may execute software programs 144 that train or contour a medical image; such software programs 144 when executed may train a boundary detector or utilize a shape dictionary.

[0062] The processor 114 may be a processing device, include one or more general-purpose processing devices such as a microprocessor, a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit ( APU), or the like. More particularly, the processor 114 may be a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction Word (VLIW) microprocessor, a processor implementing other instruction sets, or processors implementing a combination of instruction sets. The processor 114 may also be implemented by one or more special-purpose processing devices such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), a System on a Chip (SoC), or the like. As would be appreciated by those skilled in the art, in some embodiments, the processor 114 may be a special-purpose processor, rather than a general-purpose processor. The processor 114 may include one or more known processing devices, such as a microprocessor from the Pentium™, Core™, Xeon™, or Itanium® family manufactured by Intel™, the Turion™, Athlon™, Sempron™, Opteron™, FX™, Phenom™ family manufactured by AMD™, or any of various processors manufactured by Sun Microsystems. The processor 114 may also include GPUs such as a GPU from the GeForce®, Quadro®, Tesla® family manufactured by Nvidia™, GMA, Iris™ family manufactured by Intel™, or the Radeon™ family manufactured by AMD™. The processor 114 may also include accelerated processing units such as the Xeon Phi™ family manufactured by Intel™. The disclosed embodiments are not limited to any type of processor(s) otherwise configured to meet the computing demands of identifying, analyzing, maintaining, generating, and/or providing large amounts of data or manipulating such data to perform the methods disclosed herein. In addition, the term “processor” may include more than one processor (for example, a multi- core design or a plurality of processors each having a multi-core design). The processor 114 can execute sequences of computer program instructions, stored in memory device 116, to perform various operations, processes, methods that will be explained in greater detail below.

[0063] The memory device 116 can store medical images 146. In some embodiments, the medical images 146 may include one or more MRI images (e.g., two-dimensional (2D) MRI, 3D MRI, 2D streaming MRI, four-dimensional (4D) MRI, 4D volumetric MRI, 4D cine MRI, projection images, graphical aperture images, and pairing information between projection images and graphical aperture images, etc.), functional MRI images (e.g., fMRI, DCE-MRI, diffusion MRI), CT images (e.g., 2D CT, cone beam CT, 3D CT, 4D CT), ultrasound images (e.g., 2D ultrasound, 3D ultrasound, 4D ultrasound), one or more projection images representing views of an anatomy depicted in the MRI, synthetic CT (pseudo-CT), and/or CT images at different angles of a gantry relative to a patient axis, PET images, X-ray images, fluoroscopic images, radiotherapy portal images, SPECT images, computer generated synthetic images (e.g., pseudo-CT images), training data for training one or more machine learning models, and the like. Further, the medical images 146 may also include medical image data and training images, contoured images, and dose images. In an embodiment, the medical images 146 may be received from the image acquisition device 132. Accordingly, image acquisition device 132 may include an MRI imaging device, a CT imaging device, a PET imaging device, an ultrasound imaging device, a fluoroscopic device, a SPECT imaging device, an integrated linac and MRI imaging device, or other medical imaging devices for obtaining the medical images of the patient. The medical images 146 may be received and stored in any type of data or any type of format that the image processing device 112 may use to perform operations consistent with the disclosed embodiments.

[0064] The memory device 116 may be a non-transitory computer-readable medium, such as a read-only memory (ROM), a phase-change random access memory (PRAM), a static random access memory (SRAM), a flash memory, a random access memory (RAM), a dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM), an electrically erasable programmable read-only memory (EEPROM), a static memory (e.g., flash memory, flash disk, static random access memory) as well as other types of random access memories, a cache, a register, a CD-ROM, a DVD or other optical storage, a cassette tape, other magnetic storage device, or any other non-transitory medium that may be used to store information including image, data, or computer executable instructions (e.g., stored in any format) capable of being accessed by the processor 114, or any other type of computer device. The computer program instructions can be accessed by the processor 114, read from the ROM, or any other suitable memory location, and loaded into the RAM for execution by the processor 114. For example, the memory device 116 may store one or more software applications. Software applications stored in the memory device 116 may include, for example, an operating system 143 for common computer systems as well as for software-controlled devices. Further, the memory device 116 may store an entire software application, or only a part of a software application, that are executable by the processor 114. For example, the memory device 116 may store one or more radiation therapy treatment plans 142.

[0065] The image processing device 112 can communicate with the network 120 via the communication interface 118, which can be communicatively coupled to the processor 114 and the memory device 116. The communication interface 118 may provide communication connections between the image processing device 112 and radiotherapy system 100 components (e.g., permitting the exchange of data with external devices). For instance, the communication interface 118 may, in some embodiments, have appropriate interfacing circuitry to connect to the user interface 136, which may be a hardware keyboard, a keypad, or a touch screen through which a user may input information into radiotherapy system 100.

[0066] Communication interface 118 may include, for example, a network adaptor, a cable connector, a serial connector, a USB connector, a parallel connector, a high-speed data transmission adaptor (e.g., such as fiber, USB 3.0, thunderbolt, and the like), a wireless network adaptor (e.g., such as a WiFi adaptor), a telecommunication adaptor (e.g., 3G, 4G/LTE, 5G and the like), and the like. Communication interface 118 may include one or more digital and/or analog communication devices that permit image processing device 112 to communicate with other machines and devices, such as remotely located components, via the network 120.

[0067] The network 120 may provide the functionality of a local area network (LAN), a wireless network, a cloud computing environment (e.g., software as a service, platform as a service, infrastructure as a service, etc.), a client-server, a wide area network (WAN), and the like. For example, network 120 may be a LAN or a WAN that may include other systems S1 (138), S2 (140), and S3(141). Systems S1, S2, and S3 may be identical to image processing device 112 or may be different systems. In some embodiments, one or more of the systems in network 120 may form a distributed computing/simulation environment that collaboratively performs the embodiments described herein. In some embodiments, one or more systems S1, S2, and S3 may include an image acquisition device 132. In addition, network 120 may be connected to Internet 122 to communicate with servers and clients that reside remotely on the Internet.

[0068] Therefore, network 120 can allow data transmission between the image processing device 112 and a number of various other systems and devices, such as the OIS 128, the radiation therapy device 130, and the image acquisition device 132. Further, data generated by the OIS 128 and/or the image acquisition device 132 may be stored in the memory device 116, the database 124, and/or the hospital database 126. The data may be transmitted/received via network 120, through communication interface 118, in order to be accessed by the processor 114, as required.

[0069] The image processing device 112 may communicate with database 124 through network 120 to send/receive a plurality of various types of data stored on database 124. For example, database 124 may include machine data (control points or radiotherapy treatment device parameters) that includes information associated with a radiation therapy device 130, image acquisition device 132, or other machines relevant to radiotherapy. Machine data information may include control points, such as radiation beam size, arc placement, beam on and off time duration, machine parameters, segments, MLC configuration, gantry speed, MRI pulse sequence, and the like. Database 124 may be a storage device and may be equipped with appropriate database administration software programs. One skilled in the art would appreciate that database 124 may include a plurality of devices located either in a central or a distributed manner.

[0070] In some embodiments, database 124 may include a processor-readable storage medium (not shown). While the processor-readable storage medium in an embodiment may be a single medium, the term “processor-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of computer-executable instructions or data. The term “processor-readable storage medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by a processor and that cause the processor to perform any one or more of the methodologies of the present disclosure. The term “processor-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media. For example, the processor-readable storage medium can be one or more volatile, non -transitory, or non-volatile tangible computer-readable media .

[0071] Processor 114 may communicate with database 124 to read images into memory device 116 or store images from memory device 116 to database 124. For example, the database 124 may be configured to store a plurality of images (e.g., 3D MRI, 4D MRI, 2D MRI slice images, CT images, 2D Fluoroscopy images, X-ray images, raw data from MR scans or CT scans, Digital Imaging and Communications in Medicine (DIMCOM) data, projection images, graphical aperture images, etc.) that the database 124 received from image acquisition device 132. Database 124 may store data to be used by the processor 114 when executing software program 144 or when creating radiation therapy treatment plans 142. Database 124 may store the data produced by the trained machine leaning mode, such as a neural network including the network parameters constituting the model learned by the network and the resulting predicted data. The image processing device 112 may receive the imaging data, such as a medical image 146 (e.g., 2D MRI slice images, CT images, 2D Fluoroscopy images, X-ray images, 3D MRI images, 4D MRI images, projection images, graphical aperture images, etc.) either from the database 124, the radiation therapy device 130 (e.g., an MRI-linac), and/or the image acquisition device 132 to generate a treatment plan 142. [0072] In an embodiment, the radiotherapy system 100 can include an image acquisition device 132 that can acquire medical images (e.g., MRI images, 3D MRI, 2D streaming MRI, 4D volumetric MRI, CT images, cone-Beam CT, PET images, functional MRI images (e.g., fMRI, DCE-MRI, and diffusion MRI), X-ray images, fluoroscopic image, ultrasound images, radiotherapy portal images, SPECT images, and the like) of the patient. Image acquisition device 132 may, for example, be an MRI imaging device, a CT imaging device, a PET imaging device, an ultrasound device, a fluoroscopic device, a SPECT imaging device, or any other suitable medical imaging device for obtaining one or more medical images of the patient. Images acquired by the image acquisition device 132 can be stored within database 124 as either imaging data and/or test data. By way of example, the images acquired by the image acquisition device 132 can also be stored by the image processing device 112 as medical images 146 in memory device 116.

[0073] In an embodiment, for example, the image acquisition device 132 may be integrated with the radiation therapy device 130 as a single apparatus (e.g., an MRI-linac). Such an MRI-linac can be used, for example, to determine a location of a target organ or a target tumor in the patient, so as to direct radiation therapy accurately according to the radiation therapy treatment plan 142 to a predetermined target.

[0074] The image acquisition device 132 can be configured to acquire one or more images of the patient’s anatomy for a region of interest (e.g., a target organ, a target tumor, or both). Each image, typically a 2D image or slice, can include one or more parameters (e.g., a 2D slice thickness, an orientation, and a location, etc.). In an embodiment, the image acquisition device 132 can acquire a 2D slice in any orientation. For example, an orientation of the 2D slice can include a sagittal orientation, a coronal orientation, or an axial orientation. The processor 114 can adjust one or more parameters, such as the thickness and/or orientation of the 2D slice, to include the target organ and/or target tumor. In an embodiment, 2D slices can be determined from information such as a 3D MRI volume. Such 2D slices can be acquired by the image acquisition device 132 in “real-time” while a patient is undergoing radiation therapy treatment, for example, when using the radiation therapy device 130, with “real-time” meaning acquiring the data in at least milliseconds or less. [0075] The image processing device 112 may generate and store radiation therapy treatment plans 142 for one or more patients. The radiation therapy treatment plans 142 may provide information about a particular radiation dose to be applied to each patient. The radiation therapy treatment plans 142 may also include other radiotherapy information, such as control points including beam angles, gantry angles, beam intensity, dose-histogram-volume information, the number of radiation beams to be used during therapy, the dose per beam, and the like.

[0076] The processor 114 may generate the radiation therapy treatment plan 142 by using software programs 144 such as treatment planning software (such as Monaco®, manufactured by Elekta AB of Stockholm, Sweden). In order to generate the radiation therapy treatment plans 142, the processor 114 may communicate with the image acquisition device 132 (e.g., a CT device, an MRI device, a PET device, an X-ray device, an ultrasound device, etc.) to access images of the patient and to delineate a target, such as a tumor. In some embodiments, the delineation of one or more OARs, such as healthy tissue surrounding the tumor or in close proximity to the tumor, may be required. Therefore, segmentation of the OAR may be performed when the OAR is close to the target tumor. In addition, if the target tumor is close to the OAR (e.g., prostate in near proximity to the bladder and rectum), then by segmenting the OAR from the tumor, the radiotherapy system 100 may study the dose distribution not only in the target but also in the OAR.

[0077] In order to delineate a target organ or a target tumor from the OAR, medical images, such as MRI images, CT images, PET images, fMRI images, X- ray images, ultrasound images, radiotherapy portal images, SPECT images, and the like, of the patient undergoing radiotherapy may be obtained non-invasively by the image acquisition device 132 to reveal the internal structure of a body part. Based on the information from the medical images, a 3D structure of the relevant anatomical portion may be obtained. In addition, during a treatment planning process, many parameters may be taken into consideration to achieve a balance between efficient treatment of the target tumor (e.g., such that the target tumor receives enough radiation dose for an effective therapy) and low irradiation of the OAR(s) (e.g., the OAR(s) receives as low a radiation dose as possible). Other parameters that may be considered include the location of the target organ and the target tumor, the location of the OAR, and the movement of the target in relation to the OAR. For example, the 3D structure may be obtained by contouring the target or contouring the OAR within each 2D layer or slice of an MRI or CT image and combining the contour of each 2D layer or slice. The contour may be generated manually (e.g., by a physician, dosimetrist, or health care worker using a program such as MONACO™ manufactured by Elekta AB of Stockholm, Sweden) or automatically (e.g., using a program such as the Atlas-based auto-segmentation software, ABAS™, and a successor auto-segmentation software product ADMIRE™, manufactured by Elekta AB of Stockholm, Sweden). In certain embodiments, the 3D structure of a target tumor or an OAR may be generated automatically by the treatment planning software.

[0078] After the target tumor and the OAR(s) have been located and delineated, a dosimetrist, physician, or healthcare worker may determine a dose of radiation to be applied to the target tumor, as well as any maximum amounts of dose that may be received by the OAR proximate to the tumor (e.g., left and right parotid, optic nerves, eyes, lens, inner ears, spinal cord, brain stem, and the like). After the radiation dose is determined for each anatomical structure (e.g., target tumor, OAR), a process known as inverse planning may be performed to determine one or more treatment plan parameters that would achieve the desired radiation dose distribution. Examples of treatment plan parameters include volume delineation parameters (e.g., which define target volumes, contour sensitive structures, etc.), margins around the target tumor and OARs, beam angle selection, collimator settings, and beam-on times. During the inverse-planning process, the physician may define dose constraint parameters that set bounds on how much radiation an OAR may receive (e.g., defining full dose to the tumor target and zero dose to any OAR; defining 95% of dose to the target tumor; defining that the spinal cord, brain stem, and optic structures receive < 45Gy, < 55Gy, and < 54Gy, respectively). The result of inverse planning may constitute a radiation therapy treatment plan 142 that may be stored in memory device 116 or database 124. Some of these treatment parameters may be correlated. For example, tuning one preference or objective (e.g., weights for different objectives, such as increasing the dose to the target tumor) in an attempt to change the treatment plan may affect at least one other preference or objective, which in turn may result in the development of a different treatment plan. Thus, the image processing device 112 can generate a tailored radiation therapy treatment plan 142 by automating tuning of these preference or objectives in order for the radiation therapy device 130 to provide radiotherapy treatment to the patient. The image processing device 112 generates the automated tuning of the preferences or objectives by processing a set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in an MCO problem and generating a solution to the MCO problem based on the estimated preferences.

[0079] In addition, the radiotherapy system 100 may include a display device 134 and a user interface 136. The display device 134 may include one or more display screens that display medical images, interface information, treatment planning parameters (e.g., projection images, graphical aperture images, contours, dosages, beam angles, etc.), treatment plans, a target, localizing a target and/or tracking a target, or any related information to the user. The user interface 136 may be a keyboard, a keypad, a touch screen, or any type of device with which a user may input information to radiotherapy system 100. Alternatively, the display device 134 and the user interface 136 may be integrated into a device such as a tablet computer (e.g., Apple iPad®, Lenovo Thinkpad®, Samsung Galaxy ®, etc.).

[0080] Furthermore, any and all components of the radiotherapy system 100 may be implemented as a virtual machine (e.g., VMWare, Hyper-V, and the like). For instance, a virtual machine can be software that functions as hardware. Therefore, a virtual machine can include at least one or more virtual processors, one or more virtual memories, and one or more virtual communication interfaces that together function as hardware. For example, the image processing device 112, the OIS 128, and the image acquisition device 132 could be implemented as a virtual machine. Given the processing power, memory, and computational capability available, the entire radiotherapy system 100 could be implemented as a virtual machine.

[0081] FIG. 2A illustrates an example radiation therapy device 202 that may include a radiation source, such as an X-ray source or a linear accelerator, a couch 216, an imaging detector 214, and a radiation therapy output 204. The radiation therapy device 202 may be configured to emit a radiation beam 208 to provide therapy to a patient. The radiation therapy output 204 can include one or more attenuators or collimators, such as an MLC as described in the illustrative embodiment of FIG. 5, below. [0082] Referring back to FIG. 2A, a patient can be positioned in a region 212 and supported by the treatment couch 216 to receive a radiation therapy dose, according to a radiation therapy treatment plan. The radiation therapy output 204 can be mounted or attached to a gantry 206 or other mechanical support. One or more chassis motors (not shown) may rotate the gantry 206 and the radiation therapy output 204 around couch 216 when the couch 216 is inserted into the treatment area. In an embodiment, gantry 206 may be continuously rotatable around couch 216 when the couch 216 is inserted into the treatment area. In another embodiment, gantry 206 may rotate to a predetermined position when the couch 216 is inserted into the treatment area. For example, the gantry 206 can be configured to rotate the therapy output 204 around an axis (“A”). Both the couch 216 and the radiation therapy output 204 can be independently moveable to other positions around the patient, such as moveable in transverse direction (“T”), moveable in a lateral direction (“L”), or as rotation about one or more other axes, such as rotation about a transverse axis (indicated as “R”). A controller communicatively connected to one or more actuators (not shown) may control the couch 216 movements or rotations in order to properly position the patient in or out of the radiation beam 208 according to a radiation therapy treatment plan. Both the couch 216 and the gantry 206 are independently moveable from one another in multiple degrees of freedom, which allows the patient to be positioned such that the radiation beam 208 can target the tumor precisely. The MLC may be integrated and included within gantry 206 to deliver the radiation beam 208 of a certain shape. [0083] The coordinate system (including axes A, T, and L) shown in FIG. 2A can have an origin located at an isocenter 210. The isocenter can be defined as a location where the central axis of the radiation beam 208 intersects the origin of a coordinate axis, such as to deliver a prescribed radiation dose to a location on or within a patient. Alternatively, the isocenter 210 can be defined as a location where the central axis of the radiation beam 208 intersects the patient for various rotational positions of the radiation therapy output 204 as positioned by the gantry 206 around the axis A. As discussed herein, the gantry angle corresponds to the position of gantry 206 relative to axis A, although any other axis or combination of axes can be referenced and used to determine the gantry angle. [0084] Gantry 206 may also have an attached imaging detector 214. The imaging detector 214 is preferably located opposite to the radiation source, and in an embodiment, the imaging detector 214 can be located within a field of the radiation beam 208.

[0085] The imaging detector 214 can be mounted on the gantry 206 (preferably opposite the radiation therapy output 204), such as to maintain alignment with the radiation beam 208. The imaging detector 214 rotates about the rotational axis as the gantry 206 rotates. In an embodiment, the imaging detector 214 can be a flat panel detector (e.g., a direct detector or a scintillator detector). In this manner, the imaging detector 214 can be used to monitor the radiation beam 208, or the imaging detector 214 can be used for imaging the patient’s anatomy, such as portal imaging. The control circuitry of the radiation therapy device 202 may be integrated within the system 100 or remote from it,

[0086] In an illustrative embodiment, one or more of the couch 216, the therapy output 204, or the gantry 206 can be automatically positioned, and the therapy output 204 can establish the radiation beam 208 according to a specified dose for a particular therapy delivery instance. A sequence of therapy deliveries can be specified according to a radiation therapy treatment plan, such as using one or more different orientations or locations of the gantry 206, couch 216, or therapy output 204. The therapy deliveries can occur sequentially but can intersect in a desired therapy locus on or within the patient, such as at the isocenter 210. A prescribed cumulative dose of radiation therapy can thereby be delivered to the therapy locus while damage to tissue near the therapy locus can be reduced or avoided.

[0087] FIG. 2B illustrates an example radiation therapy device 202 that may include a combined linac and an imaging system, such as a CT imaging system. The radiation therapy device 202 can include an MLC (not shown). The CT imaging system can include an imaging X-ray source 218, such as providing X-ray energy in a kiloelectron- Volt (keV) energy range. The imaging X-ray source 218 can provide a fan-shaped and/or a conical radiation beam 208 directed to an imaging detector 222, such as a flat panel detector. The radiation therapy device 202 can be similar to the system described in relation to FIG. 2 A, such as including a radiation therapy output 204, a gantry 206, a couch 216, and another imaging detector 214 (such as a flat panel detector). The X-ray source 218 can provide a comparatively- lower-energy X-ray diagnostic beam for imaging.

[0088] In the illustrative embodiment of FIG. 2B, the radiation therapy output 204 and the X-ray source 218 can be mounted on the same rotating gantry 206, rotationally separated from each other by 90 degrees. In another embodiment, two or more X-ray sources can be mounted along the circumference of the gantry 206, such as each having its own detector arrangement to provide multiple angles of diagnostic imaging concurrently. Similarly, multiple radiation therapy outputs 204 can be provided.

[0089] FIG. 3 depicts an example radiation therapy system 300 that can include combining a radiation therapy device 202 and an imaging system, such as a MR imaging system (e.g., known in the art as an MR-linac) consistent with the disclosed embodiments. As shown, system 300 may include a couch 216, an image acquisition device 320, and a radiation delivery device 330. System 300 delivers radiation therapy to a patient in accordance with a radiotherapy treatment plan. In some embodiments, image acquisition device 320 may correspond to image acquisition device 132 in FIG. 1 that may acquire origin images of a first modality (e.g., MRI image shown in FIG. 4A) or destination images of a second modality (e.g., CT image shown in FIG. 4B).

[0090] Couch 216 may support a patient (not shown) during a treatment session. In some implementations, couch 216 may move along a horizontal translation axis (labelled “I”), such that couch 216 can move the patient resting on couch 216 into and/or out of system 300. Couch 216 may also rotate around a central vertical axis of rotation, transverse to the translation axis. To allow such movement or rotation, couch 216 may have motors (not shown) enabling the couch 216 to move in various directions and to rotate along various axes. A controller (not shown) may control these movements or rotations in order to properly position the patient according to a treatment plan.

[0091] In some embodiments, image acquisition device 320 may include an MRI machine used to acquire 2D or 3D MRI images of the patient before, during, and/or after a treatment session. Image acquisition device 320 may include a magnet 321 for generating a primary magnetic field for magnetic resonance imaging. The magnetic field lines generated by operation of magnet 321 may run substantially parallel to the central translation axis I. Magnet 321 may include one or more coils with an axis that runs parallel to the translation axis I. In some embodiments, the one or more coils in magnet 321 may be spaced such that a central window 323 of magnet 321 is free of coils. In other embodiments, the coils in magnet 321 may be thin enough or of a reduced density such that they are substantially transparent to radiation of the wavelength generated by radiation delivery device 330. Image acquisition device 320 may also include one or more shielding coils, which may generate a magnetic field outside magnet 321 of approximately equal magnitude and opposite polarity in order to cancel or reduce any magnetic field outside of magnet 321. As described below, radiation source 331 of radiation delivery device 330 may be positioned in the region where the magnetic field is cancelled, at least to a first order, or reduced.

[0092] Image acquisition device 320 may also include two gradient coils 325 and 326, which may generate a gradient magnetic field that is superposed on the primary magnetic field. Coils 325 and 326 may generate a gradient in the resultant magnetic field that allows spatial encoding of the protons so that their position can be determined. Gradient coils 325 and 326 may be positioned around a common central axis with the magnet 321 and may be displaced along that central axis. The displacement may create a gap, or window, between coils 325 and 326. In embodiments where magnet 321 can also include a central window 323 between coils, the two windows may be aligned with each other.

[0093] In some embodiments, image acquisition device 320 may be an imaging device other than an MRI, such as an X-ray, a CT, a CBCT, a spiral CT, a PET, a SPECT, an optical tomography, a fluorescence imaging, ultrasound imaging, radiotherapy portal imaging device, or the like. As would be recognized by one of ordinary skill in the art, the above description of image acquisition device 320 concerns certain embodiments and is not intended to be limiting.

[0094] Radiation delivery device 330 may include the radiation source 331, such as an X-ray source or a linac, and an MLC 332 (shown below in more detail in FIG. 5). Radiation delivery device 330 may be mounted on a chassis 335. One or more chassis motors (not shown) may rotate the chassis 335 around the couch 216 when the couch 216 is inserted into the treatment area. In an embodiment, the chassis 335 may be continuously rotatable around the couch 216, when the couch 216 is inserted into the treatment area. Chassis 335 may also have an attached radiation detector (not shown), preferably located opposite to radiation source 331 and with the rotational axis of the chassis 335 positioned between the radiation source 331 and the detector. Further, the device 330 may include control circuitry (not shown) used to control, for example, one or more of the couch 216, image acquisition device 320, and radiation delivery device 330. The control circuitry of the radiation delivery device 330 may be integrated within the system 300 or remote from it.

[0095] During a radiotherapy treatment session, a patient may be positioned on couch 216. System 300 may then move couch 216 into the treatment area defined by the magnet 321, coils 325, 326, and chassis 335. Control circuitry may then control radiation source 331, MLC 332, and the chassis motor(s) to deliver radiation to the patient through the window between coils 325 and 326 according to a radiotherapy treatment plan.

[0096] FIG. 2A, FIG. 2B, and FIG. 3 illustrate generally embodiments of a radiation therapy device configured to provide radiotherapy treatment to a patient, including a configuration where a radiation therapy output can be rotated around a central axis (e.g., an axis “A”). Other radiation therapy output configurations can be used. For example, a radiation therapy output can be mounted to a robotic arm or manipulator having multiple degrees of freedom. In yet another embodiment, the therapy output can be fixed, such as located in a region laterally separated from the patient, and a platform supporting the patient can be used to align a radiation therapy isocenter with a specified target locus within the patient.

[0097] As discussed above, radiation therapy devices described by FIG. 2A, FIG. 2B, and FIG. 3 include an MLC for shaping, directing, or modulating an intensity of a radiation therapy beam to the specified target locus within the patient. FIG. 5 illustrates an example MLC 332 that includes leaves 532A through 532J that can be automatically positioned to define an aperture approximating a tumor 540 cross-section or projection. The leaves 532A through 532J permit modulation of the radiation therapy beam. The leaves 532A through 532J can be made of a material specified to attenuate or block the radiation beam in regions other than the aperture, in accordance with the radiation treatment plan. For example, the leaves 532A through 532J can include metallic plates, such as comprising tungsten, with a long axis of the plates oriented parallel to a beam direction and having ends oriented orthogonally to the beam direction (as shown in the plane of the illustration of FIG. 2A). A “state” of the MLC 332 can be adjusted adaptively during a course of radiation therapy treatment, such as to establish a therapy beam that better approximates a shape or location of the tumor 540 or another target locus. This is in comparison to using a static collimator configuration or as compared to using an MLC 332 configuration determined exclusively using an “offline” therapy planning technique. A radiation therapy technique using the MLC 332 to produce a specified radiation dose distribution to a tumor or to specific areas within a tumor can be referred to as IMRT. The resulting beam shape that is output using the MLC 332 is represented as a graphical aperture image. Namely, a given graphical aperture image is generated to represent how a beam looks (beam shape) and its intensity after being passed through and output by MLC 332.

[0098] FIG. 6 illustrates an embodiment of another type of radiotherapy device 630 (e.g., a Leksell Gamma Knife), according to some embodiments of the present disclosure. As shown in FIG. 6, in a radiotherapy treatment session, a patient 602 may wear a coordinate frame 620 to keep stable the patient’s body part (e.g., the head) undergoing surgery or radiotherapy. Coordinate frame 620 and a patient positioning system 622 may establish a spatial coordinate system, which may be used while imaging a patient or during radiation surgery. Radiotherapy device 630 may include a protective housing 614 to enclose a plurality of radiation sources 612. Radiation sources 612 may generate a plurality of radiation beams (e.g., beamlets) through beam channels 616. The plurality of radiation beams may be configured to focus on an isocenter 210 from different directions. While each individual radiation beam may have a relatively low intensity, isocenter 210 may receive a relatively high level of radiation when multiple doses from different radiation beams accumulate at isocenter 210. In certain embodiments, isocenter 210 may correspond to a target under surgery or treatment, such as a tumor.

[0099] FIG. 7 A illustrates an example data flow for training and use of a GAN adapted for generating one or more preferences of an MCO problem to generate a radiotherapy treatment plan for a radiotherapy treatment device. For instance, the generator model 732 of FIG. 7A, which is trained to produce a trained generator model 760, may be trained to implement the processing functionality provided as part of the processor 114 in the radiotherapy system 100 of FIG. 1. Accordingly, a data flow of the GAN model usage 750 (prediction) is depicted in FIG. 7A as the provision of new data 770 (e.g., images or specified locations of a region of interest for a given patient or treatment device) to a trained generator model 760, and the use of the trained generator model 760 to produce a prediction or estimate of a generator output (one or more preferences of an MCO problem and, in some cases, a radiotherapy treatment plan based on a solution to the MCO problem with the one or more preferences) 734 and/or generated results 780 (e.g., one or more preferences of the MCO problem based on the (new data 770).

[0100] GANs comprise two networks: a generative network (e.g., generator model 732) that is trained to produce an output (sample) 734 that tries to fool a discriminative network (e.g., discriminator model 740) that samples the generative network’s output distribution (e.g., generator output (samples) 734) and decides whether that sample is the same or different from the true test distribution (obtained from training data 720, such as the information of the radiotherapy treatment plan or distribution of preferences of known radiotherapy treatment plans). The goal for this system of networks is to drive the generator network to learn the ground truth model or distribution of preferences associated with the radiotherapy treatment plans in the training data 720 as accurately as possible such that the discriminator network can only determine the correct origin for generator samples with 50% chance, which reaches an equilibrium with the generator network. The discriminator can access the ground truth but the generator only accesses the training data through the response of the discriminator to the generator’s output.

[0101] The data flow of FIG. 7A illustrates the receipt of training input 710, including various values of model parameters 712 and training data 720 and conditions or constraints 726. The training input 710 is provided to a GAN model training 730 to produce a trained generator model 760 used in the GAN model usage 750.

[0102] As part of the GAN model training 730, the generator model 732 is trained on real training batches of radiotherapy treatment plan data (also depicted in FIG. 7B as 723), to produce preferences of an MCO problem used to generate a radiotherapy treatment plan for those batches of radiotherapy treatment plan data. In this fashion, the generator model 732 is trained to produce, as generator output 734, simulated or synthetic preferences and/or synthetic solutions to the MCO problem based on such preferences. The discriminator model 740 decides whether a simulated preference or solution is from the training data (e.g., the training or true information associated with the radiotherapy treatment plans) or from the generator (e.g., the synthetic preferences and/or corresponding synthetic radiotherapy treatment plans generated by solving an MCO problem with the synthetic preferences, as communicated between the generator model 732 and the discriminator model 740). The discriminator output 736 is a decision of the discriminator model 740 indicating whether the received preference or radiotherapy treatment plan information is simulated or real and is used to train the generator model 732. In some cases, the generator model 732 is trained utilizing the discriminator on the generated samples. This training process results in back- propagation of weight adjustments 738, 742 to improve the generator model 732 and the discriminator model 740.

[0103] FIG. 7B illustrates an example convolutional neural network (CNN) model adapted for generating one or more preferences of an MCO problem, according to the present disclosure. Specifically, the model shown in FIG. 7B depicts an arrangement of a “U-Net” deep CNN designed for generating an output data set (output synthetic preferences of the MCO problem) based on an input training set (e.g., training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans). The name derives from the “U” configuration, and as is well understood, this form of CNN model can produce pixel-wise classification or regression results. In some cases, a first path leading to the CNN model includes one or more deformable offset layers and one or more convolution layers including convolution, batch normalization, and an activation such as the rectified linear unit (ReLU) or one of its variants. The model generates as output data set synthetic preferences of an MCO problem. In an example the model includes a neural network that outputs a fixed size one-dimensional output and, in such cases, the network ends with a linear layer or multi-layer perceptron.

[0104] The left side of the model operations (the “encoding” operations 792) learns a set of features that the right side (the “decoding” operations 794) uses to reconstruct an output result. The U-Net has n levels consisting of conv/BN/ReLU (convolution/batch normalization/rectified linear units) blocks 790, and each block has a skip connection to implement residual learning. The block sizes are denoted in FIG. 7B by “S” and “F” numbers, input images are SxS in size, and the number of feature layers is equal to F. The output of each block is a pattern of feature responses in arrays the same size as the images.

[0105] Proceeding down the encoding path, the size of the blocks decreases by ½ or 2^-1 at each level while the size of the features by convention increases by a factor of 2. The decoding side of the network goes back up in scale from S/2ⁿ while adding in feature content from the left side at the same level; this is the copy/concatenate data communication. The differences between the output image and the training version of that image drives the generator network weight adjustments by backpropagation. For inference, or testing, with use of the model, the input would be a set of patient information or radiotherapy treatment plan data and the output would be synthetic preferences of the MCO problem used to generate a radiotherapy treatment plan.

[0106] In deep CNN training, the learned model is the values of layer node parameters θ (node weights and layer biases) determined during training.

Training employs maximum likelihood or the cross entropy between the training data and the model distribution. A cost function expressing this relationship is

[0107] The exact form of the cost function for a specific problem depends on the nature of the model used. A Gaussian model

implies a cost function such as:

[0108] which includes a constant term that does not depend on θ . Thus, minimizing generates the mapping ƒ(x ; θ) that approximates the training

data distribution. [0109] The representation of the model of FIG. 7B illustrates the training and prediction of a generative model, which is adapted to perform regression rather than classification.

[0110] Consistent with embodiments of the present disclosure, the treatment modeling methods, systems, devices, and/or processes based on such models include two stages: training of the generative model, with use of a discriminator/generator pair in a GAN; and prediction with the generative model, with use of a GAN-trained generator. It will be understood that other variations and combinations of the type of deep learning model and other neural-network processing approaches may also be implemented with the present techniques.

[0111] A useful extension of the GAN is the conditional GAN. Conditional adversarial networks learn a mapping from observed image x and random noise z as G: {x, z} → y. Both adversarial networks consist of two networks: a discriminator (D) and a generator (G). The generator G is trained to produce outputs that cannot be distinguished from “real” or actual training information associated with radiotherapy treatment plans by an adversarially trained discriminator D that is trained to be maximally accurate at detecting “fakes” or outputs of G . The conditional GAN differs from the unconditional GAN in that both discriminator and generator inferences are conditioned on an example image of the type X.

[0112] The traditional GAN loss function is:

[0113] The conditional GAN loss function is:

[0114] where G tries to minimize this loss against an adversarial D that tries to maximize it, or, In addition, one wants the

generator G to minimize the difference between the predicted radiotherapy treatment plan information generated using estimated preferences of the MCO problem (or distribution preferences) and the actual training ground truth samples, so the complete loss is the λ-

weighted sum of two losses The

generator in the conditional GAN may be a U-Net.

[0115] According to some embodiments, the generator of the conditional GAN is trained to receive a batch of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans and generate a synthetic set of preferences of the MCO problem for that batch. A solution to the MCO problem can then be generated to provide a synthetic set of information associated with the radiotherapy treatment plans. The discriminator receives the synthetic set of information associated with the radiotherapy treatment plans and/or the synthetic set of preferences from the generator and is trained to distinguish the received synthetic information and/or preferences from being a real set of known information associated with the radiotherapy treatment plans (provided by the training data 720) or a fake or synthetic set of information associated with the radiotherapy treatment plans. The generator is trained to minimize a difference between the synthetic set of information associated with the radiotherapy treatment plans and the corresponding known information associated with the radiotherapy treatment plans. To this end, after the generator generates the synthetic set of information associated with the radiotherapy treatment plans, the training known information associated with the radiotherapy treatment plans is retrieved. A comparison is made between the generated synthetic information and this retrieved training known information. Parameters of the generator are then updated based on the difference in an attempt to minimize the difference. The discriminator is similarly trained based on whether the discriminator correctly classifies the generated information and/or preferences as real or fake. The parameters of the generator are updated based on a total loss function that takes into account the discriminator errors and the generator differences.

[0116] Subsequently, a second batch of training data is retrieved. The generator receives this batch of training data and generates a second synthetic set of preferences of the MCO problem which are used to solve the MCO problem to generate a second synthetic set of information associated with the radiotherapy treatment plans. A difference between the known training set of information associated with the radiotherapy treatment plans and the synthetically generated second set of information associated with the radiotherapy treatment plans is made and parameters of the generator are again updated based on this difference. Once all or a substantial portion of the training data is processed and used to update parameters of the generator and/or once a specified number of epochs or when the error (between synthetic information and/or preferences and training information) is within a threshold, the training ends and the generator’s parameters are output.

[0117] In an embodiment, to begin network training, an iteration index can be set to an initial value of zero. A batch of training data can be formed from a subset of the received sets of training radiotherapy treatment plan data. The batch of training data can be provided to the GAN and the GAN parameters can be updated based thereon. In an embodiment, parameters of the GAN, Θ, can be updated, such as to minimize or reduce a cost function, such as the cost function

where Y ean represent the synthetic information

associated with the radiotherapy treatment plans determined by the GAN based on preferences of the MCO problem generated by the GAN, where Y* can represent the known radiotherapy treatment plan information, and where Θ* can represent parameters of the GAN (e.g., layer node weights and biases as described above) corresponding to a minimized square error between Y and Y* .

[0118] In an embodiment, the cost function can include a probabilistic function where parameters of the GAN can be determined according to the expression

where Θ_train can represent the parameters of the fully trained GAN, and X can represent a collection of batches of training radiotherapy treatment plan data.

[0119] After updating the parameters of the GAN, the iteration index can be incremented by a value of one. The iteration index can correspond to a number of times that the parameters of the GAN have been updated. Stopping criteria can be computed, and if the stopping criteria are satisfied, then the GAN model can be saved in a memory, such as the memory device 116 of image processing device 112, and the training can be halted. If the stopping criteria are not satisfied, then the training can continue by obtaining another batch of training radiotherapy treatment plan data from the same set (or associated with the same clinical facility or clinician) or another set. In an embodiment, the stopping criteria can include a value of the iteration index (e.g., the stopping criteria can include whether the iteration index is greater than or equal to a determined maximum number of iterations). In an embodiment, the stopping criteria can include an accuracy of the output distribution of the preferences of the MCO problem provided by the generative model.

[0120] A motivation for predicting preferences of the MCO problem, using a generative machine learning model (or neural network), is to accelerate treatment planning computations. Current, conventional treatment planning requires a clinician to input weights for various objectives and preferences of the MCO problem in a trial and error fashion to develop a treatment plan. The development of the treatment plan in this manner consumes a great deal of resources and takes a very long time. The disclosed embodiments improve the quality and speed at which the treatment plans are created using a generative machine learning model that predicts the weights of the objectives and preferences of the MCO problem. Such weights of the objectives and preferences can provide the optimal or close to optimal solution to the MCO problem, which can reduce the amount of time and iterations it takes to develop a radiotherapy treatment plan. In this way, the overall processing and efficiency of the electronic device and processor is improved.

[0121] FIG. 8 illustrates a flowchart of a process 800 of example operations for training a generative model adapted for outputting a synthetic or simulated clinical preference information (e.g., weights of preferences or objectives in an MCO problem). The process 800 is illustrated from the perspective of a radiotherapy system 100, which trains and utilizes a generative model using a GAN, as discussed in the preceding examples. However, corresponding operations may be performed by other devices or systems (including in offline training or verification settings separate from a particular image processing workflow or medical treatment).

[0122] As shown, a first phase of the flowchart workflow begins with operations (810, 820) to establish the parameters of training and model operations. The process 800 begins with operations to receive (e.g., obtain, extract, identify) training sample data (operation 810) and constraints or conditions for training (operation 820). In an example, this training sample data may comprise training radiotherapy treatment plan data corresponding to known radiotherapy treatment plans. Also in an example, the constraints may relate to an imaging device, a treatment device, a patient, or medical treatment considerations. In an example, these constraints may include adversarial losses.

[0123] The second phase of the process 800 continues with training operations, including adversarial training of generative and discriminative models in a generative adversarial network (operation 830). In an example, the adversarial training includes training the generative model to generate simulated preferences of an MCO problem and corresponding simulated radiotherapy treatment plans by processing an input training radiotherapy treatment plan data (operation 842). The collection of simulated preferences and corresponding simulated radiotherapy treatment plans is provided to a discriminative model to train the discriminative model to classify the generated simulated preferences and corresponding simulated radiotherapy treatment plans as simulated or real training data (operation 844). Also in this adversarial training, the output of the generative model is used for training the discriminative model, and the output of the discriminative model is used for training the generative model.

[0124] In various examples, the generative model and the discriminative model comprise respective convolutional neural networks.

[0125] The process 800 continues with the output of the generative model for use in generating a synthetic set of preferences of the MCO problem (operation 850).

[0126] The process 800 continues with the utilization of the trained generative model to generate a simulated set of preferences of the MCO problem. The simulated set of preferences of the MCO problem is used to solve the MCO problem and ultimately to generate a radiotherapy treatment plan or radiotherapy device parameter.

[0127] FIG. 9 is a flowchart illustrating example operations of the image processing device 112 in performing process 900, according to example embodiments. The process 900 may be embodied in computer-readable instructions for execution by one or more processors such that the operations of the process 900 may be performed in part or in whole by the functional components of the image processing device 112; accordingly, the process 900 is described below by way of example with reference thereto. However, in other embodiments, at least some of the operations of the process 900 may be deployed on various other hardware configurations. The process 900 is therefore not intended to be limited to the image processing device 112 and can be implemented in whole, or in part, by any other component. Some or all of the operations of process 900 can be in parallel, out of order, or entirely omitted.

[0128] At operation 910, image processing device 112 receives a first set of radiotherapy treatment plan data, as discussed above.

[0129] At operation 920, image processing device 112 processes the first set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in a MCO problem, wherein the machine learning model is trained to establish a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network, as discussed above.

[0130] At operation 930, image processing device 112 generates a solution to the MCO problem based on the estimated one or more preferences, as discussed above.

[0131] At operation 940, image processing device 112 generates a radiotherapy treatment device parameter based on the solution to the MCO problem, as discussed above.

[0132] In an example, the machine learning model includes a neural network trained to generate the one or more preferences based on a plurality of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans. In such cases, the training process includes selecting a first batch of the plurality of training radiotherapy treatment plan data and processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model to estimate a set of preferences in the MCO problem. The MCO problem is solved based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans, and a gradient of a loss function is computed with respect to the estimated set of preferences, wherein the loss function is based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans. Parameters of the machine learning model are then updated based on the gradient. In this way, given a set of radiotherapy treatment plan data for which preferences in the MCO problem are unknown and not provided in the training data, the neural network can estimate such preferences. For example, the neural network can operate on training data that includes treatment plan information and the corresponding treatment plan data used to generate the treatment plan information but does not include the underlying preferences of the MCO problem used to generate such treatment plan information. In these cases, the neural network is trained to generate the preferences, which are then used to solve the MCO problem to generate an estimated or synthetic treatment plan information. The estimated or synthetic treatment plan information is then compared with the actual or known treatment plan information to compute a loss function and update parameters of the neural network.

[0133] In some examples, after the parameters of the neural network are updated, the same first batch of the plurality of training radiotherapy treatment plan data is processed again with the updated neural network to estimate a new set of preferences of the MCO problem. The MCO problem is solved again based on the estimated new set of preferences to generate a new batch of training radiotherapy treatment plans, which are used to compute a new gradient of the loss function with respect to the new batch of training radiotherapy treatment plans and the first batch of training radiotherapy treatment plans. The new gradient is then used to further update parameters of the neural network.

[0134] In some examples, different neural networks can be trained to generate preferences for different clinicians or medical facilities. For example, a first neural network can receive a first set of known radiotherapy treatment plans generated by a first clinician or medical facility. The first neural network can operate on a batch of training radiotherapy treatment data to generate the set of preferences of the MCO problem. These preferences can be used to generate a solution to the MCO problem that provides a synthetic radiotherapy treatment plan. This synthetic radiotherapy treatment plan can be compared with the first set of known radiotherapy treatment plans generated by a first clinician or medical facility to compute a loss function and update parameters of the first neural network based on the computed loss function.

[0135] Similarly, a second neural network can receive a second set of known radiotherapy treatment plans generated by a second clinician or medical facility. The second neural network can operate on the same or different batch of training radiotherapy treatment data (as the first neural network) to generate the set of preferences of the MCO problem. These preferences can be used to generate a solution to the MCO problem that provides a second synthetic radiotherapy treatment plan. This second synthetic radiotherapy treatment plan can be compared with the second set of known radiotherapy treatment plans generated by a second clinician or medical facility to compute a loss function and update parameters of the second neural network based on the computed loss function.

[0136] A clinician can then tailor their preferences to those of another clinician or medical facility by selecting the neural network that was trained based on the desired clinician or medical facility treatment plans. For example, the clinician can select the first neural network to generate the preferences of the MCO problem if the clinician desires to create a radiotherapy treatment plan using preferences generated by the first clinician or medical facility. As another example, the clinician can select the second neural network to generate the preferences of the MCO problem if the clinician desires to create a radiotherapy treatment plan using preferences generated by the second clinician or medical facility.

[0137] FIG. 10 illustrates an example user interface 1000 for solving an MCO problem using estimated preferences, according to some embodiments of the present disclosure. For example, a clinician can access the user interface 1000 through a computing device, such as image processing device 112 via a user interface 136. The image processing device 112 can receive input from the clinician, such as a file, that includes various radiotherapy treatment data (e.g., medical images and/or patient information). The image processing device 112 can process the received data with a trained machine learning model to generate or estimate a set of preferences of an MCO problem. These set of preferences can be displayed in the user interface 1000 as a set of default preference values 1021 of each of a plurality of MCO preferences 1020.

[0138] As an example, the user interface 1000 can include a region that allows a clinician to define preferences of an MCO problem 1010. The region can include one or more preferences 1020 with corresponding preference values 1021 (e.g., weights associated with the objectives or preferences) used to solve the MCO problem. The image processing device 112 can receive the output of the machine learning model and can display that output as the different preference values 1021 for each or a portion of the set of preferences 1020. For example, the image processing device 112 can include an amount of dose to a target region preference

1020 and corresponding default preference value 1021 (e.g., generated by the machine learning model). As another example, the image processing device 112 can include level of normal tissue sparing preference 1020 and corresponding default preference value 1021 (e.g., generated by the machine learning model). As another example, the image processing device 112 can include treatment complexity parameter preference 1020 and corresponding default preference value

1021 (e.g., generated by the machine learning model).

[0139] The user interface 1000 can receive input from the clinician modifying any one of the preference values 1021. For example, the clinician can increase or decrease any one of the weights of the preference values 1021 provided in the user interface 1000. After the clinician is satisfied with the preference values 1021, the user interface 1000 can receive input that selects the solve MCO problem option 1030. In response, the user interface 1000 generates a solution to the MCO problem which provides or can be used to generate a radiotherapy treatment plan that includes or can be used to generate one or more radiotherapy treatment device parameters.

[0140] FIG. 11 illustrates a block diagram of an embodiment of a machine 1100 on which one or more of the methods as discussed herein can be implemented. In one or more embodiments, one or more items of the image processing device 112 can be implemented by the machine 1100. In alternative embodiments, the machine 1100 operates as a standalone device or may be connected (e.g., networked) to other machines. In one or more embodiments, the image processing device 112 can include one or more of the items of the machine 1100. In a networked deployment, the machine 1100 may operate in the capacity of a server or a client machine in sewer-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

[0141] The example machine 1100 includes processing circuitry or processor 1102 (e.g., a CPU, a GPU, an ASIC, circuitry, such as one or more transistors, resistors, capacitors, inductors, diodes, logic gates, multiplexers, buffers, modulators, demodulators, radios (e.g., transmit or receive radios or transceivers), sensors 1121 (e.g., a transducer that converts one form of energy (e.g., light, heat, electrical, mechanical, or other energy) to another form of energy), or the like, or a combination thereof), a main memory 1104, and a static memory 1106, which communicate with each other via a bus 1108. The machine 1100 (e.g., computer system) may further include a video display device 1110 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). The machine 1100 also includes an alphanumeric input device 1112 (e.g., a keyboard), a user interface navigation device 1114 (e.g., a mouse), a disk drive or mass storage unit 1316, a signal generation device 1118 (e.g., a speaker), and a network interface device 1120.

[0142] The disk drive unit 1116 includes a machine-readable medium 1122 on which is stored one or more sets of instructions and data structures (e.g., software) 1124 embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 1124 may also reside, completely or at least partially, within the main memory 1104 and/or within the processor 1102 during execution thereof by the machine 1100, the main memory 1104 and the processor 1102 also constituting machine-readable media.

[0143] The machine 1100 as illustrated includes an output controller 1128. The output controller 1128 manages data flow to/from the machine 1100. The output controller 1128 is sometimes called a device controller, with software that directly interacts with the output controller 1128 being called a device driver.

[0144] While the machine-readable medium 1122 is shown in an embodiment to be a single medium, the term ''machine-readable medium" may include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and sewers) that store the one or more instructions or data structures. The term "machine-readable medium" shall also be taken to include any tangible medium that is capable of storing, encoding or carrying instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure, or that is capable of storing, encoding or carrying data structures utilized by or associated with such instructions. The term "machine-readable medium " shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media. Specific examples of machine-readable media include non-volatile memory, including by way of example semiconductor memory devices, e.g., Erasable Programmable Read-Only Memory (EPROM), EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD- ROM disks.

[0145] The instructions 1124 may further be transmitted or received over a communications network 1126 using a transmission medium. The instructions 1124 may be transmitted using the network interface device 1120 and any one of a number of well-known transfer protocols (e.g., HTTP). Examples of communication networks include a LAN, a WAN, the Internet, mobile telephone networks, Plain Old Telephone (POTS) networks, and wireless data networks (e.g., WiFi and WiMax networks). The term "transmission medium" shall be taken to include any intangible medium that is capable of storing, encoding, or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible media to facilitate communication of such software.

[0146] As used herein, “communicatively coupled between” means that the entities on either end of the coupling must communicate through an item therebetween and that those entities cannot communicate with each other without communicating through the item.

[0147] The above detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show, by way of illustration but not by way of limitation, specific embodiments in which the disclosure can be practiced. These embodiments are also referred to herein as “examples.” Such examples can include elements in addition to those shown or described. However, the present inventors also contemplate examples in which only those elements shown or described are provided. Moreover, the present inventors also contemplate examples using any combination or permutation of those elements shown or described (or one or more aspects thereof), either with respect to a particular example (or one or more aspects thereof), or with respect to other examples (or one or more aspects thereof) shown or described herein.

[0148] All publications, patents, and patent documents referred to in this document are incorporated by reference herein in their entirety, as though individually incorporated by reference. In the event of inconsistent usages between this document and those documents so incorporated by reference, the usage in the incorporated reference(s) should be considered supplementary to that of this document; for irreconcilable inconsistencies, the usage in this document controls.

[0149] In this document, the terms “a,” “an,” “the,” and “said” are used when introducing elements of aspects of the disclosure or in the embodiments thereof, as is common in patent documents, to include one or more than one or more of the elements, independent of any other instances or usages of “at least one” or “one or more.” In this document, the term “or” is used to refer to a nonexclusive or, such that “A or B” includes “A but not B,” “B but not A,” and “A and B,” unless otherwise indicated.

[0150] In the appended claims, the terms “including” and “in which” are used as the plain-English equivalents of the respective terms “comprising” and “wherein.” Also, in the following claims, the terms “comprising,” “including,” and “having” are intended to be open-ended to mean that there may be additional elements other than the listed elements, such that after such a term (e.g., comprising, including, having) in a claim are still deemed to fall within the scope of that claim. Moreover, in the following claims, the terms “first,” “second,” “third,” and so forth, are used merely as labels and are not intended to impose numerical requirements on their objects.

[0151] Embodiments of the disclosure may be implemented with computer- executable instructions. The computer-executable instructions (e.g., software code) may be organized into one or more computer-executable components or modules. Aspects of the disclosure may be implemented with any number and organization of such components or modules. For example, aspects of the disclosure are not limited to the specific computer-executable instructions or the specific components or modules illustrated in the figures and described herein. Other embodiments of the disclosure may include different computer-executable instructions or components having more or less functionality than illustrated and described herein.

[0152] Method examples (e.g., operations and functions) described herein can be machine or computer- implemented at least in part (e.g., implemented as software code or instructions). Some examples can include a computer-readable medium or machine-readable medium encoded with instructions operable to configure an electronic device to perform methods as described in the above examples. An implementation of such methods can include software code, such as microcode, assembly language code, a higher-level language code, or the like (e.g., “source code”). Such software code can include computer-readable instructions for performing various methods (e.g., “object” or “executable code”). The software code may form portions of computer program products. Software implementations of the embodiments described herein may be provided via an article of manufacture with the code or instructions stored thereon, or via a method of operating a communication interface to send data via a communication interface (e.g., wirelessly, over the internet, via satellite communications, and the like).

[0153] Further, the software code may be tangibly stored on one or more volatile or non-volatile computer-readable storage media during execution or at other times. These computer-readable storage media may include any mechanism that stores information in a form accessible by a machine (e.g., computing device, electronic system, and the like), such as, but are not limited to, floppy disks, hard disks, removable magnetic disks, any form of magnetic disk storage media, CD- ROMS, magnetic-optical disks, removable optical disks (e.g., compact disks and digital video disks), flash memory devices, magnetic cassettes, memory cards or sticks (e.g., secure digital cards), RAMs (e.g., CMOS RAM and the like), recordable/non-recordable media (e.g., ROMs), EPROMS, EEPROMS, or any type of media suitable for storing electronic instructions, and the like. Such computer-readable storage medium is coupled to a computer system bus to be accessible by the processor and other parts of the OIS.

[0154] In an embodiment, the computer-readable storage medium may have encoded a data structure for treatment planning, wherein the treatment plan may be adaptive. The data structure for the computer-readable storage medium may be at least one of a Digital Imaging and Communications in Medicine (DICOM) format, an extended DICOM format, an XML format, and the like. DICOM is an international communications standard that defines the format used to transfer medical image-related data between various types of medical equipment. DICOM RT refers to the communication standards that are specific to radiation therapy.

[0155] In various embodiments of the disclosure, the method of creating a component or module can be implemented in software, hardware, or a combination thereof. The methods provided by various embodiments of the present disclosure, for example, can be implemented in software by using standard programming languages such as, for example, C, C++, Java, Python, and the like; and combinations thereof. As used herein, the terms “software” and “firmware” are interchangeable and include any computer program stored in memory for execution by a computer.

[0156] A communication interface includes any mechanism that interfaces to any of a hardwired, wireless, optical, and the like, medium to communicate to another device, such as a memory bus interface, a processor bus interface, an Internet connection, a disk controller, and the like. The communication interface can be configured by providing configuration parameters and/ or sending signals to prepare the communication interface to provide a data signal describing the software content. The communication interface can be accessed via one or more commands or signals sent to the communication interface.

[0157] The present disclosure also relates to a system for performing the operations herein. This system may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. The order of execution or performance of the operations in embodiments of the disclosure illustrated and described herein is not essential, unless otherwise specified. That is, the operations may be performed in any order, unless otherwise specified, and embodiments of the disclosure may include additional or fewer operations than those disclosed herein. For example, it is contemplated that executing or performing a particular operation before, contemporaneously with, or after another operation is within the scope of aspects of the disclosure.

[0158] In view of the above, it will be seen that the several objects of the disclosure are achieved and other advantageous results attained. Having described aspects of the disclosure in detail, it will be apparent that modifications and variations are possible without departing from the scope of aspects of the disclosure as defined in the appended claims. As various changes could be made in the above constructions, products, and methods without departing from the scope of aspects of the disclosure, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.

[0159] The above description is intended to be illustrative and not restrictive. For example, the above-described examples (or one or more aspects thereof) may be used in combination with each other. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the disclosure without departing from its scope. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the disclosure should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

[0160] Also, in the above Detailed Description, various features may be grouped together to streamline the disclosure. This should not be interpreted as intending that an unclaimed disclosed feature is essential to any claim. Rather, inventive subject matter may lie in less than all features of a particular disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment. The scope of the disclosure should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. Further, the limitations of the following claims are not written in means-plus- function format and are not intended to be interpreted based on 35 U.S.C. § 112, sixth paragraph, unless and until such claim limitations expressly use the phrase “means for” followed by a statement of function void of further structure. [0161] The Abstract is provided to comply with 37 C.F.R. § 1.72(b), to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.

Claims

What is Claimed is:

1. A method comprising: receiving, by processing circuitry, a first set of radiotherapy treatment plan data; processing, by the processing circuitry, the first set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem, wherein the machine learning model is trained to establish a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network; generating, by the processing circuitry, a solution to the MCO problem based on the estimated one or more preferences; and generating a radiotherapy treatment device parameter based on the solution to the MCO problem.

2. The method of claim 1, wherein the solution to the MCO problem is further generated using the first set of radiotherapy treatment plan data.

3. The method of any one of claims 1-2, further comprising selecting the machine learning model from a plurality of machine learning models, each of the plurality of machine learning models being trained based on a different set of training radiotherapy treatment plan data corresponding to different medical facilities or clinicians, the different set of training radiotherapy treatment plan data being stored on a remote storage device, the different set of training radiotherapy treatment plan data being updated periodically or in real-time.

4. The method of any one of claims 1-3, wherein the first set of radiotherapy treatment plan data comprises one or more medical images comprising a computed tomography (CT) image and a magnetic resonance (MR) image.

5. The method of any one of claims 1-4, wherein the one or more preferences are expressed using at least one of a weighted sum of objectives or an ordinal ranking of objectives.

6. The method of any one of claims 1-5, wherein a first preference of the one or more preferences corresponds to an amount of dose to a target region, a second preference of the one or more preferences corresponds to a level of normal tissue sparing, and a third preference of the one or more preferences corresponds to a treatment complexity parameter.

7. The method of claim 6, wherein the treatment complexity parameter comprises at least one of treatment delivery time, number of required geometrical configurations, or robustness to positioning uncertainty.

8. The method of any one of claims 1-7, further comprising: generating a display of a graphical user interface for generating a radiotherapy treatment plan; displaying, within the graphical user interface, the one or more preferences as default values for solving the MCO problem; and detecting user interaction with the graphical user interface modifying the default values.

9. The method of claim 8, wherein the graphical user interface comprises a first option for specifying a first preference corresponding to an amount of dose to a target region, a second option for specifying a second preference corresponding to a level of normal tissue sparing, and a third option for specifying a third preference corresponding to a treatment complexity parameter.

10. The method of any one of claims 1-9, further comprising training the machine learning model by performing training operations comprising: obtaining the plurality of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans; selecting a first batch of the plurality of training radiotherapy treatment plan data; processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model to estimate a set of preferences in the MCO problem; solving the MCO problem based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans; computing a gradient of a loss function with respect to the estimated set of preferences, wherein the loss function is based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans; and updating parameters of the machine learning model based on the gradient.

11. The method of claim 10, further comprising performing a plurality of iterations of the training operations across additional batches of the plurality of training radiotherapy treatment plan data until a stopping criterion is met.

12. The method of claim 11, wherein the batch of training radiotherapy treatment plans is a first batch of training radiotherapy treatment plans, further comprising after updating the parameters of the machine learning model: processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model again to estimate a new set of preferences of the MCO problem; solving the MCO problem based on the estimated new set of preferences to generate a new batch of training radiotherapy treatment plans; computing a new gradient of the loss function with respect to the new batch of training radiotherapy treatment plans and the first batch of training radiotherapy treatment plans; and updating parameters of the machine learning model based on the new gradient.

13. The method of claim 10, wherein the MCO problem is solved in accordance with a gradient-based optimization method.

14. The method of any one of claims 10-13, wherein the machine learning model comprises a generative adversarial network (GAN) comprising a discriminative machine learning model or a generative machine learning model; and wherein the loss function comprises at least one of (i) a deviation between information associated with the batch of training radiotherapy treatment plans and information associated with the known set of radiotherapy treatment plans, (ii) an adversarial loss function, or (iii) an evidence lower bound (ELBO).

15. The method of any one of claims 1-14, wherein solving the MCO problem based on the estimated set of preferences comprises processing the estimated set of preferences in accordance with a deep equilibrium model.

16. The method of any one of claims 1-15, wherein solving the MCO problem based on the estimated set of preferences comprises processing the estimated set of preferences in accordance with a convex optimization layers process.

17. The method of any one of claims 1-16, wherein the generated radiotherapy treatment device parameter comprises at least one of fluence maps, control points, beam-angles, collimator apertures, seed placements, dwell times, isocenter locations, or beam-on times.

18. A system for generating one or more radiotherapy treatment plans, the system comprising: one or more processors configured to perform operations comprising: receiving a first set of radiotherapy treatment plan data; processing the first set of radiotherapy treatment plan data with a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem, wherein the machine learning model is trained to establish a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network; generating a solution to the MCO problem based on the estimated one or more preferences; and generating a radiotherapy treatment device parameter based on the solution to the MCO problem.

19. The system of claim 18, wherein the solution to the MCO problem is further generated using the first set of radiotherapy treatment plan data.

20. The system of any one of claims 18-19, wherein the operations further comprise selecting the machine learning model from a plurality of machine learning models, each of the plurality of machine learning models being trained based on a different set of training radiotherapy treatment plan data corresponding to different medical facilities or clinicians, the different set of training radiotherapy treatment plan data being stored on a remote storage device, the different set of training radiotherapy treatment plan data being updated periodically or in real-time.

21. The system of any one of claims 18-20, wherein the first set of radiotherapy treatment plan data comprises one or more medical images comprising a computed tomography (CT) image and a magnetic resonance (MR) image.

22. The system of any one of claims 18-21, wherein the one or more preferences are expressed using at least one of a weighted sum of objectives or an ordinal ranking of objectives.

23. The system of any one of claims 18-22, wherein a first preference of the one or more preferences corresponds to an amount of dose to a target region, a second preference of the one or more preferences corresponds to a level of normal tissue sparing, and a third preference of the one or more preferences corresponds to a treatment complexity parameter.

24. The system of claim 23, wherein the treatment complexity parameter comprises at least one of treatment delivery time, number of required geometrical configurations, or robustness to positioning uncertainty.

25. The system of any one of claims 18-24, wherein the operations further comprise: generating a display of a graphical user interface for generating a radiotherapy treatment plan; displaying, within the graphical user interface, the one or more preferences as default values for solving the MCO problem; and detecting user interaction with the graphical user interface modifying the default values.

26. The system of any one of claims 18-25, wherein the operations further comprise training the machine learning model by performing training operations comprising: obtaining the plurality of training radiotherapy treatment plan data corresponding to a known set of radiotherapy treatment plans; selecting a first batch of the plurality of training radiotherapy treatment plan data; processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model to estimate a set of preferences in the MCO problem; solving the MCO problem based on the estimated set of preferences to generate a batch of training radiotherapy treatment plans; computing a gradient of a loss function with respect to the estimated set of preferences, wherein the loss function is based on the batch of training radiotherapy treatment plans and the known set of radiotherapy treatment plans; and updating parameters of the machine learning model based on the gradient.

27. The system of claim 26, wherein the operations further comprise performing a plurality of iterations of the training operations across additional batches of the plurality of training radiotherapy treatment plan data until a stopping criterion is met.

28. The system of claim 27, wherein the batch of training radiotherapy treatment plans is a first batch of training radiotherapy treatment plans; and wherein the operations further comprise after updating the parameters of the machine learning model: processing the first batch of the plurality of training radiotherapy treatment plan data with the machine learning model again to estimate a new set of preferences of the MCO problem; solving the MCO problem based on the estimated new set of preferences to generate a new batch of training radiotherapy treatment plans; computing a new gradient of the loss function with respect to the new batch of training radiotherapy treatment plans and the first batch of training radiotherapy treatment plans; and updating parameters of the machine learning model based on the new gradient.

29. The system of claim 26, wherein the MCO problem is solved in accordance with a gradient-based optimization method.

30. The system of claim 29, wherein the machine learning model comprises a generative adversarial network (GAN) comprising a discriminative machine learning model or a generative machine learning model; and wherein the loss function comprises at least one of (i) a deviation between information associated with the batch of training radiotherapy treatment plans and information associated with the known set of radiotherapy treatment plans, (ii) an adversarial loss function, or (iii) an evidence lower bound (ELBO).

31. The system of any one of claims 18-30, wherein solving the MCO problem based on the estimated set of preferences comprises processing the estimated set of preferences in accordance with a deep equilibrium model.

32. The system of any one of claims 18-31, wherein solving the MCO problem based on the estimated set of preferences comprises processing the estimated set of preferences in accordance with a convex optimization layers process.

33. The system of any one of claims 18-32, wherein the generated radiotherapy treatment device parameter comprises at least one of fluence maps, control points, beam-angles, collimator apertures, seed placements, dwell times, isocenter locations, or beam-on times.

34. A method for training a machine learning model comprising: receiving a first set of radiotherapy treatment plan data; and training a machine learning model to estimate one or more preferences in a multicriteria radiotherapy treatment plan optimization (MCO) problem by establishing a relationship between the one or more preferences and information associated with a plurality of training radiotherapy treatment plan data, wherein the machine learning model includes a neural network.

35. The method of claim 34, wherein the one or more preferences comprise weighted objectives in the MCO problem.

36. A transitory or non-transitory computer-readable medium comprising non-transitory computer-readable instructions for performing operations of any of the preceding claims.