WO2022005337A1

WO2022005337A1 - Veins mask projection alignment

Info

Publication number: WO2022005337A1
Application number: PCT/RU2021/050188
Authority: WO
Inventors: Dmitry Vladimirovich Dylov; Vito Michele LELI; Oleg Yur'yevich ROGOV; Aleksandr Yevgenyevich SARACHAKOV; Aleksandr Aleksandrovich RUBASHEVSKII
Original assignee: Autonomous Non-Profit Organization For Higher Education «Skolkovo Institute Of Science And Technology»
Priority date: 2020-06-29
Filing date: 2021-06-29
Publication date: 2022-01-06

Abstract

A method and a system are provided for veins mask projection alignment over the actual veins of a target area. The method comprises the steps: taking a further image of a target area with the projected veins mask; processing the further image to determine an alignment value evaluating the projection alignment; keeping the mask intact if the determined alignment value is more or equals to a predetermined threshold, otherwise performing the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment. The system for veins mask projection alignment comprises a camera configured to take images of the target area, a visible light projector configured to project an input image on the target area, a processing module functionally coupled with the camera and the projector.

Description

VEINS MASK PROJECTION ALIGNMENT

FIELD OF THE INVENTION

The following generally relates to mixed reality facilities and visible object augmentation with a specific application to a real-time veins imaging and medical manipulations using the same.

BACKGROUND OF THE INVENTION

According to analytical studies, the number of performed blood tests can be estimated as more than 1.5 million per day, with roughly 45% of them involving discomfort of various degrees of severity, e.g., rashes, hematomas, or damaged veins due to repeated venipuncture. The risk group includes people with obesity (2.1 billion people in the world), diabetes (415 million), chronic venous damage (30 million), infants and children up to 10 years (more than 1.5 billion).

Peripheral difficult venous access (PDVA) problem is characterized by poorly discernible and non-palpable veins when even a highly experienced doctor resorts to the use of technological aids for guiding the needle of the veinpuncturing device. Most frequent causes are thin or deep veins, excessive adipose tissue layers, loss of color contrast due to the tone or the hairiness of the skin, edemas, and prior puncture damage.

Though there can be found several systems capable of partially solving this problem through the hardware and software integration, today there has not been a universally adopted solution, especially when particularly difficult PDVA cases occur.

The most promising line of augmenting products in the field of veins imaging are the vein scanners that employ NIR cameras to gain the extra vein contrast present in the NIR part of the spectrum. The NIR image can clearly show the contours of the veins on a separate screen or directly on the body part of the patient using a visible light projector. The use of such instruments is reported to result in improvement of the catheterization process by 81%, in reduction of the procedure time by 78%, and in doubling of the first-time venipuncture success rate in the pediatric care segment.

Besides unwieldy dimensions of the hardware, the factors that hinder development of a universal vein imager include the algorithmic and the image-processing challenges. These challenges could be summarized to originate from the following two physical reasons, having influenced manufacturers to develop separate products for different patient cohorts and different applications:

- the large variation in vasculature contrast and in the dynamic range of the NIR measurement among the patients (e.g., due to the skin tone and thickness, etc.), requiring highly sensitive vasculature detection methods capable of operating at low signal-to-noise ratios;

- the variation in size, flatness, and position of the imaged body part, requiring non-trivial alignment of the measurement with the visualization modality (the screen, or the projector).

The pending invention is focused on solving the problems relating to proper alignment of a veins mask projection on the part of a human body for which the mask is obtained using a camera, according to some embodiments, using a NIR camera. In other words, it can be said that the proposed method and system are focused on aligning some visible image (veins mask projected by a projector on the surface) with a partially invisible one (the actual veins contour that is partially invisible to an unaided eye).

Considering all the problems set out in this disclosure, the applicant assumes that the proposed invention may be further successfully applied not only to veins imaging, but to other practical use of mixed or augmented reality techniques that face with the same problems of proper alignment of the augmenting elements with reference to real objects.

SUMMARY OF THE INVENTION

As the aspects of the invention described herein address the above-referenced problem of proper projection alignment of a veins mask, while the preceding procedures of obtaining and enhancing thereof are not covered in detail by the present description.

The issue of alignment and modality co-registration is challenging since the NIR camera or visible range camera with an IR-filter which is used for obtaining a veins mask, cannot see the visible projection of the segmented mask. Different products utilize different factory calibration methods to assure proper alignment of the recording and the visualizing image. However, this problem should be considered together with the shape variations in the imaged body parts caused by the difference between the 3D space of the acquiring camera and the projector, and therefore requires an advanced feedback-based alignment approaches which are the object of the pending invention.

It is to be noted that the term “projection” used hereinafter is meant to relate to the augmenting picture of the veins contour provided by means of a projector that projects the veins mask over the target area of a human or animal body part. For solving the above mentioned problems of proper veins mask alignment, the proposed invention uses a noise-based reinforcement learning (RL) alignment approach that ’rewards’ the system when the noise fluctuations from the visible projection overlap with the NIR mask. The proposed invention takes advantage of the OpenCV and OpenAI stable baselines for RL to do so in real time, performing affine transformations of the projection until proper alignment with the vasculature is achieved (translation, rotation, scaling).

While the use of RL in a mixed-reality applications has been given a passing mention as a means for improving medical imaging projection (for example, in US 20200260066), there is no consistent technical solution offered in the prior art providing effective mutual alignment of an object and its veins imaging projection in a real-time mode.

Moreover, the prior art in the area of vein visualization is closely related to the products available in the market today, which include hand-held single and multi-screen tools, augmented reality glasses, surgical systems, projection-based fixed and hand-held devices. Therefore, it will be particularly appreciated that the techniques described hereby provide a possibility to use conventional commercially available hardware and open source software baselines to achieve effective real-time mask projection alignment.

As a further advantageous aspect, the proposed invention is configured for relatively easy integration in a real-time vein-imaging scheme .

Speaking of the practical use of the claimed invention, even if the imaged body part shifts in a way requiring obtaining a new veins mask, the proposed projection alignment method provides a continuous process of veins mask projection and alignment.

All the advantageous effects and benefits of the invention, including the above mentioned will be described in further detail below and are provided by the following aspects of the invention.

According to one aspect, the invention relates to a method for veins mask projection alignment over the actual veins of a target area of a human or animal body part via reinforcement learning.

According to one aspect of the present invention, the proposed method comprises the following steps:

- projecting the veins mask of a target area of a human or animal body part over said area;

- taking a further image of a target area with the projected veins mask;

- processing the further image to determine an alignment value evaluating the projection alignment; - keeping the mask intact if the determined alignment value is more or equals to a predetermined threshold, otherwise performing the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment; wherein the set of the agents states is defined by a set of transformation values needed to adjust the projected veins mask over the veins as registered by the last taken image via the affine transformation: planar translations, rotation on axis perpendicular to the projection plane, scaling; the reward function is based on the alignment value; and the policy function is an optimal policy learned via reinforcement learning algorithms.

According to some embodiments, the method further includes taking an image of a target area of a human or animal body part and obtaining a tracking mask from the last taken image, wherein the tracking mask is a part of the image containing the minimum set of pixels required to localize the veins of the target area on the image, with the processing of the further image to determine an alignment value evaluating the projection alignment further including processing the further image to define a region of interest, wherein the region of interest is the area inside the further image which best matches the tracking mask, and returning back to obtaining a veins mask from the NIR image if the region of interest is not defined on the image, otherwise processing the defined ROI to determine an alignment value evaluating the projection alignment.

According to some embodiments, prior to the projecting the method further comprises obtaining a veins mask from the last taken image, the veins mask containing a contour of the veins of said body part.

The proposed method can also comprise obtaining a tracking mask which includes:

- obtaining a rectangular region from the last taken image, the rectangular region surrounding the veins according to the veins mask;

- obtaining a contiguous convex area of the veins by performing a convex hull operation over the veins mask;

- obtaining a candidate tracking mask by filtering the frame with the convex area, thus collecting a convex map of pixels;

- testing whether the collected convex map of pixels correctly tracks the target area using the template matching function; and

- if the candidate tracking mask correctly tracks the desired area inside the last taken image, selecting the candidate tracking mask as a tracking mask,

- otherwise expanding the candidate tracking mask and returning back to the testing with the expanded candidate tracking mask as a candidate tracking mask. According to some embodiments, the processing of the further image to define the region of interest is performed using the template matching function.

According to some embodiments, the images are images of a near-infrared camera.

According to some embodiments, the veins mask is a black-and-white image with veins represented by white.

According to some embodiments, the alignment value is a value indicative of red channel intensity over the region of interest or is based on the noise fluctuations between the visible mask projection over the region of interest registered on the last taken image.

According to some embodiments, the reinforcement learning algorithms are Q-learning, Deep Q Network (DQN), Proximal Policy Optimization (PPO), Hierarchical Reinforcement Learning (HRL), or a combination thereof.

According to some embodiments, the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment includes:

- defining the current state of the one or more artificial agents based on one or more last taken images of the target area;

- calculating the alignment value using artificial neural networks;

- selecting the action that increases the alignment value using an optimal policy learned via reinforcement learning algorithms, wherein each action influences the position of the projected augmentation through changes in planar translation, rotation or scaling;

- repeating the previous steps of state defining, alignment value calculating and action selecting until the optimal projection alignment value is achieved.

According to some embodiments, said veins mask transformation via a group of multiple artificial agents based on reinforcement learning.

According to some embodiments, each agent is configured to perform only a subset of opposite actions present in the action space, and the agents operation is controlled by a supervising agent.

In another aspect, the invention proposes a system for veins mask projection alignment over the actual veins of a target area of a human or animal body part, wherein the system comprises a camera configured to take images of the target area, a visible light projector configured to project an input image on the target area, a processing module functionally coupled with the camera and the projector and configured to:

- provide the projector with a veins mask of the target area to be projected over said area; - process the further image of a target area with the proj ected veins mask from the camera to determine an alignment value evaluating the projection alignment;

- keep the mask intact if the determined alignment value is more or equals to a predetermined threshold, otherwise perform the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment; wherein the set of the agents states is defined by a set of transformation values needed to adjust the projected veins mask over the veins as registered by the last image from the camera via the affine transformation: planar translations, rotation on axis perpendicular to the projection plane, scaling; the reward function is based on the alignment value; and the policy function is an optimal policy learned via reinforcement learning algorithms. the processing module is further configured to: obtain a tracking mask from the last image taken by the camera, wherein the tracking mask is a part of the image containing the minimum set of pixels required to localize the veins of target area on the image.

According to some embodiments, the processing module is further configured to:

- process the further image of a target area with the projected veins mask from the camera to define a region of interest, wherein the region of interest is the area inside the further image which best matches the tracking mask;

- return back to obtaining a veins mask from the NIR image if the region of interest is not defined on the image, otherwise processing the defined ROI to determine an alignment value evaluating the projection alignment.

According to some embodiments, the processing module is further configured to obtain a veins mask from the last image taken by the camera, the veins mask containing a contour of the veins of said body part.

In yet another group of embodiments, the processing module is further configured to:

- obtain a rectangular region from the last taken image, the rectangular region surrounding the veins according to the veins mask;

- obtain a contiguous convex area of the veins by performing a convex hull operation over the veins mask;

- obtain a candidate tracking mask by filtering the frame with the convex area, thus collecting a convex map of pixels;

- test whether the collected convex map of pixels correctly tracks the target area using the template matching function; and - if the candidate tracking mask correctly tracks the desired area inside the last taken image, select the candidate tracking mask as a tracking mask,

- otherwise expand the candidate tracking mask and return back to the testing step with the expanded candidate tracking mask as a candidate tracking mask.

According to some embodiments, the processing module is configured to use a template matching function to process the further image from the camera to define the region of interest.

In some embodiments of the system, the camera is near-infrared camera.

In some embodiments of the system, the alignment value is a value indicative of red channel intensity over the region of interest or the alignment value is based on the noise fluctuations between the visible mask projection over the region of interest registered on the last taken image.

The reinforcement learning algorithms based on which the processing module operates to some embodiments of the system are Q-leaming, Deep Q Network (DQN), Proximal Policy Optimization (PPO), Hierarchical Reinforcement Learning (HRL), or a combination thereof.

According to some embodiments of the system, the veins mask transformation via a group of multiple artificial agents is based on reinforcement learning. According to a further embodiment, each agent is configured to perform only a subset of opposite actions present in the action space, and the agents operation is controlled by a supervising agent.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention may take form in various components and arrangements of components, and in various steps and arrangements of steps. The drawings are only for purposes of explaining the essential features of the proposed method and system and illustrating that preferred embodiments and are not to be construed as limiting the invention.

FIGURE 1 schematically illustrates an embodiment of the veins mask registration and projection system.

FIGURE 2 flowcharts a tracking mask obtaining according to an embodiment of a method for veins mask projection alignment.

FIGURE 3 shows agent-environment interaction in a reinforcement learning decision process.

FIGURE 4 alignment procedure via an artificial agent. FIGURE 5 shows the mixed reality metric function for an agent operative for translation on X, y axis according to one of the embodiments.

FIGURE 6 illustrates the exemplary reward dependency for an agent operative for vertical alignment.

FIGURE 7 shows the steps of image-projection alignment, wherein insets indicate gradual positions of the registered image and the plot shows mixed reality contrast (being the alignment value for the illustrated embodiment) in relation to the camera frame number.

FIGURE 8 shows a functional diagram of an embodiment with multiple agents.

DETAILED DESCRIPTION OF EMBODIMENTS

Initially referring to FIG. 1, system 1 is schematically illustrated. The system comprises a source of IR illumination, a camera configured to capture the scattered illumination from a human or animal body part, a computational core configured to obtain a veins mask of the body part and transform it for proper alignment, and a projector for projecting the veins mask on the body part.

According to the most plain embodiment, the system 1 scheme is build with simple and commercially available components which function as follows. IR diodes, operating at the wavelength range from 750 nm to 900 nm (for example, Kingbright, L-7113SF4C, Taipei, Taiwan may be used), illuminate the area of interest on the patient’s forearm. The light scattered and absorbed in the forearm tissues is then collected by a NIR camera (e.g. Raspberry NoIR camera V2). The IR filter placed before the camera reduces the stray light effect of the ambient light, mostly in the visible spectral range. In order to reduce reflection from the air- skin interface to minimize the flare, crossed polarizers are placed after the diodes and before the camera. The raw image acquired by the camera is then processed by the processing module (e.g. Raspberry Pi 4 with 4Gb RAM) and the processed image goes into projector. The processing module itself contains various modules for image-mask alignment (CV- and RL- based). For some embodiments, the processing module is also configured for obtaining the veins mask (for some embodiments Frangi-filtered input for the U-Net were successfully used). Finally, a projector (for example, XGIMI Z3, Chengdu Sichuan, China) is used to translate the transformed segmented veins back to the forearm for proper hand-mask alignment as shown in FIG. 1.

According to the most plain embodiment, the camera has a frame rate of 30 fps and, given the above mentioned exemplary processing capabilities, the system is configured to provide a new image each 5 seconds. However, a wide range of other equipment may be used having higher processing and imaging capabilities resulting in faster alignment procedure, with some embodiments being configured to both quickly obtain new veins mask and promptly align it despite frequent movement of the body part.

Thus, the system can be based on ubiquitous components, which results in low cost and facilitates the integration of the claimed methods into a wide range of in-place equipment.

Now moving further to FIG. 2, the focus will be made on the projection alignment process - therefore, we assume that the veins mask is already obtained.

According to an exemplary embodiment, veins mask is obtained from a NIR image of a target area of a human or animal body part based on the property of the NIR light to be absorbed or scattered in the forward direction by blood, whereas it is scattered in skin and subcutaneous fat. In this relation, the “veins mask” is hereinafter a black-and-white image with veins represented by white. However, with corresponding modifications, the proposed alignment method can be successfully used with different techniques providing different types of veins mask.

Each iteration of the whole alignment method is based on comparing two successive images (F and F’) of the target area. Since only a part of a camera frame may contain the region where the veins are found, in order to make the comparison more efficient, according to some of the embodiments there are defined a tracking mask (TM) containing the minimum set of pixels required to localize the veins of the body part inside the previous camera frame (F) and a corresponding region of interest (ROI) inside the further camera frame (F’). Thereby, if the body part for which the first frame (F) was obtained was moved for some reason, it is possible to define whether the same veins mask is relevant for the further camera frame (F’) or a new veins mask should be obtained from the further frame (F’). If the ROI is not defined, it means that for some reason (related with a change in the imaged area) the further camera frame (F’) does not comprise the part corresponding to the part of the first frame (F) according to which the current veins mask was obtained. Therefore a new veins mask should be obtained from the further frame (F’), which is then taken as a first frame (F) while repeating the step of defining the ROI on the following camera frame. According to the particular embodiment, these operations provide continuous and correct veins mask generation and projection even when there are substantive changes in the camera field of view.

So, according to some of the embodiments, as a frame is obtained from the NIR camera, a tracking procedure is performed to properly localize the ROI position inside the last frame. This ROI is then used for providing a proper alignment of the projected mask over the target area of the human or animal body part. According to one of the embodiments, the tracking procedure for defining a ROI is as follows. Starting with a camera frame (F) and a veins mask (VM) extracted from that frame:

1) a rectangular region (RECT) surrounding the veins identified in VM is obtained from the frame (F);

2) a convex hull operation is performed over the veins mask (VM) to obtain a contiguous convex area (CA) of the veins;

3) a candidate tracking mask (CTM) is obtained by filtering the frame (F) with the convex area (CA), thus collecting a convex map of pixels which can be tested for tracking with a Template Matching operation (wherein the template matching operation is used according to commonly known steps example of which are provided below);

4) if the candidate tracking mask (CTM) correctly tracks the desired area inside the frame (F), it is selected as tracking mask (TM), and otherwise it is expanded and re-tested again, therefore the procedure implies going back to the step 3 taking the expanded tracking mask as a candidate tracking mask for the new iteration; thereby this process continues with areas of increasing size and it’s limited by rectangular area (RECT), which is used if a smaller tracking mask (TM) is not found;

5) as each next camera frame (F’) is obtained, Region Of Interest (ROI) is defined as the area tracked by the tracking mask (TM).

The procedure of obtaining a tracking mask for an embodiment when the veins mask is extracted from the current frame is shown on FIG. 2. However, it is not always convenient to extract veins mask from each frame, and therefore, according to some of the embodiments, the veins mask is extracted only if no ROI was defined on the current frame.

The template matching operation used herein according to some embodiments of the invention can be implemented as a commonly known function that slides through the image, compares the overlapped patches of size w*h against the template using the specified method and stores the comparison results. The summation is done over the image patch: x'=0...w l, y'=0...h l.

According to some of the embodiments, the following formula is used:

wherein / denotes image, T - tracking mask, R - result, Mis another mask used to contour tracking T. According to the invention, alignment of the mask projection is provided by the veins mask transformations which are found by the means of one or more artificial agents which are based on reinforcement learning (RL). The exemplary operating diagram of the process of agent-environment interaction is illustrated on FIG. 3, where a quite simple illustrative embodiment is schematically depicted with only one agent performing the actions according to the received environment information from camera. The illustrated aspect of the inventive concept may also be called a closed-loop system that provides the proper alignment of the augmentation projection over the real object with use of artificial agent based on reinforcement learning (RL).

FIG. 4 provides an operative block-scheme of the agent logic. According to one of the embodiments, the states space for at least some of the artificial agents is defined by a set of values needed to adjust the projected mask over the real veins via an affine transformation: planar translations, rotation on axis perpendicular to the projection plane, and scaling.

These dimensions also represent the action space where the agent can slightly change one value at the time: according to some of the embodiments, those are one pixel up translation, one degree clockwise rotation, one percent scale increase for each step, wherein a step is the minimal action that the system can do to adjust the projection without recalculating the value of the objective function.

The action cost function is fixed as -1 for each step, which means that every action costs one unit of time. This provides the technical effect of minimizing the number of actions needed to obtain the proper alignment of the projection.

The objective function drives the agent towards the proper alignment.

According to some of the embodiments, the video acquisition element of the system is a NIR camera, thus the normal light spectrum is not available to be processed. Solving the above mentioned problem of finding a proper transformations without the access to the real state of the system but only to a partial one can be considered as a Partially Observable Markov Decision Processes (POMDP). Thus, according to the invention, the objective function (displacement metrics) is defined as the lowest possible value of the red channel intensity over the target area Regio of Interest (ROI) (the area where the augmentation mask needs to be projected) which was found in the previous steps of veins mask extraction and tracking.

Given the original image and the mask, according to some embodiments it is further considered whether the projected image is shifted from the correct position. Since the projected mask is obtained according to a camera frame, the mask corresponds to the point of view of the acquisition camera which position differs from the projector. At least this difference should be compensated by the transformations of the mask in order to obtain an optimal alignment.

As it was noted above, the nonsignificant (not causing any substantial changes of the camera image resulting in fail of ROI definition) movements of the body part can also be taken into account here, and the steps of obtaining another mask are skipped only if the ROI corresponding to the mask is defined on the further camera frame (F).

The mask transformation is completed via the affine transform matrix. If a set of agent states is 5 e S, actions - a E A , and rewards - r, e R, a state s_t representing a transformation can be written by a column-wise 3 x 3 transformation matrix responsible for the mask shifts against the original image. Also s_t can be parameterized by 5 parameters of translations {t_x, l_} } , scaling {s_c, s_} J and a rotation angle f. An example of such transformation matrix (1) according to one of the embodiments of the invention is given below:

When a veins mask is projected over the target area of the body, the camera registers an image of augmented or mixed reality. The reward function for the one or more agents used according to the invention is based on some parameter of the image that is considered to be indicative of the projection alignment accuracy (as it would be understood by a person skilled in the art, there might be different parameters used for different embodiments of the invention and their modifications). FIG. 5 illustrates such missed reality metric in relation to possible projection positions within X, Y axis. FIG. 6 shows the example of mixed reality reward in relation to particular vertical displacement of the veins mask projection along with particular images from which some of the shown values were obtained. In view of wide variety of possible embodiments, the reward function may account for a particular properties and special aspects and different threshold values may be preset to consider the observable alignment being proper.E

According to some embodiments, Q-Value family of RL algorithms is used to address the problem to find proper policies and actions that guides the decision process of the artificial agent on this step. As a non-limiting example of such embodiments, the policy learning process is considered as an RL problem of determining the optimal function in the action-value space, and the Q-value is:

Qt (^st ^at) = m tax E[fi

Taking a discount factor ^ - ^ Q^t(^{sn at}) _^ the Q-value function is maximized over ^r , a state-action-state associated transition function.

The action state space is discrete, thus it is possible to predict the optimal action by a plethora of RL algorithms, which is done according to some of the invention embodiments. The examples of such RL algorithms for the proposed method and system are Deep Q Network (DQN) or Proximal Policy Optimization (PPO).

An example of a DQN loss function for update according to one of the embodiments:

Here ^ is the discount factor determining the agent’s horizon, Q, are the parameters of q the Q-network at iteration ¹ , ' are the network parameters used to compute the agent at iteration ¹ (usually those are the parameters from ^{z -} 1 iteration).

An example of a PPO loss function for update according to one of the embodiments:

L_t (Q) = E, (min (r, (q)A , clip(r_t (<9)),1 - e,I + e]l, ) ^ n £ A

Here ^u stands for the policy parameter, ' is the empirical expectation, t is the estimated advantage at time

^e is a hyper-parameter, ^r‘ is the ratio of the probability under the new and old policies, respectively:

As it would be appreciated by a person skilled in the art, other RL algorithms can also be used. However, according to the Applicant’s experience over different embodiments, the reward function converges better with the use of Deep Q-network (DQN).

Overall, the benefits of using the RL algorithms is due to that those can infer Value functions of state that are not visited using function approximations systems such as Multi layer Perceptron (MLP) or Convolutional Neural Network (CNN).

Since according to some of the embodiments a vein image is visible by NIR camera while veins mask projection is not, for these embodiments the aligning strategy can only be based on intensity heuristics over the ROI and a RL approach with mentioned above constraints that converges to the solution. In particular, according to an embodiment, the optimal alignment is considered to be achieved when each pixel of the ROI has somehow increased its red channel intensity. According to some embodiments, the displacement metric (also called the alignment value) is based on the contrast of the ROI on the further camera frame (F’). Thus, according to an embodiment, an artificial agent is defined to align any misaligned projection over the proper area with the following steps (Fig. 4):

1) definition of the current state of the artificial agent based on one or more images received from the NIR acquisition camera and processed with the aforementioned intensity heuristics;

2) calculation of the displacement metric (alignment value) from observations using algorithms and artificial neural networks (such as Multi-layer Perceptron (MLP) or Convolutional Neural Network (CNN))(see FIG. 7);

3) selecting the action that reduces the displacement (increases the alignment) using an optimal policy p_* (an association between sensing information and actions that lead to the desired objective) learned via reinforcement learning algorithms, each action influences the position of the projected augmentation through changes in planar translation, rotation or scaling;

4) repeating the process of state determination, displacement (alignment) evaluation and action selection for a plurality of steps until the projected image is on the target area;

5) if the projected image arrives on target, the system continues to monitor the alignment via the measurement of the displacement metric (alignment value): if this metric (value) remains still on the minimum achievable, the projection is on the proper position; if the metric increases it means that the target position has changed (see FIG. 5);

6) if during steps 1 to 5 the target area changes position, orientation or scale, the agent will align the mask with respect to the new target position by repeating steps 1 to 5.

According to some of the embodiments there are multiple agents used in solving the problem of proper veins mask projection alignment, a multi-agent RL (MARL) problem is to be considered, which addresses the sequential decision-making problem of multiple autonomous agents that operate in a common environment, each of which aims to optimize its own long-term return by interacting with the environment and other agents. In particular, both the evolution of the system state and the reward received by each agent are influenced by the joint actions of all agents. Each agent has its own long-term reward to optimize, which now becomes a function of the policies of all other agents. According to an embodiment of the invention, a group of agents is defined where each one is configured to perform only a subset of opposite actions present in the action space (translation of one pixel up/down, translation of one pixel left/right, clockwise or counterclockwise rotation of one degree, increase/decrease of scale by one percent). At any step /, the combined execution is interleaved with only one agent that has the opportunity to execute one action which influences the same projection mask. According to one of the embodiments, when an agent has the operative priority, it will follow the mixed reality reward function to the local minima across it’s available actions and afterwards pass the priority to another agent with a different set of actions. Thus, when all the agents observe the same value of mixed reality metric and the overall system reached the global minima, it is considered that projecting the veins mask is over the correct 3D position. The stationary point reached will remain unchanged unless the user/patient will move the arm. In this case the metric used will increase from the previous obtained minima, allowing the agents ensemble to start the search for the correct projection again. The agents cooperate in a potential game setting where the common global function (called potential function) is the mixed reality metric (which for some of the embodiments is the intensity heuristic, as it was described above).

If the limb motion causes sufficient shift, the agents receive a different mixed reality reward value, thus starting the aligning process again.

According to some embodiments, the operation of said group of multiple agents is controlled by a supervising agent. In different examples of such embodiments, the supervising agent is configured to prioritize the operation of the multiple agents and/or use some overall reward function to control the multiple agents operation.

As it would be appreciated by a person skilled in the art, the particular realization of the invention within the above description is limited only by the performance of the available components, among which a camera and a processing module are the most limiting ones. It would be understood that the low quality of the projection made by the projector used may imply further additional improvements that may be used in combination with the claimed invention.

It is also understood that the offered solution to the MARL problem can also be altered and is not limited by the examples provided herein.

The invention has been described with reference to the preferred embodiments. Modifications and alterations may occur to others upon reading and understanding the preceding detailed description. It is intended that the invention be constructed as including all such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

CLAIMS:

1. A method for veins mask projection alignment over the actual veins of a target area of a human or animal body part via reinforcement learning, the method comprising the following steps: projecting the veins mask of a target area of a human or animal body part over said area; taking a further image of a target area with the projected veins mask; processing the further image to determine an alignment value evaluating the projection alignment; keeping the mask intact if the determined alignment value is more or equals to a predetermined threshold, otherwise performing the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment; wherein the set of the agents states is defined by a set of transformation values needed to adjust the projected veins mask over the veins as registered by the last taken image via the affine transformation: planar translations, rotation on axis perpendicular to the projection plane, scaling; the reward function is based on the alignment value; the policy function is an optimal policy learned via reinforcement learning algorithms.

2. A method of claim 1, wherein the method further includes: taking an image of a target area of a human or animal body part; obtaining a tracking mask from the last taken image, wherein the tracking mask is a part of the image containing the minimum set of pixels required to localize the veins of the target area on the image; and the processing of the further image to determine an alignment value evaluating the projection alignment includes: processing the further image to define a region of interest, wherein the region of interest is the area inside the further image which best matches the tracking mask; returning back to obtaining a veins mask from the NIR image if the region of interest is not defined on the image, otherwise processing the defined ROI to determine an alignment value evaluating the projection alignment.

3. A method of any one of claims 1 to 2, wherein prior to the projecting the method further comprises: obtaining a veins mask from the last taken image, the veins mask containing a contour of the veins of said body part.

4. A method any one of claims 2 to 3, wherein said obtaining a tracking mask comprises: obtaining a rectangular region from the last taken image, the rectangular region surrounding the veins according to the veins mask; obtaining a contiguous convex area of the veins by performing a convex hull operation over the veins mask; obtaining a candidate tracking mask by filtering the frame with the convex area, thus collecting a convex map of pixels; testing whether the collected convex map of pixels correctly tracks the target area using the template matching function; and if the candidate tracking mask correctly tracks the desired area inside the last taken image, selecting the candidate tracking mask as a tracking mask, otherwise expanding the candidate tracking mask and returning back to the testing with the expanded candidate tracking mask as a candidate tracking mask.

5. A method of any one of claims 2 to 4, wherein the processing of the further image to define the region of interest is performed using the template matching function.

6. A method of any one of claims 1 to 5 wherein the images are images of a near-infrared camera.

7. A method of any one of claims 1 to 6, wherein the veins mask is a black-and-white image with veins represented by white.

8. A method of any one of claims 1 to 7, wherein the alignment value is a value indicative of red channel intensity over the region of interest.

9. A method of any one of claims 1 to 7, wherein the alignment value is based on the noise fluctuations between the visible mask projection over the region of interest registered on the last taken image.

10. A method of any one of claims 1 to 9, wherein the reinforcement learning algorithms are Q-leaming, Deep Q-Network (DQN), Proximal Policy Optimization (PPO), Hierarchical Reinforcement Learning (HRL), or a combination thereof.

11. A method of any one of claims 1 to 10, wherein said performing the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment includes: defining the current state of the one or more artificial agents based on one or more last taken images of the target area; calculating the alignment value using artificial neural networks; selecting the action that increases the alignment value using an optimal policy learned via reinforcement learning algorithms, wherein each action influences the position of the projected augmentation through changes in planar translation, rotation or scaling; repeating the previous steps of state defining, alignment value calculating and action selecting until the optimal projection alignment value is achieved.

12. A method of any one of claims 1 to 11, wherein said veins mask transformation via a group of multiple artificial agents based on reinforcement learning.

13. A method of claim 12, wherein each agent is configured to perform only a subset of opposite actions present in the action space, and the agents operation is controlled by a supervising agent.

14. A system for veins mask projection alignment over the actual veins of a target area of a human or animal body part, the system comprising: a camera configured to take images of the target area, a visible light projector configured to project an input image on the target area, a processing module functionally coupled with the camera and the projector and configured to: provide the projector with a veins mask of the target area to be projected over said area; process the further image of a target area with the projected veins mask from the camera to determine an alignment value evaluating the projection alignment; keep the mask intact if the determined alignment value is more or equals to a predetermined threshold, otherwise perform the veins mask transformation via one or more artificial agents based on reinforcement learning to obtain an optimal projection alignment; wherein the set of the agents states is defined by a set of transformation values needed to adjust the projected veins mask over the veins as registered by the last image from the camera via the affine transformation: planar translations, rotation on axis perpendicular to the projection plane, scaling; the reward function is based on the alignment value; the policy function is an optimal policy learned via reinforcement learning algorithms.

15. The system for veins mask projection alignment according to claim 14, wherein the processing module is further configured to: obtain a tracking mask from the last image taken by the camera, wherein the tracking mask is a part of the image containing the minimum set of pixels required to localize the veins of target area on the image.

16. The system for veins mask projection alignment according to claim 15, wherein the processing module is further configured to: process the further image of a target area with the projected veins mask from the camera to define a region of interest, wherein the region of interest is the area inside the further image which best matches the tracking mask; return back to obtaining a veins mask from the NIR image if the region of interest is not defined on the image, otherwise processing the defined ROI to determine an alignment value evaluating the projection alignment.

17. The system for veins mask projection alignment according to any one of claims 14 to 16, wherein the processing module is further configured to obtain a veins mask from the last image taken by the camera, the veins mask containing a contour of the veins of said body part.

18. The system for veins mask projection alignment according to any one of claims 14 to 17, wherein the processing module is further configured to: obtain a rectangular region from the last taken image, the rectangular region surrounding the veins according to the veins mask; obtain a contiguous convex area of the veins by performing a convex hull operation over the veins mask; obtain a candidate tracking mask by filtering the frame with the convex area, thus collecting a convex map of pixels; test whether the collected convex map of pixels correctly tracks the target area using the template matching function; and if the candidate tracking mask correctly tracks the desired area inside the last taken image, select the candidate tracking mask as a tracking mask, otherwise expand the candidate tracking mask and return back to the testing step with the expanded candidate tracking mask as a candidate tracking mask.

19. The system for veins mask projection alignment according to any one of claims 16 to 18, wherein the processing module is configured to use a template matching function to process the further image from the camera to define the region of interest.

20. The system for veins mask projection alignment according to any one of claims 14 to 19, wherein the camera is near-infrared camera.

21. The system for veins mask projection alignment according to any one of claims 14 to 20, wherein the veins mask is a black-and-white image with veins represented by white.

22. The system for veins mask projection alignment according to any one of claims 14 to 21, wherein the alignment value is a value indicative of red channel intensity over the region of interest.

23. The system for veins mask projection alignment according to any one of claims 14 to 21, wherein the alignment value is based on the noise fluctuations between the visible mask projection over the region of interest registered on the last taken image.

24. The system for veins mask projection alignment according to any one of claims 14 to 23, wherein the reinforcement learning algorithms are Q-leaming, Deep Q-Network (DQN), Proximal Policy Optimization (PPO), Hierarchical Reinforcement Learning (HRL), or a combination thereof.

25. The system for veins mask projection alignment according to any one of claims 14 to 24, wherein said veins mask transformation via a group of multiple artificial agents based on reinforcement learning.

26. The system for veins mask projection alignment according to claim 25, wherein each agent is configured to perform only a subset of opposite actions present in the action space, and the agents operation is controlled by a supervising agent.