GB2549068A - Image adjustment - Google Patents

Image adjustment Download PDF

Info

Publication number
GB2549068A
GB2549068A GB1604792.0A GB201604792A GB2549068A GB 2549068 A GB2549068 A GB 2549068A GB 201604792 A GB201604792 A GB 201604792A GB 2549068 A GB2549068 A GB 2549068A
Authority
GB
United Kingdom
Prior art keywords
clique
node
possible pixel
unary
potential
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1604792.0A
Other versions
GB201604792D0 (en
GB2549068B (en
Inventor
Zach Christopher
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Europe Ltd
Original Assignee
Toshiba Research Europe Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Research Europe Ltd filed Critical Toshiba Research Europe Ltd
Priority to GB1604792.0A priority Critical patent/GB2549068B/en
Publication of GB201604792D0 publication Critical patent/GB201604792D0/en
Priority to JP2017029719A priority patent/JP6448680B2/en
Priority to US15/455,849 priority patent/US10078886B2/en
Publication of GB2549068A publication Critical patent/GB2549068A/en
Application granted granted Critical
Publication of GB2549068B publication Critical patent/GB2549068B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/143Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing

Abstract

Image processing comprising associating each pixel of an input image with a node of a graph 612, each pixel label associated to a unary potential of assigning the label to a given node and a clique potential is associated to each permutation of pixel labels assignments of nodes of a clique. Messages are initialised between each pair of nodes of each clique 614. For each node, determining a set of possible pixel labels for which the associated unary potential is known 618; computing a unary potential of a possible pixel label for which the unary potential was unknown and updating the determined set of labels accordingly 620-624; based on the known unary potentials, adjusting the clique potentials associated to each clique to which the node belongs 626; and, based on the adjusted clique potentials, adjusting a message between the node and other nodes of each clique containing the node 628. The process described for each node is repeated until a convergence criterion is met. The adjusted messages determine pixel labels for each node which produces an image adjustment such as segmentation, stereo correspondence, denoising or zoom 634. Iteratively calculating unary potentials during the message passing belief propagation algorithm saves memory.

Description

Image Adjustment
Field
This disclosure relates to image processing. In particular, but without limitation, this disclosure relates to a method of adjusting an image using message passing.
Background
As computer vision and image processing systems become more complex, it is increasingly important to build models in a way that makes it possible to manage this complexity.
Maximum a-posteriori (MAP) inference in graphical models, and especially in random fields defined over image domains, is one of the most useful tools in computer vision and related fields. If all potentials defining the objective are of parametric shape, then in certain cases non-linear optimization is the method of choice for best efficiency. On the other hand, if the potentials are not of a parametric shape, then methods such as loopy belief propagation (BP) or its convex variants are the method of choice. BP and related algorithms face two limitations if the state space is large: first, the intrinsic message passing step requires at least linear time in terms of the state space size, and it is superlinear in general. Thus, the runtime of these methods does not scale well with the state space size. Second, the memory consumption grows linearly with the state space size, since belief propagation requires the maintenance of messages for each state.
If the state space is huge, then even optimizing non-parametric unary potentials (usually referred as data terms) by explicit enumeration may be computationally too expensive for many applications (e.g. when implemented on embedded devices). Certain data terms allow more efficient computation via integral images or running sums, and data terms may need not be computed to full precision, but these methods are only suitable for very specific problem instances.
The present invention seeks to provide improved methods and systems for adjusting an image.
Summary
Aspects and features of an invention are set out in the appended claims.
Brief description of the drawings
Illustrative embodiments of the present disclosure will now be described, by way of example only, with reference to the drawings. In the drawings:
Fig. 1 shows the architecture of an example apparatus or device;
Figs. 2a, 2b and 2c show example graphs comprising four nodes;
Fig. 3 shows the evolution of dual energies with respect to the number of passes of the image for a dense stereo correspondence problem;
Figs. 4a and 4b show converged disparity maps;
Fig. 5 is a graphical plot showing the primal energy evolution with respect to number of passes over an image using different traversal schedules and weights according to a method disclosed herein;
Fig. 6 is a flowchart illustrating a method of adjusting an image;
Fig. 7 is a flowchart illustrating an optional process that may be integrated with the process of Fig. 6;
Fig. 8 shows graphical plots depicting the evolution of primal energies for dense disparity estimation with respect to wall time;
Fig. 9 is a visual illustration of converging stereo results;
Fig. 10 shows the evolution of primal energies for dense optical flow;
Fig. 11 is a visual illustration of the convergence of the optical flow field for the “Schefflera” dataset; and
Fig. 12 is a visual illustration of converging optical flow fields for various datasets.
Throughout the description and the drawings, like reference numerals refer to like parts.
Detailed description A method for adjusting an image using message passing comprises associating each pixel of an image with a node of a graph and one or more cliques of nodes, determining for a node of the graph a respective set of possible pixel labels for which a unary potential is known, computing for that node a unary potential of a possible pixel label for which the unary potential is unknown, adjusting a clique potential associated with each clique to which that node belongs based on the unary potentials, and adjusting, based on the adjusted clique potential associated with each clique to which that node belongs, at least one of the messages between that node and the other nodes of each clique. Once a convergence criterion is met, an adjusted image is produced having pixel labels determined from the adjusted messages. A computer implemented method for producing an adjusted image using message passing is provided. The method comprises receiving an input image that comprises a plurality of pixels. Each pixel has an input value and each input value corresponds to one of a number of possible pixel labels. The method further comprises associating each pixel with a node of a graph and one or more cliques of nodes.
Each possible pixel label is associated with a unary potential, and, for each permutation of possible pixel labels of the pixels associated with the nodes of each clique, a clique potential is associated. The method further comprises, for each clique, initialising at least one message between each pair of nodes of that clique.
The method further comprises, for each node, a) determining, for that node, a respective set of possible pixel labels for which the unary potential is known, b) computing a unary potential of a selected possible pixel label for which the unary potential is unknown and updating the respective set of possible pixel labels to include the selected possible pixel label, c) for each clique to which that node belongs, based on the unary potentials associated with the updated respective set of possible pixel labels, adjusting the clique potential associated with that clique, and, d) for each clique to which that node belongs, adjusting, based on the adjusted clique potential associated with that clique, at least one of the messages between that node and the other nodes of that clique. For each node, steps a)-d) are repeated until a convergence criterion is met.
The method further comprises producing an adjusted image having pixel labels determined from the adjusted messages. A unary potential is a mapping from a pair to a real value, the pair comprising a pixel and a label for that pixel. This mapping or function may be based on the image values near the pixel of interest. A unary potential can be thought of as a cost associated with assigning a pixel label or state or value to a pixel (node). A clique potential of a clique is a cost associated with assigning pixel labels or labels to nodes of that clique. Accordingly, in an optimisation problem in image processing, the unary potentials and clique potentials may act as a measure of how close a solution of the problem is to an optimal solution. One may therefore attempt to find an optimal solution by finding a minimum sum of unary potentials and clique potentials, i.e. the assignment of pixel labels to pixels for which there is a minimal cost. A unary potential may be defined based on the image processing problem to be solved. For example, in some circumstances the unary potentials may all comprise constant values for a respective possible pixel label. In other circumstances, the unary potential may be defined by a polynomial function. Any suitable unary potential may be used and examples will be clear from the examples below.
In some situations, such as when the pixel labels are to be used to identify disparities or flow vectors, the unary potential for a particular pixel is a function comparing an image patch centred at that particular pixel in the reference image with an image patch from a second (“moving”) image centred at the particular pixel shifted by the label value.
In some situations, such as when the pixel labels are to be used to identify an object category, the unary potential is a function that has the image patch centred at the particular pixel and an object category as arguments. The unary potential would therefore represent a cost associated with assigning the pixel to an object category. This function can be trained beforehand using, for example, machine learning techniques.
The methods disclosed herein demonstrate a memory-efficient and fast way of performing image processing operations such as object segmentation, distortion correction, blurring and zooming. By calculating a unary potential for a selected possible pixel label in each iteration of steps a) to d), not all of the unary potentials necessarily need to be calculated for the method to converge on a solution to an image processing problem. Accordingly, the method requires less memory to be able to converge on a solution to an image processing problem, while maintaining a high level of accuracy.
The method may further comprise, after step a) and before step c), updating the respective set of possible pixel labels by removing one of the possible pixel labels therefrom. Optionally, for each clique to which the node belongs, the method uses an associated parameter related to messages passed between the node and other nodes of that clique. The parameter can be dependent on the possible pixel labels for that node. The one of the possible pixel labels that is removed from the respective set of possible pixel labels may be the possible pixel label of the respective set of possible pixel labels for which the sum of the associated parameters is greatest.
At step c), adjusting the clique potential associated with that clique may comprise adjusting the clique potential based on a weighted sum of the unary potentials associated with the respective set of possible pixel labels. The weighting may be an even or uneven weighting. Adjusting the clique potential associated with that clique may further comprise adding a constant value to the weighted sum for each possible pixel label which is not included in the respective set of possible pixel labels.
At step b), the selected possible pixel label may be selected from a candidate set of possible pixel labels. The candidate set of possible pixel labels may be based on an estimated set of possible pixel labels of an adjacent node for which the unary potential is known.
Repeating steps a)-d) for each node until a convergence criterion is met may comprise repeating steps a)-d) a predetermined number of times. Repeating steps a)-d) for each node until a convergence criterion is met may comprise repeating steps a)-d) until repetition does not change the messages.
At step b), computing a unary potential of a selected possible pixel label may comprise estimating a unary potential of the selected possible pixel label based on a unary potential associated with a possible pixel label of an adjacent node in the graph. For example, if a possible pixel label of an adjacent node is associated with a particular unary potential (for example a constant value such as 1), then computing the unary potential of the selected possible pixel label may comprise assigning the particular unary potential (the constant value) to the selected possible pixel label. Such an assignment may indicate that adjacent nodes are likely to take the same value, for example by indicating that two adjacent pixels are associated with the same surface depicted in an image.
At step d), adjusting at least one of the messages may comprise adjusting the at least one of the messages according to one or more predetermined message weightings. For example, the messages may be evenly weighted. The messages may be unevenly weighted.
The unary potentials may be configured so as to cause the method to perform a segmentation operation upon the input image. The unary potentials may be configured so as to cause the method to perform a distortion correction operation upon the input image. The unary potentials may be configured so as to cause the method to perform an optical flow operation upon the input image. The unary potentials may be configured so as to cause the method to perform a dense stereo correspondence operation upon the input image. The unary potentials may be configured so as to cause the method to perform an image denoising operation upon the input image. The unary potentials may be configured so as to cause the method to perform a deblurring and/or zooming operation upon the input image. Furthermore the clique potentials may be initially configured so as to cause the method to perform one or more of said operations upon the input image.
An apparatus or system is provided, the apparatus or system arranged to perform a method as disclosed herein. The apparatus or system may comprise input/output means arranged to receive an image. The apparatus or system may comprise a memory storing instructions for causing a processor to perform steps of a method as disclosed herein. The apparatus or system may comprise a processor for performing steps of a method as disclosed herein. A non-transitory computer-readable medium is provided. The computer-readable medium comprises machine-readable instructions arranged, upon execution by one or more processors, to cause the one or more processors to carry out a method as disclosed herein.
Further optional features will be appreciated from the following description.
Fig. 1 shows the architecture of an example apparatus or device 100 for performing the methods described herein. The apparatus or device 100 comprises a processor 110, a memory 115, and a display 135. These are connected to an optional central bus structure, the display 135 being connected via a display adaptor 130. The example apparatus or device 100 also comprises an input device 125 (such as a mouse and/or keyboard) and a communications adaptor 105 for connecting the apparatus or device to other apparatuses, devices or networks. The input device 125 and communications adaptor 105 are also connected to the central bus structure, the input device 125 being connected via an input device adaptor 120. The example apparatus or device 100 also comprises an output device 145 (such as a supplementary display device and/or a sound system). The output device 145 is connected to the central bus structure via an output device adaptor 140.
In operation the processor 110 can execute computer-executable instructions stored in the memory 115 and the results of the processing can be displayed to a user on the display 135. User inputs for controlling the operation of the computer may be received via input device(s) 125. Results of at least part of the processing can also be outputted via output device(s) 145. MAP Inference and Graphical Models
One of the first steps in creating a computer vision system is the establishment of the overall computational paradigm that will be used to compute the final solution. One of the most flexible ways to implement a solution is through the combination of an energy function and maximum a-posteriori (MAP) inference.
The MAP inference strategy begins with the definition of a conditional probability distribution p(X\Y) where X is a vector of random variables estimated from observations Y. In MAP inference, the actual estimate X* is found by finding the vector X* that maximises p(X\Y).
The connection with energy functions can be seen by expressing p(X\Y) as a Gibbs distribution:
where E(XC·, Y) denotes an energy function over a set Xc of elements of X. Accordingly, the sum in the exponent is a sum over different sets of elements of X. The structure of these sets is characteristic of the model used, as will be discussed below.
The constant Z is a normalisation constant that ensures that p(X\Y) is a valid probability distribution, and accordingly is not usually important for finding XT Accordingly, it can be seen that to perform MAP inference, one must find the vector X* which maximises the energy function
The next step in MAP inference for image adjustment is to decide what is the form of the distribution p(X|K) and it is useful to use a graphical model. A graph Q is defined by a pair of sets, g = (V,8) with nodes, or vertices, s belonging to the set V of vertices and edges e belonging to the set 8 of edges. Each pixel of a received image may be represented as a node, or vertex, on a graph. Relationships between different pixels of the received image may be represented by edges between nodes on the graph.
For example, assume that the vector X represents an image comprising 4 pixels, with each pixel able to take on one of 256 values or labels. If one were to specify just p(X) and were to ignore the vector Y describing observations then one would need to determine 2564 labels in order to account for every possible interaction between pixels. This is shown in Fig. 2A, which shows a graph comprising four nodes, each node representing a pixel of a received image. In the graph, node A is positioned next to node B and node C is positioned next to node D. Nodes A and B are positioned above nodes C and D respectively. As there are no edges in the graph of Fig. 2A, no interrelationships between pixels are accounted for or modelled.
If instead, one was to model some of the interactions between nodes, for example the interactions between horizontal and vertical neighbours, then the number of required labels is reduced. In Fig. 2B, the interactions between node pairs A & B, A & C, C & D, and B & D are modelled and so only 4x2562 labels need be considered to specify the distribution. The reduction in required labels is due to the fact that relationships between diagonal neighbours are captured indirectly by horizontal and vertical neighbours. For example, the interaction between B & C is not specified (there is no edge between nodes B & C on the graph of Fig. 2B) but the interaction between B & C is indirectly capture via the interactions between node pairs A & C and A & B, and node pairs C & D and B & D.
Fig. 2C shows another graph of four pixels A, B, C & D, in which the interactions between all nodes are explicitly modelled by edges. The model in Fig. 2C is more descriptive than the model in 2B. A clique is a subset of nodes (vertices) of a graph that are completely connected i.e. every distinct node of a clique is adjacent every other distinct node of the clique. For example, in Fig. 2B, node A belongs to a clique comprising nodes A & B as nodes A and B are adjacent. Node A also belongs to a clique comprising nodes A & C as nodes A and C are adjacent. Flowever, in Fig. 2B, there is no clique comprising nodes A & D as nodes A & D are not adjacent - they are not connected by an edge.
In Fig. 2C, node A belongs to a clique comprising nodes A & B and a clique comprising nodes A & C. Flowever, in Fig. 2C, node A also belongs to a clique comprising nodes A & D as nodes A & D are adjacent (there is an edge connecting nodes A & D). Additionally, node A belongs to a clique comprising nodes A & B & C, a clique comprising nodes A & B & D, a clique comprising nodes A & C & D, and a clique comprising nodes A & B & C & D.
The cliques loosely capture the direct interactions between nodes. For example, if node A of Fig. 2B is assigned a particular value or label, then such an assignment would have a direct effect on node B and node C. Accordingly, cliques are related to the model used for modelling the computer vision problem.
In view of the above, a labelling or MAP inference problem is determining the optimal label c £.? assigned at each node « g y„ where the objective is over unary terms and clique terms (where cliques are indexed by Greek letters a etc.),
where * =· (**}*εν € A*, and is the subvector of x comprising the labels assigned at nodes belonging to the clique, x« = )«€«*·.
The symbol represents a unary potential, or cost, associated with assigning the label :¾ to the node .£j . In other words, the potential $»(£«) is the “energy” associated with assigning the label, or state, to node $ . The symbol represents a clique potential, or cost, associated with assigning particular labels, or states, to the nodes of clique m . In other words, the potential is the “energy” associated with assigning labels to the nodes of clique cs and can be thought of as an energy term associated with interactions between nodes.
The label assignment problem of (Eq. 1) is generally intractable to solve, and one highly successful approach to approximately solve this problem is to employ the corresponding linear programming (LP) relaxation,
(Eq. 2) where is the energy function to be minimised. In (Eq. 2), ^ 0.
The expression \ is shorthand for ' -rt = ). In what follows <* 3 « is written instead of i°' : a ^ <v r. The unknowns anc* {} a; are “one-hot” encodings of the assigned labels, e.g. if h* is the optimal solution of iCsiM> and the relaxation is tight, then is ideally 1 if and only if state 3ES is the optimal label at node # and 0 otherwise (similar for clique states x* ). The first set of constraints is usually called the set of marginalization constraints, and the unit sum constraint is typically referred as a normalization constraint. The linear program in (Eq. 2) is not unique, since redundant non-negativity and normalization constraints can be added to KWap without affecting the optimal solution or value. Consequently, different duals are solved in the literature. The particular LP dual of &amp;mm*, which will be used in the remainder, is given by
(Eq. 3)
The first set of constraints will be referred to as the balance constraint and the second set of constraints is termed capacity constraints. Since the unknown play only the role of auxiliary variables, they are dropped as argument to to simplify the notation. Without loss of generality one requires 0 > 0 (pointwise) such that λ Ξ Cl is dual feasible.
Even if (Eq 3) is a convex problem (a linear program), optimizing is not straightforward. Generic LP codes do not exploit the very particular structure of the problem, and first order methods exhibit slow convergence in practice due to the non-smooth objective. A successful class of algorithms to solve approximately is based on block coordinate ascent, which performs repeated optimization over a small but varying subsets of unknowns. Different algorithms are obtained by different choices of dual energies and subsets of optimized unknowns. One important aspect for the success of these algorithms is that the subproblems can be solved efficiently, for example, in closed form. These algorithms usually resemble the classical belief propagation algorithm (which has few guarantees if run on cyclic graphs) and fall under the umbrella term convex belief propagation. These algorithms have in common, that the dual objective improves monotonically in each iteration, which renders them convergent (under the mild assumption that the optimal value of is finite i.e. the problem is bounded).
Dual coordinate ascent
In this section a convex belief propagation algorithm is provided which optimizes over all variables for all cliques a: containing ,# in each step (i.e. all messages incoming at node a ). The convex belief propagation described in this section may be used with any of the methods of adjusting an image using message passing described herein, as will be demonstrated below. It is further shown that this convex belief propagation algorithm is equivalent to optimizing over both incoming and outgoing messages, i.e. block coordinate ascent is performed on a full star-like subgraph. Further, it will be shown that there are additional tuning weights in the algorithm which—depending on the node traversal schedule—have a substantial impact on the observed performance.
If one considers a particular node # and fixes all unknowns other than 1½ and the subproblem induced by (Eq. 3) reads as
(Eq. 4)
The right hand side of the inequality constraints are defined as
(Eq. 5)
Non-negative weights are introduced such that
(but otherwise chosen arbitrarily) and the ansatz
is used for some
to obtain the equivalent problem to (Eq. 4),
(Eq. 6)
The choice of these weights and their impact on the convergence rate will be discussed later in this section. Since d* (¾} > 0, the largest allowed value for 1¾ is given by
(Eq. 7) and 44¾} and A*-*...** are consequently given by
Via complementary slackness it is easy to see that if A is dual optimal, then ^(¾) > Ps implies &amp;*(£«) = ® in the primal solution of Eyy&amp;·. Algorithm 1 summarizes this convex BP method.
Algorithm 1 Node-based message passing Require: Arbitrary feasible Λ and p, weights l·, while not converged do 2; loop over a € V and assign for ad « ? b i : Node update:
4. end loop 5; end while
Before discussing the impact of the choice of weights the following is stated:
Result 1. Algorithm 1 is equivalent to performing dual coordinate ascent with respect to both incoming messages ,* and reverse messages t € a \ s. This means that for a fixed node s messages are updated for all cliques a containing a .
Proof of Result 1: Update As*-** and for a node s and its neighbouring node t (i.e. nodes sharing a clique with s). Thus, maximise
f-i ·?' · \ where “HA:) is the reparametrized unary potential, **nJ *f "" {1 **> ^ *^* #*,*-. Observe that one can replace each A by Αί (for all ^ fe \ Ή), and substitute fb with Pm t ]LtA. (and setting 4— Σ-«*<ε<*\**·* itil^ ^α—Μ· without changing the objective (or violating constraints). Thus, it is possible to assume that A ™ 0 without loss of generality, and Plugging this into the maximization problem yields
This variant of dual coordinate ascent is not stronger than optimizing solely over incoming messages Ή,. As one has the freedom to arbitrarily assign values to all /¾ , the values A can be kept constant, i.e. one does not need to maximize with respect to A at all. In this case -s-t is fixed to its old value (which follows from
Pt ™ A and maximization is performed only with respect to 1¾ and λο-κ*? thus optimizing with respect to both messages is equivalent to the original method in Algorithm 1.
The block coordinate method in Algorithm 1 optimizes over fewer unknowns than, for example, the tree block-coordinate one, updates a larger set of unknowns than min-sum-diffusion or MPLP.
Stable points: Recall that a block-coordinate method applied on a (not strictly) convex (or concave) problem is only guaranteed to monotonically improve the objective but does not necessarily achieve one optimal solution as its fixed point. Convergence to a fixed point follows from monotonicity. Reasoning about fixed points is actually too restrictive, since the dual objective Ps will usually remain constant long before Λ reaches a fixed point. Hence we relax the fixed point condition and will introduce stable points shortly. Another slight complication arises from the fact, that the assignment of contributions /¾ to the dual objective value W" = P> is not unique: one can arbitrarily shift quantities between nodes by adjusting the messages without changing the objective or active constraints (i.e. without changing the primal solution). Fortunately, this ambiguity is fixed by updating only ps for a single node in each node update step in Algorithm 1 (i.e. the freedom to modify 1¾ at neighbouring nodes t is not used), and one can define stable points:
Definition 1. A is called a stable point for if the following condition is met:
(Eq. 8) (using the definition of as in (Eq. 5)).
In other words, A is stable if for all nodes there exists a state with all capacity constraints being active. In some sense stable points are fixed points for node updates:
Result 2. If 1 is a stable point, running Algorithm 1 (with any traversal schedule for nodes) will not improve the dual objective ^μλϊΗ' I .
Before sketching the proof, the notion of active states is introduced:
Definition 2. Let potentials # and messages Λ be given. Using the notation as in Algorithm 1, a state «« is called active, if — Him*-/ p*(2£) — ps. A state being active at node *f means that all capacity constraints are active for cliques a containing f , thus
If ^«-4« .> 0, i.e. the weights are chosen from the interior of the unit simplex, the converse is also true.
Result 2 can be seen as follows: a node update at f (Algorithm 1) can only improve the value of 1¾, if at least one capacity constraint for a clique 0 3 s becomes inactive for every previously active state. If this is not the case, then I-*- — ΣΤ,-, ^ also remains constant, and it is easy to see that then Λχ™»si3?*)also remains constant for active states. Messages may change for inactive states.
Traversal schedule and choice of weights: The scheduling policy in which order the nodes s € V are traversed and the exact choice of are unspecified parameters of the algorithm. Intuitively, different choices for the node traversal schedule and weights may be beneficial for the speed of convergence, since relevant message information may be propagated faster depending on the schedule and employed weights. A non-uniform weighting assigning larger weights to forward (hyper)-edges, means that messages incoming at successor nodes f will have larger upper bounds and the subproblem (Eq. 4) at node t is therefore less constrained. In Fig. 3 the evolution of dual energies with respect to the number of passes of the image for a dense stereo correspondence problem is illustrated.“Seq” refers to a schedule, that alternates between sequential top-to-bottom, left-to-right traversal (and its reverse), “par” is the schedule of (possibly simultaneously) updating every other pixel in the image, and “row” refers to a row-parallel schedule that simultaneously traverses every other row in the image and reverses the direction after every pass. We show results for uniform and non-uniform weight assignments. The combination “seq/non-uniform” has the fastest convergence speed if run on a sequential processor. Since the policy “row/non-uniform” is suitable for parallel implementation, we use this policy in our experiments.
The higher resolution stereo pair of the “Cones” dataset is used to generate the graph in Fig. 3. The unary potentials are NCC-induced costs,
with T — 0,5 and the NCC score computed on a 5 x 5 grayscale patch. ZNCC is the zero-mean NCC of 5X5 gray-scale images patches. The smoothness model is used. For the “weak regularization" setting P1 = 1/4, P2 = 1 was chosen and Fig. 4A shows the converged result. To obtain strong regularization P1 = 1, P2 = 4 were chosen and Fig. 4(b) shows the converged result.
The meaning of the traversal schedules is as follows: • Seq: sequential scan from top-left pixel to bottom-right one (for odd passes) and its reverse (for even passes). • Row: Sequential scan from left to right for odd rows first, then even ones. The direction is reversed to right-to-left in every other pass. • Par: update white pixels in a checkerboard pattern first, then black ones.
The weights are assigned according to the following: • Uniform weighting: all «/<*-** are set to 1/ deg(s). • Non-uniform weighting: is set to = 1/32 for backward edges in the traversal schedule. For the “seq” schedule, forward edges have weight = 1/2 - ε, such that the total sum of weights is 1. In the “row” schedule, edges to pixels in the previous and next row have weight 1/4, and therefore the forward edge has weight 1 - 2 x 1/4 - = 1/2 - ε.
Lazy evaluation and pessimistic potentials
In this section it is assumed that the potentials are not given in advance for each state and need to be computed on demand. One strategy is to utilize a proxy for not-yet queried potentials, and to determine the states considered promising for subsequent queries while performing inference. Thus, reparametrized costs arising in the inference procedure can guide the exploration of true values for the respective potentials. It will be shown that using an upper bound for not queried potentials, i.e. “pessimistic” potentials, is highly beneficial in this context.
This section provides the necessary background and focuses on “lazy evaluation” of computationally costly unary potentials, but the inference algorithm keeps the full representation of messages. In the next section it is described how delayed evaluation of clique potentials yields compressed message representations, and how it enables more efficient inference.
Pessimistic potentials: Let # be the true but only partially known potentials, and pessimistic upper bound potentials $ > # (point-wise) are available. By construction we have
(Eq. 9)
Definition 3. For given potentials Θ let if be a reparametrization of # such that = 0 for all s and 3¾. Without loss of generality we set
(Eq. 10)
Instead of reasoning about stable points of &amp;μαρ^' ^ ^)it is beneficial to work with £&amp;*(· | $) since modifying potentials ?:/ will only affect the capacity constraints but not the balance conditions. For clarity, &amp;map! ' ! $). is here restated:
(Eq. 11)
This correspondence also carries over to attributed potentials such as 0 ϋ # etc. For these reparametrized potentials it is clear that if λ is feasible for £·μλρ(* \ , then it is also feasible for | $). The interesting question is the following: when are stable points .Ji for jE]JAp(' { f?) also stable points for £jmapI ' \ 0 i ?
Result 3. Let A be a stable point for &amp;μαι>1' I 0). If A is feasible for ^maf4 I then Λ is also a stable point for
Proof. Let be a stable point for
and feasible with respect to
Fix a node a . The essential quantities in Algorithm 1 applied on A are
(Eq. 12)
Since A is a fixed point for S we have
(Eq. 13)
Updating the messages incoming at a now with respect to the potential # require computation of
(Eq. 14) and A* π sin, AAA By construction we have ]ϊ^».(τ«)* Μ*».) < ^(*«), and 1¾ < p$. But since λ is feasible with respect to # it is known that
(Eq. 15) i.e. This implies that
(Eq. 16) and therefore A = A . Thus, the objective does not improve by updating the messages incoming at # . Further, for active states ^ A)one has = PQ....i,*(^*)(since the capacity constraints are active for active states), and combining this with λα-»*!**} S IA-^s(Asland
)ne obtains
(Eq. 17)
Hence, for active states A t A I ™ ^ anc| the updated messages for these states are given by
(Eq. 18)
and A is a stable point for I
The relevance of this result in our setting is the following: given upper bounds on the true potentials, one can interleave sampling (or exploration) of their true value with MAP inference via successive node updates (or exploitation in a wider sense), and in the limit that combined exploration exploitation strategy still leads to a stable point of the full inference problem. It also tells—given current messages A—which states are good candidates to query the respective true potential: they are the ones that are more likely to violate the corresponding capacity constraint.
Lazy evaluation of unary potentials: The typical setting is that the unary potentials are non-parametric and costly to evaluate data terms, and that clique potentials (usually pairwise ones) are parametric and inexpensive to compute. Hence for the purposes of the present embodiment, one is interested in an approach that leads to “guided” evaluation of unseen data terms, which are then used in subsequent node updates.
In the following we will assume that the unary potentials are bounded from above, ®·9· s < i. The upper bound may be dependent on # but for simplicity assume a constant upper bound. Partial knowledge of (unary) potentials combined with an upper bound on the unknown ones leads to related MAP inference problem:
Definition 4. For each £ € V let i-C^) be the set of resident states for which the true Λ unary potentials are known, and 0 is constructed as follows:
(Eq. 19)
A ·\
Note that for 0 there are corresponding reparametrized potentials # with vanishing unary potentials (recall Def. 3). With these definitions it is possible to present a metaalgorithm for MAP inference with lazy evaluation of data terms in Algorithm 2:
The same remarks on the node traversal schedule and choice of weights *i.v,....f .·, as for Algorithm 1 apply. In the limit T oo every state is explored in algorithm, and a stable point for the full MAP inference problem is obtained. The algorithm may also stop earlier if no violating state is found at any of the nodes. The most relevant application will be when T is a constant value to meet, for example, a runtime budget. This leads to the main open design choice in the algorithm: how to find a state 3¾ such that instantiating leads to the largest subsequent reduction in the objective. In order to describe the principle, we assume for now that all states are considered at pixel s. In practice neighbouring pixels are utilized to generate a small set of candidate states, which will be described in the section entitled “Limited-memory PM-CBP” below. The selected possible pixel label, for which the unary potential is to be calculated may therefore be selected from a candidate set of possible pixel labels, wherein the candidate set of possible pixel labels is based on an estimated set of possible pixel labels of an adjacent node for which the unary potential is known. In this example, the state ^ to evaluate the unary potential for is determined as the state with the smallest value of using an estimated cost $#(¾) in the place of the unknown i.e. set to
(Eq. 20) where # is obtained from #by substituting $§(»«) with Note that 0x(2·) should be the true data term $4¾) or a lower bound thereof for Result 3 to hold, In practice it is possible to use the smallest unary potential from the neighbours as the estimate, i.e. 04#»)*"" which appears to work well. Overall, determining according to (Eq. 20) essentially amounts to performing one node update step of Algorithm 1 and has the same runtime complexity.
While above it was shown that the schedule and the weights influence the convergence speed for standard convex BP, PM-CBP benefits as well especially if the algorithm is run for a few passes. Fig. 5 shows that the “row+non-uniform” setting achieves a lower primal energy much quicker than schedules using a uniform weighting.
Fig. 6 is a flowchart showing a computer implemented method for adjusting an image using message passing. The method may be performed by, for example, the architecture described above in relation to Fig. 1. At step 610 an input image is received. The input image comprises a plurality of pixels, each pixel having an input value and each input value corresponding to one of a number of possible pixel labels or states.
At step 612, each pixel of the input image is associated with a node of a graph and one or more cliques of nodes. Each possible pixel label is associated with a unary potential and, for each permutation of possible pixel labels of the pixels associated with the nodes of each clique, a clique potential is associated.
At step 614, for each clique of the graph, at least one message between each pair of nodes of that clique is initialised. For example, as with Algorithm 2 described above, all messages Λ are initialised to 0.
At step 616, a node # is selected according to a node traversal scheduling policy. For example, the node traversal scheduling policy may indicate that nodes of the graph are to be traversed sequentially in the same order each time.
At step 618, for that node a respective set of possible pixel labels for which the unary potential is known is determined. In other words, for that node the set of resident .J.V: states L{b} is determined. An array may then be defined accordingly (see (Eq. 19)).
At step 620 a state ' is selected, the state not belonging to the respective set of possible pixel labels for which the unary potential is known. As described above, the state ^ may be selected based on a determination of the state with the smallest value of by using an estimated cost #*(**}.
At step 622, the unary potential of the selected possible pixel label is calculated.
At step 624, the respective set L{$) of possible pixel labels for which the unary potential is known is updated to include the selected state The array may be updated accordingly.
At step 626, for each clique to which node s belongs, the clique potential associated with that clique is adjusted or reparameterised based on the updated respective set of possible pixel labels .14#) For example, the clique potentials are adjusted as in (Eq. 10), which also shows the unary potentials being adjusted to zero.
At step 628, a node update is performed according to Algorithm 1, using the adjusted clique potentials. This has the effect of, for each clique to which the node s belongs, adjusting, based on the adjusted clique potential associated with that clique, at least one of the messages between node <f and the other nodes of that clique.
At step 630, a determination is made as to whether or not a convergence criterion has been met. If the convergence criterion has not been met, the method proceeds to step 632, in which a different node is selected according to the node traversal scheduling policy. The method then returns to step 618. The convergence criterion may be, for example, that a solution to the dual problem of (Eq. 11) has been found. The convergence criterion may be, for example, that the method has traversed over all nodes a predetermined number of times.
If, at step 630, a determination is made that the convergence criterion has been met, then the method proceeds to step 634 and an adjusted image is produced having pixel labels determined from the adjusted messages. If, for example, the convergence criterion is that a solution to the dual problem of (Eq. 11) has been found, then the adjusted image may be produced by translating the solution to a solution of a corresponding primal problem and assigning the labels to the pixels accordingly.
Limited-memory PM-CBP
The main benefit of the basic PM-CBP algorithm is only apparent when early stopping is applied: in this case only a subset of unary potentials is evaluated, and the final convex BP iterations converge to the solution of a proxy MAP instance with partially pessimistic potentials. Runtime savings come from not evaluating all data terms (which can lead to substantial savings). Interestingly, one can go far beyond that to derive a limited memory version of PM-CBP, which maintains a constant number of states and respective messages. This is achieved using the following two observations: (i) by using proxy upper bounds for clique potentials, it is possible to represent all messages for non-resident states at a node by a single value, and (ii) at least one resident state can be made non-resident without decreasing the objective.
It is therefore sufficient to maintain only three resident states per node. The method is outlined in Algorithm 2 (now including the framed instructions), and is explained in more detail in the following.
Group state The basic concept is to extend the use of upper bounds for unseen unary potentials (i.e. for non-resident states) to clique potentials. If any element of a clique state xa is not resident (i.e. not in the assigned potential ^!:x»)is an upper bound of the true value. In the experiments described herein truncated pairwise potentials are used, hence an upper bound is easily available. Adding a state x# to the resident set does not only update } (the true value), but it may also lead to the substitution·ψ* if x3 consists now of only resident states. Since every unary and clique potential involving a nonresident state is constant for all states x* ψ .!*{«), it also implies that the messages after a node update will attain the same value for all x* ^ £f £).
Hence, the set of messages 1 can be represented by a single value, which is denoted by The group state “*” simultaneously represents all non-resident states. Thus, introducing a single message for all non-resident states does not affect the validity of the basic PM-CBP algorithm. The resident sets &amp;(&amp;} will grow by one element in each pass. In order to have a fixed sized resident set, the “least active” state x^.\s discarded i.e. the state € L(s) with the largest value of That is, for each clique to which the node s belongs, there is an associated parameter M»-»·*!3'*), the associated parameter dependent on the possible pixel labels for node s. The one of the possible pixel labels which is removed from the resident set £ f>) is the possible pixel label for which the sum ^(¾} of the parameters is greatest. Removal offrom will therefore not reduce the objective.
Choice of 2; j : The state ii j to add to ;Ιφ'} was determined by scanning over all non-resident states in the previous section (recall (Eq 20)). If we maintain a small set of resident states iTa), this exhaustive scan may dominate the overall runtime complexity. Hence, it is sensible to generate potential candidates at node s based on resident states at neighbouring nodes t. Since (parametric) clique potentials usually encode smoothness assumptions, it is reasonable to randomly sample states using a density p{x» \ cc <sxp(—Since we do not assume the potentials to be calibrated, there is a global scale ambiguity in addition to local bias ambiguities (one degree of freedom per clique), which do not affect the overall MAP solution. Therefore, in practice it is possible to estimate distribution parameters from training data. Training data can be ground truth labelings or MAP solutions generated by full-scale inference. The set of resident clique states not containing s are
(Eq. 21) which allows the definition of the random proposal sets,
(Eq. 22) and C(s) — UiS-3i, \ L{s}. C{§) is the set of random proposal generated from neighbouring resident states (minus the already resident states at s). Finally, the new resident state is determined in analogy to (Eq. 20),
If the estimated costs are the true potential values or respective lower bounds and K;2:'s I :K»\s) assigns a positive probability to every state, such that every state is in £,*{&amp;) infinitely often when T do, then limited memory PM-CBP will reach a stable point of the full inference problem. This follows from Result 3 and the fact that feasibility of the current messages are infinitely often tested. It also implies that maintaining messages for three states is sufficient: one for the group state *, one for an active state, and one slot for /;rs .
The data structure for messages in limited memory PM-CBP can be just a fixed size array. The slots for .) can be reused for .1. In general, the algorithm requires only fixed size, pre-allocated data structures, which is a large benefit for GPU or embedded implementations.
An optional subprocess is now described in relation to Fig. 7, which may take place within the process described above in relation to Fig. 6. Fig. 7 therefore shows a number of steps already described above in relation to Fig.6 and which are accordingly allocated the same reference numerals.
After step 618, in which a respective set 14^1 of possible pixel labels is determined for node # the method proceeds to step 702 in which a state of L{$) is selected. As described above, the state -¾ may be selected based on a determination of the state with the greatest value of .
At step 704 the respective set of possible pixel labels is updated by removing one of the possible pixel labels therefrom. Accordingly, and with reference to (Eq. 19) above, in the array 4*1 ), the known unary potential (A* ) is replaced with 1. As described above, this allows for the messages ) to be represented by the group state
After step 704, the method proceeds to step 620 in which a state :1 ^ for which the unary potential is unknown is selected, and the method proceeds as in Figure 6. In this way, the resident set Llx) stays a constant size for each node update (step 626) of node * .
Applications
The methods described herein may be used to perform a number of image processing tasks, as described below. In particular, by performing the methods described herein a solution, or approximate solution, to a primal problem such as that of (Eq. 2) may be found in a time-efficient and memory-efficient manner. The image processing task performed on the image depends on the parameters used to define the problem.
The performance of the methods described herein on dense correspondence problems is demonstrated. The general parameters are as follows: PM-CBP is performed with 5 resident states (plus one to represent all non-resident states) for the indicated number of passes T followed by a fixed number of 32 convex BP iterations to refine the messages. Since a 4-neighborhood is used for the pairwise terms, the memory consumption is 6 X 4 = 24 times the image resolution floating point values. Primal solutions are extracted simply by reporting the state for each pixel with the smallest min-marginal ^(¾). The algorithm is implemented in straightforward C++ with OpenMP enabled, and the runtimes are reported for a dual Xeon E5-2690 system with 64 Gb of main memory. GPU acceleration is not employed.
Dense disparity estimation: Results are demonstrated on dense stereo instances from the Middlebury benchmark datasets. The state space contains integral disparity values and has between 60 and 240 elements (depending on the image pair). The data term (unary potential) attains values in [0; 1] and is given by ψ min
where ZNCC is the zero-mean NCC of 5X5 gray-scale images patches, r is fixed to 1/2. Results are shown for two related pairwise potentials. The first one is a Potts smoothness model, and the second one is the 3-way pairwise potential
which is also known as the Pi-P2 smoothness. From ground truth disparity maps the relative frequencies of events are estimated
for neighbouring pixels :¾ |. This defines how candidate states are sampled in Algorithm 2. Fig. 8 shows the evolution of the attained primal objective with respect to wall clock time for full scale convex BP (Algorithm 1) and limited memory PM-CBP. Clearly, PM-CBP achieves a lower energy much faster than convex BP with much lower memory requirements (5% for “Teddy” and “Cones”, and 2.5% for “Aloe”). The corresponding labelling results returned by PM-CBP after T = 4; 8; 16; 32; 64; 128 passes are illustrated in Fig. 9. Fig. 9 is a visual illustration of converging stereo results for the (starting at the top) “Cones”, “Teddy”, “Cones hires”, “Teddy hires”, “Aloe”, and “Baby3” datasets after the respective number T of passes.
As the objective optimized in the above discussion is the dual program to the original linear (i.e. primal) program in (Eq.2), a primal solution can be extracted by complementary slackness. In practice this means, that given a solution of (Eq.3) an approximate solution of (Eq.1) is obtained by setting x*s = arg minXs vs(xs) (recall Eq.7 for the definition of vs(xs)).
For dense disparity estimation the unknown label values are the disparities, and the unary potentials are computed by comparing the image patch centered at the current pixel in the reference image with the image patch in the “moving” image centered at the current pixel shifted by the disparity value under consideration. The comparison of image patches is based on a truncated zero-mean normalized cross correlation value.
Optical flow estimation: Similar numerical experiments were run for optical flow instances. The state space contains 1292 flow vectors corresponding to a 16 pixel search range at quarter-pixel accuracy. The original grayscale images were upscaled to 400% and the same ZNCC based score as for dense stereo was used (but computed on 11 X 11 patches from the upscaled images). The pairwise smoothness term is the P-|-P2 model applied separately in the horizontal and vertical component of the motion vector. The decrease in primal energy for the solution returned after the respective number of passes with respect to wall clock time is shown in Fig. 10. In this case the memory consumption is 6/1292 or less than 0:04% of running full inference, and usable motion fields are obtained after a few seconds of CPU time. Visualizations of the corresponding flow fields are depicted in Figs. 11 and 12 Fig. 11 is a visual illustration of the convergence of the optical flow field for the “Schleffera” dataset after a) 4, b) 8, c) 16, d) 32, e) 64 and f) 128 passes. Fig. 12 is a visual illustration of converging optical flow fields for various datasets after a) 4, b) 8, c) 16, d) 32, e) 64 and f) 128 passes. The color coding is similar to the Middlebury one, but uses higher saturated colors for better visibility.
Optical flow can be addressed in a very similar way to dense disparity estimation. In order to allow subpixel motion vectors, one can upscale the original pair of input images to 400% of the original size, and estimate integral motion vectors at this resolution. This yields quarter pixel motion vectors for the original image resolution.
Image segmentation: In this setting the pixel label values to infer are object categories such as “sky”, “vegetation”, “road”, “building”, “pedestrian”, “car” etc. (these categories are useful if outdoor images are segmented, e.g. for medical images the categories might be different types of tissues and organs). The unary potentials can typically be trained from ground-truth data using a machine learning method, and the pairwise clique potential will be usually set to the Potts smoothness model, 0st(xs,xt) = T if xs ^ xt (with τ greater than 0), and 0 otherwise.
Image denoising: For image denoising the pixel label values are unknown pixel values. The unary potential is derived from the (known or assumed) noise model of the imaging sensor, and the pairwise (or higher-order) clique potentials encode desired image statistics (e.g. how correlated two neighbouring pixel values are in natural images).
Image deblurring and zooming: Image deblurring and zooming may be carried out using a similar method to image denoising. However, a blur kernel is added to the unary potential for each pixel.
Variations of the described arrangements are envisaged. For example, receiving an image may comprise receiving data from an external data source and processing the received data to produce an image. Receiving an image may comprise generating an image.
In the above discussion, a received image comprises a plurality of pixels, each pixel having an input value and each input value corresponding to one of a number of possible pixel labels. The number of pixel labels may vary from node to node or may be constant. The number of possible pixel labels may depend on the application for which the methods described herein are used. For example, if a mask is to be generated, then the number of possible pixel labels may be 2, with a first pixel label being assigned to the pixel if a determination is made that the pixel is related to an object, and a second pixel label being assigned to the pixel if a determination is made that the pixel is not related to the object.
Cliques of the graph may be formed of any number of nodes. Any two cliques may contain the same or a different number of nodes. The assignment of a node to a particular clique may be performed based on the problem to be solved.
In the discussion above, in the course of Algorithm 2 the messages were initialised at 0. Messages may be initialised at any suitable value.
Determining, for a node, a respective set of possible pixel labels for which the unary potential is known, may comprise retrieving information concerning the respective set from memory.
Computing a unary potential of a selected possible pixel label for which the unary potential is unknown may comprise any suitable method for computing a unary potential. For example, one or more known unary potentials of possible pixel labels associated with adjacent nodes may be considered, and a suitable one of these unary potentials may be associated with the selected possible pixel label. For example, labels that are promising may be propagated to neighbouring pixels in order to rank candidate states to query the true data term.
The selected possible pixel label may be selected according to any criterion. In the discussion above, the pixel label ;rj”was chosen as it was the based on a determination of the state with the smallest value of However, the selected possible pixel label may be selected, for example, randomly.
Adjusting the clique potential for each clique to which a node belongs based on the unary potentials associated with the updated respective set of possible pixel labels may comprise, for example, weighting the unary potentials. For example, the unary potentials may be weighted evenly according to the number of nodes adjacent to the node under observation, or may be weighted unevenly.
Producing an adjusted image having pixel labels determined from adjusted messages may comprise producing a new image having pixel labels determined from adjusted messages. Alternatively, producing an adjusted image may comprise altering the pixel labels of the received input image.
The methods described herein work well if the unary potentials (data terms) are reasonably discriminative for most pixels. In cases when larger regions in the image are non-discriminative (e.g. uniformly coloured sky regions in a stereo image pair for dense depth computation), then the output of the algorithm can look “patchy”. In order to avoid this, the algorithm may be run on lower resolution versions of the input image(s), followed by upscaling the obtained result. This result can be used to initialize one state in the resident set at the finer level. This scheme can be applied recursively, i.e. the algorithm can be run on very coarse images and subsequently rerun at higher resolutions with the upscaled result from the previous level used as initializer.
The described methods may be implemented by a computer program. The computer program which may be in the form of a web application or ‘app’ comprises computer-executable instructions or code arranged to instruct or cause a computer or processor to perform one or more functions of the described methods. The computer program may be provided to an apparatus, such as a computer, on a computer readable medium or computer program product. The computer readable medium or computer program product may comprise non-transitory media such as as semiconductor or solid state memory, magnetic tape, a removable computer memory stick or diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disc, and an optical disk, such as a CD-ROM, CD-R/W, DVD or Blu-ray. The computer readable medium or computer program product may comprise a transmission signal or medium for data transmission, for example for downloading the computer program over the Internet.
An apparatus or device such as a computer may be configured to perform one or more functions of the described methods. The apparatus or device may comprise a mobile phone, tablet, laptop or other processing device. The apparatus or device may take the form of a data processing system. The data processing system may be a distributed system. For example, the data processing system may be distributed across a network or through dedicated local connections.
The apparatus or device typically comprises at least one memory for storing the computer-executable instructions and at least one processor for performing the computer-executable instructions.
While certain arrangements have been described, these arrangements have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the methods, apparatuses and systems described herein may be embodied in a variety of other forms; furthermore various omissions, substitutions and changes in the form of the apparatuses described herein may be made

Claims (15)

Claims
1. A computer implemented method for producing an adjusted image using message passing, the method comprising performing the following steps: i) receiving an input image that comprises a plurality of pixels, each pixel having an input value and each input value corresponding to one of a number of possible pixel labels; ii) associating each pixel with a node of a graph and one or more cliques of nodes, wherein each possible pixel label is associated with a unary potential, and wherein, for each permutation of possible pixel labels of the pixels associated with the nodes of each clique, a clique potential is associated; iii) for each clique, initialising at least one message between each pair of nodes of that clique; iv) for each node: a) determining, for that node, a respective set of possible pixel labels for which the unary potential is known; b) computing a unary potential of a selected possible pixel label for which the unary potential is unknown and updating the respective set of possible pixel labels to include the selected possible pixel label; c) for each clique to which that node belongs, based on the unary potentials associated with the updated respective set of possible pixel labels, adjusting the clique potential associated with that clique; and d) for each clique to which that node belongs, adjusting, based on the adjusted clique potential associated with that clique, at least one of the messages between that node and the other nodes of that clique; v) repeating step iv) until a convergence criterion is met; and vi) producing an adjusted image having pixel labels determined from the adjusted messages.
2. A method according to claim 1, wherein step iv) further comprises, after step a) and before step c), updating the respective set of possible pixel labels by removing one of the possible pixel labels therefrom.
3. A method according to claim 2, wherein for each clique to which that node belongs there is an associated parameter related to messages passed between that node and other nodes of that clique, the parameter dependent on the possible pixel labels for that node; and wherein the one of the possible pixel labels that is removed from the respective set of possible pixel labels is the possible pixel label of the respective set of possible pixel labels for which the sum of the associated parameters is greatest.
4. A method according to any preceding claim, wherein, at step c), adjusting the clique potential associated with that clique comprises adjusting the clique potential based on a weighted sum of the unary potentials associated with the respective set of possible pixel labels.
5. A method according to claim 4, wherein, at step c), adjusting the clique potential associated with that clique further comprises adding a constant value to the weighted sum for each possible pixel label which is not included in the respective set of possible pixel labels.
6. A method according to any preceding claim, wherein, at step b), the selected possible pixel label is selected from a candidate set of possible pixel labels, wherein the candidate set of possible pixel labels is based on an estimated set of possible pixel labels of an adjacent node for which the unary potential is known.
7. A method according to any preceding claim, wherein repeating step iv) until a convergence criterion is met comprises repeating step iv) a predetermined number of times.
8. A method according to any of claims 1 to 6, wherein repeating step iv) until a convergence criterion is met comprises repeating step iv) until repetition of step iv) does not change the messages.
9. A method according to any preceding claim, wherein at step b), computing a unary potential of a selected possible pixel label comprises estimating a unary potential of the selected possible pixel label based on a unary potential associated with a possible pixel label of an adjacent node in the graph.
10. A method according to any preceding claim, wherein, at step d), adjusting at least one of the messages comprises adjusting the at least one of the messages according to one or more predetermined message weightings.
11. A method according to any preceding claim, wherein the unary potentials are configured so as to cause the method to perform one or more of the following operations upon the input image: segmentation; distortion correction; optical flow; dense stereo correspondence; image denoising, deblurring and zooming;
12. A method according to any preceding claim, wherein the clique potentials are initially configured so as to cause the method to perform one or more of the following operations upon the input image: segmentation; distortion correction; optical flow; dense stereo correspondence; image denoising, deblurring and zooming;
13. An apparatus or system arranged to perform the method of any preceding claim.
14. The apparatus or system of claim 13, wherein the apparatus or system comprises: input/output means arranged to receive an image; a memory storing instructions for causing a processor to perform steps of the method of any of claims 1 to 12; and a processor for performing steps of the method of any of claims 1 to 12.
15. A non-transitory computer-readable medium comprising machine-readable instructions arranged, upon execution by one or more processors, to cause the one or more processors to carry out the method of any of claims 1 to 12.
GB1604792.0A 2016-03-22 2016-03-22 Image adjustment Active GB2549068B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB1604792.0A GB2549068B (en) 2016-03-22 2016-03-22 Image adjustment
JP2017029719A JP6448680B2 (en) 2016-03-22 2017-02-21 Image adjustment
US15/455,849 US10078886B2 (en) 2016-03-22 2017-03-10 Image adjustment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1604792.0A GB2549068B (en) 2016-03-22 2016-03-22 Image adjustment

Publications (3)

Publication Number Publication Date
GB201604792D0 GB201604792D0 (en) 2016-05-04
GB2549068A true GB2549068A (en) 2017-10-11
GB2549068B GB2549068B (en) 2021-09-29

Family

ID=55968649

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1604792.0A Active GB2549068B (en) 2016-03-22 2016-03-22 Image adjustment

Country Status (3)

Country Link
US (1) US10078886B2 (en)
JP (1) JP6448680B2 (en)
GB (1) GB2549068B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019050013A1 (en) 2017-09-11 2019-03-14 キヤノン株式会社 Developer carrier, process cartridge and electrophotographic apparatus
JP6463534B1 (en) 2017-09-11 2019-02-06 キヤノン株式会社 Developer carrier, process cartridge, and electrophotographic apparatus
CN111417981A (en) * 2018-03-12 2020-07-14 华为技术有限公司 Image definition detection method and device
US10468062B1 (en) * 2018-04-03 2019-11-05 Zoox, Inc. Detecting errors in sensor data
CN108924385B (en) * 2018-06-27 2020-11-03 华东理工大学 Video de-jittering method based on width learning
KR20210062381A (en) 2019-11-21 2021-05-31 삼성전자주식회사 Liveness test method and liveness test apparatus, biometrics authentication method and biometrics authentication apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010135586A1 (en) * 2009-05-20 2010-11-25 The Trustees Of Columbia University In The City Of New York Systems devices and methods for estimating
WO2012123505A2 (en) * 2011-03-14 2012-09-20 Ecole Centrale Paris Method and device for efficient parallel message computation for map inference

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001022942A (en) * 1999-06-18 2001-01-26 Mitsubishi Electric Inf Technol Center America Inc Method for presuming scene from test image
JP2008084076A (en) * 2006-09-28 2008-04-10 Toshiba Corp Image processor, method, and program
US9153031B2 (en) * 2011-06-22 2015-10-06 Microsoft Technology Licensing, Llc Modifying video regions using mobile device input
US9595134B2 (en) * 2013-05-11 2017-03-14 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing 3D scenes from 2D images
US10022544B2 (en) * 2013-07-22 2018-07-17 National Ict Australia Limited Vision enhancement apparatus for a vision impaired user
US9996925B2 (en) * 2013-10-30 2018-06-12 Worcester Polytechnic Institute System and method for assessing wound
BE1023147B1 (en) * 2015-07-03 2016-12-01 Cnh Industrial Belgium Nv CONTROLLER FOR A VEHICLE VEHICLE
US9881380B2 (en) * 2016-02-16 2018-01-30 Disney Enterprises, Inc. Methods and systems of performing video object segmentation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010135586A1 (en) * 2009-05-20 2010-11-25 The Trustees Of Columbia University In The City Of New York Systems devices and methods for estimating
WO2012123505A2 (en) * 2011-03-14 2012-09-20 Ecole Centrale Paris Method and device for efficient parallel message computation for map inference

Also Published As

Publication number Publication date
US10078886B2 (en) 2018-09-18
US20170278223A1 (en) 2017-09-28
GB201604792D0 (en) 2016-05-04
JP2017174414A (en) 2017-09-28
GB2549068B (en) 2021-09-29
JP6448680B2 (en) 2019-01-09

Similar Documents

Publication Publication Date Title
GB2549068A (en) Image adjustment
Nardi et al. Practical design space exploration
Pope et al. The intrinsic dimension of images and its impact on learning
JP7249326B2 (en) Method, Apparatus, and Computer Program for Improved Reconstruction of High Density Super-Resolved Images from Diffraction Limited Images Acquired by Single Molecule Localization Microscopy
Blake et al. Markov random fields for vision and image processing
AU2017206289B2 (en) Improved patch match convergence using metropolis-hastings
Prakash et al. Fully unsupervised diversity denoising with convolutional variational autoencoders
EP2686808B1 (en) Method and device for efficient parallel message computation for map inference
Russell et al. Exact and approximate inference in associative hierarchical networks using graph cuts
US10832180B2 (en) Artificial intelligence system that employs windowed cellular automata to create plausible alternatives
Osband et al. Posterior sampling for reinforcement learning without episodes
Horwitz et al. Conffusion: Confidence intervals for diffusion models
JP2022117452A (en) Explaining Graph-Based Predictions Using Network Motif Analysis
Moura et al. LSHSIM: a locality sensitive hashing based method for multiple-point geostatistics
Luo et al. Neural combinatorial optimization with heavy decoder: Toward large scale generalization
CN111160459A (en) Device and method for optimizing hyper-parameters
Gudovskiy et al. Explain to fix: A framework to interpret and correct DNN object detector predictions
Gould et al. Alphabet soup: A framework for approximate energy minimization
Madi et al. A graph-based approach for kite recognition
Sontag et al. Clusters and coarse partitions in LP relaxations
Adams et al. Dynamic trees for image modelling
Ryu et al. Quick community detection of big graph data using modified louvain algorithm
Thuerck et al. A fast, massively parallel solver for large, irregular pairwise Markov random fields.
Zach A principled approach for coarse-to-fine MAP inference
Lu et al. An effective visual saliency detection method based on maximum entropy random walk