55:148 Digital Image Processing
55:247 Image Analysis and Understanding
Chapter 8, Part V
Image understanding: Scene labeling and constraint propagation
Chapter 8.5 Overview:
Scene labeling and constraint propagation
- Context plays a significant role in image understanding; the previous section was
devoted to context present in pixel data configurations, and this section deals with
semantic labeling of regions and objects.
- Assume that regions have been detected in an image that correspond to objects or other
image entities, and let the objects and their inter-relationships be described by a region
adjacency graph and/or a semantic net.
- Object properties are described by unary relations, and inter-relationships between
objects are described by binary (or n-ary) relations.
- The goal of scene labeling is to assign a label (a meaning) to each image object to
achieve an appropriate image interpretation.
- The resulting interpretation should correspond with available scene knowledge.
- The labeling should be consistent, and should favor more probable interpretations if
there is more than one option.
- Consistency means that no two objects of the image appear in an illegal
configuration - e.g. an object labeled house in the middle of an object labeled lake
will be considered inconsistent in most scenes.
- Conversely, an object labeled house surrounded by an object labeled lawn
in the middle of a lake may be fully acceptable.
- Two main approaches may be chosen to achieve this goal.
- Discrete labeling allows only one label to be assigned to each object in the
final labeling.
- Effort is directed to achieving a consistent labeling all over the image.
- Probabilistic labeling allows multiple labels to co-exist in objects.
- Labels are probabilistically weighted, with a label confidence being assigned to each
object label.
- The main difference is in interpretation robustness.
- Discrete labeling always finds either a consistent labeling or detects the impossibility
of assigning consistent labels to the scene.
- Often, as a result of imperfect segmentation, discrete labeling fails to find a
consistent interpretation even if only a small number of local inconsistencies is
detected.
- Probabilistic labeling always gives an interpretation result together with a measure of
confidence in the interpretation.
- Even if the result may be locally inconsistent, it often gives a better scene
interpretation than a consistent and possibly very unlikely interpretation resulting from
a discrete labeling.
- Note that discrete labeling may be considered a special case of probabilistic labeling
with one label probability always being 1 and all the others being 0 for each
object.
- The scene labeling problem is specified by
- A set of objects R_i, i=1,...,N.
- A finite set of labels Omega_i for each object R_i.
- A finite set of relations between objects
- The existence of a compatibility function (reflecting constraints) between interacting
objects.
- To solve the labeling problem considering direct interaction of all objects in an image
is computationally very expensive and approaches to solving labeling problems are usually
based on constraint propagation.
- This means that local constraints result in local consistencies (local optima), and by
applying an iterative scheme the local consistencies adjust to global consistencies
(global optima) in the whole image.
- Many types of relaxation exist, some of them being used in statistical physics, for
example, simulated annealing, stochastic relaxation, etc.
- Others, such as relaxation labeling, are typical in image understanding.
- To provide a better understanding of the idea, the discrete relaxation approach is
considered first.
Discrete relaxation
- Next, six objects are present in the scene, including the background.
- Let the labels be background (B), window (W), table (T), drawer (D), phone (P),
and let the unary properties of object interpretations be
- A window is rectangular
- A table is rectangular
- A drawer is rectangular
- Let the binary constraints be
- A window is located above a table
- A phone is above a table
- A drawer is inside a table
- Background is adjacent to the image border
- Given these constraints, the labeling in the right panel is inconsistent.
- Discrete relaxation assigns all existing labels to each object and iteratively removes
all the labels which may not be assigned to an object without violating the constraints.
- A possible relaxation sequence is shown.
- Note the mechanism of constraint propagation.
- The distant relations between objects may influence labeling in distant locations of the
scene after several steps, making it possible to achieve a global labeling consistency of
the scene interpretation although all the label removing operations are local.
Probabilistic relaxation
- Constraints are a typical tool in image understanding.
- The classical problem of discrete relaxation labeling was first introduced by Waltz in
1957 in understanding perspective line drawings, depicting 3D objects.
- Discrete relaxation results in an unambiguous labeling, however in a majority of real
situations, it represents an oversimplified approach to image data understanding; it
cannot cope with incomplete or imprecise segmentation.
- Using semantics and knowledge, image understanding is supposed to solve segmentation
problems which cannot be solved by bottom-up interpretation approaches.
- Probabilistic relaxation may overcome the segmentation problems of missing objects or
extra regions in the scene, however it results in an ambiguous image interpretation which
is often inconsistent.
- Consider the relaxation problem as specified above (regions R_i and sets of labels
Omega_i) and in addition, let each object R_i be described by a set of unary properties
X_i.
- Similarly to discrete relaxation, object labeling depends on the object properties and
on a measure of compatibility of the potential object labels with the labeling of other
directly interacting objects.
- All the image objects may be considered directly interacting and a general form of the
algorithm will be given assuming this.
- Nevertheless, only adjacent objects are usually considered to interact directly to
reduce computational demands of the relaxation.
- However, as before, more distant objects still interact with each other as a result of
the constraint propagation.
- A region adjacency graph is usually used to store the adjacency information.
- Confidence in the label theta_i of an object R_i depends on the configuration of
labels of directly interacting objects.
- Let r(theta_i=omega_k, theta_j=omega_l) represent the value of a compatibility function
for two interacting objects R_i and R_j with labels theta_i and theta_j (the probability
that two objects with labels theta_i and theta_j appear in a specific relation).
- The relaxation algorithm is iterative and its goal is to achieve the locally best
consistency in the entire image.
- The support q_j^s for a label theta_i of the object R_i resulting from the binary
relation with the object R_j at the s-th step of the iteration process is
- where P^s(theta_j=omega_l) is the probability that region R_j should be labeled omega_l.
- The support Q^s for the same label theta_i of the same object R_i resulting from
all N directly interacting objects R_j and their labels theta_j at the s-th step is
- where c_ij are positive weights with a unit sum.
- The coefficients c_ij represent the strength of interaction between objects R_i and R_j.
- Originally, an updating formula was given which specified the new probability of a label
theta_i according to the previous probability P^s(theta_i=omega_k) and probabilities
of labels of interacting objects
- where K is a normalizing constant
- This form of the algorithm is usually referred to as a nonlinear relaxation scheme.
- A linear scheme looks for probabilities such as
- with a non-contextual probability
being used only to start the relaxation process
- A relaxation algorithm can also be treated as an optimization problem, the goal being
maximization of the global confidence in the labeling.
- The global objective function is
- subject to the constraint that the solution satisfies
- Optimization approaches to relaxation can be generalized to allow n-ary relations among
objects.
- Convergence is an important property of iterative algorithms; as far as relaxation is
concerned, convergence problems have not yet been satisfactorily solved.
- Although convergence of a discrete relaxation scheme can always be achieved by an
appropriate design of label updating scheme (e.g. to remove the inconsistent labels),
convergence of more complex schemes where labels may be added, or of probabilistic
relaxation, often cannot be guaranteed mathematically.
- Despite this fact, the relaxation approach may still be quite useful.
- Relaxation algorithms are one of the cornerstones of the high-level vision understanding
processes, and applications can also be found outside the area of computer vision.
- Relaxation algorithms are naturally parallel since the label updating may be done
on all objects at the same time.
- Many parallel implementations exist and parallel relaxation does not differ in essence
from the serial version. A general version is
- Relaxation algorithms are still being developed.
- One existing problem with their behavior is that the labeling improves rapidly during
early iterations followed by a degradation, which may be very severe.
- The reason is that the search for the global optimum over the image may cause highly
non-optimal local labeling.
- A possible treatment that allows spatial consistency to be developed while avoiding
labeling degradation is based on decreasing the neighborhood influence with the iteration
count.
Searching interpretation trees
- Note that relaxation is not the only way to solve discrete labeling problems and
classical methods of interpretation tree searching may be applied.
- A tree has as many levels as there are objects present in the scene; nodes are assigned
all possible labels, and a depth-first search based on backtracking is applied.
- Starting with a label assigned to the first object node (tree root), a consistent label
is assigned to the second object node, to the third object node, etc.
- If a consistent label cannot be assigned, a backtracking mechanism changes the label of
the closest node at the higher level.
- All the label changes are done in a systematic way.
- An interpretation tree search tests all possible labelings, and therefore computational
inefficiency is common, especially if an appropriate tree pruning algorithm is not
available.
- An efficient method for searching the interpretation trees was introduced by Grimson.
- The search is heuristically guided towards a good interpretation based on a quality
of match that is based on constraints and may thus reflect feasibility of the
interpretation.
- Clearly, an infeasible interpretation makes all interpretations represented down the
tree infeasible also.
- To represent the possibility of discarding the evaluated patch, an additional
interpretation tree branch is added to each node.
- The general search strategy is based on a depth-first approach in which the search is for
the best interpretation.
- However, the search for the best solution can be very time consuming.
Last Modified: April 1, 1997