55:148 Digital Image Processing
55:247 Image Analysis and Understanding
Chapter 6, Part II
Shape representation and description: Contour-based shape representation and description
Chapter 6.2 Overview:
Contour-based shape representation and
description
- Region borders must be expressed in some mathematical form.
Chain codes
- Chain codes describe an object by a sequence of unit-size line segments with a given
orientation.
- The first element of such a sequence must bear information about its position to permit
the region to be reconstructed.
- If the chain code is used for matching it must be independent of the choice of the first
border pixel in the sequence. One possibility for normalizing the chain code is to find
the pixel in the border sequence which results in the minimum integer number if the
description chain is interpreted as a base four number -- that pixel is then used as the
starting pixel.
- A mod 4 or mod 8 difference is called a chain code derivative.
Simple geometric border representation
- The following descriptors are mostly based on geometric properties of described regions.
Because of the discrete character of digital images, all of them are sensitive to image
resolution.
- Boundary length
- Curvature
- Chord is a line joining any two points of the region boundary is a chord.
- Let b(x,y)=1 represent the contour points, and b(x,y)=0 represent all other points.
- Rotation-independent radial distribution:
- The angular distribution h_a(theta) is independent of scale, while rotation causes a
proportional offset
Fourier transforms of boundaries
- Suppose C is a closed curve (boundary) in the complex plane.
- Traveling anti-clockwise along this curve keeping constant speed, a complex function z(t)
is obtained, where t is a time variable.
- The coefficients T_n of the series are called the Fourier descriptors of the
curve C. It is more useful to consider the curve distance s in comparison to time
- where L is the curve length. The Fourier descriptors T_n are given by
- The descriptors are influenced by the curve shape and by the initial point of the curve.
Working with digital image data, boundary co-ordinates are discrete and the function z(s)
is not continuous. Assume that z(k) is a discrete version of z(s), where 4-connectivity is
used to get a constant sampling interval; the descriptors T_n can be computed from the
discrete Fourier transform.
- The Fourier descriptors can be invariant to translation and rotation if the co-ordinate
system is appropriately chosen.
- The coefficients a_n, b_n are not invariant, but after the transform
- r_n are translation and rotation invariant.
- To achieve a magnification invariance the descriptors w_n are used:
- The first 10 -- 15 descriptors w_n are found to be sufficient for character description.
- A closed boundary can be represented as a function of angle tangents versus the distance
between the boundary points from which the angles were determined
- The descriptor set is then
- The high quality boundary shape representation obtained using only a few lower order
coefficients is a favorable property common to Fourier descriptors.
Boundary description using segment sequences
- If the segment type is known for all segments, the boundary can be described as a
chain of segment types, a code-word consisting of representatives of a type alphabet.
- A polygonal representation approximates a region by a polygon, the region being
represented using its vertices. Polygonal representations are obtained as a result of a
simple boundary segmentation.
- Another method for determining the boundary vertices is a tolerance interval approach
based on setting a maximum allowed difference e.
- Recursive boundary splitting.
- Boundary segmentation into segments of constant curvature or curve segmentation into
circular arcs and straight lines is used.
- Segments are considered as primitives for syntactic shape recognition procedures.
- Sensitivity of shape descriptors to scale is also important if a curve is to be divided
into segments -- a scale-space approach to curve segmentation.
- Only new segmentation points can appear at higher resolutions, and no existing
segmentation points can disappear.
- Fine details of the curve disappear in pairs with increasing size of the Gaussian
smoothing kernel, and two segmentation points always merge to form a closed contour
showing that any segmentation point existing in coarse resolution must also exist in finer
resolution.
- Moreover, the position of a segmentation point is most accurate in finest resolution and
this position can be traced from coarse to fine resolution using the scale-space image.
- A multiscale curve description can be represented by an interval tree.
B-spline representation
- Representation of curves using piecewise polynomial interpolation to obtain smooth
curves is widely used in computer graphics.
- B-splines are piecewise polynomial curves whose shape is closely related to their
control polygon - a chain of vertices giving a polygonal representation of a curve.
- B-splines of the third-order are most common because this is the lowest order which
includes the change of curvature.
- Splines have very good representation properties and are easy to compute:
- Firstly, they change their shape less then their control polygon, and do not oscillate
between sampling points as many other representations do. A spline curve is always
positioned inside a convex n+1-polygon for a B-spline of the n-th order.
- Secondly, the interpolation is local in character. If a control polygon vertex changes
its position, a resulting change of the spline curve will occur only in a small
neighborhood of that vertex.
- Thirdly, methods of matching region boundaries represented by splines to image data are
based on a direct search of original image data.
- Each part of a cubic B-spline curve is a third-order polynomial, meaning that it and its
first and second derivatives are continuous. B-splines are given by
Other contour-based shape description approaches
- Many other methods and approaches can be used to describe two-dimensional curves and
contours.
- The Hough transform has excellent shape description abilities.
- Region-based shape description using statistical moments is covered below.
- Fractal approach to shape is gaining attention in image shape description.
- Mathematical morphology can be used for shape description, typically in
connection with region skeleton construction.
- Neural networks can be used to recognize shapes in raw boundary representations
directly. Contour sequences of noiseless reference shapes are used for training, and noisy
data are used in later training stages to increase robustness; effective representations
of closed planar shapes result.
Shape invariants
- Shape invariants represent a very active current research area in machine vision.
- Importance of shape invariance has been known for a long time however it is somewhat
novel approach in machine vision.
- Invariant theory is not new and many of its principles were introduced in the nineteenth
century.
- Shape descriptors discussed so far depend on viewpoint, meaning that object recognition
may often be impossible as a result of changed object or observer position.
- The role of shape description invariance is obvious -- shape invariants represent
properties of such geometric configurations which remain unchanged under an appropriate
class of transforms.
- Machine vision is especially concerned with the class of projective transforms.
- Collinearity is the simplest example of a projectively invariant image feature. Any
straight line is projected as a straight line under any projective transform.
- Similarly, the basic idea of the projection-invariant shape description is to find such
shape features that are unaffected by the transform between the object and the image
plane.
- A standard technique of projection-invariant description is to hypothesize the pose
(position and orientation) of an object and transform this object into a specific
co-ordinate system; then shape characteristics measured in this co-ordinate system yield
an invariant description.
- However, the pose must be hypothesized for each object and each image which makes this
approach difficult and unreliable.
- Application of invariant theory, where invariant descriptors can be computed
directly from image data without the need for a particular co-ordinate system, represents
another approach.
- Let corresponding entities in two different co-ordinate systems be distinguished by
large and small letters. An invariant of a linear transformation may be defined as:
- An invariant, I(P), of a geometric structure described by a parameter vector P, subject
to a linear transformation T of the co-ordinates x=TX, is transformed according to
I(p)=I(P)|T|^w.
- Here I(p) is the function of the parameters after the linear transformation, and |T| is
the determinant of the matrix T.
- In this definition, w is referred to as the weight of the invariant. If w=0, the
invariants are called scalar invariants, which are considered below.
- Invariant descriptors are unaffected by object pose, by perspective projection, and by
the intrinsic parameters of the camera.
- Several examples of invariants are now given.
- Cross ratio:
- The cross ratio represents a classic invariant of a projective line.
- A straight line is always projected as a straight line. Any four collinear points
A,B,C,D may be described by the cross-ratio invariant
- where (A-C) represents the distance between points A and C. Note that the cross ratio
depends on the order in which the four collinear points are labeled.
Practical Experiment 5.1
- Open the image shape-invariants.pgm or shape-invariants.tif using cantata (located in
~dip/examples/images.dir)
- Determine coordinates of 4 collinear points from object type 1 and type 2 (corresponding
within object types).
- Calculate the invariant I (eq. 6.25) for the two shape classes - you may use the simple
Matlab shape.m program (located in ~dip/examples/khoros.dir).
- Are the invariants equal for smae objects imaged in different pose?
- Are the invariants different for the two shapes?
Hint: For the star-like object, no collinar points exist directly. However, 4 coplanar
lines can always be used to generate 4 collinear points.
- A system of five general coplanar lines forms two invariants
where M_ijk=(l_i, l_j, l_k).
l_i=(l_i^1,l_i^2,l_i^3)^T is a representation of a line l_i^1x+l_i^2y+ l_i^3=0, where i
is from interval [1,5], and |M| is the determinant of M.
If the three lines forming the matrix M_ijk are concurrent, the matrix becomes singular
and the invariant is undefined.
- A system of five coplanar points is dual to a system of five lines and the same two
invariants are formed.
- These two functional invariants can also be formed as two cross ratios of two coplanar
concurrent line quadruples.
- Note that even though combinations other than those given in Figure may be formed, only
the two presented functionally independent invariants exist.
- Plane conics:
- A plane conic may be represented by an equation
for x=(x,y,1)^T.
Then the conic may also be defined by a matrix C
- For any conic represented by a matrix C, and any two coplanar lines not tangent to the
conic, one invariant may be defined
- The same invariant can be formed for a conic and two coplanar points.
- Two invariants can be determined for a pair of conics represented by their respective
matrices C_1, C_2 normalized so that |C_i|=1
- For non-normalized conics, the invariants of associated quadratic forms are
- and two true invariants of the conics are
- Two plane conics uniquely determine four points of intersection, and any point that is
not an intersection point may be chosen to form a five-point system together with the four
intersection points.
- Therefore, two invariants exist for the pair of conics, as for the five-point system.
- Many man-made objects consist of a combination of straight lines and conics, and these
invariants may be used for their description.
- However, if the object has a contour which cannot be represented by an algebraic curve,
the situation is much more difficult.
- Differential invariants can be formed (e.g. curvature, torsion, Gaussian
curvature) which are not affected by projective transforms.
- These invariants are local - that is, the invariants are found for each point on the
curve, which may be quite general.
- Unfortunately, these invariants are extremely large and complex polynomials, requiring
up to seventh derivatives of the curve, which makes them practically unusable due to image
noise and acquisition errors, although noise-resistant local invariants are beginning to
appear.
- However, if additional information is available, higher derivatives may be avoided.
- Stability of invariants is another crucial property which affects their applicability.
The robustness of invariants to image noise and errors introduced by image sensors is of
prime importance, although not much is known about this.
- Different invariants have different stability and distinguishing powers.
- An example of recognition of man-made objects using invariant description of four
coplanar lines, a conic and two lines, and a pair of coplanar conics is given.
- The recognition system is based on a model library containing over thirty object models
- significantly more than that reported for other recognition systems.
- Moreover, the construction of the model library is extremely easy; no special
measurements are needed, the object is digitized in a standard way and the projectively
invariant description is stored as a model.
- Further, there is no need for camera calibration. The recognition accuracy is 100% for
occluded objects viewed from different viewpoints if the objects are not severely
disrupted by shadows and specularities.
Last Modified: February 3, 1997