55:148 Digital Image Processing
55:247 Image Analysis and Understanding

Chapter 8, Part I
Image understanding: Image understanding control strategies

Chapter 8.1 Overview:

Parallel and serial processing control
Hierarchical control
Bottom-up control strategies
Model-based control strategies
Combined control strategies
Non-hierarchical control

Image understanding control strategies

Image understanding requires mutual interaction of processing steps.
A human being is well prepared to do image processing, analysis and understanding. Despite this fact, it may sometimes be difficult to recognize what is seen if what to expect is not known.

The main difference between a human observer and an artificial vision system is in a lack of widely applicable, general, and modifiable knowledge of the real world in the latter.
Machine vision systems construct internal models of the processed scene, verify them, and update them, and an appropriate sequence of processing steps must be performed to fulfill the given task.
If the internal model matches the reality, image understanding is achieved.

On the other hand, the existence of an image model is a prerequisite for perception; there is no inconsistency in this.

The image representation has an incremental character; new data or perceptions are compared with an existing model, and are used for model modification.

Image data interpretation is not explicitly dependent on image data alone. The variations in starting models, as well as differences in previous experience, cause the data to be interpreted differently, even if always consistently with the constructed model; any final interpretation can be considered correct if just a match between a model and image data is evaluated.

Machine vision consists of lower and upper processing levels, and image understanding is the highest processing level in this classification.
The main task of this processing level is to define control strategies that ensure an appropriate sequence of processing steps.
A machine vision system must be able to deal with a large number of interpretations that are hypothetical and ambiguous.
Generally viewed, the organization of the machine vision system consists of a weak hierarchical structure of image models.

Many important results have been achieved in image understanding in recent years. Despite that, the image understanding process remains an open area of computer vision and is under continued investigation.

Image understanding is one of the most complex challenges of AI, and to cover this complicated area of computer vision in detail it would be necessary to discuss relatively independent branches of AI

knowledge representation
relational structures
semantic networks
general matching
inference
production systems
problem solving
planning
control
feedback
learning from experience
etc.

Image understanding control strategies
Image understanding can be achieved only as a result of cooperation of complex information processing tasks and appropriate control of these tasks.
Biological systems include a very complicated and complex control strategy incorporating parallel processing, dynamic sensing subsystem allocation, behavior modifications, interrupt driven shifts of attention, etc.

As in other AI problems, the main goal of computer vision is to achieve machine behavior similar to that of biological systems by applying technically available procedures.

Parallel and serial processing control

Both parallel and serial approaches can be applied to image processing, although sometimes it is not obvious which steps should be processed in parallel and which serially.
Parallel processing makes several computations simultaneously and an extremely important consideration is the synchronization of processing actions; that is, the decision of when, or if, the processing should wait for other processing steps to complete.

Operations are always sequential in serial processing.

Almost all low-level image processing can be done in parallel.
High-level processing using higher levels of abstraction is usually serial in essence.

There is an obvious comparison with the human strategy of solving complex sensing problems: A human always concentrates on a single topic during later phases of vision even if the early steps are done in parallel.

Hierarchical control

Should the processing be controlled by the image data information or by higher level knowledge?

Control by the image data (bottom-up control):

Processing proceeds from the raster image to segmented image, to region description, and to their recognition.

Model-based control (top-down control):

A set of assumptions and expected properties is constructed from applicable knowledge.
The satisfaction of those properties is tested in image representations at different processing levels in a top-down direction
The image understanding is an internal model verification, and the model is either accepted or rejected.

The two basic control strategies do not differ in the types of operation applied, but do differ in the sequence of their application, in the application either to all image data or just to selected image data, etc.

The control mechanism chosen is not only a route to the processing goal, it influences the whole control strategy.

Neither top-down nor bottom-up control strategies can explain the vision process or solve complex vision sensing problems in their standard forms.
Their appropriate combination can yield a more flexible and powerful vision control strategy.

Bottom-up control strategies

A general bottom-up algorithm is;

It is obvious that the bottom-up control strategy is based on the construction of data structures for the processing steps that follow.
Each algorithm step can consist of several substeps, however the image representation remains unchanged in the substeps.

The bottom-up control strategy is advantageous if a simple and efficient processing method is available that is independent of the image data content.
Bottom-up control yields good results if unambiguous data are processed and if the processing gives reliable and precise representations for later processing steps.
If the input data are of low quality, bottom-up control can yield good results only if unreliability of the data causes just a limited number of insubstantial errors in each processing step.

This implies that the main image understanding role must be played by a control strategy that is not only a concatenation of processing operations in the bottom-up direction, but that also uses an internal model goal specifications, planning, and complex cognitive processes.

Model-based control strategies

There is no general form of top-down control as was presented in the bottom-up control algorithm.
The main top-down control principle is the construction of an internal model and its verification, meaning that the main principle is goal oriented processing.

Goals at higher processing levels are split into subgoals at lower processing levels, which are split again into subgoals etc., until the subgoals can either be accepted or rejected directly.

An example - looking for your car from a hotel room window.

The general mechanism of top-down control is hypothesis generation and its testing.
The internal model generator predicts what a specific part of the model must look like in lower image representations.

The image understanding process consists of sequential hypothesis generation and testing.
The internal model is updated during the processing according to the results of the hypothesis tests.
The hypothesis testing relies on a (relatively small) amount of information acquired from lower representation levels, and the processing control is based on the fact that just the necessary image processing is required to test each hypothesis.

The model-based control strategy, hypothesize and verify seems to be a way of solving computer vision tasks by avoiding brute force processing; at the same time, it does not mean that parallel processing should not be applied whenever possible.

Combined control strategies

A combined control mechanism usually gives better results than any of the previously discussed, separately applied, basic control strategies.

An example of a robust approach to automated coronary border detection in angiographic images illustrates the combined control strategy.

A frequent problem of model-based control strategies is that the model control necessary in some parts of the image is too strong in other parts.
This is the rationale for a multi-stage approach where a strong model is applied at low resolution, and a weaker model leaves enough freedom for the search to be guided predominantly by image data at full-resolution, thereby achieving higher overall accuracy.

Non-hierarchical control

There is always an upper and a lower level in hierarchical control.
Conversely, non-hierarchical control can be seen as a cooperation of competing experts at the same level.

Non-hierarchical control can be applied to problems that can be separated into a number of subproblems, each of which require some expertise.
The order in which the expertise should be deployed is not fixed.

The basic idea of non-hierarchical control is to ask for assistance from the expert that can help most to obtain the final solution.
The chosen expert may be known, for instance, for high reliability, high efficiency, or for the ability to provide the most information under given conditions, etc.
Criteria for selection of an expert from the set may differ; one possibility is to let the experts calculate their own abilities to contribute to the solution in particular cases - the choice is based on these local and individual evaluations.
Another option is to assign a fixed evaluation to each expert beforehand and help is then requested from the expert with the highest evaluation under given conditions.
The criterion for expert choice may be based on some appropriate combination of empirically detected evaluations computed by experts, and evaluations dependent on the actual state of the problem solution.

A system for analysis of complex aerial photographs (Nagao) is an example of a successful application of non-hierarchical control - the blackboard principle was used for competing experts.

The blackboard usually includes a mechanism that retrieves specialized subsystems which can immediately affect the standard control.
These subsystems are very powerful and are called daemons.
The blackboard must include a mechanism that synchronizes the daemon activity.

The blackboard is sometimes called the short term memory - it contains information about interpretation of the processed image.
The long term memory, the knowledge base, consists of more general information that is valid for (almost) all representations of the problems to be solved.

The primary aim of the blackboard system is to identify places of interest in the image that should be processed with higher accuracy, to locate places with a high probability of a target region being present.
E.g., the approximate region borders are found first based on a fast computation of just a few basic characteristics, saving computational time and making the detailed analysis easier.

The control process follows the production system principle, using the information that comes from the region detection subsystems via the blackboard.

The blackboard serves as a place where all the conflicts between region labeling are solved (one region can be marked by two or more region detection subsystems at the same time and it is necessary to decide which label is the best one).

Furthermore, the labeling errors are detected in the blackboard, and are corrected using backtracking principles.

Last Modified: February 18, 1997

55:148 Digital Image Processing 55:247 Image Analysis and Understanding

Chapter 8, Part I Image understanding: Image understanding control strategies

55:148 Digital Image Processing
55:247 Image Analysis and Understanding

Chapter 8, Part I
Image understanding: Image understanding control strategies